-This document is current for OProfile version 0.9.8.
+This document is current for OProfile version 1.0.0.
This document provides some details on the internal workings of OProfile for the
interested hacker. This document assumes strong C, working C++, plus some knowledge of
kernel internals and CPU hardware.
@@ -606,7 +606,7 @@ information.
@@ -750,7 +750,7 @@ or enable on a per-counter basis, unlike the PPro models).
-
2.2. IA64 and perfmon
+
2.2. IA64 and perfmon
@@ -904,7 +904,7 @@ iterator. This provides an entirely lock-free method for extracting data
from the CPU buffers. This process is described in detail later in this chapter.
-
+
Figure 3.1. The OProfile buffers
diff --git a/doc/ocount.1.in b/doc/ocount.1.in
new file mode 100644
index 0000000..790356b
--- /dev/null
+++ b/doc/ocount.1.in
@@ -0,0 +1,274 @@
+.\" an page for ocount
+.\" Author: Maynard Johnson
+.TH ocount 1 "@DATE@" "oprofile @VERSION@"
+.SH NAME
+ocount \- Event counting tool for Linux
+
+.SH SYNOPSIS
+.B ocount
+[
+.I options
+]
+[ --system-wide | --process-list | --thread-list | --cpu-list | [ command [ args ] ] ]
+
+.SH DESCRIPTION
+.BI ocount
+is an OProfile tool that can be used to count native hardware events occurring
+in either a given application, a set of processes or threads, a subset of active
+system processors, or the entire system. The data collected during
+a counting session is displayed to stdout by default or, optionally,
+to a file.
+.P
+When counting multiple events, the kernel may not be able to count all events
+simultaneously and, thus, may need to multiplex the counting of the events.
+If this happens, the "Percent time enabled" column in the
+.B ocount
+output will be less than 100, but counts are scaled up to a 100% estimated value.
+.br
+
+.SH RUN MODES
+One (and only one) of the following
+.SB run modes
+must be specified. If you run
+.BI ocount
+using a run mode other than
+.BI "command " [args]
+, press Ctrl-c to stop
+.BI ocount
+when finished counting (e.g., when the monitored process ends).
+If you background
+.BI ocount
+(i.e., with '&') while using one these run modes, you
+.B must
+stop it in a controlled manner so that the data collection process can
+be shut down cleanly and final results can be displayed. Use
+.BI kill
+.BI -SIGINT
+.BI
+for this purpose.
+.TP
+.BI "command " [args]
+The
+.I command
+is the application for which to count events.
+.I args
+are the input arguments required by the application.
+The
+.I command
+and its arguments
+.B must
+be positioned at the
+end of the command line, after all ocount options.
+.br
+.TP
+.BI "--process-list / -p " pids
+Use this option to count events for one or more already-running applications, specified
+via a comma-separated list (
+.I pids
+). Event counts will be collected for all children of the passed process(es)
+as well. You must have privileges for the user ID under which the specified process(es)
+are running; e.g., for a non-root user, the user ID of the process(es) is the same as
+that used for running ocount. A lack of privileges will result in the following
+failure message:
+.br
+ perf_event_open failed with Permission denied
+.br
+
+.TP
+.BI "--thread-list / -r " tids
+Use this option to count events for one or more already-running threads, specified
+via a comma-separated list (
+.I tids
+). Event counts will
+.B not
+be collected for any children of the passed thread(s). See the description of
+.I --process-list
+concerning required privileges.
+.br
+
+.TP
+.BI "--system-wide / -s"
+This option is for counting events for all processes running on your system. You must
+have root authority to run ocount in this mode.
+.br
+
+.TP
+.BI "--cpu-list / -C " cpus
+This option is for counting events on a subset of processors on your system. You must
+have root authority to run ocount in this mode. This is a comma-separated list, where each
+element in the list may be either a single processor number or a range of processor numbers;
+for example: '-C 2,3,4-11,15'.
+.br
+
+.SH OTHER OPTIONS
+.TP
+.BI "--events / -e " event1[,event2[,...]]
+This option is for passing a comma-separated list of event specifications
+for counting. Each event spec is of the form:
+.br
+.I " name[:unitmask[:kernel[:user]]]"
+.br
+.B Note:
+Do
+.B not
+include a
+.I count
+value in the event spec, as that parameter is only needed when profiling.
+.P
+.RS
+You can specify
+.I unitmask
+values using either a numerical value (hex values
+.I must
+begin with "0x") or a symbolic name (if the
+.I name=
+field is shown in the
+.B ophelp
+output). For some named unit masks, the hex value is not unique; thus, OProfile
+tools enforce specifying such unit masks value by name.
+If no unit mask is specified, the default unit mask value for the event is used.
+.P
+The
+.I kernel
+and
+.I user
+parts of the event specification are binary values ('1' or '0') indicating
+whether or not to count events in kernel space and user space.
+.br
+.B Note:
+In order to specify the
+.I kernel/user
+bits, you must also specify a
+.I unitmask
+value, even if the running processor type does not use unit masks \(em
+in which case, use the value '0' to signify a null unit mask; for example:
+.br
+ -e INST_RETIRED_ANY_P:0:1:0
+.br
+ ^ ^ ^
+ | | |--- '0': do not count user space events
+ | |-- '1': count kernel space events
+ |-- '0': the null unit mask
+.P
+Event names for certain processor types include a
+.I "_GRP"
+suffix. For such cases, the
+.I --events
+option may be specified with or without the
+.I "_GRP"
+suffix.
+.P
+When no event specification is given, the default event for the running
+processor type will be used for counting.
+Use
+.BI ophelp
+to list the available events for your processor type.
+.RE
+.br
+
+.TP
+.BI "--separate-thread / -t"
+This option can be used in conjunction with either the
+.I --process-list
+or
+.I --thread-list
+option to display event counts on a per-thread (per-process) basis. Without this option, all counts
+are aggregated.
+.P
+.RS
+.BI NOTE:
+If new threads are started by the process(es) being monitored after counting begins,
+the counts for those threads are aggregated with their parent's counts.
+.RE
+
+.br
+.TP
+.BI "--separate-cpu / -c"
+This option can be used in conjunction with either the
+.I --system-wide
+or
+.I --cpu-list
+option
+to display event counts on a per-cpu basis. Without this option, all counts are aggregated.
+.br
+
+.TP
+.BI "--time-interval / -i " interval_length[:num_intervals]
+
+.B Note:
+The
+.I "interval_length"
+is given in milliseconds. However, the current implementation only supports
+100 ms granularity, so the given
+.I "interval_length"
+will be rounded to the nearest 100 ms.
+Results collected for each time interval are printed immediately
+instead of the default of one dump of cumulative event counts at the end of the run.
+Counters are reset to zero at the start of each interval.
+.P
+.RS
+If
+.I num_intervals
+is specified,
+.BI ocount
+exits after the specified number of intervals occur.
+.RE
+
+.TP
+.BI "--brief-format / -b"
+Use this option to print results in the following brief format:
+.br
+ [cpu or thread,][:umask[:K:U]],,
+.br
+ [ ,]< string >[< u32>[]],< u64 >,< double >
+
+The umask,
+.BR K ernel
+and
+.BR U ser
+modes are only printed if the values were specified as part of the event.
+The 'K' and 'U' fields are binary fields separated by colons, where the value for each binary
+field may be either '0' or '1'.
+.P
+.RS
+If
+.I --timer-interval
+is specified, a separate line formatted as
+.br
+ timestamp,[.n]
+.br
+is printed ahead of each dump of event counts. If the time interval specified is
+less than one second, the timestamp will have 1/10 second precision.
+.RE
+
+.TP
+.BI "--output-file / -f " outfile_name
+Results are written to
+.I outfile_name
+instead of interactively to the terminal.
+.br
+.TP
+.BI "--verbose / -V"
+Use this option to increase the verbosity of the output.
+.br
+.TP
+.BI "--version / -v"
+Show ocount version.
+.br
+.TP
+.BI "--help / -h"
+Display brief usage message.
+.br
+.TP
+.BI "--usage / -u"
+Display brief usage message.
+.br
+
+.SH EXAMPLE
+$ ocount make
+
+.SH VERSION
+This man page is current for @PACKAGE@-@VERSION@.
+
+.SH SEE ALSO
+operf(1).
diff --git a/doc/op-check-perfevents.1.in b/doc/op-check-perfevents.1.in
new file mode 100644
index 0000000..fc98683
--- /dev/null
+++ b/doc/op-check-perfevents.1.in
@@ -0,0 +1,36 @@
+.TH OP-CHECK-PERFEVENTS 1 "@DATE@" "oprofile @VERSION@"
+.UC 4
+.SH NAME
+op-check-perfevents \- checks for kernel perf pmu support
+.SH SYNOPSIS
+.br
+.B op-check-perfevents
+[
+.I options
+]
+.SH DESCRIPTION
+
+The small helper program
+.B op-check-perfevents
+determines whether the kernel supports the perf interface
+and returns a zero exit status if the perf pmu support is available.
+.SH OPTIONS
+.TP
+.BI "--help / -h"
+Show usage help message.
+.br
+.TP
+.BI "--verbose / -v"
+Print string describing the error number of perf_event_open syscall
+.br
+
+.SH ENVIRONMENT
+No special environment variables are recognised by op-check-perfevents.
+
+.SH VERSION
+.TP
+This man page is current for @PACKAGE@-@VERSION@.
+
+.SH SEE ALSO
+.BR @OP_DOCDIR@,
+.BR oprofile(1)
diff --git a/doc/opannotate.1.in b/doc/opannotate.1.in
index ba57a38..98eda51 100644
--- a/doc/opannotate.1.in
+++ b/doc/opannotate.1.in
@@ -40,7 +40,21 @@ used --separate.
.br
.TP
.BI "--exclude-file [files]"
-Exclude all files in the given comma-separated list of glob patterns.
+Exclude all files in the given comma-separated list of glob patterns. This option
+is supported solely with the
+.I --source
+option. It can be used to filter out source files in the output using the
+following types of specifications:
+.RS
+.IP \(bu 2
+filenames (basename -- i.e., no path)
+.IP \(bu 2
+filename glob specifications (all files whose base filename matches the given pattern)
+.IP \(bu 2
+directory segments (all source files located in the specified directory; e.g. "libio")
+.IP \(bu 2
+directory segment glob specifications (e.g., "libi*")
+.RE
.br
.TP
.BI "--exclude-symbols / -e [symbols]"
@@ -62,6 +76,13 @@ A path to a filesystem to search for additional binaries.
.TP
.BI "--include-file [files]"
Only include files in the given comma-separated list of glob patterns.
+The same rules apply for this option as for the
+.I --exclude-file
+option.
+.br
+.TP
+.BI "--merge / -m [lib,cpu,tid,tgid,unitmask,all]"
+Merge any profiles separated in a --separate session.
.br
.TP
.BI "--include-symbols / -i [symbols]"
@@ -104,12 +125,11 @@ looking for them in --search-dirs.
.BI "--session-dir="dir_path
Use sample database from the specified directory
.I dir_path
-instead of the default locations. If
+instead of the default location. If
.I --session-dir
is not specified, then
.B opannotate
-will search for samples in
-.I /oprofile_data
+will search for samples in /oprofile_data
first. If that directory does not exist, the standard session-dir of /var/lib/oprofile is used.
.br
.TP
@@ -119,8 +139,13 @@ for the binaries.
.br
.TP
.BI "--threshold / -t [percentage]"
-Only output data for symbols that have more than the given percentage
-of total samples.
+For annotated assembly, only output data for symbols that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
+
+For annotated source, only output data for source files that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the source file is shown.
.br
.TP
.BI "--verbose / -V [options]"
@@ -134,15 +159,9 @@ Show version.
No special environment variables are recognised by opannotate.
.SH FILES
-.I /oprofile_data/samples
-.RS 7
-Or
-.RE
-.I /var/lib/oprofile/samples/
-.LP
-.RS 7
+.TP
+.I /samples
The location of the generated sample files.
-.RE
.SH VERSION
.TP
diff --git a/doc/oparchive.1.in b/doc/oparchive.1.in
index 0dba32d..753d6d5 100644
--- a/doc/oparchive.1.in
+++ b/doc/oparchive.1.in
@@ -36,12 +36,11 @@ Give verbose debugging output.
.BI "--session-dir="dir_path
Use sample database from the specified directory
.I dir_path
-instead of the default locations. If
+instead of the default location. If
.I --session-dir
is not specified, then
.B oparchive
-will search for samples in
-.I /oprofile_data
+will search for samples in /oprofile_data
first. If that directory does not exist, the standard session-dir of /var/lib/oprofile is used.
.br
.TP
@@ -68,18 +67,12 @@ used --separate.
Only list the files that would be archived, don't copy them.
.SH ENVIRONMENT
-No special environment variables are recognised by oparchive.
+No special environment variables are recognized by oparchive.
.SH FILES
-.I /oprofile_data/samples
-.RS 7
-Or
-.RE
-.I /var/lib/oprofile/samples/
-.LP
-.RS 7
+.TP
+.I /samples
The location of the generated sample files.
-.RE
.SH VERSION
.TP
diff --git a/doc/opcontrol.1.in b/doc/opcontrol.1.in
deleted file mode 100644
index 0b595f3..0000000
--- a/doc/opcontrol.1.in
+++ /dev/null
@@ -1,195 +0,0 @@
-.TH OPCONTROL 1 "@DATE@" "oprofile @VERSION@"
-.UC 4
-.SH NAME
-opcontrol \- control OProfile profiling
-.SH SYNOPSIS
-.br
-.B opcontrol
-[
-.I options
-]
-.SH DESCRIPTION
-.B opcontrol
-can be used to start profiling, end a profiling session,
-dump profile data, and set up the profiling parameters.
-
-.SH OPTIONS
-.TP
-.BI "--help"
-Show help message.
-.br
-.TP
-.BI "--version"
-Show version.
-.br
-.TP
-.BI "--list-events"
-Shows the monitorable events.
-.br
-.TP
-.BI "--init"
-Load the OProfile module if required and make the OProfile driver
-interface available.
-.br
-.TP
-.BI "--setup"
-Followed by list options for profiling setup. Store setup
-in ~root/.oprofile/daemonrc. Optional.
-.br
-.TP
-.BI "--status"
-Show configuration information.
-.br
-.TP
-.BI "--start-daemon"
-Start the oprofile daemon without starting profiling.
-.br
-.TP
-.BI "--start"
-Start data collection with either arguments provided by --setup
-or with information saved in ~root/.oprofile/daemonrc.
-.br
-.TP
-.BI "--dump"
-Force a flush of the collected profiling data to the daemon.
-.br
-.TP
-.BI "--stop"
-Stop data collection.
-.br
-.TP
-.BI "--shutdown"
-Stop data collection and kill the daemon.
-.br
-.TP
-.BI "--reset"
-Clear out data from current session, but leaves saved sessions.
-.br
-.TP
-.BI "--save="sessionname
-Save data from current session to sessionname.
-.br
-.TP
-.BI "--deinit"
-Shut down daemon. Unload the oprofile module and oprofilefs.
-.br
-.TP
-.BI "--session-dir="dir_path
-Use sample database out of directory dir_path instead of the default location (/var/lib/oprofile).
-.br
-.TP
-.BI "--buffer-size="num
-Set kernel buffer to num samples. The buffer watershed needs
-to be tweaked when changing this value.
-Rules: A non-zero value goes into effect after a '--shutdown/start' sequence.
-A value of zero sets this parameter back to default value, but does not go into
-effect until after '--deinit/init' sequence.
-.br
-.TP
-.BI "--buffer-watershed="num
-Set kernel buffer watershed to num samples. When
-buffer-size - buffer-watershed free entries remain in the kernel buffer, data will be
-flushed to the daemon. Most useful values are in the range [0.25 - 0.5] * buffer-size.
-Same rules as defined for buffer-size.
-.br
-.TP
-.BI "--cpu-buffer-size="num
-Set kernel per-cpu buffer to num samples. If you profile at high
-rate it can help to increase this if the log file show excessive count of
-sample lost cpu buffer overflow. Same rules as defined for buffer-size.
-.br
-.TP
-.BI "--event="[event|"default"]
-Specify an event to measure for the hardware performance counters,
-or "default" for the default event. The event is of the form
-"CPU_CLK_UNHALTED:30000:0:1:1" where the numeric values are
-count, unit mask, kernel-space counting, user-space counting,
-respectively. Note that this over-rides all previous events selected;
-if you want two or more counters used simultaneously, you must specify
-them on the same opcontrol invocation. The numerical unit mask
-can also be a string which matches the first word in the unit mask
-description, but only for events with "extra:" parameters shown.
-Unit masks with "extra:" parameters
-.I must
-be specified by first word.
-.br
-.TP
-.BI "--separate="[none,lib,kernel,thread,cpu,all]
-Separate samples based on the given separator. 'lib' separates
-dynamically linked library samples per application. 'kernel' separates
-kernel and kernel module samples per application; 'kernel'
-implies 'library'. 'thread' gives separation for each thread and
-task. 'cpu' separates for each CPU. 'all' implies all of the above
-options and 'none' turns off separation.
-.br
-.TP
-.BI "--callgraph=#depth"
-Enable callgraph sample collection with a maximum depth. Use 0 to disable
-callgraph profiling. This option is available on x86 using a
-2.6+ kernel with callgraph support enabled. It is also available on PowerPC using a 2.6.17+ kernel.
-.br
-.TP
-.BI "--image="[name,name...|"all"]
-Only profile the given absolute paths to binaries, or "all" to profile
-everything (the default).
-.br
-.TP
-.BI "--vmlinux="file
-vmlinux kernel image.
-.br
-.TP
-.BI "--no-vmlinux"
-Use this when you don't have a kernel vmlinux file, and you don't want to
-profile the kernel.
-.br
-.TP
-.BI "--verbose"
-Be verbose in the daemon log. This has a high overhead.
-.br
-.TP
-.BI "--kernel-range="start,end
-Set kernel range vma address in hexadecimal.
-
-.SH OPTIONS (specific to Xen)
-.TP
-.BI "--xen="file
-Xen image
-.br
-.TP
-.BI "--active-domains="
-List of domain ids participating in a multi-domain profiling session. If
-more than one domain is specified in they should be separated using
-commas. This option can only be used in domain 0 which is the only domain
-that can coordinate a multi-domain profiling session. Including domain 0 in
-the list of active domains is optional. (e.g. --active-domains=2,5,6 and
---active-domains=0,2,5,6 are equivalent)
-.br
-.SH OPTIONS (specific to System z)
-.TP
-.BI "--s390hwsampbufsize="num
-Number of 2MB areas used per CPU for storing sample data. The best
-size for the sample memory depends on the particular system and the
-workload to be measured. Providing the sampler with too little memory
-results in lost samples. Reserving too much system memory for the
-sampler impacts the overall performance and, hence, also the workload
-to be measured.
-.br
-
-.SH ENVIRONMENT
-No special environment variables are recognised by opcontrol.
-
-.SH FILES
-.TP
-.I /root/.oprofile/daemonrc
-Configuration file for opcontrol
-.TP
-.I /var/lib/oprofile/samples/
-The location of the generated sample files.
-
-.SH VERSION
-.TP
-This man page is current for @PACKAGE@-@VERSION@.
-
-.SH SEE ALSO
-.BR @OP_DOCDIR@,
-.BR oprofile(1)
diff --git a/doc/operf.1.in b/doc/operf.1.in
index b109324..efaceb9 100644
--- a/doc/operf.1.in
+++ b/doc/operf.1.in
@@ -13,43 +13,41 @@ operf \- Performance profiler tool for Linux
[ --system-wide | --pid | [ command [ args ] ] ]
.SH DESCRIPTION
-Operf is an OProfile tool that can be used in place of opcontrol for profiling. Operf
-uses the Linux Performance Events Subsystem, and hence, does not require the use of
-the opcontrol daemon -- in fact, operf and opcontrol usage are mutually exclusive.
+Operf is the profiler tool provided with OProfile. Operf
+uses the Linux Performance Events Subsystem and, thus, does not require the
+obsolete oprofile kernel driver.
.P
By default, operf uses /oprofile_data as the session-dir and stores profiling data there.
You can change this by way of the
.I --session-dir
-option.
-.P
-The usual post-profiling analysis tools such as
+option. The usual post-profiling analysis tools such as
.BI opreport(1)
and
.BI opannotate(1)
-can be used to generate profile reports. The post-processing analysis tools will search for samples in
-.I /oprofile_data
-first. If that directory does not exist, the post-processing tools use the standard session-dir of /var/lib/oprofile.
+can be used to generate profile reports. Unless a
+.I session-dir
+is specified, the post-processing analysis tools will search for samples in
+/oprofile_data first. If that directory does not exist, the
+post-processing tools use the standard session-dir of /var/lib/oprofile.
.P
Statistics, such as total samples received
and lost samples, are written to the operf.log file that can be found in the
/samples directory.
.br
-.SH OPTIONS
+.SH RUN MODES
+One (and only one) of the following
+.SB run modes
+must be specified:
.TP
.BI command [args]
The command or application to be profiled.
.I args
-are the input arguments that the command or application requires. One (and only one) of either
-.I command
-,
-.I --pid
-or
-.I --system-wide
-is required.
+are the input arguments that the command or application requires.
.br
.TP
.BI "--pid / -p " PID
+.RS
This option enables operf to profile a running application.
.I PID
should be the process ID of the process you wish to profile. When
@@ -65,7 +63,13 @@ data it has collected. Use
.BI -SIGINT
.BI
for this purpose.
-.br
+.P
+.B Limitation:
+When using this option to profile a multi-threaded application that also forks
+new processes, be aware that samples for processes that are forked before profiling
+is started may not be recorded (depending on timing of thread creation and when
+operf is started).
+.RE
.TP
.BI "--system-wide / -s"
This option is for performing a system-wide profile. You must
@@ -86,30 +90,85 @@ that when running operf with this option, the user's current working
directory should be /root or a subdirectory of /root to avoid
storing sample data files in locations accessible by regular users.
.br
+.SH OTHER OPTIONS
.TP
-.BI "--vmlinux / k " vmlinux_path
+.BI "--vmlinux / -k " vmlinux_path
+.RS
A vmlinux file that matches the running kernel that has symbol and/or debuginfo.
Kernel samples will be attributed to this binary, allowing post-processing tools
(like opreport) to attribute samples to the appropriate kernel symbols.
+.P
+The kernel symbol information may be obtained from /proc/kallsyms if
+the user does not specify a vmlinux file. The symbol addresses are given
+in /proc/kallsyms if permitted by the setting of /proc/sys/kernel/kptr_restrict.
+.P
+If the
+.I --vmlinux
+option is not used and kernel symbols cannot be obtained from /proc/kallsyms,
+then all kernel samples are attributed to "no-vmlinux", which is simply
+a bucket to hold the samples and not an actual file.
+.RE
.TP
.BI "--events / -e " event1[,event2[,...]]
This option is for passing a comma-separated list of event specifications
for profiling. Each event spec is of the form:
.br
.I " name:count[:unitmask[:kernel[:user]]]"
-.br
-When specifying a unit mask value, it may be either a hexadecimal value (which
-.I must
-begin with "0x") or a string (i.e, symbolic name) which matches the first word in
-the unit mask description. Specifying a symbolic name for the unit mask is valid only
-for unit masks having "extra:" parameters, as shown by the output of
-.B ophelp.
-Unit masks with "extra:" parameters
-.I must
-be specified using the symbolic name. If no unit mask is specified, 0x0 will be
-used as the default.
.P
.RS
+The
+.I count
+value is used to control the sampling rate for profiling; it is the number
+of events to occur between samples. The rate is lowered by specifying a higher
+.I count
+value \(em i.e., a higher number of events to occur between samples.
+.P
+You can specify
+.I unitmask
+values using either a numerical value (hex values
+.I must
+begin with "0x") or a symbolic name (if the
+.I name=
+field is shown in the
+.B ophelp
+output). For some named unit masks, the hex value is not unique; thus, OProfile
+tools enforce specifying such unit masks value by name.
+If no unit mask is specified, the default unit mask value for the event is used.
+.P
+The
+.I kernel
+and
+.I user
+parts of the event specification are binary values ('1' or '0') indicating
+whether or not to collect samples for kernel space and user space.
+.br
+.B Note:
+In order to specify the
+.I kernel/user
+bits, you must also specify a
+.I unitmask
+value, even if the processor type (or the specified event) does not use unit masks \(em
+in which case, use the value '0' to signify a null unit mask; for example:
+.br
+ -e INST_RETIRED_ANY_P:100000:0:1:0
+.br
+ ^ ^ ^ ^
+ | | | |--- '0': do not record user space samples
+ | | |-- '1': record kernel space samples
+ | |-- '0': the null unit mask
+ |--count value
+.P
+Event names for some IBM PowerPC systems include a
+.I _GRP
+(group number) suffix. You can pass either the full event name or the base event name
+(i.e., without the suffix) to
+.B operf.
+If the base event name is passed,
+.B operf
+will automatically choose an appropriate group number suffix
+for the event; thus, OProfile post-processing tools will always show real event
+names that include the group number suffix.
+.Po
When no event specification is given, the default event for the running
processor type will be used for profiling.
Use
@@ -126,7 +185,7 @@ full callchain is recorded, so there is no depth limit.
.BI "--separate-thread / -t"
This option categorizes samples by thread group ID (tgid) and thread ID (tid).
The '--separate-thread' option is useful for seeing per-thread samples in
-multi-threaded applications. When used in conjuction with the '--system-wide'
+multi-threaded applications. When used in conjunction with the '--system-wide'
option, the '--separate-thread' option is also useful for seeing per-process
(i.e., per-thread group) samples for the case where multiple processes are
executing the same program during a profiling run.
@@ -144,6 +203,7 @@ directory on the current path.
.br
.TP
.BI "--lazy-conversion / -l"
+.RS
Use this option to reduce the overhead of
.BI operf
during profiling. Normally, profile data received from the kernel is converted
@@ -156,7 +216,16 @@ particularly on busy multi-processor systems. The
option directs
.BI operf
to wait until profiling is completed to do the conversion of profile data.
-.br
+.P
+.B Note:
+This option is
+.B not
+recommended to be used in conjunction with the
+.I --pid
+option for profiling multi-threaded processes. Depending
+on the order of thread creation (or forking of new processes),
+you may not get any samples for the new threads/processes.
+.RE
.TP
.BI "--append / -a"
By default,
@@ -184,7 +253,7 @@ Show operf version.
.br
.TP
.BI "--help / -h"
-Show a help message.
+Display brief usage message.
.br
.TP
.BI "--usage / -u"
@@ -199,6 +268,3 @@ This man page is current for @PACKAGE@-@VERSION@.
.SH SEE ALSO
opreport(1), opannotate(1).
-
-.SH BUGS
-Some parameters are still under development.
diff --git a/doc/opgprof.1.in b/doc/opgprof.1.in
index 3e61ba9..5679289 100644
--- a/doc/opgprof.1.in
+++ b/doc/opgprof.1.in
@@ -30,7 +30,14 @@ Give verbose debugging output.
.br
.TP
.BI "--session-dir="dir_path
-Use sample database out of directory dir_path instead of the default location (/var/lib/oprofile).
+Use sample database from the specified directory
+.I dir_path
+instead of the default location. If
+.I --session-dir
+is not specified, then
+.B opgprof
+will search for samples in /oprofile_data
+first. If that directory does not exist, the standard session-dir of /var/lib/oprofile is used.
.br
.TP
.BI "--image-path / -p [paths]"
@@ -51,11 +58,11 @@ of total samples.
Output to the given file instead of the default, gmon.out
.SH ENVIRONMENT
-No special environment variables are recognised by opgprof.
+No special environment variables are recognized by opgprof.
.SH FILES
.TP
-.I /var/lib/oprofile/samples/
+.I /samples
The location of the generated sample files.
.SH VERSION
diff --git a/doc/ophelp.1.in b/doc/ophelp.1.in
index 3548d74..3d4a7de 100644
--- a/doc/ophelp.1.in
+++ b/doc/ophelp.1.in
@@ -27,6 +27,28 @@ Show the events for the given numerical CPU type.
Show the symbolic CPU name.
.br
.TP
+.BI "--get-default-event / -d"
+.br
+Show the default event for the specified CPU type.
+.TP
+.BI "--check-events / -e [events]"
+Check the given space-separated event descriptions for validity.
+If the events are valid, show which pmu counter each event would be assigned to.
+.br
+.TP
+.BI "--callgraph [callgraph_depth]"
+Use the callgraph depth to compute the higher minimum sampling intervals
+for the events.
+.br
+.TP
+.BI "--unit-mask / -u [event]"
+Show the default unit mask for the given event.
+.br
+.TP
+.BI "--extra-mask / -E [event]"
+Show the extra unit mask for given event.
+.br
+.TP
.BI "--xml / -X"
List events in XML format.
.br
@@ -45,9 +67,6 @@ No special environment variables are recognised by ophelp.
.TP
.I $prefix/share/oprofile/
Event description files used by OProfile.
-.TP
-.I /var/lib/oprofile/samples/
-The location of the generated sample files.
.SH VERSION
.TP
diff --git a/doc/ophelp.xsd b/doc/ophelp.xsd
new file mode 100644
index 0000000..1270121
--- /dev/null
+++ b/doc/ophelp.xsd
@@ -0,0 +1,58 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/doc/opimport.1.in b/doc/opimport.1.in
index ef8ef5c..5bb59b8 100644
--- a/doc/opimport.1.in
+++ b/doc/opimport.1.in
@@ -43,11 +43,11 @@ Give verbose debugging output.
Show version.
.SH ENVIRONMENT
-No special environment variables are recognised by opimport
+No special environment variables are recognized by opimport
.SH FILES
.TP
-.I /var/lib/oprofile/abi
+.I /abi
The abi file description of the sample database files
.SH VERSION
diff --git a/doc/opreport.1.in b/doc/opreport.1.in
index 1742886..0627aa9 100644
--- a/doc/opreport.1.in
+++ b/doc/opreport.1.in
@@ -77,7 +77,7 @@ Output full paths instead of basenames.
Merge any profiles separated in a --separate session.
.br
.TP
-.BI "--no-header"
+.BI "--no-header / -n"
Don't output a header detailing profiling parameters.
.br
.TP
@@ -92,12 +92,11 @@ Reverse the sort from the default.
.BI "--session-dir="dir_path
Use sample database from the specified directory
.I dir_path
-instead of the default locations. If
+instead of the default location. If
.I --session-dir
is not specified, then
.B opreport
-will search for samples in
-.I /oprofile_data
+will search for samples in /oprofile_data
first. If that directory does not exist, the standard session-dir of /var/lib/oprofile is used.
.br
.TP
@@ -114,10 +113,25 @@ binary image filename.
.BI "--symbols / -l"
List per-symbol information instead of a binary image summary.
.br
+Usually, the total of all per-symbols samples for a given binary image
+equals the summary count for the binary image (shown by running
+.B opreport
+with no options).
+However, it's possible for some sample addresses to fall outside the range of
+any symbols for a given binary image. In such cases, the total number of
+per-symbols samples for the binary image may be less than the summary count
+for the image. Running
+.B opreport
+with the
+.I --verbose=debug
+option will display an informational message when this difference is detected.
+This difference is typically very small and can be ignored.
+.br
.TP
.BI "--threshold / -t [percentage]"
Only output data for symbols that have more than the given percentage
-of total samples.
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
.br
.TP
.BI "--verbose / -V [options]"
@@ -135,15 +149,9 @@ Generate XML output.
No special environment variables are recognized by opreport.
.SH FILES
-.I /oprofile_data/samples
-.RS 7
-Or
-.RE
-.I /var/lib/oprofile/samples/
-.LP
-.RS 7
+.TP
+.I /samples
The location of the generated sample files.
-.RE
.SH VERSION
.TP
diff --git a/doc/opreport.xsd b/doc/opreport.xsd
index 682a0bf..28e3128 100644
--- a/doc/opreport.xsd
+++ b/doc/opreport.xsd
@@ -110,7 +110,7 @@
-
+
@@ -121,7 +121,7 @@
-
+
@@ -131,10 +131,13 @@
-
+
+
-
+
@@ -144,7 +147,7 @@
-
+
@@ -203,7 +206,7 @@
-
+
diff --git a/doc/oprofile.1 b/doc/oprofile.1
index fc4fcd8..efac053 100644
--- a/doc/oprofile.1
+++ b/doc/oprofile.1
@@ -1,14 +1,9 @@
-.TH OPROFILE 1 "Mon 27 August 2012" "oprofile 0.9.8"
+.TH OPROFILE 1 "Fri 12 September 2014" "oprofile 1.0.0"
.UC 4
.SH NAME
oprofile \- a system-wide profiler
.SH SYNOPSIS
.br
-.B opcontrol
-[
-.I options
-]
-.br
.B opreport
[
.I options
@@ -45,9 +40,6 @@ For a gentle guide to using OProfile, please read the HTML documentation
listed in SEE ALSO.
.br
.SH OPCONTROL
-.B opcontrol
-is used for starting and stopping the OProfile daemon, and providing set-up
-parameters.
.SH OPREPORT
.B opreport
gives image and symbol-based profile summaries for the whole system or
@@ -143,41 +135,39 @@ tgid: to restrict the results to particular threads within a process.
This is only useful when using per-process profile separation.
.SH ENVIRONMENT
-No special environment variables are recognised by oprofile.
+No special environment variables are recognized by oprofile.
.SH FILES
.TP
-.I $HOME/.oprofile/
-Configuration files
+.I /usr/local/share/doc/oprofile/oprofile.html
+OProfile user guide.
+.TP
+.I /usr/local/share/doc/oprofile/opreport.xsd
+Schema file for opreport XML output.
.TP
-.I /root/.oprofile/daemonrc
-Configuration file for opcontrol
+.I /usr/local/share/doc/oprofile/ophelp.xsd
+Schema file for ophelp XML output.
.TP
.I /usr/local/share/oprofile/
Event description files used by OProfile.
.TP
-.I /var/lib/oprofile/samples/oprofiled.log
-The user-space daemon logfile.
-.TP
-.I /dev/oprofile
-The device filesystem for communication with the Linux kernel module.
+.I /samples/operf.log
+The profiler log file.
.TP
-.I /var/lib/oprofile/samples/
+.I /samples/current
The location of the generated sample files.
.SH VERSION
.TP
-This man page is current for oprofile-0.9.8.
+This man page is current for oprofile-1.0.0.
.SH SEE ALSO
.BR /usr/local/share/doc/oprofile/,
-.BR opcontrol(1),
.BR opreport(1),
.BR opannotate(1),
.BR oparchive(1),
.BR opgprof(1),
.BR gprof(1),
-.BR readprofile(1),
.BR "CPU vendor architecture manuals"
.SH COPYRIGHT
diff --git a/doc/oprofile.1.in b/doc/oprofile.1.in
index 0f2cb83..550954a 100644
--- a/doc/oprofile.1.in
+++ b/doc/oprofile.1.in
@@ -4,11 +4,6 @@
oprofile \- a system-wide profiler
.SH SYNOPSIS
.br
-.B opcontrol
-[
-.I options
-]
-.br
.B opreport
[
.I options
@@ -45,9 +40,6 @@ For a gentle guide to using OProfile, please read the HTML documentation
listed in SEE ALSO.
.br
.SH OPCONTROL
-.B opcontrol
-is used for starting and stopping the OProfile daemon, and providing set-up
-parameters.
.SH OPREPORT
.B opreport
gives image and symbol-based profile summaries for the whole system or
@@ -143,26 +135,26 @@ tgid: to restrict the results to particular threads within a process.
This is only useful when using per-process profile separation.
.SH ENVIRONMENT
-No special environment variables are recognised by oprofile.
+No special environment variables are recognized by oprofile.
.SH FILES
.TP
-.I $HOME/.oprofile/
-Configuration files
+.I @prefix@/share/doc/oprofile/oprofile.html
+OProfile user guide.
+.TP
+.I @prefix@/share/doc/oprofile/opreport.xsd
+Schema file for opreport XML output.
.TP
-.I /root/.oprofile/daemonrc
-Configuration file for opcontrol
+.I @prefix@/share/doc/oprofile/ophelp.xsd
+Schema file for ophelp XML output.
.TP
.I @prefix@/share/oprofile/
Event description files used by OProfile.
.TP
-.I /var/lib/oprofile/samples/oprofiled.log
-The user-space daemon logfile.
-.TP
-.I /dev/oprofile
-The device filesystem for communication with the Linux kernel module.
+.I /samples/operf.log
+The profiler log file.
.TP
-.I /var/lib/oprofile/samples/
+.I /samples/current
The location of the generated sample files.
.SH VERSION
@@ -171,13 +163,11 @@ This man page is current for @PACKAGE@-@VERSION@.
.SH SEE ALSO
.BR @OP_DOCDIR@,
-.BR opcontrol(1),
.BR opreport(1),
.BR opannotate(1),
.BR oparchive(1),
.BR opgprof(1),
.BR gprof(1),
-.BR readprofile(1),
.BR "CPU vendor architecture manuals"
.SH COPYRIGHT
diff --git a/doc/oprofile.html b/doc/oprofile.html
index 4e1a04e..3b00224 100644
--- a/doc/oprofile.html
+++ b/doc/oprofile.html
@@ -47,51 +47,56 @@
-This manual applies to OProfile version 0.9.8.
-OProfile is a profiling system for Linux 2.6 and higher systems on a number of architectures. It is capable of profiling
-all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries
-to binaries. OProfile can profile the whole system in the background, collecting information at a low overhead. These
-features make it ideal for profiling entire systems to determine bottle necks in real-world systems.
+This manual applies to OProfile version 1.0.0.
+OProfile is a set of performance monitoring tools for Linux 2.6 and higher systems, available on a number of architectures.
+OProfile provides the following features:
+
+
+
+
Profiler
+
Post-processing tools for analyzing profile data
+
Event counter
+
+
+
+
+
+OProfile is capable of monitoring native hardware events occurring in all parts of a running system, from the kernel
+(including modules and interrupt handlers) to shared libraries
+to binaries. OProfile can collect event information for the whole system in the background with very little overhead. These
+features make it ideal for monitoring entire systems to determine bottle necks in real-world systems.
Many CPUs provide "performance counters", hardware registers that can count "events"; for example,
-cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events:
+cache misses, or CPU cycles. OProfile can collect profiles of code based on the number of these occurring events:
repeatedly, every time a certain (configurable) number of events has occurred, the PC value is recorded.
-This information is aggregated into profiles for each binary image.
-
-Some hardware setups do not allow OProfile to use performance counters: in these cases, no
-events are available so OProfile operates in timer mode, as described in later chapters. Timer
-mode is only available in "legacy mode" (see Section 1, “OProfile legacy mode”).
-
-
1. OProfile legacy mode
-"Legacy" OProfile consists of the opcontrol shell script, the oprofiled daemon, and several post-processing tools (e.g.,
-opreport). The opcontrol script is used for configuring, starting, and stopping a profiling session. An OProfile
-kernel driver (usually built as a kernel module) is used for collecting samples, which are then recorded into sample files by
-oprofiled. Using OProfile in "legacy mode" requires root user authority since the profiling is done on a system-wide basis, which may
-(if misused) cause adverse effects to the system.
-
Note
-Profiling setup parameters that you specify using opcontrol are cached in /root/.oprofile/daemonrc.
-Subsequent runs of opcontrol --start will continue to use these cached values until you
-override them with new values.
-
-
2. OProfile perf_events mode
-As of release 0.9.8, OProfile now includes the ability to profile a single process versus the system-wide technique
-of legacy OProfile. With this new technique, the operf program is used to control profiling instead of the
-opcontrol script and oprofiled daemon of leagacy mode. Also, operf does not require the
-special OProfile kernel driver that legacy mode does; instead, it interfaces with the kernel to collect samples via the Linux Kernel
-Performance Events Subsystem (hereafter referred to as "perf_events"). Using operf to profile a single
-process can be done as a normal user; however, root authority is required to run operf in system-wide
-profiling mode.
-
Note 1
-The same OProfile post-processing tools are used whether you collect your profile with operf or opcontrol.
-
Note 2
+This information is aggregated into profiles for each binary image. Alternatively, OProfile's event counting
+tool can collect simple raw event counts.
+
1. OProfile legacy profiling mode
+Prior to release 1.0, OProfile included a profiling tool consisting of the opcontrol shell script, the oprofiled daemon,
+and the attendant oprofile kernel driver. This "legacy profiler" was deprecated in release 0.9.8 with the introduction of
+the operf profiling tool (see Section 2, “OProfile perf_events profiling mode”). Some older architectures/platforms
+do not support the use of operf. For those cases, oprofile users should install release 0.9.9, which is the
+last release to include the legacy profiler.
+
+
+
+
+
+
2. OProfile perf_events profiling mode
+
+
+
+
+OProfile has the ability to profile a single process or every currently running process (i.e., system-wide)
+via the operf program. operf interfaces with the
+kernel to collect samples via the Linux Kernel Performance Events Subsystem (hereafter
+referred to as "perf_events"). OProfile can co-exist with other tools on your system that
+may also be using the perf_events kernel subsystem.
+
+
+Using operf to profile a single
+process can be done as a normal user; however, root authority is required to run
+operf in system-wide profiling mode.
+
+
Note
Some older processor models are not supported by the underlying perf_events kernel and, thus, are not supported by operf.
If you receive the message
Your kernel's Performance Events Subsystem does not support your processor type
-when attempting to use operf, try profiling with opcontrol
+when attempting to use operf, install OProfile 0.9.9 and try profiling with opcontrol
to see if your processor type may be supported by OProfile's legacy mode.
-
-
+
+
+
+
+
3. OProfile event counting mode
+OProfile provides the ocount tool for
+collecting raw event counts on a per-application, per-process, per-cpu, or system-wide basis. Unlike the
+profiling tools, post-processing of the data collected is not necessary -- the data is displayed in the
+output of ocount. A common use case for event counting tools is when performance analysts
+want to determine the CPI (cycles per instruction) for an application. High CPI implies possible stalls,
+and many architectures provide events that give detailed information about the different types of stalls.
+The events provided are architecture-specific, so we refer the reader to the hardware manuals available for
+the processor type being used.
+
+
-
3. Applications of OProfile
+
4. Applications of OProfile
@@ -608,7 +611,7 @@ OProfile is useful in a number of situations. You might want to use OProfile whe
need to profile an application and its shared libraries
need to capture the performance behaviour of entire system
@@ -649,66 +652,41 @@ OProfile is not a panacea. OProfile might not be a complete solution when you :
-
-
-
-
-
3.1. Support for dynamically compiled (JIT) code
-
-
-
-
-Older versions of OProfile were not capable of attributing samples to symbols from dynamically
-compiled code, i.e. "just-in-time (JIT) code". Typical JIT compilers load the JIT code into
-anonymous memory regions. OProfile reported the samples from such code, but the attribution
-provided was simply:
-
-
-
-
-
anon: <tgid><address range>
-
-
-
-
-Due to this limitation, it wasn't possible to profile applications executed by virtual machines (VMs)
-like the Java Virtual Machine. OProfile now contains an infrastructure to support JITed code.
+
4.1. Support for dynamically compiled (JIT) code
+OProfile provides a framework to support JITed code ("just-in-time (JIT) compiled code").
A development library is provided to allow developers
-to add support for any VM that produces dynamically compiled code (see the OProfile JIT agent
+to add support for any VM (virtual machine) that produces dynamically compiled code (see the OProfile JIT agent
developer guide).
-In addition, built-in support is included for the following:
+In addition, built-in support is included for the following:
JVMTI agent library for Java (1.5 and higher)
JVMPI agent library for Java (1.5 and lower)
+These libraries make it possible for OProfile to attribute profile samples
+to Java methods. Without a VM-specific agent library, OProfile will typically report
+samples from JITed code similar to the following example:
+
OProfile currently does not support event-based profiling (i.e, using hardware events like cache misses,
-branch mispredicts) on virtual machine guests running under systems such as VMware. The list of
-supported events displayed by ophelp or 'opcontrol --list-events' is based on CPU type and does
+branch mispredicts) on virtual machine guests running under systems such as VMware.
+(Note: KVM guests are supported.) The list of
+supported events displayed by ophelp is based on CPU type and does
not take into account whether the running system is a guest system or real system. To use
-OProfile on such guest systems, you can use timer mode (see Section 6.2, “OProfile in timer interrupt mode”).
+OProfile on such guest systems, you must use the legacy profiler's timer mode (see Section 3.2, “OProfile timer interrupt mode”).
-
+
-
4. System requirements
+
5. System requirements
@@ -719,50 +697,15 @@ OProfile on such guest systems, you can use timer mode (see
@@ -889,11 +822,11 @@ OProfile on such guest systems, you can use timer mode (see
-
+
-
6. Installation
+
7. Installation
@@ -911,7 +844,7 @@ is often all you need, but note these arguments to
Use this option if you need to profile Java applications. Also, see
- Section 4, “System requirements”, "Required user account". This option
+ Section 5, “System requirements”, "Required user account". This option
is used to specify the location of the Java Development Kit (JDK)
source tree you wish to use. This is necessary to get the interface description
of the JVMPI (or JVMTI) interface to compile the JIT support code successfully.
@@ -1022,18 +955,13 @@ it's not sufficient to enable the local APIC -- you must also turn it on explici
time by providing the "lapic" option to the kernel.
If you use the NMI watchdog, be aware that the watchdog is disabled when profiling starts
and not re-enabled until the profiling is stopped.
-
-
-Please note that you must save or have available the vmlinux file
-generated during a kernel compile, as OProfile needs it (you can use
---no-vmlinux, but this will prevent kernel profiling).
-
+
-
7. Uninstalling OProfile
+
8. Uninstalling OProfile
@@ -1063,12 +991,17 @@ remove all installed files except your configuration file in the directory
@@ -1082,13 +1015,8 @@ remove all installed files except your configuration file in the directory
-Profiling with operf is the recommended profiling mode with OProfile. Using
-this mode not only allows you to target your profiling more precisely (i.e., single process
-or system-wide), it also allows OProfile to co-exist better with other tools on your system that
-may also be using the perf_events kernel subsystem.
-
-
-With operf, there is no initial setup needed -- simply invoke operf with
+Profiling with operf allows you to precisely target your profiling (i.e., single process
+or system-wide). With operf, there is no initial setup needed -- simply invoke operf with
the options you need; then run the OProfile post-processing tool(s). The operf syntax
is as follows:
@@ -1117,205 +1045,347 @@ and opreport and other post-proces
unless you pass the --session-dir option.
-
+
-
2. Getting started with OProfile using legacy mode
+
2. Getting started with OProfile using ocount
-Before you can use OProfile's legacy mode, you must set it up. The minimum setup required for this
-is to tell OProfile where the vmlinux file corresponding to the
-running kernel is, for example :
+ocount is an OProfile tool that can be used to count native hardware events occurring in either
+a specific application, a set of processes or threads, a set of active system processors, or the
+entire system. The data collected during a counting session is displayed to stdout by default, but may
+also be saved to a file. The ocount syntax is as follows:
-
-
-
-
opcontrol --vmlinux=/boot/vmlinux-`uname -r`
-
-
-
-If you don't want to profile the kernel itself,
-you can tell OProfile you don't have a vmlinux file :
-Note that unlike gprof, no instrumentation (-pg
-and -a options to gcc)
-is necessary.
-Periodically (or on opcontrol --shutdown or opcontrol --dump)
-the profile data is written out into the $SESSION_DIR/samples directory (by default at /var/lib/oprofile/samples).
-These profile files cover shared libraries, applications, the kernel (vmlinux), and kernel modules.
-You can clear the profile data (at any time) with opcontrol --reset.
+When my_test_program completes (or when you press Ctrl-C), counting
+stops and the results are displayed to the screen (as shown below).
-To place these sample database files in a specific directory instead of the default location
-(/var/lib/oprofile) use the --session-dir=dir option.
-You must also specify the --session-dir to tell the tools to continue using this directory.
-
-You can get summaries of this data in a number of ways at any time. To get a summary of
-data across the entire system for all of these profiles, you can do :
-
opreport [--session-dir=dir]
+
+Events were actively counted for 2.8 seconds.
+Event counts (actual) for /home/user1/my_test_program:
+ Event Count % time counted
+ CPU_CLK_UNHALTED 9,408,018,070 100.00
+ INST_RETIRED 16,719,918,108 100.00
+
-Or to get a more detailed summary, for a particular image, you can do something like :
-
-
-
-
opreport -l /boot/vmlinux-`uname -r`
-
-
-
-There are also a number of other ways of presenting the data, as described later in this manual.
-Note that OProfile will choose a default profiling setup for you. However, there are a number
-of options you can pass to opcontrol if you need to change something,
-also detailed later.
-
+
-
3. Tools summary
+
3. Specifying performance counter events
-This section gives a brief description of the available OProfile utilities and their purpose.
+Whether profiling with operf or doing simple event counting with ocount,
+you can collect information about one more native hardware events using the --events
+option -- a comma-separated list of event specfications. The event specification is the means to provide details
+of how each hardware performance counter should be set up.
+For profiling, the event specification is a colon-separated string of the form
+name:count:unitmask:kernel:user
+as described in the table below. For ocount, specification is of the form
+name:unitmask:kernel:user.
+Note the presence of the count field for profiling. The count field tells the profiler
+how many events should occur between a profile snapshot (usually referred to as a "sample"). Since
+ocount does not do sampling, the count field is not needed.
-
-
-
-
- ophelp
-
-
-
-
- This utility lists the available events and short descriptions.
-
-
-
-
- operf
-
-
-
-
- This is the recommended program for collecting profile data.
-
- This utility can be used to produce annotated source, assembly or mixed source/assembly.
- Source level annotation is available only if the application was compiled with
- debugging symbols. See Section 3, “Outputting annotated source (opannotate)”.
-
+If no event specs are passed to operf or ocount,
+the default event will be used.
+
+
+
+
Note
The perf_events kernel subsystem allocates hardware counters as necessary, but some processor
+types have restrictions as to what hardware events may be counted simultaneously.
+The kernel employs a multiplexing technique when such
+hardware restrictions are encountered, such that events are monitored on a rotating basis.
+
+
+
+
+
+
+
+
+
+
+
+
+ name
+
+
The symbolic event name, e.g. CPU_CLK_UNHALTED
+
+
+
+ count
+
+
The counter reset value, e.g. 100000; use only for profiling
+
+
+
+ unitmask
+
+
The unit mask, as given in the events list: e.g. 0x0f; or a symbolic name
+if a name=<um_name> field is present
+
+
+
+ kernel
+
+
Enable profiling of kernel code
+
+
+
+ user
+
+
Enable profiling of userspace code
+
+
+
+
+
+The last three values are optional; if you omit them (e.g. operf --events=DATA_MEM_REFS:30000),
+they will be set to the default values (i.e., the default unit mask value for the given event, and profiling (or counting)
+both kernel and userspace code will be enabled). Note that on some architectures, some events may
+require a unit mask be specified.
+
+
+You can specify unit mask values using either a numerical value (hex values
+must begin with "0x") or a symbolic name (if the name=<um_name>
+field is shown in the ophelp output). For some named unit masks, the hex value is not unique; thus, OProfile
+tools enforce specifying such unit masks value by name.
+
+
+The table below lists the default profiling event for various processor types. The same events
+can be used for ocount, minus the count field.
+
+
+
+
+
+
+
+
+
+
+
Processor
+
cpu_type
+
Default event
+
+
+
Alpha EV67
+
alpha/ev67
+
CYCLES:100000:0:1:1
+
+
+
ARM/XScale PMU1
+
arm/xscale1
+
CPU_CYCLES:100000:0:1:1
+
+
+
ARM/XScale PMU2
+
arm/xscale2
+
CPU_CYCLES:100000:0:1:1
+
+
+
ARM/MPCore
+
arm/mpcore
+
CPU_CYCLES:100000:0:1:1
+
+
+
Athlon
+
i386/athlon
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Pentium Pro
+
i386/ppro
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Pentium II
+
i386/pii
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Pentium III
+
i386/piii
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Pentium M (P6 core)
+
i386/p6_mobile
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Pentium 4 (non-HT)
+
i386/p4
+
GLOBAL_POWER_EVENTS:100000:1:1:1
+
+
+
Pentium 4 (HT)
+
i386/p4-ht
+
GLOBAL_POWER_EVENTS:100000:1:1:1
+
+
+
Hammer
+
x86-64/hammer
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Family10h
+
x86-64/family10
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
Family11h
+
x86-64/family11h
+
CPU_CLK_UNHALTED:100000:0:1:1
+
+
+
IBM pseries
+
ppc64/power{ 4|5|6|7|8|970 }
+
CYCLES:100000:0:1:1
+
+
+
IBM s390
+
s390/{ z10|z196|zEC12 }
+
HWSAMPLING:4127518:0:1:1
+
+
+
+
+
+
+
+
+
+
4. Tools summary
+
+
+
+
+This section gives a brief description of the available OProfile utilities and their purpose.
+
+
+
+
+
+ ophelp
+
+
+
+
+ This utility lists the available events and short descriptions.
+
+ This utility can be used to produce annotated source, assembly or mixed source/assembly.
+ Source level annotation is available only if the application was compiled with
+ debugging symbols. See Section 3, “Outputting annotated source (opannotate)”.
+
@@ -1336,7 +1406,7 @@ This section gives a brief description of the available OProfile utilities and t
This utility converts sample database files from a foreign binary format (abi) to
- the native format. This is useful only when moving sample files between hosts,
+ the native format. This is useful only when moving sample files between hosts
for analysis on platforms other than the one used for collection.
See Section 7, “Converting sample database files (opimport)”.
@@ -1349,7 +1419,7 @@ This section gives a brief description of the available OProfile utilities and t
-
Chapter 3. Controlling the profiler
+
Chapter 3. Controlling the profiler
@@ -1365,92 +1435,38 @@ This section gives a brief description of the available OProfile utilities and t
@@ -1483,7 +1499,7 @@ Additionally, each counter is programmed with a "count" value, which corresponds
detailed the profile is. The lower the value, the more frequently profile
samples are taken. You can choose to sample only kernel code, user-space code,
or both (both is the default). Finally, some events have a "unit mask"
--- this is a value that further restricts the types of event that are counted.
+-- this is a value that further restricts the type of event being counted.
You can see the event types and unit masks for your CPU using ophelp.
More information on event specification can be found at Section 3, “Specifying performance counter events”.
@@ -1510,12 +1526,12 @@ Following is a description of the operf
- command
+ command [args]
- The command or application to be profiled. args are the input arguments
+ The command or application to be profiled. The [args] are the input arguments
that the command or application requires. Either command, --pid or
--system-wide is required, but cannot be used simultaneously.
@@ -1561,8 +1577,11 @@ Following is a description of the operf
A vmlinux file that matches the running kernel that has symbol and/or debuginfo.
Kernel samples will be attributed to this binary, allowing post-processing tools
(like opreport) to attribute samples to the appropriate kernel symbols.
- If this option is not specified, all kernel samples will be attributed to a pseudo
- binary named "no-vmlinux".
+ If this option is not specified, the file /proc/kallsyms is used to obtain
+ kernel symbol addresses correponding to sample addresses. However, the setting of
+ /proc/sys/kernel/kptr_restrict may restrict a non-root user's access to
+ /proc/kallsyms, in which case,
+ all kernel samples are attributed to a pseudo binary named "no-vmlinux".
@@ -1638,809 +1657,87 @@ Following is a description of the operf
The --separate-thread option is useful for seeing per-thread samples in
multi-threaded applications. When used in conjuction with the --system-wide
option, --separate-thread is also useful for seeing per-process
- (i.e., per-thread group) samples for the case where multiple processes are
- executing the same program during a profiling run.
-
-
-
-
- --separate-cpu / -c
-
-
-
-
- This option categorizes samples by cpu.
-
-
-
-
- --session-dir / -d [path]
-
-
-
-
- This option specifies the session directory to hold the sample data. If not specified,
- the data is saved in the oprofile_data directory on the current path.
-
-
-
-
- ---lazy-conversion / -l
-
-
-
-
- Use this option to reduce the overhead of operf during profiling.
- Normally, profile data received from the kernel is converted to OProfile format
- during profiling time. This is typically not an issue when profiling a single
- application. But when using the --system-wide option, this on-the-fly
- conversion process can cause noticeable overhead, particularly on busy
- multi-processor systems. The --lazy-conversion option directs
- operf to wait until profiling is completed to do the conversion
- of profile data.
-
-
-
-
- --verbose / -V [level]
-
-
-
-
- A comma-separated list of debugging control values used to increase the verbosity of the
- output. Valid values are: debug, record, convert, misc, sfile, arcs, and the special value, 'all'.
-
-
-
-
- --version -v
-
-
-
-
- Show operf version.
-
-
-
-
- --help / -h
-
-
-
-
- Show a help message.
-
-
-
-
-
-
-
-
-
-
2. Using opcontrol
-
-
-
-
-In this section we describe the configuration and control of the profiling system
-with opcontrol in more depth. See Section 1, “Using operf” for a description
-of the preferred profiling method.
-
-
-The opcontrol script has a default setup, but you
-can alter this with the options given below. In particular, you can select
-specific hardware events on which to base your profile. See Section 1, “Using operf” for an
-introduction to hardware events and performance counter configuration.
-The event types and unit masks for your CPU are listed by opcontrol
---list-events or ophelp.
-
-
-The opcontrol script provides the following actions :
-
-
-
-
-
- --init
-
-
-
-
- Loads the OProfile module if required and makes the OProfile driver
- interface available.
-
-
-
-
- --setup
-
-
-
-
- Followed by list arguments for profiling set up. List of arguments
- saved in /root/.oprofile/daemonrc.
- Giving this option is not necessary; you can just directly pass one
- of the setup options, e.g. opcontrol --no-vmlinux.
-
-
-
-
- --status
-
-
-
-
- Show configuration information.
-
-
-
-
- --start-daemon
-
-
-
-
- Start the oprofile daemon without starting actual profiling. The profiling
- can then be started using --start. This is useful for avoiding
- measuring the cost of daemon startup, as --start is a simple
- write to a file in oprofilefs.
-
-
-
-
- --start
-
-
-
-
- Start data collection with either arguments provided by --setup
- or information saved in /root/.oprofile/daemonrc. Specifying
- the addition --verbose makes the daemon generate lots of debug data
- whilst it is running.
-
-
-
-
- --dump
-
-
-
-
- Force a flush of the collected profiling data to the daemon.
-
-
-
-
- --stop
-
-
-
-
- Stop data collection.
-
-
-
-
- --shutdown
-
-
-
-
- Stop data collection and kill the daemon.
-
-
-
-
- --reset
-
-
-
-
- Clears out data from current session, but leaves saved sessions.
-
-
-
- --save=session_name
-
-
-
- Save data from current session to session_name.
-
-
-
-
- --deinit
-
-
-
-
- Shuts down daemon. Unload the OProfile module and oprofilefs.
-
-
-
-
- --list-events
-
-
-
-
- List event types and unit masks.
-
-
-
-
- --help
-
-
-
-
- Generate usage messages.
-
-
-
-
-
-There are a number of possible settings, of which, only
---vmlinux (or --no-vmlinux)
-is required. These settings are stored in ~/.oprofile/daemonrc.
-
-
-
-
- --buffer-size=num
-
-
-
- Number of samples in kernel buffer.
- Buffer watershed needs to be tweaked when changing this value.
-
-
-
- --buffer-watershed=num
-
-
-
- Set kernel buffer watershed to num samples. When remain only
- buffer-size - buffer-watershed free entries remain in the kernel buffer, data will be
- flushed to the daemon. Most useful values are in the range [0.25 - 0.5] * buffer-size.
-
-
-
- --cpu-buffer-size=num
-
-
-
- Number of samples in kernel per-cpu buffer. If you
- profile at high rate, it can help to increase this if the log
- file show excessive count of samples lost due to cpu buffer overflow.
-
- Create/use sample database out of directory dir_path instead of
- the default location (/var/lib/oprofile).
-
-
-
- --separate=[none,lib,kernel,thread,cpu,all]
-
-
-
- By default, every profile is stored in a single file. Thus, for example,
- samples in the C library are all accredited to the /lib/libc.o
- profile. However, you choose to create separate sample files by specifying
- one of the below options.
-
-
-
-
-
-
-
-
-
-
- none
-
-
No profile separation (default)
-
-
-
- lib
-
-
Create per-application profiles for libraries
-
-
-
- kernel
-
-
Create per-application profiles for the kernel and kernel modules
-
-
-
- thread
-
-
Create profiles for each thread and each task
-
-
-
- cpu
-
-
Create profiles for each CPU
-
-
-
- all
-
-
All of the above options
-
-
-
-
-
- Note that --separate=kernel also turns on --separate=lib.
-
- When using --separate=kernel, samples in hardware interrupts, soft-irqs, or other
- asynchronous kernel contexts are credited to the task currently running. This means you will see
- seemingly nonsense profiles such as /bin/bash showing samples for the PPP modules,
- etc.
-
-
- Using --separate=thread creates a lot
- of sample files if you leave OProfile running for a while; it's most
- useful when used for short sessions, or when using image filtering.
-
-
-
- --callgraph=#depth
-
-
-
- Enable call-graph sample collection with a maximum depth. Use 0 to disable
- callgraph profiling. NOTE: Callgraph support is available on a limited
- number of platforms at this time; for example:
-
-
-
-
-
-
-
x86 with 2.6 or higher kernel
-
-
-
ARM with 2.6 or higher kernel
-
-
-
PowerPC with 2.6.17 or higher kernel
-
-
-
-
-
-
-
-
-
- --image=image,[images]|"all"
-
-
-
- Image filtering. If you specify one or more absolute
- paths to binaries, OProfile will only produce profile results for those
- binary images. This is useful for restricting the sometimes voluminous
- output you may get otherwise, especially with
- --separate=thread. Note that if you are using
- --separate=lib or
- --separate=kernel, then if you specification an
- application binary, the shared libraries and kernel code
- are included. Specify the value
- "all" to profile everything (the default).
-
-
-
- --vmlinux=file
-
-
-
- vmlinux kernel image.
-
-
-
-
- --no-vmlinux
-
-
-
-
- Use this when you don't have a kernel vmlinux file, and you don't want
- to profile the kernel. This still counts the total number of kernel samples,
- but can't give symbol-based results for the kernel or any modules.
-
-
-
-
-
-
-
-
-
2.1. Examples
-
-
-
-
-
-
-
-
2.1.1. Intel performance counter setup
-
-
-
-
-Here, we have a Pentium III running at 800MHz, and we want to look at where data memory
-references are happening most, and also get results for CPU time.
-
2.1.3. Separate profiles for libraries and the kernel
-
-
-
-
-Here, we want to see a profile of the OProfile daemon itself, including when
-it was running inside the kernel driver, and its use of shared libraries.
-
-It can often be useful to split up profiling data into several different
-time periods. For example, you may want to collect data on an application's
-startup separately from the normal runtime data. You can use the simple
-command opcontrol --save to do this. For example :
-
-
-
-
-
-# opcontrol --save=blah
-
-
-
-
-
-will create a sub-directory in $SESSION_DIR/samples containing the samples
-up to that point (the current session's sample files are moved into this
-directory). You can then pass this session name as a parameter to the post-profiling
-analysis tools, to only get data up to the point you named the
-session. If you do not want to save a session, you can do
-rm -rf $SESSION_DIR/samples/sessionname or, for the
-current session, opcontrol --reset.
-
-
-
-
-
-
-
-
-
3. Specifying performance counter events
-
-
-
-
-Both methods of profiling (operf and opcontrol)
-allow you to give one or more event specifications to provide details of how each
-hardware performance counter should be setup. With operf, you
-can provide a comma-separated list of event specfications using the --events
-option. With opcontrol, you use the --event option
-for each desired event specification.
-The event specification is a colon-separated string of the form
-name:count:unitmask:kernel:user
-as described in the table below.
-
-
-If no event specs are passed to operf or opcontrol,
-the default event will be used for profiling. With opcontrol, if you have
-previously specified some non-default event but want to revert to the default event, use
---event=default. Use of this option overrides all previous event selections
-that have been cached.
-
-
-
-
Note
OProfile will allocate hardware counters as necessary, but some processor
-types have restrictions as to what hardware events may be counted simultaneously.
-The operf program uses a multiplexing technique when such
-hardware restrictions are encountered, but opcontrol does
-not have this capability; instead, opcontrol will display an
-error message if you select an incompatible set of events.
-
-
-
-
-
-
-
-
-
-
-
-
- name
-
-
The symbolic event name, e.g. CPU_CLK_UNHALTED
-
-
-
- count
-
-
The counter reset value, e.g. 100000
-
-
-
- unitmask
-
-
The unit mask, as given in the events list: e.g. 0x0f; or a symbolic name as
-given by the first word of the description (only valid for unit masks having an "extra:" parameter)
-
-
-
- kernel
-
-
Whether to profile kernel code
-
-
-
- user
-
-
Whether to profile userspace code
-
-
-
-
-
-The last three values are optional, if you omit them (e.g. --event=DATA_MEM_REFS:30000),
-they will be set to the default values (a unit mask of 0, and profiling both kernel and
-userspace code). Note that some events require a unit mask.
-
-
-When specifying a unit mask value, it may be either a hexadecimal value (which
-must begin with "0x") or a string (i.e, symbolic name) which matches
-the first word in the unit mask description. Specifying a symbolic name for
-the unit mask is valid only for unit masks having "extra:" parameters, as
-shown by the output of ophelp. Unit masks with "extra:" parameters must be
-specified using the symbolic name.
-
-
-
Note
-
-When using legacy mode opcontrol on PowerPC platforms, all events specified must be in the same group;
-i.e., the group number appended to the event name (e.g. <some-event-name>_GRP9
-) must be the same.
-
-
-
-If OProfile is using timer-interrupt mode, there is no event configuration possible.
-
-
-The table below lists the default event for various processor types:
-
-
-
-
-
-
-
-
-
-
-
Processor
-
cpu_type
-
Default event
-
-
-
Alpha EV4
-
alpha/ev4
-
CYCLES:100000:0:1:1
-
-
-
Alpha EV5
-
alpha/ev5
-
CYCLES:100000:0:1:1
-
-
-
Alpha PCA56
-
alpha/pca56
-
CYCLES:100000:0:1:1
-
-
-
Alpha EV6
-
alpha/ev6
-
CYCLES:100000:0:1:1
-
-
-
Alpha EV67
-
alpha/ev67
-
CYCLES:100000:0:1:1
-
-
-
ARM/XScale PMU1
-
arm/xscale1
-
CPU_CYCLES:100000:0:1:1
-
-
-
ARM/XScale PMU2
-
arm/xscale2
-
CPU_CYCLES:100000:0:1:1
-
-
-
ARM/MPCore
-
arm/mpcore
-
CPU_CYCLES:100000:0:1:1
-
-
-
AVR32
-
avr32
-
CPU_CYCLES:100000:0:1:1
-
-
-
Athlon
-
i386/athlon
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Pentium Pro
-
i386/ppro
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Pentium II
-
i386/pii
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Pentium III
-
i386/piii
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Pentium M (P6 core)
-
i386/p6_mobile
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Pentium 4 (non-HT)
-
i386/p4
-
GLOBAL_POWER_EVENTS:100000:1:1:1
-
-
-
Pentium 4 (HT)
-
i386/p4-ht
-
GLOBAL_POWER_EVENTS:100000:1:1:1
-
-
-
Hammer
-
x86-64/hammer
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Family10h
-
x86-64/family10
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Family11h
-
x86-64/family11h
-
CPU_CLK_UNHALTED:100000:0:1:1
-
-
-
Itanium
-
ia64/itanium
-
CPU_CYCLES:100000:0:1:1
-
-
-
Itanium 2
-
ia64/itanium2
-
CPU_CYCLES:100000:0:1:1
-
-
-
TIMER_INT
-
timer
-
None selectable
-
-
-
IBM pseries
-
PowerPC 4/5/6/7/970/Cell
-
CYCLES:100000:0:1:1
-
-
-
IBM s390
-
timer
-
None selectable
-
-
-
IBM s390x
-
timer
-
None selectable
-
-
-
+ (i.e., per-thread group) samples for the case where multiple processes are
+ executing the same program during a profiling run.
+
+
+
+
+ --separate-cpu / -c
+
+
+
+
+ This option categorizes samples by cpu.
+
+
+
+
+ --session-dir / -d [path]
+
+
+
+
+ This option specifies the session directory to hold the sample data. If not specified,
+ the data is saved in the oprofile_data directory on the current path.
+
+
+
+
+ ---lazy-conversion / -l
+
+
+
+
+ Use this option to reduce the overhead of operf during profiling.
+ Normally, profile data received from the kernel is converted to OProfile format
+ during profiling time. This is typically not an issue when profiling a single
+ application. But when using the --system-wide option, this on-the-fly
+ conversion process can cause noticeable overhead, particularly on busy
+ multi-processor systems. The --lazy-conversion option directs
+ operf to wait until profiling is completed to do the conversion
+ of profile data.
+
+
+
+
+ --verbose / -V [level]
+
+
+
+
+ A comma-separated list of debugging control values used to increase the verbosity of the
+ output. Valid values are: debug, record, convert, misc, sfile, arcs, and the special value, 'all'.
+
+
+
+
+ --version -v
+
+
+
+
+ Show operf version.
+
+
+
+
+ --help / -h
+
+
+
+
+ Show a help message.
+
+
+
-
+
-
4. Setting up the JIT profiling feature
+
2. Setting up the JIT profiling feature
@@ -2449,14 +1746,14 @@ The table below lists the default event for various processor types:
it needs to be instrumented with an agent library. We use the
agent libraries for Java in the following example. To use the
Java profiling feature, you must build OProfile with the "--with-java" option
- (Section 6, “Installation”).
+ (Section 7, “Installation”).
-
+
-
4.1. JVM instrumentation
+
2.1. JVM instrumentation
@@ -2514,54 +1811,19 @@ The table below lists the default event for various processor types:
-
-
-
-
-
5. Using oprof_start
-
-
-
-
-The oprof_start application provides a convenient way to start the profiler.
-Note that oprof_start is just a wrapper around the opcontrol script,
-so it does not provide more services than the script itself.
-
-
-After oprof_start is started you can select the event type for each counter;
-the sampling rate and other related parameters are explained in Section 2, “Using opcontrol”.
-The "Configuration" section allows you to set general parameters such as the buffer size, kernel filename
-etc. The counter setup interface should be self-explanatory; Section 6.1, “Hardware performance counters” and related
-links contain information on using unit masks.
-
-
-A status line shows the current status of the profiler: how long it has been running, and the average
-number of interrupts received per second and the total, over all processors.
-Note that quitting oprof_start does not stop the profiler.
-
-
-Your configuration is saved in the same file as opcontrol uses; that is,
-~/.oprofile/daemonrc.
-
-
-
-
Note
oprof_start does not currently support operf.
-
-
-
-
+
-
6. Configuration details
+
3. Configuration details
-
+
-
6.1. Hardware performance counters
+
3.1. Hardware performance counters
@@ -2574,7 +1836,8 @@ events other than the default event chosen by OProfile.
@@ -2588,82 +1851,92 @@ https://www.power.org/events/Power7 contains specific information on the per
monitor unit for the IBM POWER7.
-These processors are capable of delivering an interrupt when a counter overflows.
+A physical performance monitor counter (PMC) is configured by a profiling tool to count a particular
+type of event. When the counter overflows, an interrupt is delivered to the processor.
This is the basic mechanism on which OProfile is based. The delivery mode is NMI,
so blocking interrupts in the kernel does not prevent profiling. When the interrupt handler is called,
-the current PC value and the current task are recorded into the profiling structure.
-This allows the overflow event to be attached to a specific assembly instruction in a binary image.
-OProfile receives this data from the kernel and writes it to the sample files.
+the current PC (program counter) value and the current task are recorded into the profiling structure.
+This allows the overflow event to be attributed to a specific assembly instruction in a specific binary image.
+OProfile receives this data (commonly referred to as a "sample") from the kernel and writes it to the sample files.
If we use an event such as CPU_CLK_UNHALTED or INST_RETIRED
(GLOBAL_POWER_EVENTS or INSTR_RETIRED, respectively, on the Pentium 4), we can
-use the overflow counts as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
+use the overflow counts (samples) as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
data such as the cache behaviour of routines with the other available counters.
However there are several caveats. First, there are those issues listed in the Intel manual. There is a delay
between the counter overflow and the interrupt delivery that can skew results on a small scale - this means
you cannot rely on the profiles at the instruction level as being perfectly accurate.
-If you are using an "event-mode" counter such as the cache counters, a count registered against it doesn't mean
-that it is responsible for that event. However, it implies that the counter overflowed in the dynamic
-vicinity of that instruction, to within a few instructions. Further details on this problem can be found in
+For example, if you are profiling an application with an event that counts L1 cache misses, a sample attributed
+to a particular instruction in the application doesn't necessarily mean that exact instruction is responsible
+for that event; instead, it means the sample was taken in the dynamic vicinity of that instruction,
+usually with a margin of error of a few instructions. Further details on this problem can be found in
Chapter 5, Interpreting profiling results and also in the Digital paper "ProfileMe: A Hardware Performance Counter".
-Each counter has several configuration parameters.
-First, there is the unit mask: this simply further specifies what to count.
-Second, there is the counter value, discussed below. Third, there is a parameter whether to increment counts
+Each counter has several configuration parameters besides the type of event to count.
+First, there is the unit mask, which is used to further qualify exactly what to count.
+Second, there is the count field, discussed below. Third, there are parameters
+to specify whether to increment counts
whilst in kernel or user space. You can configure these separately for each counter.
-After each overflow event, the counter will be re-initialized
-such that another overflow will occur after this many events have been counted. Thus, higher
-values mean less-detailed profiling, and lower values mean more detail, but higher overhead.
-Picking a good value for this
-parameter is, unfortunately, somewhat of a black art. It is of course dependent on the event
-you have chosen.
+When the profiler is initially setup, a performance monitor counter is chosen for counting the
+event, and it is initialized using the count value.
+Once profiling begins, the counter increments with each event detected, and the counter
+overflows when the count value is reached.
+As described above, the counter overflow generates an interrupt, and the sample is recorded.
+After each overflow event, the counter is re-initialized using the count value,
+and counting begins anew for the next sample. Higher values for count
+result in samples being taken less frequently, and therefore less-detailed (and, potentially,
+less accurate) profiling. Lower values mean more detail, but higher overhead.
+Picking a good value for this parameter is, unfortunately, somewhat of a black art. It is
+of course dependent on the event you have chosen.
Specifying too large a value will mean not enough interrupts are generated
-to give a realistic profile (though this problem can be ameliorated by profiling for longer).
-Specifying too small a value can lead to higher performance overhead.
+to give a realistic profile (though this problem can be ameliorated by profiling for
+longer time periods. Specifying too small a value can lead to higher performance overhead.
-
+
-
6.2. OProfile in timer interrupt mode
+
3.2. OProfile timer interrupt mode
-Some CPU types do not provide the needed hardware support to use the hardware performance counters. This includes
-some laptops, classic Pentiums, and other CPU types not yet supported by OProfile (such as Cyrix).
-On these machines, OProfile falls back to using the timer interrupt for profiling,
-back to using the real-time clock interrupt to collect samples. In timer mode, OProfile
-is not able to profile code that has interrupts disabled.
-
-
-You can force use of the timer interrupt by using the timer=1 module
-parameter (or oprofile.timer=1 on the boot command line if OProfile is
-built-in). If OProfile was built as a kernel module, then you must pass the 'timer=1'
-parameter with the modprobe command. Do this before executing 'opcontrol --init' or
-edit the opcontrol command's invocation of modprobe to pass the 'timer=1' parameter.
-
-
-
Note
Timer mode is only available using the legacy opcontrol command.
+Some CPU types do not provide the needed hardware support for hardware performance counters.
+Additionally, some older architectures are not supported by the perf_events kernel subsystem.
+On such machines, the operf and ocount commands will exit with a message indicating the
+processor type is not supported. However, you can install OProfile 0.9.9 and use the legacy
+opcontrol-based profiler, which will fall back to using timer interrupts for profiling.
+Note that in timer mode, OProfile is not able to profile code that has interrupts disabled.
+
+
Note
Timer mode is only available using the legacy opcontrol command,
+available in releases prior to 1.0.
-
+
-
6.3. Pentium 4 support
+
3.3. Architecture-specific configuration notes
-
+
+
+
+
+
3.3.1. Pentium 4 support
+
+
+
+
The Pentium 4 / Xeon performance counters are organized around 3 types of model specific registers (MSRs): 45 event
selection control registers (ESCRs), 18 counter configuration control registers (CCCRs) and 18 counters. ESCRs describe a
particular set of events which are to be recorded, and CCCRs bind ESCRs to counters and configure their
@@ -2672,370 +1945,34 @@ another at any time. There is, however, a subset of 8 counters, 8 ESCRs, and 8 C
one another, so OProfile only accesses those registers, treating them as a bank of 8 "normal" counters, similar
to those in the P6 or Athlon/Opteron/Phenom/Turion families of CPU.
-
+
There is currently no support for Precision Event-Based Sampling (PEBS), nor any advanced uses of the Debug Store
(DS). Current support is limited to the conservative extension of OProfile's existing interrupt-based model described
above.
-
-
-
-
-
-
-
6.4. Intel Itanium 2 support
-
-
-
-
-The Itanium 2 performance monitoring unit (PMU) organizes the counters as four
-pairs of performance event monitoring registers. Each pair is composed of a
-Performance Monitoring Configuration (PMC) register and Performance Monitoring
-Data (PMD) register. The PMC selects the performance event being monitored and
-the PMD determines the sampling interval. The IA64 Performance Monitoring Unit
-(PMU) triggers sampling with maskable interrupts. Thus, samples will not occur
-in sections of the IA64 kernel where interrupts are disabled.
-
-
-None of the advance features of the Itanium 2 performance monitoring unit
-such as opcode matching, address range matching, or precise event sampling are
-supported by this version of OProfile. The Itanium 2 support only maps OProfile's
-existing interrupt-based model to the PMU hardware.
-
-
-
-
-
-
-
6.5. PowerPC64 support
-
-
-
-
-The performance monitoring unit (PMU) for the IBM PowerPC 64-bit processors
-consists of between 4 and 8 counters (depending on the model), plus three
-special purpose registers used for programming the counters -- MMCR0, MMCR1,
-and MMCRA. Advanced features such as instruction matching and thresholding are
-not supported by this version of OProfile.
-
-
Note
Later versions of the IBM POWER5+ processor (beginning with revision 3.0)
-run the performance monitor unit in POWER6 mode, effectively removing OProfile's
-access to counters 5 and 6. These two counters are dedicated to counting
-instructions completed and cycles, respectively. In POWER6 mode, however, the
-counters do not generate an interrupt on overflow and so are unusable by
-OProfile. Kernel versions 2.6.23 and higher will recognize this mode
-and export "ppc64/power5++" as the cpu_type to the oprofilefs pseudo filesystem.
-OProfile userspace responds to this cpu_type by removing these counters from
-the list of potential events to count. Without this kernel support, attempts
-to profile using an event from one of these counters will yield incorrect
-results -- typically, zero (or near zero) samples in the generated report.
-
-
-
-
-
-
-
-
-
6.6. Cell Broadband Engine support
-
-
-
-
-The Cell Broadband Engine (CBE) processor core consists of a PowerPC Processing
-Element (PPE) and 8 Synergistic Processing Elements (SPE). PPEs and SPEs each
-consist of a processing unit (PPU and SPU, respectively) and other hardware
-components, such as memory controllers.
-
-
-A PPU has two hardware threads (aka "virtual CPUs"). The performance monitor
-unit of the CBE collects event information on one hardware thread at a time.
-Therefore, when profiling PPE events,
-OProfile collects the profile based on the selected events by time slicing the
-performance counter hardware between the two threads. The user must ensure the
-collection interval is long enough so that the time spent collecting data for
-each PPU is sufficient to obtain a good profile.
-
-
-To profile an SPU application, the user should specify the SPU_CYCLES event.
-When starting OProfile with SPU_CYCLES, the opcontrol script enforces certain
-separation parameters (separate=cpu,lib) to ensure that sufficient information
-is collected in the sample data in order to generate a complete report. The
---merge=cpu option can be used to obtain a more readable report if analyzing
-the performance of each separate SPU is not necessary.
-
-
-Profiling with an SPU event (events 4100 through 4163) is not compatible with any other
-event. Further more, only one SPU event can be specified at a time. The hardware only
-supports profiling on one SPU per node at a time. The OProfile kernel code time slices
-between the eight SPUs to collect data on all SPUs.
-
-
-SPU profile reports have some unique characteristics compared to reports for
-standard architectures:
-
-
-
-
Typically no "app name" column. This is really standard OProfile behavior
-when the report contains samples for just a single application, which is
-commonly the case when profiling SPUs.
-
"CPU" equates to "SPU"
-
Specifying '--long-filenames' on the opreport command does not always result
-in long filenames. This happens when the SPU application code is embedded in
-the PPE executable or shared library. The embedded SPU ELF data contains only the
-short filename (i.e., no path information) for the SPU binary file that was used as
-the source for embedding. The reason that just the short filename is used is because
-the original SPU binary file may not exist or be accessible at runtime. The performance
-analyst must have sufficient knowledge of the application to be able to correlate the
-SPU binary image names found in the report to the application's source files.
-
Note
-Compile the application with -g and generate the OProfile report
-with -g to facilitate finding the right source file(s) on which to focus.
-
-
-
-
-
-
-
-
-
6.7. AMD64 (x86_64) Instruction-Based Sampling (IBS) support
-
-
-
-
-Instruction-Based Sampling (IBS) is a new performance measurement technique
-available on AMD Family 10h processors. Traditional performance counter
-sampling is not precise enough to isolate performance issues to individual
-instructions. IBS, however, precisely identifies instructions which are not
-making the best use of the processor pipeline and memory hierarchy.
-For more information, please refer to the "Instruction-Based Sampling:
-A New Performance Analysis Technique for AMD Family 10h Processors" (
-
-http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf).
-There are two types of IBS profile types, described in the following sections.
-
-
Note
Profiling on IBS events is only supported with legacy mode profiling
-(i.e., with opcontrol).
-
-
-
-
-
-
-
6.7.1. IBS Fetch
-
-
-
-
-IBS fetch sampling is a statistical sampling method which counts completed
-fetch operations. When the number of completed fetch operations reaches the
-maximum fetch count (the sampling period), IBS tags the fetch operation and
-monitors that operation until it either completes or aborts. When a tagged
-fetch completes or aborts, a sampling interrupt is generated and an IBS fetch
-sample is taken. An IBS fetch sample contains a timestamp, the identifier of
-the interrupted process, the virtual fetch address, and several event flags
-and values that describe what happened during the fetch operation.
-
+
-
6.7.2. IBS Op
+
3.3.2. PowerPC64 support
-IBS op sampling selects, tags, and monitors macro-ops as issued from AMD64
-instructions. Two options are available for selecting ops for sampling:
-
-
-
-
-Cycles-based selection counts CPU clock cycles. The op is tagged and monitored
-when the count reaches a threshold (the sampling period) and a valid op is
-available.
-
-
-Dispatched op-based selection counts dispatched macro-ops.
-When the count reaches a threshold, the next valid op is tagged and monitored.
-
-
-
-
-In both cases, an IBS sample is generated only if the tagged op retires.
-Thus, IBS op event information does not measure speculative execution activity.
-The execution stages of the pipeline monitor the tagged macro-op. When the
-tagged macro-op retires, a sampling interrupt is generated and an IBS op
-sample is taken. An IBS op sample contains a timestamp, the identifier of
-the interrupted process, the virtual address of the AMD64 instruction from
-which the op was issued, and several event flags and values that describe
-what happened when the macro-op executed.
-
-
-
-Enabling IBS profiling is done simply by specifying IBS performance events
-through the "--event=" options. These events are listed in the
-opcontrol --list-events.
-
-
-
-
-
-opcontrol --event=IBS_FETCH_XXX:<count>:<um>:<kernel>:<user>
-opcontrol --event=IBS_OP_XXX:<count>:<um>:<kernel>:<user>
-
-Note: * All IBS fetch event must have the same event count and unitmask,
- as do those for IBS op.
-
-
-
-
-
-
-
-
-
-
6.8. IBM System z hardware sampling support
-
-
-
-
-IBM System z provides a facility which does instruction sampling as
-part of the CPU. This has great advantages over the timer based
-sampling approach like better sampling resolution with less overhead
-and the possibility to get samples within code sections where
-interrupts are disabled (useful especially for Linux kernel code).
-
-
Note
Profiling with the instruction sampling facility is currently only supported
-with legacy mode profiling (i.e., with opcontrol).
-System z hardware sampling can be used for Linux instances in LPAR
-mode. The hardware sampling support used by OProfile was introduced
-for System z10 in October 2008.
-
-
-To enable hardware sampling for an LPAR you must activate the LPAR
-with authorization for basic sampling control. See the "Support
-Element Operations Guide" for your mainframe system for more
-information.
-
-
-The hardware sampling facility can be enabled and disabled using the
-event interface. A `virtual' counter 0 has been defined that only supports
-a single event, HWSAMPLING. By default the HWSAMPLING event is
-enabled on machines providing the facility. For both events only the
-`count', `kernel' and `user' options are evaluated by the kernel
-module.
-
-
-The `count' value is the sampling rate as it is passed to the CPU
-measurement facility. A sample will be taken by the hardware every
-`count' cycles. Using low values here will quickly fill up the
-sampling buffers and will generate CPU load on the OProfile daemon and
-the kernel module being busy flushing the hardware buffers. This
-might considerably impact the workload to be profiled.
-
-
-The unit mask `um' is required to be zero.
-
-
-The opcontrol tool provides a new option specific to System z
-hardware sampling:
-
-
-
-
--s390hwsampbufsize="num": Number of 2MB areas
-used per CPU for storing sample data. The best
-size for the sample memory depends on the particular system and the
-workload to be measured. Providing the sampler with too little memory
-results in lost samples. Reserving too much system memory for the
-sampler impacts the overall performance and, hence, also the workload
-to be measured.
-
-
-
-A special counter /dev/oprofile/timer is provided
-by the kernel module allowing to switch back to timer mode sampling
-dynamically. The TIMER event is limited to be used only with this
-counter. The TIMER event can be specified using the
---event= as with every other event.
-
-
-
-
-
opcontrol --event=TIMER:1
-
-
-
-
-On z10 or later machines the default event is set to TIMER in case the
-hardware sampling facility is not available.
-
-
-Although required, the 'count' parameter of the TIMER event is
-ignored. The value may eventually be used for timer based sampling
-with a configurable sampling frequency, but this is currently not
-supported.
-
-
-
-
-
-
-
6.9. Dangerous counter settings
-
-
-
-
-OProfile is a low-level profiler which allows continuous profiling with a low-overhead cost.
-When using OProfile legacy mode profiling, it may be possible to configure such a low a counter reset value
-(i.e., high sampling rate) that the system can become overloaded with counter interrupts and your
-system's responsiveness may be severely impacted. Whilst some validation is done on the count
-values you pass to opcontrol with your event specification, it is not foolproof.
-
-
-
Note
-
-This can happen as follows: When the profiler count
-reaches zero, an NMI handler is called which stores the sample values in an internal buffer, then resets the counter
-to its original value. If the reset count you specified is very low, a pending NMI can be sent before the NMI handler has
-completed. Due to the priority of the NMI, the pending interrupt is delivered immediately after
-completion of the previous interrupt handler, and control never returns to other parts of the system.
-If all processors are stuck in this mode, the system will appear to be frozen.
-
-
-
If this happens, it will be impossible to bring the system back to a workable state.
-There is no way to provide real security against this happening, other than making sure to use a reasonable value
-for the counter reset. For example, setting CPU_CLK_UNHALTED event type with a ridiculously low reset count (e.g. 500)
-is likely to freeze the system.
+The performance monitoring unit (PMU) for the IBM PowerPC 64-bit processors
+consists of between 4 and 8 counters (depending on the model). Advanced features
+such as instruction matching and thresholding are not supported by OProfile.
-
-In short : Don't try a foolish sample count value. Unfortunately the definition of a foolish value
-is really dependent on the event type. If ever in doubt, post a message to
-The scenario described above cannot occur if you use operf for profiling instead of
-opcontrol, because the perf_events kernel subsystem automatically detects when performance monitor
-interrupts are arriving at a dangerous level and will throttle back the sampling rate.
-
-
+
-
Chapter 4. Obtaining results
+
Chapter 4. Obtaining profiling results
@@ -3157,7 +2094,7 @@ interrupts are arriving at a dangerous level and will throttle back the sampling
@@ -3186,23 +2123,13 @@ interrupts are arriving at a dangerous level and will throttle back the sampling
-OK, so the profiler has been running, but it's not much use unless we can get some data out. Sometimes,
-OProfile does a little too good a job of keeping overhead low, and no data reaches
-the profiler. This can happen on lightly-loaded machines. If you're using OPorifle legacy mode, you can
-force a dump at any time with :
-
-
-
- opcontrol --dump
-
-
-
This ensures that any profile data collected by the oprofiled daemon has been flusehd
-to disk. Remember to do a dump, stop, shutdown, or deinit
-before complaining there is no profiling data!
-
-
-Now that we've got some data, it has to be processed. That's the job of opreport,
-opannotate, or opgprof.
+After collecting profile data, the raw data must undergo special processing in order for you to
+perform your analysis. The analysis tools that perform this special processing are
+opreport, opannotate, and opgprof.
+Additionally, the oparchive is used to gather together profile
+data, sampled binary files, etc. for the purpose of off-line analysis. While
+not really an analysis tool, oparchive is put in that category
+for convenience since it takes many of the same options as the other analysis tools.
@@ -3213,11 +2140,11 @@ Now that we've got some data, it has to be processed. That's the job of
-All of the analysis tools take a profile specification.
-This is a set of definitions that describe which actual profiles should be
+All of the analysis tools take a profile specification
+as an input argument.
+This is a set of definitions that describes the specific profile data that should be
examined. The simplest profile specification is empty: this will match all
-the available profile files for the current session (this is what happens
-when you do opreport).
+the available profile files for the current session.
Specification parameters are of the form name:value[,value].
@@ -3225,10 +2152,11 @@ For example, if I wanted to get a combined symbol summary for
/bin/myprog and /bin/myprog2,
I could do opreport -l image:/bin/myprog,/bin/myprog2.
As a special case, you don't actually need to specify the image:
-part here: anything left on the command line is assumed to be an
+part of the specification. Anything left on the command line after all other
+opreport options have been processed is assumed to be an
image: name. Similarly, if no session:
is specified, then session:current is assumed ("current"
-is a special name of the current / last profiling session).
+is a special name of the current (i.e., most recent) profiling session).
In addition to the comma-separated list shown above, some of the
@@ -3414,10 +2342,7 @@ Differential profile of an archived binary with the current session :
Same as image:, but only for images that are for
- a particular primary binary image (namely, an application). This only
- makes sense to use if you're using --separate.
- This includes kernel modules and the kernel when using
- --separate=kernel.
+ a particular primary binary image (namely, an application).
@@ -3446,7 +2371,6 @@ Differential profile of an archived binary with the current session :
The symbolic event name to match on, e.g. event:DATA_MEM_REFS.
You can pass a list of events for side-by-side comparison with opreport.
- When using the timer interrupt, the event is always "TIMER".
@@ -3461,11 +2385,10 @@ Differential profile of an archived binary with the current session :
The event count to match on, e.g. event:DATA_MEM_REFS count:30000.
Note that this value refers to the count value in the event spec you passed
- to opcontrol or operf when setting up to do a
+ to operf when setting up to do a
profile run. It has nothing to do with the sample counts in the profile data
itself.
You can pass a list of events for side-by-side comparison with opreport.
- When using the timer interrupt, the count is always 0 (indicating it cannot be set).
@@ -3543,9 +2466,8 @@ Differential profile of an archived binary with the current session :
-Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default when
-using legacy mode: /var/lib/oprofile/samples/; default when using
-operf: <cur_dir>/oprofile_data/samples/).
+Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default
+for operf is <cur_dir>/oprofile_data/samples/).
These are used, along with the binary image files, to produce human-readable data.
In some circumstances (e.g., kernel modules), OProfile
will not be able to find the binary images. All the tools have an --image-path
@@ -3622,12 +2544,7 @@ taken per second.
Similarly, if the application spends little time in the main binary image
itself, with most of it spent in shared libraries it uses, you might
-not see any samples for the binary image (i.e., executable) itself. If you're
-using OProfile legacy mode profiling, then we recommend using
-opcontrol --separate=lib before the
-profiling session so that opreport and friends show
-the library profiles on a per-application basis. This is done automatically
-when profiling with operf, so no special setup is necessary.
+not see any samples for the binary image (i.e., executable) itself.
@@ -3644,7 +2561,7 @@ but no task with that group ID ever ran the code.
-If you're using a particular event counter, for example counting MMX
+If you're profiling a particular event, for example counting MMX
operations, the code might simply have not generated any events in the
first place. Verify the code you're profiling does what you expect it
to.
@@ -3678,7 +2595,7 @@ The opreport utility is the primar
getting formatted data out of OProfile. It produces two types of data: image summaries
and symbol summaries. An image summary lists the number of samples for individual
binary images such as libraries or applications. Symbol summaries provide per-symbol
-profile data. In the following example, we're getting an image summary for the whole
+profile data. In the following truncated example, we see an image summary for the whole
system:
@@ -3708,31 +2642,28 @@ Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mas
If we had specified --symbols in the previous command, we would have
gotten a symbol summary of all the images across the entire system. We can restrict this to only
part of the system profile; for example,
-below is a symbol summary of the OProfile daemon. Note that as we used
-opcontrol --separate=lib,kernel, symbols from images that oprofiled
-has used are also shown.
+below is a symbol summary for the operf program used to collect the profile.
-$ opreport -l -p /lib/modules/`uname -r` `which oprofiled` 2>/dev/null | more
-CPU: Core 2, speed 2.534e+06 MHz (estimated)
-Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
+$ opreport -l -p /lib/modules/`uname -r` `which operf` 2>/dev/null | more
+CPU: Intel Sandy Bridge microarchitecture, speed 2401 MHz (estimated)
+Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
-1353 24.9447 vmlinux sidtab_context_to_sid
-500 9.2183 vmlinux avtab_hash_eval
-154 2.8392 vmlinux __link_path_walk
-152 2.8024 vmlinux d_prune_aliases
-120 2.2124 vmlinux avtab_search_node
-104 1.9174 vmlinux find_next_bit
-85 1.5671 vmlinux selinux_file_fcntl
-82 1.5118 vmlinux avtab_write
-81 1.4934 oprofiled odb_update_node_with_offset
-73 1.3459 oprofiled opd_process_samples
-72 1.3274 vmlinux avc_has_perm_noaudit
-61 1.1246 libc-2.12.so _IO_vfscanf
-59 1.0878 ext4.ko ext4_mark_iloc_dirty
+860 7.4607 kallsyms avtab_search_node
+474 4.1121 operf OP_perf_utils::op_write_event(event_union*, unsigned long long)
+461 3.9993 kallsyms avc_has_perm_noaudit
+455 3.9473 libstdc++.so.6.0.13 /usr/lib64/libstdc++.so.6.0.13
+412 3.5742 libc-2.12.so _IO_vfscanf
+369 3.2012 kallsyms __d_lookup
+350 3.0363 kallsyms sidtab_context_to_sid
+274 2.3770 operf OP_perf_utils::op_record_process_exec_mmaps(int, int, int, operf_record*)
+232 2.0127 operf operf_process_info::find_mapping_for_sample(unsigned long long, bool)
+222 1.9259 kallsyms __link_path_walk
+191 1.6570 kallsyms pipe_read
+34 0.2950 ext4.ko ext4_mark_iloc_dirty
...
@@ -3748,8 +2679,8 @@ If you have used one of the --separate[*] options
whilst profiling, there can be several separate profiles for
a single binary image within a session. Normally the output
will keep these images separated. So, for example, if you profiled
-with separation on a per-cpu basis (opcontrol --separate=cpu or
-operf --separate-cpu), you would see separate columns in
+with separation on a per-cpu basis (operf --separate-cpu),
+you would see separate columns in
the output of opreport for each CPU where samples
were recorded. But it can be useful to merge these results back together
to make the report more readable. The --merge option allows
@@ -3880,11 +2811,11 @@ as calling strfry(), but it's clear from the sourc
that this doesn't actually happen. See Section 3, “Interpreting call-graph profiles” for an explanation.
-
+
-
2.3.2. Callgraph and JIT support
+
2.3.2. Callgraph is not supported with JIT samples
@@ -3969,13 +2900,12 @@ A typical way to use this feature is with archives created with
-$ ./a
+$ operf ./a
$ oparchive -o orig ./a
-$ opcontrol --reset
# edit and recompile a
-$ ./a
+$ operf ./a
# now compare the current profile of a with the archived profile
-$ opreport -xl ./a { archive:./orig } { }
+$ opreport --session-dir=`pwd`/oprofile_data/ -xl ./a { archive:./orig } { }
CPU: PIII, speed 863.233 MHz (estimated)
Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a
unit mask of 0x00 (No unit mask) count 100000
@@ -4059,8 +2989,7 @@ samples % image name symbol name
Note that, since such mappings are dependent upon individual invocations of
-a binary, these mappings are always listed as a dependent image,
-even when using the legacy mode opcontrol --separate=none command.
+a binary, these mappings are always listed as a dependent image.
Equally, the results are not affected by the --merge
option.
@@ -4071,7 +3000,7 @@ Enhanced support for JITed code is now available for some virtual machines;
e.g., the Java Virtual Machine. For details about OProfile output for
JITed code, see Section 4, “OProfile results with JIT samples”.
-
@@ -4172,8 +3101,7 @@ offsets for the image binary.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel..
@@ -4288,12 +3216,18 @@ Reverse the sort from the default.
-Use sample database out of directory dir_path
-instead of the default location (/var/lib/oprofile).
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opreport will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
@@ -4336,7 +3270,8 @@ List per-symbol information instead of a binary image summary.
Only output data for symbols that have more than the given percentage
-of total samples.
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
@@ -4392,7 +3327,7 @@ use opannotate --assembly
-Note that for the reason explained in Section 6.1, “Hardware performance counters” the results can be
+Note that for the reason explained in Section 3.1, “Hardware performance counters” the results can be
inaccurate. The debug information itself can add other problems; for example, the line number for a symbol can be
incorrect. Assembly instructions can be re-ordered and moved by the compiler, and this can lead to
crediting source lines with samples not really "owned" by this line. Also see
@@ -4545,8 +3480,7 @@ pattern-matching to make C++ symbol demangling more readable.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel.
@@ -4557,6 +3491,19 @@ used --separate.
Exclude all files in the given comma-separated list of glob patterns.
+This option is supported solely with the --source
+option. It can be used to filter out source files in the output using the
+following types of specifications:
+
+
+
+
filenames (basename -- i.e., no path)
+
filename glob specifications (all files whose base filename matches the given pattern)
+
directory segments (all source files located in the specified directory; e.g. "libio")
@@ -4608,6 +3555,7 @@ A path to a filesystem to search for additional binaries.
Only include files in the given comma-separated list of glob patterns.
+The same rules apply for this option as for the --exclude-file option.
@@ -4694,8 +3642,23 @@ source files when the debug information only contains relative paths.
-Output annotated source. This requires debugging information to be available
-for the binaries.
+Output annotated source. This requires debugging information to be available
+for the binaries.
+
+
+
+
+ --session-dir=dir_path
+
+
+
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opannotate will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
@@ -4705,8 +3668,14 @@ for the binaries.
-Only output data for symbols that have more than the given percentage
-of total samples.
+For annotated assembly, only output data for symbols that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
+
+
+For annotated source, only output data for source files that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the source file is shown.
@@ -4912,6 +3881,14 @@ of total samples.
Give verbose debugging output.
+
--session-dir=dir_path
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opgprof will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
+
@@ -4928,24 +3905,24 @@ Show version.
-
+
-
6. Archiving measurements (oparchive)
+
6. Analyzing profile data on another system (oparchive)
The oparchive utility generates a directory populated
with executable, debug, and oprofile sample files. This directory can be
- moved to another machine via tar and analyzed without
- further use of the data collection machine.
+ copied to another (host) machine and analyzed offline, with no further need to
+ access the data collection machine (target).
- The following command would collect the sample files, the executables
- associated with the sample files, and the debuginfo files associated
- with the executables and copy them into
+ The following command, executed on the target system, will collect the
+ sample files, the executables associated with the sample files, and the
+ debuginfo files associated with the executables and copy them into
/tmp/current_data:
@@ -4957,6 +3934,66 @@ Show version.
+
+ When transferring archived profile data to a host machine for offline analysis,
+ you need to determine if the oprofile ABI format is compatible between the
+ target system and the host system; if it isn't, you must run the opimport
+ command to convert the target's sample data files to the format of your host system.
+ See Section 7, “Converting sample database files (opimport)” for more details.
+
+
+ After your profile data is transferred to the host system and (if necessary)
+ you have run the opimport command to convert the file
+ format, you can now run the opreport and
+ opannotate commands. However, you must provide an
+ "archive specification" to let these post-processing tools know where to find
+ of the profile data (sample files, executables, etc.); for example:
+
+ Furthermore, if your profile was collected on your target system into a session-dir
+ other than /var/lib/oprofile, the oparchive
+ command will display a message similar to the following:
+
+
+
+
+
+# NOTE: The sample data in this archive is located at /home/user1/test-stuff/oprofile_data
+instead of the standard location of /var/lib/oprofile. Hence, when using opreport
+and other post-processing tools on this archive, you must pass the following option:
+ --session-dir=/home/user1/test-stuff/oprofile_data
+
+
+
+
+
+ Then the above opreport example would have to include that
+ --session-dir option.
+
+
+
+
Note
+ In some host/target development environments, all target executables, libraries, and
+ debuginfo files are stored in a root directory on the host to facilitate offline
+ analysis. In such cases, the oparchive command collects more data
+ than is necessary; so, when copying the resulting output of oparchive,
+ you can skip all of the executables, etc, and just archive the $SESSION_DIR
+ tree located within the output directory you specified in your oparchive
+ command. Then, when running the opreport or opannotate
+ commands on your host system, pass the --root option to point to the
+ location of your target's executables, etc.
+
+
+
@@ -4985,8 +4022,7 @@ Show help message.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel.
@@ -5038,6 +4074,21 @@ Only list the files that would be archived, don't copy them.
Give verbose debugging output.
+
+
+
+
+ --session-dir=dir_path
+
+
+
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then oparchive will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
@@ -5064,15 +4115,20 @@ Show version.
This utility converts sample database files from a foreign binary format (abi) to
- the native format. This is useful only when moving sample files between systems
- for analysis on platforms other than the one used for collection. The
- oparchive should be used on the machine where the profile was taken (target)
- in order to collect sample files and all other necessary information. The archive
- directory that is the output from oparchive should be copied
- to the system where you wish to perform your performance analysis (host). If the
- When the architecture of your target and host systems differ, then you'll need to
- use the opimport command. The abi format of the sample files
- to be imported is described in a text file located in $SESSION_DIR/abi.
+ the native format. This is required when moving sample files to a (host) system
+ other than the one used for collection (target system), and the host and target systems are different
+ architectures. The abi format of the sample files to be imported is described in a
+ text file located in $SESSION_DIR/abi. If you are unsure if
+ your target and host systems have compatible architectures (in regard to the OProfile
+ ABI), simply diff a $SESSION_DIR/abi file from the target system
+ with one from the host system. If any differences show up at all, you must run the
+ opimport command.
+
+
+ The oparchive command should be used on the machine where
+ the profile was taken (target) in order to collect sample files and all other necessary
+ information. The archive directory that is the output from oparchive
+ should be copied to the system where you wish to perform your performance analysis (host).
The following command converts an input sample file to the specified
@@ -5426,10 +4482,7 @@ problem and OProfile can do nothing about it.
OProfile uses non-maskable interrupts (NMI) on the P6 generation, Pentium 4,
Athlon, Opteron, Phenom, and Turion processors. These interrupts can occur even in sections of the
kernel where interrupts are disabled, allowing collection of samples in virtually
-all executable code. The timer interrupt mode and Itanium 2 collection mechanisms
-use maskable interrupts; therefore, these profiling mechanisms have "sample
-shadows", or blind spots: regions where no samples will be collected. Typically, the samples
-will be attributed to the code immediately after the interrupts are re-enabled.
+all executable code.
@@ -5461,7 +4514,7 @@ will appear as poll_idle() in your kernel profile.
OProfile profiles kernel modules by default. However, there are a couple of problems
you may have when trying to get results. First, you may have booted via an initrd;
this means that the actual path for the module binaries cannot be determined automatically.
-To get around this, you can use the -p option to the profiling tools
+To get around this, you can use the -p option to the analysis tools
to specify where to look for the kernel modules.
@@ -5491,7 +4544,7 @@ information for OProfile to get this information.
-Sometimes the results from call-graph profiles may be different to what
+Sometimes the results from call-graph profiles may be different from what
you expect to see. The first thing to check is whether the target
binaries where compiled with frame pointers enabled (if the binary was
compiled using gcc's
@@ -5954,11 +5007,301 @@ and http://devel
+This section describes in detail how ocount is used.
+Unless the --events option is specified, ocount will use
+the default event for your system. For most systems, the default event is some
+cycles-based event, assuming your processor type supports hardware performance
+counters. The event specification used for ocount is slightly
+different from that required for profiling -- a count value
+is not needed. You can see the event information for your CPU using ophelp.
+More information on event specification can be found at Section 3, “Specifying performance counter events”.
+
+One and only one of these 5 run modes must be specified when you run ocount.
+If you run ocount using a run mode other than command [args], press Ctrl-c
+to stop it when finished counting (e.g., when the monitored process ends). If you background ocount
+(i.e., with ’&’) while using one these run modes, you must stop it in a controlled manner so that
+the data collection process can be shut down cleanly and final results can be displayed.
+Use kill -SIGINT <ocount-PID> for this purpose.
+
+
+Following is a description of the ocount options.
+
+
+
+
+
+ command [args]
+
+
+
+
+ The command or application to be profiled. The [args] are the input arguments
+ that the command or application requires. The command and its arguments must be positioned at the
+ end of the command line, after all other ocount options.
+
+
+
+
+ --process-list / -p [PIDs]
+
+
+
+
+ Use this option to count events for one or more already-running applications, specified via
+ a comma-separated list (PIDs). Event counts will be collected for all children of the
+ passed process(es) as well.
+
+
+
+
+ --thread-list / -r [TIDs]
+
+
+
+
+ Use this option to count events for one or more already-running threads, specified via
+ a comma-separated list (TIDs). Event counts will not be collected
+ for any children of the passed thread(s).
+
+
+
+
+ --system-wide / -s
+
+
+
+
+ This option is for counting events for all processes running on your system. You must have
+ root authority to run ocount in this mode.
+
+
+
+
+ --cpu-list / -C [CPUs]
+
+
+
+
+ This option is for counting events on a subset of processors on your system. You must have
+ root authority to run ocount in this mode. This is a comma-separated list,
+ where each element in the list may be either a single processor number or a range of processor
+ numbers; for example: ’-C 2,3,4-11,15’.
+
+
+
+
+ --events / -e [event1[,event2[,...]]]
+
+
+
+
+ This option is for passing a comma-separated list of event specifications
+ for counting. Each event spec is of the form:
+
+
+
+
+
name[:unitmask[:kernel[:user]]]
+
+
+
+
+ When no event specification is given, the default event for the running
+ processor type will be used for counting. Use ophelp
+ to list the available events for your processor type.
+
+
+
+
+ --separate-thread / -t
+
+
+
+
+ This option can be used in conjunction with either the --process-list or
+ --thread-list option to display event counts on a per-thread (per-process) basis.
+ Without this option, all counts are aggregated.
+
+
+
+
+ --separate-cpu / -c
+
+
+
+
+ This option can be used in conjunction with either the --system-wide or
+ --cpu-list option to display event counts on a per-cpu basis. Without this option,
+ all counts are aggregated.
+
+ Note: The interval_length is given in milliseconds.
+ However, the current implementation only supports 100 ms
+ granularity, so the given interval_length will be rounded
+ to the nearest 100 ms. Results collected for each time
+ interval are printed immediately instead of the default
+ of one dump of cumulative event counts at the end of the
+ run. Counters are reset to zero at the start of each
+ interval.
+
+
+ If num_intervals is specified, ocount exits after the
+ specified number of intervals occur.
+
+
+
+
+ --brief-format / -b
+
+
+
+
+ Use this option to print results in the following brief format:
+
+
+
+
+
+ [optional cpu or thread,]<event_name>,<count>,<percent_time_enabled>
+ [ <int> ,]< string >,< u64 >,< double >
+
+
+
+
+
+ If --timer-interval is specified, a separate line formatted as
+
+
+
+
+
+ timestamp,<num_seconds_since_epoch>[.n]
+
+
+
+
+
+ is printed ahead of each dump of event counts. If the time interval specified is
+ less than one second, the timestamp will have 1/10 second precision.
+
+
+
+
+ --output-file / -f outfile_name
+
+
+
+
+ Results are written to outfile_name instead of interactively to the terminal.
+
+
+
+
+ --verbose / -V
+
+
+
+
+ Use this option to increase the verbosity of the output.
+
+
+
+
+ --version -v
+
+
+
+
+ Show ocount version.
+
+
+
+
+ --help / -h
+
+
+
+
+ Show a help message.
+
+
+
+
+
+
+
-
Chapter 6. Acknowledgments
+
Chapter 7. Acknowledgments
diff --git a/doc/oprofile.xml b/doc/oprofile.xml
index 6bbab72..01cd309 100644
--- a/doc/oprofile.xml
+++ b/doc/oprofile.xml
@@ -3,7 +3,7 @@
OProfile manual
-
+
John
@@ -27,56 +27,72 @@
This manual applies to OProfile version .
-OProfile is a profiling system for Linux 2.6 and higher systems on a number of architectures. It is capable of profiling
-all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries
-to binaries. OProfile can profile the whole system in the background, collecting information at a low overhead. These
-features make it ideal for profiling entire systems to determine bottle necks in real-world systems.
+OProfile is a set of performance monitoring tools for Linux 2.6 and higher systems, available on a number of architectures.
+OProfile provides the following features:
+
+Profiler
+Post-processing tools for analyzing profile data
+Event counter
+
+
+OProfile is capable of monitoring native hardware events occurring in all parts of a running system, from the kernel
+(including modules and interrupt handlers) to shared libraries
+to binaries. OProfile can collect event information for the whole system in the background with very little overhead. These
+features make it ideal for monitoring entire systems to determine bottle necks in real-world systems.
+
+
Many CPUs provide "performance counters", hardware registers that can count "events"; for example,
-cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events:
+cache misses, or CPU cycles. OProfile can collect profiles of code based on the number of these occurring events:
repeatedly, every time a certain (configurable) number of events has occurred, the PC value is recorded.
-This information is aggregated into profiles for each binary image.
-
-Some hardware setups do not allow OProfile to use performance counters: in these cases, no
-events are available so OProfile operates in timer mode, as described in later chapters. Timer
-mode is only available in "legacy mode" (see ).
-
+This information is aggregated into profiles for each binary image. Alternatively, OProfile's event counting
+tool can collect simple raw event counts.
-OProfile legacy mode
-"Legacy" OProfile consists of the opcontrol shell script, the oprofiled daemon, and several post-processing tools (e.g.,
-opreport). The opcontrol script is used for configuring, starting, and stopping a profiling session. An OProfile
-kernel driver (usually built as a kernel module) is used for collecting samples, which are then recorded into sample files by
-oprofiled. Using OProfile in "legacy mode" requires root user authority since the profiling is done on a system-wide basis, which may
-(if misused) cause adverse effects to the system.
-
-Profiling setup parameters that you specify using opcontrol are cached in /root/.oprofile/daemonrc.
-Subsequent runs of opcontrol --start will continue to use these cached values until you
-override them with new values.
-
+OProfile legacy profiling mode
+Prior to release 1.0, OProfile included a profiling tool consisting of the opcontrol shell script, the oprofiled daemon,
+and the attendant oprofile kernel driver. This "legacy profiler" was deprecated in release 0.9.8 with the introduction of
+the operf profiling tool (see ). Some older architectures/platforms
+do not support the use of operf. For those cases, oprofile users should install release 0.9.9, which is the
+last release to include the legacy profiler.
-OProfile perf_events mode
-As of release 0.9.8, OProfile now includes the ability to profile a single process versus the system-wide technique
-of legacy OProfile. With this new technique, the operf program is used to control profiling instead of the
-opcontrol script and oprofiled daemon of leagacy mode. Also, operf does not require the
-special OProfile kernel driver that legacy mode does; instead, it interfaces with the kernel to collect samples via the Linux Kernel
-Performance Events Subsystem (hereafter referred to as "perf_events"). Using operf to profile a single
-process can be done as a normal user; however, root authority is required to run operf in system-wide
-profiling mode.
-
-Note 1
-The same OProfile post-processing tools are used whether you collect your profile with operf or opcontrol.
-
+OProfile perf_events profiling mode
+
+OProfile has the ability to profile a single process or every currently running process (i.e., system-wide)
+via the operf program. operf interfaces with the
+kernel to collect samples via the Linux Kernel Performance Events Subsystem (hereafter
+referred to as "perf_events"). OProfile can co-exist with other tools on your system that
+may also be using the perf_events kernel subsystem.
+
+
+Using operf to profile a single
+process can be done as a normal user; however, root authority is required to run
+operf in system-wide profiling mode.
-Note 2
+Note
Some older processor models are not supported by the underlying perf_events kernel and, thus, are not supported by operf.
If you receive the message
Your kernel's Performance Events Subsystem does not support your processor type
-when attempting to use operf, try profiling with opcontrol
+when attempting to use operf, install OProfile 0.9.9 and try profiling with opcontrol
to see if your processor type may be supported by OProfile's legacy mode.
+
+
+
+
+OProfile event counting mode
+OProfile provides the ocount tool for
+collecting raw event counts on a per-application, per-process, per-cpu, or system-wide basis. Unlike the
+profiling tools, post-processing of the data collected is not necessary -- the data is displayed in the
+output of ocount. A common use case for event counting tools is when performance analysts
+want to determine the CPI (cycles per instruction) for an application. High CPI implies possible stalls,
+and many architectures provide events that give detailed information about the different types of stalls.
+The events provided are architecture-specific, so we refer the reader to the hardware manuals available for
+the processor type being used.
+
+
Applications of OProfile
@@ -107,33 +123,30 @@ OProfile is not a panacea. OProfile might not be a complete solution when you :
Support for dynamically compiled (JIT) code
-Older versions of OProfile were not capable of attributing samples to symbols from dynamically
-compiled code, i.e. "just-in-time (JIT) code". Typical JIT compilers load the JIT code into
-anonymous memory regions. OProfile reported the samples from such code, but the attribution
-provided was simply:
- anon: <tgid><address range>
-Due to this limitation, it wasn't possible to profile applications executed by virtual machines (VMs)
-like the Java Virtual Machine. OProfile now contains an infrastructure to support JITed code.
+OProfile provides a framework to support JITed code ("just-in-time (JIT) compiled code").
A development library is provided to allow developers
-to add support for any VM that produces dynamically compiled code (see the OProfile JIT agent
+to add support for any VM (virtual machine) that produces dynamically compiled code (see the OProfile JIT agent
developer guide).
In addition, built-in support is included for the following:JVMTI agent library for Java (1.5 and higher)JVMPI agent library for Java (1.5 and lower)
-
+These libraries make it possible for OProfile to attribute profile samples
+to Java methods. Without a VM-specific agent library, OProfile will typically report
+samples from JITed code similar to the following example:
+ anon: <tgid><address range>
For information on how to use OProfile's JIT support, see .
-No support for virtual machine guests
OProfile currently does not support event-based profiling (i.e, using hardware events like cache misses,
-branch mispredicts) on virtual machine guests running under systems such as VMware. The list of
-supported events displayed by ophelp or 'opcontrol --list-events' is based on CPU type and does
+branch mispredicts) on virtual machine guests running under systems such as VMware.
+(Note: KVM guests are supported.) The list of
+supported events displayed by ophelp is based on CPU type and does
not take into account whether the running system is a guest system or real system. To use
-OProfile on such guest systems, you can use timer mode (see ).
+OProfile on such guest systems, you must use the legacy profiler's timer mode (see ).
@@ -147,47 +160,13 @@ OProfile on such guest systems, you can use timer mode (see Required kernel headers
- In order to build the perf_events-enabled operf program, you need to either
- install the kernel-headers package for your system or use the --with-kernel
+ Either the kernel-headers package must be installed or use the --with-kernel
configure option.
@@ -219,13 +197,6 @@ OProfile on such guest systems, you can use timer mode (see Bug tracker
There is a bug tracker for OProfile at SourceForge,
- http://sf.net/tracker/?group_id=16191&atid=116191.
+ http://sourceforge.net/p/oprofile/bugs/.
@@ -383,11 +354,6 @@ time by providing the "lapic" option to the kernel.
If you use the NMI watchdog, be aware that the watchdog is disabled when profiling starts
and not re-enabled until the profiling is stopped.
-
-Please note that you must save or have available the vmlinux file
-generated during a kernel compile, as OProfile needs it (you can use
-, but this will prevent kernel profiling).
-
@@ -406,13 +372,8 @@ remove all installed files except your configuration file in the directory Getting started with OProfile using operf
-Profiling with operf is the recommended profiling mode with OProfile. Using
-this mode not only allows you to target your profiling more precisely (i.e., single process
-or system-wide), it also allows OProfile to co-exist better with other tools on your system that
-may also be using the perf_events kernel subsystem.
-
-
-With operf, there is no initial setup needed -- simply invoke operf with
+Profiling with operf allows you to precisely target your profiling (i.e., single process
+or system-wide). With operf, there is no initial setup needed -- simply invoke operf with
the options you need; then run the OProfile post-processing tool(s). The operf syntax
is as follows:
@@ -430,61 +391,113 @@ unless you pass the --session-dir option.
-
-Getting started with OProfile using legacy mode
+
+
+Getting started with OProfile using ocount
-Before you can use OProfile's legacy mode, you must set it up. The minimum setup required for this
-is to tell OProfile where the vmlinux file corresponding to the
-running kernel is, for example :
-
-opcontrol --vmlinux=/boot/vmlinux-`uname -r`
+ocount is an OProfile tool that can be used to count native hardware events occurring in either
+a specific application, a set of processes or threads, a set of active system processors, or the
+entire system. The data collected during a counting session is displayed to stdout by default, but may
+also be saved to a file. The ocount syntax is as follows:
-If you don't want to profile the kernel itself,
-you can tell OProfile you don't have a vmlinux file :
+ocount [ options ] [ --system-wide | --process-list <pids> | --thread-list <tids> | --cpu-list <cpus> [ command [ args ] ] ]
+
-opcontrol --no-vmlinux
+A typical usage might look like this:
-Now we are ready to start the daemon (oprofiled) which collects
-the profile data :
+ocount --events=CPU_CLK_UNHALTED,INST_RETIRED /home/user1/my_test_program my_arg
-opcontrol --start
+When my_test_program completes (or when you press Ctrl-C), counting
+stops and the results are displayed to the screen (as shown below).
-When you want to stop profiling, you can do so with :
+
+Events were actively counted for 2.8 seconds.
+Event counts (actual) for /home/user1/my_test_program:
+ Event Count % time counted
+ CPU_CLK_UNHALTED 9,408,018,070 100.00
+ INST_RETIRED 16,719,918,108 100.00
+
-opcontrol --shutdown
+
+
+
+
+Specifying performance counter events
-Note that unlike gprof, no instrumentation (
-and options to gcc)
-is necessary.
+Whether profiling with operf or doing simple event counting with ocount,
+you can collect information about one more native hardware events using the --events
+option -- a comma-separated list of event specfications. The event specification is the means to provide details
+of how each hardware performance counter should be set up.
+For profiling, the event specification is a colon-separated string of the form
+
+as described in the table below. For ocount, specification is of the form
+.
+Note the presence of the count field for profiling. The count field tells the profiler
+how many events should occur between a profile snapshot (usually referred to as a "sample"). Since
+ocount does not do sampling, the count field is not needed.
-Periodically (or on opcontrol --shutdown or opcontrol --dump)
-the profile data is written out into the $SESSION_DIR/samples directory (by default at /var/lib/oprofile/samples).
-These profile files cover shared libraries, applications, the kernel (vmlinux), and kernel modules.
-You can clear the profile data (at any time) with opcontrol --reset.
+If no event specs are passed to operf or ocount,
+the default event will be used.
-To place these sample database files in a specific directory instead of the default location
-(/var/lib/oprofile) use the option.
-You must also specify the to tell the tools to continue using this directory.
+The perf_events kernel subsystem allocates hardware counters as necessary, but some processor
+types have restrictions as to what hardware events may be counted simultaneously.
+The kernel employs a multiplexing technique when such
+hardware restrictions are encountered, such that events are monitored on a rotating basis.
+
-opcontrol --no-vmlinux --session-dir=/home/me/tmpsession
-opcontrol --start --session-dir=/home/me/tmpsession
+
+
+
+The symbolic event name, e.g. CPU_CLK_UNHALTED
+The counter reset value, e.g. 100000; use only for profiling
+The unit mask, as given in the events list: e.g. 0x0f; or a symbolic name
+if a name=<um_name> field is present
+Enable profiling of kernel code
+Enable profiling of userspace code
+
+
+
-You can get summaries of this data in a number of ways at any time. To get a summary of
-data across the entire system for all of these profiles, you can do :
+The last three values are optional; if you omit them (e.g. ),
+they will be set to the default values (i.e., the default unit mask value for the given event, and profiling (or counting)
+both kernel and userspace code will be enabled). Note that on some architectures, some events may
+require a unit mask be specified.
-opreport [--session-dir=dir]
-Or to get a more detailed summary, for a particular image, you can do something like :
+You can specify unit mask values using either a numerical value (hex values
+must begin with "0x") or a symbolic name (if the name=<um_name>
+field is shown in the ophelp output). For some named unit masks, the hex value is not unique; thus, OProfile
+tools enforce specifying such unit masks value by name.
-opreport -l /boot/vmlinux-`uname -r`
-There are also a number of other ways of presenting the data, as described later in this manual.
-Note that OProfile will choose a default profiling setup for you. However, there are a number
-of options you can pass to opcontrol if you need to change something,
-also detailed later.
+The table below lists the default profiling event for various processor types. The same events
+can be used for ocount, minus the count field.
+
+
+
+Processorcpu_typeDefault event
+Alpha EV67alpha/ev67CYCLES:100000:0:1:1
+ARM/XScale PMU1arm/xscale1CPU_CYCLES:100000:0:1:1
+ARM/XScale PMU2arm/xscale2CPU_CYCLES:100000:0:1:1
+ARM/MPCorearm/mpcoreCPU_CYCLES:100000:0:1:1
+Athloni386/athlonCPU_CLK_UNHALTED:100000:0:1:1
+Pentium Proi386/pproCPU_CLK_UNHALTED:100000:0:1:1
+Pentium IIi386/piiCPU_CLK_UNHALTED:100000:0:1:1
+Pentium IIIi386/piiiCPU_CLK_UNHALTED:100000:0:1:1
+Pentium M (P6 core)i386/p6_mobileCPU_CLK_UNHALTED:100000:0:1:1
+Pentium 4 (non-HT)i386/p4GLOBAL_POWER_EVENTS:100000:1:1:1
+Pentium 4 (HT)i386/p4-htGLOBAL_POWER_EVENTS:100000:1:1:1
+Hammerx86-64/hammerCPU_CLK_UNHALTED:100000:0:1:1
+Family10hx86-64/family10CPU_CLK_UNHALTED:100000:0:1:1
+Family11hx86-64/family11hCPU_CLK_UNHALTED:100000:0:1:1
+IBM pseriesppc64/power{ 4|5|6|7|8|970 }CYCLES:100000:0:1:1
+IBM s390s390/{ z10|z196|zEC12 }HWSAMPLING:4127518:0:1:1
+
+
+
@@ -504,14 +517,14 @@ This section gives a brief description of the available OProfile utilities and t
operf
- This is the recommended program for collecting profile data.
+ This is the program for collecting profile data, discussed in .
- opcontrol
+ ocount
- Used for controlling OProfile data collection in legacy mode, discussed in .
+ This tool is used for simple event counting, as described in in .
@@ -562,7 +575,7 @@ This section gives a brief description of the available OProfile utilities and t
opimport
This utility converts sample database files from a foreign binary format (abi) to
- the native format. This is useful only when moving sample files between hosts,
+ the native format. This is useful only when moving sample files between hosts
for analysis on platforms other than the one used for collection.
See .
@@ -572,8 +585,8 @@ This section gives a brief description of the available OProfile utilities and t
-
-
+
+Controlling the profiler
@@ -596,7 +609,7 @@ Additionally, each counter is programmed with a "count" value, which corresponds
detailed the profile is. The lower the value, the more frequently profile
samples are taken. You can choose to sample only kernel code, user-space code,
or both (both is the default). Finally, some events have a "unit mask"
--- this is a value that further restricts the types of event that are counted.
+-- this is a value that further restricts the type of event being counted.
You can see the event types and unit masks for your CPU using ophelp.
More information on event specification can be found at .
@@ -615,9 +628,9 @@ Following is a description of the operf options.
-
+
- The command or application to be profiled. args are the input arguments
+ The command or application to be profiled. The [args] are the input arguments
that the command or application requires. Either command, --pid or
--system-wide is required, but cannot be used simultaneously.
@@ -651,8 +664,11 @@ Following is a description of the operf options.
A vmlinux file that matches the running kernel that has symbol and/or debuginfo.
Kernel samples will be attributed to this binary, allowing post-processing tools
(like opreport) to attribute samples to the appropriate kernel symbols.
- If this option is not specified, all kernel samples will be attributed to a pseudo
- binary named "no-vmlinux".
+ If this option is not specified, the file /proc/kallsyms is used to obtain
+ kernel symbol addresses correponding to sample addresses. However, the setting of
+ /proc/sys/kernel/kptr_restrict may restrict a non-root user's access to
+ /proc/kallsyms, in which case,
+ all kernel samples are attributed to a pseudo binary named "no-vmlinux".
@@ -738,419 +754,19 @@ Following is a description of the operf options.
- Show operf version.
-
-
-
-
-
- Show a help message.
-
-
-
-
-
-
-Using opcontrol
-
-In this section we describe the configuration and control of the profiling system
-with opcontrol in more depth. See for a description
-of the preferred profiling method.
-
-
-The opcontrol script has a default setup, but you
-can alter this with the options given below. In particular, you can select
-specific hardware events on which to base your profile. See for an
-introduction to hardware events and performance counter configuration.
-The event types and unit masks for your CPU are listed by opcontrol
---list-events or ophelp.
-
-
-The opcontrol script provides the following actions :
-
-
-
-
-
- Loads the OProfile module if required and makes the OProfile driver
- interface available.
-
-
-
-
-
- Followed by list arguments for profiling set up. List of arguments
- saved in /root/.oprofile/daemonrc.
- Giving this option is not necessary; you can just directly pass one
- of the setup options, e.g. opcontrol --no-vmlinux.
-
-
-
-
-
- Show configuration information.
-
-
-
-
-
- Start the oprofile daemon without starting actual profiling. The profiling
- can then be started using . This is useful for avoiding
- measuring the cost of daemon startup, as is a simple
- write to a file in oprofilefs.
-
-
-
-
-
- Start data collection with either arguments provided by
- or information saved in /root/.oprofile/daemonrc. Specifying
- the addition makes the daemon generate lots of debug data
- whilst it is running.
-
-
-
-
-
- Force a flush of the collected profiling data to the daemon.
-
-
-
-
-
- Stop data collection.
-
-
-
-
-
- Stop data collection and kill the daemon.
-
-
-
-
-
- Clears out data from current session, but leaves saved sessions.
-
-
-
- session_name
-
- Save data from current session to session_name.
-
-
-
-
-
- Shuts down daemon. Unload the OProfile module and oprofilefs.
-
-
-
-
-
- List event types and unit masks.
-
-
-
-
-
- Generate usage messages.
-
-
-
-
-
-There are a number of possible settings, of which, only
- (or )
-is required. These settings are stored in ~/.oprofile/daemonrc.
-
-
-
- num
-
- Number of samples in kernel buffer.
- Buffer watershed needs to be tweaked when changing this value.
-
-
-
- num
-
- Set kernel buffer watershed to num samples. When remain only
- buffer-size - buffer-watershed free entries remain in the kernel buffer, data will be
- flushed to the daemon. Most useful values are in the range [0.25 - 0.5] * buffer-size.
-
-
-
- num
-
- Number of samples in kernel per-cpu buffer. If you
- profile at high rate, it can help to increase this if the log
- file show excessive count of samples lost due to cpu buffer overflow.
-
-
-
- [eventspec]
-
- Use the given performance counter event to profile.
- See below.
-
-
-
- dir_path
-
- Create/use sample database out of directory dir_path instead of
- the default location (/var/lib/oprofile).
-
-
-
- [none,lib,kernel,thread,cpu,all]
-
- By default, every profile is stored in a single file. Thus, for example,
- samples in the C library are all accredited to the /lib/libc.o
- profile. However, you choose to create separate sample files by specifying
- one of the below options.
-
-
-
-
- No profile separation (default)
- Create per-application profiles for libraries
- Create per-application profiles for the kernel and kernel modules
- Create profiles for each thread and each task
- Create profiles for each CPU
- All of the above options
-
-
-
-
- Note that also turns on .
-
- When using , samples in hardware interrupts, soft-irqs, or other
- asynchronous kernel contexts are credited to the task currently running. This means you will see
- seemingly nonsense profiles such as /bin/bash showing samples for the PPP modules,
- etc.
-
-
- Using creates a lot
- of sample files if you leave OProfile running for a while; it's most
- useful when used for short sessions, or when using image filtering.
-
-
-
-
- #depth
-
- Enable call-graph sample collection with a maximum depth. Use 0 to disable
- callgraph profiling. NOTE: Callgraph support is available on a limited
- number of platforms at this time; for example:
-
-
- x86 with 2.6 or higher kernel
- ARM with 2.6 or higher kernel
- PowerPC with 2.6.17 or higher kernel
-
-
-
-
-
- image,[images]|"all"
-
- Image filtering. If you specify one or more absolute
- paths to binaries, OProfile will only produce profile results for those
- binary images. This is useful for restricting the sometimes voluminous
- output you may get otherwise, especially with
- . Note that if you are using
- or
- , then if you specification an
- application binary, the shared libraries and kernel code
- are included. Specify the value
- "all" to profile everything (the default).
-
-
-
- file
-
- vmlinux kernel image.
-
-
-
-
-
- Use this when you don't have a kernel vmlinux file, and you don't want
- to profile the kernel. This still counts the total number of kernel samples,
- but can't give symbol-based results for the kernel or any modules.
-
-
-
-
-
-Examples
-
-
-Intel performance counter setup
-
-Here, we have a Pentium III running at 800MHz, and we want to look at where data memory
-references are happening most, and also get results for CPU time.
-
-
-# opcontrol --event=CPU_CLK_UNHALTED:400000 --event=DATA_MEM_REFS:10000
-# opcontrol --vmlinux=/boot/2.6.0/vmlinux
-# opcontrol --start
-
-
-
-
-Starting the daemon separately
-
-Use to avoid
-the profiler startup affecting results.
-
-
-# opcontrol --vmlinux=/boot/2.6.0/vmlinux
-# opcontrol --start-daemon
-# my_favourite_benchmark --init
-# opcontrol --start ; my_favourite_benchmark --run ; opcontrol --stop
-
-
-
-
-Separate profiles for libraries and the kernel
-
-Here, we want to see a profile of the OProfile daemon itself, including when
-it was running inside the kernel driver, and its use of shared libraries.
-
-
-# opcontrol --separate=kernel --vmlinux=/boot/2.6.0/vmlinux
-# opcontrol --start
-# my_favourite_stress_test --run
-# opreport -l -p /lib/modules/2.6.0/kernel /usr/local/bin/oprofiled
-
-
-
-
-Profiling sessions
-
-It can often be useful to split up profiling data into several different
-time periods. For example, you may want to collect data on an application's
-startup separately from the normal runtime data. You can use the simple
-command opcontrol --save to do this. For example :
-
-
-# opcontrol --save=blah
-
-
-will create a sub-directory in $SESSION_DIR/samples containing the samples
-up to that point (the current session's sample files are moved into this
-directory). You can then pass this session name as a parameter to the post-profiling
-analysis tools, to only get data up to the point you named the
-session. If you do not want to save a session, you can do
-rm -rf $SESSION_DIR/samples/sessionname or, for the
-current session, opcontrol --reset.
-
-
-
-
-
-
-Specifying performance counter events
-
-Both methods of profiling (operf and opcontrol)
-allow you to give one or more event specifications to provide details of how each
-hardware performance counter should be setup. With operf, you
-can provide a comma-separated list of event specfications using the --events
-option. With opcontrol, you use the --event option
-for each desired event specification.
-The event specification is a colon-separated string of the form
-
-as described in the table below.
-
-
-If no event specs are passed to operf or opcontrol,
-the default event will be used for profiling. With opcontrol, if you have
-previously specified some non-default event but want to revert to the default event, use
-. Use of this option overrides all previous event selections
-that have been cached.
-
-
-OProfile will allocate hardware counters as necessary, but some processor
-types have restrictions as to what hardware events may be counted simultaneously.
-The operf program uses a multiplexing technique when such
-hardware restrictions are encountered, but opcontrol does
-not have this capability; instead, opcontrol will display an
-error message if you select an incompatible set of events.
-
-
-
-
-
-The symbolic event name, e.g. CPU_CLK_UNHALTED
-The counter reset value, e.g. 100000
-The unit mask, as given in the events list: e.g. 0x0f; or a symbolic name as
-given by the first word of the description (only valid for unit masks having an "extra:" parameter)
-Whether to profile kernel code
-Whether to profile userspace code
-
-
-
-
-The last three values are optional, if you omit them (e.g. ),
-they will be set to the default values (a unit mask of 0, and profiling both kernel and
-userspace code). Note that some events require a unit mask.
-
-
-When specifying a unit mask value, it may be either a hexadecimal value (which
-must begin with "0x") or a string (i.e, symbolic name) which matches
-the first word in the unit mask description. Specifying a symbolic name for
-the unit mask is valid only for unit masks having "extra:" parameters, as
-shown by the output of ophelp. Unit masks with "extra:" parameters must be
-specified using the symbolic name.
-
-
-When using legacy mode opcontrol on PowerPC platforms, all events specified must be in the same group;
-i.e., the group number appended to the event name (e.g. <some-event-name>_GRP9
-) must be the same.
-
-
-If OProfile is using timer-interrupt mode, there is no event configuration possible.
-
-
-The table below lists the default event for various processor types:
-
-
-
-
-Processorcpu_typeDefault event
-Alpha EV4alpha/ev4CYCLES:100000:0:1:1
-Alpha EV5alpha/ev5CYCLES:100000:0:1:1
-Alpha PCA56alpha/pca56CYCLES:100000:0:1:1
-Alpha EV6alpha/ev6CYCLES:100000:0:1:1
-Alpha EV67alpha/ev67CYCLES:100000:0:1:1
-ARM/XScale PMU1arm/xscale1CPU_CYCLES:100000:0:1:1
-ARM/XScale PMU2arm/xscale2CPU_CYCLES:100000:0:1:1
-ARM/MPCorearm/mpcoreCPU_CYCLES:100000:0:1:1
-AVR32avr32CPU_CYCLES:100000:0:1:1
-Athloni386/athlonCPU_CLK_UNHALTED:100000:0:1:1
-Pentium Proi386/pproCPU_CLK_UNHALTED:100000:0:1:1
-Pentium IIi386/piiCPU_CLK_UNHALTED:100000:0:1:1
-Pentium IIIi386/piiiCPU_CLK_UNHALTED:100000:0:1:1
-Pentium M (P6 core)i386/p6_mobileCPU_CLK_UNHALTED:100000:0:1:1
-Pentium 4 (non-HT)i386/p4GLOBAL_POWER_EVENTS:100000:1:1:1
-Pentium 4 (HT)i386/p4-htGLOBAL_POWER_EVENTS:100000:1:1:1
-Hammerx86-64/hammerCPU_CLK_UNHALTED:100000:0:1:1
-Family10hx86-64/family10CPU_CLK_UNHALTED:100000:0:1:1
-Family11hx86-64/family11hCPU_CLK_UNHALTED:100000:0:1:1
-Itaniumia64/itaniumCPU_CYCLES:100000:0:1:1
-Itanium 2ia64/itanium2CPU_CYCLES:100000:0:1:1
-TIMER_INTtimerNone selectable
-IBM pseriesPowerPC 4/5/6/7/970/CellCYCLES:100000:0:1:1
-IBM s390timerNone selectable
-IBM s390xtimerNone selectable
-
-
-
-
+ Show operf version.
+
+
+
+
+
+ Show a help message.
+
+
+
-
+
+
Setting up the JIT profiling feature
@@ -1196,33 +812,6 @@ The table below lists the default event for various processor types:
-
-Using oprof_start
-
-The oprof_start application provides a convenient way to start the profiler.
-Note that oprof_start is just a wrapper around the opcontrol script,
-so it does not provide more services than the script itself.
-
-
-After oprof_start is started you can select the event type for each counter;
-the sampling rate and other related parameters are explained in .
-The "Configuration" section allows you to set general parameters such as the buffer size, kernel filename
-etc. The counter setup interface should be self-explanatory; and related
-links contain information on using unit masks.
-
-
-A status line shows the current status of the profiler: how long it has been running, and the average
-number of interrupts received per second and the total, over all processors.
-Note that quitting oprof_start does not stop the profiler.
-
-
-Your configuration is saved in the same file as opcontrol uses; that is,
-~/.oprofile/daemonrc.
-
-
-oprof_start does not currently support operf.
-
-Configuration details
@@ -1237,7 +826,8 @@ events other than the default event chosen by OProfile.
Your CPU type may not include the requisite support for hardware performance counters, in which case
-you must use OProfile in timer mode (see ).
+you must use OProfile in timer mode (see ), which is only available in
+OProfile releases prior to 1.0.
@@ -1252,69 +842,73 @@ https://www.power.org/events/Power7 contains specific information on the
monitor unit for the IBM POWER7.
-These processors are capable of delivering an interrupt when a counter overflows.
+A physical performance monitor counter (PMC) is configured by a profiling tool to count a particular
+type of event. When the counter overflows, an interrupt is delivered to the processor.
This is the basic mechanism on which OProfile is based. The delivery mode is NMI,
so blocking interrupts in the kernel does not prevent profiling. When the interrupt handler is called,
-the current PC value and the current task are recorded into the profiling structure.
-This allows the overflow event to be attached to a specific assembly instruction in a binary image.
-OProfile receives this data from the kernel and writes it to the sample files.
+the current PC (program counter) value and the current task are recorded into the profiling structure.
+This allows the overflow event to be attributed to a specific assembly instruction in a specific binary image.
+OProfile receives this data (commonly referred to as a "sample") from the kernel and writes it to the sample files.
If we use an event such as CPU_CLK_UNHALTED or INST_RETIRED
(GLOBAL_POWER_EVENTS or INSTR_RETIRED, respectively, on the Pentium 4), we can
-use the overflow counts as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
+use the overflow counts (samples) as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
data such as the cache behaviour of routines with the other available counters.
However there are several caveats. First, there are those issues listed in the Intel manual. There is a delay
between the counter overflow and the interrupt delivery that can skew results on a small scale - this means
you cannot rely on the profiles at the instruction level as being perfectly accurate.
-If you are using an "event-mode" counter such as the cache counters, a count registered against it doesn't mean
-that it is responsible for that event. However, it implies that the counter overflowed in the dynamic
-vicinity of that instruction, to within a few instructions. Further details on this problem can be found in
+For example, if you are profiling an application with an event that counts L1 cache misses, a sample attributed
+to a particular instruction in the application doesn't necessarily mean that exact instruction is responsible
+for that event; instead, it means the sample was taken in the dynamic vicinity of that instruction,
+usually with a margin of error of a few instructions. Further details on this problem can be found in
and also in the Digital paper "ProfileMe: A Hardware Performance Counter".
-Each counter has several configuration parameters.
-First, there is the unit mask: this simply further specifies what to count.
-Second, there is the counter value, discussed below. Third, there is a parameter whether to increment counts
+Each counter has several configuration parameters besides the type of event to count.
+First, there is the unit mask, which is used to further qualify exactly what to count.
+Second, there is the count field, discussed below. Third, there are parameters
+to specify whether to increment counts
whilst in kernel or user space. You can configure these separately for each counter.
-After each overflow event, the counter will be re-initialized
-such that another overflow will occur after this many events have been counted. Thus, higher
-values mean less-detailed profiling, and lower values mean more detail, but higher overhead.
-Picking a good value for this
-parameter is, unfortunately, somewhat of a black art. It is of course dependent on the event
-you have chosen.
+When the profiler is initially setup, a performance monitor counter is chosen for counting the
+event, and it is initialized using the count value.
+Once profiling begins, the counter increments with each event detected, and the counter
+overflows when the count value is reached.
+As described above, the counter overflow generates an interrupt, and the sample is recorded.
+After each overflow event, the counter is re-initialized using the count value,
+and counting begins anew for the next sample. Higher values for count
+result in samples being taken less frequently, and therefore less-detailed (and, potentially,
+less accurate) profiling. Lower values mean more detail, but higher overhead.
+Picking a good value for this parameter is, unfortunately, somewhat of a black art. It is
+of course dependent on the event you have chosen.
Specifying too large a value will mean not enough interrupts are generated
-to give a realistic profile (though this problem can be ameliorated by profiling for longer).
-Specifying too small a value can lead to higher performance overhead.
+to give a realistic profile (though this problem can be ameliorated by profiling for
+longer time periods. Specifying too small a value can lead to higher performance overhead.
-OProfile in timer interrupt mode
-
-Some CPU types do not provide the needed hardware support to use the hardware performance counters. This includes
-some laptops, classic Pentiums, and other CPU types not yet supported by OProfile (such as Cyrix).
-On these machines, OProfile falls back to using the timer interrupt for profiling,
-back to using the real-time clock interrupt to collect samples. In timer mode, OProfile
-is not able to profile code that has interrupts disabled.
-
+OProfile timer interrupt mode
-You can force use of the timer interrupt by using the module
-parameter (or on the boot command line if OProfile is
-built-in). If OProfile was built as a kernel module, then you must pass the 'timer=1'
-parameter with the modprobe command. Do this before executing 'opcontrol --init' or
-edit the opcontrol command's invocation of modprobe to pass the 'timer=1' parameter.
-
-Timer mode is only available using the legacy opcontrol command.
+Some CPU types do not provide the needed hardware support for hardware performance counters.
+Additionally, some older architectures are not supported by the perf_events kernel subsystem.
+On such machines, the operf and ocount commands will exit with a message indicating the
+processor type is not supported. However, you can install OProfile 0.9.9 and use the legacy
+opcontrol-based profiler, which will fall back to using timer interrupts for profiling.
+Note that in timer mode, OProfile is not able to profile code that has interrupts disabled.
+Timer mode is only available using the legacy opcontrol command,
+available in releases prior to 1.0.
-
+
+Architecture-specific configuration notes
+Pentium 4 support
The Pentium 4 / Xeon performance counters are organized around 3 types of model specific registers (MSRs): 45 event
@@ -1330,338 +924,46 @@ There is currently no support for Precision Event-Based Sampling (PEBS), nor any
(DS). Current support is limited to the conservative extension of OProfile's existing interrupt-based model described
above.
-
-
-
-Intel Itanium 2 support
-
-The Itanium 2 performance monitoring unit (PMU) organizes the counters as four
-pairs of performance event monitoring registers. Each pair is composed of a
-Performance Monitoring Configuration (PMC) register and Performance Monitoring
-Data (PMD) register. The PMC selects the performance event being monitored and
-the PMD determines the sampling interval. The IA64 Performance Monitoring Unit
-(PMU) triggers sampling with maskable interrupts. Thus, samples will not occur
-in sections of the IA64 kernel where interrupts are disabled.
-
-
-None of the advance features of the Itanium 2 performance monitoring unit
-such as opcode matching, address range matching, or precise event sampling are
-supported by this version of OProfile. The Itanium 2 support only maps OProfile's
-existing interrupt-based model to the PMU hardware.
-
-
+
-
+PowerPC64 support
The performance monitoring unit (PMU) for the IBM PowerPC 64-bit processors
-consists of between 4 and 8 counters (depending on the model), plus three
-special purpose registers used for programming the counters -- MMCR0, MMCR1,
-and MMCRA. Advanced features such as instruction matching and thresholding are
-not supported by this version of OProfile.
-Later versions of the IBM POWER5+ processor (beginning with revision 3.0)
-run the performance monitor unit in POWER6 mode, effectively removing OProfile's
-access to counters 5 and 6. These two counters are dedicated to counting
-instructions completed and cycles, respectively. In POWER6 mode, however, the
-counters do not generate an interrupt on overflow and so are unusable by
-OProfile. Kernel versions 2.6.23 and higher will recognize this mode
-and export "ppc64/power5++" as the cpu_type to the oprofilefs pseudo filesystem.
-OProfile userspace responds to this cpu_type by removing these counters from
-the list of potential events to count. Without this kernel support, attempts
-to profile using an event from one of these counters will yield incorrect
-results -- typically, zero (or near zero) samples in the generated report.
-
-
-
-
-
-
-Cell Broadband Engine support
-
-The Cell Broadband Engine (CBE) processor core consists of a PowerPC Processing
-Element (PPE) and 8 Synergistic Processing Elements (SPE). PPEs and SPEs each
-consist of a processing unit (PPU and SPU, respectively) and other hardware
-components, such as memory controllers.
-
-
-A PPU has two hardware threads (aka "virtual CPUs"). The performance monitor
-unit of the CBE collects event information on one hardware thread at a time.
-Therefore, when profiling PPE events,
-OProfile collects the profile based on the selected events by time slicing the
-performance counter hardware between the two threads. The user must ensure the
-collection interval is long enough so that the time spent collecting data for
-each PPU is sufficient to obtain a good profile.
-
-
-To profile an SPU application, the user should specify the SPU_CYCLES event.
-When starting OProfile with SPU_CYCLES, the opcontrol script enforces certain
-separation parameters (separate=cpu,lib) to ensure that sufficient information
-is collected in the sample data in order to generate a complete report. The
---merge=cpu option can be used to obtain a more readable report if analyzing
-the performance of each separate SPU is not necessary.
-
-
-Profiling with an SPU event (events 4100 through 4163) is not compatible with any other
-event. Further more, only one SPU event can be specified at a time. The hardware only
-supports profiling on one SPU per node at a time. The OProfile kernel code time slices
-between the eight SPUs to collect data on all SPUs.
-
-
-SPU profile reports have some unique characteristics compared to reports for
-standard architectures:
-
-
-Typically no "app name" column. This is really standard OProfile behavior
-when the report contains samples for just a single application, which is
-commonly the case when profiling SPUs.
-"CPU" equates to "SPU"
-Specifying '--long-filenames' on the opreport command does not always result
-in long filenames. This happens when the SPU application code is embedded in
-the PPE executable or shared library. The embedded SPU ELF data contains only the
-short filename (i.e., no path information) for the SPU binary file that was used as
-the source for embedding. The reason that just the short filename is used is because
-the original SPU binary file may not exist or be accessible at runtime. The performance
-analyst must have sufficient knowledge of the application to be able to correlate the
-SPU binary image names found in the report to the application's source files.
-
-Compile the application with -g and generate the OProfile report
-with -g to facilitate finding the right source file(s) on which to focus.
-
-
-
-
-
-
-
-AMD64 (x86_64) Instruction-Based Sampling (IBS) support
-
-
-Instruction-Based Sampling (IBS) is a new performance measurement technique
-available on AMD Family 10h processors. Traditional performance counter
-sampling is not precise enough to isolate performance issues to individual
-instructions. IBS, however, precisely identifies instructions which are not
-making the best use of the processor pipeline and memory hierarchy.
-For more information, please refer to the "Instruction-Based Sampling:
-A New Performance Analysis Technique for AMD Family 10h Processors" (
-
-http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf).
-There are two types of IBS profile types, described in the following sections.
-Profiling on IBS events is only supported with legacy mode profiling
-(i.e., with opcontrol).
-
-
-
-IBS Fetch
-
-
-IBS fetch sampling is a statistical sampling method which counts completed
-fetch operations. When the number of completed fetch operations reaches the
-maximum fetch count (the sampling period), IBS tags the fetch operation and
-monitors that operation until it either completes or aborts. When a tagged
-fetch completes or aborts, a sampling interrupt is generated and an IBS fetch
-sample is taken. An IBS fetch sample contains a timestamp, the identifier of
-the interrupted process, the virtual fetch address, and several event flags
-and values that describe what happened during the fetch operation.
-
-
-
-
-
-IBS Op
-
-
-IBS op sampling selects, tags, and monitors macro-ops as issued from AMD64
-instructions. Two options are available for selecting ops for sampling:
-
-
-
-
-Cycles-based selection counts CPU clock cycles. The op is tagged and monitored
-when the count reaches a threshold (the sampling period) and a valid op is
-available.
-
-
-
-Dispatched op-based selection counts dispatched macro-ops.
-When the count reaches a threshold, the next valid op is tagged and monitored.
-
-
-
-
-In both cases, an IBS sample is generated only if the tagged op retires.
-Thus, IBS op event information does not measure speculative execution activity.
-The execution stages of the pipeline monitor the tagged macro-op. When the
-tagged macro-op retires, a sampling interrupt is generated and an IBS op
-sample is taken. An IBS op sample contains a timestamp, the identifier of
-the interrupted process, the virtual address of the AMD64 instruction from
-which the op was issued, and several event flags and values that describe
-what happened when the macro-op executed.
+consists of between 4 and 8 counters (depending on the model). Advanced features
+such as instruction matching and thresholding are not supported by OProfile.
-
-
-Enabling IBS profiling is done simply by specifying IBS performance events
-through the "--event=" options. These events are listed in the
-opcontrol --list-events.
-
-
-
-opcontrol --event=IBS_FETCH_XXX:<count>:<um>:<kernel>:<user>
-opcontrol --event=IBS_OP_XXX:<count>:<um>:<kernel>:<user>
-
-Note: * All IBS fetch event must have the same event count and unitmask,
- as do those for IBS op.
-
-
-
-IBM System z hardware sampling support
-
-IBM System z provides a facility which does instruction sampling as
-part of the CPU. This has great advantages over the timer based
-sampling approach like better sampling resolution with less overhead
-and the possibility to get samples within code sections where
-interrupts are disabled (useful especially for Linux kernel code).
-
-Profiling with the instruction sampling facility is currently only supported
-with legacy mode profiling (i.e., with opcontrol).
-
-A public description of the System z CPU-Measurement Facilities can be
-found here:
-The Load-Program-Parameter and CPU-Measurement Facilities
-
-
-System z hardware sampling can be used for Linux instances in LPAR
-mode. The hardware sampling support used by OProfile was introduced
-for System z10 in October 2008.
-
-
-To enable hardware sampling for an LPAR you must activate the LPAR
-with authorization for basic sampling control. See the "Support
-Element Operations Guide" for your mainframe system for more
-information.
-
-
-The hardware sampling facility can be enabled and disabled using the
-event interface. A `virtual' counter 0 has been defined that only supports
-a single event, HWSAMPLING. By default the HWSAMPLING event is
-enabled on machines providing the facility. For both events only the
-`count', `kernel' and `user' options are evaluated by the kernel
-module.
-
-
-The `count' value is the sampling rate as it is passed to the CPU
-measurement facility. A sample will be taken by the hardware every
-`count' cycles. Using low values here will quickly fill up the
-sampling buffers and will generate CPU load on the OProfile daemon and
-the kernel module being busy flushing the hardware buffers. This
-might considerably impact the workload to be profiled.
-
-
-The unit mask `um' is required to be zero.
-
-
-The opcontrol tool provides a new option specific to System z
-hardware sampling:
-
-
-
---s390hwsampbufsize="num": Number of 2MB areas
-used per CPU for storing sample data. The best
-size for the sample memory depends on the particular system and the
-workload to be measured. Providing the sampler with too little memory
-results in lost samples. Reserving too much system memory for the
-sampler impacts the overall performance and, hence, also the workload
-to be measured.
-
-
-
-A special counter /dev/oprofile/timer is provided
-by the kernel module allowing to switch back to timer mode sampling
-dynamically. The TIMER event is limited to be used only with this
-counter. The TIMER event can be specified using the
- as with every other event.
-
-opcontrol --event=TIMER:1
-
-On z10 or later machines the default event is set to TIMER in case the
-hardware sampling facility is not available.
-
-
-Although required, the 'count' parameter of the TIMER event is
-ignored. The value may eventually be used for timer based sampling
-with a configurable sampling frequency, but this is currently not
-supported.
-
-
-
-
-Dangerous counter settings
-
-OProfile is a low-level profiler which allows continuous profiling with a low-overhead cost.
-When using OProfile legacy mode profiling, it may be possible to configure such a low a counter reset value
-(i.e., high sampling rate) that the system can become overloaded with counter interrupts and your
-system's responsiveness may be severely impacted. Whilst some validation is done on the count
-values you pass to opcontrol with your event specification, it is not foolproof.
-
-
-This can happen as follows: When the profiler count
-reaches zero, an NMI handler is called which stores the sample values in an internal buffer, then resets the counter
-to its original value. If the reset count you specified is very low, a pending NMI can be sent before the NMI handler has
-completed. Due to the priority of the NMI, the pending interrupt is delivered immediately after
-completion of the previous interrupt handler, and control never returns to other parts of the system.
-If all processors are stuck in this mode, the system will appear to be frozen.
-
-If this happens, it will be impossible to bring the system back to a workable state.
-There is no way to provide real security against this happening, other than making sure to use a reasonable value
-for the counter reset. For example, setting CPU_CLK_UNHALTED event type with a ridiculously low reset count (e.g. 500)
-is likely to freeze the system.
-
-
-In short : Don't try a foolish sample count value. Unfortunately the definition of a foolish value
-is really dependent on the event type. If ever in doubt, post a message to oprofile-list@lists.sf.net.
-
-
-The scenario described above cannot occur if you use operf for profiling instead of
-opcontrol, because the perf_events kernel subsystem automatically detects when performance monitor
-interrupts are arriving at a dangerous level and will throttle back the sampling rate.
-
-
-
+
-Obtaining results
-
-OK, so the profiler has been running, but it's not much use unless we can get some data out. Sometimes,
-OProfile does a little too good a job of keeping overhead low, and no data reaches
-the profiler. This can happen on lightly-loaded machines. If you're using OPorifle legacy mode, you can
-force a dump at any time with :
-
-opcontrol --dump
-This ensures that any profile data collected by the oprofiled daemon has been flusehd
-to disk. Remember to do a dump, stop, shutdown, or deinit
-before complaining there is no profiling data!
-
+Obtaining profiling results
-Now that we've got some data, it has to be processed. That's the job of opreport,
-opannotate, or opgprof.
+After collecting profile data, the raw data must undergo special processing in order for you to
+perform your analysis. The analysis tools that perform this special processing are
+opreport, opannotate, and opgprof.
+Additionally, the oparchive is used to gather together profile
+data, sampled binary files, etc. for the purpose of off-line analysis. While
+not really an analysis tool, oparchive is put in that category
+for convenience since it takes many of the same options as the other analysis tools.
Profile specifications
-All of the analysis tools take a profile specification.
-This is a set of definitions that describe which actual profiles should be
+All of the analysis tools take a profile specification
+as an input argument.
+This is a set of definitions that describes the specific profile data that should be
examined. The simplest profile specification is empty: this will match all
-the available profile files for the current session (this is what happens
-when you do opreport).
+the available profile files for the current session.
Specification parameters are of the form .
@@ -1669,10 +971,11 @@ For example, if I wanted to get a combined symbol summary for
/bin/myprog and /bin/myprog2,
I could do opreport -l image:/bin/myprog,/bin/myprog2.
As a special case, you don't actually need to specify the
-part here: anything left on the command line is assumed to be an
+part of the specification. Anything left on the command line after all other
+opreport options have been processed is assumed to be an
name. Similarly, if no
is specified, then is assumed ("current"
-is a special name of the current / last profiling session).
+is a special name of the current (i.e., most recent) profiling session).
In addition to the comma-separated list shown above, some of the
@@ -1779,10 +1082,7 @@ Differential profile of an archived binary with the current session :
imagelist
Same as , but only for images that are for
- a particular primary binary image (namely, an application). This only
- makes sense to use if you're using .
- This includes kernel modules and the kernel when using
- .
+ a particular primary binary image (namely, an application).
@@ -1799,7 +1099,6 @@ Differential profile of an archived binary with the current session :
The symbolic event name to match on, e.g. .
You can pass a list of events for side-by-side comparison with opreport.
- When using the timer interrupt, the event is always "TIMER".
@@ -1808,11 +1107,10 @@ Differential profile of an archived binary with the current session :
The event count to match on, e.g. .
Note that this value refers to the count value in the event spec you passed
- to opcontrol or operf when setting up to do a
+ to operf when setting up to do a
profile run. It has nothing to do with the sample counts in the profile data
itself.
You can pass a list of events for side-by-side comparison with opreport.
- When using the timer interrupt, the count is always 0 (indicating it cannot be set).
@@ -1861,9 +1159,8 @@ Differential profile of an archived binary with the current session :
Locating and managing binary images
-Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default when
-using legacy mode: /var/lib/oprofile/samples/; default when using
-operf: <cur_dir>/oprofile_data/samples/).
+Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default
+for operf is <cur_dir>/oprofile_data/samples/).
These are used, along with the binary image files, to produce human-readable data.
In some circumstances (e.g., kernel modules), OProfile
will not be able to find the binary images. All the tools have an
@@ -1910,19 +1207,14 @@ taken per second.
application spent most of its time in libraries
Similarly, if the application spends little time in the main binary image
itself, with most of it spent in shared libraries it uses, you might
-not see any samples for the binary image (i.e., executable) itself. If you're
-using OProfile legacy mode profiling, then we recommend using
-opcontrol --separate=lib before the
-profiling session so that opreport and friends show
-the library profiles on a per-application basis. This is done automatically
-when profiling with operf, so no special setup is necessary.
+not see any samples for the binary image (i.e., executable) itself.
specification was really too strict
For example, you specified something like ,
but no task with that group ID ever ran the code.
application didn't generate any events
-If you're using a particular event counter, for example counting MMX
+If you're profiling a particular event, for example counting MMX
operations, the code might simply have not generated any events in the
first place. Verify the code you're profiling does what you expect it
to.
@@ -1946,52 +1238,66 @@ The opreport utility is the primary utility you will use for
getting formatted data out of OProfile. It produces two types of data: image summaries
and symbol summaries. An image summary lists the number of samples for individual
binary images such as libraries or applications. Symbol summaries provide per-symbol
-profile data. In the following example, we're getting an image summary for the whole
+profile data. In the following truncated example, we see an image summary for the whole
system:
$ opreport --long-filenames
-CPU: PIII, speed 863.195 MHz (estimated)
-Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 23150
- 905898 59.7415 /usr/lib/gcc-lib/i386-redhat-linux/3.2/cc1plus
- 214320 14.1338 /boot/2.6.0/vmlinux
- 103450 6.8222 /lib/i686/libc-2.3.2.so
- 60160 3.9674 /usr/local/bin/madplay
- 31769 2.0951 /usr/local/oprofile-pp/bin/oprofiled
- 26550 1.7509 /usr/lib/libartsflow.so.1.0.0
- 23906 1.5765 /usr/bin/as
- 18770 1.2378 /oprofile
- 15528 1.0240 /usr/lib/qt-3.0.5/lib/libqt-mt.so.3.0.5
- 11979 0.7900 /usr/X11R6/bin/XFree86
- 11328 0.7471 /bin/bash
+CPU: Intel Sandy Bridge microarchitecture, speed 2401 MHz (estimated)
+Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
+CPU_CLK_UNHALT...|
+ samples| %|
+------------------
+ 22577 28.9011 /usr/bin/Xorg
+ CPU_CLK_UNHALT...|
+ samples| %|
+ ------------------
+ 16846 74.6158 /proc/kallsyms
+ 2126 9.4167 /usr/bin/Xorg
+ 763 3.3795 /usr/lib64/libpixman-1.so.0.26.2
+ ...
+ 17402 22.2766 /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.55.x86_64/jre/bin/java
+ CPU_CLK_UNHALT...|
+ samples| %|
+ ------------------
+ 5666 32.5595 anon (tgid:29664 range:0x7f3475000000-0x7f347616ffff)
+ 2312 13.2858 /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.55.x86_64/jre/lib/amd64/server/libjvm.so
+ ...
+ 11554 14.7904 /home/user1/oprof-install/bin/operf
+ CPU_CLK_UNHALT...|
+ samples| %|
+ ------------------
+ 7467 64.6270 /proc/kallsyms
+ 1691 14.6356 /usr/bin/operf
+ 1324 11.4592 /lib64/libc-2.12.so
+ 455 3.9380 /usr/lib64/libstdc++.so.6.0.13
+ 315 2.7263 /ext4
+ ...
...
If we had specified in the previous command, we would have
gotten a symbol summary of all the images across the entire system. We can restrict this to only
part of the system profile; for example,
-below is a symbol summary of the OProfile daemon. Note that as we used
-opcontrol --separate=lib,kernel, symbols from images that oprofiled
-has used are also shown.
+below is a symbol summary for the operf program used to collect the profile.
-$ opreport -l -p /lib/modules/`uname -r` `which oprofiled` 2>/dev/null | more
-CPU: Core 2, speed 2.534e+06 MHz (estimated)
-Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
+$ opreport -l -p /lib/modules/`uname -r` `which operf` 2>/dev/null | more
+CPU: Intel Sandy Bridge microarchitecture, speed 2401 MHz (estimated)
+Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
-1353 24.9447 vmlinux sidtab_context_to_sid
-500 9.2183 vmlinux avtab_hash_eval
-154 2.8392 vmlinux __link_path_walk
-152 2.8024 vmlinux d_prune_aliases
-120 2.2124 vmlinux avtab_search_node
-104 1.9174 vmlinux find_next_bit
-85 1.5671 vmlinux selinux_file_fcntl
-82 1.5118 vmlinux avtab_write
-81 1.4934 oprofiled odb_update_node_with_offset
-73 1.3459 oprofiled opd_process_samples
-72 1.3274 vmlinux avc_has_perm_noaudit
-61 1.1246 libc-2.12.so _IO_vfscanf
-59 1.0878 ext4.ko ext4_mark_iloc_dirty
+860 7.4607 kallsyms avtab_search_node
+474 4.1121 operf OP_perf_utils::op_write_event(event_union*, unsigned long long)
+461 3.9993 kallsyms avc_has_perm_noaudit
+455 3.9473 libstdc++.so.6.0.13 /usr/lib64/libstdc++.so.6.0.13
+412 3.5742 libc-2.12.so _IO_vfscanf
+369 3.2012 kallsyms __d_lookup
+350 3.0363 kallsyms sidtab_context_to_sid
+274 2.3770 operf OP_perf_utils::op_record_process_exec_mmaps(int, int, int, operf_record*)
+232 2.0127 operf operf_process_info::find_mapping_for_sample(unsigned long long, bool)
+222 1.9259 kallsyms __link_path_walk
+191 1.6570 kallsyms pipe_read
+34 0.2950 ext4.ko ext4_mark_iloc_dirty
...
@@ -2007,8 +1313,8 @@ If you have used one of the options
whilst profiling, there can be several separate profiles for
a single binary image within a session. Normally the output
will keep these images separated. So, for example, if you profiled
-with separation on a per-cpu basis (opcontrol --separate=cpu or
-operf --separate-cpu), you would see separate columns in
+with separation on a per-cpu basis (operf --separate-cpu),
+you would see separate columns in
the output of opreport for each CPU where samples
were recorded. But it can be useful to merge these results back together
to make the report more readable. The option allows
@@ -2120,7 +1426,7 @@ linkend="interpreting-callgraph" /> for an explanation.
-Callgraph and JIT support
+Callgraph is not supported with JIT samples
Callgraph output where anonymously mapped code is in the callstack can sometimes be misleading.
For all such code, the samples for the anonymously mapped code are stored in a samples subdirectory
@@ -2182,13 +1488,12 @@ A typical way to use this feature is with archives created with
oparchive. Let's look at an example:
-$ ./a
+$ operf ./a
$ oparchive -o orig ./a
-$ opcontrol --reset
# edit and recompile a
-$ ./a
+$ operf ./a
# now compare the current profile of a with the archived profile
-$ opreport -xl ./a { archive:./orig } { }
+$ opreport --session-dir=`pwd`/oprofile_data/ -xl ./a { archive:./orig } { }
CPU: PIII, speed 863.233 MHz (estimated)
Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a
unit mask of 0x00 (No unit mask) count 100000
@@ -2257,8 +1562,7 @@ samples % image name symbol name
Note that, since such mappings are dependent upon individual invocations of
-a binary, these mappings are always listed as a dependent image,
-even when using the legacy mode command.
+a binary, these mappings are always listed as a dependent image.
Equally, the results are not affected by the
option.
@@ -2319,8 +1623,7 @@ offsets for the image binary.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel..
Exclude all the symbols in the given comma-separated list.
@@ -2356,9 +1659,13 @@ Output to the given file instead of stdout.
Reverse the sort from the default.
-dir_path
-Use sample database out of directory dir_path
-instead of the default location (/var/lib/oprofile).
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opreport will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
Show the VMA address of each symbol (off by default).
@@ -2373,7 +1680,8 @@ List per-symbol information instead of a binary image summary.
Only output data for symbols that have more than the given percentage
-of total samples.
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
Give verbose debugging output.
@@ -2502,11 +1810,19 @@ pattern-matching to make C++ symbol demangling more readable.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel.
Exclude all files in the given comma-separated list of glob patterns.
+This option is supported solely with the --source
+option. It can be used to filter out source files in the output using the
+following types of specifications:
+
+filenames (basename -- i.e., no path)
+filename glob specifications (all files whose base filename matches the given pattern)
+directory segments (all source files located in the specified directory; e.g. "libio")
+directory segment glob specifications (e.g., "libi*")
+
Exclude all the symbols in the given comma-separated list.
@@ -2523,6 +1839,7 @@ A path to a filesystem to search for additional binaries.
Only include files in the given comma-separated list of glob patterns.
+The same rules apply for this option as for the --exclude-file option.
Only include symbols in the given comma-separated list.
@@ -2561,9 +1878,23 @@ source files when the debug information only contains relative paths.
Output annotated source. This requires debugging information to be available
for the binaries.
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opannotate will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
+
-Only output data for symbols that have more than the given percentage
-of total samples.
+For annotated assembly, only output data for symbols that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the symbol is shown.
+
+
+For annotated source, only output data for source files that have more than the given percentage
+of total samples. For profiles using multiple events, if the threshold is reached
+for any event, then all sample data for the source file is shown.
Give verbose debugging output.
@@ -2683,6 +2014,14 @@ of total samples.
Give verbose debugging output.
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then opgprof will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
+
Show version.
@@ -2694,18 +2033,18 @@ Show version.
-Archiving measurements (oparchive)
+Analyzing profile data on another system (oparchive)
The oparchive utility generates a directory populated
with executable, debug, and oprofile sample files. This directory can be
- moved to another machine via tar and analyzed without
- further use of the data collection machine.
+ copied to another (host) machine and analyzed offline, with no further need to
+ access the data collection machine (target).
- The following command would collect the sample files, the executables
- associated with the sample files, and the debuginfo files associated
- with the executables and copy them into
+ The following command, executed on the target system, will collect the
+ sample files, the executables associated with the sample files, and the
+ debuginfo files associated with the executables and copy them into
/tmp/current_data:
@@ -2713,6 +2052,59 @@ Show version.
# oparchive -o /tmp/current_data
+
+ When transferring archived profile data to a host machine for offline analysis,
+ you need to determine if the oprofile ABI format is compatible between the
+ target system and the host system; if it isn't, you must run the opimport
+ command to convert the target's sample data files to the format of your host system.
+ See for more details.
+
+
+
+ After your profile data is transferred to the host system and (if necessary)
+ you have run the opimport command to convert the file
+ format, you can now run the opreport and
+ opannotate commands. However, you must provide an
+ "archive specification" to let these post-processing tools know where to find
+ of the profile data (sample files, executables, etc.); for example:
+
+
+
+# opreport archive:/home/user1/my_oprofile_archive --symbols
+
+
+
+ Furthermore, if your profile was collected on your target system into a session-dir
+ other than /var/lib/oprofile, the oparchive
+ command will display a message similar to the following:
+
+
+
+# NOTE: The sample data in this archive is located at /home/user1/test-stuff/oprofile_data
+instead of the standard location of /var/lib/oprofile. Hence, when using opreport
+and other post-processing tools on this archive, you must pass the following option:
+ --session-dir=/home/user1/test-stuff/oprofile_data
+
+
+
+ Then the above opreport example would have to include that
+ option.
+
+
+
+
+ In some host/target development environments, all target executables, libraries, and
+ debuginfo files are stored in a root directory on the host to facilitate offline
+ analysis. In such cases, the oparchive command collects more data
+ than is necessary; so, when copying the resulting output of oparchive,
+ you can skip all of the executables, etc, and just archive the $SESSION_DIR
+ tree located within the output directory you specified in your oparchive
+ command. Then, when running the opreport or opannotate
+ commands on your host system, pass the option to point to the
+ location of your target's executables, etc.
+
+
+
Usage of oparchive
@@ -2722,8 +2114,7 @@ Show help message.
Do not include application-specific images for libraries, kernel modules
-and the kernel. This option only makes sense if the profile session
-used --separate.
+and the kernel.
Comma-separated list of additional paths to search for binaries.
@@ -2741,6 +2132,14 @@ Only list the files that would be archived, don't copy them.
Give verbose debugging output.
+
+Use sample database from the specified directory dir_path instead
+of the default location. If this option is not specified, then oparchive will search for
+samples in <cur_dir>/oprofile_data
+first. If that directory does not exist, the standard session-dir of
+/var/lib/oprofile is used
+as the session directory.
+
Show version.
@@ -2754,15 +2153,21 @@ Show version.
Converting sample database files (opimport)
This utility converts sample database files from a foreign binary format (abi) to
- the native format. This is useful only when moving sample files between systems
- for analysis on platforms other than the one used for collection. The
- oparchive should be used on the machine where the profile was taken (target)
- in order to collect sample files and all other necessary information. The archive
- directory that is the output from oparchive should be copied
- to the system where you wish to perform your performance analysis (host). If the
- When the architecture of your target and host systems differ, then you'll need to
- use the opimport command. The abi format of the sample files
- to be imported is described in a text file located in $SESSION_DIR/abi.
+ the native format. This is required when moving sample files to a (host) system
+ other than the one used for collection (target system), and the host and target systems are different
+ architectures. The abi format of the sample files to be imported is described in a
+ text file located in $SESSION_DIR/abi. If you are unsure if
+ your target and host systems have compatible architectures (in regard to the OProfile
+ ABI), simply diff a $SESSION_DIR/abi file from the target system
+ with one from the host system. If any differences show up at all, you must run the
+ opimport command.
+
+
+
+ The oparchive command should be used on the machine where
+ the profile was taken (target) in order to collect sample files and all other necessary
+ information. The archive directory that is the output from oparchive
+ should be copied to the system where you wish to perform your performance analysis (host).
@@ -2919,10 +2324,7 @@ problem and OProfile can do nothing about it.
OProfile uses non-maskable interrupts (NMI) on the P6 generation, Pentium 4,
Athlon, Opteron, Phenom, and Turion processors. These interrupts can occur even in sections of the
kernel where interrupts are disabled, allowing collection of samples in virtually
-all executable code. The timer interrupt mode and Itanium 2 collection mechanisms
-use maskable interrupts; therefore, these profiling mechanisms have "sample
-shadows", or blind spots: regions where no samples will be collected. Typically, the samples
-will be attributed to the code immediately after the interrupts are re-enabled.
+all executable code.
@@ -2942,7 +2344,7 @@ will appear as poll_idle() in your kernel profile.
OProfile profiles kernel modules by default. However, there are a couple of problems
you may have when trying to get results. First, you may have booted via an initrd;
this means that the actual path for the module binaries cannot be determined automatically.
-To get around this, you can use the option to the profiling tools
+To get around this, you can use the option to the analysis tools
to specify where to look for the kernel modules.
@@ -2967,7 +2369,7 @@ information for OProfile to get this information.
Interpreting call-graph profiles
-Sometimes the results from call-graph profiles may be different to what
+Sometimes the results from call-graph profiles may be different from what
you expect to see. The first thing to check is whether the target
binaries where compiled with frame pointers enabled (if the binary was
compiled using gcc's
@@ -3332,6 +2734,183 @@ and http://developer.amd.co
+
+
+Controlling the event counter
+
+Using ocount
+
+This section describes in detail how ocount is used.
+Unless the option is specified, ocount will use
+the default event for your system. For most systems, the default event is some
+cycles-based event, assuming your processor type supports hardware performance
+counters. The event specification used for ocount is slightly
+different from that required for profiling -- a count value
+is not needed. You can see the event information for your CPU using ophelp.
+More information on event specification can be found at .
+
+
+The ocount command syntax is:
+
+ocount [ options ] [ --system-wide | --process-list <pids> | --thread-list <tids> | --cpu-list <cpus> [ command [ args ] ] ]
+
+
+
+ocount has 5 run modes:
+
+
+system-wide
+process-list
+thread-list
+cpu-list
+command
+
+
+
+One and only one of these 5 run modes must be specified when you run ocount.
+If you run ocount using a run mode other than command [args], press Ctrl-c
+to stop it when finished counting (e.g., when the monitored process ends). If you background ocount
+(i.e., with ’&’) while using one these run modes, you must stop it in a controlled manner so that
+the data collection process can be shut down cleanly and final results can be displayed.
+Use kill -SIGINT <ocount-PID> for this purpose.
+
+
+Following is a description of the ocount options.
+
+
+
+
+
+ The command or application to be profiled. The [args] are the input arguments
+ that the command or application requires. The command and its arguments must be positioned at the
+ end of the command line, after all other ocount options.
+
+
+
+
+
+ Use this option to count events for one or more already-running applications, specified via
+ a comma-separated list (PIDs). Event counts will be collected for all children of the
+ passed process(es) as well.
+
+
+
+
+
+ Use this option to count events for one or more already-running threads, specified via
+ a comma-separated list (TIDs). Event counts will not be collected
+ for any children of the passed thread(s).
+
+
+
+
+
+ This option is for counting events for all processes running on your system. You must have
+ root authority to run ocount in this mode.
+
+
+
+
+
+ This option is for counting events on a subset of processors on your system. You must have
+ root authority to run ocount in this mode. This is a comma-separated list,
+ where each element in the list may be either a single processor number or a range of processor
+ numbers; for example: ’-C 2,3,4-11,15’.
+
+
+
+
+
+ This option is for passing a comma-separated list of event specifications
+ for counting. Each event spec is of the form:
+
+ name[:unitmask[:kernel[:user]]]
+
+ When no event specification is given, the default event for the running
+ processor type will be used for counting. Use ophelp
+ to list the available events for your processor type.
+
+
+
+
+
+ This option can be used in conjunction with either the --process-list or
+ --thread-list option to display event counts on a per-thread (per-process) basis.
+ Without this option, all counts are aggregated.
+
+
+
+
+
+ This option can be used in conjunction with either the --system-wide or
+ --cpu-list option to display event counts on a per-cpu basis. Without this option,
+ all counts are aggregated.
+
+
+
+
+
+ Note: The interval_length is given in milliseconds.
+ However, the current implementation only supports 100 ms
+ granularity, so the given interval_length will be rounded
+ to the nearest 100 ms. Results collected for each time
+ interval are printed immediately instead of the default
+ of one dump of cumulative event counts at the end of the
+ run. Counters are reset to zero at the start of each
+ interval.
+
+
+ If num_intervals is specified, ocount exits after the
+ specified number of intervals occur.
+
+
+
+
+
+ Use this option to print results in the following brief format:
+
+ [optional cpu or thread,]<event_name>,<count>,<percent_time_enabled>
+ [ <int> ,]< string >,< u64 >,< double >
+
+ If --timer-interval is specified, a separate line formatted as
+
+ timestamp,<num_seconds_since_epoch>[.n]
+
+ is printed ahead of each dump of event counts. If the time interval specified is
+ less than one second, the timestamp will have 1/10 second precision.
+
+
+
+
+
+
+ Results are written to outfile_name instead of interactively to the terminal.
+
+
+
+
+
+ Use this option to increase the verbosity of the output.
+
+
+
+
+
+ Show ocount version.
+
+
+
+
+
+ Show a help message.
+
+
+
+
+
+
+
+
Acknowledgments
diff --git a/events/Makefile.am b/events/Makefile.am
index 7c14713..d68f0e8 100644
--- a/events/Makefile.am
+++ b/events/Makefile.am
@@ -1,9 +1,5 @@
event_files = \
- alpha/ev4/events alpha/ev4/unit_masks \
- alpha/ev5/events alpha/ev5/unit_masks \
alpha/ev67/events alpha/ev67/unit_masks \
- alpha/ev6/events alpha/ev6/unit_masks \
- alpha/pca56/events alpha/pca56/unit_masks \
i386/athlon/events i386/athlon/unit_masks \
i386/core_2/events i386/core_2/unit_masks \
i386/p4/events i386/p4-ht/events \
@@ -20,27 +16,26 @@ event_files = \
i386/westmere/events i386/westmere/unit_masks \
i386/sandybridge/events i386/sandybridge/unit_masks \
i386/ivybridge/events i386/ivybridge/unit_masks \
- ia64/ia64/events ia64/ia64/unit_masks \
- ia64/itanium2/events ia64/itanium2/unit_masks \
- ia64/itanium/events ia64/itanium/unit_masks \
+ i386/haswell/events i386/haswell/unit_masks \
+ i386/broadwell/events i386/broadwell/unit_masks \
+ i386/silvermont/events i386/silvermont/unit_masks \
+ ppc64/architected_events_v1/events ppc64/architected_events_v1/unit_masks \
ppc64/power4/events ppc64/power4/event_mappings ppc64/power4/unit_masks \
ppc64/power5/events ppc64/power5/event_mappings ppc64/power5/unit_masks \
ppc64/power5+/events ppc64/power5+/event_mappings ppc64/power5+/unit_masks \
ppc64/power5++/events ppc64/power5++/event_mappings ppc64/power5++/unit_masks \
ppc64/power6/events ppc64/power6/event_mappings ppc64/power6/unit_masks \
ppc64/power7/events ppc64/power7/event_mappings ppc64/power7/unit_masks \
+ ppc64/power8/events ppc64/power8/unit_masks \
ppc64/970/events ppc64/970/event_mappings ppc64/970/unit_masks \
ppc64/970MP/events ppc64/970MP/event_mappings ppc64/970MP/unit_masks \
- ppc64/ibm-compat-v1/events ppc64/ibm-compat-v1/event_mappings ppc64/ibm-compat-v1/unit_masks \
- ppc64/pa6t/events ppc64/pa6t/event_mappings ppc64/pa6t/unit_masks \
- ppc64/cell-be/events ppc64/cell-be/unit_masks \
- rtc/events rtc/unit_masks \
x86-64/hammer/events x86-64/hammer/unit_masks \
x86-64/family10/events x86-64/family10/unit_masks \
x86-64/family11h/events x86-64/family11h/unit_masks \
x86-64/family12h/events x86-64/family12h/unit_masks \
x86-64/family14h/events x86-64/family14h/unit_masks \
x86-64/family15h/events x86-64/family15h/unit_masks \
+ x86-64/generic/events x86-64/generic/unit_masks \
arm/xscale1/events arm/xscale1/unit_masks \
arm/xscale2/events arm/xscale2/unit_masks \
arm/armv6/events arm/armv6/unit_masks \
@@ -48,12 +43,16 @@ event_files = \
arm/armv7/events arm/armv7/unit_masks \
arm/armv7-scorpion/events arm/armv7-scorpion/unit_masks \
arm/armv7-scorpionmp/events arm/armv7-scorpionmp/unit_masks \
+ arm/armv7-krait/events arm/armv7-krait/unit_masks \
arm/armv7-ca9/events arm/armv7-ca9/unit_masks \
arm/armv7-ca5/events arm/armv7-ca5/unit_masks \
arm/armv7-ca7/events arm/armv7-ca7/unit_masks \
arm/armv7-ca15/events arm/armv7-ca15/unit_masks \
arm/mpcore/events arm/mpcore/unit_masks \
- avr32/events avr32/unit_masks \
+ arm/armv8-pmuv3-common/events arm/armv8-pmuv3-common/unit_masks \
+ arm/armv8-xgene/events arm/armv8-xgene/unit_masks \
+ arm/armv8-ca57/events arm/armv8-ca57/unit_masks \
+ arm/armv8-ca53/events arm/armv8-ca53/unit_masks \
mips/20K/events mips/20K/unit_masks \
mips/24K/events mips/24K/unit_masks \
mips/25K/events mips/25K/unit_masks \
@@ -72,12 +71,15 @@ event_files = \
ppc/7450/events ppc/7450/unit_masks \
ppc/e500/events ppc/e500/unit_masks \
ppc/e500v2/events ppc/e500v2/unit_masks \
+ ppc/e500mc/events ppc/e500mc/unit_masks \
+ ppc/e6500/events ppc/e6500/unit_masks \
ppc/e300/events ppc/e300/unit_masks \
tile/tile64/events tile/tile64/unit_masks \
tile/tilepro/events tile/tilepro/unit_masks \
tile/tilegx/events tile/tilegx/unit_masks \
s390/z10/events s390/z10/unit_masks \
- s390/z196/events s390/z196/unit_masks
+ s390/z196/events s390/z196/unit_masks \
+ s390/zEC12/events s390/zEC12/unit_masks
install-data-local:
for i in ${event_files} ; do \
diff --git a/events/Makefile.in b/events/Makefile.in
index 7fac3d0..3108c6e 100644
--- a/events/Makefile.in
+++ b/events/Makefile.in
@@ -38,7 +38,6 @@ DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/m4/binutils.m4 \
$(top_srcdir)/m4/builtinexpect.m4 \
- $(top_srcdir)/m4/cellspubfdsupport.m4 \
$(top_srcdir)/m4/compileroption.m4 \
$(top_srcdir)/m4/copyifchange.m4 $(top_srcdir)/m4/docbook.m4 \
$(top_srcdir)/m4/extradirs.m4 \
@@ -47,7 +46,7 @@ am__aclocal_m4_deps = $(top_srcdir)/m4/binutils.m4 \
$(top_srcdir)/m4/ltversion.m4 $(top_srcdir)/m4/lt~obsolete.m4 \
$(top_srcdir)/m4/mallocattribute.m4 \
$(top_srcdir)/m4/poptconst.m4 \
- $(top_srcdir)/m4/precompiledheader.m4 $(top_srcdir)/m4/qt.m4 \
+ $(top_srcdir)/m4/precompiledheader.m4 \
$(top_srcdir)/m4/sstream.m4 $(top_srcdir)/m4/typedef.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
@@ -110,7 +109,6 @@ LN_S = @LN_S@
LTLIBOBJS = @LTLIBOBJS@
MAKEINFO = @MAKEINFO@
MKDIR_P = @MKDIR_P@
-MOC = @MOC@
NM = @NM@
NMEDIT = @NMEDIT@
OBJDUMP = @OBJDUMP@
@@ -134,20 +132,13 @@ PFM_LIB = @PFM_LIB@
PKG_CONFIG = @PKG_CONFIG@
POPT_LIBS = @POPT_LIBS@
PTRDIFF_T_TYPE = @PTRDIFF_T_TYPE@
-QT_CFLAGS = @QT_CFLAGS@
-QT_INCLUDES = @QT_INCLUDES@
-QT_LDFLAGS = @QT_LDFLAGS@
-QT_LIB = @QT_LIB@
-QT_LIBS = @QT_LIBS@
-QT_VERSION = @QT_VERSION@
RANLIB = @RANLIB@
+RT_LIB = @RT_LIB@
SED = @SED@
SET_MAKE = @SET_MAKE@
SHELL = @SHELL@
SIZE_T_TYPE = @SIZE_T_TYPE@
STRIP = @STRIP@
-UIC = @UIC@
-UIChelp = @UIChelp@
VERSION = @VERSION@
XMKMF = @XMKMF@
XML_CATALOG = @XML_CATALOG@
@@ -212,11 +203,7 @@ top_builddir = @top_builddir@
top_srcdir = @top_srcdir@
topdir = @topdir@
event_files = \
- alpha/ev4/events alpha/ev4/unit_masks \
- alpha/ev5/events alpha/ev5/unit_masks \
alpha/ev67/events alpha/ev67/unit_masks \
- alpha/ev6/events alpha/ev6/unit_masks \
- alpha/pca56/events alpha/pca56/unit_masks \
i386/athlon/events i386/athlon/unit_masks \
i386/core_2/events i386/core_2/unit_masks \
i386/p4/events i386/p4-ht/events \
@@ -233,27 +220,26 @@ event_files = \
i386/westmere/events i386/westmere/unit_masks \
i386/sandybridge/events i386/sandybridge/unit_masks \
i386/ivybridge/events i386/ivybridge/unit_masks \
- ia64/ia64/events ia64/ia64/unit_masks \
- ia64/itanium2/events ia64/itanium2/unit_masks \
- ia64/itanium/events ia64/itanium/unit_masks \
+ i386/haswell/events i386/haswell/unit_masks \
+ i386/broadwell/events i386/broadwell/unit_masks \
+ i386/silvermont/events i386/silvermont/unit_masks \
+ ppc64/architected_events_v1/events ppc64/architected_events_v1/unit_masks \
ppc64/power4/events ppc64/power4/event_mappings ppc64/power4/unit_masks \
ppc64/power5/events ppc64/power5/event_mappings ppc64/power5/unit_masks \
ppc64/power5+/events ppc64/power5+/event_mappings ppc64/power5+/unit_masks \
ppc64/power5++/events ppc64/power5++/event_mappings ppc64/power5++/unit_masks \
ppc64/power6/events ppc64/power6/event_mappings ppc64/power6/unit_masks \
ppc64/power7/events ppc64/power7/event_mappings ppc64/power7/unit_masks \
+ ppc64/power8/events ppc64/power8/unit_masks \
ppc64/970/events ppc64/970/event_mappings ppc64/970/unit_masks \
ppc64/970MP/events ppc64/970MP/event_mappings ppc64/970MP/unit_masks \
- ppc64/ibm-compat-v1/events ppc64/ibm-compat-v1/event_mappings ppc64/ibm-compat-v1/unit_masks \
- ppc64/pa6t/events ppc64/pa6t/event_mappings ppc64/pa6t/unit_masks \
- ppc64/cell-be/events ppc64/cell-be/unit_masks \
- rtc/events rtc/unit_masks \
x86-64/hammer/events x86-64/hammer/unit_masks \
x86-64/family10/events x86-64/family10/unit_masks \
x86-64/family11h/events x86-64/family11h/unit_masks \
x86-64/family12h/events x86-64/family12h/unit_masks \
x86-64/family14h/events x86-64/family14h/unit_masks \
x86-64/family15h/events x86-64/family15h/unit_masks \
+ x86-64/generic/events x86-64/generic/unit_masks \
arm/xscale1/events arm/xscale1/unit_masks \
arm/xscale2/events arm/xscale2/unit_masks \
arm/armv6/events arm/armv6/unit_masks \
@@ -261,12 +247,16 @@ event_files = \
arm/armv7/events arm/armv7/unit_masks \
arm/armv7-scorpion/events arm/armv7-scorpion/unit_masks \
arm/armv7-scorpionmp/events arm/armv7-scorpionmp/unit_masks \
+ arm/armv7-krait/events arm/armv7-krait/unit_masks \
arm/armv7-ca9/events arm/armv7-ca9/unit_masks \
arm/armv7-ca5/events arm/armv7-ca5/unit_masks \
arm/armv7-ca7/events arm/armv7-ca7/unit_masks \
arm/armv7-ca15/events arm/armv7-ca15/unit_masks \
arm/mpcore/events arm/mpcore/unit_masks \
- avr32/events avr32/unit_masks \
+ arm/armv8-pmuv3-common/events arm/armv8-pmuv3-common/unit_masks \
+ arm/armv8-xgene/events arm/armv8-xgene/unit_masks \
+ arm/armv8-ca57/events arm/armv8-ca57/unit_masks \
+ arm/armv8-ca53/events arm/armv8-ca53/unit_masks \
mips/20K/events mips/20K/unit_masks \
mips/24K/events mips/24K/unit_masks \
mips/25K/events mips/25K/unit_masks \
@@ -285,12 +275,15 @@ event_files = \
ppc/7450/events ppc/7450/unit_masks \
ppc/e500/events ppc/e500/unit_masks \
ppc/e500v2/events ppc/e500v2/unit_masks \
+ ppc/e500mc/events ppc/e500mc/unit_masks \
+ ppc/e6500/events ppc/e6500/unit_masks \
ppc/e300/events ppc/e300/unit_masks \
tile/tile64/events tile/tile64/unit_masks \
tile/tilepro/events tile/tilepro/unit_masks \
tile/tilegx/events tile/tilegx/unit_masks \
s390/z10/events s390/z10/unit_masks \
- s390/z196/events s390/z196/unit_masks
+ s390/z196/events s390/z196/unit_masks \
+ s390/zEC12/events s390/zEC12/unit_masks
EXTRA_DIST = $(event_files)
all: all-am
diff --git a/events/alpha/ev4/events b/events/alpha/ev4/events
deleted file mode 100644
index 8b193d1..0000000
--- a/events/alpha/ev4/events
+++ /dev/null
@@ -1,18 +0,0 @@
-# Alpha EV4 events.
-#
-event:0x00 counters:0 um:zero minimum:4096 name:ISSUES : Total issues divided by 2
-event:0x02 counters:0 um:zero minimum:4096 name:PIPELINE_DRY : Nothing issued, no valid I-stream data
-event:0x04 counters:0 um:zero minimum:4096 name:LOAD_INSNS : All load instructions
-event:0x06 counters:0 um:zero minimum:4096 name:PIPELINE_FROZEN : Nothing issued, resource conflict
-event:0x08 counters:0 um:zero minimum:4096 name:BRANCH_INSNS : All branches (conditional, unconditional, jsr, hw_rei)
-event:0x0a counters:0 um:zero minimum:4096 name:CYCLES : Total cycles
-event:0x0b counters:0 um:zero minimum:4096 name:PAL_MODE : Cycles while in PALcode environment
-event:0x0c counters:0 um:zero minimum:4096 name:NON_ISSUES : Total nonissues divided by 2
-event:0x10 counters:0 um:zero minimum:256 name:DCACHE_MISSES : Total D-cache misses
-event:0x11 counters:0 um:zero minimum:256 name:ICACHE_MISSES : Total I-cache misses
-event:0x12 counters:0 um:zero minimum:256 name:DUAL_ISSUE_CYCLES : Cycles of dual issue
-event:0x13 counters:0 um:zero minimum:256 name:BRANCH_MISPREDICTS : Branch mispredicts (conditional, jsr, hw_rei)
-event:0x14 counters:0 um:zero minimum:256 name:FP_INSNS : FP operate instructions (not br, load, store)
-event:0x15 counters:0 um:zero minimum:256 name:INTEGER_OPERATE : Integer operate instructions
-event:0x16 counters:0 um:zero minimum:256 name:STORE_INSNS : Store instructions
-# There's also EXTERNAL, by which we could monitor the 21066/21068 bus controller.
diff --git a/events/alpha/ev4/unit_masks b/events/alpha/ev4/unit_masks
deleted file mode 100644
index bc77cc8..0000000
--- a/events/alpha/ev4/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# Alpha EV4 possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/alpha/ev5/events b/events/alpha/ev5/events
deleted file mode 100644
index 709e06a..0000000
--- a/events/alpha/ev5/events
+++ /dev/null
@@ -1,49 +0,0 @@
-# Alpha EV5 events
-#
-event:0x00 counters:0,2 um:zero minimum:256 name:CYCLES : Total cycles
-event:0x01 counters:0 um:zero minimum:256 name:ISSUES : Total issues
-event:0x02 counters:1 um:zero minimum:256 name:NON_ISSUE_CYCLES : Nothing issued, pipeline frozen
-event:0x03 counters:1 um:zero minimum:256 name:SPLIT_ISSUE_CYCLES : Some but not all issuable instructions issued
-event:0x04 counters:1 um:zero minimum:256 name:PIPELINE_DRY : Nothing issued, pipeline dry
-event:0x05 counters:1 um:zero minimum:256 name:REPLAY_TRAP : Replay traps (ldu, wb/maf, litmus test)
-event:0x06 counters:1 um:zero minimum:256 name:SINGLE_ISSUE_CYCLES : Single issue cycles
-event:0x07 counters:1 um:zero minimum:256 name:DUAL_ISSUE_CYCLES : Dual issue cycles
-event:0x08 counters:1 um:zero minimum:256 name:TRIPLE_ISSUE_CYCLES : Triple issue cycles
-event:0x09 counters:1 um:zero minimum:256 name:QUAD_ISSUE_CYCLES : Quad issue cycles
-event:0x0a counters:1 um:zero minimum:256 name:FLOW_CHANGE : Flow change (meaning depends on counter 2)
-# ??? This one's dependent on the value in PCSEL2: If measuring PC_MISPR,
-# this is jsr-ret instructions, if measuring BRANCH_MISPREDICTS, this is
-# conditional branches, otherwise this is all branch insns, including hw_rei.
-event:0x0b counters:1 um:zero minimum:256 name:INTEGER_OPERATE : Integer operate instructions
-event:0x0c counters:1 um:zero minimum:256 name:FP_INSNS : FP operate instructions (not br, load, store)
-# FIXME: Bug carried over
-event:0x0c counters:1 um:zero minimum:256 name:LOAD_INSNS : Load instructions
-event:0x0d counters:1 um:zero minimum:256 name:STORE_INSNS : Store instructions
-event:0x0e counters:1 um:zero minimum:256 name:ICACHE_ACCESS : Instruction cache access
-event:0x0f um:zero minimum:256 name:DCACHE_ACCESS : Data cache access
-event:0x10 counters:2 um:zero minimum:256 name:LONG_STALLS : Stalls longer than 15 cycles
-event:0x11 counters:2 um:zero minimum:256 name:PC_MISPR : PC mispredicts
-event:0x12 counters:2 um:zero minimum:256 name:BRANCH_MISPREDICTS : Branch mispredicts
-event:0x13 counters:2 um:zero minimum:256 name:ICACHE_MISSES : Instruction cache misses
-event:0x14 counters:2 um:zero minimum:256 name:ITB_MISS : Instruction TLB miss
-event:0x15 counters:2 um:zero minimum:256 name:DCACHE_MISSES : Data cache misses
-event:0x16 counters:2 um:zero minimum:256 name:DTB_MISS : Data TLB miss
-event:0x17 counters:2 um:zero minimum:256 name:LOADS_MERGED : Loads merged in MAF
-event:0x18 counters:2 um:zero minimum:256 name:LDU_REPLAYS : LDU replay traps
-event:0x19 counters:2 um:zero minimum:256 name:WB_MAF_FULL_REPLAYS : WB/MAF full replay traps
-event:0x1a counters:2 um:zero minimum:256 name:MEM_BARRIER : Memory barrier instructions
-event:0x1b counters:2 um:zero minimum:256 name:LOAD_LOCKED : LDx/L instructions
-event:0x1c counters:1 um:zero minimum:256 name:SCACHE_ACCESS : S-cache access
-event:0x1d counters:1 um:zero minimum:256 name:SCACHE_READ : S-cache read
-event:0x1e counters:1,2 um:zero minimum:256 name:SCACHE_WRITE : S-cache write
-event:0x1f counters:1 um:zero minimum:256 name:SCACHE_VICTIM : S-cache victim
-event:0x20 counters:2 um:zero minimum:256 name:SCACHE_MISS : S-cache miss
-event:0x21 counters:2 um:zero minimum:256 name:SCACHE_READ_MISS : S-cache read miss
-event:0x22 counters:2 um:zero minimum:256 name:SCACHE_WRITE_MISS : S-cache write miss
-event:0x23 counters:2 um:zero minimum:256 name:SCACHE_SH_WRITE : S-cache shared writes
-event:0x24 counters:1 um:zero minimum:256 name:BCACHE_HIT : B-cache hit
-event:0x25 counters:1 um:zero minimum:256 name:BCACHE_VICTIM : B-cache victim
-event:0x26 counters:2 um:zero minimum:256 name:BCACHE_MISS : B-cache miss
-event:0x27 counters:1 um:zero minimum:256 name:SYS_REQ : System requests
-event:0x28 counters:2 um:zero minimum:256 name:SYS_INV : System invalidates
-event:0x29 counters:2 um:zero minimum:256 name:SYS_READ_REQ : System read requests
diff --git a/events/alpha/ev5/unit_masks b/events/alpha/ev5/unit_masks
deleted file mode 100644
index 4f24fa9..0000000
--- a/events/alpha/ev5/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# Alpha EV-5 possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/alpha/ev6/events b/events/alpha/ev6/events
deleted file mode 100644
index 2039cef..0000000
--- a/events/alpha/ev6/events
+++ /dev/null
@@ -1,11 +0,0 @@
-# Alpha EV6 events
-#
-event:0x00 counters:0,1 um:zero minimum:500 name:CYCLES : Total cycles
-event:0x01 counters:1 um:zero minimum:500 name:RETIRED : Retired instructions
-event:0x02 counters:1 um:zero minimum:500 name:COND_BRANCHES : Retired conditional branches
-event:0x03 counters:1 um:zero minimum:500 name:BRANCH_MISPREDICTS : Retired branch mispredicts
-event:0x04 counters:1 um:zero minimum:500 name:DTB_MISS : Retired DTB single misses * 2
-event:0x05 counters:1 um:zero minimum:500 name:DTB_DD_MISS : Retired DTB double double misses
-event:0x06 counters:1 um:zero minimum:500 name:ITB_MISS : Retired ITB misses
-event:0x07 counters:1 um:zero minimum:500 name:UNALIGNED_TRAP : Retired unaligned traps
-event:0x08 counters:1 um:zero minimum:500 name:REPLAY_TRAP : Replay traps
diff --git a/events/alpha/ev6/unit_masks b/events/alpha/ev6/unit_masks
deleted file mode 100644
index bbe38c6..0000000
--- a/events/alpha/ev6/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# Alpha EV-6 possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/alpha/ev67/events b/events/alpha/ev67/events
index b603871..6e62383 100644
--- a/events/alpha/ev67/events
+++ b/events/alpha/ev67/events
@@ -1,27 +1,6 @@
# Alpha EV-67 Events
#
-event:0x00 counters:0 um:zero minimum:500 name:CYCLES : Total cycles
-event:0x01 counters:1 um:zero minimum:500 name:DELAYED_CYCLES : Cycles of delayed retire pointer advance
-# FIXME: bug carried over
-event:0x00 counters:0,1 um:zero minimum:500 name:RETIRED : Retired instructions
-event:0x02 counters:1 um:zero minimum:500 name:BCACHE_MISS : Bcache misses/long probe latency
-event:0x03 counters:1 um:zero minimum:500 name:MBOX_REPLAY : Mbox replay traps
-# FIXME: all the below used PM_CTR
-event:0x04 counters:0 um:zero minimum:500 name:STALLED_0 : PCTR0 triggered; stalled between fetch and map stages
-event:0x05 counters:0 um:zero minimum:500 name:TAKEN_0 : PCTR0 triggered; branch was not mispredicted and taken
-event:0x06 counters:0 um:zero minimum:500 name:MISPREDICT_0 : PCTR0 triggered; branch was mispredicted
-event:0x07 counters:0 um:zero minimum:500 name:ITB_MISS_0 : PCTR0 triggered; ITB miss
-event:0x08 counters:0 um:zero minimum:500 name:DTB_MISS_0 : PCTR0 triggered; DTB miss
-event:0x09 counters:0 um:zero minimum:500 name:REPLAY_0 : PCTR0 triggered; replay trap
-event:0x0a counters:0 um:zero minimum:500 name:LOAD_STORE_0 : PCTR0 triggered; load-store order replay trap
-event:0x0b counters:0 um:zero minimum:500 name:ICACHE_MISS_0 : PCTR0 triggered; Icache miss
-event:0x0c counters:0 um:zero minimum:500 name:UNALIGNED_0 : PCTR0 triggered; unaligned load/store trap
-event:0x0d counters:0 um:zero minimum:500 name:STALLED_1 : PCTR1 triggered; stalled between fetch and map stages
-event:0x0e counters:0 um:zero minimum:500 name:TAKEN_1 : PCTR1 triggered; branch was not mispredicted and taken
-event:0x0f counters:0 um:zero minimum:500 name:MISPREDICT_1 : PCTR1 triggered; branch was mispredicted
-event:0x10 counters:0 um:zero minimum:500 name:ITB_MISS_1 : PCTR1 triggered; ITB miss
-event:0x11 counters:0 um:zero minimum:500 name:DTB_MISS_1 : PCTR1 triggered; DTB miss
-event:0x12 counters:0 um:zero minimum:500 name:REPLAY_1 : PCTR1 triggered; replay trap
-event:0x13 counters:0 um:zero minimum:500 name:LOAD_STORE_1 : PCTR1 triggered; load-store order replay trap
-event:0x14 counters:0 um:zero minimum:500 name:ICACHE_MISS_1 : PCTR1 triggered; Icache miss
-event:0x15 counters:0 um:zero minimum:500 name:UNALIGNED_1 : PCTR1 triggered; unaligned load/store trap
+event:0x01 counters:0,1 um:zero minimum:500 name:CYCLES : Total cycles
+event:0x02 counters:0 um:zero minimum:500 name:INSTRUCTIONS : Retired instructions
+event:0x03 counters:1 um:zero minimum:500 name:BCACHE_MISS : Bcache misses/long probe latency
+event:0x04 counters:1 um:zero minimum:500 name:MBOX_REPLAY : Mbox replay traps
diff --git a/events/alpha/pca56/events b/events/alpha/pca56/events
deleted file mode 100644
index 334babe..0000000
--- a/events/alpha/pca56/events
+++ /dev/null
@@ -1,2 +0,0 @@
-# PCA-56
-# FIXME: no events ? What's going on here Falk ?
diff --git a/events/alpha/pca56/unit_masks b/events/alpha/pca56/unit_masks
deleted file mode 100644
index 2b807b7..0000000
--- a/events/alpha/pca56/unit_masks
+++ /dev/null
@@ -1,3 +0,0 @@
-# Alpha PCA-56 possible unit masks
-#
-# FIXME: any events ...?
diff --git a/events/arm/armv7-common/events b/events/arm/armv7-common/events
index 0b6ed45..c83b2b7 100644
--- a/events/arm/armv7-common/events
+++ b/events/arm/armv7-common/events
@@ -33,4 +33,4 @@ event:0x1B counters:1,2,3,4,5,6 um:zero minimum:500 name:INST_SPEC : Instruction
event:0x1C counters:1,2,3,4,5,6 um:zero minimum:500 name:TTBR_WRITE_RETIRED : Write to TTBR architecturally executed, condition code pass
event:0x1D counters:1,2,3,4,5,6 um:zero minimum:500 name:BUS_CYCLES : Bus cycle
-event:0xFF counters:0 um:zero minimum:500 name:CPU_CYCLES : CPU cycle
+event:0xFF counters:0 um:zero minimum:100000 name:CPU_CYCLES : CPU cycle
diff --git a/events/arm/armv7-krait/events b/events/arm/armv7-krait/events
new file mode 100644
index 0000000..ec838c7
--- /dev/null
+++ b/events/arm/armv7-krait/events
@@ -0,0 +1,3 @@
+# ARM V7 events
+# WARNING: just re-uses common ARM PMU codes as Stephen Boyd advised
+include:arm/armv7-common
diff --git a/events/avr32/unit_masks b/events/arm/armv7-krait/unit_masks
similarity index 54%
rename from events/avr32/unit_masks
rename to events/arm/armv7-krait/unit_masks
index 37d9839..4027469 100644
--- a/events/avr32/unit_masks
+++ b/events/arm/armv7-krait/unit_masks
@@ -1,4 +1,4 @@
-# AVR32 performance counters possible unit masks
+# ARM V7 PMNC possible unit masks
#
name:zero type:mandatory default:0x00
0x00 No unit mask
diff --git a/events/arm/armv8-ca53/events b/events/arm/armv8-ca53/events
new file mode 100644
index 0000000..5e1b4d8
--- /dev/null
+++ b/events/arm/armv8-ca53/events
@@ -0,0 +1,38 @@
+#
+# Copyright (c) Red Hat, 2014.
+# Contributed by William Cohen
+#
+# ARM Cortex A53 events
+# From Cortex A53 TRM
+#
+include:arm/armv8-pmuv3-common
+event:0x60 um:zero minimum:10007 name:BUS_ACCESS_LD : Bus access - Read
+event:0x61 um:zero minimum:10007 name:BUS_ACCESS_ST : Bus access - Write
+event:0x7A um:zero minimum:10007 name:BR_INDIRECT_SPEC : Branch speculatively executed - Indirect branch
+event:0x86 um:zero minimum:10007 name:EXC_IRQ : Exception taken, IRQ
+event:0x87 um:zero minimum:10007 name:EXC_FIQ : Exception taken, FIQ
+event:0xC0 um:zero minimum:10007 name:EXT_MEM_REQ : External memory request
+event:0xC1 um:zero minimum:10007 name:EXT_MEM_REQ_NC : Non-cacheable external memory request
+event:0xC2 um:zero minimum:10007 name:PREFETCH_LINEFILL : Linefill because of prefetch
+event:0xC3 um:zero minimum:10007 name:PREFETCH_LINEFILL_DROP : Instruction Cache Throttle occurred
+event:0xC4 um:zero minimum:10007 name:READ_ALLOC_ENTER : Entering read allocate mode
+event:0xC5 um:zero minimum:10007 name:READ_ALLOC : Read allocate mode
+event:0xC6 um:zero minimum:10007 name:PRE_DECODE_ERR : Pre-decode error
+event:0xC7 um:zero minimum:10007 name:STALL_SB_FULL : Data Write operation that stalls the pipeline because the store buffer is full
+event:0xC8 um:zero minimum:10007 name:EXT_SNOOP : SCU Snooped data from another CPU for this CPU
+event:0xC9 um:zero minimum:10007 name:BR_COND : Conditional branch executed
+event:0xCA um:zero minimum:10007 name:BR_INDIRECT_MISPRED : Indirect branch mispredicted
+event:0xCB um:zero minimum:10007 name:BR_INDIRECT_MISPRED_ADDR : Indirect branch mispredicted because of address miscompare
+event:0xCC um:zero minimum:10007 name:BR_COND_MISPRED : Conditional branch mispredicted
+event:0xD0 um:zero minimum:10007 name:L1I_CACHE_ERR : L1 Instruction Cache (data or tag) memory error
+event:0xD1 um:zero minimum:10007 name:L1D_CACHE_ERR : L1 Data Cache (data, tag or dirty) memory error, correctable or non-correctable
+event:0xD2 um:zero minimum:10007 name:TLB_ERR : TLB memory error
+event:0xE0 um:zero minimum:10007 name:OTHER_IQ_DEP_STALL : Cycles that the DPU IQ is empty and that is not because of a recent micro-TLB miss, instruction cache miss or pre-decode error
+event:0xE1 um:zero minimum:10007 name:IC_DEP_STALL : Cycles the DPU IQ is empty and there is an instruction cache miss being processed
+event:0xE2 um:zero minimum:10007 name:IUTLB_DEP_STALL : Cycles the DPU IQ is empty and there is an instruction micro-TLB miss being processed
+event:0xE3 um:zero minimum:10007 name:DECODE_DEP_STALL : Cycles the DPU IQ is empty and there is a pre-decode error being processed
+event:0xE4 um:zero minimum:10007 name:OTHER_INTERLOCK_STALL : Cycles there is an interlock other than Advanced SIMD/Floating-point instructions or load/store instruction
+event:0xE5 um:zero minimum:10007 name:AGU_DEP_STALL : Cycles there is an interlock for a load/store instruction waiting for data to calculate the address in the AGU
+event:0xE6 um:zero minimum:10007 name:SIMD_DEP_STALL : Cycles there is an interlock for an Advanced SIMD/Floating-point operation.
+event:0xE7 um:zero minimum:10007 name:LD_DEP_STALL : Cycles there is a stall in the Wr stage because of a load miss
+event:0xE8 um:zero minimum:10007 name:ST_DEP_STALL : Cycles there is a stall in the Wr stage because of a store
diff --git a/events/arm/armv8-ca53/unit_masks b/events/arm/armv8-ca53/unit_masks
new file mode 100644
index 0000000..42b12b4
--- /dev/null
+++ b/events/arm/armv8-ca53/unit_masks
@@ -0,0 +1,3 @@
+# ARMv8 Cortex A53 unit masks
+#
+include:arm/armv8-pmuv3-common
diff --git a/events/arm/armv8-ca57/events b/events/arm/armv8-ca57/events
new file mode 100644
index 0000000..62974c1
--- /dev/null
+++ b/events/arm/armv8-ca57/events
@@ -0,0 +1,67 @@
+#
+# Copyright (c) Red Hat, 2014.
+# Contributed by William Cohen
+#
+# ARM Cortex A57 events
+# From Cortex A57 TRM
+#
+include:arm/armv8-pmuv3-common
+event:0x40 um:zero minimum:10007 name:L1D_CACHE_LD : Level 1 data cache access - Read
+event:0x41 um:zero minimum:10007 name:L1D_CACHE_ST : Level 1 data cache access - Write
+event:0x42 um:zero minimum:10007 name:L1D_CACHE_REFILL_LD : Level 1 data cache refill - Read
+event:0x43 um:zero minimum:10007 name:L1D_CACHE_REFILL_ST : Level 1 data cache refill - Write
+event:0x46 um:zero minimum:10007 name:L1D_CACHE_WB_VICTIM : Level 1 data cache Write-back - Victim
+event:0x47 um:zero minimum:10007 name:L1D_CACHE_WB_CLEAN : Level 1 data cache Write-back - Cleaning event:and coherency
+event:0x48 um:zero minimum:10007 name:L1D_CACHE_INVAL : Level 1 data cache invalidate
+event:0x4C um:zero minimum:10007 name:L1D_TLB_REFILL_LD : Level 1 data TLB refill - Read
+event:0x4D um:zero minimum:10007 name:L1D_TLB_REFILL_ST : Level 1 data TLB refill - Write
+event:0x50 um:zero minimum:10007 name:L2D_CACHE_LD : Level 2 data cache access - Read
+event:0x51 um:zero minimum:10007 name:L2D_CACHE_ST : Level 2 data cache access - Write
+event:0x52 um:zero minimum:10007 name:L2D_CACHE_REFILL_LD : Level 2 data cache refill - Read
+event:0x53 um:zero minimum:10007 name:L2D_CACHE_REFILL_ST : Level 2 data cache refill - Write
+event:0x56 um:zero minimum:10007 name:L2D_CACHE_WB_VICTIM : Level 2 data cache Write-back - Victim
+event:0x57 um:zero minimum:10007 name:L2D_CACHE_WB_CLEAN : Level 2 data cache Write-back - Cleaning and coherency
+event:0x58 um:zero minimum:10007 name:L2D_CACHE_INVAL : Level 2 data cache invalidate
+event:0x60 um:zero minimum:10007 name:BUS_ACCESS_LD : Bus access - Read
+event:0x61 um:zero minimum:10007 name:BUS_ACCESS_ST : Bus access - Write
+event:0x62 um:zero minimum:10007 name:BUS_ACCESS_SHARED : Bus access - Normal
+event:0x63 um:zero minimum:10007 name:BUS_ACCESS_NOT_SHARED : Bus access - Not normal
+event:0x64 um:zero minimum:10007 name:BUS_ACCESS_NORMAL : Bus access - Normal
+event:0x65 um:zero minimum:10007 name:BUS_ACCESS_PERIPH : Bus access - Peripheral
+event:0x66 um:zero minimum:10007 name:MEM_ACCESS_LD : Data memory access - Read
+event:0x67 um:zero minimum:10007 name:MEM_ACCESS_ST : Data memory access - Write
+event:0x68 um:zero minimum:10007 name:UNALIGNED_LD_SPEC : Unaligned access - Read
+event:0x69 um:zero minimum:10007 name:UNALIGNED_ST_SPEC : Unaligned access - Write
+event:0x6A um:zero minimum:10007 name:UNALIGNED_LDST_SPEC : Unaligned access
+event:0x6C um:zero minimum:10007 name:LDREX_SPEC : Exclusive operation speculatively executed - LDREX
+event:0x6D um:zero minimum:10007 name:STREX_PASS_SPEC : Exclusive instruction speculatively executed - STREX pass
+event:0x6E um:zero minimum:10007 name:STREX_FAIL_SPEC : Exclusive operation speculatively executed - STREX fail
+event:0x70 um:zero minimum:10007 name:LD_SPEC : Operation speculatively executed - Load
+event:0x71 um:zero minimum:10007 name:ST_SPEC : Operation speculatively executed - Store
+event:0x72 um:zero minimum:10007 name:LDST_SPEC : Operation speculatively executed - Load or store
+event:0x73 um:zero minimum:10007 name:DP_SPEC : Operation speculatively executed - Integer data processing
+event:0x74 um:zero minimum:10007 name:ASE_SPEC : Operation speculatively executed - Advanced SIMD
+event:0x75 um:zero minimum:10007 name:VFP_SPEC : Operation speculatively executed - VFP
+event:0x76 um:zero minimum:10007 name:PC_WRITE_SPEC : Operation speculatively executed - Software change of the PC
+event:0x77 um:zero minimum:10007 name:CRYPTO_SPEC : Operation speculatively executed, crypto data processing
+event:0x78 um:zero minimum:10007 name:BR_IMMED_SPEC : Branch speculatively executed - Immediate branch
+event:0x79 um:zero minimum:10007 name:BR_RETURN_SPEC : Branch speculatively executed - Procedure return
+event:0x7A um:zero minimum:10007 name:BR_INDIRECT_SPEC : Branch speculatively executed - Indirect branch
+event:0x7C um:zero minimum:10007 name:ISB_SPEC : Barrier speculatively executed - ISB
+event:0x7D um:zero minimum:10007 name:DSB_SPEC : Barrier speculatively executed - DSB
+event:0x7E um:zero minimum:10007 name:DMB_SPEC : Barrier speculatively executed - DMB
+event:0x81 um:zero minimum:10007 name:EXC_UNDEF : Exception taken, other synchronous
+event:0x82 um:zero minimum:10007 name:EXC_SVC : Exception taken, Supervisor Call
+event:0x83 um:zero minimum:10007 name:EXC_PABORT : Exception taken, Instruction Abort
+event:0x84 um:zero minimum:10007 name:EXC_DABORT : Exception taken, Data Abort or SError
+event:0x86 um:zero minimum:10007 name:EXC_IRQ : Exception taken, IRQ
+event:0x87 um:zero minimum:10007 name:EXC_FIQ : Exception taken, FIQ
+event:0x88 um:zero minimum:10007 name:EXC_SMC : Exception taken, Secure Monitor Call
+event:0x8A um:zero minimum:10007 name:EXC_HVC : Exception taken, Hypervisor Call
+event:0x8B um:zero minimum:10007 name:EXC_TRAP_PABORT : Exception taken, Instruction Abort not taken locally
+event:0x8C um:zero minimum:10007 name:EXC_TRAP_DABORT : Exception taken, Data Abort, or SError not taken locally
+event:0x8D um:zero minimum:10007 name:EXC_TRAP_OTHER : Exception taken – Other traps not taken locally
+event:0x8E um:zero minimum:10007 name:EXC_TRAP_IRQ : Exception taken, IRQ not taken locally
+event:0x8F um:zero minimum:10007 name:EXC_TRAP_FIQ : Exception taken, FIQ not taken locally
+event:0x90 um:zero minimum:10007 name:RC_LD_SPEC : Release consistency instruction speculatively executed – Load-Acquire
+event:0x91 um:zero minimum:10007 name:RC_ST_SPEC : Release consistency instruction speculatively executed – Store-Release
diff --git a/events/arm/armv8-ca57/unit_masks b/events/arm/armv8-ca57/unit_masks
new file mode 100644
index 0000000..5d69263
--- /dev/null
+++ b/events/arm/armv8-ca57/unit_masks
@@ -0,0 +1,3 @@
+# ARMv8 Cortex A57 unit masks
+#
+include:arm/armv8-pmuv3-common
diff --git a/events/arm/armv8-pmuv3-common/events b/events/arm/armv8-pmuv3-common/events
new file mode 100644
index 0000000..3cdff03
--- /dev/null
+++ b/events/arm/armv8-pmuv3-common/events
@@ -0,0 +1,38 @@
+#
+# Copyright (c) Red Hat, 2014.
+# Contributed by William Cohen
+#
+# ARMv8 pmu v3 architected events
+
+event:0x00 um:zero minimum:500 name:SW_INCR : Instruction architecturally executed, condition code check pass, software increment
+event:0x01 um:zero minimum:5000 name:L1I_CACHE_REFILL : Level 1 instruction cache refill
+event:0x02 um:zero minimum:5000 name:L1I_TLB_REFILL : Level 1 instruction TLB refill
+event:0x03 um:zero minimum:5000 name:L1D_CACHE_REFILL : Level 1 data cache refill
+event:0x04 um:zero minimum:5000 name:L1D_CACHE : Level 1 data cache access
+event:0x05 um:zero minimum:5000 name:L1D_TLB_REFILL : Level 1 data TLB refill
+event:0x06 um:zero minimum:100000 name:LD_RETIRED : Instruction architecturally executed, condition code check pass, load
+event:0x07 um:zero minimum:100000 name:ST_RETIRED : Instruction architecturally executed, condition code check pass, store
+event:0x08 um:zero minimum:100000 name:INST_RETIRED : Instruction architecturally executed
+event:0x09 um:zero minimum:500 name:EXC_TAKEN : Exception taken
+event:0x0A um:zero minimum:500 name:EXC_RETURN : Instruction architecturally executed, condition code check pass, exception return
+event:0x0B um:zero minimum:500 name:CID_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to CONTEXTIDR
+event:0x0C um:zero minimum:5000 name:PC_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, software change of the PC
+event:0x0D um:zero minimum:5000 name:BR_IMMED_RETIRED : Instruction architecturally executed, immediate branch
+event:0x0E um:zero minimum:5000 name:BR_RETURN_RETIRED : Instruction architecturally executed, condition code check pass, procedure return
+event:0x0F um:zero minimum:500 name:UNALIGNED_LDST_RETIRED : Instruction architecturally executed, condition code check pass, unaligned load or store
+event:0x10 um:zero minimum:5000 name:BR_MIS_PRED : Mispredicted or not predicted branch speculatively executed
+event:0x11 um:zero minimum:100000 name:CPU_CYCLES : Cycle
+event:0x12 um:zero minimum:5000 name:BR_PRED : Predictable branch speculatively executed
+event:0x13 um:zero minimum:100000 name:MEM_ACCESS : Data memory access
+event:0x14 um:zero minimum:5000 name:L1I_CACHE : Level 1 instruction cache access
+event:0x15 um:zero minimum:5000 name:L1D_CACHE_WB : Level 1 data cache write-back
+event:0x16 um:zero minimum:5000 name:L2D_CACHE : Level 2 data cache access
+event:0x17 um:zero minimum:5000 name:L2D_CACHE_REFILL : Level 2 data cache refill
+event:0x18 um:zero minimum:5000 name:L2D_CACHE_WB : Level 2 data cache write-back
+event:0x19 um:zero minimum:5000 name:BUS_ACCESS : Bus access
+event:0x1A um:zero minimum:500 name:MEMORY_ERROR : Local memory error
+event:0x1B um:zero minimum:100000 name:INST_SPEC : Operation speculatively executed
+event:0x1C um:zero minimum:5000 name:TTBR_WRITE_RETIRED : Instruction architecturally executed, condition code check pass, write to TTBR
+event:0x1D um:zero minimum:5000 name:BUS_CYCLES : Bus cycle
+event:0x1F um:zero minimum:5000 name:L1D_CACHE_ALLOCATE : Level 1 data cache allocation without refill
+event:0x20 um:zero minimum:5000 name:L2D_CACHE_ALLOCATE : Level 2 data cache allocation without refill
diff --git a/events/arm/armv8-pmuv3-common/unit_masks b/events/arm/armv8-pmuv3-common/unit_masks
new file mode 100644
index 0000000..7666c35
--- /dev/null
+++ b/events/arm/armv8-pmuv3-common/unit_masks
@@ -0,0 +1,4 @@
+# ARMv8 architected events unit masks
+#
+name:zero type:mandatory default:0x00
+ 0x00 No unit mask
diff --git a/events/arm/armv8-xgene/events b/events/arm/armv8-xgene/events
new file mode 100644
index 0000000..3e28463
--- /dev/null
+++ b/events/arm/armv8-xgene/events
@@ -0,0 +1,7 @@
+#
+# Copyright (c) Red Hat, 2014.
+# Contributed by William Cohen
+#
+# Basic ARM V8 events
+#
+include:arm/armv8-pmuv3-common
diff --git a/events/arm/armv8-xgene/unit_masks b/events/arm/armv8-xgene/unit_masks
new file mode 100644
index 0000000..9ace2eb
--- /dev/null
+++ b/events/arm/armv8-xgene/unit_masks
@@ -0,0 +1,3 @@
+# ARMv8 architected events unit masks
+#
+include:arm/armv8-pmuv3-common
diff --git a/events/avr32/events b/events/avr32/events
deleted file mode 100644
index 489d914..0000000
--- a/events/avr32/events
+++ /dev/null
@@ -1,27 +0,0 @@
-# AVR32 events
-#
-event:0x00 counters:1,2 um:zero minimum:500 name:IFU_IFETCH_MISS : number of instruction fetch misses
-event:0x01 counters:1,2 um:zero minimum:500 name:CYCLES_IFU_MEM_STALL : cycles instruction fetch pipe is stalled
-event:0x02 counters:1,2 um:zero minimum:500 name:CYCLES_DATA_STALL : cycles stall due to data dependency
-event:0x03 counters:1,2 um:zero minimum:500 name:ITLB_MISS : number of Instruction TLB misses
-event:0x04 counters:1,2 um:zero minimum:500 name:DTLB_MISS : number of Data TLB misses
-event:0x05 counters:1,2 um:zero minimum:500 name:BR_INST_EXECUTED : branch instruction executed w/ or w/o program flow change
-event:0x06 counters:1,2 um:zero minimum:500 name:BR_INST_MISS_PRED : branch mispredicted
-event:0x07 counters:1,2 um:zero minimum:500 name:INSN_EXECUTED : instructions executed
-event:0x08 counters:1,2 um:zero minimum:500 name:DCACHE_WBUF_FULL : data cache write buffers full
-event:0x09 counters:1,2 um:zero minimum:500 name:CYCLES_DCACHE_WBUF_FULL : cycles stalled due to data cache write buffers full
-event:0x0a counters:1,2 um:zero minimum:500 name:DCACHE_READ_MISS : data cache read miss
-event:0x0b counters:1,2 um:zero minimum:500 name:CYCLES_DCACHE_READ_MISS : cycles stalled due to data cache read miss
-event:0x0c counters:1,2 um:zero minimum:500 name:WRITE_ACCESS : write access
-event:0x0d counters:1,2 um:zero minimum:500 name:CYCLES_WRITE_ACCESS : cycles when write access is ongoing
-event:0x0e counters:1,2 um:zero minimum:500 name:READ_ACCESS : read access
-event:0x0f counters:1,2 um:zero minimum:500 name:CYCLES_READ_ACCESS : cycles when read access is ongoing
-event:0x10 counters:1,2 um:zero minimum:500 name:CACHE_STALL : read or write access that stalled
-event:0x11 counters:1,2 um:zero minimum:500 name:CYCLES_CACHE_STALL : cycles stalled doing read or write access
-event:0x12 counters:1,2 um:zero minimum:500 name:DCACHE_ACCESS : data cache access
-event:0x13 counters:1,2 um:zero minimum:500 name:CYCLES_DCACHE_ACCESS : cycles when data cache access is ongoing
-event:0x14 counters:1,2 um:zero minimum:500 name:DCACHE_WB : data cache line writeback
-event:0x15 counters:1,2 um:zero minimum:500 name:ACCUMULATOR_HIT : accumulator cache hit
-event:0x16 counters:1,2 um:zero minimum:500 name:ACCUMULATOR_MISS : accumulator cache miss
-event:0x17 counters:1,2 um:zero minimum:500 name:BTB_HIT : branch target buffer hit
-event:0xff counters:0 um:zero minimum:500 name:CPU_CYCLES : clock cycles counter
diff --git a/events/i386/atom/unit_masks b/events/i386/atom/unit_masks
index acaec23..4802ddb 100644
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -3,118 +3,118 @@
#
include:i386/arch_perfmon
name:store_forwards type:mandatory default:0x81
- 0x81 good Good store forwards
+ 0x81 extra: good Good store forwards
name:segment_reg_loads type:mandatory default:0x00
- 0x00 any Number of segment register loads
+ 0x00 extra: any Number of segment register loads
name:simd_prefetch type:bitmask default:0x01
- 0x01 prefetcht0 Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed
- 0x06 sw_l2 Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed
- 0x08 prefetchnta Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed
+ 0x01 extra: prefetcht0 Streaming SIMD Extensions (SSE) PrefetchT0 instructions executed
+ 0x06 extra: sw_l2 Streaming SIMD Extensions (SSE) PrefetchT1 and PrefetchT2 instructions executed
+ 0x08 extra: prefetchnta Streaming SIMD Extensions (SSE) Prefetch NTA instructions executed
name:data_tlb_misses type:bitmask default:0x07
- 0x07 dtlb_miss Memory accesses that missed the DTLB
- 0x05 dtlb_miss_ld DTLB misses due to load operations
- 0x09 l0_dtlb_miss_ld L0_DTLB misses due to load operations
- 0x06 dtlb_miss_st DTLB misses due to store operations
+ 0x07 extra: dtlb_miss Memory accesses that missed the DTLB
+ 0x05 extra: dtlb_miss_ld DTLB misses due to load operations
+ 0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
+ 0x06 extra: dtlb_miss_st DTLB misses due to store operations
name:page_walks type:bitmask default:0x03
- 0x03 walks Number of page-walks executed
- 0x03 cycles Duration of page-walks in core cycles
+ 0x03 extra: walks Number of page-walks executed
+ 0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
- 0x01 s Floating point computational micro-ops executed
- 0x81 ar Floating point computational micro-ops retired
+ 0x01 extra: s Floating point computational micro-ops executed
+ 0x81 extra: ar Floating point computational micro-ops retired
name:fp_assist type:mandatory default:0x81
- 0x81 ar Floating point assists
+ 0x81 extra: ar Floating point assists
name:mul type:bitmask default:0x01
- 0x01 s Multiply operations executed
- 0x81 ar Multiply operations retired
+ 0x01 extra: s Multiply operations executed
+ 0x81 extra: ar Multiply operations retired
name:div type:bitmask default:0x01
- 0x01 s Divide operations executed
- 0x81 ar Divide operations retired
+ 0x01 extra: s Divide operations executed
+ 0x81 extra: ar Divide operations retired
name:l2_rqsts type:bitmask default:0x41
- 0x41 i_state L2 cache demand requests from this core that missed the L2
+ 0x41 extra: i_state L2 cache demand requests from this core that missed the L2
0x4F mesi L2 cache demand requests from this core
name:cpu_clk_unhalted type:bitmask default:0x00
- 0x00 core_p Core cycles when core is not halted
- 0x01 bus Bus cycles when core is not halted
- 0x02 no_other Bus cycles when core is active and the other is halted
+ 0x00 extra: core_p Core cycles when core is not halted
+ 0x01 extra: bus Bus cycles when core is not halted
+ 0x02 extra: no_other Bus cycles when core is active and the other is halted
name:l1d_cache type:bitmask default:0x21
- 0x21 ld L1 Cacheable Data Reads
- 0x22 st L1 Cacheable Data Writes
+ 0x21 extra: ld L1 Cacheable Data Reads
+ 0x22 extra: st L1 Cacheable Data Writes
name:icache type:bitmask default:0x03
- 0x03 accesses Instruction fetches
- 0x02 misses Icache miss
+ 0x03 extra: accesses Instruction fetches
+ 0x02 extra: misses Icache miss
name:itlb type:bitmask default:0x04
- 0x04 flush ITLB flushes
- 0x02 misses ITLB misses
+ 0x04 extra: flush ITLB flushes
+ 0x02 extra: misses ITLB misses
name:macro_insts type:exclusive default:0x03
- 0x02 cisc_decoded CISC macro instructions decoded
- 0x03 all_decoded All Instructions decoded
+ 0x02 extra: cisc_decoded CISC macro instructions decoded
+ 0x03 extra: all_decoded All Instructions decoded
name:simd_uops_exec type:exclusive default:0x80
- 0x00 s SIMD micro-ops executed (excluding stores)
- 0x80 ar SIMD micro-ops retired (excluding stores)
+ 0x00 extra: s SIMD micro-ops executed (excluding stores)
+ 0x80 extra: ar SIMD micro-ops retired (excluding stores)
name:simd_sat_uop_exec type:bitmask default:0x00
- 0x00 s SIMD saturated arithmetic micro-ops executed
- 0x80 ar SIMD saturated arithmetic micro-ops retired
+ 0x00 extra: s SIMD saturated arithmetic micro-ops executed
+ 0x80 extra: ar SIMD saturated arithmetic micro-ops retired
name:simd_uop_type_exec type:bitmask default:0x01
- 0x01 s SIMD packed multiply microops executed
- 0x81 ar SIMD packed multiply microops retired
- 0x02 s SIMD packed shift micro-ops executed
- 0x82 ar SIMD packed shift micro-ops retired
- 0x04 s SIMD pack micro-ops executed
- 0x84 ar SIMD pack micro-ops retired
- 0x08 s SIMD unpack micro-ops executed
- 0x88 ar SIMD unpack micro-ops retired
- 0x10 s SIMD packed logical microops executed
- 0x90 ar SIMD packed logical microops retired
- 0x20 s SIMD packed arithmetic micro-ops executed
+ 0x01 extra: s SIMD packed multiply microops executed
+ 0x81 extra: ar SIMD packed multiply microops retired
+ 0x02 extra: s SIMD packed shift micro-ops executed
+ 0x82 extra: ar SIMD packed shift micro-ops retired
+ 0x04 extra: s SIMD pack micro-ops executed
+ 0x84 extra: ar SIMD pack micro-ops retired
+ 0x08 extra: s SIMD unpack micro-ops executed
+ 0x88 extra: ar SIMD unpack micro-ops retired
+ 0x10 extra: s SIMD packed logical microops executed
+ 0x90 extra: ar SIMD packed logical microops retired
+ 0x20 extra: s SIMD packed arithmetic micro-ops executed
0xA0 ar SIMD packed arithmetic micro-ops retired
name:uops_retired type:mandatory default:0x10
- 0x10 any Micro-ops retired
+ 0x10 extra: any Micro-ops retired
name:br_inst_retired type:bitmask default:0x00
- 0x00 any Retired branch instructions
- 0x01 pred_not_taken Retired branch instructions that were predicted not-taken
- 0x02 mispred_not_taken Retired branch instructions that were mispredicted not-taken
- 0x04 pred_taken Retired branch instructions that were predicted taken
- 0x08 mispred_taken Retired branch instructions that were mispredicted taken
+ 0x00 extra: any Retired branch instructions
+ 0x01 extra: pred_not_taken Retired branch instructions that were predicted not-taken
+ 0x02 extra: mispred_not_taken Retired branch instructions that were mispredicted not-taken
+ 0x04 extra: pred_taken Retired branch instructions that were predicted taken
+ 0x08 extra: mispred_taken Retired branch instructions that were mispredicted taken
0x0A mispred Retired mispredicted branch instructions (precise event)
0x0C taken Retired taken branch instructions
0x0F any1 Retired branch instructions
name:cycles_int_masked type:bitmask default:0x01
- 0x01 cycles_int_masked Cycles during which interrupts are disabled
- 0x02 cycles_int_pending_and_masked Cycles during which interrupts are pending and disabled
+ 0x01 extra: cycles_int_masked Cycles during which interrupts are disabled
+ 0x02 extra: cycles_int_pending_and_masked Cycles during which interrupts are pending and disabled
name:simd_inst_retired type:bitmask default:0x01
- 0x01 packed_single Retired Streaming SIMD Extensions (SSE) packed-single instructions
- 0x02 scalar_single Retired Streaming SIMD Extensions (SSE) scalar-single instructions
- 0x04 packed_double Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions
- 0x08 scalar_double Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions
- 0x10 vector Retired Streaming SIMD Extensions 2 (SSE2) vector instructions
+ 0x01 extra: packed_single Retired Streaming SIMD Extensions (SSE) packed-single instructions
+ 0x02 extra: scalar_single Retired Streaming SIMD Extensions (SSE) scalar-single instructions
+ 0x04 extra: packed_double Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions
+ 0x08 extra: scalar_double Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions
+ 0x10 extra: vector Retired Streaming SIMD Extensions 2 (SSE2) vector instructions
0x1F any Retired Streaming SIMD instructions
name:simd_comp_inst_retired type:bitmask default:0x01
- 0x01 packed_single Retired computational Streaming SIMD Extensions (SSE) packed-single instructions
- 0x02 scalar_single Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions
- 0x04 packed_double Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions
- 0x08 scalar_double Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions
+ 0x01 extra: packed_single Retired computational Streaming SIMD Extensions (SSE) packed-single instructions
+ 0x02 extra: scalar_single Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions
+ 0x04 extra: packed_double Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions
+ 0x08 extra: scalar_double Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions
name:mem_load_retired type:bitmask default:0x01
- 0x01 l2_hit Retired loads that hit the L2 cache (precise event)
- 0x02 l2_miss Retired loads that miss the L2 cache (precise event)
- 0x04 dtlb_miss Retired loads that miss the DTLB (precise event)
+ 0x01 extra: l2_hit Retired loads that hit the L2 cache (precise event)
+ 0x02 extra: l2_miss Retired loads that miss the L2 cache (precise event)
+ 0x04 extra: dtlb_miss Retired loads that miss the DTLB (precise event)
name:thermal_trip type:mandatory default:0xc0
- 0xc0 thermal_trip Number of thermal trips.
+ 0xc0 extra: thermal_trip Number of thermal trips.
# 18-11
name:core type:bitmask default:0x180
- 0x180 all All cores.
- 0x080 this This Core.
+ 0x180 extra: all All cores.
+ 0x080 extra: this This Core.
# 18-12
name:agent type:bitmask default:0x00
- 0x00 this This agent
- 0x40 any Include any agents
+ 0x00 extra: this This agent
+ 0x40 extra: any Include any agents
# 18-13
name:prefetch type:bitmask default:0x60
- 0x60 all All inclusive
- 0x20 hw Hardware prefetch only
- 0x00 exclude_hw Exclude hardware prefetch
+ 0x60 extra: all All inclusive
+ 0x20 extra: hw Hardware prefetch only
+ 0x00 extra: exclude_hw Exclude hardware prefetch
# 18-14
name:mesi type:bitmask default:0x0f
- 0x08 modified Counts modified state
- 0x04 exclusive Counts exclusive state
- 0x02 shared Counts shared state
- 0x01 invalid Counts invalid state
+ 0x08 extra: modified Counts modified state
+ 0x04 extra: exclusive Counts exclusive state
+ 0x02 extra: shared Counts shared state
+ 0x01 extra: invalid Counts invalid state
diff --git a/events/i386/broadwell/events b/events/i386/broadwell/events
new file mode 100644
index 0000000..ec55836
--- /dev/null
+++ b/events/i386/broadwell/events
@@ -0,0 +1,65 @@
+#
+# Intel "Broadwell" microarchitecture core events.
+#
+# See http://ark.intel.com/ for help in identifying Broadwell based CPUs
+#
+# Note the minimum counts are not discovered experimentally and could be likely
+# lowered in many cases without ill effect.
+#
+include:i386/arch_perfmon
+event:0x03 counters:cpuid um:ld_blocks minimum:100003 name:ld_blocks :
+event:0x05 counters:cpuid um:misalign_mem_ref minimum:2000003 name:misalign_mem_ref :
+event:0x07 counters:cpuid um:one minimum:100003 name:ld_blocks_partial_address_alias :
+event:0x08 counters:cpuid um:dtlb_load_misses minimum:2000003 name:dtlb_load_misses :
+event:0x0d counters:cpuid um:x03 minimum:2000003 name:int_misc_recovery_cycles :
+event:0x0e counters:cpuid um:uops_issued minimum:2000003 name:uops_issued :
+event:0x14 counters:cpuid um:one minimum:2000003 name:arith_fpu_div_active :
+event:0x24 counters:cpuid um:l2_rqsts minimum:200003 name:l2_rqsts :
+event:0x27 counters:cpuid um:x50 minimum:200003 name:l2_demand_rqsts_wb_hit :
+event:0x48 counters:2 um:l1d_pend_miss minimum:2000003 name:l1d_pend_miss :
+event:0x49 counters:cpuid um:dtlb_store_misses minimum:100003 name:dtlb_store_misses :
+event:0x4c counters:cpuid um:x02 minimum:100003 name:load_hit_pre_hw_pf :
+event:0x4f counters:cpuid um:x10 minimum:2000003 name:ept_walk_cycles :
+event:0x51 counters:cpuid um:one minimum:2000003 name:l1d_replacement :
+event:0x54 counters:cpuid um:tx_mem minimum:2000003 name:tx_mem :
+event:0x58 counters:cpuid um:move_elimination minimum:1000003 name:move_elimination :
+event:0x5c counters:cpuid um:cpl_cycles minimum:2000003 name:cpl_cycles :
+event:0x5d counters:cpuid um:tx_exec minimum:2000003 name:tx_exec :
+event:0x5e counters:cpuid um:rs_events minimum:2000003 name:rs_events :
+event:0x60 counters:cpuid um:offcore_requests_outstanding minimum:2000003 name:offcore_requests_outstanding :
+event:0x63 counters:cpuid um:lock_cycles minimum:2000003 name:lock_cycles :
+event:0x79 counters:0,1,2,3 um:idq minimum:2000003 name:idq :
+event:0x80 counters:cpuid um:x02 minimum:200003 name:icache_misses :
+event:0x85 counters:cpuid um:itlb_misses minimum:100003 name:itlb_misses :
+event:0x87 counters:cpuid um:one minimum:2000003 name:ild_stall_lcp :
+event:0x88 counters:cpuid um:br_inst_exec minimum:200003 name:br_inst_exec :
+event:0x89 counters:cpuid um:br_misp_exec minimum:200003 name:br_misp_exec :
+event:0x9c counters:0,1,2,3 um:idq_uops_not_delivered minimum:2000003 name:idq_uops_not_delivered :
+event:0xa1 counters:cpuid um:uops_executed_port minimum:2000003 name:uops_executed_port :
+event:0xa1 counters:cpuid um:uops_dispatched_port minimum:2000003 name:uops_dispatched_port :
+event:0xa2 counters:cpuid um:resource_stalls minimum:2000003 name:resource_stalls :
+event:0xa3 counters:2 um:cycle_activity minimum:2000003 name:cycle_activity :
+event:0xa8 counters:cpuid um:lsd minimum:2000003 name:lsd :
+event:0xab counters:cpuid um:x02 minimum:2000003 name:dsb2mite_switches_penalty_cycles :
+event:0xae counters:cpuid um:one minimum:100007 name:itlb_itlb_flush :
+event:0xb0 counters:cpuid um:offcore_requests minimum:100003 name:offcore_requests :
+event:0xb1 counters:cpuid um:uops_executed minimum:2000003 name:uops_executed :
+event:0xbc counters:0,1,2,3 um:page_walker_loads minimum:2000003 name:page_walker_loads :
+event:0xc0 counters:1 um:inst_retired minimum:2000003 name:inst_retired :
+event:0xc1 counters:cpuid um:other_assists minimum:100003 name:other_assists :
+event:0xc2 counters:cpuid um:uops_retired minimum:2000003 name:uops_retired :
+event:0xc3 counters:cpuid um:machine_clears minimum:2000003 name:machine_clears :
+event:0xc4 counters:cpuid um:br_inst_retired minimum:400009 name:br_inst_retired :
+event:0xc5 counters:cpuid um:br_misp_retired minimum:400009 name:br_misp_retired :
+event:0xc8 counters:cpuid um:hle_retired minimum:2000003 name:hle_retired :
+event:0xc9 counters:0,1,2,3 um:rtm_retired minimum:2000003 name:rtm_retired :
+event:0xca counters:cpuid um:fp_assist minimum:100003 name:fp_assist :
+event:0xcc counters:cpuid um:x20 minimum:2000003 name:rob_misc_events_lbr_inserts :
+event:0xd0 counters:0,1,2,3 um:mem_uops_retired minimum:2000003 name:mem_uops_retired :
+event:0xd1 counters:0,1,2,3 um:mem_load_uops_retired minimum:2000003 name:mem_load_uops_retired :
+event:0xd2 counters:0,1,2,3 um:mem_load_uops_l3_hit_retired minimum:100003 name:mem_load_uops_l3_hit_retired :
+event:0xd3 counters:0,1,2,3 um:mem_load_uops_l3_miss_retired minimum:100007 name:mem_load_uops_l3_miss_retired :
+event:0xe6 counters:cpuid um:x1f minimum:100003 name:baclears_any :
+event:0xf0 counters:cpuid um:l2_trans minimum:200003 name:l2_trans :
+event:0xf1 counters:cpuid um:l2_lines_in minimum:100003 name:l2_lines_in :
+event:0xf2 counters:cpuid um:x05 minimum:100003 name:l2_lines_out_demand_clean :
diff --git a/events/i386/broadwell/unit_masks b/events/i386/broadwell/unit_masks
new file mode 100644
index 0000000..0d6ccd5
--- /dev/null
+++ b/events/i386/broadwell/unit_masks
@@ -0,0 +1,347 @@
+#
+# Unit masks for the Intel "Broadwell" micro architecture
+#
+# See http://ark.intel.com/ for help in identifying Broadwell based CPUs
+#
+include:i386/arch_perfmon
+name:x02 type:mandatory default:0x2
+ 0x2 No unit mask
+name:x03 type:mandatory default:0x3
+ 0x3 No unit mask
+name:x05 type:mandatory default:0x5
+ 0x5 No unit mask
+name:x10 type:mandatory default:0x10
+ 0x10 No unit mask
+name:x1f type:mandatory default:0x1f
+ 0x1f No unit mask
+name:x20 type:mandatory default:0x20
+ 0x20 No unit mask
+name:x50 type:mandatory default:0x50
+ 0x50 No unit mask
+name:ld_blocks type:exclusive default:0x2
+ 0x2 extra: store_forward This event counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when: - preceding store conflicts with the load (incomplete overlap); - store forwarding is impossible due to u-arch limitations; - preceding lock RMW operations are not forwarded; - store has the no-forward bit set (uncacheable/page-split/masked stores); - all-blocking stores are used (mostly, fences and port I/O); and others. The most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events. See the table of not supported store forwards in the Optimization Guide.
+ 0x8 extra: no_sr This event counts the number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.
+name:misalign_mem_ref type:exclusive default:0x1
+ 0x1 extra: loads This event counts speculative cache-line split load uops dispatched to the L1 cache.
+ 0x2 extra: stores This event counts speculative cache line split store-address (STA) uops dispatched to the L1 cache.
+name:dtlb_load_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk This event counts load misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).
+ 0x2 extra: walk_completed_4k This event counts load misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.
+ 0x10 extra: walk_duration This event counts the number of cycles while PMH is busy with the page walk.
+ 0x20 extra: stlb_hit_4k Load misses that miss the DTLB and hit the STLB (4K)
+ 0xe extra: walk_completed Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.
+ 0x60 extra: stlb_hit Load operations that miss the first DTLB level but hit the second and do not cause page walks
+name:uops_issued type:exclusive default:0x1
+ 0x1 extra: any This event counts the number of Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS).
+ 0x10 extra: flags_merge Number of flags-merge uops being allocated. Such uops considered perf sensitive; added by GSR u-arch.
+ 0x20 extra: slow_lea Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.
+ 0x40 extra: single_mul Number of Multiply packed/scalar single precision uops allocated
+ 0x1 extra:cmask=1,inv stall_cycles This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.
+name:l2_rqsts type:exclusive default:0x21
+ 0x21 extra: demand_data_rd_miss This event counts the number of demand Data Read requests that miss L2 cache. Only not rejected loads are counted.
+ 0x41 extra: demand_data_rd_hit This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.
+ 0x30 extra: l2_pf_miss This event counts the number of requests from the L2 hardware prefetchers that miss L2 cache.
+ 0x50 extra: l2_pf_hit This event counts the number of requests from the L2 hardware prefetchers that hit L2 cache. L3 prefetch new types
+ 0xe1 extra: all_demand_data_rd This event counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only non rejected loads are counted.
+ 0xe2 extra: all_rfo This event counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.
+ 0xe4 extra: all_code_rd This event counts the total number of L2 code requests.
+ 0xf8 extra: all_pf This event counts the total number of requests from the L2 hardware prefetchers.
+ 0x42 extra: rfo_hit RFO requests that hit L2 cache
+ 0x22 extra: rfo_miss RFO requests that miss L2 cache
+ 0x44 extra: code_rd_hit L2 cache hits when fetching instructions, code reads.
+ 0x24 extra: code_rd_miss L2 cache misses when fetching instructions
+ 0x27 extra: all_demand_miss Demand requests that miss L2 cache
+ 0xe7 extra: all_demand_references Demand requests to L2 cache
+ 0x3f extra: miss All requests that miss L2 cache
+ 0xff extra: references All L2 requests
+name:l1d_pend_miss type:exclusive default:0x1
+ 0x1 extra: pending This event counts duration of L1D miss outstanding, that is each cycle number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand; from the demand Hit FB, if it is allocated by hardware or software prefetch. Note: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type.
+ 0x1 extra:cmask=1 pending_cycles This event counts duration of L1D miss outstanding in cycles.
+name:dtlb_store_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk This event counts store misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).
+ 0x2 extra: walk_completed_4k This event counts store misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.
+ 0x10 extra: walk_duration This event counts the number of cycles while PMH is busy with the page walk.
+ 0x20 extra: stlb_hit_4k Store misses that miss the DTLB and hit the STLB (4K)
+ 0xe extra: walk_completed Store misses in all DTLB levels that cause completed page walks
+ 0x60 extra: stlb_hit Store operations that miss the first TLB level but hit the second and do not cause page walks
+name:tx_mem type:exclusive default:0x1
+ 0x1 extra: abort_conflict Number of times a TSX line had a cache conflict
+ 0x2 extra: abort_capacity_write Number of times a TSX Abort was triggered due to an evicted line caused by a transaction overflow
+ 0x4 extra: abort_hle_store_to_elided_lock Number of times a TSX Abort was triggered due to a non-release/commit store to lock
+ 0x8 extra: abort_hle_elision_buffer_not_empty Number of times a TSX Abort was triggered due to commit but Lock Buffer not empty
+ 0x10 extra: abort_hle_elision_buffer_mismatch Number of times a TSX Abort was triggered due to release/commit but data and address mismatch
+ 0x20 extra: abort_hle_elision_buffer_unsupported_alignment Number of times a TSX Abort was triggered due to attempting an unsupported alignment from Lock Buffer
+ 0x40 extra: hle_elision_buffer_full Number of times we could not allocate Lock Buffer
+name:move_elimination type:exclusive default:0x1
+ 0x1 extra: int_eliminated Number of integer Move Elimination candidate uops that were eliminated.
+ 0x2 extra: simd_eliminated Number of SIMD Move Elimination candidate uops that were eliminated.
+ 0x4 extra: int_not_eliminated Number of integer Move Elimination candidate uops that were not eliminated.
+ 0x8 extra: simd_not_eliminated Number of SIMD Move Elimination candidate uops that were not eliminated.
+name:cpl_cycles type:exclusive default:0x1
+ 0x1 extra: ring0 This event counts the unhalted core cycles during which the thread is in the ring 0 privileged mode.
+ 0x2 extra: ring123 This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.
+ 0x1 extra:cmask=1,edge ring0_trans This event counts when there is a transition from ring 1,2 or 3 to ring0.
+name:tx_exec type:exclusive default:0x1
+ 0x1 extra: misc1 Unfriendly TSX abort triggered by a flowmarker
+ 0x2 extra: misc2 Unfriendly TSX abort triggered by a vzeroupper instruction
+ 0x4 extra: misc3 Unfriendly TSX abort triggered by a nest count that is too deep
+ 0x8 extra: misc4 RTM region detected inside HLE
+ 0x10 extra: misc5 # HLE inside HLE+
+name:rs_events type:exclusive default:0x1
+ 0x1 extra: empty_cycles This event counts cycles during which the reservation station (RS) is empty for the thread. Note: In ST-mode, not active thread should drive 0. This is usually caused by severely costly branch mispredictions, or allocator/FE issues.
+ 0x1 extra:cmask=1,inv,edge empty_end Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.
+name:offcore_requests_outstanding type:exclusive default:0x1
+ 0x1 extra: demand_data_rd This event counts the number of offcore outstanding Demand Data Read transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor. See the corresponding Umask under OFFCORE_REQUESTS. Note: A prefetch promoted to Demand is counted from the promotion point.
+ 0x2 extra: demand_code_rd This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The "Offcore outstanding" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.
+ 0x4 extra: demand_rfo This event counts the number of offcore outstanding RFO (store) transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.
+ 0x8 extra: all_data_rd This event counts the number of offcore outstanding cacheable Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.
+ 0x1 extra:cmask=1 cycles_with_demand_data_rd This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).
+ 0x8 extra:cmask=1 cycles_with_data_rd This event counts cycles when offcore outstanding cacheable Core Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.
+name:lock_cycles type:exclusive default:0x1
+ 0x1 extra: split_lock_uc_lock_duration This event counts cycles in which the L1 and L2 are locked due to a UC lock or split lock. A lock is asserted in case of locked memory access, due to noncacheable memory, locked operation that spans two cache lines, or a page walk from the noncacheable page table. L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such access.
+ 0x2 extra: cache_lock_duration This event counts the number of cycles when the L1D is locked. It is a superset of the 0x1 mask (BUS_LOCK_CLOCKS.BUS_LOCK_DURATION).
+name:idq type:exclusive default:0x2
+ 0x2 extra: empty This counts the number of cycles that the instruction decoder queue is empty and can indicate that the application may be bound in the front end. It does not determine whether there are uops being delivered to the Alloc stage since uops can be delivered by bypass skipping the Instruction Decode Queue (IDQ) when it is empty.
+ 0x4 extra: mite_uops This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may "bypass" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).
+ 0x8 extra: dsb_uops This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may "bypass" the IDQ.
+ 0x10 extra: ms_dsb_uops This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may "bypass" the IDQ.
+ 0x20 extra: ms_mite_uops This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may "bypass" the IDQ.
+ 0x30 extra: ms_uops This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may "bypass" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.
+ 0x30 extra:cmask=1 ms_cycles This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may "bypass" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.
+ 0x4 extra:cmask=1 mite_cycles This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may "bypass" the IDQ.
+ 0x8 extra:cmask=1 dsb_cycles This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may "bypass" the IDQ.
+ 0x10 extra:cmask=1 ms_dsb_cycles This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may "bypass" the IDQ.
+ 0x10 extra:cmask=1,edge ms_dsb_occur This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may "bypass" the IDQ.
+ 0x18 extra:cmask=4 all_dsb_cycles_4_uops This event counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may "bypass" the IDQ.
+ 0x18 extra:cmask=1 all_dsb_cycles_any_uops This event counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may "bypass" the IDQ.
+ 0x24 extra:cmask=4 all_mite_cycles_4_uops This event counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may "bypass" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).
+ 0x24 extra:cmask=1 all_mite_cycles_any_uops This event counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may "bypass" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).
+ 0x3c extra: mite_all_uops This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may "bypass" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).
+ 0x30 extra:cmask=1,edge ms_switches Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer
+name:itlb_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk This event counts store misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).
+ 0x2 extra: walk_completed_4k This event counts store misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.
+ 0x10 extra: walk_duration This event counts the number of cycles while PMH is busy with the page walk.
+ 0x20 extra: stlb_hit_4k Core misses that miss the DTLB and hit the STLB (4K)
+ 0xe extra: walk_completed Misses in all ITLB levels that cause completed page walks
+ 0x60 extra: stlb_hit Operations that miss the first ITLB level but hit the second and do not cause any page walks
+name:br_inst_exec type:exclusive default:0xff
+ 0xff extra: all_branches This event counts both taken and not taken speculative and retired branch instructions.
+ 0x41 extra: nontaken_conditional This event counts not taken macro-conditional branch instructions.
+ 0x81 extra: taken_conditional This event counts taken speculative and retired macro-conditional branch instructions.
+ 0x82 extra: taken_direct_jump This event counts taken speculative and retired macro-conditional branch instructions excluding calls and indirect branches.
+ 0x84 extra: taken_indirect_jump_non_call_ret This event counts taken speculative and retired indirect branches excluding calls and return branches.
+ 0x88 extra: taken_indirect_near_return This event counts taken speculative and retired indirect branches that have a return mnemonic.
+ 0x90 extra: taken_direct_near_call This event counts taken speculative and retired direct near calls.
+ 0xa0 extra: taken_indirect_near_call This event counts taken speculative and retired indirect calls including both register and memory indirect.
+ 0xc1 extra: all_conditional This event counts both taken and not taken speculative and retired macro-conditional branch instructions.
+ 0xc2 extra: all_direct_jmp This event counts both taken and not taken speculative and retired macro-unconditional branch instructions, excluding calls and indirects.
+ 0xc4 extra: all_indirect_jump_non_call_ret This event counts both taken and not taken speculative and retired indirect branches excluding calls and return branches.
+ 0xc8 extra: all_indirect_near_return This event counts both taken and not taken speculative and retired indirect branches that have a return mnemonic.
+ 0xd0 extra: all_direct_near_call This event counts both taken and not taken speculative and retired direct near calls.
+name:br_misp_exec type:exclusive default:0xff
+ 0xff extra: all_branches This event counts both taken and not taken speculative and retired mispredicted branch instructions.
+ 0x41 extra: nontaken_conditional This event counts not taken speculative and retired mispredicted macro conditional branch instructions.
+ 0x81 extra: taken_conditional This event counts taken speculative and retired mispredicted macro conditional branch instructions.
+ 0x84 extra: taken_indirect_jump_non_call_ret This event counts taken speculative and retired mispredicted indirect branches excluding calls and returns.
+ 0xc1 extra: all_conditional This event counts both taken and not taken speculative and retired mispredicted macro conditional branch instructions.
+ 0xc4 extra: all_indirect_jump_non_call_ret This event counts both taken and not taken mispredicted indirect branches excluding calls and returns.
+ 0xa0 extra: taken_indirect_near_call Taken speculative and retired mispredicted indirect calls
+name:idq_uops_not_delivered type:exclusive default:0x1
+ 0x1 extra: core This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding ?4 ? x? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread; b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); c. Instruction Decode Queue (IDQ) delivers four uops.
+ 0x1 extra:cmask=4 cycles_0_uops_deliv_core This event counts, on the per-thread basis, cycles when no uops are delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core =4.
+ 0x1 extra:cmask=3 cycles_le_1_uop_deliv_core This event counts, on the per-thread basis, cycles when less than 1 uop is delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core >=3.
+ 0x1 extra:cmask=2 cycles_le_2_uop_deliv_core Cycles with less than 2 uops delivered by the front end
+ 0x1 extra:cmask=1 cycles_le_3_uop_deliv_core Cycles with less than 3 uops delivered by the front end
+ 0x1 extra:cmask=1,inv cycles_fe_was_ok Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.
+name:uops_executed_port type:exclusive default:0x1
+ 0x1 extra:any port_0_core Cycles per core when uops are exectuted in port 0
+ 0x2 extra:any port_1_core Cycles per core when uops are exectuted in port 1
+ 0x4 extra:any port_2_core Cycles per core when uops are dispatched to port 2
+ 0x8 extra:any port_3_core Cycles per core when uops are dispatched to port 3
+ 0x10 extra:any port_4_core Cycles per core when uops are exectuted in port 4
+ 0x20 extra:any port_5_core Cycles per core when uops are exectuted in port 5
+ 0x40 extra:any port_6_core Cycles per core when uops are exectuted in port 6
+ 0x80 extra:any port_7_core Cycles per core when uops are dispatched to port 7
+ 0x1 extra: port_0 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.
+ 0x2 extra: port_1 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.
+ 0x4 extra: port_2 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.
+ 0x8 extra: port_3 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.
+ 0x10 extra: port_4 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.
+ 0x20 extra: port_5 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.
+ 0x40 extra: port_6 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.
+ 0x80 extra: port_7 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.
+name:uops_dispatched_port type:exclusive default:0x1
+ 0x1 extra: port_0 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.
+ 0x2 extra: port_1 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.
+ 0x4 extra: port_2 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.
+ 0x8 extra: port_3 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.
+ 0x10 extra: port_4 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.
+ 0x20 extra: port_5 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.
+ 0x40 extra: port_6 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.
+ 0x80 extra: port_7 This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.
+name:resource_stalls type:exclusive default:0x1
+ 0x1 extra: any This event counts resource-related stall cycles. Reasons for stalls can be as follows: - *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots) - *any* u-arch structure got empty (like INT/SIMD FreeLists) - FPU control word (FPCW), MXCSR and others. This counts cycles that the pipeline backend blocked uop delivery from the front end.
+ 0x4 extra: rs This event counts stall cycles caused by absence of eligible entries in the reservation station (RS). This may result from RS overflow, or from RS deallocation because of the RS array Write Port allocation scheme (each RS entry has two write ports instead of four. As a result, empty entries could not be used, although RS is not really full). This counts cycles that the pipeline backend blocked uop delivery from the front end.
+ 0x8 extra: sb This event counts stall cycles caused by the store buffer (SB) overflow (excluding draining from synch). This counts cycles that the pipeline backend blocked uop delivery from the front end.
+ 0x10 extra: rob This event counts ROB full stall cycles. This counts cycles that the pipeline backend blocked uop delivery from the front end.
+name:cycle_activity type:exclusive default:0x1
+ 0x1 extra:cmask=1 cycles_l2_pending Counts number of cycles the CPU has at least one pending demand* load request missing the L2 cache.
+ 0x8 extra:cmask=8 cycles_l1d_pending Counts number of cycles the CPU has at least one pending demand load request missing the L1 data cache.
+ 0x2 extra:cmask=2 cycles_ldm_pending Counts number of cycles the CPU has at least one pending demand load request (that is cycles with non-completed load waiting for its data from memory subsystem)
+ 0x4 extra:cmask=4 cycles_no_execute Counts number of cycles nothing is executed on any execution port.
+ 0x5 extra:cmask=5 stalls_l2_pending Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand* load request missing the L2 cache. (as a footprint) * includes also L1 HW prefetch requests that may or may not be required by demands
+ 0x6 extra:cmask=6 stalls_ldm_pending Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand load request.
+ 0xc extra:cmask=c stalls_l1d_pending Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand load request missing the L1 data cache.
+ 0x8 extra:cmask=8 cycles_l1d_miss Cycles while L1 cache miss demand load is outstanding.
+ 0x1 extra:cmask=1 cycles_l2_miss Cycles while L2 cache miss demand load is outstanding.
+ 0x2 extra:cmask=2 cycles_mem_any Cycles while memory subsystem has an outstanding load.
+ 0x4 extra:cmask=4 stalls_total Total execution stalls.
+ 0xc extra:cmask=c stalls_l1d_miss Execution stalls while L1 cache miss demand load is outstanding.
+ 0x5 extra:cmask=5 stalls_l2_miss Execution stalls while L2 cache miss demand load is outstanding.
+ 0x6 extra:cmask=6 stalls_mem_any Execution stalls while memory subsystem has an outstanding load.
+name:lsd type:exclusive default:0x1
+ 0x1 extra: uops Number of Uops delivered by the LSD. Read more on LSD under LSD_REPLAY.REPLAY
+ 0x1 extra:cmask=4 cycles_4_uops Cycles 4 Uops delivered by the LSD, but didn't come from the decoder
+ 0x1 extra:cmask=1 cycles_active Cycles Uops delivered by the LSD, but didn't come from the decoder
+name:offcore_requests type:exclusive default:0x1
+ 0x1 extra: demand_data_rd This event counts the Demand Data Read requests sent to uncore. Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore.
+ 0x2 extra: demand_code_rd This event counts both cacheable and noncachaeble code read requests.
+ 0x4 extra: demand_rfo This event counts the demand RFO (read for ownership) requests including regular RFOs, locks, ItoM.
+ 0x8 extra: all_data_rd This event counts the demand and prefetch data reads. All Core Data Reads include cacheable "Demands" and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.
+name:uops_executed type:exclusive default:0x1
+ 0x1 extra: thread Number of uops to be executed per-thread each cycle.
+ 0x2 extra: core Number of uops executed from any thread
+ 0x1 extra:cmask=1,inv stall_cycles This event counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread.
+ 0x1 extra:cmask=1 cycles_ge_1_uop_exec Cycles where at least 1 uop was executed per-thread
+ 0x1 extra:cmask=2 cycles_ge_2_uops_exec Cycles where at least 2 uops were executed per-thread
+ 0x1 extra:cmask=3 cycles_ge_3_uops_exec Cycles where at least 3 uops were executed per-thread
+ 0x1 extra:cmask=4 cycles_ge_4_uops_exec Cycles where at least 4 uops were executed per-thread
+name:page_walker_loads type:exclusive default:0x11
+ 0x11 extra: dtlb_l1 Number of DTLB page walker hits in the L1+FB
+ 0x21 extra: itlb_l1 Number of ITLB page walker hits in the L1+FB
+ 0x12 extra: dtlb_l2 Number of DTLB page walker hits in the L2
+ 0x22 extra: itlb_l2 Number of ITLB page walker hits in the L2
+ 0x14 extra: dtlb_l3 Number of DTLB page walker hits in the L3 + XSNP
+ 0x24 extra: itlb_l3 Number of ITLB page walker hits in the L3 + XSNP
+ 0x18 extra: dtlb_memory Number of DTLB page walker hits in Memory
+name:inst_retired type:exclusive default:0x2
+ 0x2 extra: x87 This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.
+ 0x1 extra:pebs prec_dist This is a precise version (that is, uses PEBS) of the event that counts instructions retired.
+name:other_assists type:exclusive default:0x8
+ 0x8 extra: avx_to_sse This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.
+ 0x10 extra: sse_to_avx This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.
+ 0x40 extra: any_wb_assist Number of times any microcode assist is invoked by HW upon uop writeback.
+name:uops_retired type:exclusive default:0x1
+ 0x1 extra: all This is a non-precise version (that is, does not use PEBS) of the event that counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.
+ 0x1 extra: all_pebs Counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.
+ 0x2 extra: retire_slots This is a non-precise version (that is, does not use PEBS) of the event that counts the number of retirement slots used.
+ 0x2 extra: retire_slots_pebs Counts the number of retirement slots used.
+ 0x1 extra:cmask=1,inv stall_cycles This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.
+ 0x1 extra:cmask=a,inv total_cycles Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.
+name:machine_clears type:exclusive default:0x1
+ 0x1 extra: cycles This event counts both thread-specific (TS) and all-thread (AT) nukes.
+ 0x2 extra: memory_ordering This event counts the number of memory ordering Machine Clears detected. Memory Ordering Machine Clears can result from one of the following: 1. memory disambiguation, 2. external snoop, or 3. cross SMT-HW-thread snoop (stores) hitting load buffer.
+ 0x4 extra: smc This event counts self-modifying code (SMC) detected, which causes a machine clear.
+ 0x20 extra: maskmov Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.
+ 0x1 extra:cmask=1,edge count Number of machine clears (nukes) of any type.
+name:br_inst_retired type:exclusive default:0x1
+ 0x1 extra: conditional This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.
+ 0x1 extra: conditional_pebs Counts conditional branch instructions retired.
+ 0x2 extra: near_call This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.
+ 0x2 extra: near_call_pebs Counts both direct and indirect near call instructions retired.
+ 0x8 extra: near_return This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.
+ 0x8 extra: near_return_pebs Counts return instructions retired.
+ 0x10 extra: not_taken This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.
+ 0x20 extra: near_taken This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.
+ 0x20 extra: near_taken_pebs Counts taken branch instructions retired.
+ 0x40 extra: far_branch This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.
+ 0x4 extra:pebs all_branches_pebs This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.
+name:br_misp_retired type:exclusive default:0x1
+ 0x1 extra: conditional This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.
+ 0x1 extra: conditional_pebs Counts mispredicted conditional branch instructions retired.
+ 0x4 extra:pebs all_branches_pebs This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.
+ 0x20 extra: near_taken number of near branch instructions retired that were mispredicted and taken.
+ 0x20 extra: near_taken_pebs number of near branch instructions retired that were mispredicted and taken.
+name:hle_retired type:exclusive default:0x1
+ 0x1 extra: start Number of times we entered an HLE region; does not count nested transactions
+ 0x2 extra: commit Number of times HLE commit succeeded
+ 0x4 extra: aborted Number of times HLE abort was triggered
+ 0x4 extra: aborted_pebs Number of times HLE abort was triggered
+ 0x8 extra: aborted_misc1 Number of times an HLE abort was attributed to a Memory condition (See TSX_Memory event for additional details)
+ 0x10 extra: aborted_misc2 Number of times the TSX watchdog signaled an HLE abort
+ 0x20 extra: aborted_misc3 Number of times a disallowed operation caused an HLE abort
+ 0x40 extra: aborted_misc4 Number of times HLE caused a fault
+ 0x80 extra: aborted_misc5 Number of times HLE aborted and was not due to the abort conditions in subevents 3-6
+name:rtm_retired type:exclusive default:0x1
+ 0x1 extra: start Number of times we entered an RTM region; does not count nested transactions
+ 0x2 extra: commit Number of times RTM commit succeeded
+ 0x4 extra: aborted Number of times RTM abort was triggered
+ 0x4 extra: aborted_pebs Number of times RTM abort was triggered
+ 0x8 extra: aborted_misc1 Number of times an RTM abort was attributed to a Memory condition (See TSX_Memory event for additional details)
+ 0x10 extra: aborted_misc2 Number of times the TSX watchdog signaled an RTM abort
+ 0x20 extra: aborted_misc3 Number of times a disallowed operation caused an RTM abort
+ 0x40 extra: aborted_misc4 Number of times a RTM caused a fault
+ 0x80 extra: aborted_misc5 Number of times RTM aborted and was not due to the abort conditions in subevents 3-6
+name:fp_assist type:exclusive default:0x1e
+ 0x1e extra:cmask=1 any This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.
+ 0x2 extra: x87_output This is a non-precise version (that is, does not use PEBS) of the event that counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.
+ 0x4 extra: x87_input This is a non-precise version (that is, does not use PEBS) of the event that counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.
+ 0x8 extra: simd_output This is a non-precise version (that is, does not use PEBS) of the event that counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.
+ 0x10 extra: simd_input This is a non-precise version (that is, does not use PEBS) of the event that counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.
+name:mem_uops_retired type:exclusive default:0x11
+ 0x11 extra: stlb_miss_loads This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.
+ 0x11 extra: stlb_miss_loads_pebs Counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.
+ 0x12 extra: stlb_miss_stores This is a non-precise version (that is, does not use PEBS) of the event that counts store uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.
+ 0x12 extra: stlb_miss_stores_pebs Counts store uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.
+ 0x21 extra: lock_loads This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with locked access retired to the architected path.
+ 0x21 extra: lock_loads_pebs Counts load uops with locked access retired to the architected path.
+ 0x41 extra: split_loads This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).
+ 0x41 extra: split_loads_pebs Counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).
+ 0x42 extra: split_stores This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).
+ 0x42 extra: split_stores_pebs Counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).
+ 0x81 extra: all_loads This is a non-precise version (that is, does not use PEBS) of the event that counts load uops retired to the architected path with a filter on bits 0 and 1 applied. Note: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.
+ 0x81 extra: all_loads_pebs Counts load uops retired to the architected path with a filter on bits 0 and 1 applied. Note: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.
+ 0x82 extra: all_stores This is a non-precise version (that is, does not use PEBS) of the event that counts store uops retired to the architected path with a filter on bits 0 and 1 applied. Note: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement.
+ 0x82 extra: all_stores_pebs Counts store uops retired to the architected path with a filter on bits 0 and 1 applied. Note: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement.
+name:mem_load_uops_retired type:exclusive default:0x1
+ 0x1 extra: l1_hit This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the nearest-level (L1) cache. Note: Only two data-sources of L1/FB are applicable for AVX-256bit even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source
+ 0x1 extra: l1_hit_pebs Counts retired load uops which data sources were hits in the nearest-level (L1) cache. Note: Only two data-sources of L1/FB are applicable for AVX-256bit even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source
+ 0x2 extra: l2_hit This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.
+ 0x2 extra: l2_hit_pebs Counts retired load uops which data sources were hits in the mid-level (L2) cache.
+ 0x4 extra: l3_hit This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.
+ 0x4 extra: l3_hit_pebs Counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.
+ 0x8 extra: l1_miss This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.
+ 0x8 extra: l1_miss_pebs Counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.
+ 0x10 extra: l2_miss This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.
+ 0x10 extra: l2_miss_pebs Counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.
+ 0x20 extra: l3_miss Miss in last-level (L3) cache. Excludes Unknown data-source.
+ 0x20 extra: l3_miss_pebs Miss in last-level (L3) cache. Excludes Unknown data-source.
+ 0x40 extra: hit_lfb This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready. Note: Only two data-sources of L1/FB are applicable for AVX-256bit even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.
+ 0x40 extra: hit_lfb_pebs Counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready. Note: Only two data-sources of L1/FB are applicable for AVX-256bit even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.
+name:mem_load_uops_l3_hit_retired type:exclusive default:0x1
+ 0x1 extra: xsnp_miss This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.
+ 0x1 extra: xsnp_miss_pebs Counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.
+ 0x2 extra: xsnp_hit This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.
+ 0x2 extra: xsnp_hit_pebs Counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.
+ 0x4 extra: xsnp_hitm This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).
+ 0x4 extra: xsnp_hitm_pebs Counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).
+ 0x8 extra: xsnp_none This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.
+ 0x8 extra: xsnp_none_pebs Counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.
+name:mem_load_uops_l3_miss_retired type:exclusive default:0x1
+ 0x1 extra: local_dram Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI)
+ 0x1 extra: local_dram_pebs Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI)
+name:l2_trans type:exclusive default:0x80
+ 0x80 extra: all_requests This event counts transactions that access the L2 pipe including snoops, pagewalks, and so on.
+ 0x1 extra: demand_data_rd This event counts Demand Data Read requests that access L2 cache, including rejects.
+ 0x2 extra: rfo This event counts Read for Ownership (RFO) requests that access L2 cache.
+ 0x4 extra: code_rd This event counts the number of L2 cache accesses when fetching instructions.
+ 0x8 extra: all_pf This event counts L2 or L3 HW prefetches that access L2 cache including rejects.
+ 0x10 extra: l1d_wb This event counts L1D writebacks that access L2 cache.
+ 0x20 extra: l2_fill This event counts L2 fill requests that access L2 cache.
+ 0x40 extra: l2_wb This event counts L2 writebacks that access L2 cache.
+name:l2_lines_in type:exclusive default:0x7
+ 0x7 extra: all This event counts the number of L2 cache lines filling the L2. Counting does not cover rejects.
+ 0x1 extra: i This event counts the number of L2 cache lines in the Invalidate state filling the L2. Counting does not cover rejects.
+ 0x2 extra: s This event counts the number of L2 cache lines in the Shared state filling the L2. Counting does not cover rejects.
+ 0x4 extra: e This event counts the number of L2 cache lines in the Exclusive state filling the L2. Counting does not cover rejects.
diff --git a/events/i386/core_2/unit_masks b/events/i386/core_2/unit_masks
index d528f17..6bc0960 100644
--- a/events/i386/core_2/unit_masks
+++ b/events/i386/core_2/unit_masks
@@ -50,30 +50,30 @@ name:sse_miss type:exclusive default:0x0
0x01 PREFETCHT0
0x02 PREFETCHT1/PREFETCHT2
name:load_block type:bitmask default:0x3e
- 0x02 STA Loads blocked by a preceding store with unknown address.
- 0x04 STD Loads blocked by a preceding store with unknown data.
- 0x08 OVERLAP_STORE Loads that partially overlap an earlier store, or 4K aliased with a previous store.
- 0x10 UNTIL_RETIRE Loads blocked until retirement.
- 0x20 L1D Loads blocked by the L1 data cache.
+ 0x02 extra: STA Loads blocked by a preceding store with unknown address.
+ 0x04 extra: STD Loads blocked by a preceding store with unknown data.
+ 0x08 extra: OVERLAP_STORE Loads that partially overlap an earlier store, or 4K aliased with a previous store.
+ 0x10 extra: UNTIL_RETIRE Loads blocked until retirement.
+ 0x20 extra: L1D Loads blocked by the L1 data cache.
name:store_block type:bitmask default:0x0b
- 0x01 SB_DRAIN_CYCLES Cycles while stores are blocked due to store buffer drain.
- 0x02 ORDER Cycles while store is waiting for a preceding store to be globally observed.
- 0x08 NOOP A store is blocked due to a conflict with an external or internal snoop.
+ 0x01 extra: SB_DRAIN_CYCLES Cycles while stores are blocked due to store buffer drain.
+ 0x02 extra: ORDER Cycles while store is waiting for a preceding store to be globally observed.
+ 0x08 extra: NOOP A store is blocked due to a conflict with an external or internal snoop.
name:dtlb_miss type:bitmask default:0x0f
- 0x01 ANY Memory accesses that missed the DTLB.
- 0x02 MISS_LD DTLB misses due to load operations.
- 0x04 L0_MISS_LD L0 DTLB misses due to load operations.
- 0x08 MISS_ST TLB misses due to store operations.
+ 0x01 extra: ANY Memory accesses that missed the DTLB.
+ 0x02 extra: MISS_LD DTLB misses due to load operations.
+ 0x04 extra: L0_MISS_LD L0 DTLB misses due to load operations.
+ 0x08 extra: MISS_ST TLB misses due to store operations.
name:memory_dis type:exclusive default:0x01
- 0x01 RESET Memory disambiguation reset cycles.
- 0x02 SUCCESS Number of loads that were successfully disambiguated.
+ 0x01 extra: RESET Memory disambiguation reset cycles.
+ 0x02 extra: SUCCESS Number of loads that were successfully disambiguated.
name:page_walks type:exclusive default:0x02
- 0x01 COUNT Number of page-walks executed.
- 0x02 CYCLES Duration of page-walks in core cycles.
+ 0x01 extra: COUNT Number of page-walks executed.
+ 0x02 extra: CYCLES Duration of page-walks in core cycles.
name:delayed_bypass type:exclusive default:0x00
- 0x00 FP Delayed bypass to FP operation.
- 0x01 SIMD Delayed bypass to SIMD operation.
- 0x02 LOAD Delayed bypass to load operation.
+ 0x00 extra: FP Delayed bypass to FP operation.
+ 0x01 extra: SIMD Delayed bypass to SIMD operation.
+ 0x02 extra: LOAD Delayed bypass to load operation.
name:core type:exclusive default:0x40
0xc0 All cores
0x40 This core
@@ -133,10 +133,10 @@ name:esp type:bitmask default:0x01
0x01 ESP register content synchronizations
0x02 ESP register automatic additions
name:inst_retired type:bitmask default:0x00
- 0x00 Any
- 0x01 Loads
- 0x02 Stores
- 0x04 Other
+ 0x00 extra: Any
+ 0x01 extra: Loads
+ 0x02 extra: Stores
+ 0x04 extra: Other
name:x87_ops_retired type:exclusive default:0xfe
0x01 FXCH instructions retired
0xfe Retired floating-point computational operations (precise)
@@ -183,10 +183,10 @@ name:rat_stalls type:bitmask default:0xf
0x08 FPU status word
0x0f All RAT
name:seg_regs type:bitmask default:0x0f
- 0x01 ES
- 0x02 DS
- 0x04 FS
- 0x08 GS
+ 0x01 extra: ES
+ 0x02 extra: DS
+ 0x04 extra: FS
+ 0x08 extra: GS
name:resource_stalls type:bitmask default:0x0f
0x01 when the ROB is full
0x02 during which the RS is full
diff --git a/events/i386/haswell/events b/events/i386/haswell/events
new file mode 100644
index 0000000..5aa5eb5
--- /dev/null
+++ b/events/i386/haswell/events
@@ -0,0 +1,64 @@
+#
+# Intel "Haswell" microarchitecture core events.
+#
+# See http://ark.intel.com/ for help in identifying Haswell based CPUs
+#
+# Note the minimum counts are not discovered experimentally and could be likely
+# lowered in many cases without ill effect.
+#
+include:i386/arch_perfmon
+event:0x03 counters:cpuid um:ld_blocks minimum:100003 name:ld_blocks :
+event:0x05 counters:cpuid um:misalign_mem_ref minimum:2000003 name:misalign_mem_ref :
+event:0x07 counters:cpuid um:one minimum:100003 name:ld_blocks_partial_address_alias :
+event:0x08 counters:cpuid um:dtlb_load_misses minimum:2000003 name:dtlb_load_misses :
+event:0x0d counters:cpuid um:x03 minimum:2000003 name:int_misc_recovery_cycles :
+event:0x0e counters:cpuid um:uops_issued minimum:2000003 name:uops_issued :
+event:0x24 counters:cpuid um:l2_rqsts minimum:200003 name:l2_rqsts :
+event:0x27 counters:cpuid um:x50 minimum:200003 name:l2_demand_rqsts_wb_hit :
+event:0x48 counters:2 um:l1d_pend_miss minimum:2000003 name:l1d_pend_miss :
+event:0x49 counters:cpuid um:dtlb_store_misses minimum:100003 name:dtlb_store_misses :
+event:0x4c counters:cpuid um:load_hit_pre minimum:100003 name:load_hit_pre :
+event:0x4f counters:cpuid um:x10 minimum:2000003 name:ept_walk_cycles :
+event:0x51 counters:cpuid um:one minimum:2000003 name:l1d_replacement :
+event:0x54 counters:cpuid um:tx_mem minimum:2000003 name:tx_mem :
+event:0x58 counters:cpuid um:move_elimination minimum:1000003 name:move_elimination :
+event:0x5c counters:cpuid um:cpl_cycles minimum:2000003 name:cpl_cycles :
+event:0x5d counters:cpuid um:tx_exec minimum:2000003 name:tx_exec :
+event:0x5e counters:cpuid um:rs_events minimum:2000003 name:rs_events :
+event:0x60 counters:cpuid um:offcore_requests_outstanding minimum:2000003 name:offcore_requests_outstanding :
+event:0x63 counters:cpuid um:lock_cycles minimum:2000003 name:lock_cycles :
+event:0x79 counters:0,1,2,3 um:idq minimum:2000003 name:idq :
+event:0x80 counters:cpuid um:icache minimum:2000003 name:icache :
+event:0x85 counters:cpuid um:itlb_misses minimum:100003 name:itlb_misses :
+event:0x87 counters:cpuid um:ild_stall minimum:2000003 name:ild_stall :
+event:0x88 counters:cpuid um:br_inst_exec minimum:200003 name:br_inst_exec :
+event:0x89 counters:cpuid um:br_misp_exec minimum:200003 name:br_misp_exec :
+event:0x9c counters:0,1,2,3 um:idq_uops_not_delivered minimum:2000003 name:idq_uops_not_delivered :
+event:0xa1 counters:cpuid um:uops_executed_port minimum:2000003 name:uops_executed_port :
+event:0xa2 counters:cpuid um:resource_stalls minimum:2000003 name:resource_stalls :
+event:0xa3 counters:2 um:cycle_activity minimum:2000003 name:cycle_activity :
+event:0xa8 counters:cpuid um:one minimum:2000003 name:lsd_uops :
+event:0xab counters:cpuid um:x02 minimum:2000003 name:dsb2mite_switches_penalty_cycles :
+event:0xae counters:cpuid um:one minimum:100007 name:itlb_itlb_flush :
+event:0xb0 counters:cpuid um:offcore_requests minimum:100003 name:offcore_requests :
+event:0xb1 counters:cpuid um:uops_executed minimum:2000003 name:uops_executed :
+event:0xbc counters:0,1,2,3 um:page_walker_loads minimum:2000003 name:page_walker_loads :
+event:0xbd counters:cpuid um:tlb_flush minimum:100007 name:tlb_flush :
+event:0xc0 counters:1 um:one minimum:2000003 name:inst_retired_prec_dist :
+event:0xc1 counters:cpuid um:other_assists minimum:100003 name:other_assists :
+event:0xc2 counters:cpuid um:uops_retired minimum:2000003 name:uops_retired :
+event:0xc3 counters:cpuid um:machine_clears minimum:2000003 name:machine_clears :
+event:0xc4 counters:cpuid um:br_inst_retired minimum:400009 name:br_inst_retired :
+event:0xc5 counters:cpuid um:br_misp_retired minimum:400009 name:br_misp_retired :
+event:0xc8 counters:cpuid um:hle_retired minimum:2000003 name:hle_retired :
+event:0xc9 counters:0,1,2,3 um:rtm_retired minimum:2000003 name:rtm_retired :
+event:0xca counters:cpuid um:fp_assist minimum:100003 name:fp_assist :
+event:0xcc counters:cpuid um:x20 minimum:2000003 name:rob_misc_events_lbr_inserts :
+event:0xd0 counters:0,1,2,3 um:mem_uops_retired minimum:2000003 name:mem_uops_retired :
+event:0xd1 counters:0,1,2,3 um:mem_load_uops_retired minimum:2000003 name:mem_load_uops_retired :
+event:0xd2 counters:0,1,2,3 um:mem_load_uops_l3_hit_retired minimum:100003 name:mem_load_uops_l3_hit_retired :
+event:0xd3 counters:0,1,2,3 um:mem_load_uops_l3_miss_retired minimum:100007 name:mem_load_uops_l3_miss_retired :
+event:0xe6 counters:cpuid um:x1f minimum:100003 name:baclears_any :
+event:0xf0 counters:cpuid um:l2_trans minimum:200003 name:l2_trans :
+event:0xf1 counters:cpuid um:l2_lines_in minimum:100003 name:l2_lines_in :
+event:0xf2 counters:cpuid um:l2_lines_out minimum:100003 name:l2_lines_out :
diff --git a/events/i386/haswell/unit_masks b/events/i386/haswell/unit_masks
new file mode 100644
index 0000000..60c2a61
--- /dev/null
+++ b/events/i386/haswell/unit_masks
@@ -0,0 +1,355 @@
+#
+# Unit masks for the Intel "Haswell" micro architecture
+#
+# See http://ark.intel.com/ for help in identifying Haswell based CPUs
+#
+include:i386/arch_perfmon
+name:x02 type:mandatory default:0x2
+ 0x2 No unit mask
+name:x03 type:mandatory default:0x3
+ 0x3 No unit mask
+name:x10 type:mandatory default:0x10
+ 0x10 No unit mask
+name:x1f type:mandatory default:0x1f
+ 0x1f No unit mask
+name:x20 type:mandatory default:0x20
+ 0x20 No unit mask
+name:x50 type:mandatory default:0x50
+ 0x50 No unit mask
+name:ld_blocks type:exclusive default:0x2
+ 0x2 extra: store_forward This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load. The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceding smaller uncompleted store. The penalty for blocked store forwarding is that the load must wait for the store to write its value to the cache before it can be issued.
+ 0x8 extra: no_sr The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use
+name:misalign_mem_ref type:exclusive default:0x1
+ 0x1 extra: loads Speculative cache line split load uops dispatched to L1 cache
+ 0x2 extra: stores Speculative cache line split STA uops dispatched to L1 cache
+name:dtlb_load_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk Load misses in all DTLB levels that cause page walks
+ 0x2 extra: walk_completed_4k Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (4K).
+ 0x4 extra: walk_completed_2m_4m Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (2M/4M).
+ 0x10 extra: walk_duration This event counts cycles when the page miss handler (PMH) is servicing page walks caused by DTLB load misses.
+ 0x20 extra: stlb_hit_4k This event counts load operations from a 4K page that miss the first DTLB level but hit the second and do not cause page walks.
+ 0x40 extra: stlb_hit_2m This event counts load operations from a 2M page that miss the first DTLB level but hit the second and do not cause page walks.
+ 0x80 extra: pde_cache_miss DTLB demand load misses with low part of linear-to-physical address translation missed
+ 0xe extra: walk_completed Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.
+ 0x60 extra: stlb_hit Load operations that miss the first DTLB level but hit the second and do not cause page walks
+name:uops_issued type:exclusive default:0x1
+ 0x1 extra: any This event counts the number of uops issued by the Front-end of the pipeline to the Back-end. This event is counted at the allocation stage and will count both retired and non-retired uops.
+ 0x10 extra: flags_merge Number of flags-merge uops being allocated. Such uops considered perf sensitive; added by GSR u-arch.
+ 0x20 extra: slow_lea Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.
+ 0x40 extra: single_mul Number of Multiply packed/scalar single precision uops allocated
+ 0x1 extra:cmask=1,inv stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread
+ 0x1 extra:cmask=1,inv,any core_stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads
+name:l2_rqsts type:exclusive default:0x21
+ 0x21 extra: demand_data_rd_miss Demand Data Read miss L2, no rejects
+ 0x41 extra: demand_data_rd_hit Demand Data Read requests that hit L2 cache
+ 0x30 extra: l2_pf_miss L2 prefetch requests that miss L2 cache
+ 0x50 extra: l2_pf_hit L2 prefetch requests that hit L2 cache
+ 0xe1 extra: all_demand_data_rd Demand Data Read requests
+ 0xe2 extra: all_rfo RFO requests to L2 cache
+ 0xe4 extra: all_code_rd L2 code requests
+ 0xf8 extra: all_pf Requests from L2 hardware prefetchers
+ 0x42 extra: rfo_hit RFO requests that hit L2 cache
+ 0x22 extra: rfo_miss RFO requests that miss L2 cache
+ 0x44 extra: code_rd_hit L2 cache hits when fetching instructions, code reads.
+ 0x24 extra: code_rd_miss L2 cache misses when fetching instructions
+ 0x27 extra: all_demand_miss Demand requests that miss L2 cache
+ 0xe7 extra: all_demand_references Demand requests to L2 cache
+ 0x3f extra: miss All requests that miss L2 cache
+ 0xff extra: references All L2 requests
+name:l1d_pend_miss type:exclusive default:0x1
+ 0x1 extra: pending L1D miss oustandings duration in cycles
+ 0x1 extra:cmask=1 pending_cycles Cycles with L1D load Misses outstanding.
+name:dtlb_store_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk Store misses in all DTLB levels that cause page walks
+ 0x2 extra: walk_completed_4k Store miss in all TLB levels causes a page walk that completes. (4K)
+ 0x4 extra: walk_completed_2m_4m Store misses in all DTLB levels that cause completed page walks (2M/4M)
+ 0x10 extra: walk_duration This event counts cycles when the page miss handler (PMH) is servicing page walks caused by DTLB store misses.
+ 0x20 extra: stlb_hit_4k This event counts store operations from a 4K page that miss the first DTLB level but hit the second and do not cause page walks.
+ 0x40 extra: stlb_hit_2m This event counts store operations from a 2M page that miss the first DTLB level but hit the second and do not cause page walks.
+ 0x80 extra: pde_cache_miss DTLB store misses with low part of linear-to-physical address translation missed
+ 0xe extra: walk_completed Store misses in all DTLB levels that cause completed page walks
+ 0x60 extra: stlb_hit Store operations that miss the first TLB level but hit the second and do not cause page walks
+name:load_hit_pre type:exclusive default:0x1
+ 0x1 extra: sw_pf Not software-prefetch load dispatches that hit FB allocated for software prefetch
+ 0x2 extra: hw_pf Not software-prefetch load dispatches that hit FB allocated for hardware prefetch
+name:tx_mem type:exclusive default:0x1
+ 0x1 extra: abort_conflict Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address
+ 0x2 extra: abort_capacity_write Number of times a transactional abort was signaled due to a data capacity limitation for transactional writes.
+ 0x4 extra: abort_hle_store_to_elided_lock Number of times a HLE transactional region aborted due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer
+ 0x8 extra: abort_hle_elision_buffer_not_empty Number of times an HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero.
+ 0x10 extra: abort_hle_elision_buffer_mismatch Number of times an HLE transactional execution aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer
+ 0x20 extra: abort_hle_elision_buffer_unsupported_alignment Number of times an HLE transactional execution aborted due to an unsupported read alignment from the elision buffer.
+ 0x40 extra: hle_elision_buffer_full Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero.
+name:move_elimination type:exclusive default:0x1
+ 0x1 extra: int_eliminated Number of integer Move Elimination candidate uops that were eliminated.
+ 0x2 extra: simd_eliminated Number of SIMD Move Elimination candidate uops that were eliminated.
+ 0x4 extra: int_not_eliminated Number of integer Move Elimination candidate uops that were not eliminated.
+ 0x8 extra: simd_not_eliminated Number of SIMD Move Elimination candidate uops that were not eliminated.
+name:cpl_cycles type:exclusive default:0x1
+ 0x1 extra: ring0 Unhalted core cycles when the thread is in ring 0
+ 0x2 extra: ring123 Unhalted core cycles when thread is in rings 1, 2, or 3
+ 0x1 extra:cmask=1,edge ring0_trans Number of intervals between processor halts while thread is in ring 0
+name:tx_exec type:exclusive default:0x1
+ 0x1 extra: misc1 Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort.
+ 0x2 extra: misc2 Counts the number of times a class of instructions (e.g., vzeroupper) that may cause a transactional abort was executed inside a transactional region
+ 0x4 extra: misc3 Counts the number of times an instruction execution caused the transactional nest count supported to be exceeded
+ 0x8 extra: misc4 Counts the number of times a XBEGIN instruction was executed inside an HLE transactional region.
+ 0x10 extra: misc5 Counts the number of times an HLE XACQUIRE instruction was executed inside an RTM transactional region
+name:rs_events type:exclusive default:0x1
+ 0x1 extra: empty_cycles This event counts cycles when the Reservation Station ( RS ) is empty for the thread. The RS is a structure that buffers allocated micro-ops from the Front-end. If there are many cycles when the RS is empty, it may represent an underflow of instructions delivered from the Front-end.
+ 0x1 extra:cmask=1,inv,edge empty_end Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.
+name:offcore_requests_outstanding type:exclusive default:0x1
+ 0x1 extra: demand_data_rd Offcore outstanding Demand Data Read transactions in uncore queue.
+ 0x2 extra: demand_code_rd Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle
+ 0x4 extra: demand_rfo Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore
+ 0x8 extra: all_data_rd Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore
+ 0x1 extra:cmask=1 cycles_with_demand_data_rd Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore
+ 0x8 extra:cmask=1 cycles_with_data_rd Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore
+name:lock_cycles type:exclusive default:0x1
+ 0x1 extra: split_lock_uc_lock_duration Cycles when L1 and L2 are locked due to UC or split lock
+ 0x2 extra: cache_lock_duration Cycles when L1D is locked
+name:idq type:exclusive default:0x2
+ 0x2 extra: empty Instruction Decode Queue (IDQ) empty cycles
+ 0x4 extra: mite_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
+ 0x8 extra: dsb_uops Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path
+ 0x10 extra: ms_dsb_uops Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x20 extra: ms_mite_uops Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x30 extra: ms_uops This event counts uops delivered by the Front-end with the assistance of the microcode sequencer. Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder. Using other instructions, if possible, will usually improve performance.
+ 0x30 extra:cmask=1 ms_cycles This event counts cycles during which the microcode sequencer assisted the Front-end in delivering uops. Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder. Using other instructions, if possible, will usually improve performance.
+ 0x4 extra:cmask=1 mite_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path
+ 0x8 extra:cmask=1 dsb_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path
+ 0x10 extra:cmask=1 ms_dsb_cycles Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x10 extra:cmask=1,edge ms_dsb_occur Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequenser (MS) is busy
+ 0x18 extra:cmask=4 all_dsb_cycles_4_uops Cycles Decode Stream Buffer (DSB) is delivering 4 Uops
+ 0x18 extra:cmask=1 all_dsb_cycles_any_uops Cycles Decode Stream Buffer (DSB) is delivering any Uop
+ 0x24 extra:cmask=4 all_mite_cycles_4_uops Cycles MITE is delivering 4 Uops
+ 0x24 extra:cmask=1 all_mite_cycles_any_uops Cycles MITE is delivering any Uop
+ 0x3c extra: mite_all_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
+ 0x30 extra:cmask=1,edge ms_switches Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer
+name:icache type:exclusive default:0x2
+ 0x2 extra: misses This event counts Instruction Cache (ICACHE) misses.
+ 0x4 extra: ifetch_stall Cycles where a code-fetch stalled due to L1 instruction-cache miss or an iTLB miss
+name:itlb_misses type:exclusive default:0x1
+ 0x1 extra: miss_causes_a_walk Misses at all ITLB levels that cause page walks
+ 0x2 extra: walk_completed_4k Code miss in all TLB levels causes a page walk that completes. (4K)
+ 0x4 extra: walk_completed_2m_4m Code miss in all TLB levels causes a page walk that completes. (2M/4M)
+ 0x10 extra: walk_duration This event counts cycles when the page miss handler (PMH) is servicing page walks caused by ITLB misses.
+ 0x20 extra: stlb_hit_4k Core misses that miss the DTLB and hit the STLB (4K)
+ 0x40 extra: stlb_hit_2m Code misses that miss the DTLB and hit the STLB (2M)
+ 0xe extra: walk_completed Misses in all ITLB levels that cause completed page walks
+ 0x60 extra: stlb_hit Operations that miss the first ITLB level but hit the second and do not cause any page walks
+name:ild_stall type:exclusive default:0x1
+ 0x1 extra: lcp This event counts cycles where the decoder is stalled on an instruction with a length changing prefix (LCP).
+ 0x4 extra: iq_full Stall cycles because IQ is full
+name:br_inst_exec type:exclusive default:0xff
+ 0xff extra: all_branches Speculative and retired branches
+ 0x41 extra: nontaken_conditional Not taken macro-conditional branches
+ 0x81 extra: taken_conditional Taken speculative and retired macro-conditional branches
+ 0x82 extra: taken_direct_jump Taken speculative and retired macro-conditional branch instructions excluding calls and indirects
+ 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired indirect branches excluding calls and returns
+ 0x88 extra: taken_indirect_near_return Taken speculative and retired indirect branches with return mnemonic
+ 0x90 extra: taken_direct_near_call Taken speculative and retired direct near calls
+ 0xa0 extra: taken_indirect_near_call Taken speculative and retired indirect calls
+ 0xc1 extra: all_conditional Speculative and retired macro-conditional branches
+ 0xc2 extra: all_direct_jmp Speculative and retired macro-unconditional branches excluding calls and indirects
+ 0xc4 extra: all_indirect_jump_non_call_ret Speculative and retired indirect branches excluding calls and returns
+ 0xc8 extra: all_indirect_near_return Speculative and retired indirect return branches.
+ 0xd0 extra: all_direct_near_call Speculative and retired direct near calls
+name:br_misp_exec type:exclusive default:0xff
+ 0xff extra: all_branches Speculative and retired mispredicted macro conditional branches
+ 0x41 extra: nontaken_conditional Not taken speculative and retired mispredicted macro conditional branches
+ 0x81 extra: taken_conditional Taken speculative and retired mispredicted macro conditional branches
+ 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired mispredicted indirect branches excluding calls and returns
+ 0x88 extra: taken_return_near Taken speculative and retired mispredicted indirect branches with return mnemonic
+ 0xc1 extra: all_conditional Speculative and retired mispredicted macro conditional branches
+ 0xc4 extra: all_indirect_jump_non_call_ret Mispredicted indirect branches excluding calls and returns
+ 0xa0 extra: taken_indirect_near_call Taken speculative and retired mispredicted indirect calls
+name:idq_uops_not_delivered type:exclusive default:0x1
+ 0x1 extra: core This event count the number of undelivered (unallocated) uops from the Front-end to the Resource Allocation Table (RAT) while the Back-end of the processor is not stalled. The Front-end can allocate up to 4 uops per cycle so this event can increment 0-4 times per cycle depending on the number of unallocated uops. This event is counted on a per-core basis.
+ 0x1 extra:cmask=4 cycles_0_uops_deliv_core This event counts the number cycles during which the Front-end allocated exactly zero uops to the Resource Allocation Table (RAT) while the Back-end of the processor is not stalled. This event is counted on a per-core basis.
+ 0x1 extra:cmask=3 cycles_le_1_uop_deliv_core Cycles per thread when 3 or more uops are not delivered to Resource Allocation Table (RAT) when backend of the machine is not stalled
+ 0x1 extra:cmask=2 cycles_le_2_uop_deliv_core Cycles with less than 2 uops delivered by the front end.
+ 0x1 extra:cmask=1 cycles_le_3_uop_deliv_core Cycles with less than 3 uops delivered by the front end.
+ 0x1 extra:cmask=1,inv cycles_fe_was_ok Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.
+name:uops_executed_port type:exclusive default:0x1
+ 0x1 extra: port_0 Cycles per thread when uops are executed in port 0
+ 0x2 extra: port_1 Cycles per thread when uops are executed in port 1
+ 0x4 extra: port_2 Cycles per thread when uops are executed in port 2
+ 0x8 extra: port_3 Cycles per thread when uops are executed in port 3
+ 0x10 extra: port_4 Cycles per thread when uops are executed in port 4
+ 0x20 extra: port_5 Cycles per thread when uops are executed in port 5
+ 0x40 extra: port_6 Cycles per thread when uops are executed in port 6
+ 0x80 extra: port_7 Cycles per thread when uops are executed in port 7
+ 0x1 extra:any port_0_core Cycles per core when uops are exectuted in port 0
+ 0x2 extra:any port_1_core Cycles per core when uops are exectuted in port 1
+ 0x4 extra:any port_2_core Cycles per core when uops are dispatched to port 2
+ 0x8 extra:any port_3_core Cycles per core when uops are dispatched to port 3
+ 0x10 extra:any port_4_core Cycles per core when uops are exectuted in port 4
+ 0x20 extra:any port_5_core Cycles per core when uops are exectuted in port 5
+ 0x40 extra:any port_6_core Cycles per core when uops are exectuted in port 6
+ 0x80 extra:any port_7_core Cycles per core when uops are dispatched to port 7
+name:resource_stalls type:exclusive default:0x1
+ 0x1 extra: any Resource-related stall cycles
+ 0x4 extra: rs Cycles stalled due to no eligible RS entry available.
+ 0x8 extra: sb This event counts cycles during which no instructions were allocated because no Store Buffers (SB) were available.
+ 0x10 extra: rob Cycles stalled due to re-order buffer full.
+name:cycle_activity type:exclusive default:0x1
+ 0x1 extra:cmask=1 cycles_l2_pending Cycles with pending L2 cache miss loads.
+ 0x8 extra:cmask=8 cycles_l1d_pending Cycles with pending L1 cache miss loads.
+ 0x2 extra:cmask=2 cycles_ldm_pending Cycles with pending memory loads.
+ 0x4 extra:cmask=4 cycles_no_execute This event counts cycles during which no instructions were executed in the execution stage of the pipeline.
+ 0x5 extra:cmask=5 stalls_l2_pending Execution stalls due to L2 cache misses.
+ 0x6 extra:cmask=6 stalls_ldm_pending This event counts cycles during which no instructions were executed in the execution stage of the pipeline and there were memory instructions pending (waiting for data).
+ 0xc extra:cmask=c stalls_l1d_pending Execution stalls due to L1 data cache misses
+name:offcore_requests type:exclusive default:0x1
+ 0x1 extra: demand_data_rd Demand Data Read requests sent to uncore
+ 0x2 extra: demand_code_rd Cacheable and noncachaeble code read requests
+ 0x4 extra: demand_rfo Demand RFO requests including regular RFOs, locks, ItoM
+ 0x8 extra: all_data_rd Demand and prefetch data reads
+name:uops_executed type:exclusive default:0x2
+ 0x2 extra: core Number of uops executed on the core. Errata: HSM31
+ 0x1 extra:cmask=1,inv stall_cycles Counts number of cycles no uops were dispatched to be executed on this thread.
+ 0x1 extra:cmask=1 cycles_ge_1_uops_exec This events counts the cycles where at least one uop was executed. It is counted per thread. Errata: HSM31
+ 0x1 extra:cmask=2 cycles_ge_2_uops_exec This events counts the cycles where at least two uop were executed. It is counted per thread. Errata: HSM31
+ 0x1 extra:cmask=3 cycles_ge_3_uops_exec This events counts the cycles where at least three uop were executed. It is counted per thread. Errata: HSM31
+ 0x1 extra:cmask=4 cycles_ge_4_uops_exec Cycles where at least 4 uops were executed per-thread Errata: HSM31
+name:page_walker_loads type:exclusive default:0x11
+ 0x11 extra: dtlb_l1 Number of DTLB page walker hits in the L1+FB
+ 0x21 extra: itlb_l1 Number of ITLB page walker hits in the L1+FB
+ 0x41 extra: ept_dtlb_l1 Counts the number of Extended Page Table walks from the DTLB that hit in the L1 and FB.
+ 0x81 extra: ept_itlb_l1 Counts the number of Extended Page Table walks from the ITLB that hit in the L1 and FB.
+ 0x12 extra: dtlb_l2 Number of DTLB page walker hits in the L2
+ 0x22 extra: itlb_l2 Number of ITLB page walker hits in the L2
+ 0x42 extra: ept_dtlb_l2 Counts the number of Extended Page Table walks from the DTLB that hit in the L2.
+ 0x82 extra: ept_itlb_l2 Counts the number of Extended Page Table walks from the ITLB that hit in the L2.
+ 0x14 extra: dtlb_l3 Number of DTLB page walker hits in the L3 + XSNP
+ 0x24 extra: itlb_l3 Number of ITLB page walker hits in the L3 + XSNP
+ 0x44 extra: ept_dtlb_l3 Counts the number of Extended Page Table walks from the DTLB that hit in the L3.
+ 0x84 extra: ept_itlb_l3 Counts the number of Extended Page Table walks from the ITLB that hit in the L2.
+ 0x18 extra: dtlb_memory Number of DTLB page walker hits in Memory
+ 0x48 extra: ept_dtlb_memory Counts the number of Extended Page Table walks from the DTLB that hit in memory.
+ 0x88 extra: ept_itlb_memory Counts the number of Extended Page Table walks from the ITLB that hit in memory.
+name:tlb_flush type:exclusive default:0x1
+ 0x1 extra: dtlb_thread DTLB flush attempts of the thread-specific entries
+ 0x20 extra: stlb_any STLB flush attempts
+name:other_assists type:exclusive default:0x8
+ 0x8 extra: avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable. Errata: HSM57
+ 0x10 extra: sse_to_avx Number of transitions from SSE to AVX-256 when penalty applicable. Errata: HSM57
+ 0x40 extra: any_wb_assist Number of times any microcode assist is invoked by HW upon uop writeback.
+name:uops_retired type:exclusive default:0x1
+ 0x1 extra: all Actually retired uops.
+ 0x1 extra: all_pebs Actually retired uops.
+ 0x2 extra: retire_slots This event counts the number of retirement slots used each cycle. There are potentially 4 slots that can be used each cycle - meaning, 4 uops or 4 instructions could retire each cycle.
+ 0x2 extra: retire_slots_pebs This event counts the number of retirement slots used each cycle. There are potentially 4 slots that can be used each cycle - meaning, 4 uops or 4 instructions could retire each cycle.
+ 0x1 extra:cmask=1,inv stall_cycles Cycles without actually retired uops.
+ 0x1 extra:cmask=a,inv total_cycles Cycles with less than 10 actually retired uops.
+ 0x1 extra:cmask=1,inv core_stall_cycles Cycles without actually retired uops.
+name:machine_clears type:exclusive default:0x1
+ 0x1 extra: cycles Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.
+ 0x2 extra: memory_ordering This event counts the number of memory ordering machine clears detected. Memory ordering machine clears can result from memory address aliasing or snoops from another hardware thread or core to data inflight in the pipeline. Machine clears can have a significant performance impact if they are happening frequently.
+ 0x4 extra: smc This event is incremented when self-modifying code (SMC) is detected, which causes a machine clear. Machine clears can have a significant performance impact if they are happening frequently.
+ 0x20 extra: maskmov This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.
+ 0x1 extra:cmask=1,edge count Number of machine clears (nukes) of any type.
+name:br_inst_retired type:exclusive default:0x1
+ 0x1 extra: conditional Conditional branch instructions retired.
+ 0x1 extra: conditional_pebs Conditional branch instructions retired.
+ 0x2 extra: near_call Direct and indirect near call instructions retired.
+ 0x2 extra: near_call_pebs Direct and indirect near call instructions retired.
+ 0x8 extra: near_return Return instructions retired.
+ 0x8 extra: near_return_pebs Return instructions retired.
+ 0x10 extra: not_taken Not taken branch instructions retired.
+ 0x20 extra: near_taken Taken branch instructions retired.
+ 0x20 extra: near_taken_pebs Taken branch instructions retired.
+ 0x40 extra: far_branch Far branch instructions retired.
+ 0x4 extra:pebs all_branches_pebs All (macro) branch instructions retired.
+name:br_misp_retired type:exclusive default:0x1
+ 0x1 extra: conditional Mispredicted conditional branch instructions retired.
+ 0x1 extra: conditional_pebs Mispredicted conditional branch instructions retired.
+ 0x4 extra:pebs all_branches_pebs This event counts all mispredicted branch instructions retired. This is a precise event.
+ 0x20 extra: near_taken number of near branch instructions retired that were mispredicted and taken.
+ 0x20 extra: near_taken_pebs number of near branch instructions retired that were mispredicted and taken.
+name:hle_retired type:exclusive default:0x1
+ 0x1 extra: start Number of times an HLE execution started.
+ 0x2 extra: commit Number of times an HLE execution successfully committed
+ 0x4 extra: aborted Number of times an HLE execution aborted due to any reasons (multiple categories may count as one).
+ 0x4 extra: aborted_pebs Number of times an HLE execution aborted due to any reasons (multiple categories may count as one).
+ 0x8 extra: aborted_misc1 Number of times an HLE execution aborted due to various memory events (e.g., read/write capacity and conflicts).
+ 0x10 extra: aborted_misc2 Number of times an HLE execution aborted due to uncommon conditions
+ 0x20 extra: aborted_misc3 Number of times an HLE execution aborted due to HLE-unfriendly instructions
+ 0x40 extra: aborted_misc4 Number of times an HLE execution aborted due to incompatible memory type
+ 0x80 extra: aborted_misc5 Number of times an HLE execution aborted due to none of the previous 4 categories (e.g. interrupts)
+name:rtm_retired type:exclusive default:0x1
+ 0x1 extra: start Number of times an RTM execution started.
+ 0x2 extra: commit Number of times an RTM execution successfully committed
+ 0x4 extra: aborted Number of times an RTM execution aborted due to any reasons (multiple categories may count as one).
+ 0x4 extra: aborted_pebs Number of times an RTM execution aborted due to any reasons (multiple categories may count as one).
+ 0x8 extra: aborted_misc1 Number of times an RTM execution aborted due to various memory events (e.g. read/write capacity and conflicts)
+ 0x10 extra: aborted_misc2 Number of times an RTM execution aborted due to various memory events (e.g., read/write capacity and conflicts).
+ 0x20 extra: aborted_misc3 Number of times an RTM execution aborted due to HLE-unfriendly instructions
+ 0x40 extra: aborted_misc4 Number of times an RTM execution aborted due to incompatible memory type
+ 0x80 extra: aborted_misc5 Number of times an RTM execution aborted due to none of the previous 4 categories (e.g. interrupt)
+name:fp_assist type:exclusive default:0x1e
+ 0x1e extra:cmask=1 any Cycles with any input/output SSE or FP assist
+ 0x2 extra: x87_output Number of X87 assists due to output value.
+ 0x4 extra: x87_input Number of X87 assists due to input value.
+ 0x8 extra: simd_output Number of SIMD FP assists due to Output values
+ 0x10 extra: simd_input Number of SIMD FP assists due to input values
+name:mem_uops_retired type:exclusive default:0x11
+ 0x11 extra: stlb_miss_loads Load uops with true STLB miss retired to architected path. Errata: HSM30
+ 0x11 extra: stlb_miss_loads_pebs Load uops with true STLB miss retired to architected path. Errata: HSM30
+ 0x12 extra: stlb_miss_stores Store uops with true STLB miss retired to architected path. Errata: HSM30
+ 0x12 extra: stlb_miss_stores_pebs Store uops with true STLB miss retired to architected path. Errata: HSM30
+ 0x21 extra: lock_loads Load uops with locked access retired to architected path. Errata: HSM30
+ 0x21 extra: lock_loads_pebs Load uops with locked access retired to architected path. Errata: HSM30
+ 0x41 extra: split_loads Line-splitted load uops retired to architected path. Errata: HSM30
+ 0x41 extra: split_loads_pebs Line-splitted load uops retired to architected path. Errata: HSM30
+ 0x42 extra: split_stores Line-splitted store uops retired to architected path. Errata: HSM30
+ 0x42 extra: split_stores_pebs Line-splitted store uops retired to architected path. Errata: HSM30
+ 0x81 extra: all_loads Load uops retired to architected path with filter on bits 0 and 1 applied. Errata: HSM30
+ 0x81 extra: all_loads_pebs Load uops retired to architected path with filter on bits 0 and 1 applied. Errata: HSM30
+ 0x82 extra: all_stores Store uops retired to architected path with filter on bits 0 and 1 applied. Errata: HSM30
+ 0x82 extra: all_stores_pebs Store uops retired to architected path with filter on bits 0 and 1 applied. Errata: HSM30
+name:mem_load_uops_retired type:exclusive default:0x1
+ 0x1 extra: l1_hit Retired load uops with L1 cache hits as data sources. Errata: HSM30
+ 0x1 extra: l1_hit_pebs Retired load uops with L1 cache hits as data sources. Errata: HSM30
+ 0x2 extra: l2_hit Retired load uops with L2 cache hits as data sources. Errata: HSM30
+ 0x2 extra: l2_hit_pebs Retired load uops with L2 cache hits as data sources. Errata: HSM30
+ 0x4 extra: l3_hit Retired load uops which data sources were data hits in L3 without snoops required. Errata: HSM26, HSM30
+ 0x4 extra: l3_hit_pebs Retired load uops which data sources were data hits in L3 without snoops required. Errata: HSM26, HSM30
+ 0x8 extra: l1_miss Retired load uops misses in L1 cache as data sources. Errata: HSM30
+ 0x8 extra: l1_miss_pebs Retired load uops misses in L1 cache as data sources. Errata: HSM30
+ 0x10 extra: l2_miss Miss in mid-level (L2) cache. Excludes Unknown data-source. Errata: HSM30
+ 0x10 extra: l2_miss_pebs Miss in mid-level (L2) cache. Excludes Unknown data-source. Errata: HSM30
+ 0x20 extra: l3_miss Miss in last-level (L3) cache. Excludes Unknown data-source. Errata: HSM26, HSM30
+ 0x20 extra: l3_miss_pebs Miss in last-level (L3) cache. Excludes Unknown data-source. Errata: HSM26, HSM30
+ 0x40 extra: hit_lfb Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. Errata: HSM30
+ 0x40 extra: hit_lfb_pebs Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. Errata: HSM30
+name:mem_load_uops_l3_hit_retired type:exclusive default:0x1
+ 0x1 extra: xsnp_miss Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. Errata: HSM26, HSM30
+ 0x1 extra: xsnp_miss_pebs Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. Errata: HSM26, HSM30
+ 0x2 extra: xsnp_hit Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. Errata: HSM26, HSM30
+ 0x2 extra: xsnp_hit_pebs Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. Errata: HSM26, HSM30
+ 0x4 extra: xsnp_hitm Retired load uops which data sources were HitM responses from shared L3. Errata: HSM26, HSM30
+ 0x4 extra: xsnp_hitm_pebs Retired load uops which data sources were HitM responses from shared L3. Errata: HSM26, HSM30
+ 0x8 extra: xsnp_none Retired load uops which data sources were hits in L3 without snoops required. Errata: HSM26, HSM30
+ 0x8 extra: xsnp_none_pebs Retired load uops which data sources were hits in L3 without snoops required. Errata: HSM26, HSM30
+name:mem_load_uops_l3_miss_retired type:exclusive default:0x1
+ 0x1 extra: local_dram This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. Errata: HSM30
+ 0x1 extra: local_dram_pebs This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. Errata: HSM30
+name:l2_trans type:exclusive default:0x80
+ 0x80 extra: all_requests Transactions accessing L2 pipe
+ 0x1 extra: demand_data_rd Demand Data Read requests that access L2 cache
+ 0x2 extra: rfo RFO requests that access L2 cache
+ 0x4 extra: code_rd L2 cache accesses when fetching instructions
+ 0x8 extra: all_pf L2 or L3 HW prefetches that access L2 cache
+ 0x10 extra: l1d_wb L1D writebacks that access L2 cache
+ 0x20 extra: l2_fill L2 fill requests that access L2 cache
+ 0x40 extra: l2_wb L2 writebacks that access L2 cache
+name:l2_lines_in type:exclusive default:0x7
+ 0x7 extra: all This event counts the number of L2 cache lines brought into the L2 cache. Lines are filled into the L2 cache when there was an L2 miss.
+ 0x1 extra: i L2 cache lines in I state filling L2
+ 0x2 extra: s L2 cache lines in S state filling L2
+ 0x4 extra: e L2 cache lines in E state filling L2
+name:l2_lines_out type:exclusive default:0x5
+ 0x5 extra: demand_clean Clean L2 cache lines evicted by demand
+ 0x6 extra: demand_dirty Dirty L2 cache lines evicted by demand
diff --git a/events/i386/ivybridge/unit_masks b/events/i386/ivybridge/unit_masks
index ddb59a0..7786904 100644
--- a/events/i386/ivybridge/unit_masks
+++ b/events/i386/ivybridge/unit_masks
@@ -5,163 +5,163 @@
#
include:i386/arch_perfmon
name:ld_blocks type:mandatory default:0x2
- 0x2 store_forward loads blocked by overlapping with store buffer that cannot be forwarded
+ 0x2 extra: store_forward loads blocked by overlapping with store buffer that cannot be forwarded
name:misalign_mem_ref type:bitmask default:0x1
- 0x1 loads Speculative cache line split load uops dispatched to L1 cache
- 0x2 stores Speculative cache line split STA uops dispatched to L1 cache
+ 0x1 extra: loads Speculative cache line split load uops dispatched to L1 cache
+ 0x2 extra: stores Speculative cache line split STA uops dispatched to L1 cache
name:ld_blocks_partial type:mandatory default:0x1
- 0x1 address_alias False dependencies in MOB due to partial compare on address
+ 0x1 extra: address_alias False dependencies in MOB due to partial compare on address
name:dtlb_load_misses type:exclusive default:0x81
- 0x81 demand_ld_miss_causes_a_walk Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.
- 0x82 demand_ld_walk_completed Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.
- 0x84 demand_ld_walk_duration Demand load cycles page miss handler (PMH) is busy with this walk.
-name:int_misc type:exclusive default:0x3
+ 0x81 extra: demand_ld_miss_causes_a_walk Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.
+ 0x82 extra: demand_ld_walk_completed Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.
+ 0x84 extra: demand_ld_walk_duration Demand load cycles page miss handler (PMH) is busy with this walk.
+name:int_misc type:exclusive default:recovery_cycles
0x3 extra:cmask=1 recovery_cycles Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)
0x3 extra:cmask=1,edge recovery_stalls_count Number of occurences waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)
-name:uops_issued type:exclusive default:0x1
- 0x1 any Uops that Resource Allocation Table (RAT) issues to Reservation Station (RS)
+name:uops_issued type:exclusive default:any
+ 0x1 extra: any Uops that Resource Allocation Table (RAT) issues to Reservation Station (RS)
0x1 extra:cmask=1,inv stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread
0x1 extra:cmask=1,inv,any core_stall_cycles Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads
- 0x10 flags_merge Number of flags-merge uops being allocated.
- 0x20 slow_lea Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.
- 0x40 single_mul Number of Multiply packed/scalar single precision uops allocated
-name:arith type:bitmask default:0x1
- 0x1 fpu_div_active Cycles when divider is busy executing divide operations
+ 0x10 extra: flags_merge Number of flags-merge uops being allocated.
+ 0x20 extra: slow_lea Number of slow LEA uops being allocated. A uop is generally considered SlowLea if it has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not.
+ 0x40 extra: single_mul Number of Multiply packed/scalar single precision uops allocated
+name:arith type:bitmask default:fpu_div_active
+ 0x1 extra: fpu_div_active Cycles when divider is busy executing divide operations
0x4 extra:cmask=1,edge fpu_div Divide operations executed
name:l2_rqsts type:exclusive default:0x1
- 0x1 demand_data_rd_hit Demand Data Read requests that hit L2 cache
- 0x3 all_demand_data_rd Demand Data Read requests
- 0x4 rfo_hit RFO requests that hit L2 cache
- 0x8 rfo_miss RFO requests that miss L2 cache
- 0xc all_rfo RFO requests to L2 cache
- 0x10 code_rd_hit L2 cache hits when fetching instructions, code reads.
- 0x20 code_rd_miss L2 cache misses when fetching instructions
- 0x30 all_code_rd L2 code requests
- 0x40 pf_hit Requests from the L2 hardware prefetchers that hit L2 cache
- 0x80 pf_miss Requests from the L2 hardware prefetchers that miss L2 cache
- 0xc0 all_pf Requests from L2 hardware prefetchers
+ 0x1 extra: demand_data_rd_hit Demand Data Read requests that hit L2 cache
+ 0x3 extra: all_demand_data_rd Demand Data Read requests
+ 0x4 extra: rfo_hit RFO requests that hit L2 cache
+ 0x8 extra: rfo_miss RFO requests that miss L2 cache
+ 0xc extra: all_rfo RFO requests to L2 cache
+ 0x10 extra: code_rd_hit L2 cache hits when fetching instructions, code reads.
+ 0x20 extra: code_rd_miss L2 cache misses when fetching instructions
+ 0x30 extra: all_code_rd L2 code requests
+ 0x40 extra: pf_hit Requests from the L2 hardware prefetchers that hit L2 cache
+ 0x80 extra: pf_miss Requests from the L2 hardware prefetchers that miss L2 cache
+ 0xc0 extra: all_pf Requests from L2 hardware prefetchers
name:l2_store_lock_rqsts type:exclusive default:0x1
- 0x1 miss RFOs that miss cache lines
- 0x8 hit_m RFOs that hit cache lines in M state
- 0xf all RFOs that access cache lines in any state
+ 0x1 extra: miss RFOs that miss cache lines
+ 0x8 extra: hit_m RFOs that hit cache lines in M state
+ 0xf extra: all RFOs that access cache lines in any state
name:l2_l1d_wb_rqsts type:exclusive default:0x1
- 0x1 miss Count the number of modified Lines evicted from L1 and missed L2. (Non-rejected WBs from the DCU.)
- 0x4 hit_e Not rejected writebacks from L1D to L2 cache lines in E state
- 0x8 hit_m Not rejected writebacks from L1D to L2 cache lines in M state
- 0xf all Not rejected writebacks from L1D to L2 cache lines in any state.
-name:l1d_pend_miss type:exclusive default:0x1
- 0x1 pending L1D miss oustandings duration in cycles
+ 0x1 extra: miss Count the number of modified Lines evicted from L1 and missed L2. (Non-rejected WBs from the DCU.)
+ 0x4 extra: hit_e Not rejected writebacks from L1D to L2 cache lines in E state
+ 0x8 extra: hit_m Not rejected writebacks from L1D to L2 cache lines in M state
+ 0xf extra: all Not rejected writebacks from L1D to L2 cache lines in any state.
+name:l1d_pend_miss type:exclusive default:pending_cycles
+ 0x1 extra: pending L1D miss oustandings duration in cycles
0x1 extra:cmask=1 pending_cycles Cycles with L1D load Misses outstanding.
0x1 extra:cmask=1,edge occurences This event counts the number of L1D misses outstanding, using an edge detect to count transitions.
name:dtlb_store_misses type:bitmask default:0x1
- 0x1 miss_causes_a_walk Store misses in all DTLB levels that cause page walks
- 0x2 walk_completed Store misses in all DTLB levels that cause completed page walks
- 0x4 walk_duration Cycles when PMH is busy with page walks
- 0x10 stlb_hit Store operations that miss the first TLB level but hit the second and do not cause page walks
+ 0x1 extra: miss_causes_a_walk Store misses in all DTLB levels that cause page walks
+ 0x2 extra: walk_completed Store misses in all DTLB levels that cause completed page walks
+ 0x4 extra: walk_duration Cycles when PMH is busy with page walks
+ 0x10 extra: stlb_hit Store operations that miss the first TLB level but hit the second and do not cause page walks
name:load_hit_pre type:bitmask default:0x1
- 0x1 sw_pf Not software-prefetch load dispatches that hit forward buffer allocated for software prefetch
- 0x2 hw_pf Not software-prefetch load dispatches that hit forward buffer allocated for hardware prefetch
+ 0x1 extra: sw_pf Not software-prefetch load dispatches that hit forward buffer allocated for software prefetch
+ 0x2 extra: hw_pf Not software-prefetch load dispatches that hit forward buffer allocated for hardware prefetch
name:l1d type:mandatory default:0x1
- 0x1 replacement L1D data line replacements
+ 0x1 extra: replacement L1D data line replacements
name:move_elimination type:bitmask default:0x1
- 0x1 int_not_eliminated Number of integer Move Elimination candidate uops that were not eliminated.
- 0x2 simd_not_eliminated Number of SIMD Move Elimination candidate uops that were not eliminated.
- 0x4 int_eliminated Number of integer Move Elimination candidate uops that were eliminated.
- 0x8 simd_eliminated Number of SIMD Move Elimination candidate uops that were eliminated.
-name:cpl_cycles type:exclusive default:0x1
- 0x1 ring0 Unhalted core cycles when the thread is in ring 0
+ 0x1 extra: int_not_eliminated Number of integer Move Elimination candidate uops that were not eliminated.
+ 0x2 extra: simd_not_eliminated Number of SIMD Move Elimination candidate uops that were not eliminated.
+ 0x4 extra: int_eliminated Number of integer Move Elimination candidate uops that were eliminated.
+ 0x8 extra: simd_eliminated Number of SIMD Move Elimination candidate uops that were eliminated.
+name:cpl_cycles type:exclusive default:ring0
+ 0x1 extra: ring0 Unhalted core cycles when the thread is in ring 0
0x1 extra:cmask=1,edge ring0_trans Number of intervals between processor halts while thread is in ring 0
- 0x2 ring123 Unhalted core cycles when thread is in rings 1, 2, or 3
+ 0x2 extra: ring123 Unhalted core cycles when thread is in rings 1, 2, or 3
name:rs_events type:mandatory default:0x1
- 0x1 empty_cycles Cycles when Reservation Station (RS) is empty for the thread
+ 0x1 extra: empty_cycles Cycles when Reservation Station (RS) is empty for the thread
name:tlb_access type:mandatory default:0x4
- 0x4 load_stlb_hit Load operations that miss the first DTLB level but hit the second and do not cause page walks
-name:offcore_requests_outstanding type:exclusive default:0x1
- 0x1 demand_data_rd Offcore outstanding Demand Data Read transactions in uncore queue.
+ 0x4 extra: load_stlb_hit Load operations that miss the first DTLB level but hit the second and do not cause page walks
+name:offcore_requests_outstanding type:exclusive default:cycles_with_demand_data_rd
+ 0x1 extra: demand_data_rd Offcore outstanding Demand Data Read transactions in uncore queue.
0x1 extra:cmask=1 cycles_with_demand_data_rd Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore
- 0x2 demand_code_rd Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle
- 0x4 demand_rfo Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore
+ 0x2 extra: demand_code_rd Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle
+ 0x4 extra: demand_rfo Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore
0x4 extra:cmask=1 cycles_with_demand_rfo Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle
- 0x8 all_data_rd Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore
+ 0x8 extra: all_data_rd Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore
0x8 extra:cmask=1 cycles_with_data_rd Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore
name:lock_cycles type:bitmask default:0x1
- 0x1 split_lock_uc_lock_duration Cycles when L1 and L2 are locked due to UC or split lock
- 0x2 cache_lock_duration Cycles when L1D is locked
-name:idq type:exclusive default:0x2
- 0x2 empty Instruction Decode Queue (IDQ) empty cycles
- 0x4 mite_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
+ 0x1 extra: split_lock_uc_lock_duration Cycles when L1 and L2 are locked due to UC or split lock
+ 0x2 extra: cache_lock_duration Cycles when L1D is locked
+name:idq type:exclusive default:empty
+ 0x2 extra: empty Instruction Decode Queue (IDQ) empty cycles
+ 0x4 extra: mite_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
0x4 extra:cmask=1 mite_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path
- 0x8 dsb_uops Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path
+ 0x8 extra: dsb_uops Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path
0x8 extra:cmask=1 dsb_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path
- 0x10 ms_dsb_uops Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x10 extra: ms_dsb_uops Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
0x10 extra:cmask=1 ms_dsb_cycles Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
0x10 extra:cmask=1,edge ms_dsb_occur Deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while Microcode Sequenser (MS) is busy
0x18 extra:cmask=1 all_dsb_cycles_any_uops Cycles Decode Stream Buffer (DSB) is delivering any Uop
0x18 extra:cmask=4 all_dsb_cycles_4_uops Cycles Decode Stream Buffer (DSB) is delivering 4 Uops
- 0x20 ms_mite_uops Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x20 extra: ms_mite_uops Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
0x24 extra:cmask=1 all_mite_cycles_any_uops Cycles MITE is delivering any Uop
0x24 extra:cmask=4 all_mite_cycles_4_uops Cycles MITE is delivering 4 Uops
- 0x30 ms_uops Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
+ 0x30 extra: ms_uops Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
0x30 extra:cmask=1 ms_cycles Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy
- 0x3c mite_all_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
+ 0x3c extra: mite_all_uops Uops delivered to Instruction Decode Queue (IDQ) from MITE path
name:icache type:mandatory default:0x2
- 0x2 misses Instruction cache, streaming buffer and victim cache misses
+ 0x2 extra: misses Instruction cache, streaming buffer and victim cache misses
name:itlb_misses type:bitmask default:0x1
- 0x1 miss_causes_a_walk Misses at all ITLB levels that cause page walks
- 0x2 walk_completed Misses in all ITLB levels that cause completed page walks
- 0x4 walk_duration Cycles when PMH is busy with page walks
- 0x10 stlb_hit Operations that miss the first ITLB level but hit the second and do not cause any page walks
+ 0x1 extra: miss_causes_a_walk Misses at all ITLB levels that cause page walks
+ 0x2 extra: walk_completed Misses in all ITLB levels that cause completed page walks
+ 0x4 extra: walk_duration Cycles when PMH is busy with page walks
+ 0x10 extra: stlb_hit Operations that miss the first ITLB level but hit the second and do not cause any page walks
name:ild_stall type:bitmask default:0x1
- 0x1 lcp Stalls caused by changing prefix length of the instruction.
- 0x4 iq_full Stall cycles because IQ is full
+ 0x1 extra: lcp Stalls caused by changing prefix length of the instruction.
+ 0x4 extra: iq_full Stall cycles because IQ is full
name:br_inst_exec type:exclusive default:0x41
- 0x41 nontaken_conditional Not taken macro-conditional branches
- 0x81 taken_conditional Taken speculative and retired macro-conditional branches
- 0x82 taken_direct_jump Taken speculative and retired macro-conditional branch instructions excluding calls and indirects
- 0x84 taken_indirect_jump_non_call_ret Taken speculative and retired indirect branches excluding calls and returns
- 0x88 taken_indirect_near_return Taken speculative and retired indirect branches with return mnemonic
- 0x90 taken_direct_near_call Taken speculative and retired direct near calls
- 0xa0 taken_indirect_near_call Taken speculative and retired indirect calls
- 0xc1 all_conditional Speculative and retired macro-conditional branches
- 0xc2 all_direct_jmp Speculative and retired macro-unconditional branches excluding calls and indirects
- 0xc4 all_indirect_jump_non_call_ret Speculative and retired indirect branches excluding calls and returns
- 0xc8 all_indirect_near_return Speculative and retired indirect return branches.
- 0xd0 all_direct_near_call Speculative and retired direct near calls
- 0xff all_branches Speculative and retired branches
+ 0x41 extra: nontaken_conditional Not taken macro-conditional branches
+ 0x81 extra: taken_conditional Taken speculative and retired macro-conditional branches
+ 0x82 extra: taken_direct_jump Taken speculative and retired macro-conditional branch instructions excluding calls and indirects
+ 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired indirect branches excluding calls and returns
+ 0x88 extra: taken_indirect_near_return Taken speculative and retired indirect branches with return mnemonic
+ 0x90 extra: taken_direct_near_call Taken speculative and retired direct near calls
+ 0xa0 extra: taken_indirect_near_call Taken speculative and retired indirect calls
+ 0xc1 extra: all_conditional Speculative and retired macro-conditional branches
+ 0xc2 extra: all_direct_jmp Speculative and retired macro-unconditional branches excluding calls and indirects
+ 0xc4 extra: all_indirect_jump_non_call_ret Speculative and retired indirect branches excluding calls and returns
+ 0xc8 extra: all_indirect_near_return Speculative and retired indirect return branches.
+ 0xd0 extra: all_direct_near_call Speculative and retired direct near calls
+ 0xff extra: all_branches Speculative and retired branches
name:br_misp_exec type:exclusive default:0x41
- 0x41 nontaken_conditional Not taken speculative and retired mispredicted macro conditional branches
- 0x81 taken_conditional Taken speculative and retired mispredicted macro conditional branches
- 0x84 taken_indirect_jump_non_call_ret Taken speculative and retired mispredicted indirect branches excluding calls and returns
- 0x88 taken_return_near Taken speculative and retired mispredicted indirect branches with return mnemonic
- 0xa0 taken_indirect_near_call Taken speculative and retired mispredicted indirect calls
- 0xc1 all_conditional Speculative and retired mispredicted macro conditional branches
- 0xc4 all_indirect_jump_non_call_ret Mispredicted indirect branches excluding calls and returns
- 0xff all_branches Speculative and retired mispredicted macro conditional branches
-name:idq_uops_not_delivered type:exclusive default:0x1
- 0x1 core Uops not delivered by the Frontend to the Backend of the machine, while there is no Backend stall
+ 0x41 extra: nontaken_conditional Not taken speculative and retired mispredicted macro conditional branches
+ 0x81 extra: taken_conditional Taken speculative and retired mispredicted macro conditional branches
+ 0x84 extra: taken_indirect_jump_non_call_ret Taken speculative and retired mispredicted indirect branches excluding calls and returns
+ 0x88 extra: taken_return_near Taken speculative and retired mispredicted indirect branches with return mnemonic
+ 0xa0 extra: taken_indirect_near_call Taken speculative and retired mispredicted indirect calls
+ 0xc1 extra: all_conditional Speculative and retired mispredicted macro conditional branches
+ 0xc4 extra: all_indirect_jump_non_call_ret Mispredicted indirect branches excluding calls and returns
+ 0xff extra: all_branches Speculative and retired mispredicted macro conditional branches
+name:idq_uops_not_delivered type:exclusive default:core
+ 0x1 extra: core Uops not delivered by the Frontend to the Backend of the machine, while there is no Backend stall
0x1 extra:cmask=1 cycles_le_3_uop_deliv.core Cycles with 3 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
0x1 extra:cmask=1,inv cycles_fe_was_ok Cycles with 4 uops delivered by the Frontend to the Backend of the machine, or the Backend was stalling
0x1 extra:cmask=2 cycles_le_2_uop_deliv.core Cycles with 2 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
0x1 extra:cmask=3 cycles_le_1_uop_deliv.core Cycles with 1 or less uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
0x1 extra:cmask=4 cycles_0_uops_deliv.core Cycles with no uops delivered by the Frontend to the Backend of the machine, while there is no Backend stall
-name:uops_dispatched_port type:exclusive default:0x1
- 0x1 port_0 Cycles per thread when uops are dispatched to port 0
+name:uops_dispatched_port type:exclusive default:port_0
+ 0x1 extra: port_0 Cycles per thread when uops are dispatched to port 0
0x1 extra:any port_0_core Cycles per core when uops are dispatched to port 0
- 0x2 port_1 Cycles per thread when uops are dispatched to port 1
+ 0x2 extra: port_1 Cycles per thread when uops are dispatched to port 1
0x2 extra:any port_1_core Cycles per core when uops are dispatched to port 1
- 0xc port_2 Cycles per thread when load or STA uops are dispatched to port 2
+ 0xc extra: port_2 Cycles per thread when load or STA uops are dispatched to port 2
0xc extra:any port_2_core Cycles per core when load or STA uops are dispatched to port 2
- 0x30 port_3 Cycles per thread when load or STA uops are dispatched to port 3
+ 0x30 extra: port_3 Cycles per thread when load or STA uops are dispatched to port 3
0x30 extra:any port_3_core Cycles per core when load or STA uops are dispatched to port 3
- 0x40 port_4 Cycles per thread when uops are dispatched to port 4
+ 0x40 extra: port_4 Cycles per thread when uops are dispatched to port 4
0x40 extra:any port_4_core Cycles per core when uops are dispatched to port 4
- 0x80 port_5 Cycles per thread when uops are dispatched to port 5
+ 0x80 extra: port_5 Cycles per thread when uops are dispatched to port 5
0x80 extra:any port_5_core Cycles per core when uops are dispatched to port 5
name:resource_stalls type:bitmask default:0x1
- 0x1 any Resource-related stall cycles
- 0x4 rs Cycles stalled due to no eligible RS entry available.
- 0x8 sb Cycles stalled due to no store buffers available. (not including draining form sync).
- 0x10 rob Cycles stalled due to re-order buffer full.
+ 0x1 extra: any Resource-related stall cycles
+ 0x4 extra: rs Cycles stalled due to no eligible RS entry available.
+ 0x8 extra: sb Cycles stalled due to no store buffers available. (not including draining form sync).
+ 0x10 extra: rob Cycles stalled due to re-order buffer full.
name:cycle_activity type:exclusive default:0x1
0x1 extra:cmask=1 cycles_l2_pending Cycles with pending L2 cache miss loads.
0x2 extra:cmask=2 cycles_ldm_pending Cycles with pending memory loads.
@@ -171,99 +171,99 @@ name:cycle_activity type:exclusive default:0x1
0x8 extra:cmask=8 cycles_l1d_pending Cycles with pending L1 cache miss loads.
0xc extra:cmask=c stalls_l1d_pending Execution stalls due to L1 data cache misses
name:dsb2mite_switches type:mandatory default:0x1
- 0x1 count Decode Stream Buffer (DSB)-to-MITE switches
+ 0x1 extra: count Decode Stream Buffer (DSB)-to-MITE switches
name:dsb_fill type:mandatory default:0x8
- 0x8 exceed_dsb_lines Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines
+ 0x8 extra: exceed_dsb_lines Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines
name:itlb type:mandatory default:0x1
- 0x1 itlb_flush Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.
+ 0x1 extra: itlb_flush Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.
name:offcore_requests type:bitmask default:0x1
- 0x1 demand_data_rd Demand Data Read requests sent to uncore
- 0x2 demand_code_rd Cacheable and noncachaeble code read requests
- 0x4 demand_rfo Demand RFO requests including regular RFOs, locks, ItoM
- 0x8 all_data_rd Demand and prefetch data reads
-name:uops_executed type:exclusive default:0x1
- 0x1 thread Counts the number of uops to be executed per-thread each cycle.
+ 0x1 extra: demand_data_rd Demand Data Read requests sent to uncore
+ 0x2 extra: demand_code_rd Cacheable and noncachaeble code read requests
+ 0x4 extra: demand_rfo Demand RFO requests including regular RFOs, locks, ItoM
+ 0x8 extra: all_data_rd Demand and prefetch data reads
+name:uops_executed type:exclusive default:thread
+ 0x1 extra: thread Counts the number of uops to be executed per-thread each cycle.
0x1 extra:cmask=1 cycles_ge_1_uop_exec Cycles where at least 1 uop was executed per-thread
0x1 extra:cmask=1,inv stall_cycles Counts number of cycles no uops were dispatched to be executed on this thread.
0x1 extra:cmask=2 cycles_ge_2_uops_exec Cycles where at least 2 uops were executed per-thread
0x1 extra:cmask=3 cycles_ge_3_uops_exec Cycles where at least 3 uops were executed per-thread
0x1 extra:cmask=4 cycles_ge_4_uops_exec Cycles where at least 4 uops were executed per-thread
- 0x2 core Number of uops executed on the core.
+ 0x2 extra: core Number of uops executed on the core.
name:tlb_flush type:bitmask default:0x1
- 0x1 dtlb_thread DTLB flush attempts of the thread-specific entries
- 0x20 stlb_any STLB flush attempts
+ 0x1 extra: dtlb_thread DTLB flush attempts of the thread-specific entries
+ 0x20 extra: stlb_any STLB flush attempts
name:other_assists type:bitmask default:0x8
- 0x8 avx_store Number of AVX memory assist for stores. AVX microcode assist is being invoked whenever the hardware is unable to properly handle AVX-256b operations.
- 0x10 avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable.
- 0x20 sse_to_avx Number of transitions from SSE to AVX-256 when penalty applicable.
-name:uops_retired type:exclusive default:0x1
- 0x1 all Actually retired uops.
+ 0x8 extra: avx_store Number of AVX memory assist for stores. AVX microcode assist is being invoked whenever the hardware is unable to properly handle AVX-256b operations.
+ 0x10 extra: avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable.
+ 0x20 extra: sse_to_avx Number of transitions from SSE to AVX-256 when penalty applicable.
+name:uops_retired type:exclusive default:all
+ 0x1 extra: all Actually retired uops.
0x1 extra:cmask=1,inv stall_cycles Cycles without actually retired uops.
0x1 extra:cmask=1,inv,any core_stall_cycles Cycles without actually retired uops.
0x1 extra:cmask=10,inv total_cycles Cycles with less than 10 actually retired uops.
- 0x2 retire_slots Retirement slots used.
+ 0x2 extra: retire_slots Retirement slots used.
name:machine_clears type:bitmask default:0x2
- 0x2 memory_ordering Counts the number of machine clears due to memory order conflicts.
- 0x4 smc Self-modifying code (SMC) detected.
- 0x20 maskmov This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.
+ 0x2 extra: memory_ordering Counts the number of machine clears due to memory order conflicts.
+ 0x4 extra: smc Self-modifying code (SMC) detected.
+ 0x20 extra: maskmov This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.
name:br_inst_retired type:exclusive default:0x1
- 0x1 conditional Conditional branch instructions retired.
- 0x2 near_call_r3 Direct and indirect macro near call instructions retired (captured in ring 3).
- 0x2 near_call Direct and indirect near call instructions retired.
- 0x8 near_return Return instructions retired.
- 0x10 not_taken Not taken branch instructions retired.
- 0x20 near_taken Taken branch instructions retired.
- 0x40 far_branch Far branch instructions retired.
+ 0x1 extra: conditional Conditional branch instructions retired.
+ 0x2 extra: near_call_r3 Direct and indirect macro near call instructions retired (captured in ring 3).
+ 0x2 extra: near_call Direct and indirect near call instructions retired.
+ 0x8 extra: near_return Return instructions retired.
+ 0x10 extra: not_taken Not taken branch instructions retired.
+ 0x20 extra: near_taken Taken branch instructions retired.
+ 0x40 extra: far_branch Far branch instructions retired.
name:br_misp_retired type:bitmask default:0x1
- 0x1 conditional Mispredicted conditional branch instructions retired.
- 0x20 near_taken number of near branch instructions retired that were mispredicted and taken.
+ 0x1 extra: conditional Mispredicted conditional branch instructions retired.
+ 0x20 extra: near_taken number of near branch instructions retired that were mispredicted and taken.
name:fp_assist type:exclusive default:0x1e
- 0x2 x87_output Number of X87 assists due to output value.
- 0x4 x87_input Number of X87 assists due to input value.
- 0x8 simd_output Number of SIMD FP assists due to Output values
- 0x10 simd_input Number of SIMD FP assists due to input values
+ 0x2 extra: x87_output Number of X87 assists due to output value.
+ 0x4 extra: x87_input Number of X87 assists due to input value.
+ 0x8 extra: simd_output Number of SIMD FP assists due to Output values
+ 0x10 extra: simd_input Number of SIMD FP assists due to input values
0x1e extra:cmask=1 any Cycles with any input/output SSE or FP assist
name:rob_misc_events type:mandatory default:0x20
- 0x20 lbr_inserts Count cases of saving new LBR
+ 0x20 extra: lbr_inserts Count cases of saving new LBR
name:mem_uops_retired type:exclusive default:0x81
- 0x11 stlb_miss_loads Load uops with true STLB miss retired to architected path.
- 0x12 stlb_miss_stores Store uops with true STLB miss retired to architected path.
- 0x21 lock_loads Load uops with locked access retired to architected path.
- 0x41 split_loads Line-splitted load uops retired to architected path.
- 0x42 split_stores Line-splitted store uops retired to architected path.
- 0x81 all_loads Load uops retired to architected path with filter on bits 0 and 1 applied.
- 0x82 all_stores Store uops retired to architected path with filter on bits 0 and 1 applied.
+ 0x11 extra: stlb_miss_loads Load uops with true STLB miss retired to architected path.
+ 0x12 extra: stlb_miss_stores Store uops with true STLB miss retired to architected path.
+ 0x21 extra: lock_loads Load uops with locked access retired to architected path.
+ 0x41 extra: split_loads Line-splitted load uops retired to architected path.
+ 0x42 extra: split_stores Line-splitted store uops retired to architected path.
+ 0x81 extra: all_loads Load uops retired to architected path with filter on bits 0 and 1 applied.
+ 0x82 extra: all_stores Store uops retired to architected path with filter on bits 0 and 1 applied.
name:mem_load_uops_retired type:bitmask default:0x1
- 0x1 l1_hit Retired load uops with L1 cache hits as data sources.
- 0x2 l2_hit Retired load uops with L2 cache hits as data sources.
- 0x4 llc_hit Retired load uops which data sources were data hits in LLC without snoops required.
- 0x40 hit_lfb Retired load uops which data sources were load uops missed L1 but hit forward buffer due to preceding miss to the same cache line with data not ready.
+ 0x1 extra: l1_hit Retired load uops with L1 cache hits as data sources.
+ 0x2 extra: l2_hit Retired load uops with L2 cache hits as data sources.
+ 0x4 extra: llc_hit Retired load uops which data sources were data hits in LLC without snoops required.
+ 0x40 extra: hit_lfb Retired load uops which data sources were load uops missed L1 but hit forward buffer due to preceding miss to the same cache line with data not ready.
name:mem_load_uops_llc_hit_retired type:bitmask default:0x1
- 0x1 xsnp_miss Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache.
- 0x2 xsnp_hit Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache.
- 0x4 xsnp_hitm Retired load uops which data sources were HitM responses from shared LLC.
- 0x8 xsnp_none Retired load uops which data sources were hits in LLC without snoops required.
+ 0x1 extra: xsnp_miss Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache.
+ 0x2 extra: xsnp_hit Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache.
+ 0x4 extra: xsnp_hitm Retired load uops which data sources were HitM responses from shared LLC.
+ 0x8 extra: xsnp_none Retired load uops which data sources were hits in LLC without snoops required.
name:mem_load_uops_llc_miss_retired type:mandatory default:0x1
- 0x1 local_dram Data from local DRAM either Snoop not needed or Snoop Miss (RspI)
+ 0x1 extra: local_dram Data from local DRAM either Snoop not needed or Snoop Miss (RspI)
name:baclears type:mandatory default:0x1f
- 0x1f any Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.
+ 0x1f extra: any Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.
name:l2_trans type:bitmask default:0x80
- 0x1 demand_data_rd Demand Data Read requests that access L2 cache
- 0x2 rfo RFO requests that access L2 cache
- 0x4 code_rd L2 cache accesses when fetching instructions
- 0x8 all_pf L2 or LLC HW prefetches that access L2 cache
- 0x10 l1d_wb L1D writebacks that access L2 cache
- 0x20 l2_fill L2 fill requests that access L2 cache
- 0x40 l2_wb L2 writebacks that access L2 cache
- 0x80 all_requests Transactions accessing L2 pipe
+ 0x1 extra: demand_data_rd Demand Data Read requests that access L2 cache
+ 0x2 extra: rfo RFO requests that access L2 cache
+ 0x4 extra: code_rd L2 cache accesses when fetching instructions
+ 0x8 extra: all_pf L2 or LLC HW prefetches that access L2 cache
+ 0x10 extra: l1d_wb L1D writebacks that access L2 cache
+ 0x20 extra: l2_fill L2 fill requests that access L2 cache
+ 0x40 extra: l2_wb L2 writebacks that access L2 cache
+ 0x80 extra: all_requests Transactions accessing L2 pipe
name:l2_lines_in type:exclusive default:0x7
- 0x1 i L2 cache lines in I state filling L2
- 0x2 s L2 cache lines in S state filling L2
- 0x4 e L2 cache lines in E state filling L2
- 0x7 all L2 cache lines filling L2
+ 0x1 extra: i L2 cache lines in I state filling L2
+ 0x2 extra: s L2 cache lines in S state filling L2
+ 0x4 extra: e L2 cache lines in E state filling L2
+ 0x7 extra: all L2 cache lines filling L2
name:l2_lines_out type:exclusive default:0x1
- 0x1 demand_clean Clean L2 cache lines evicted by demand
- 0x2 demand_dirty Dirty L2 cache lines evicted by demand
- 0x4 pf_clean Clean L2 cache lines evicted by L2 prefetch
- 0x8 pf_dirty Dirty L2 cache lines evicted by L2 prefetch
- 0xa dirty_all Dirty L2 cache lines filling the L2
+ 0x1 extra: demand_clean Clean L2 cache lines evicted by demand
+ 0x2 extra: demand_dirty Dirty L2 cache lines evicted by demand
+ 0x4 extra: pf_clean Clean L2 cache lines evicted by L2 prefetch
+ 0x8 extra: pf_dirty Dirty L2 cache lines evicted by L2 prefetch
+ 0xa extra: dirty_all Dirty L2 cache lines filling the L2
diff --git a/events/i386/nehalem/unit_masks b/events/i386/nehalem/unit_masks
index d800e5d..8f60292 100644
--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -4,369 +4,369 @@
#
include:i386/arch_perfmon
name:sb_forward type:mandatory default:0x01
- 0x01 any Counts the number of store forwards
+ 0x01 extra: any Counts the number of store forwards
name:load_block type:bitmask default:0x01
- 0x01 std Counts the number of loads blocked by a preceding store with unknown data
- 0x04 address_offset Counts the number of loads blocked by a preceding store address
+ 0x01 extra: std Counts the number of loads blocked by a preceding store with unknown data
+ 0x04 extra: address_offset Counts the number of loads blocked by a preceding store address
name:sb_drain type:mandatory default:0x01
- 0x01 cycles Counts the cycles of store buffer drains
+ 0x01 extra: cycles Counts the cycles of store buffer drains
name:misalign_mem_ref type:bitmask default:0x03
- 0x01 load Counts the number of misaligned load references
- 0x02 store Counts the number of misaligned store references
- 0x03 any Counts the number of misaligned memory references
+ 0x01 extra: load Counts the number of misaligned load references
+ 0x02 extra: store Counts the number of misaligned store references
+ 0x03 extra: any Counts the number of misaligned memory references
name:store_blocks type:bitmask default:0x0f
- 0x01 not_sta This event counts the number of load operations delayed caused by preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflict with the load but which incompletely overlap the load
- 0x02 sta This event counts load operations delayed caused by preceding stores whose addresses are unknown (STA block)
- 0x04 at_ret Counts number of loads delayed with at-Retirement block code
- 0x08 l1d_block Cacheable loads delayed with L1D block code
+ 0x01 extra: not_sta This event counts the number of load operations delayed caused by preceding stores whose addresses are known but whose data is unknown, and preceding stores that conflict with the load but which incompletely overlap the load
+ 0x02 extra: sta This event counts load operations delayed caused by preceding stores whose addresses are unknown (STA block)
+ 0x04 extra: at_ret Counts number of loads delayed with at-Retirement block code
+ 0x08 extra: l1d_block Cacheable loads delayed with L1D block code
0x0F any All loads delayed due to store blocks
name:dtlb_load_misses type:bitmask default:0x01
- 0x01 any Counts all load misses that cause a page walk
- 0x02 walk_completed Counts number of completed page walks due to load miss in the STLB
- 0x10 stlb_hit Number of cache load STLB hits
- 0x20 pde_miss Number of DTLB cache load misses where the low part of the linear to physical address translation was missed
- 0x40 pdp_miss Number of DTLB cache load misses where the high part of the linear to physical address translation was missed
- 0x80 large_walk_completed Counts number of completed large page walks due to load miss in the STLB
+ 0x01 extra: any Counts all load misses that cause a page walk
+ 0x02 extra: walk_completed Counts number of completed page walks due to load miss in the STLB
+ 0x10 extra: stlb_hit Number of cache load STLB hits
+ 0x20 extra: pde_miss Number of DTLB cache load misses where the low part of the linear to physical address translation was missed
+ 0x40 extra: pdp_miss Number of DTLB cache load misses where the high part of the linear to physical address translation was missed
+ 0x80 extra: large_walk_completed Counts number of completed large page walks due to load miss in the STLB
name:memory_disambiguration type:bitmask default:0x01
- 0x01 reset Counts memory disambiguration reset cycles
- 0x02 success Counts the number of loads that memory disambiguration succeeded
- 0x04 watchdog Counts the number of times the memory disambiguration watchdog kicked in
- 0x08 watch_cycles Counts the cycles that the memory disambiguration watchdog is active
+ 0x01 extra: reset Counts memory disambiguration reset cycles
+ 0x02 extra: success Counts the number of loads that memory disambiguration succeeded
+ 0x04 extra: watchdog Counts the number of times the memory disambiguration watchdog kicked in
+ 0x08 extra: watch_cycles Counts the cycles that the memory disambiguration watchdog is active
name:mem_inst_retired type:bitmask default:0x01
- 0x01 loads Counts the number of instructions with an architecturally-visible store retired on the architected path
- 0x02 stores Counts the number of instructions with an architecturally-visible store retired on the architected path
+ 0x01 extra: loads Counts the number of instructions with an architecturally-visible store retired on the architected path
+ 0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
- 0x01 dtlb_miss The event counts the number of retired stores that missed the DTLB
+ 0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
name:uops_issued type:bitmask default:0x01
- 0x01 any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
- 0x01 stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
- 0x02 fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
+ 0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
+ 0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
+ 0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
name:mem_uncore_retired type:bitmask default:0x02
- 0x02 other_core_l2_hitm Counts number of memory load instructions retired where the memory reference hit modified data in a sibling core residing on the same socket
- 0x08 remote_cache_local_home_hit Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and HIT in a remote socket's cache
- 0x10 remote_dram Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and was remotely homed
- 0x20 local_dram Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and required a local socket memory reference
+ 0x02 extra: other_core_l2_hitm Counts number of memory load instructions retired where the memory reference hit modified data in a sibling core residing on the same socket
+ 0x08 extra: remote_cache_local_home_hit Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and HIT in a remote socket's cache
+ 0x10 extra: remote_dram Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and was remotely homed
+ 0x20 extra: local_dram Counts number of memory load instructions retired where the memory reference missed the L1, L2 and LLC caches and required a local socket memory reference
name:fp_comp_ops_exe type:bitmask default:0x01
- 0x01 x87 Counts the number of FP Computational Uops Executed
- 0x02 mmx Counts number of MMX Uops executed
- 0x04 sse_fp Counts number of SSE and SSE2 FP uops executed
- 0x08 sse2_integer Counts number of SSE2 integer uops executed
- 0x10 sse_fp_packed Counts number of SSE FP packed uops executed
- 0x20 sse_fp_scalar Counts number of SSE FP scalar uops executed
- 0x40 sse_single_precision Counts number of SSE* FP single precision uops executed
- 0x80 sse_double_precision Counts number of SSE* FP double precision uops executed
+ 0x01 extra: x87 Counts the number of FP Computational Uops Executed
+ 0x02 extra: mmx Counts number of MMX Uops executed
+ 0x04 extra: sse_fp Counts number of SSE and SSE2 FP uops executed
+ 0x08 extra: sse2_integer Counts number of SSE2 integer uops executed
+ 0x10 extra: sse_fp_packed Counts number of SSE FP packed uops executed
+ 0x20 extra: sse_fp_scalar Counts number of SSE FP scalar uops executed
+ 0x40 extra: sse_single_precision Counts number of SSE* FP single precision uops executed
+ 0x80 extra: sse_double_precision Counts number of SSE* FP double precision uops executed
name:simd_int_128 type:bitmask default:0x01
- 0x01 packed_mpy Counts number of 128 bit SIMD integer multiply operations
- 0x02 packed_shift Counts number of 128 bit SIMD integer shift operations
- 0x04 pack Counts number of 128 bit SIMD integer pack operations
- 0x08 unpack Counts number of 128 bit SIMD integer unpack operations
- 0x10 packed_logical Counts number of 128 bit SIMD integer logical operations
- 0x20 packed_arith Counts number of 128 bit SIMD integer arithmetic operations
- 0x40 shuffle_move Counts number of 128 bit SIMD integer shuffle and move operations
+ 0x01 extra: packed_mpy Counts number of 128 bit SIMD integer multiply operations
+ 0x02 extra: packed_shift Counts number of 128 bit SIMD integer shift operations
+ 0x04 extra: pack Counts number of 128 bit SIMD integer pack operations
+ 0x08 extra: unpack Counts number of 128 bit SIMD integer unpack operations
+ 0x10 extra: packed_logical Counts number of 128 bit SIMD integer logical operations
+ 0x20 extra: packed_arith Counts number of 128 bit SIMD integer arithmetic operations
+ 0x40 extra: shuffle_move Counts number of 128 bit SIMD integer shuffle and move operations
name:load_dispatch type:bitmask default:0x07
- 0x01 rs Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer
- 0x02 rs_delayed Counts the number of delayed RS dispatches at the stage latch
- 0x04 mob Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer
- 0x07 any Counts all loads dispatched from the Reservation Station
+ 0x01 extra: rs Counts number of loads dispatched from the Reservation Station that bypass the Memory Order Buffer
+ 0x02 extra: rs_delayed Counts the number of delayed RS dispatches at the stage latch
+ 0x04 extra: mob Counts the number of loads dispatched from the Reservation Station to the Memory Order Buffer
+ 0x07 extra: any Counts all loads dispatched from the Reservation Station
name:arith type:bitmask default:0x01
- 0x01 cycles_div_busy Counts the number of cycles the divider is busy executing divide or square root operations
- 0x02 mul Counts the number of multiply operations executed
+ 0x01 extra: cycles_div_busy Counts the number of cycles the divider is busy executing divide or square root operations
+ 0x02 extra: mul Counts the number of multiply operations executed
name:inst_decoded type:mandatory default:0x01
- 0x01 dec0 Counts number of instructions that require decoder 0 to be decoded
+ 0x01 extra: dec0 Counts number of instructions that require decoder 0 to be decoded
name:hw_int type:bitmask default:0x01
- 0x01 rcv Number of interrupt received
- 0x02 cycles_masked Number of cycles interrupt are masked
- 0x04 cycles_pending_and_masked Number of cycles interrupts are pending and masked
+ 0x01 extra: rcv Number of interrupt received
+ 0x02 extra: cycles_masked Number of cycles interrupt are masked
+ 0x04 extra: cycles_pending_and_masked Number of cycles interrupts are pending and masked
name:l2_rqsts type:bitmask default:0x01
- 0x01 ld_hit Counts number of loads that hit the L2 cache
- 0x02 ld_miss Counts the number of loads that miss the L2 cache
- 0x03 loads Counts all L2 load requests
- 0x04 rfo_hit Counts the number of store RFO requests that hit the L2 cache
- 0x08 rfo_miss Counts the number of store RFO requests that miss the L2 cache
+ 0x01 extra: ld_hit Counts number of loads that hit the L2 cache
+ 0x02 extra: ld_miss Counts the number of loads that miss the L2 cache
+ 0x03 extra: loads Counts all L2 load requests
+ 0x04 extra: rfo_hit Counts the number of store RFO requests that hit the L2 cache
+ 0x08 extra: rfo_miss Counts the number of store RFO requests that miss the L2 cache
0x0C rfos Counts all L2 store RFO requests
- 0x10 ifetch_hit Counts number of instruction fetches that hit the L2 cache
- 0x20 ifetch_miss Counts number of instruction fetches that miss the L2 cache
- 0x30 ifetches Counts all instruction fetches
- 0x40 prefetch_hit Counts L2 prefetch hits for both code and data
- 0x80 prefetch_miss Counts L2 prefetch misses for both code and data
+ 0x10 extra: ifetch_hit Counts number of instruction fetches that hit the L2 cache
+ 0x20 extra: ifetch_miss Counts number of instruction fetches that miss the L2 cache
+ 0x30 extra: ifetches Counts all instruction fetches
+ 0x40 extra: prefetch_hit Counts L2 prefetch hits for both code and data
+ 0x80 extra: prefetch_miss Counts L2 prefetch misses for both code and data
0xC0 prefetches Counts all L2 prefetches for both code and data
0xAA miss Counts all L2 misses for both code and data
0xFF references Counts all L2 requests for both code and data
name:l2_data_rqsts type:bitmask default:0xff
- 0x01 i_state Counts number of L2 data demand loads where the cache line to be loaded is in the I (invalid) state, i
- 0x02 s_state Counts number of L2 data demand loads where the cache line to be loaded is in the S (shared) state
- 0x04 e_state Counts number of L2 data demand loads where the cache line to be loaded is in the E (exclusive) state
- 0x08 m_state Counts number of L2 data demand loads where the cache line to be loaded is in the M (modified) state
+ 0x01 extra: i_state Counts number of L2 data demand loads where the cache line to be loaded is in the I (invalid) state, i
+ 0x02 extra: s_state Counts number of L2 data demand loads where the cache line to be loaded is in the S (shared) state
+ 0x04 extra: e_state Counts number of L2 data demand loads where the cache line to be loaded is in the E (exclusive) state
+ 0x08 extra: m_state Counts number of L2 data demand loads where the cache line to be loaded is in the M (modified) state
0x0F mesi Counts all L2 data demand requests
- 0x10 i_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the I (invalid) state, i
- 0x20 s_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the S (shared) state
- 0x40 e_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the E (exclusive) state
- 0x80 m_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the M (modified) state
+ 0x10 extra: i_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the I (invalid) state, i
+ 0x20 extra: s_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the S (shared) state
+ 0x40 extra: e_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the E (exclusive) state
+ 0x80 extra: m_state Counts number of L2 prefetch data loads where the cache line to be loaded is in the M (modified) state
0xF0 mesi Counts all L2 prefetch requests
0xFF any Counts all L2 data requests
name:l2_write type:bitmask default:0x01
- 0x01 i_state Counts number of L2 demand store RFO requests where the cache line to be loaded is in the I (invalid) state, i
- 0x02 s_state Counts number of L2 store RFO requests where the cache line to be loaded is in the S (shared) state
- 0x04 e_state Counts number of L2 store RFO requests where the cache line to be loaded is in the E (exclusive) state
- 0x08 m_state Counts number of L2 store RFO requests where the cache line to be loaded is in the M (modified) state
+ 0x01 extra: i_state Counts number of L2 demand store RFO requests where the cache line to be loaded is in the I (invalid) state, i
+ 0x02 extra: s_state Counts number of L2 store RFO requests where the cache line to be loaded is in the S (shared) state
+ 0x04 extra: e_state Counts number of L2 store RFO requests where the cache line to be loaded is in the E (exclusive) state
+ 0x08 extra: m_state Counts number of L2 store RFO requests where the cache line to be loaded is in the M (modified) state
0x0E hit Counts number of L2 store RFO requests where the cache line to be loaded is in either the S, E or M states
0x0F mesi Counts all L2 store RFO requests
- 0x10 i_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, i
- 0x20 s_state Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state
- 0x40 e_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state
- 0x80 m_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state
+ 0x10 extra: i_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the I (invalid) state, i
+ 0x20 extra: s_state Counts number of L2 lock RFO requests where the cache line to be loaded is in the S (shared) state
+ 0x40 extra: e_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the E (exclusive) state
+ 0x80 extra: m_state Counts number of L2 demand lock RFO requests where the cache line to be loaded is in the M (modified) state
0xE0 hit Counts number of L2 demand lock RFO requests where the cache line to be loaded is in either the S, E, or M state
0xF0 mesi Counts all L2 demand lock RFO requests
name:l1d_wb_l2 type:bitmask default:0x01
- 0x01 i_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i
- 0x02 s_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state
- 0x04 e_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state
- 0x08 m_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state
+ 0x01 extra: i_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the I (invalid) state, i
+ 0x02 extra: s_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the S state
+ 0x04 extra: e_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the E (exclusive) state
+ 0x08 extra: m_state Counts number of L1 writebacks to the L2 where the cache line to be written is in the M (modified) state
0x0F mesi Counts all L1 writebacks to the L2
name:longest_lat_cache type:bitmask default:0x4F
0x4F reference This event counts requests originating from the core that reference a cache line in the last level cache
- 0x41 miss This event counts each cache miss condition for references to the last level cache
+ 0x41 extra: miss This event counts each cache miss condition for references to the last level cache
name:cpu_clk_unhalted type:bitmask default:0x00
- 0x00 thread_p Counts the number of thread cycles while the thread is not in a halt state
- 0x01 ref_p Increments at the frequency of a slower reference clock when not halted
+ 0x00 extra: thread_p Counts the number of thread cycles while the thread is not in a halt state
+ 0x01 extra: ref_p Increments at the frequency of a slower reference clock when not halted
name:l1d_cache_ld type:bitmask default:0x01
- 0x01 i_state Counts L1 data cache read requests where the cache line to be loaded is in the I (invalid) state, i
- 0x02 s_state Counts L1 data cache read requests where the cache line to be loaded is in the S (shared) state
- 0x04 e_state Counts L1 data cache read requests where the cache line to be loaded is in the E (exclusive) state
- 0x08 m_state Counts L1 data cache read requests where the cache line to be loaded is in the M (modified) state
+ 0x01 extra: i_state Counts L1 data cache read requests where the cache line to be loaded is in the I (invalid) state, i
+ 0x02 extra: s_state Counts L1 data cache read requests where the cache line to be loaded is in the S (shared) state
+ 0x04 extra: e_state Counts L1 data cache read requests where the cache line to be loaded is in the E (exclusive) state
+ 0x08 extra: m_state Counts L1 data cache read requests where the cache line to be loaded is in the M (modified) state
0x0F mesi Counts L1 data cache read requests
name:l1d_cache_st type:bitmask default:0x01
- 0x01 i_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the I state
- 0x02 s_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the S (shared) state
- 0x04 e_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the E (exclusive) state
- 0x08 m_state Counts L1 data cache store RFO requests where cache line to be loaded is in the M (modified) state
+ 0x01 extra: i_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the I state
+ 0x02 extra: s_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the S (shared) state
+ 0x04 extra: e_state Counts L1 data cache store RFO requests where the cache line to be loaded is in the E (exclusive) state
+ 0x08 extra: m_state Counts L1 data cache store RFO requests where cache line to be loaded is in the M (modified) state
0x0F mesi Counts L1 data cache store RFO requests
name:l1d_cache_lock type:bitmask default:0x01
- 0x01 hit Counts retired load locks that hit in the L1 data cache or hit in an already allocated fill buffer
- 0x02 s_state Counts L1 data cache retired load locks that hit the target cache line in the shared state
- 0x04 e_state Counts L1 data cache retired load locks that hit the target cache line in the exclusive state
- 0x08 m_state Counts L1 data cache retired load locks that hit the target cache line in the modified state
+ 0x01 extra: hit Counts retired load locks that hit in the L1 data cache or hit in an already allocated fill buffer
+ 0x02 extra: s_state Counts L1 data cache retired load locks that hit the target cache line in the shared state
+ 0x04 extra: e_state Counts L1 data cache retired load locks that hit the target cache line in the exclusive state
+ 0x08 extra: m_state Counts L1 data cache retired load locks that hit the target cache line in the modified state
name:l1d_all_ref type:bitmask default:0x01
- 0x01 any Counts all references (uncached, speculated and retired) to the L1 data cache, including all loads and stores with any memory types
- 0x02 cacheable Counts all data reads and writes (speculated and retired) from cacheable memory, including locked operations
+ 0x01 extra: any Counts all references (uncached, speculated and retired) to the L1 data cache, including all loads and stores with any memory types
+ 0x02 extra: cacheable Counts all data reads and writes (speculated and retired) from cacheable memory, including locked operations
#name:l1d_pend_miss type:mandatory default:0x02
-# 0x02 load_buffers_full Counts cycles of L1 data cache load fill buffers full
+# 0x02 extra: load_buffers_full Counts cycles of L1 data cache load fill buffers full
name:dtlb_misses type:bitmask default:0x01
- 0x01 any Counts the number of misses in the STLB which causes a page walk
- 0x02 walk_completed Counts number of misses in the STLB which resulted in a completed page walk
- 0x10 stlb_hit Counts the number of DTLB first level misses that hit in the second level TLB
- 0x20 pde_miss Number of DTLB cache misses where the low part of the linear to physical address translation was missed
- 0x40 pdp_miss Number of DTLB misses where the high part of the linear to physical address translation was missed
- 0x80 large_walk_completed Counts number of completed large page walks due to misses in the STLB
+ 0x01 extra: any Counts the number of misses in the STLB which causes a page walk
+ 0x02 extra: walk_completed Counts number of misses in the STLB which resulted in a completed page walk
+ 0x10 extra: stlb_hit Counts the number of DTLB first level misses that hit in the second level TLB
+ 0x20 extra: pde_miss Number of DTLB cache misses where the low part of the linear to physical address translation was missed
+ 0x40 extra: pdp_miss Number of DTLB misses where the high part of the linear to physical address translation was missed
+ 0x80 extra: large_walk_completed Counts number of completed large page walks due to misses in the STLB
name:sse_mem_exec type:bitmask default:0x01
- 0x01 nta Counts number of SSE NTA prefetch/weakly-ordered instructions which missed the L1 data cache
- 0x08 streaming_stores Counts number of SSE nontemporal stores
+ 0x01 extra: nta Counts number of SSE NTA prefetch/weakly-ordered instructions which missed the L1 data cache
+ 0x08 extra: streaming_stores Counts number of SSE nontemporal stores
name:l1d_prefetch type:bitmask default:0x01
- 0x01 requests Counts number of hardware prefetch requests dispatched out of the prefetch FIFO
- 0x02 miss Counts number of hardware prefetch requests that miss the L1D
- 0x04 triggers Counts number of prefetch requests triggered by the Finite State Machine and pushed into the prefetch FIFO
+ 0x01 extra: requests Counts number of hardware prefetch requests dispatched out of the prefetch FIFO
+ 0x02 extra: miss Counts number of hardware prefetch requests that miss the L1D
+ 0x04 extra: triggers Counts number of prefetch requests triggered by the Finite State Machine and pushed into the prefetch FIFO
name:ept type:bitmask default:0x02
- 0x02 epde_miss Counts Extended Page Directory Entry misses
- 0x04 epdpe_hit Counts Extended Page Directory Pointer Entry hits
- 0x08 epdpe_miss Counts Extended Page Directory Pointer Entry misses
+ 0x02 extra: epde_miss Counts Extended Page Directory Entry misses
+ 0x04 extra: epdpe_hit Counts Extended Page Directory Pointer Entry hits
+ 0x08 extra: epdpe_miss Counts Extended Page Directory Pointer Entry misses
name:l1d type:bitmask default:0x01
- 0x01 repl Counts the number of lines brought into the L1 data cache
- 0x02 m_repl Counts the number of modified lines brought into the L1 data cache
- 0x04 m_evict Counts the number of modified lines evicted from the L1 data cache due to replacement
- 0x08 m_snoop_evict Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention
+ 0x01 extra: repl Counts the number of lines brought into the L1 data cache
+ 0x02 extra: m_repl Counts the number of modified lines brought into the L1 data cache
+ 0x04 extra: m_evict Counts the number of modified lines evicted from the L1 data cache due to replacement
+ 0x08 extra: m_snoop_evict Counts the number of modified lines evicted from the L1 data cache due to snoop HITM intervention
name:offcore_requests_outstanding type:bitmask default:0x01
- 0x01 read_data Counts weighted cycles of offcore demand data read requests
- 0x02 read_code Counts weighted cycles of offcore demand code read requests
- 0x04 rfo Counts weighted cycles of offcore demand RFO requests
- 0x08 read Counts weighted cycles of offcore read requests of any kind
+ 0x01 extra: read_data Counts weighted cycles of offcore demand data read requests
+ 0x02 extra: read_code Counts weighted cycles of offcore demand code read requests
+ 0x04 extra: rfo Counts weighted cycles of offcore demand RFO requests
+ 0x08 extra: read Counts weighted cycles of offcore read requests of any kind
name:cache_lock_cycles type:bitmask default:0x01
- 0x01 l1d_l2 Cycle count during which the L1D and L2 are locked
- 0x02 l1d Counts the number of cycles that cacheline in the L1 data cache unit is locked
+ 0x01 extra: l1d_l2 Cycle count during which the L1D and L2 are locked
+ 0x02 extra: l1d Counts the number of cycles that cacheline in the L1 data cache unit is locked
name:l1i type:bitmask default:0x01
- 0x01 hits Counts all instruction fetches that hit the L1 instruction cache
- 0x02 misses Counts all instruction fetches that miss the L1I cache
- 0x03 reads Counts all instruction fetches, including uncacheable fetches that bypass the L1I
- 0x04 cycles_stalled Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault
+ 0x01 extra: hits Counts all instruction fetches that hit the L1 instruction cache
+ 0x02 extra: misses Counts all instruction fetches that miss the L1I cache
+ 0x03 extra: reads Counts all instruction fetches, including uncacheable fetches that bypass the L1I
+ 0x04 extra: cycles_stalled Cycle counts for which an instruction fetch stalls due to a L1I cache miss, ITLB miss or ITLB fault
name:ifu_ivc type:bitmask default:0x01
- 0x01 full Instruction Fetche unit victim cache full
- 0x02 l1i_eviction L1 Instruction cache evictions
+ 0x01 extra: full Instruction Fetche unit victim cache full
+ 0x02 extra: l1i_eviction L1 Instruction cache evictions
name:large_itlb type:mandatory default:0x01
- 0x01 hit Counts number of large ITLB hits
+ 0x01 extra: hit Counts number of large ITLB hits
name:itlb_misses type:bitmask default:0x01
- 0x01 any Counts the number of misses in all levels of the ITLB which causes a page walk
- 0x02 walk_completed Counts number of misses in all levels of the ITLB which resulted in a completed page walk
- 0x04 walk_cycles Counts ITLB miss page walk cycles
- 0x04 pmh_busy_cycles Counts PMH busy cycles
- 0x10 stlb_hit Counts the number of ITLB misses that hit in the second level TLB
- 0x20 pde_miss Number of ITLB misses where the low part of the linear to physical address translation was missed
- 0x40 pdp_miss Number of ITLB misses where the high part of the linear to physical address translation was missed
- 0x80 large_walk_completed Counts number of completed large page walks due to misses in the STLB
+ 0x01 extra: any Counts the number of misses in all levels of the ITLB which causes a page walk
+ 0x02 extra: walk_completed Counts number of misses in all levels of the ITLB which resulted in a completed page walk
+ 0x04 extra: walk_cycles Counts ITLB miss page walk cycles
+ 0x04 extra: pmh_busy_cycles Counts PMH busy cycles
+ 0x10 extra: stlb_hit Counts the number of ITLB misses that hit in the second level TLB
+ 0x20 extra: pde_miss Number of ITLB misses where the low part of the linear to physical address translation was missed
+ 0x40 extra: pdp_miss Number of ITLB misses where the high part of the linear to physical address translation was missed
+ 0x80 extra: large_walk_completed Counts number of completed large page walks due to misses in the STLB
name:ild_stall type:bitmask default:0x0f
- 0x01 lcp Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX
- 0x02 mru Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU) Most Recently Used (MRU) bypass
- 0x04 iq_full Stall cycles due to a full instruction queue
- 0x08 regen Counts the number of regen stalls
+ 0x01 extra: lcp Cycles Instruction Length Decoder stalls due to length changing prefixes: 66, 67 or REX
+ 0x02 extra: mru Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU) Most Recently Used (MRU) bypass
+ 0x04 extra: iq_full Stall cycles due to a full instruction queue
+ 0x08 extra: regen Counts the number of regen stalls
0x0F any Counts any cycles the Instruction Length Decoder is stalled
name:br_inst_exec type:bitmask default:0x7f
- 0x01 cond Counts the number of conditional near branch instructions executed, but not necessarily retired
- 0x02 direct Counts all unconditional near branch instructions excluding calls and indirect branches
- 0x04 indirect_non_call Counts the number of executed indirect near branch instructions that are not calls
- 0x07 non_calls Counts all non call near branch instructions executed, but not necessarily retired
- 0x08 return_near Counts indirect near branches that have a return mnemonic
- 0x10 direct_near_call Counts unconditional near call branch instructions, excluding non call branch, executed
- 0x20 indirect_near_call Counts indirect near calls, including both register and memory indirect, executed
- 0x30 near_calls Counts all near call branches executed, but not necessarily retired
- 0x40 taken Counts taken near branches executed, but not necessarily retired
+ 0x01 extra: cond Counts the number of conditional near branch instructions executed, but not necessarily retired
+ 0x02 extra: direct Counts all unconditional near branch instructions excluding calls and indirect branches
+ 0x04 extra: indirect_non_call Counts the number of executed indirect near branch instructions that are not calls
+ 0x07 extra: non_calls Counts all non call near branch instructions executed, but not necessarily retired
+ 0x08 extra: return_near Counts indirect near branches that have a return mnemonic
+ 0x10 extra: direct_near_call Counts unconditional near call branch instructions, excluding non call branch, executed
+ 0x20 extra: indirect_near_call Counts indirect near calls, including both register and memory indirect, executed
+ 0x30 extra: near_calls Counts all near call branches executed, but not necessarily retired
+ 0x40 extra: taken Counts taken near branches executed, but not necessarily retired
0x7F any Counts all near executed branches (not necessarily retired)
name:br_misp_exec type:bitmask default:0x7f
- 0x01 cond Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired
- 0x02 direct Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0)
- 0x04 indirect_non_call Counts the number of executed mispredicted indirect near branch instructions that are not calls
- 0x07 non_calls Counts mispredicted non call near branches executed, but not necessarily retired
- 0x08 return_near Counts mispredicted indirect branches that have a rear return mnemonic
- 0x10 direct_near_call Counts mispredicted non-indirect near calls executed, (should always be 0)
- 0x20 indirect_near_call Counts mispredicted indirect near calls exeucted, including both register and memory indirect
- 0x30 near_calls Counts all mispredicted near call branches executed, but not necessarily retired
- 0x40 taken Counts executed mispredicted near branches that are taken, but not necessarily retired
+ 0x01 extra: cond Counts the number of mispredicted conditional near branch instructions executed, but not necessarily retired
+ 0x02 extra: direct Counts mispredicted macro unconditional near branch instructions, excluding calls and indirect branches (should always be 0)
+ 0x04 extra: indirect_non_call Counts the number of executed mispredicted indirect near branch instructions that are not calls
+ 0x07 extra: non_calls Counts mispredicted non call near branches executed, but not necessarily retired
+ 0x08 extra: return_near Counts mispredicted indirect branches that have a rear return mnemonic
+ 0x10 extra: direct_near_call Counts mispredicted non-indirect near calls executed, (should always be 0)
+ 0x20 extra: indirect_near_call Counts mispredicted indirect near calls exeucted, including both register and memory indirect
+ 0x30 extra: near_calls Counts all mispredicted near call branches executed, but not necessarily retired
+ 0x40 extra: taken Counts executed mispredicted near branches that are taken, but not necessarily retired
0x7F any Counts the number of mispredicted near branch instructions that were executed, but not necessarily retired
name:resource_stalls type:bitmask default:0x01
- 0x01 any Counts the number of Allocator resource related stalls
- 0x02 load Counts the cycles of stall due to lack of load buffer for load operation
- 0x04 rs_full This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle
- 0x08 store This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i
- 0x10 rob_full Counts the cycles of stall due to reorder buffer full
- 0x20 fpcw Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word
- 0x40 mxcsr Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename
- 0x80 other Counts the number of cycles while execution was stalled due to other resource issues
+ 0x01 extra: any Counts the number of Allocator resource related stalls
+ 0x02 extra: load Counts the cycles of stall due to lack of load buffer for load operation
+ 0x04 extra: rs_full This event counts the number of cycles when the number of instructions in the pipeline waiting for execution reaches the limit the processor can handle
+ 0x08 extra: store This event counts the number of cycles that a resource related stall will occur due to the number of store instructions reaching the limit of the pipeline, (i
+ 0x10 extra: rob_full Counts the cycles of stall due to reorder buffer full
+ 0x20 extra: fpcw Counts the number of cycles while execution was stalled due to writing the floating-point unit (FPU) control word
+ 0x40 extra: mxcsr Stalls due to the MXCSR register rename occurring to close to a previous MXCSR rename
+ 0x80 extra: other Counts the number of cycles while execution was stalled due to other resource issues
name:offcore_requests type:bitmask default:0x80
- 0x01 demand_read_data Counts number of offcore demand data read requests
- 0x02 demand_read_code Counts number of offcore demand code read requests
- 0x04 demand_rfo Counts number of offcore demand RFO requests
- 0x08 any_read Counts number of offcore read requests
- 0x10 any_rfo Counts number of offcore RFO requests
- 0x20 uncached_mem Counts number of offcore uncached memory requests
- 0x40 l1d_writeback Counts number of L1D writebacks to the uncore
- 0x80 any Counts all offcore requests
+ 0x01 extra: demand_read_data Counts number of offcore demand data read requests
+ 0x02 extra: demand_read_code Counts number of offcore demand code read requests
+ 0x04 extra: demand_rfo Counts number of offcore demand RFO requests
+ 0x08 extra: any_read Counts number of offcore read requests
+ 0x10 extra: any_rfo Counts number of offcore RFO requests
+ 0x20 extra: uncached_mem Counts number of offcore uncached memory requests
+ 0x40 extra: l1d_writeback Counts number of L1D writebacks to the uncore
+ 0x80 extra: any Counts all offcore requests
name:uops_executed type:bitmask default:0x3f
- 0x01 port0 Counts number of Uops executed that were issued on port 0
- 0x02 port1 Counts number of Uops executed that were issued on port 1
- 0x04 port2_core Counts number of Uops executed that were issued on port 2
- 0x08 port3_core Counts number of Uops executed that were issued on port 3
- 0x10 port4_core Counts number of Uops executed that where issued on port 4
- 0x20 port5 Counts number of Uops executed that where issued on port 5
- 0x40 port015 Counts number of Uops executed that where issued on port 0, 1, or 5
- 0x80 port234 Counts number of Uops executed that where issued on port 2, 3, or 4
+ 0x01 extra: port0 Counts number of Uops executed that were issued on port 0
+ 0x02 extra: port1 Counts number of Uops executed that were issued on port 1
+ 0x04 extra: port2_core Counts number of Uops executed that were issued on port 2
+ 0x08 extra: port3_core Counts number of Uops executed that were issued on port 3
+ 0x10 extra: port4_core Counts number of Uops executed that where issued on port 4
+ 0x20 extra: port5 Counts number of Uops executed that where issued on port 5
+ 0x40 extra: port015 Counts number of Uops executed that where issued on port 0, 1, or 5
+ 0x80 extra: port234 Counts number of Uops executed that where issued on port 2, 3, or 4
name:snoopq_requests_outstanding type:bitmask default:0x01
- 0x01 data Counts weighted cycles of snoopq requests for data
- 0x02 invalidate Counts weighted cycles of snoopq invalidate requests
- 0x04 code Counts weighted cycles of snoopq requests for code
+ 0x01 extra: data Counts weighted cycles of snoopq requests for data
+ 0x02 extra: invalidate Counts weighted cycles of snoopq invalidate requests
+ 0x04 extra: code Counts weighted cycles of snoopq requests for code
name:snoop_response type:bitmask default:0x01
- 0x01 hit Counts HIT snoop response sent by this thread in response to a snoop request
- 0x02 hite Counts HIT E snoop response sent by this thread in response to a snoop request
- 0x04 hitm Counts HIT M snoop response sent by this thread in response to a snoop request
+ 0x01 extra: hit Counts HIT snoop response sent by this thread in response to a snoop request
+ 0x02 extra: hite Counts HIT E snoop response sent by this thread in response to a snoop request
+ 0x04 extra: hitm Counts HIT M snoop response sent by this thread in response to a snoop request
name:pic_accesses type:bitmask default:0x01
- 0x01 tpr_reads Counts number of TPR reads
- 0x02 tpr_writes Counts number of TPR writes
+ 0x01 extra: tpr_reads Counts number of TPR reads
+ 0x02 extra: tpr_writes Counts number of TPR writes
name:inst_retired type:bitmask default:0x01
- 0x01 any_p instructions retired
- 0x02 x87 Counts the number of floating point computational operations retired: floating point computational operations executed by the assist handler and sub-operations of complex floating point instructions like transcendental instructions
+ 0x01 extra: any_p instructions retired
+ 0x02 extra: x87 Counts the number of floating point computational operations retired: floating point computational operations executed by the assist handler and sub-operations of complex floating point instructions like transcendental instructions
name:uops_retired type:bitmask default:0x01
- 0x01 any Counts the number of micro-ops retired, (macro-fused=1, micro-fused=2, others=1; maximum count of 8 per cycle)
- 0x02 retire_slots Counts the number of retirement slots used each cycle
- 0x04 macro_fused Counts number of macro-fused uops retired
+ 0x01 extra: any Counts the number of micro-ops retired, (macro-fused=1, micro-fused=2, others=1; maximum count of 8 per cycle)
+ 0x02 extra: retire_slots Counts the number of retirement slots used each cycle
+ 0x04 extra: macro_fused Counts number of macro-fused uops retired
name:machine_clears type:bitmask default:0x01
- 0x01 cycles Counts the cycles machine clear is asserted
- 0x02 mem_order Counts the number of machine clears due to memory order conflicts
- 0x04 smc Counts the number of times that a program writes to a code section
- 0x10 fusion_assist Counts the number of macro-fusion assists
+ 0x01 extra: cycles Counts the cycles machine clear is asserted
+ 0x02 extra: mem_order Counts the number of machine clears due to memory order conflicts
+ 0x04 extra: smc Counts the number of times that a program writes to a code section
+ 0x10 extra: fusion_assist Counts the number of macro-fusion assists
name:br_inst_retired type:bitmask default:0x00
- 0x00 all_branches See Table A-1
- 0x01 conditional Counts the number of conditional branch instructions retired
- 0x02 near_call Counts the number of direct & indirect near unconditional calls retired
- 0x04 all_branches Counts the number of branch instructions retired
+ 0x00 extra: all_branches See Table A-1
+ 0x01 extra: conditional Counts the number of conditional branch instructions retired
+ 0x02 extra: near_call Counts the number of direct & indirect near unconditional calls retired
+ 0x04 extra: all_branches Counts the number of branch instructions retired
name:br_misp_retired type:bitmask default:0x00
- 0x00 all_branches See Table A-1
- 0x02 near_call Counts mispredicted direct & indirect near unconditional retired calls
+ 0x00 extra: all_branches See Table A-1
+ 0x02 extra: near_call Counts mispredicted direct & indirect near unconditional retired calls
name:ssex_uops_retired type:bitmask default:0x01
- 0x01 packed_single Counts SIMD packed single-precision floating point Uops retired
- 0x02 scalar_single Counts SIMD calar single-precision floating point Uops retired
- 0x04 packed_double Counts SIMD packed double-precision floating point Uops retired
- 0x08 scalar_double Counts SIMD scalar double-precision floating point Uops retired
- 0x10 vector_integer Counts 128-bit SIMD vector integer Uops retired
+ 0x01 extra: packed_single Counts SIMD packed single-precision floating point Uops retired
+ 0x02 extra: scalar_single Counts SIMD calar single-precision floating point Uops retired
+ 0x04 extra: packed_double Counts SIMD packed double-precision floating point Uops retired
+ 0x08 extra: scalar_double Counts SIMD scalar double-precision floating point Uops retired
+ 0x10 extra: vector_integer Counts 128-bit SIMD vector integer Uops retired
name:mem_load_retired type:bitmask default:0x01
- 0x01 l1d_hit Counts number of retired loads that hit the L1 data cache
- 0x02 l2_hit Counts number of retired loads that hit the L2 data cache
- 0x04 llc_unshared_hit Counts number of retired loads that hit their own, unshared lines in the LLC cache
- 0x08 other_core_l2_hit_hitm Counts number of retired loads that hit in a sibling core's L2 (on die core)
- 0x10 llc_miss Counts number of retired loads that miss the LLC cache
- 0x40 hit_lfb Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache
- 0x80 dtlb_miss Counts the number of retired loads that missed the DTLB
+ 0x01 extra: l1d_hit Counts number of retired loads that hit the L1 data cache
+ 0x02 extra: l2_hit Counts number of retired loads that hit the L2 data cache
+ 0x04 extra: llc_unshared_hit Counts number of retired loads that hit their own, unshared lines in the LLC cache
+ 0x08 extra: other_core_l2_hit_hitm Counts number of retired loads that hit in a sibling core's L2 (on die core)
+ 0x10 extra: llc_miss Counts number of retired loads that miss the LLC cache
+ 0x40 extra: hit_lfb Counts number of retired loads that miss the L1D and the address is located in an allocated line fill buffer and will soon be committed to cache
+ 0x80 extra: dtlb_miss Counts the number of retired loads that missed the DTLB
name:fp_mmx_trans type:bitmask default:0x03
- 0x01 to_fp Counts the first floating-point instruction following any MMX instruction
- 0x02 to_mmx Counts the first MMX instruction following a floating-point instruction
- 0x03 any Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions
+ 0x01 extra: to_fp Counts the first floating-point instruction following any MMX instruction
+ 0x02 extra: to_mmx Counts the first MMX instruction following a floating-point instruction
+ 0x03 extra: any Counts all transitions from floating point to MMX instructions and from MMX instructions to floating point instructions
name:macro_insts type:mandatory default:0x01
- 0x01 decoded Counts the number of instructions decoded, (but not necessarily executed or retired)
+ 0x01 extra: decoded Counts the number of instructions decoded, (but not necessarily executed or retired)
name:uops_decoded type:bitmask default:0x0e
- 0x02 ms Counts the number of Uops decoded by the Microcode Sequencer, MS
- 0x04 esp_folding Counts number of stack pointer (ESP) instructions decoded: push , pop , call , ret, etc
- 0x08 esp_sync Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register
+ 0x02 extra: ms Counts the number of Uops decoded by the Microcode Sequencer, MS
+ 0x04 extra: esp_folding Counts number of stack pointer (ESP) instructions decoded: push , pop , call , ret, etc
+ 0x08 extra: esp_sync Counts number of stack pointer (ESP) sync operations where an ESP instruction is corrected by adding the ESP offset register to the current value of the ESP register
name:rat_stalls type:bitmask default:0x0f
- 0x01 flags Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall
- 0x02 registers This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction
- 0x04 rob_read_port Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline
- 0x08 scoreboard Counts the cycles where we stall due to microarchitecturally required serialization
+ 0x01 extra: flags Counts the number of cycles during which execution stalled due to several reasons, one of which is a partial flag register stall
+ 0x02 extra: registers This event counts the number of cycles instruction execution latency became longer than the defined latency because the instruction used a register that was partially written by previous instruction
+ 0x04 extra: rob_read_port Counts the number of cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the out-of-order pipeline
+ 0x08 extra: scoreboard Counts the cycles where we stall due to microarchitecturally required serialization
0x0F any Counts all Register Allocation Table stall cycles due to: Cycles when ROB read port stalls occurred, which did not allow new micro-ops to enter the execution pipe
name:baclear type:bitmask default:0x01
- 0x01 clear Counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end
- 0x02 bad_target Counts number of Branch Address Calculator clears (BACLEAR) asserted due to conditional branch instructions in which there was a target hit but the direction was wrong
+ 0x01 extra: clear Counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end
+ 0x02 extra: bad_target Counts number of Branch Address Calculator clears (BACLEAR) asserted due to conditional branch instructions in which there was a target hit but the direction was wrong
name:bpu_clears type:bitmask default:0x03
- 0x01 early Counts early (normal) Branch Prediction Unit clears: BPU predicted a taken branch after incorrectly assuming that it was not taken
- 0x02 late Counts late Branch Prediction Unit clears due to Most Recently Used conflicts
- 0x03 any Counts all BPU clears
+ 0x01 extra: early Counts early (normal) Branch Prediction Unit clears: BPU predicted a taken branch after incorrectly assuming that it was not taken
+ 0x02 extra: late Counts late Branch Prediction Unit clears due to Most Recently Used conflicts
+ 0x03 extra: any Counts all BPU clears
name:l2_transactions type:bitmask default:0x80
- 0x01 load Counts L2 load operations due to HW prefetch or demand loads
- 0x02 rfo Counts L2 RFO operations due to HW prefetch or demand RFOs
- 0x04 ifetch Counts L2 instruction fetch operations due to HW prefetch or demand ifetch
- 0x08 prefetch Counts L2 prefetch operations
- 0x10 l1d_wb Counts L1D writeback operations to the L2
- 0x20 fill Counts L2 cache line fill operations due to load, RFO, L1D writeback or prefetch
- 0x40 wb Counts L2 writeback operations to the LLC
- 0x80 any Counts all L2 cache operations
+ 0x01 extra: load Counts L2 load operations due to HW prefetch or demand loads
+ 0x02 extra: rfo Counts L2 RFO operations due to HW prefetch or demand RFOs
+ 0x04 extra: ifetch Counts L2 instruction fetch operations due to HW prefetch or demand ifetch
+ 0x08 extra: prefetch Counts L2 prefetch operations
+ 0x10 extra: l1d_wb Counts L1D writeback operations to the L2
+ 0x20 extra: fill Counts L2 cache line fill operations due to load, RFO, L1D writeback or prefetch
+ 0x40 extra: wb Counts L2 writeback operations to the LLC
+ 0x80 extra: any Counts all L2 cache operations
name:l2_lines_in type:bitmask default:0x07
- 0x02 s_state Counts the number of cache lines allocated in the L2 cache in the S (shared) state
- 0x04 e_state Counts the number of cache lines allocated in the L2 cache in the E (exclusive) state
- 0x07 any Counts the number of cache lines allocated in the L2 cache
+ 0x02 extra: s_state Counts the number of cache lines allocated in the L2 cache in the S (shared) state
+ 0x04 extra: e_state Counts the number of cache lines allocated in the L2 cache in the E (exclusive) state
+ 0x07 extra: any Counts the number of cache lines allocated in the L2 cache
name:l2_lines_out type:bitmask default:0x0f
- 0x01 demand_clean Counts L2 clean cache lines evicted by a demand request
- 0x02 demand_dirty Counts L2 dirty (modified) cache lines evicted by a demand request
- 0x04 prefetch_clean Counts L2 clean cache line evicted by a prefetch request
- 0x08 prefetch_dirty Counts L2 modified cache line evicted by a prefetch request
+ 0x01 extra: demand_clean Counts L2 clean cache lines evicted by a demand request
+ 0x02 extra: demand_dirty Counts L2 dirty (modified) cache lines evicted by a demand request
+ 0x04 extra: prefetch_clean Counts L2 clean cache line evicted by a prefetch request
+ 0x08 extra: prefetch_dirty Counts L2 modified cache line evicted by a prefetch request
0x0F any Counts all L2 cache lines evicted for any reason
name:l2_hw_prefetch type:bitmask default:0x01
- 0x01 hit Count L2 HW prefetcher detector hits
- 0x02 alloc Count L2 HW prefetcher allocations
- 0x04 data_trigger Count L2 HW data prefetcher triggered
- 0x08 code_trigger Count L2 HW code prefetcher triggered
- 0x10 dca_trigger Count L2 HW DCA prefetcher triggered
- 0x20 kick_start Count L2 HW prefetcher kick started
+ 0x01 extra: hit Count L2 HW prefetcher detector hits
+ 0x02 extra: alloc Count L2 HW prefetcher allocations
+ 0x04 extra: data_trigger Count L2 HW data prefetcher triggered
+ 0x08 extra: code_trigger Count L2 HW code prefetcher triggered
+ 0x10 extra: dca_trigger Count L2 HW DCA prefetcher triggered
+ 0x20 extra: kick_start Count L2 HW prefetcher kick started
name:sq_misc type:bitmask default:0x01
- 0x01 promotion Counts the number of L2 secondary misses that hit the Super Queue
- 0x02 promotion_post_go Counts the number of L2 secondary misses during the Super Queue filling L2
- 0x04 lru_hints Counts number of Super Queue LRU hints sent to L3
- 0x08 fill_dropped Counts the number of SQ L2 fills dropped due to L2 busy
- 0x10 split_lock Counts the number of SQ lock splits across a cache line
+ 0x01 extra: promotion Counts the number of L2 secondary misses that hit the Super Queue
+ 0x02 extra: promotion_post_go Counts the number of L2 secondary misses during the Super Queue filling L2
+ 0x04 extra: lru_hints Counts number of Super Queue LRU hints sent to L3
+ 0x08 extra: fill_dropped Counts the number of SQ L2 fills dropped due to L2 busy
+ 0x10 extra: split_lock Counts the number of SQ lock splits across a cache line
name:fp_assist type:bitmask default:0x01
- 0x01 all Counts the number of floating point operations executed that required micro-code assist intervention
- 0x02 output Counts number of floating point micro-code assist when the output value (destination register) is invalid
- 0x04 input Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid
+ 0x01 extra: all Counts the number of floating point operations executed that required micro-code assist intervention
+ 0x02 extra: output Counts number of floating point micro-code assist when the output value (destination register) is invalid
+ 0x04 extra: input Counts number of floating point micro-code assist when the input value (one of the source operands to an FP instruction) is invalid
name:simd_int_64 type:bitmask default:0x01
- 0x01 packed_mpy Counts number of SID integer 64 bit packed multiply operations
- 0x02 packed_shift Counts number of SID integer 64 bit packed shift operations
- 0x04 pack Counts number of SID integer 64 bit pack operations
- 0x08 unpack Counts number of SID integer 64 bit unpack operations
- 0x10 packed_logical Counts number of SID integer 64 bit logical operations
- 0x20 packed_arith Counts number of SID integer 64 bit arithmetic operations
- 0x40 shuffle_move Counts number of SID integer 64 bit shift or move operations
+ 0x01 extra: packed_mpy Counts number of SID integer 64 bit packed multiply operations
+ 0x02 extra: packed_shift Counts number of SID integer 64 bit packed shift operations
+ 0x04 extra: pack Counts number of SID integer 64 bit pack operations
+ 0x08 extra: unpack Counts number of SID integer 64 bit unpack operations
+ 0x10 extra: packed_logical Counts number of SID integer 64 bit logical operations
+ 0x20 extra: packed_arith Counts number of SID integer 64 bit arithmetic operations
+ 0x40 extra: shuffle_move Counts number of SID integer 64 bit shift or move operations
name:x20 type:mandatory default:0x20
0x20 No unit mask
diff --git a/events/i386/sandybridge/unit_masks b/events/i386/sandybridge/unit_masks
index e02bb33..f35f32d 100644
--- a/events/i386/sandybridge/unit_masks
+++ b/events/i386/sandybridge/unit_masks
@@ -11,100 +11,100 @@ name:x10 type:mandatory default:0x10
name:x20 type:mandatory default:0x20
0x20 No unit mask
name:ld_blocks type:bitmask default:0x1
- 0x1 data_unknown blocked loads due to store buffer blocks with unknown data.
- 0x2 store_forward loads blocked by overlapping with store buffer that cannot be forwarded
- 0x8 no_sr This event counts the number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.
- 0x10 all_block Number of cases where any load is blocked but has no DCU miss.
+ 0x1 extra: data_unknown blocked loads due to store buffer blocks with unknown data.
+ 0x2 extra: store_forward loads blocked by overlapping with store buffer that cannot be forwarded
+ 0x8 extra: no_sr This event counts the number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.
+ 0x10 extra: all_block Number of cases where any load is blocked but has no DCU miss.
name:misalign_mem_ref type:bitmask default:0x1
- 0x1 loads Speculative cache-line split load uops dispatched to the L1D.
- 0x2 stores Speculative cache-line split Store-address uops dispatched to L1D
+ 0x1 extra: loads Speculative cache-line split load uops dispatched to the L1D.
+ 0x2 extra: stores Speculative cache-line split Store-address uops dispatched to L1D
name:ld_blocks_partial type:bitmask default:0x1
- 0x1 address_alias False dependencies in MOB due to partial compare on address
- 0x8 all_sta_block This event counts the number of times that load operations are temporarily blocked because of older stores, with addresses that are not yet known. A load operation may incur more than one block of this type.
+ 0x1 extra: address_alias False dependencies in MOB due to partial compare on address
+ 0x8 extra: all_sta_block This event counts the number of times that load operations are temporarily blocked because of older stores, with addresses that are not yet known. A load operation may incur more than one block of this type.
name:dtlb_load_misses type:bitmask default:0x1
- 0x1 miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
- 0x2 walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
- 0x4 walk_duration Cycles PMH is busy with this walk
- 0x10 stlb_hit First level miss but second level hit; no page walk.
+ 0x1 extra: miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
+ 0x2 extra: walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
+ 0x4 extra: walk_duration Cycles PMH is busy with this walk
+ 0x10 extra: stlb_hit First level miss but second level hit; no page walk.
name:int_misc type:bitmask default:0x40
- 0x40 rat_stall_cycles Cycles Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for this thread.
+ 0x40 extra: rat_stall_cycles Cycles Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for this thread.
0x3 extra:cmask=1 recovery_cycles Number of cycles waiting to be recover after Nuke due to all other cases except JEClear.
0x3 extra:cmask=1,edge recovery_stalls_count Edge applied to recovery_cycles, thus counts occurrences.
-name:uops_issued type:bitmask default:0x1
- 0x1 any Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)
+name:uops_issued type:bitmask default:any
+ 0x1 extra: any Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)
0x1 extra:cmask=1,inv stall_cycles cycles no uops issued by this thread.
-name:arith type:bitmask default:0x1
- 0x1 fpu_div_active Cycles that the divider is busy with any divide or sqrt operation.
+name:arith type:bitmask default:fpu_div_active
+ 0x1 extra: fpu_div_active Cycles that the divider is busy with any divide or sqrt operation.
0x1 extra:cmask=1,edge fpu_div Number of times that the divider is actived, includes INT, SIMD and FP.
name:l2_rqsts type:bitmask default:0x1
- 0x1 demand_data_rd_hit Demand Data Read hit L2, no rejects
- 0x4 rfo_hit RFO requests that hit L2 cache
- 0x8 rfo_miss RFO requests that miss L2 cache
- 0x10 code_rd_hit L2 cache hits when fetching instructions, code reads.
- 0x20 code_rd_miss L2 cache misses when fetching instructions
- 0x40 pf_hit Requests from the L2 hardware prefetchers that hit L2 cache
- 0x80 pf_miss Requests from the L2 hardware prefetchers that miss L2 cache
- 0x3 all_demand_data_rd Any data read request to L2 cache
- 0xc all_rfo Any data RFO request to L2 cache
- 0x30 all_code_rd Any code read request to L2 cache
- 0xc0 all_pf Any L2 HW prefetch request to L2 cache
+ 0x1 extra: demand_data_rd_hit Demand Data Read hit L2, no rejects
+ 0x4 extra: rfo_hit RFO requests that hit L2 cache
+ 0x8 extra: rfo_miss RFO requests that miss L2 cache
+ 0x10 extra: code_rd_hit L2 cache hits when fetching instructions, code reads.
+ 0x20 extra: code_rd_miss L2 cache misses when fetching instructions
+ 0x40 extra: pf_hit Requests from the L2 hardware prefetchers that hit L2 cache
+ 0x80 extra: pf_miss Requests from the L2 hardware prefetchers that miss L2 cache
+ 0x3 extra: all_demand_data_rd Any data read request to L2 cache
+ 0xc extra: all_rfo Any data RFO request to L2 cache
+ 0x30 extra: all_code_rd Any code read request to L2 cache
+ 0xc0 extra: all_pf Any L2 HW prefetch request to L2 cache
name:l2_store_lock_rqsts type:bitmask default:0xf
- 0xf all RFOs that access cache lines in any state
- 0x1 miss RFO (as a result of regular RFO or Lock request) miss cache - I state
- 0x4 hit_e RFO (as a result of regular RFO or Lock request) hits cache in E state
- 0x8 hit_m RFO (as a result of regular RFO or Lock request) hits cache in M state
+ 0xf extra: all RFOs that access cache lines in any state
+ 0x1 extra: miss RFO (as a result of regular RFO or Lock request) miss cache - I state
+ 0x4 extra: hit_e RFO (as a result of regular RFO or Lock request) hits cache in E state
+ 0x8 extra: hit_m RFO (as a result of regular RFO or Lock request) hits cache in M state
name:l2_l1d_wb_rqsts type:bitmask default:0x4
- 0x4 hit_e writebacks from L1D to L2 cache lines in E state
- 0x8 hit_m writebacks from L1D to L2 cache lines in M state
-name:l1d_pend_miss type:bitmask default:0x1
- 0x1 pending Cycles with L1D load Misses outstanding.
+ 0x4 extra: hit_e writebacks from L1D to L2 cache lines in E state
+ 0x8 extra: hit_m writebacks from L1D to L2 cache lines in M state
+name:l1d_pend_miss type:bitmask default:pending
+ 0x1 extra: pending Cycles with L1D load Misses outstanding.
0x1 extra:cmask=1,edge occurences This event counts the number of L1D misses outstanding occurences.
name:dtlb_store_misses type:bitmask default:0x1
- 0x1 miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
- 0x2 walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
- 0x4 walk_duration Cycles PMH is busy with this walk
- 0x10 stlb_hit First level miss but second level hit; no page walk. Only relevant if multiple levels.
+ 0x1 extra: miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
+ 0x2 extra: walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
+ 0x4 extra: walk_duration Cycles PMH is busy with this walk
+ 0x10 extra: stlb_hit First level miss but second level hit; no page walk. Only relevant if multiple levels.
name:load_hit_pre type:bitmask default:0x1
- 0x1 sw_pf Load dispatches that hit fill buffer allocated for S/W prefetch.
- 0x2 hw_pf Load dispatches that hit fill buffer allocated for HW prefetch.
+ 0x1 extra: sw_pf Load dispatches that hit fill buffer allocated for S/W prefetch.
+ 0x2 extra: hw_pf Load dispatches that hit fill buffer allocated for HW prefetch.
name:l1d type:bitmask default:0x1
- 0x1 replacement L1D Data line replacements.
- 0x2 allocated_in_m L1D M-state Data Cache Lines Allocated
- 0x4 eviction L1D M-state Data Cache Lines Evicted due to replacement (only)
- 0x8 all_m_replacement All Modified lines evicted out of L1D
-name:partial_rat_stalls type:bitmask default:0x20
- 0x20 flags_merge_uop Number of perf sensitive flags-merge uops added by Sandy Bridge u-arch.
- 0x40 slow_lea_window Number of cycles with at least 1 slow Load Effective Address (LEA) uop being allocated.
- 0x80 mul_single_uop Number of Multiply packed/scalar single precision uops allocated
+ 0x1 extra: replacement L1D Data line replacements.
+ 0x2 extra: allocated_in_m L1D M-state Data Cache Lines Allocated
+ 0x4 extra: eviction L1D M-state Data Cache Lines Evicted due to replacement (only)
+ 0x8 extra: all_m_replacement All Modified lines evicted out of L1D
+name:partial_rat_stalls type:bitmask default:flags_merge_uop
+ 0x20 extra: flags_merge_uop Number of perf sensitive flags-merge uops added by Sandy Bridge u-arch.
+ 0x40 extra: slow_lea_window Number of cycles with at least 1 slow Load Effective Address (LEA) uop being allocated.
+ 0x80 extra: mul_single_uop Number of Multiply packed/scalar single precision uops allocated
0x20 extra:cmask=1 flags_merge_uop_cycles Cycles with perf sensitive flags-merge uops added by SandyBridge u-arch.
name:resource_stalls2 type:bitmask default:0x40
- 0x40 bob_full Cycles Allocator is stalled due Branch Order Buffer (BOB).
- 0xf all_prf_control Resource stalls2 control structures full for physical registers
- 0xc all_fl_empty Cycles with either free list is empty
- 0x4f ooo_rsrc Resource stalls2 control structures full Physical Register Reclaim Table (PRRT), Physical History Table (PHT), INT or SIMD Free List (FL), Branch Order Buffer (BOB)
-name:cpl_cycles type:bitmask default:0x1
- 0x1 ring0 Unhalted core cycles the Thread was in Rings 0.
+ 0x40 extra: bob_full Cycles Allocator is stalled due Branch Order Buffer (BOB).
+ 0xf extra: all_prf_control Resource stalls2 control structures full for physical registers
+ 0xc extra: all_fl_empty Cycles with either free list is empty
+ 0x4f extra: ooo_rsrc Resource stalls2 control structures full Physical Register Reclaim Table (PRRT), Physical History Table (PHT), INT or SIMD Free List (FL), Branch Order Buffer (BOB)
+name:cpl_cycles type:bitmask default:ring0
+ 0x1 extra: ring0 Unhalted core cycles the Thread was in Rings 0.
0x1 extra:cmask=1,edge ring0_trans Transitions from ring123 to Ring0.
- 0x2 ring123 Unhalted core cycles the Thread was in Rings 1/2/3.
-name:offcore_requests_outstanding type:bitmask default:0x1
- 0x1 demand_data_rd Offcore outstanding Demand Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle. Includes L1D data hardware prefetches.
+ 0x2 extra: ring123 Unhalted core cycles the Thread was in Rings 1/2/3.
+name:offcore_requests_outstanding type:bitmask default:cycles_with_demand_data_rd
+ 0x1 extra: demand_data_rd Offcore outstanding Demand Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle. Includes L1D data hardware prefetches.
0x1 extra:cmask=1 cycles_with_demand_data_rd cycles there are Offcore outstanding RD data transactions in the SuperQueue (SQ), queue to uncore.
- 0x2 demand_code_rd Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
- 0x4 demand_rfo Offcore outstanding RFO (store) transactions in the SuperQueue (SQ), queue to uncore, every cycle.
- 0x8 all_data_rd Offcore outstanding all cacheable Core Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle.
+ 0x2 extra: demand_code_rd Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
+ 0x4 extra: demand_rfo Offcore outstanding RFO (store) transactions in the SuperQueue (SQ), queue to uncore, every cycle.
+ 0x8 extra: all_data_rd Offcore outstanding all cacheable Core Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x8 extra:cmask=1 cycles_with_data_rd Cycles there are Offcore outstanding all Data read transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x2 extra:cmask=1 cycles_with_demand_code_rd Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x4 extra:cmask=1 cycles_with_demand_rfo Cycles with offcore outstanding demand RFO Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
name:lock_cycles type:bitmask default:0x1
- 0x1 split_lock_uc_lock_duration Cycles in which the L1D and L2 are locked, due to a UC lock or split lock
- 0x2 cache_lock_duration cycles that theL1D is locked
+ 0x1 extra: split_lock_uc_lock_duration Cycles in which the L1D and L2 are locked, due to a UC lock or split lock
+ 0x2 extra: cache_lock_duration cycles that theL1D is locked
name:idq type:bitmask default:0x2
- 0x2 empty Cycles the Instruction Decode Queue (IDQ) is empty.
- 0x4 mite_uops Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path.
- 0x8 dsb_uops Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.
- 0x10 ms_dsb_uops Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB).
- 0x20 ms_mite_uops Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by MITE.
- 0x30 ms_uops Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE.
+ 0x2 extra: empty Cycles the Instruction Decode Queue (IDQ) is empty.
+ 0x4 extra: mite_uops Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path.
+ 0x8 extra: dsb_uops Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.
+ 0x10 extra: ms_dsb_uops Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB).
+ 0x20 extra: ms_mite_uops Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by MITE.
+ 0x30 extra: ms_uops Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE.
0x30 extra:cmask=1 ms_cycles Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE.
0x4 extra:cmask=1 mite_cycles Cycles MITE is active
0x8 extra:cmask=1 dsb_cycles Cycles Decode Stream Buffer (DSB) is active
@@ -114,42 +114,42 @@ name:idq type:bitmask default:0x2
0x18 extra:cmask=4 all_dsb_cycles_4_uops Cycles Decode Stream Buffer (DSB) is delivering 4 Uops
0x24 extra:cmask=1 all_mite_cycles_any_uops Cycles MITE is delivering anything
0x24 extra:cmask=4 all_mite_cycles_4_uops Cycles MITE is delivering 4 Uops
- 0x3c mite_all_uops Number of uops delivered to Instruction Decode Queue (IDQ) from any path.
+ 0x3c extra: mite_all_uops Number of uops delivered to Instruction Decode Queue (IDQ) from any path.
name:itlb_misses type:bitmask default:0x1
- 0x1 miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M)
- 0x2 walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M)
- 0x4 walk_duration Cycles PMH is busy with this walk.
- 0x10 stlb_hit First level miss but second level hit; no page walk.
+ 0x1 extra: miss_causes_a_walk Miss in all TLB levels causes an page walk of any page size (4K/2M/4M)
+ 0x2 extra: walk_completed Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M)
+ 0x4 extra: walk_duration Cycles PMH is busy with this walk.
+ 0x10 extra: stlb_hit First level miss but second level hit; no page walk.
name:ild_stall type:bitmask default:0x1
- 0x1 lcp Stall "occurrences" due to length changing prefixes (LCP).
- 0x4 iq_full Stall cycles when instructions cannot be written because the Instruction Queue (IQ) is full.
+ 0x1 extra: lcp Stall "occurrences" due to length changing prefixes (LCP).
+ 0x4 extra: iq_full Stall cycles when instructions cannot be written because the Instruction Queue (IQ) is full.
name:br_inst_exec type:bitmask default:0xff
- 0xff all_branches All branch instructions executed.
- 0x41 nontaken_conditional All macro conditional nontaken branch instructions.
- 0x81 taken_conditional All macro conditional taken branch instructions.
- 0x82 taken_direct_jump All macro unconditional taken branch instructions, excluding calls and indirects.
- 0x84 taken_indirect_jump_non_call_ret All taken indirect branches that are not calls nor returns.
- 0x88 taken_indirect_near_return All taken indirect branches that have a return mnemonic.
- 0x90 taken_direct_near_call All taken non-indirect calls.
- 0xa0 taken_indirect_near_call All taken indirect calls, including both register and memory indirect.
- 0xc1 all_conditional All macro conditional branch instructions.
- 0xc2 all_direct_jmp All macro unconditional branch instructions, excluding calls and indirects
- 0xc4 all_indirect_jump_non_call_ret All indirect branches that are not calls nor returns.
- 0xc8 all_indirect_near_return All indirect return branches.
- 0xd0 all_direct_near_call All non-indirect calls executed.
+ 0xff extra: all_branches All branch instructions executed.
+ 0x41 extra: nontaken_conditional All macro conditional nontaken branch instructions.
+ 0x81 extra: taken_conditional All macro conditional taken branch instructions.
+ 0x82 extra: taken_direct_jump All macro unconditional taken branch instructions, excluding calls and indirects.
+ 0x84 extra: taken_indirect_jump_non_call_ret All taken indirect branches that are not calls nor returns.
+ 0x88 extra: taken_indirect_near_return All taken indirect branches that have a return mnemonic.
+ 0x90 extra: taken_direct_near_call All taken non-indirect calls.
+ 0xa0 extra: taken_indirect_near_call All taken indirect calls, including both register and memory indirect.
+ 0xc1 extra: all_conditional All macro conditional branch instructions.
+ 0xc2 extra: all_direct_jmp All macro unconditional branch instructions, excluding calls and indirects
+ 0xc4 extra: all_indirect_jump_non_call_ret All indirect branches that are not calls nor returns.
+ 0xc8 extra: all_indirect_near_return All indirect return branches.
+ 0xd0 extra: all_direct_near_call All non-indirect calls executed.
name:br_misp_exec type:bitmask default:0xff
- 0xff all_branches All mispredicted branch instructions executed.
- 0x41 nontaken_conditional All nontaken mispredicted macro conditional branch instructions.
- 0x81 taken_conditional All taken mispredicted macro conditional branch instructions.
- 0x84 taken_indirect_jump_non_call_ret All taken mispredicted indirect branches that are not calls nor returns.
- 0x88 taken_return_near All taken mispredicted indirect branches that have a return mnemonic.
- 0x90 taken_direct_near_call All taken mispredicted non-indirect calls.
- 0xa0 taken_indirect_near_call All taken mispredicted indirect calls, including both register and memory indirect.
- 0xc1 all_conditional All mispredicted macro conditional branch instructions.
- 0xc4 all_indirect_jump_non_call_ret All mispredicted indirect branches that are not calls nor returns.
- 0xd0 all_direct_near_call All mispredicted non-indirect calls
-name:idq_uops_not_delivered type:bitmask default:0x1
- 0x1 core Count number of non-delivered uops to Resource Allocation Table (RAT).
+ 0xff extra: all_branches All mispredicted branch instructions executed.
+ 0x41 extra: nontaken_conditional All nontaken mispredicted macro conditional branch instructions.
+ 0x81 extra: taken_conditional All taken mispredicted macro conditional branch instructions.
+ 0x84 extra: taken_indirect_jump_non_call_ret All taken mispredicted indirect branches that are not calls nor returns.
+ 0x88 extra: taken_return_near All taken mispredicted indirect branches that have a return mnemonic.
+ 0x90 extra: taken_direct_near_call All taken mispredicted non-indirect calls.
+ 0xa0 extra: taken_indirect_near_call All taken mispredicted indirect calls, including both register and memory indirect.
+ 0xc1 extra: all_conditional All mispredicted macro conditional branch instructions.
+ 0xc4 extra: all_indirect_jump_non_call_ret All mispredicted indirect branches that are not calls nor returns.
+ 0xd0 extra: all_direct_near_call All mispredicted non-indirect calls
+name:idq_uops_not_delivered type:bitmask default:core
+ 0x1 extra: core Count number of non-delivered uops to Resource Allocation Table (RAT).
0x1 extra:cmask=4 cycles_0_uops_deliv.core Counts the cycles no uops were delivered
0x1 extra:cmask=3 cycles_le_1_uop_deliv.core Counts the cycles less than 1 uops were delivered
0x1 extra:cmask=2 cycles_le_2_uop_deliv.core Counts the cycles less than 2 uops were delivered
@@ -157,119 +157,119 @@ name:idq_uops_not_delivered type:bitmask default:0x1
0x1 extra:cmask=4,inv cycles_ge_1_uop_deliv.core Cycles when 1 or more uops were delivered to the by the front end.
0x1 extra:cmask=1,inv cycles_fe_was_ok Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.
name:uops_dispatched_port type:bitmask default:0x1
- 0x1 port_0 Cycles which a Uop is dispatched on port 0
- 0x2 port_1 Cycles which a Uop is dispatched on port 1
- 0x4 port_2_ld Cycles which a load Uop is dispatched on port 2
- 0x8 port_2_sta Cycles which a STA Uop is dispatched on port 2
- 0x10 port_3_ld Cycles which a load Uop is dispatched on port 3
- 0x20 port_3_sta Cycles which a STA Uop is dispatched on port 3
- 0x40 port_4 Cycles which a Uop is dispatched on port 4
- 0x80 port_5 Cycles which a Uop is dispatched on port 5
- 0xc port_2 Uops disptached to port 2, loads and stores (speculative and retired)
- 0x30 port_3 Uops disptached to port 3, loads and stores (speculative and retired)
- 0xc port_2_core Uops disptached to port 2, loads and stores per core (speculative and retired)
- 0x30 port_3_core Uops disptached to port 3, loads and stores per core (speculative and retired)
+ 0x1 extra: port_0 Cycles which a Uop is dispatched on port 0
+ 0x2 extra: port_1 Cycles which a Uop is dispatched on port 1
+ 0x4 extra: port_2_ld Cycles which a load Uop is dispatched on port 2
+ 0x8 extra: port_2_sta Cycles which a STA Uop is dispatched on port 2
+ 0x10 extra: port_3_ld Cycles which a load Uop is dispatched on port 3
+ 0x20 extra: port_3_sta Cycles which a STA Uop is dispatched on port 3
+ 0x40 extra: port_4 Cycles which a Uop is dispatched on port 4
+ 0x80 extra: port_5 Cycles which a Uop is dispatched on port 5
+ 0xc extra: port_2 Uops disptached to port 2, loads and stores (speculative and retired)
+ 0x30 extra: port_3 Uops disptached to port 3, loads and stores (speculative and retired)
+ 0xc extra: port_2_core Uops disptached to port 2, loads and stores per core (speculative and retired)
+ 0x30 extra: port_3_core Uops disptached to port 3, loads and stores per core (speculative and retired)
name:resource_stalls type:bitmask default:0x1
- 0x1 any Cycles Allocation is stalled due to Resource Related reason.
- 0x2 lb Cycles Allocator is stalled due to Load Buffer full
- 0x4 rs Stall due to no eligible Reservation Station (RS) entry available.
- 0x8 sb Cycles Allocator is stalled due to Store Buffer full (not including draining from synch).
- 0x10 rob ROB full cycles.
- 0xe mem_rs Resource stalls due to LB, SB or Reservation Station (RS) being completely in use
- 0xf0 ooo_rsrc Resource stalls due to Rob being full, FCSW, MXCSR and OTHER
- 0xa lb_sb Resource stalls due to load or store buffers
+ 0x1 extra: any Cycles Allocation is stalled due to Resource Related reason.
+ 0x2 extra: lb Cycles Allocator is stalled due to Load Buffer full
+ 0x4 extra: rs Stall due to no eligible Reservation Station (RS) entry available.
+ 0x8 extra: sb Cycles Allocator is stalled due to Store Buffer full (not including draining from synch).
+ 0x10 extra: rob ROB full cycles.
+ 0xe extra: mem_rs Resource stalls due to LB, SB or Reservation Station (RS) being completely in use
+ 0xf0 extra: ooo_rsrc Resource stalls due to Rob being full, FCSW, MXCSR and OTHER
+ 0xa extra: lb_sb Resource stalls due to load or store buffers
name:dsb2mite_switches type:bitmask default:0x1
- 0x1 count Number of Decode Stream Buffer (DSB) to MITE switches
- 0x2 penalty_cycles Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.
+ 0x1 extra: count Number of Decode Stream Buffer (DSB) to MITE switches
+ 0x2 extra: penalty_cycles Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.
name:dsb_fill type:bitmask default:0x2
- 0x2 other_cancel Count number of times a valid DSB fill has been actually cancelled for any reason.
- 0x8 exceed_dsb_lines Decode Stream Buffer (DSB) Fill encountered > 3 Decode Stream Buffer (DSB) lines.
- 0xa all_cancel Count number of times a valid Decode Stream Buffer (DSB) fill has been actually cancelled for any reason.
+ 0x2 extra: other_cancel Count number of times a valid DSB fill has been actually cancelled for any reason.
+ 0x8 extra: exceed_dsb_lines Decode Stream Buffer (DSB) Fill encountered > 3 Decode Stream Buffer (DSB) lines.
+ 0xa extra: all_cancel Count number of times a valid Decode Stream Buffer (DSB) fill has been actually cancelled for any reason.
name:offcore_requests type:bitmask default:0x1
- 0x1 demand_data_rd Demand Data Read requests sent to uncore
- 0x2 demand_code_rd Offcore Code read requests. Includes Cacheable and Un-cacheables.
- 0x4 demand_rfo Offcore Demand RFOs. Includes regular RFO, Locks, ItoM.
- 0x8 all_data_rd Offcore Demand and prefetch data reads returned to the core.
-name:uops_dispatched type:bitmask default:0x1
- 0x1 thread Counts total number of uops to be dispatched per-thread each cycle.
+ 0x1 extra: demand_data_rd Demand Data Read requests sent to uncore
+ 0x2 extra: demand_code_rd Offcore Code read requests. Includes Cacheable and Un-cacheables.
+ 0x4 extra: demand_rfo Offcore Demand RFOs. Includes regular RFO, Locks, ItoM.
+ 0x8 extra: all_data_rd Offcore Demand and prefetch data reads returned to the core.
+name:uops_dispatched type:bitmask default:thread
+ 0x1 extra: thread Counts total number of uops to be dispatched per-thread each cycle.
0x1 extra:cmask=1,inv stall_cycles Counts number of cycles no uops were dispatced to be executed on this thread.
- 0x2 core Counts total number of uops dispatched from any thread
+ 0x2 extra: core Counts total number of uops dispatched from any thread
name:tlb_flush type:bitmask default:0x1
- 0x1 dtlb_thread Count number of DTLB flushes of thread-specific entries.
- 0x20 stlb_any Count number of any STLB flushes
-name:l1d_blocks type:bitmask default:0x1
- 0x1 ld_bank_conflict Any dispatched loads cancelled due to DCU bank conflict
+ 0x1 extra: dtlb_thread Count number of DTLB flushes of thread-specific entries.
+ 0x20 extra: stlb_any Count number of any STLB flushes
+name:l1d_blocks type:bitmask default:bank_conflict_cycles
+ 0x1 extra: ld_bank_conflict Any dispatched loads cancelled due to DCU bank conflict
0x5 extra:cmask=1 bank_conflict_cycles Cycles with l1d blocks due to bank conflicts
name:other_assists type:bitmask default:0x2
- 0x2 itlb_miss_retired Instructions that experienced an ITLB miss. Non Pebs
- 0x10 avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable Non Pebs
- 0x20 sse_to_avx Number of transitions from legacy SSE to AVX-256 when penalty applicable Non Pebs
-name:uops_retired type:bitmask default:0x1
- 0x1 all All uops that actually retired.
- 0x2 retire_slots number of retirement slots used non PEBS
+ 0x2 extra: itlb_miss_retired Instructions that experienced an ITLB miss. Non Pebs
+ 0x10 extra: avx_to_sse Number of transitions from AVX-256 to legacy SSE when penalty applicable Non Pebs
+ 0x20 extra: sse_to_avx Number of transitions from legacy SSE to AVX-256 when penalty applicable Non Pebs
+name:uops_retired type:bitmask default:all
+ 0x1 extra: all All uops that actually retired.
+ 0x2 extra: retire_slots number of retirement slots used non PEBS
0x1 extra:cmask=1,inv stall_cycles Cycles no executable uops retired
0x1 extra:cmask=10,inv total_cycles Number of cycles using always true condition applied to non PEBS uops retired event.
name:machine_clears type:bitmask default:0x2
- 0x2 memory_ordering Number of Memory Ordering Machine Clears detected.
- 0x4 smc Number of Self-modifying code (SMC) Machine Clears detected.
- 0x20 maskmov Number of AVX masked mov Machine Clears detected.
+ 0x2 extra: memory_ordering Number of Memory Ordering Machine Clears detected.
+ 0x4 extra: smc Number of Self-modifying code (SMC) Machine Clears detected.
+ 0x20 extra: maskmov Number of AVX masked mov Machine Clears detected.
name:br_inst_retired type:bitmask default:0x1
- 0x1 conditional Counts all taken and not taken macro conditional branch instructions.
- 0x2 near_call Counts all macro direct and indirect near calls. non PEBS
- 0x8 near_return This event counts the number of near ret instructions retired.
- 0x10 not_taken Counts all not taken macro branch instructions retired.
- 0x20 near_taken Counts the number of near branch taken instructions retired.
- 0x40 far_branch Counts the number of far branch instructions retired.
- 0x4 all_branches_ps Counts all taken and not taken macro branches including far branches.(Precise Event)
- 0x2 near_call_r3 Ring123 only near calls (non precise)
- 0x2 near_call_r3_ps Ring123 only near calls (precise event)
+ 0x1 extra: conditional Counts all taken and not taken macro conditional branch instructions.
+ 0x2 extra: near_call Counts all macro direct and indirect near calls. non PEBS
+ 0x8 extra: near_return This event counts the number of near ret instructions retired.
+ 0x10 extra: not_taken Counts all not taken macro branch instructions retired.
+ 0x20 extra: near_taken Counts the number of near branch taken instructions retired.
+ 0x40 extra: far_branch Counts the number of far branch instructions retired.
+ 0x4 extra: all_branches_ps Counts all taken and not taken macro branches including far branches.(Precise Event)
+ 0x2 extra: near_call_r3 Ring123 only near calls (non precise)
+ 0x2 extra: near_call_r3_ps Ring123 only near calls (precise event)
name:br_misp_retired type:bitmask default:0x1
- 0x1 conditional All mispredicted macro conditional branch instructions.
- 0x2 near_call All macro direct and indirect near calls
- 0x10 not_taken number of branch instructions retired that were mispredicted and not-taken.
- 0x20 taken number of branch instructions retired that were mispredicted and taken.
- 0x4 all_branches_ps all macro branches (Precise Event)
+ 0x1 extra: conditional All mispredicted macro conditional branch instructions.
+ 0x2 extra: near_call All macro direct and indirect near calls
+ 0x10 extra: not_taken number of branch instructions retired that were mispredicted and not-taken.
+ 0x20 extra: taken number of branch instructions retired that were mispredicted and taken.
+ 0x4 extra: all_branches_ps all macro branches (Precise Event)
name:fp_assist type:bitmask default:0x1e
0x1e extra:cmask=1 any Counts any FP_ASSIST umask was incrementing.
- 0x2 x87_output output - Numeric Overflow, Numeric Underflow, Inexact Result
- 0x4 x87_input input - Invalid Operation, Denormal Operand, SNaN Operand
- 0x8 simd_output Any output SSE* FP Assist - Numeric Overflow, Numeric Underflow.
- 0x10 simd_input Any input SSE* FP Assist
+ 0x2 extra: x87_output output - Numeric Overflow, Numeric Underflow, Inexact Result
+ 0x4 extra: x87_input input - Invalid Operation, Denormal Operand, SNaN Operand
+ 0x8 extra: simd_output Any output SSE* FP Assist - Numeric Overflow, Numeric Underflow.
+ 0x10 extra: simd_input Any input SSE* FP Assist
name:mem_uops_retired type:bitmask default:0x11
- 0x11 stlb_miss_loads STLB misses dues to retired loads
- 0x12 stlb_miss_stores STLB misses dues to retired stores
- 0x21 lock_loads Locked retired loads
- 0x41 split_loads Retired loads causing cacheline splits
- 0x42 split_stores Retired stores causing cacheline splits
- 0x81 all_loads Any retired loads
- 0x82 all_stores Any retired stores
+ 0x11 extra: stlb_miss_loads STLB misses dues to retired loads
+ 0x12 extra: stlb_miss_stores STLB misses dues to retired stores
+ 0x21 extra: lock_loads Locked retired loads
+ 0x41 extra: split_loads Retired loads causing cacheline splits
+ 0x42 extra: split_stores Retired stores causing cacheline splits
+ 0x81 extra: all_loads Any retired loads
+ 0x82 extra: all_stores Any retired stores
name:mem_load_uops_retired type:bitmask default:0x1
- 0x1 l1_hit Load hit in nearest-level (L1D) cache
- 0x2 l2_hit Load hit in mid-level (L2) cache
- 0x4 llc_hit Load hit in last-level (L3) cache with no snoop needed
- 0x40 hit_lfb A load missed L1D but hit the Fill Buffer
+ 0x1 extra: l1_hit Load hit in nearest-level (L1D) cache
+ 0x2 extra: l2_hit Load hit in mid-level (L2) cache
+ 0x4 extra: llc_hit Load hit in last-level (L3) cache with no snoop needed
+ 0x40 extra: hit_lfb A load missed L1D but hit the Fill Buffer
name:mem_load_uops_llc_hit_retired type:bitmask default:0x1
- 0x1 xsnp_miss Load LLC Hit and a cross-core Snoop missed in on-pkg core cache
- 0x2 xsnp_hit Load LLC Hit and a cross-core Snoop hits in on-pkg core cache
- 0x4 xsnp_hitm Load had HitM Response from a core on same socket (shared LLC).
- 0x8 xsnp_none Load hit in last-level (L3) cache with no snoop needed.
+ 0x1 extra: xsnp_miss Load LLC Hit and a cross-core Snoop missed in on-pkg core cache
+ 0x2 extra: xsnp_hit Load LLC Hit and a cross-core Snoop hits in on-pkg core cache
+ 0x4 extra: xsnp_hitm Load had HitM Response from a core on same socket (shared LLC).
+ 0x8 extra: xsnp_none Load hit in last-level (L3) cache with no snoop needed.
name:l2_trans type:bitmask default:0x80
- 0x80 all_requests Transactions accessing L2 pipe
- 0x1 demand_data_rd Demand Data Read requests that access L2 cache, includes L1D prefetches.
- 0x2 rfo RFO requests that access L2 cache
- 0x4 code_rd L2 cache accesses when fetching instructions including L1D code prefetches
- 0x8 all_pf L2 or LLC HW prefetches that access L2 cache
- 0x10 l1d_wb L1D writebacks that access L2 cache
- 0x20 l2_fill L2 fill requests that access L2 cache
- 0x40 l2_wb L2 writebacks that access L2 cache
+ 0x80 extra: all_requests Transactions accessing L2 pipe
+ 0x1 extra: demand_data_rd Demand Data Read requests that access L2 cache, includes L1D prefetches.
+ 0x2 extra: rfo RFO requests that access L2 cache
+ 0x4 extra: code_rd L2 cache accesses when fetching instructions including L1D code prefetches
+ 0x8 extra: all_pf L2 or LLC HW prefetches that access L2 cache
+ 0x10 extra: l1d_wb L1D writebacks that access L2 cache
+ 0x20 extra: l2_fill L2 fill requests that access L2 cache
+ 0x40 extra: l2_wb L2 writebacks that access L2 cache
name:l2_lines_in type:bitmask default:0x7
- 0x7 all L2 cache lines filling L2
- 0x1 i L2 cache lines in I state filling L2
- 0x2 s L2 cache lines in S state filling L2
- 0x4 e L2 cache lines in E state filling L2
+ 0x7 extra: all L2 cache lines filling L2
+ 0x1 extra: i L2 cache lines in I state filling L2
+ 0x2 extra: s L2 cache lines in S state filling L2
+ 0x4 extra: e L2 cache lines in E state filling L2
name:l2_lines_out type:bitmask default:0x1
- 0x1 demand_clean Clean line evicted by a demand
- 0x2 demand_dirty Dirty line evicted by a demand
- 0x4 pf_clean Clean line evicted by an L2 Prefetch
- 0x8 pf_dirty Dirty line evicted by an L2 Prefetch
- 0xa dirty_all Any Dirty line evicted
+ 0x1 extra: demand_clean Clean line evicted by a demand
+ 0x2 extra: demand_dirty Dirty line evicted by a demand
+ 0x4 extra: pf_clean Clean line evicted by an L2 Prefetch
+ 0x8 extra: pf_dirty Dirty line evicted by an L2 Prefetch
+ 0xa extra: dirty_all Any Dirty line evicted
diff --git a/events/i386/silvermont/events b/events/i386/silvermont/events
new file mode 100644
index 0000000..434538f
--- /dev/null
+++ b/events/i386/silvermont/events
@@ -0,0 +1,24 @@
+#
+# Intel "Silvermont" microarchitecture core events.
+#
+# See http://ark.intel.com/ for help in identifying Silvermont based CPUs
+#
+# Note the minimum counts are not discovered experimentally and could be likely
+# lowered in many cases without ill effect.
+#
+include:i386/arch_perfmon
+event:0x03 counters:0,1 um:rehabq minimum:200003 name:rehabq :
+event:0x04 counters:0,1 um:mem_uops_retired minimum:200003 name:mem_uops_retired :
+event:0x05 counters:0,1 um:page_walks minimum:200003 name:page_walks :
+event:0x30 counters:0,1 um:zero minimum:200003 name:l2_reject_xq_all :
+event:0x31 counters:0,1 um:zero minimum:200003 name:core_reject_l2q_all :
+event:0x80 counters:0,1 um:icache minimum:200003 name:icache :
+event:0xc2 counters:0,1 um:uops_retired minimum:2000003 name:uops_retired :
+event:0xc3 counters:0,1 um:machine_clears minimum:200003 name:machine_clears :
+event:0xc4 counters:0,1 um:br_inst_retired minimum:200003 name:br_inst_retired :
+event:0xc5 counters:0,1 um:br_misp_retired minimum:200003 name:br_misp_retired :
+event:0xca counters:0,1 um:no_alloc_cycles minimum:200003 name:no_alloc_cycles :
+event:0xcb counters:0,1 um:rs_full_stall minimum:200003 name:rs_full_stall :
+event:0xcd counters:0,1 um:one minimum:2000003 name:cycles_div_busy_all :
+event:0xe6 counters:0,1 um:baclears minimum:200003 name:baclears :
+event:0xe7 counters:0,1 um:one minimum:200003 name:ms_decoded_ms_entry :
diff --git a/events/i386/silvermont/unit_masks b/events/i386/silvermont/unit_masks
new file mode 100644
index 0000000..c0dac26
--- /dev/null
+++ b/events/i386/silvermont/unit_masks
@@ -0,0 +1,89 @@
+#
+# Unit masks for the Intel "Silvermont" micro architecture
+#
+# See http://ark.intel.com/ for help in identifying Silvermont based CPUs
+#
+include:i386/arch_perfmon
+name:rehabq type:exclusive default:0x1
+ 0x1 extra: ld_block_st_forward This event counts the number of retired loads that were prohibited from receiving forwarded data from the store because of address mismatch.
+ 0x1 extra:pebs ld_block_st_forward_pebs This event counts the number of retired loads that were prohibited from receiving forwarded data from the store because of address mismatch.
+ 0x2 extra: ld_block_std_notready This event counts the cases where a forward was technically possible, but did not occur because the store data was not available at the right time
+ 0x4 extra: st_splits This event counts the number of retire stores that experienced cache line boundary splits
+ 0x8 extra: ld_splits This event counts the number of retire loads that experienced cache line boundary splits
+ 0x8 extra:pebs ld_splits_pebs This event counts the number of retire loads that experienced cache line boundary splits
+ 0x10 extra: lock This event counts the number of retired memory operations with lock semantics. These are either implicit locked instructions such as the XCHG instruction or instructions with an explicit LOCK prefix (0xF0).
+ 0x20 extra: sta_full This event counts the number of retired stores that are delayed because there is not a store address buffer available.
+ 0x40 extra: any_ld This event counts the number of load uops reissued from Rehabq
+ 0x80 extra: any_st This event counts the number of store uops reissued from Rehabq
+name:mem_uops_retired type:exclusive default:0x1
+ 0x1 extra: l1_miss_loads This event counts the number of load ops retired that miss in L1 Data cache. Note that prefetch misses will not be counted.
+ 0x2 extra: l2_hit_loads This event counts the number of load ops retired that hit in the L2
+ 0x2 extra:pebs l2_hit_loads_pebs This event counts the number of load ops retired that hit in the L2
+ 0x4 extra: l2_miss_loads This event counts the number of load ops retired that miss in the L2
+ 0x4 extra:pebs l2_miss_loads_pebs This event counts the number of load ops retired that miss in the L2
+ 0x8 extra: dtlb_miss_loads This event counts the number of load ops retired that had DTLB miss.
+ 0x8 extra:pebs dtlb_miss_loads_pebs This event counts the number of load ops retired that had DTLB miss.
+ 0x10 extra: utlb_miss This event counts the number of load ops retired that had UTLB miss.
+ 0x20 extra: hitm This event counts the number of load ops retired that got data from the other core or from the other module.
+ 0x20 extra:pebs hitm_pebs This event counts the number of load ops retired that got data from the other core or from the other module.
+ 0x40 extra: all_loads This event counts the number of load ops retired
+ 0x80 extra: all_stores This event counts the number of store ops retired
+name:page_walks type:exclusive default:0x1
+ 0x1 extra:edge d_side_walks This event counts when a data (D) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.
+ 0x1 extra: d_side_cycles This event counts every cycle when a D-side (walks due to a load) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks.
+ 0x2 extra:edge i_side_walks This event counts when an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.
+ 0x2 extra: i_side_cycles This event counts every cycle when a I-side (walks due to an instruction fetch) page walk is in progress. Page walk duration divided by number of page walks is the average duration of page-walks.
+ 0x3 extra:edge walks This event counts when a data (D) page walk or an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks.
+ 0x3 extra: cycles This event counts every cycle when a data (D) page walk or instruction (I) page walk is in progress. Since a pagewalk implies a TLB miss, the approximate cost of a TLB miss can be determined from this event.
+name:icache type:exclusive default:0x3
+ 0x3 extra: accesses This event counts all instruction fetches, including uncacheable fetches.
+ 0x1 extra: hit This event counts all instruction fetches from the instruction cache.
+ 0x2 extra: misses This event counts all instruction fetches that miss the Instruction cache or produce memory requests. This includes uncacheable fetches. An instruction fetch miss is counted only once and not once for every cycle it is outstanding.
+name:uops_retired type:exclusive default:0x10
+ 0x10 extra: all This event counts the number of micro-ops retired. The processor decodes complex macro instructions into a sequence of simpler micro-ops. Most instructions are composed of one or two micro-ops. Some instructions are decoded into longer sequences such as repeat instructions, floating point transcendental instructions, and assists. In some cases micro-op sequences are fused or whole instructions are fused into one micro-op. See other UOPS_RETIRED events for differentiating retired fused and non-fused micro-ops.
+ 0x1 extra: ms This event counts the number of micro-ops retired that were supplied from MSROM.
+name:machine_clears type:exclusive default:0x8
+ 0x8 extra: all Machine clears happen when something happens in the machine that causes the hardware to need to take special care to get the right answer. When such a condition is signaled on an instruction, the front end of the machine is notified that it must restart, so no more instructions will be decoded from the current path. All instructions "older" than this one will be allowed to finish. This instruction and all "younger" instructions must be cleared, since they must not be allowed to complete. Essentially, the hardware waits until the problematic instruction is the oldest instruction in the machine. This means all older instructions are retired, and all pending stores (from older instructions) are completed. Then the new path of instructions from the front end are allowed to start into the machine. There are many conditions that might cause a machine clear (including the receipt of an interrupt, or a trap or a fault). All those conditions (including but not limited to MACHINE_CLEARS.MEMORY_ORDERING, MACHINE_CLEARS.SMC, and MACHINE_CLEARS.FP_ASSIST) are captured in the ANY event. In addition, some conditions can be specifically counted (i.e. SMC, MEMORY_ORDERING, FP_ASSIST). However, the sum of SMC, MEMORY_ORDERING, and FP_ASSIST machine clears will not necessarily equal the number of ANY.
+ 0x1 extra: smc This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel? architecture processors.
+ 0x2 extra: memory_ordering This event counts the number of times that pipeline was cleared due to memory ordering issues.
+ 0x4 extra: fp_assist This event counts the number of times that pipeline stalled due to FP operations needing assists.
+name:br_inst_retired type:exclusive default:0x7e
+ 0x7e extra: jcc JCC counts the number of conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0x7e extra:pebs jcc_pebs JCC counts the number of conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfe extra: taken_jcc TAKEN_JCC counts the number of taken conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfe extra:pebs taken_jcc_pebs TAKEN_JCC counts the number of taken conditional branch (JCC) instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xf9 extra: call CALL counts the number of near CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xf9 extra:pebs call_pebs CALL counts the number of near CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfd extra: rel_call REL_CALL counts the number of near relative CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfd extra:pebs rel_call_pebs REL_CALL counts the number of near relative CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfb extra: ind_call IND_CALL counts the number of near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xfb extra:pebs ind_call_pebs IND_CALL counts the number of near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xf7 extra: return RETURN counts the number of near RET branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xf7 extra:pebs return_pebs RETURN counts the number of near RET branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xeb extra: non_return_ind NON_RETURN_IND counts the number of near indirect JMP and near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xeb extra:pebs non_return_ind_pebs NON_RETURN_IND counts the number of near indirect JMP and near indirect CALL branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xbf extra: far_branch FAR counts the number of far branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+ 0xbf extra:pebs far_branch_pebs FAR counts the number of far branch instructions retired. Branch prediction predicts the branch target and enables the processor to begin executing instructions long before the branch true execution path is known. All branches utilize the branch prediction unit (BPU) for prediction. This unit predicts the target address not only based on the EIP of the branch but also based on the execution path through which execution reached this EIP. The BPU can efficiently predict the following branch types: conditional branches, direct calls and jumps, indirect calls and jumps, returns.
+name:br_misp_retired type:exclusive default:0x7e
+ 0x7e extra: jcc JCC counts the number of mispredicted conditional branches (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0x7e extra:pebs jcc_pebs JCC counts the number of mispredicted conditional branches (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xfe extra: taken_jcc TAKEN_JCC counts the number of mispredicted taken conditional branch (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xfe extra:pebs taken_jcc_pebs TAKEN_JCC counts the number of mispredicted taken conditional branch (JCC) instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xfb extra: ind_call IND_CALL counts the number of mispredicted near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xfb extra:pebs ind_call_pebs IND_CALL counts the number of mispredicted near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xf7 extra: return RETURN counts the number of mispredicted near RET branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xf7 extra:pebs return_pebs RETURN counts the number of mispredicted near RET branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xeb extra: non_return_ind NON_RETURN_IND counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+ 0xeb extra:pebs non_return_ind_pebs NON_RETURN_IND counts the number of mispredicted near indirect JMP and near indirect CALL branch instructions retired. This event counts the number of retired branch instructions that were mispredicted by the processor, categorized by type. A branch misprediction occurs when the processor predicts that the branch would be taken, but it is not, or vice-versa. When the misprediction is discovered, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.
+name:no_alloc_cycles type:exclusive default:0x3f
+ 0x3f extra: all The NO_ALLOC_CYCLES.ALL event counts the number of cycles when the front-end does not provide any instructions to be allocated for any reason. This event indicates the cycles where an allocation stalls occurs, and no UOPS are allocated in that cycle.
+ 0x1 extra: rob_full Counts the number of cycles when no uops are allocated and the ROB is full (less than 2 entries available)
+ 0x20 extra: rat_stall Counts the number of cycles when no uops are allocated and a RATstall is asserted.
+ 0x50 extra: not_delivered The NO_ALLOC_CYCLES.NOT_DELIVERED event is used to measure front-end inefficiencies, i.e. when front-end of the machine is not delivering micro-ops to the back-end and the back-end is not stalled. This event can be used to identify if the machine is truly front-end bound. When this event occurs, it is an indication that the front-end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: Front-end and Back-end. Front-end is responsible for fetching the instruction, decoding into micro-ops (uops) in machine understandable format and putting them into a micro-op queue to be consumed by back end. The back-end then takes these micro-ops, allocates the required resources. When all resources are ready, micro-ops are executed. If the back-end is not ready to accept micro-ops from the front-end, then we do not want to count these as front-end bottlenecks. However, whenever we have bottlenecks in the back-end, we will have allocation unit stalls and eventually forcing the front-end to wait until the back-end is ready to receive more UOPS. This event counts the cycles only when back-end is requesting more uops and front-end is not able to provide them. Some examples of conditions that cause front-end efficiencies are: Icache misses, ITLB misses, and decoder restrictions that limit the the front-end bandwidth.
+name:rs_full_stall type:exclusive default:0x1f
+ 0x1f extra: all Counts the number of cycles the Alloc pipeline is stalled when any one of the RSs (IEC, FPC and MEC) is full. This event is a superset of all the individual RS stall event counts.
+ 0x1 extra: mec Counts the number of cycles and allocation pipeline is stalled and is waiting for a free MEC reservation station entry. The cycles should be appropriately counted in case of the cracked ops e.g. In case of a cracked load-op, the load portion is sent to M
+name:baclears type:exclusive default:0x1
+ 0x1 extra: all The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.ANY event counts the number of baclears for any type of branch.
+ 0x8 extra: return The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.RETURN event counts the number of RETURN baclears.
+ 0x10 extra: cond The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.COND event counts the number of JCC (Jump on Condtional Code) baclears.
diff --git a/events/i386/westmere/unit_masks b/events/i386/westmere/unit_masks
index c98d81a..56206ce 100644
--- a/events/i386/westmere/unit_masks
+++ b/events/i386/westmere/unit_masks
@@ -16,291 +16,291 @@ name:x10 type:mandatory default:0x10
name:x20 type:mandatory default:0x20
0x20 No unit mask
name:arith type:bitmask default:0x01
- 0x01 cycles_div_busy Cycles the divider is busy
- 0x02 mul Multiply operations executed
+ 0x01 extra: cycles_div_busy Cycles the divider is busy
+ 0x02 extra: mul Multiply operations executed
name:baclear type:bitmask default:0x01
- 0x01 clear BACLEAR asserted, regardless of cause
- 0x02 bad_target BACLEAR asserted with bad target address
+ 0x01 extra: clear BACLEAR asserted, regardless of cause
+ 0x02 extra: bad_target BACLEAR asserted with bad target address
name:bpu_clears type:bitmask default:0x01
- 0x01 early Early Branch Prediction Unit clears
- 0x02 late Late Branch Prediction Unit clears
+ 0x01 extra: early Early Branch Prediction Unit clears
+ 0x02 extra: late Late Branch Prediction Unit clears
name:br_inst_exec type:bitmask default:0x7f
- 0x01 cond Conditional branch instructions executed
- 0x02 direct Unconditional branches executed
- 0x04 indirect_non_call Indirect non call branches executed
- 0x07 non_calls All non call branches executed
- 0x08 return_near Indirect return branches executed
- 0x10 direct_near_call Unconditional call branches executed
- 0x20 indirect_near_call Indirect call branches executed
- 0x30 near_calls Call branches executed
- 0x40 taken Taken branches executed
- 0x7f any Branch instructions executed
+ 0x01 extra: cond Conditional branch instructions executed
+ 0x02 extra: direct Unconditional branches executed
+ 0x04 extra: indirect_non_call Indirect non call branches executed
+ 0x07 extra: non_calls All non call branches executed
+ 0x08 extra: return_near Indirect return branches executed
+ 0x10 extra: direct_near_call Unconditional call branches executed
+ 0x20 extra: indirect_near_call Indirect call branches executed
+ 0x30 extra: near_calls Call branches executed
+ 0x40 extra: taken Taken branches executed
+ 0x7f extra: any Branch instructions executed
name:br_inst_retired type:bitmask default:0x04
- 0x01 conditional Retired conditional branch instructions (Precise Event)
- 0x02 near_call Retired near call instructions (Precise Event)
- 0x04 all_branches Retired branch instructions (Precise Event)
+ 0x01 extra: conditional Retired conditional branch instructions (Precise Event)
+ 0x02 extra: near_call Retired near call instructions (Precise Event)
+ 0x04 extra: all_branches Retired branch instructions (Precise Event)
name:br_misp_exec type:bitmask default:0x7f
- 0x01 cond Mispredicted conditional branches executed
- 0x02 direct Mispredicted unconditional branches executed
- 0x04 indirect_non_call Mispredicted indirect non call branches executed
- 0x07 non_calls Mispredicted non call branches executed
- 0x08 return_near Mispredicted return branches executed
- 0x10 direct_near_call Mispredicted non call branches executed
- 0x20 indirect_near_call Mispredicted indirect call branches executed
- 0x30 near_calls Mispredicted call branches executed
- 0x40 taken Mispredicted taken branches executed
- 0x7f any Mispredicted branches executed
+ 0x01 extra: cond Mispredicted conditional branches executed
+ 0x02 extra: direct Mispredicted unconditional branches executed
+ 0x04 extra: indirect_non_call Mispredicted indirect non call branches executed
+ 0x07 extra: non_calls Mispredicted non call branches executed
+ 0x08 extra: return_near Mispredicted return branches executed
+ 0x10 extra: direct_near_call Mispredicted non call branches executed
+ 0x20 extra: indirect_near_call Mispredicted indirect call branches executed
+ 0x30 extra: near_calls Mispredicted call branches executed
+ 0x40 extra: taken Mispredicted taken branches executed
+ 0x7f extra: any Mispredicted branches executed
name:br_misp_retired type:bitmask default:0x04
- 0x01 conditional Mispredicted conditional retired branches (Precise Event)
- 0x02 near_call Mispredicted near retired calls (Precise Event)
- 0x04 all_branches Mispredicted retired branch instructions (Precise Event)
+ 0x01 extra: conditional Mispredicted conditional retired branches (Precise Event)
+ 0x02 extra: near_call Mispredicted near retired calls (Precise Event)
+ 0x04 extra: all_branches Mispredicted retired branch instructions (Precise Event)
name:cache_lock_cycles type:bitmask default:0x01
- 0x01 l1d_l2 Cycles L1D and L2 locked
- 0x02 l1d Cycles L1D locked
+ 0x01 extra: l1d_l2 Cycles L1D and L2 locked
+ 0x02 extra: l1d Cycles L1D locked
name:cpu_clk_unhalted type:bitmask default:0x00
- 0x00 thread_p Cycles when thread is not halted (programmable counter)
- 0x01 ref_p Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter)
+ 0x00 extra: thread_p Cycles when thread is not halted (programmable counter)
+ 0x01 extra: ref_p Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter)
name:dtlb_load_misses type:bitmask default:0x01
- 0x01 any DTLB load misses
- 0x02 walk_completed DTLB load miss page walks complete
- 0x04 walk_cycles DTLB load miss page walk cycles
- 0x10 stlb_hit DTLB second level hit
- 0x20 pde_miss DTLB load miss caused by low part of address
- 0x80 large_walk_completed DTLB load miss large page walks
+ 0x01 extra: any DTLB load misses
+ 0x02 extra: walk_completed DTLB load miss page walks complete
+ 0x04 extra: walk_cycles DTLB load miss page walk cycles
+ 0x10 extra: stlb_hit DTLB second level hit
+ 0x20 extra: pde_miss DTLB load miss caused by low part of address
+ 0x80 extra: large_walk_completed DTLB load miss large page walks
name:dtlb_misses type:bitmask default:0x01
- 0x01 any DTLB misses
- 0x02 walk_completed DTLB miss page walks
- 0x04 walk_cycles DTLB miss page walk cycles
- 0x10 stlb_hit DTLB first level misses but second level hit
- 0x20 pde_miss DTLB misses casued by low part of address
- 0x80 large_walk_completed DTLB miss large page walks
+ 0x01 extra: any DTLB misses
+ 0x02 extra: walk_completed DTLB miss page walks
+ 0x04 extra: walk_cycles DTLB miss page walk cycles
+ 0x10 extra: stlb_hit DTLB first level misses but second level hit
+ 0x20 extra: pde_miss DTLB misses casued by low part of address
+ 0x80 extra: large_walk_completed DTLB miss large page walks
name:fp_assist type:bitmask default:0x01
- 0x01 all X87 Floating point assists (Precise Event)
- 0x02 output X87 Floating point assists for invalid output value (Precise Event)
- 0x04 input X87 Floating poiint assists for invalid input value (Precise Event)
+ 0x01 extra: all X87 Floating point assists (Precise Event)
+ 0x02 extra: output X87 Floating point assists for invalid output value (Precise Event)
+ 0x04 extra: input X87 Floating poiint assists for invalid input value (Precise Event)
name:fp_comp_ops_exe type:bitmask default:0x01
- 0x01 x87 Computational floating-point operations executed
- 0x02 mmx MMX Uops
- 0x04 sse_fp SSE and SSE2 FP Uops
- 0x08 sse2_integer SSE2 integer Uops
- 0x10 sse_fp_packed SSE FP packed Uops
- 0x20 sse_fp_scalar SSE FP scalar Uops
- 0x40 sse_single_precision SSE* FP single precision Uops
- 0x80 sse_double_precision SSE* FP double precision Uops
+ 0x01 extra: x87 Computational floating-point operations executed
+ 0x02 extra: mmx MMX Uops
+ 0x04 extra: sse_fp SSE and SSE2 FP Uops
+ 0x08 extra: sse2_integer SSE2 integer Uops
+ 0x10 extra: sse_fp_packed SSE FP packed Uops
+ 0x20 extra: sse_fp_scalar SSE FP scalar Uops
+ 0x40 extra: sse_single_precision SSE* FP single precision Uops
+ 0x80 extra: sse_double_precision SSE* FP double precision Uops
name:fp_mmx_trans type:bitmask default:0x03
- 0x01 to_fp Transitions from MMX to Floating Point instructions
- 0x02 to_mmx Transitions from Floating Point to MMX instructions
- 0x03 any All Floating Point to and from MMX transitions
+ 0x01 extra: to_fp Transitions from MMX to Floating Point instructions
+ 0x02 extra: to_mmx Transitions from Floating Point to MMX instructions
+ 0x03 extra: any All Floating Point to and from MMX transitions
name:ild_stall type:bitmask default:0x0f
- 0x01 lcp Length Change Prefix stall cycles
- 0x02 mru Stall cycles due to BPU MRU bypass
- 0x04 iq_full Instruction Queue full stall cycles
- 0x08 regen Regen stall cycles
- 0x0f any Any Instruction Length Decoder stall cycles
+ 0x01 extra: lcp Length Change Prefix stall cycles
+ 0x02 extra: mru Stall cycles due to BPU MRU bypass
+ 0x04 extra: iq_full Instruction Queue full stall cycles
+ 0x08 extra: regen Regen stall cycles
+ 0x0f extra: any Any Instruction Length Decoder stall cycles
name:inst_retired type:bitmask default:0x01
- 0x01 any_p Instructions retired (Programmable counter and Precise Event)
- 0x02 x87 Retired floating-point operations (Precise Event)
- 0x04 mmx Retired MMX instructions (Precise Event)
+ 0x01 extra: any_p Instructions retired (Programmable counter and Precise Event)
+ 0x02 extra: x87 Retired floating-point operations (Precise Event)
+ 0x04 extra: mmx Retired MMX instructions (Precise Event)
name:itlb_misses type:bitmask default:0x01
- 0x01 any ITLB miss
- 0x02 walk_completed ITLB miss page walks
- 0x04 walk_cycles ITLB miss page walk cycles
- 0x80 large_walk_completed ITLB miss large page walks
+ 0x01 extra: any ITLB miss
+ 0x02 extra: walk_completed ITLB miss page walks
+ 0x04 extra: walk_cycles ITLB miss page walk cycles
+ 0x80 extra: large_walk_completed ITLB miss large page walks
name:l1d type:bitmask default:0x01
- 0x01 repl L1 data cache lines allocated
- 0x02 m_repl L1D cache lines allocated in the M state
- 0x04 m_evict L1D cache lines replaced in M state
- 0x08 m_snoop_evict L1D snoop eviction of cache lines in M state
+ 0x01 extra: repl L1 data cache lines allocated
+ 0x02 extra: m_repl L1D cache lines allocated in the M state
+ 0x04 extra: m_evict L1D cache lines replaced in M state
+ 0x08 extra: m_snoop_evict L1D snoop eviction of cache lines in M state
name:l1d_prefetch type:bitmask default:0x01
- 0x01 requests L1D hardware prefetch requests
- 0x02 miss L1D hardware prefetch misses
- 0x04 triggers L1D hardware prefetch requests triggered
+ 0x01 extra: requests L1D hardware prefetch requests
+ 0x02 extra: miss L1D hardware prefetch misses
+ 0x04 extra: triggers L1D hardware prefetch requests triggered
name:l1d_wb_l2 type:bitmask default:0x0f
- 0x01 i_state L1 writebacks to L2 in I state (misses)
- 0x02 s_state L1 writebacks to L2 in S state
- 0x04 e_state L1 writebacks to L2 in E state
- 0x08 m_state L1 writebacks to L2 in M state
- 0x0f mesi All L1 writebacks to L2
+ 0x01 extra: i_state L1 writebacks to L2 in I state (misses)
+ 0x02 extra: s_state L1 writebacks to L2 in S state
+ 0x04 extra: e_state L1 writebacks to L2 in E state
+ 0x08 extra: m_state L1 writebacks to L2 in M state
+ 0x0f extra: mesi All L1 writebacks to L2
name:l1i type:bitmask default:0x01
- 0x01 hits L1I instruction fetch hits
- 0x02 misses L1I instruction fetch misses
- 0x03 reads L1I Instruction fetches
- 0x04 cycles_stalled L1I instruction fetch stall cycles
+ 0x01 extra: hits L1I instruction fetch hits
+ 0x02 extra: misses L1I instruction fetch misses
+ 0x03 extra: reads L1I Instruction fetches
+ 0x04 extra: cycles_stalled L1I instruction fetch stall cycles
name:l2_data_rqsts type:bitmask default:0xff
- 0x01 demand_i_state L2 data demand loads in I state (misses)
- 0x02 demand_s_state L2 data demand loads in S state
- 0x04 demand_e_state L2 data demand loads in E state
- 0x08 demand_m_state L2 data demand loads in M state
- 0x0f demand_mesi L2 data demand requests
- 0x10 prefetch_i_state L2 data prefetches in the I state (misses)
- 0x20 prefetch_s_state L2 data prefetches in the S state
- 0x40 prefetch_e_state L2 data prefetches in E state
- 0x80 prefetch_m_state L2 data prefetches in M state
- 0xf0 prefetch_mesi All L2 data prefetches
- 0xff any All L2 data requests
+ 0x01 extra: demand_i_state L2 data demand loads in I state (misses)
+ 0x02 extra: demand_s_state L2 data demand loads in S state
+ 0x04 extra: demand_e_state L2 data demand loads in E state
+ 0x08 extra: demand_m_state L2 data demand loads in M state
+ 0x0f extra: demand_mesi L2 data demand requests
+ 0x10 extra: prefetch_i_state L2 data prefetches in the I state (misses)
+ 0x20 extra: prefetch_s_state L2 data prefetches in the S state
+ 0x40 extra: prefetch_e_state L2 data prefetches in E state
+ 0x80 extra: prefetch_m_state L2 data prefetches in M state
+ 0xf0 extra: prefetch_mesi All L2 data prefetches
+ 0xff extra: any All L2 data requests
name:l2_lines_in type:bitmask default:0x07
- 0x02 s_state L2 lines allocated in the S state
- 0x04 e_state L2 lines allocated in the E state
- 0x07 any L2 lines alloacated
+ 0x02 extra: s_state L2 lines allocated in the S state
+ 0x04 extra: e_state L2 lines allocated in the E state
+ 0x07 extra: any L2 lines alloacated
name:l2_lines_out type:bitmask default:0x0f
- 0x01 demand_clean L2 lines evicted by a demand request
- 0x02 demand_dirty L2 modified lines evicted by a demand request
- 0x04 prefetch_clean L2 lines evicted by a prefetch request
- 0x08 prefetch_dirty L2 modified lines evicted by a prefetch request
- 0x0f any L2 lines evicted
+ 0x01 extra: demand_clean L2 lines evicted by a demand request
+ 0x02 extra: demand_dirty L2 modified lines evicted by a demand request
+ 0x04 extra: prefetch_clean L2 lines evicted by a prefetch request
+ 0x08 extra: prefetch_dirty L2 modified lines evicted by a prefetch request
+ 0x0f extra: any L2 lines evicted
name:l2_rqsts type:bitmask default:0x01
- 0x01 ld_hit L2 load hits
- 0x02 ld_miss L2 load misses
- 0x03 loads L2 requests
- 0x04 rfo_hit L2 RFO hits
- 0x08 rfo_miss L2 RFO misses
- 0x0c rfos L2 RFO requests
- 0x10 ifetch_hit L2 instruction fetch hits
- 0x20 ifetch_miss L2 instruction fetch misses
- 0x30 ifetches L2 instruction fetches
- 0x40 prefetch_hit L2 prefetch hits
- 0x80 prefetch_miss L2 prefetch misses
- 0xaa miss All L2 misses
- 0xc0 prefetches All L2 prefetches
- 0xff references All L2 requests
+ 0x01 extra: ld_hit L2 load hits
+ 0x02 extra: ld_miss L2 load misses
+ 0x03 extra: loads L2 requests
+ 0x04 extra: rfo_hit L2 RFO hits
+ 0x08 extra: rfo_miss L2 RFO misses
+ 0x0c extra: rfos L2 RFO requests
+ 0x10 extra: ifetch_hit L2 instruction fetch hits
+ 0x20 extra: ifetch_miss L2 instruction fetch misses
+ 0x30 extra: ifetches L2 instruction fetches
+ 0x40 extra: prefetch_hit L2 prefetch hits
+ 0x80 extra: prefetch_miss L2 prefetch misses
+ 0xaa extra: miss All L2 misses
+ 0xc0 extra: prefetches All L2 prefetches
+ 0xff extra: references All L2 requests
name:l2_transactions type:bitmask default:0x80
- 0x01 load L2 Load transactions
- 0x02 rfo L2 RFO transactions
- 0x04 ifetch L2 instruction fetch transactions
- 0x08 prefetch L2 prefetch transactions
- 0x10 l1d_wb L1D writeback to L2 transactions
- 0x20 fill L2 fill transactions
- 0x40 wb L2 writeback to LLC transactions
- 0x80 any All L2 transactions
+ 0x01 extra: load L2 Load transactions
+ 0x02 extra: rfo L2 RFO transactions
+ 0x04 extra: ifetch L2 instruction fetch transactions
+ 0x08 extra: prefetch L2 prefetch transactions
+ 0x10 extra: l1d_wb L1D writeback to L2 transactions
+ 0x20 extra: fill L2 fill transactions
+ 0x40 extra: wb L2 writeback to LLC transactions
+ 0x80 extra: any All L2 transactions
name:l2_write type:bitmask default:0x01
- 0x01 rfo_i_state L2 demand store RFOs in I state (misses)
- 0x02 rfo_s_state L2 demand store RFOs in S state
- 0x08 rfo_m_state L2 demand store RFOs in M state
- 0x0e rfo_hit All L2 demand store RFOs that hit the cache
- 0x0f rfo_mesi All L2 demand store RFOs
- 0x10 lock_i_state L2 demand lock RFOs in I state (misses)
- 0x20 lock_s_state L2 demand lock RFOs in S state
- 0x40 lock_e_state L2 demand lock RFOs in E state
- 0x80 lock_m_state L2 demand lock RFOs in M state
- 0xe0 lock_hit All demand L2 lock RFOs that hit the cache
- 0xf0 lock_mesi All demand L2 lock RFOs
+ 0x01 extra: rfo_i_state L2 demand store RFOs in I state (misses)
+ 0x02 extra: rfo_s_state L2 demand store RFOs in S state
+ 0x08 extra: rfo_m_state L2 demand store RFOs in M state
+ 0x0e extra: rfo_hit All L2 demand store RFOs that hit the cache
+ 0x0f extra: rfo_mesi All L2 demand store RFOs
+ 0x10 extra: lock_i_state L2 demand lock RFOs in I state (misses)
+ 0x20 extra: lock_s_state L2 demand lock RFOs in S state
+ 0x40 extra: lock_e_state L2 demand lock RFOs in E state
+ 0x80 extra: lock_m_state L2 demand lock RFOs in M state
+ 0xe0 extra: lock_hit All demand L2 lock RFOs that hit the cache
+ 0xf0 extra: lock_mesi All demand L2 lock RFOs
name:load_dispatch type:bitmask default:0x07
- 0x01 rs Loads dispatched that bypass the MOB
- 0x02 rs_delayed Loads dispatched from stage 305
- 0x04 mob Loads dispatched from the MOB
- 0x07 any All loads dispatched
+ 0x01 extra: rs Loads dispatched that bypass the MOB
+ 0x02 extra: rs_delayed Loads dispatched from stage 305
+ 0x04 extra: mob Loads dispatched from the MOB
+ 0x07 extra: any All loads dispatched
name:longest_lat_cache type:bitmask default:0x01
- 0x01 miss Longest latency cache miss
- 0x02 reference Longest latency cache reference
+ 0x01 extra: miss Longest latency cache miss
+ 0x02 extra: reference Longest latency cache reference
name:machine_clears type:bitmask default:0x01
- 0x01 cycles Cycles machine clear asserted
- 0x02 mem_order Execution pipeline restart due to Memory ordering conflicts
- 0x04 smc Self-Modifying Code detected
+ 0x01 extra: cycles Cycles machine clear asserted
+ 0x02 extra: mem_order Execution pipeline restart due to Memory ordering conflicts
+ 0x04 extra: smc Self-Modifying Code detected
name:mem_inst_retired type:bitmask default:0x01
- 0x01 loads Instructions retired which contains a load (Precise Event)
- 0x02 stores Instructions retired which contains a store (Precise Event)
+ 0x01 extra: loads Instructions retired which contains a load (Precise Event)
+ 0x02 extra: stores Instructions retired which contains a store (Precise Event)
name:mem_load_retired type:bitmask default:0x01
- 0x01 l1d_hit Retired loads that hit the L1 data cache (Precise Event)
- 0x02 l2_hit Retired loads that hit the L2 cache (Precise Event)
- 0x04 llc_unshared_hit Retired loads that hit valid versions in the LLC cache (Precise Event)
- 0x08 other_core_l2_hit_hitm Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)
- 0x10 llc_miss Retired loads that miss the LLC cache (Precise Event)
- 0x40 hit_lfb Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)
- 0x80 dtlb_miss Retired loads that miss the DTLB (Precise Event)
+ 0x01 extra: l1d_hit Retired loads that hit the L1 data cache (Precise Event)
+ 0x02 extra: l2_hit Retired loads that hit the L2 cache (Precise Event)
+ 0x04 extra: llc_unshared_hit Retired loads that hit valid versions in the LLC cache (Precise Event)
+ 0x08 extra: other_core_l2_hit_hitm Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event)
+ 0x10 extra: llc_miss Retired loads that miss the LLC cache (Precise Event)
+ 0x40 extra: hit_lfb Retired loads that miss L1D and hit an previously allocated LFB (Precise Event)
+ 0x80 extra: dtlb_miss Retired loads that miss the DTLB (Precise Event)
name:mem_uncore_retired type:bitmask default:0x02
- 0x02 local_hitm Load instructions retired that HIT modified data in sibling core (Precise Event)
- 0x04 remote_hitm Retired loads that hit remote socket in modified state (Precise Event)
- 0x08 local_dram_and_remote_cache_hit Load instructions retired local dram and remote cache HIT data sources (Precise Event)
- 0x10 remote_dram Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)
- 0x80 uncacheable Load instructions retired IO (Precise Event)
+ 0x02 extra: local_hitm Load instructions retired that HIT modified data in sibling core (Precise Event)
+ 0x04 extra: remote_hitm Retired loads that hit remote socket in modified state (Precise Event)
+ 0x08 extra: local_dram_and_remote_cache_hit Load instructions retired local dram and remote cache HIT data sources (Precise Event)
+ 0x10 extra: remote_dram Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event)
+ 0x80 extra: uncacheable Load instructions retired IO (Precise Event)
name:offcore_requests type:bitmask default:0x80
- 0x01 demand_read_data Offcore demand data read requests
- 0x02 demand_read_code Offcore demand code read requests
- 0x04 demand_rfo Offcore demand RFO requests
- 0x08 any_read Offcore read requests
- 0x10 any_rfo Offcore RFO requests
- 0x40 l1d_writeback Offcore L1 data cache writebacks
- 0x80 any All offcore requests
+ 0x01 extra: demand_read_data Offcore demand data read requests
+ 0x02 extra: demand_read_code Offcore demand code read requests
+ 0x04 extra: demand_rfo Offcore demand RFO requests
+ 0x08 extra: any_read Offcore read requests
+ 0x10 extra: any_rfo Offcore RFO requests
+ 0x40 extra: l1d_writeback Offcore L1 data cache writebacks
+ 0x80 extra: any All offcore requests
name:offcore_requests_outstanding type:bitmask default:0x08
- 0x01 demand_read_data Outstanding offcore demand data reads
- 0x02 demand_read_code Outstanding offcore demand code reads
- 0x04 demand_rfo Outstanding offcore demand RFOs
- 0x08 any_read Outstanding offcore reads
+ 0x01 extra: demand_read_data Outstanding offcore demand data reads
+ 0x02 extra: demand_read_code Outstanding offcore demand code reads
+ 0x04 extra: demand_rfo Outstanding offcore demand RFOs
+ 0x08 extra: any_read Outstanding offcore reads
name:rat_stalls type:bitmask default:0x0f
- 0x01 flags Flag stall cycles
- 0x02 registers Partial register stall cycles
- 0x04 rob_read_port ROB read port stalls cycles
- 0x08 scoreboard Scoreboard stall cycles
- 0x0f any All RAT stall cycles
+ 0x01 extra: flags Flag stall cycles
+ 0x02 extra: registers Partial register stall cycles
+ 0x04 extra: rob_read_port ROB read port stalls cycles
+ 0x08 extra: scoreboard Scoreboard stall cycles
+ 0x0f extra: any All RAT stall cycles
name:resource_stalls type:bitmask default:0x01
- 0x01 any Resource related stall cycles
- 0x02 load Load buffer stall cycles
- 0x04 rs_full Reservation Station full stall cycles
- 0x08 store Store buffer stall cycles
- 0x10 rob_full ROB full stall cycles
- 0x20 fpcw FPU control word write stall cycles
- 0x40 mxcsr MXCSR rename stall cycles
- 0x80 other Other Resource related stall cycles
+ 0x01 extra: any Resource related stall cycles
+ 0x02 extra: load Load buffer stall cycles
+ 0x04 extra: rs_full Reservation Station full stall cycles
+ 0x08 extra: store Store buffer stall cycles
+ 0x10 extra: rob_full ROB full stall cycles
+ 0x20 extra: fpcw FPU control word write stall cycles
+ 0x40 extra: mxcsr MXCSR rename stall cycles
+ 0x80 extra: other Other Resource related stall cycles
name:simd_int_128 type:bitmask default:0x01
- 0x01 packed_mpy 128 bit SIMD integer multiply operations
- 0x02 packed_shift 128 bit SIMD integer shift operations
- 0x04 pack 128 bit SIMD integer pack operations
- 0x08 unpack 128 bit SIMD integer unpack operations
- 0x10 packed_logical 128 bit SIMD integer logical operations
- 0x20 packed_arith 128 bit SIMD integer arithmetic operations
- 0x40 shuffle_move 128 bit SIMD integer shuffle/move operations
+ 0x01 extra: packed_mpy 128 bit SIMD integer multiply operations
+ 0x02 extra: packed_shift 128 bit SIMD integer shift operations
+ 0x04 extra: pack 128 bit SIMD integer pack operations
+ 0x08 extra: unpack 128 bit SIMD integer unpack operations
+ 0x10 extra: packed_logical 128 bit SIMD integer logical operations
+ 0x20 extra: packed_arith 128 bit SIMD integer arithmetic operations
+ 0x40 extra: shuffle_move 128 bit SIMD integer shuffle/move operations
name:simd_int_64 type:bitmask default:0x01
- 0x01 packed_mpy SIMD integer 64 bit packed multiply operations
- 0x02 packed_shift SIMD integer 64 bit shift operations
- 0x04 pack SIMD integer 64 bit pack operations
- 0x08 unpack SIMD integer 64 bit unpack operations
- 0x10 packed_logical SIMD integer 64 bit logical operations
- 0x20 packed_arith SIMD integer 64 bit arithmetic operations
- 0x40 shuffle_move SIMD integer 64 bit shuffle/move operations
+ 0x01 extra: packed_mpy SIMD integer 64 bit packed multiply operations
+ 0x02 extra: packed_shift SIMD integer 64 bit shift operations
+ 0x04 extra: pack SIMD integer 64 bit pack operations
+ 0x08 extra: unpack SIMD integer 64 bit unpack operations
+ 0x10 extra: packed_logical SIMD integer 64 bit logical operations
+ 0x20 extra: packed_arith SIMD integer 64 bit arithmetic operations
+ 0x40 extra: shuffle_move SIMD integer 64 bit shuffle/move operations
name:snoopq_requests type:bitmask default:0x01
- 0x01 data Snoop data requests
- 0x02 invalidate Snoop invalidate requests
- 0x04 code Snoop code requests
+ 0x01 extra: data Snoop data requests
+ 0x02 extra: invalidate Snoop invalidate requests
+ 0x04 extra: code Snoop code requests
name:snoopq_requests_outstanding type:bitmask default:0x01
- 0x01 data Outstanding snoop data requests
- 0x02 invalidate Outstanding snoop invalidate requests
- 0x04 code Outstanding snoop code requests
+ 0x01 extra: data Outstanding snoop data requests
+ 0x02 extra: invalidate Outstanding snoop invalidate requests
+ 0x04 extra: code Outstanding snoop code requests
name:snoop_response type:bitmask default:0x01
- 0x01 hit Thread responded HIT to snoop
- 0x02 hite Thread responded HITE to snoop
- 0x04 hitm Thread responded HITM to snoop
+ 0x01 extra: hit Thread responded HIT to snoop
+ 0x02 extra: hite Thread responded HITE to snoop
+ 0x04 extra: hitm Thread responded HITM to snoop
name:sq_misc type:bitmask default:0x04
- 0x04 lru_hints Super Queue LRU hints sent to LLC
- 0x10 split_lock Super Queue lock splits across a cache line
+ 0x04 extra: lru_hints Super Queue LRU hints sent to LLC
+ 0x10 extra: split_lock Super Queue lock splits across a cache line
name:ssex_uops_retired type:bitmask default:0x01
- 0x01 packed_single SIMD Packed-Single Uops retired (Precise Event)
- 0x02 scalar_single SIMD Scalar-Single Uops retired (Precise Event)
- 0x04 packed_double SIMD Packed-Double Uops retired (Precise Event)
- 0x08 scalar_double SIMD Scalar-Double Uops retired (Precise Event)
- 0x10 vector_integer SIMD Vector Integer Uops retired (Precise Event)
+ 0x01 extra: packed_single SIMD Packed-Single Uops retired (Precise Event)
+ 0x02 extra: scalar_single SIMD Scalar-Single Uops retired (Precise Event)
+ 0x04 extra: packed_double SIMD Packed-Double Uops retired (Precise Event)
+ 0x08 extra: scalar_double SIMD Scalar-Double Uops retired (Precise Event)
+ 0x10 extra: vector_integer SIMD Vector Integer Uops retired (Precise Event)
name:store_blocks type:bitmask default:0x04
- 0x04 at_ret Loads delayed with at-Retirement block code
- 0x08 l1d_block Cacheable loads delayed with L1D block code
+ 0x04 extra: at_ret Loads delayed with at-Retirement block code
+ 0x08 extra: l1d_block Cacheable loads delayed with L1D block code
name:uops_decoded type:bitmask default:0x01
- 0x01 stall_cycles Cycles no Uops are decoded
- 0x02 ms_cycles_active Uops decoded by Microcode Sequencer
- 0x04 esp_folding Stack pointer instructions decoded
- 0x08 esp_sync Stack pointer sync operations
+ 0x01 extra: stall_cycles Cycles no Uops are decoded
+ 0x02 extra: ms_cycles_active Uops decoded by Microcode Sequencer
+ 0x04 extra: esp_folding Stack pointer instructions decoded
+ 0x08 extra: esp_sync Stack pointer sync operations
name:uops_executed type:bitmask default:0x3f
- 0x01 port0 Uops executed on port 0
- 0x02 port1 Uops executed on port 1
- 0x04 port2_core Uops executed on port 2 (core count)
- 0x08 port3_core Uops executed on port 3 (core count)
- 0x10 port4_core Uops executed on port 4 (core count)
- 0x1f core_active_cycles_no_port5 Cycles Uops executed on ports 0-4 (core count)
- 0x20 port5 Uops executed on port 5
- 0x3f core_active_cycles Cycles Uops executed on any port (core count)
- 0x40 port015 Uops issued on ports 0, 1 or 5
- 0x80 port234_core Uops issued on ports 2, 3 or 4
+ 0x01 extra: port0 Uops executed on port 0
+ 0x02 extra: port1 Uops executed on port 1
+ 0x04 extra: port2_core Uops executed on port 2 (core count)
+ 0x08 extra: port3_core Uops executed on port 3 (core count)
+ 0x10 extra: port4_core Uops executed on port 4 (core count)
+ 0x1f extra: core_active_cycles_no_port5 Cycles Uops executed on ports 0-4 (core count)
+ 0x20 extra: port5 Uops executed on port 5
+ 0x3f extra: core_active_cycles Cycles Uops executed on any port (core count)
+ 0x40 extra: port015 Uops issued on ports 0, 1 or 5
+ 0x80 extra: port234_core Uops issued on ports 2, 3 or 4
name:uops_issued type:bitmask default:0x01
- 0x01 any Uops issued
- 0x02 fused Fused Uops issued
+ 0x01 extra: any Uops issued
+ 0x02 extra: fused Fused Uops issued
name:uops_retired type:bitmask default:0x01
- 0x01 active_cycles Cycles Uops are being retired
- 0x02 retire_slots Retirement slots used (Precise Event)
- 0x04 macro_fused Macro-fused Uops retired (Precise Event)
+ 0x01 extra: active_cycles Cycles Uops are being retired
+ 0x02 extra: retire_slots Retirement slots used (Precise Event)
+ 0x04 extra: macro_fused Macro-fused Uops retired (Precise Event)
diff --git a/events/ia64/ia64/events b/events/ia64/ia64/events
deleted file mode 100644
index 8ae41dd..0000000
--- a/events/ia64/ia64/events
+++ /dev/null
@@ -1,3 +0,0 @@
-# IA-64 events
-event:0x12 counters:0,1,2,3 um:zero minimum:500 name:CPU_CYCLES : CPU Cycles
-event:0x08 counters:0,1,2,3 um:zero minimum:500 name:IA64_INST_RETIRED : IA-64 Instructions Retired
diff --git a/events/ia64/ia64/unit_masks b/events/ia64/ia64/unit_masks
deleted file mode 100644
index 7dd854a..0000000
--- a/events/ia64/ia64/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# IA-64 possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/ia64/itanium/events b/events/ia64/itanium/events
deleted file mode 100644
index b0ce10f..0000000
--- a/events/ia64/itanium/events
+++ /dev/null
@@ -1,5 +0,0 @@
-# IA-64 Itanium 1 events
-event:0x12 counters:0,1,2,3 um:zero minimum:500 name:CPU_CYCLES : CPU Cycles
-event:0x08 counters:0,1 um:zero minimum:500 name:IA64_INST_RETIRED : IA-64 Instructions Retired
-event:0x15 counters:0,1,2,3 um:zero minimum:500 name:IA32_INST_RETIRED : IA-32 Instructions Retired
-# FIXME: itanium doc describe a lot of other events, should we add them w/o any testing ?
diff --git a/events/ia64/itanium/unit_masks b/events/ia64/itanium/unit_masks
deleted file mode 100644
index 6a9f77b..0000000
--- a/events/ia64/itanium/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# IA-64 Itanium 1 possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/ia64/itanium2/events b/events/ia64/itanium2/events
deleted file mode 100644
index c979022..0000000
--- a/events/ia64/itanium2/events
+++ /dev/null
@@ -1,267 +0,0 @@
-# IA-64 Itanium 2 events
-
-# IA64_2 Basic Events, Table 11-1
-event:0x12 counters:0,1,2,3 um:zero minimum:500 name:CPU_CYCLES : CPU Cycles
-event:0x08 counters:0,1,2,3 um:zero minimum:500 name:IA64_INST_RETIRED : IA-64 Instructions Retired
-event:0x59 counters:0,1,2,3 um:zero minimum:5000 name:IA32_INST_RETIRED : IA-32 Instructions Retired
-event:0x07 counters:0,1,2,3 um:zero minimum:500 name:IA32_ISA_TRANSITIONS : Itanium to/from IA-32 ISA Transitions
-
-# IA64_2 Instruction Disperal Events, Table 11-3
-event:0x49 counters:0,1,2,3 um:zero minimum:5000 name:DISP_STALLED : Number of cycles dispersal stalled
-event:0x4d counters:0,1,2,3 um:zero minimum:5000 name:INST_DISPERSED : Syllables Dispersed from REN to REG stage
-event:0x4e counters:0,1,2,3 um:syll_not_dispersed minimum:5000 name:SYLL_NOT_DISPERSED : Syllables not dispersed
-event:0x4f counters:0,1,2,3 um:syll_overcount minimum:5000 name:SYLL_OVERCOUNT : Syllables overcounted
-
-# IA64_2 Instruction Execution Events, Table 11-4
-event:0x58 counters:0,1,2,3 um:alat_capacity_miss minimum:5000 name:ALAT_CAPACITY_MISS : ALAT Entry Replaced
-event:0x06 counters:0,1,2,3 um:zero minimum:5000 name:FP_FAILED_FCHKF : Failed fchkf
-event:0x05 counters:0,1,2,3 um:zero minimum:5000 name:FP_FALSE_SIRSTALL : SIR stall without a trap
-event:0x0b counters:0,1,2,3 um:zero minimum:5000 name:FP_FLUSH_TO_ZERO : Result Flushed to Zero
-event:0x09 counters:0,1,2,3 um:zero minimum:5000 name:FP_OPS_RETIRED : Retired FP operations
-event:0x03 counters:0,1,2,3 um:zero minimum:5000 name:FP_TRUE_SIRSTALL : SIR stall asserted and leads to a trap
-event:0x08 counters:0,1,2,3 um:tagged_inst_retired minimum:5000 name:IA64_TAGGED_INST_RETIRED : Retired Tagged Instructions
-event:0x56 counters:0,1,2,3 um:alat_capacity_miss minimum:5000 name:INST_CHKA_LDC_ALAT : Advanced Check Loads
-event:0x57 counters:0,1,2,3 um:alat_capacity_miss minimum:5000 name:INST_FAILED_CHKA_LDC_ALAT : Failed Advanced Check Loads
-event:0x55 counters:0,1,2,3 um:alat_capacity_miss minimum:5000 name:INST_FAILED_CHKS_RETIRED : Failed Speculative Check Loads
-# To avoid duplication from other tables the following events commented out
-#event:0xcd counters:0,1,2,3 um:zero minimum:5000 name:LOADS_RETIRED : Retired Loads
-#event:0xce counters:0,1,2,3 um:zero minimum:5000 name:MISALIGNED_LOADS_RETIRED : Retired Misaligned Load Instructions
-#event:0xcf counters:0,1,2,3 um:zero minimum:5000 name:UC_LOADS_RETIRED : Retired Uncacheable Loads
-#event:0xd1 counters:0,1,2,3 um:zero minimum:5000 name:STORES_RETIRED : Retired Stores
-#event:0xd2 counters:0,1,2,3 um:zero minimum:5000 name:MISALIGNED_STORES_RETIRED : Retired Misaligned Store Instructions
-#event:0xd0 counters:0,1,2,3 um:zero minimum:5000 name:UC_STORES_RETIRED : Retired Uncacheable Stores
-event:0x50 counters:0,1,2,3 um:zero minimum:5000 name:NOPS_RETIRED : Retired NOP Instructions
-event:0x51 counters:0,1,2,3 um:zero minimum:5000 name:PREDICATE_SQUASHED_RETIRED : Instructions Squashed Due to Predicate Off`
-
-# IA64_2 Stall Events, Table 11-6
-event:0x00 counters:0,1,2,3 um:back_end_bubble minimum:5000 name:BACK_END_BUBBLE : Full pipe bubbles in main pipe
-event:0x02 counters:0,1,2,3 um:be_exe_bubble minimum:5000 name:BE_EXE_BUBBLE : Full pipe bubbles in main pipe due to Execution unit stalls
-event:0x04 counters:0,1,2,3 um:be_flush_bubble minimum:5000 name:BE_FLUSH_BUBBLE : Full pipe bubbles in main pipe due to flushes
-event:0xca counters:0,1,2,3 um:be_l1d_fpu_bubble minimum:5000 name:BE_L1D_FPU_BUBBLE : Full pipe bubbles in main pipe due to FP or L1 dcache
-# To avoid duplication from other tables the following events commented out
-#event:0x72 counters:0,1,2,3 um:be_lost_bw_due_to_fe minimum:5000 name:BE_LOST_BW_DUE_TO_FE : Invalid bundles if BE not stalled for other reasons
-event:0x01 counters:0,1,2,3 um:be_rse_bubble minimum:5000 name:BE_RSE_BUBBLE : Full pipe bubbles in main pipe due to RSE stalls
-event:0x71 counters:0,1,2,3 um:fe_bubble minimum:5000 name:FE_BUBBLE : Bubbles seen by FE
-event:0x70 counters:0,1,2,3 um:fe_lost minimum:5000 name:FE_LOST_BW : Invalid bundles at the entrance to IB
-event:0x73 counters:0,1,2,3 um:fe_lost minimum:5000 name:IDEAL_BE_LOST_BW_DUE_TO_FE : Invalid bundles at the exit from IB
-
-# IA64_2 Branch Events, Table 11-7
-event:0x61 counters:0,1,2,3 um:be_br_mispredict_detail minimum:5000 name:BE_BR_MISPRED_DETAIL : BE branch misprediction detail
-event:0x11 counters:0,1,2,3 um:zero minimum:5000 name:BRANCH_EVENT : Branch Event Captured
-event:0x5b counters:0,1,2,3 um:br_mispred_detail minimum:5000 name:BR_MISPRED_DETAIL : Branch Mispredict Detail
-event:0x68 counters:0,1,2,3 um:br_mispredict_detail2 minimum:5000 name:BR_MISPRED_DETAIL2 : FE Branch Mispredict Detail (Unknown path component)
-event:0x54 counters:0,1,2,3 um:br_path_pred minimum:5000 name:BR_PATH_PRED : FE Branch Path Prediction Detail
-event:0x6a counters:0,1,2,3 um:br_path_pred2 minimum:5000 name:BR_PATH_PRED2 : FE Branch Path Prediction Detail (Unknown prediction component)
-event:0x63 counters:0,1,2,3 um:encbr_mispred_detail minimum:5000 name:ENCBR_MISPRED_DETAIL : Number of encoded branches retired
-
-# IA64_2 L1 Instruction Cache and Prefetch Events, Table 11-8
-event:0x46 counters:0,1,2,3 um:zero minimum:5000 name:ISB_BUNPAIRS_IN : Bundle pairs written from L2 into FE
-event:0x43 counters:0,1,2,3 um:zero minimum:5000 name:L1I_EAR_EVENTS : Instruction EAR Events
-event:0x66 counters:0,1,2,3 um:zero minimum:5000 name:L1I_FETCH_ISB_HIT : "\"Just-in-time\" instruction fetch hitting in and being bypassed from ISB
-event:0x65 counters:0,1,2,3 um:zero minimum:5000 name:L1I_FETCH_RAB_HIT : Instruction fetch hitting in RAB
-event:0x41 counters:0,1,2,3 um:zero minimum:5000 name:L1I_FILLS : L1 Instruction Cache Fills
-event:0x44 counters:0,1,2,3 um:zero minimum:5000 name:L1I_PREFETCHES : Instruction Prefetch Requests
-event:0x42 counters:0,1,2,3 um:zero minimum:5000 name:L2_INST_DEMAND_READS : L1 Instruction Cache and ISB Misses
-event:0x67 counters:0,1,2,3 um:l1i_prefetch_stall minimum:5000 name:L1I_PREFETCH_STALL : Why prefetch pipeline is stalled?
-event:0x4b counters:0,1,2,3 um:zero minimum:5000 name:L1I_PURGE : L1ITLB purges handled by L1I
-event:0x69 counters:0,1,2,3 um:zero minimum:5000 name:L1I_PVAB_OVERFLOW : PVAB overflow
-event:0x64 counters:0,1,2,3 um:zero minimum:5000 name:L1I_RAB_ALMOST_FULL : Is RAB almost full?
-event:0x60 counters:0,1,2,3 um:zero minimum:500 name:L1I_RAB_FULL : Is RAB full?
-event:0x40 counters:0,1,2,3 um:zero minimum:5000 name:L1I_READS : L1 Instruction Cache Read
-event:0x4a counters:0,1,2,3 um:zero minimum:5000 name:L1I_SNOOP : Snoop requests handled by L1I
-event:0x5f counters:0,1,2,3 um:zero minimum:5000 name:L1I_STRM_PREFETCHES : L1 Instruction Cache line prefetch requests
-event:0x45 counters:0,1,2,3 um:zero minimum:5000 name:L2_INST_PREFETCHES : Instruction Prefetch Requests
-
-# IA64_2 L1 Data Cache Events, Table 11-10
-event:0xc8 counters:0,1,2,3 um:zero minimum:5000 name:DATA_EAR_EVENTS : Data Cache EAR Events
-# To avoid duplication from other tables the following events commented out
-#event:0xc2 counters:0,1,2,3 um:zero minimum:5000 name:L1D_READS_SET0 : L1 Data Cache Reads
-#event:0xc3 counters:0,1,2,3 um:zero minimum:5000 name:DATA_REFERENCES_SET0 : Data memory references issued to memory pipeline
-#event:0xc4 counters:0,1,2,3 um:zero minimum:5000 name:L1D_READS_SET1 : L1 Data Cache Reads
-#event:0xc5 counters:0,1,2,3 um:zero minimum:5000 name:DATA_REFERENCES_SET1 : Data memory references issued to memory pipeline
-#event:0xc7 counters:0,1,2,3 um:l1d_read_misses minimum:5000 name:L1D_READ_MISSES : L1 Data Cache Read Misses
-
-# IA64_2 L1 Data Cache Set 0 Events, Table 11-11
-event:0xc0 counters:1 um:zero minimum:5000 name:L1DTLB_TRANSFER : L1DTLB misses that hit in the L2DTLB for accesses counted in L1D_READS
-event:0xc1 counters:1 um:zero minimum:5000 name:L2DTLB_MISSES : L2DTLB Misses
-event:0xc2 counters:1 um:zero minimum:5000 name:L1D_READS_SET0 : L1 Data Cache Reads
-event:0xc3 counters:1 um:zero minimum:5000 name:DATA_REFERENCES_SET0 : Data memory references issued to memory pipeline
-
-# IA64_2 L1 Data Cache Set 1 Events, Table 11-12
-event:0xc4 counters:1 um:zero minimum:5000 name:L1D_READS_SET1 : L1 Data Cache Reads
-event:0xc5 counters:1 um:zero minimum:5000 name:DATA_REFERENCES_SET1 : Data memory references issued to memory pipeline
-event:0xc7 counters:1 um:l1d_read_misses minimum:5000 name:L1D_READ_MISSES : L1 Data Cache Read Misses
-
-# IA64_2 L1 Data Cache Set 2 Events, Table 11-13
-event:0xca counters:1 um:be_l1d_fpu_bubble minimum:5000 name:BE_L1D_FPU_BUBBLE : Full pipe bubbles in main pipe due to FP or L1 dcache
-
-# IA64_2 L1 Data Cache Set 3 Events, Table 11-14
-event:0xcd counters:1 um:zero minimum:5000 name:LOADS_RETIRED : Retired Loads
-event:0xce counters:1 um:zero minimum:5000 name:MISALIGNED_LOADS_RETIRED : Retired Misaligned Load Instructions
-event:0xcf counters:1 um:zero minimum:5000 name:UC_LOADS_RETIRED : Retired Uncacheable Loads
-
-# IA64_2 L1 Data Cache Set 4 Events, Table 11-15
-event:0xd1 counters:1 um:zero minimum:5000 name:STORES_RETIRED : Retired Stores
-event:0xd2 counters:1 um:zero minimum:5000 name:MISALIGNED_STORES_RETIRED : Retired Misaligned Store Instructions
-event:0xd0 counters:1 um:zero minimum:5000 name:UC_STORES_RETIRED : Retired Uncacheable Stores
-
-# IA64_2 L2 Unified Cache Events, Table 11-16
-# To avoid duplication from other tables the following events commented out
-#event:0xb9 counters:0,1,2,3 um:zero minimum:5000 name:L2_BAD_LINES_SELECTED : Valid line replaced when invalid line is available
-#event:0xb8 counters:0,1,2,3 um:l2_bypass minimum:5000 name:L2_BYPASS : Count bypass
-#event:0xb2 counters:0,1,2,3 um:l2_data_references minimum:5000 name:L2_DATA_REFERENCES : Data read/write access to L2
-event:0xbf counters:0,1,2,3 um:zero minimum:5000 name:L2_FILLB_FULL : L2D Fill buffer is full
-#event:0xb4 counters:0,1,2,3 um:l2_force_recirc minimum:5000 name:L2_FORCE_RECIRC : Forced recirculates
-event:0xba counters:0,1,2,3 um:recirc_ifetch minimum:5000 name:L2_GOT_RECIRC_IFETCH : Instruction fetch recirculates received by L2D
-#event:0xb6 counters:0,1,2,3 um:zero minimum:5000 name:L2_GOT_RECIRC_OZQ_ACC : Counts number of OZQ accesses recirculated back to L1D
-#event:0xa1 counters:0,1,2,3 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-#event:0xa5 counters:0,1,2,3 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-#event:0xa9 counters:0,1,2,3 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-#event:0xad counters:0,1,2,3 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-event:0xb9 counters:0,1,2,3 um:recirc_ifetch minimum:5000 name:L2_ISSUED_RECIRC_IFETCH : Instruction fetch recirculates issued by L2D
-#event:0xb5 counters:0,1,2,3 um:zero minimum:5000 name:L2_ISSUED_RECIRC_OZQ_ACC : Count number of times a recirculate issue was attempted and not preempted
-#event:0xb0 counters:0,1,2,3 um:l2_l3_access_cancel minimum:5000 name:L2_L3ACCESS_CANCEL : Canceled L3 accesses
-event:0xcb counters:0,1,2,3 um:zero minimum:5000 name:L2_MISSES : L2 Misses
-event:0xb8 counters:0,1,2,3 um:l2_ops_issued minimum:5000 name:L2_OPS_ISSUED : Different operations issued by L2D
-#event:0xbd counters:0,1,2,3 um:zero minimum:5000 name:L2_OZDB_FULL : L2D OZQ is full
-#event:0xa2 counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-#event:0xa6 counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-#event:0xaa counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-#event:0xae counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-#event:0xa0 counters:0,1,2,3 um:l2_ozq_cancels0 minimum:5000 name:L2_OZQ_CANCELS0 : L2 OZQ cancels
-#event:0xac counters:0,1,2,3 um:l2_ozq_cancels1 minimum:5000 name:L2_OZQ_CANCELS1 : L2 OZQ cancels
-#event:0xa8 counters:0,1,2,3 um:l2_ozq_cancels2 minimum:5000 name:L2_OZQ_CANCELS2 : L2 OZQ cancels
-#event:0xbc counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_FULL : L2D OZQ is full
-#event:0xa3 counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-#event:0xa7 counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-#event:0xab counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-#event:0xaf counters:0,1,2,3 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-#event:0xb1 counters:0,1,2,3 um:zero minimum:5000 name:L2_REFERENCES : Requests made from L2
-#event:0xba counters:0,1,2,3 um:zero minimum:5000 name:L2_STORE_HIT_SHARED : Store hit a shared line
-#event:0xb7 counters:0,1,2,3 um:zero minimum:5000 name:L2_SYNTH_PROBE : Synthesized Probe
-#event:0xbe counters:0,1,2,3 um:zero minimum:5000 name:L2_VICTIMB_FULL : L2D victim buffer is full
-
-# IA64_2 L2 Cache Events Set 0, Table 11-18
-# FIXME all sorts of restrictions on how these can be combined
-event:0xa1 counters:0 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-event:0xa5 counters:0 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-event:0xa9 counters:0 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-event:0xad counters:0 um:l2_ifet_cancels minimum:5000 name:L2_IFET_CANCELS : Instruction fetch cancels by the L2.
-event:0xa2 counters:0 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-event:0xa6 counters:0 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-event:0xaa counters:0 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-event:0xae counters:0 um:zero minimum:5000 name:L2_OZQ_ACQUIRE : Clocks with acquire ordering attribute existed in L2 OZQ
-event:0xa0 counters:0 um:l2_ozq_cancels0 minimum:5000 name:L2_OZQ_CANCELS0 : L2 OZQ cancels
-event:0xac counters:0 um:l2_ozq_cancels1 minimum:5000 name:L2_OZQ_CANCELS1 : L2 OZQ cancels
-event:0xa8 counters:0 um:l2_ozq_cancels2 minimum:5000 name:L2_OZQ_CANCELS2 : L2 OZQ cancels
-event:0xa3 counters:0 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-event:0xa7 counters:0 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-event:0xab counters:0 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-event:0xaf counters:0 um:zero minimum:5000 name:L2_OZQ_RELEASE : Clocks with release ordering attribute existed in L2 OZQ
-
-# IA64_2 L2 Cache Events Set 1, Table 11-19
-# manual states that L2_L3ACCESS_CANCEL must be measured in PMD4.
-# FIXME Don't have any way of enforcing the constraints
-# so only l2_l3_access_cancel allowed.
-event:0xb0 counters:0 um:l2_l3_access_cancel minimum:5000 name:L2_L3ACCESS_CANCEL : Canceled L3 accesses
-#event:0xb2 counters:0,1,2,3 um:l2_data_references minimum:5000 name:L2_DATA_REFERENCES : Data read/write access to L2
-#event:0xb1 counters:0,1,2,3 um:zero minimum:5000 name:L2_REFERENCES : Requests made from L2
-
-# IA64_2 L2 Cache Events Set 2, Table 11-20
-# manual states that L2_FORCE_RECIRC must be measured in PMD4.
-# FIXME Don't have anyway of enforcing thes constraint
-# so only L2_FORCE_RECIRC allowed.
-event:0xb4 counters:0 um:l2_force_recirc minimum:5000 name:L2_FORCE_RECIRC : Forced recirculates
-#event:0xb5 counters:0,1,2,3 um:zero minimum:5000 name:L2_ISSUED_RECIRC_OZQ_ACC : Count number of times a recirculate issue was attempted and not preempted
-#event:0xb6 counters:0,1,2,3 um:zero minimum:5000 name:L2_GOT_RECIRC_OZQ_ACC : Counts number of OZQ accesses recirculated back to L1D
-#event:0xb7 counters:0,1,2,3 um:zero minimum:5000 name:L2_SYNTH_PROBE : Synthesized Probe
-
-# IA64_2 L2 Cache Events Set 3, Table 11-21
-# The manual states that all events in this set share the same umask.
-event:0xb9 counters:0 um:zero minimum:5000 name:L2_BAD_LINES_SELECTED : Valid line replaced when invalid line is available
-event:0xb8 counters:0 um:l2_bypass minimum:5000 name:L2_BYPASS : Count bypass
-event:0xba counters:0 um:zero minimum:5000 name:L2_STORE_HIT_SHARED : Store hit a shared line
-
-# IA64_2 L2 Cache Events Set 4, Table 11-22
-# The manual states one of the following needs to be in pmd4 and these events
-# share the same umask.
-event:0xba counters:0 um:recirc_ifetch minimum:5000 name:L2_GOT_RECIRC_IFETCH : Instruction fetch recirculates received by L2D
-event:0xb9 counters:0 um:recirc_ifetch minimum:5000 name:L2_ISSUED_RECIRC_IFETCH : Instruction fetch recirculates issued by L2D
-event:0xb8 counters:0 um:l2_ops_issued minimum:5000 name:L2_OPS_ISSUED : Different operations issued by L2D
-
-# IA64_2 L2 Cache Events Set 5, Table 11-23
-# manual states one of the following needs to be in pmd4 and
-# these events share the same umask
-event:0xbc counters:0 um:zero minimum:5000 name:L2_OZQ_FULL : L2D OZQ is full
-event:0xbd counters:0 um:zero minimum:5000 name:L2_OZDB_FULL : L2D OZQ is full
-event:0xbe counters:0 um:zero minimum:5000 name:L2_VICTIMB_FULL : L2D victim buffer is full
-event:0xbf counters:0 um:zero minimum:5000 name:L2_FILLB_FULL : L2D Fill buffer is full
-
-# IA64_2 L3 Cache Events, Table 11-24
-event:0xdf counters:0,1,2,3 um:zero minimum:5000 name:L3_LINES_REPLACED : Cache Lines Replaced
-event:0xdc counters:0,1,2,3 um:zero minimum:5000 name:L3_MISSES : L3 Misses
-event:0xdb counters:0,1,2,3 um:zero minimum:5000 name:L3_REFERENCES : L3 References
-event:0xdd counters:0,1,2,3 um:l3_reads minimum:5000 name:L3_READS : L3 Reads
-event:0xde counters:0,1,2,3 um:l3_writes minimum:5000 name:L3_WRITES : L3 Writes
-
-# IA64_2 System Events, Table 11-26
-event:0x13 counters:0,1,2,3 um:zero minimum:5000 name:CPU_CPL_CHANGES : Privilege Level Changes
-event:0x52 counters:0,1,2,3 um:zero minimum:5000 name:DATA_DEBUG_REGISTER_FAULT : Fault due to data debug reg. Match to load/store instruction
-event:0xc6 counters:0,1,2,3 um:zero minimum:5000 name:DATA_DEBUG_REGISTER_MATCHES : Data debug register matches data address of memory reference
-event:0x9e counters:0,1,2,3 um:extern_dp_pins_0_to_3 minimum:5000 name:EXTERN_DP_PINS_0_TO_3 : DP pins 0-3 asserted
-event:0x9f counters:0,1,2,3 um:extern_dp_pins_4_to_5 minimum:5000 name:EXTERN_DP_PINS_4_TO_5 : DP pins 4-5 asserted
-event:0x53 counters:0,1,2,3 um:zero minimum:5000 name:SERIALIZATION_EVENTS : Number of srlz.I instructions
-
-# IA64_2 TLB Events, Table 11-28
-event:0xc9 counters:0,1,2,3 um:zero minimum:5000 name:DTLB_INSERTS_HPW : Hardware Page Walker Installs to DTLB"
-event:0x2c counters:0,1,2,3 um:zero minimum:500 name:DTLB_INSERTS_HPW_RETIRED : VHPT entries inserted into DTLB by HW PW
-event:0x2d counters:0,1,2,3 um:zero minimum:500 name:HPW_DATA_REFERENCES : Data memory references to VHPT
-#event:0xc1 counters:1 um:zero minimum:5000 name:L2DTLB_MISSES : L2DTLB Misses
-event:0x48 counters:0,1,2,3 um:zero minimum:5000 name:L1ITLB_INSERTS_HPW : L1ITLB Hardware Page Walker Inserts
-event:0x47 counters:0,1,2,3 um:itlb_misses_fetch minimum:5000 name:ITLB_MISSES_FETCH : ITLB Misses Demand Fetch
-#event:0xc0 counters:1 um:zero minimum:5000 name:L1DTLB_TRANSFER : L1DTLB misses that hit in the L2DTLB for accesses counted in L1D_READS
-
-# IA64_2 System Bus Events, Table 11-30
-event:0x87 counters:0,1,2,3 um:bus minimum:5000 name:BUS_ALL : Bus Transactions
-event:0x9c counters:0,1,2,3 um:zero minimum:5000 name:BUS_BRQ_LIVE_REQ_HI : BRQ Live Requests (two most-significant-bit of the 5-bit outstanding BRQ request count)
-event:0x9b counters:0,1,2,3 um:zero minimum:5000 name:BUS_BRQ_LIVE_REQ_LO : BRQ Live Requests (three least-significant-bit of the 5-bit outstanding BRQ request count
-event:0x9d counters:0,1,2,3 um:zero minimum:5000 name:BUS_BRQ_REQ_INSERTED : BRQ Requests Inserted
-event:0x88 counters:0,1,2,3 um:zero minimum:5000 name:BUS_DATA_CYCLE : Valid data cycle on the Bus
-event:0x84 counters:0,1,2,3 um:zero minimum:5000 name:BUS_HITM : Bus Hit Modified Line Transactions
-event:0x90 counters:0,1,2,3 um:bus minimum:5000 name:BUS_IO : IA-32 Compatible IO Bus Transactions
-event:0x98 counters:0,1,2,3 um:zero minimum:5000 name:BUS_IOQ_LIVE_REQ_HI : Inorder Bus Queue Requests (two most-significant-bit of the 4-bit outstanding IOQ request count)
-event:0x97 counters:0,1,2,3 um:zero minimum:5000 name:BUS_IOQ_LIVE_REQ_LO : Inorder Bus Queue Requests (two least-significant-bit of the 4-bit outstanding IOQ request count)
-event:0x93 counters:0,1,2,3 um:bus_lock minimum:5000 name:BUS_LOCK : IA-32 Compatible Bus Lock Transactions
-event:0x8e counters:0,1,2,3 um:bus_backsnp_req minimum:5000 name:BUS_BACKSNP_REQ : Bus Back Snoop Requests
-event:0x8a counters:0,1,2,3 um:bus_memory minimum:5000 name:BUS_MEMORY : Bus Memory Transactions
-event:0x8b counters:0,1,2,3 um:bus_mem_read minimum:5000 name:BUS_MEM_READ : Full Cache line D/I memory RD, RD invalidate, and BRIL
-event:0x94 counters:0,1,2,3 um:zero minimum:5000 name:BUS_MEM_READ_OUT_HI : Outstanding memory RD transactions
-event:0x95 counters:0,1,2,3 um:zero minimum:5000 name:BUS_MEM_READ_OUT_LO : Outstanding memory RD transactions
-event:0x9a counters:0,1,2,3 um:zero minimum:5000 name:BUS_OOQ_LIVE_REQ_HI : Out-of-order Bus Queue Requests (two most-significant-bit of the 4-bit outstanding OOQ request count)
-event:0x99 counters:0,1,2,3 um:zero minimum:5000 name:BUS_OOQ_LIVE_REQ_LO : Out-of-order Bus Queue Requests (three least-significant-bit of the 4-bit outstanding OOQ request count)
-event:0x8c counters:0,1,2,3 um:bus minimum:5000 name:BUS_RD_DATA : Bus Read Data Transactions
-event:0x80 counters:0,1,2,3 um:zero minimum:5000 name:BUS_RD_HIT : Bus Read Hit Clean Non-local Cache Transactions
-event:0x81 counters:0,1,2,3 um:zero minimum:5000 name:BUS_RD_HITM : Bus Read Hit Modified Non-local Cache Transactions
-event:0x83 counters:0,1,2,3 um:zero minimum:5000 name:BUS_RD_INVAL_ALL_HITM : Bus BIL or BRIL Transaction Results in HITM
-event:0x82 counters:0,1,2,3 um:zero minimum:5000 name:BUS_RD_INVAL_HITM : Bus BIL Transaction Results in HITM
-event:0x91 counters:0,1,2,3 um:bus minimum:5000 name:BUS_RD_IO : IA-32 Compatible IO Read Transactions
-event:0x8d counters:0,1,2,3 um:bus minimum:5000 name:BUS_RD_PRTL : Bus Read Partial Transactions
-event:0x96 counters:0,1,2,3 um:zero minimum:5000 name:BUS_SNOOPQ_REQ : Bus Snoop Queue Requests
-event:0x86 counters:0,1,2,3 um:bus minimum:5000 name:BUS_SNOOPS : Bus Snoops Total
-event:0x85 counters:0,1,2,3 um:bus_snoop minimum:5000 name:BUS_SNOOPS_HITM : Bus Snoops HIT Modified Cache Line
-event:0x8f counters:0,1,2,3 um:bus_snoop minimum:5000 name:BUS_SNOOP_STALL_CYCLES : Bus Snoop Stall Cycles (from any agent)
-event:0x92 counters:0,1,2,3 um:bus_wr_wb minimum:5000 name:BUS_WR_WB : Bus Write Back Transactions
-event:0x89 counters:0,1,2,3 um:mem_read_current minimum:5000 name:MEM_READ_CURRENT : Current Mem Read Transactions On Bus
-
-# RSE Events, Table 11-34
-event:0x2b counters:0,1,2,3 um:zero minimum:500 name:RSE_CURRENT_REGS_2_TO_0 : Current RSE registers
-event:0x2a counters:0,1,2,3 um:zero minimum:500 name:RSE_CURRENT_REGS_5_TO_3 : Current RSE registers
-event:0x26 counters:0,1,2,3 um:zero minimum:500 name:RSE_CURRENT_REGS_6 : Current RSE registers
-event:0x29 counters:0,1,2,3 um:zero minimum:500 name:RSE_DIRTY_REGS_2_TO_0 : Dirty RSE registers
-event:0x28 counters:0,1,2,3 um:zero minimum:500 name:RSE_DIRTY_REGS_5_TO_3 : Dirty RSE registers
-event:0x24 counters:0,1,2,3 um:zero minimum:500 name:RSE_DIRTY_REGS_6 : Dirty RSE registers
-event:0x32 counters:0,1,2,3 um:zero minimum:500 name:RSE_EVENT_RETIRED : Retired RSE operations
-event:0x20 counters:0,1,2,3 um:rse_references_retired minimum:500 name:RSE_REFERENCES_RETIRED : RSE Accesses
-
-# IA64 Performance Monitors Ordered by Code, Table 11-36
-event:0xbb counters:0,1,2,3 um:zero minimum:5000 name:TAGGED_L2_DATA_RETURN_POR : Tagged L2 Data Return Ports 0/1
diff --git a/events/ia64/itanium2/unit_masks b/events/ia64/itanium2/unit_masks
deleted file mode 100644
index bc74f5d..0000000
--- a/events/ia64/itanium2/unit_masks
+++ /dev/null
@@ -1,465 +0,0 @@
-# IA-64 Itanium 2 possible unit masks
-#
-# The information for the following entries for the Itanium 2
-# came from Intel Itanium 2 Processor Reference Manual For
-# Software Development and Optimization, June 2002, Document
-# number 251110-001.
-
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
-
-# CPU_IA64_2 Table 11-37, 11-72
-name:alat_capacity_miss type:bitmask default:0x03
- 0x1 INT
- 0x2 FP
- 0x3 ALL
-
-# CPU_IA64_2 Table 11-38
-name:back_end_bubble type:exclusive default:0x00
- 0x0 ALL
- 0x1 FE
- 0x2 L1D_FPU_RSE
-
-# CPU_IA64_2 Table 11-39
-name:be_br_mispredict_detail type:exclusive default:0x00
- 0x0 ANY
- 0x1 STG
- 0x2 ROT
- 0x3 PFS
-
-# CPU_IA64_2 Table 11-40
-name:be_exe_bubble type:exclusive default:0x00
- 0x0 ALL
- 0x1 GRALL
- 0x2 FRALL
- 0x3 PR
- 0x4 ARCR
- 0x5 GRCR
- 0x6 CANCEL
- 0x7 BANK_SWITCH
- 0x8 ARCR_PR_CANCEL_BANK
-
-# CPU_IA64_2 Table 11-41
-name:be_flush_bubble type:exclusive default:0x00
- 0x0 ALL
- 0x1 BRU
- 0x2 XPN
-
-# CPU_IA64_2 Table 11-42
-name:be_l1d_fpu_bubble type:exclusive default:0x00
- 0x0 ALL
- 0x1 FPU
- 0x2 L1D
- 0x3 L1D_FULLSTBUF
- 0x4 L1D_DCURECIR
- 0x5 L1D_HPW
- 0x7 L1D_FILLCONF
- 0x8 L1D_DCS
- 0x9 L1D_L2BPRESS
- 0xa L1D_TLB
- 0xb L1D_LDCONF
- 0xc L1D_LDCHK
- 0xd L1D_NAT
- 0xe L1D_STBUFRECIR
- 0xf L1D_NATCONF
-
-# CPU_IA64_2 Table 11-43
-# FIXME: events using this is commented out in events
-#name:be_lost_bw_due_to_fe type:exclusive default:0x00
-# 0x0 ALL
-# 0x1 FEFLUSH
-# 0x4 UNREACHED
-# 0x5 IBFULL
-# 0x6 IMISS
-# 0x7 TLBMISS
-# 0x8 FILL_RECIRC
-# 0x9 BI
-# 0xa BRQ
-# 0xb PLP
-# 0xc BR_ILOCK
-# 0xd BUBBLE
-
-# CPU_IA64_2 Table 11-44
-name:be_rse_bubble type:exclusive default:0x00
- 0x0 ALL
- 0x1 BANK_SWITCH
- 0x2 AR_DEP
- 0x3 OVERFLOW
- 0x4 UNDERFLOW
- 0x5 LOADRS
-
-# CPU_IA64_2 Table 11-45
-name:br_mispred_detail type:exclusive default:0x00
- 0x0 ALL.ALL_PRED
- 0x1 ALL.CORRECT_PRED
- 0x2 ALL.WRONG_PATH
- 0x3 ALL.WRONG_TARGET
- 0x4 IPREL.ALL_PRED
- 0x5 IPREL.CORRECT_PRED
- 0x6 IPREL.WRONG_PATH
- 0x7 IPREL.WRONG_TARGET
- 0x8 RETURN.ALL_PRED
- 0x9 RETURN.CORRECT_PRED
- 0xa RETURN.WRONG_PATH
- 0xb RETURN.WRONG_TARGET
- 0xc NRETIND.ALL_PRED
- 0xd NRETIND.CORRECT_PRED
- 0xe NRETIND.WRONG_PATH
- 0xf NRETIND.WRONG_TARGET
-
-# CPU_IA64_2 Table 11-46
-name:br_mispredict_detail2 type:exclusive default:0x00
- 0x0 ALL.ALL_UNKNOWN_PRED
- 0x1 ALL.UKNOWN_PATH_CORRECT_PRED
- 0x2 ALL.UKNOWN_PATH_WRONG_PATH
- 0x4 IPREL.ALL_UNKNOWN_PRED
- 0x5 IPREL.UNKNOWN_PATH_CORRECT_PRED
- 0x6 IPREL.UNKNOWN_PATH_WRONG_PATH
- 0x8 RETURN.ALL_UNKNOWN_PRED
- 0x9 RETURN.UNKNOWN_PATH_CORRECT_PRED
- 0xa RETURN.UNKNOWN_PATH_WRONG_PATH
- 0xc NRETIND.ALL_UNKNOWN_PRED
- 0xd NRETIND.UNKNOWN_PATH_CORRECT_PRED
- 0xe NRETIND.UNKNOWN_PATH_WRONG_PATH
-
-# CPU_IA64_2 Table 11-47
-name:br_path_pred type:exclusive default:0x00
- 0x0 ALL.MISPRED_NOTTAKEN
- 0x1 ALL.MISPRED_TAKEN
- 0x2 ALL.OKPRED_NOTTAKEN
- 0x3 ALL.OKPRED_TAKEN
- 0x4 IPREL.MISPRED_NOTTAKEN
- 0x5 IPREL.MISPRED_TAKEN
- 0x6 IPREL.OKPRED_NOTTAKEN
- 0x7 IPREL.OKPRED_TAKEN
- 0x8 RETURN.MISPRED_NOTTAKEN
- 0x9 RETURN.MISPRED_TAKEN
- 0xa RETURN.OKPRED_NOTTAKEN
- 0xb RETURN.OKPRED_TAKEN
- 0xc NRETIND.MISPRED_NOTTAKEN
- 0xd NRETIND.MISPRED_TAKEN
- 0xe NRETIND.OKPRED_NOTTAKEN
- 0xf NRETIND.OKPRED_TAKEN
-
-# CPU_IA64_2 Table 11-48
-name:br_path_pred2 type:exclusive default:0x00
- 0x0 ALL.UNKNOWNPRED_NOTTAKEN
- 0x1 ALL.UNKNOWNPRED_TAKEN
- 0x4 IPREL.UNKNOWNPRED_NOTTAKEN
- 0x5 IPREL.UNKNOWNPRED__TAKEN
- 0x8 RETURN.UNKNOWNPRED_NOTTAKEN
- 0x9 RETURN.UNKNOWNPRED_TAKEN
- 0xc NRETIND.UNKNOWNPRED_NOTTAKEN
- 0xd NRETIND.UNKNOWNPRED_TAKEN
-
-# CPU_IA64_2 Table 11-49, 11-51, 11-55, 11-56, 11-57, 11-58
-name:bus type:exclusive default:0x03
- 0x1 IO
- 0x2 SELF
- 0x3 ANY
-
-# CPU_IA64_2 Table 11-50 b0001
-name:bus_backsnp_req type:mandatory default:0x01
- 0x1 0x0
-
-# CPU_IA64_2 Table 11-52
-name:bus_lock type:exclusive default:0x03
- 0x2 SELF
- 0x3 ANY
-
-# CPU_IA64_2 Table 11-53
-name:bus_memory type:exclusive default:0x0f
- 0x5 EQ_128BYTEIO
- 0x6 EQ_128BYTE_SELF
- 0x7 EQ_128BYTE_ANY
- 0x9 LT_128BYTEIO
- 0xa LT_128BYTE_SELF
- 0xb LT_128BYTE_ANY
- 0xd ALL IO
- 0xe ALL SELF
- 0xf ALL ANY
-
-# CPU_IA64_2 Table 11-54
-name:bus_mem_read type:exclusive default:0x0f
- 0x1 BIL IO
- 0x2 BIL SELF
- 0x3 BIL ANY
- 0x5 BRL IO
- 0x6 BRL SELF
- 0x7 BRL_ANY
- 0x9 BRIL IO
- 0xa BRIL SELF
- 0xb BRIL ANY
- 0xd ALL IO
- 0xe ALL SELF
- 0xf ALL ANY
-
-# CPU_IA64_2 Table 11-59, 11-60
-name:bus_snoop type:exclusive default:0x03
- 0x2 SELF
- 0x3 ANY
-
-# CPU_IA64_2 Table 11-61
-name:bus_wr_wb type:exclusive default:0x0f
- 0x5 EQ_128BYTE IO
- 0x6 EQ_128BYTE SELF
- 0x7 EQ_128BYTE ANY
- 0xa CCASTOUT SELF
- 0xb CCASTOUT ANY
- 0xd ALL IO
- 0xe ALL SELF
- 0xf ALL ANY
-
-# CPU_IA64_2 Table 11-62
-name:encbr_mispred_detail type:exclusive default:0x0
- 0x0 ALL.ALL_PRED
- 0x1 ALL.CORRECT_PRED
- 0x2 ALL.WRONG_PATH
- 0x3 ALL.WRONG_TARGET
- 0x8 OVERSUB.ALL_PRED
- 0x9 OVERSUB.CORRECT_PRED
- 0xa OVERSUB.CORRECT_PRED
- 0xb OVERSUB.WRONGPATH
- 0xc ALL2.ALL_PRED
- 0xd ALL2.CORRECT_PRED
- 0xe ALL2.WRONG_PATH
- 0xf ALL2.WRONG_TARGET
-
-# CPU_IA64_2 Table 11-63
-name:extern_dp_pins_0_to_3 type:bitmask default:0xf
- 0x1 PIN0
- 0x2 PIN1
- 0x4 PIN2
- 0x8 PIN3
- 0xf ALL
-
-# CPU_IA64_2 Table 11-64
-name:extern_dp_pins_4_to_5 type:bitmask default:0x03
- 0x1 PIN4
- 0x2 PIN5
- 0xf ALL
-
-# CPU_IA64_2 Table 11-65
-name:fe_bubble type:exclusive default:0x0
- 0x0 ALL
- 0x1 FEFLUSH
- 0x3 GROUP1
- 0x4 GROUP2
- 0x5 IBFULL
- 0x6 IMISS
- 0x7 TLBMISS
- 0x8 FILL_RECIRC
- 0x9 BRANCH
- 0xa GROUP3
- 0xb ALLBUT_FEFLUSH_BUBBLE
- 0xc ALLBUT_IBFULL
- 0xd BUBBLE
-
-# CPU_IA64_2 Table 11-66, 11-69*/
-name:fe_lost type:exclusive default:0x0
- 0x0 ALL
- 0x1 FEFLUSH
- 0x4 UNREACHED
- 0x5 IBFULL
- 0x6 IMISS
- 0x7 TLBMISS
- 0x8 FILL_RECIRC
- 0x9 BI
- 0xa BRQ
- 0xb PLP
- 0xc BR_ILOCK
- 0xd BUBBLE
-
-# CPU_IA64_2 Table 11-67, 11-79, 11-86, 11-90, 11-92 b0000
-# FIXME: events using this is commented out in events
-#name:this type:exclusive default:0x0
-# 0x0 THIS
-
-# CPU_IA64_2 Table 11-68
-name:tagged_inst_retired type:exclusive default:0x0
- 0x0 IBRP0_PMB8
- 0x1 IBRP1_PMB9
- 0x2 IBRP2_PMC8
- 0x3 IBRP3_PMC9
-
-# CPU_IA64_2 Table 11-73
-name:itlb_misses_fetch type:exclusive default:0x3
- 0x1 L1ITLB
- 0x2 L2ITLB
- 0x3 ALL
-
-# CPU_IA64_2 Table 11-74
-name:l1d_read_misses type:exclusive default:0x0
- 0x0 ALL
- 0x1 RSE_FILL
-
-# CPU_IA64_2 Table 11-75
-name:l1i_prefetch_stall type:exclusive default:0x3
- 0x2 FLOW
- 0x3 ALL
-
-# CPU_IA64_2 Table 11-76, 11-91 b0000
-# FIXME: events using this is commented out in events
-#name:l2_lines type:exclusive default:0x0
-# 0x0 ANY
-
-# CPU_IA64_2 Table 11-77
-name:l2_bypass type:exclusive default:0x0
- 0x0 L2_DATA1
- 0x1 L2_DATA2
- 0x2 L3_DATA1
- 0x4 L2_INST1
- 0x5 L2_INST2
- 0x6 L3_INST1
-
-# CPU_IA64_2 Table 11-78
-# FIXME: events using this is commented out in events
-#name:l2_data_references type:bitmask default:0x3
-# 0x1 L2_DATA_READS
-# 0x2 L2_DATA_WRITES
-# 0x3 L2_ALL
-
-# CPU_IA64_2 Table 11-80
-name:l2_force_recirc type:exclusive default:0x0
- 0x0 ANY
- 0x1 SMC_HIT
- 0x2 L1W
- 0x4 TAG_NOTOK
- 0x5 TRAN_PREF
- 0x6 SNP_OR_L3
- 0x8 VIC_PEND
- 0x9 FILL_HIT
- 0xa IPF_MISS
- 0xb VIC_BUF_FULL
- 0xc OZQ_MISS
- 0xd SAME_INDEX
- 0xe FRC_RECIRC
-
-# CPU_IA64_2 Table 11-81, 11-83 b1000
-name:recirc_ifetch type:mandatory default:0x8
- 0x8 default:0x0} } };
-
-# CPU_IA64_2 Table 11-82
-name:l2_ifet_cancels type:exclusive default:0x0
- 0x0 ANY
- 0x2 BYPASS
- 0x4 DIDNT_RECIR
- 0x5 RECIRC_OVER_SUB
- 0x6 ST_FILL_WB
- 0x7 DATA_RD
- 0x8 PREEMPT
- 0xc CHG_PRIO
- 0xd IFETCH_BYP
-
-# CPU_IA64_2 Table 11-84
-name:l2_l3_access_cancel type:exclusive default:0x9
- 0x1 SPEC_L3_BYP
- 0x2 FILLD_FULL
- 0x5 UC_BLOCKED
- 0x6 INV_L3_BYP
- 0x8 EBL_REJECT
- 0x9 ANY
- 0xa DFETCH
- 0xb IFETCH
-
-# CPU_IA64_2 Table 11-85
-name:l2_ops_issued type:exclusive default:0x8
- 0x8 INT_LOAD
- 0x9 FP_LOAD
- 0xa RMW
- 0xb STORE
- 0xc NST_NLD
-
-# CPU_IA64_2 Table 11-87
-name:l2_ozq_cancels0 type:exclusive default:0x0
- 0x0 ANY
- 0x1 LATE_SPEC_BYP
- 0x2 LATE_RELEASE
- 0x3 LATE_ACQUIRE
- 0x4 LATE_BYP_EFFRELEASE
-
-# CPU_IA64_2 Table 11-88
-name:l2_ozq_cancels1 type:exclusive default:0x1
- 0x0 REL
- 0x1 BANK_CONF
- 0x2 L2D_ST_MAT
- 0x4 SYNC
- 0x5 HPW_IFETCH_CONF
- 0x6 CANC_L2M_ST
- 0x7 L1_FILL_CONF
- 0x8 ST_FILL_CONF
- 0x9 CCV
- 0xa SEM
- 0xb L2M_ST_MAT
- 0xc MFA
- 0xd L2A_ST_MAT
- 0xe L1DF_L2M
- 0xf ECC
-
-# CPU_IA64_2 Table 11-89
-name:l2_ozq_cancels2 type:exclusive default:0x0
- 0x0 RECIRC_OVER_SUB
- 0x1 CANC_L2C_ST
- 0x2 L2C_ST_MAT
- 0x3 SCRUB
- 0x4 ACQ
- 0x5 READ_WB_CONF
- 0x6 OZ_DATA_CONF
- 0x8 L2FILL_ST_CONF
- 0x9 DIDNT_RECIRC
- 0xa WEIRD
- 0xc OVER_SUB
- 0xd CANC_L2D_ST
- 0xf D_IFET
-
-# CPU_IA64_2 Table 11-93
-name:l3_reads type:exclusive default:0x3
- 0x1 DINST_FETCH.HIT
- 0x2 DINST_FETCH.MISS
- 0x3 DINST_FETCH.ALL
- 0x5 INST_FETCH.HIT
- 0x6 INST_FETCH.MISS
- 0x7 INST_FETCH.ALL
- 0x9 DATA_READ.HIT
- 0xa DATA_READ.MISS
- 0xb DATA_READ.ALL
- 0xd ALL.HIT
- 0xe ALL.MISS
- 0xf ALL.ALL
-
-# CPU_IA64_2 Table 11-94
-name:l3_writes type:exclusive default:0x7
- 0x5 DATA_WRITE.HIT
- 0x6 DATA_WRITE.MISS
- 0x7 DATA_WRITE.ALL
- 0x9 L2_WB.HIT
- 0xa L2_WB.MISS
- 0xb L2_WB.ALL
- 0xd ALL.HIT
- 0xe ALL.MISS
- 0xf ALL.ALL
-
-# CPU_IA64_2 Table 11-95
-name:mem_read_current type:exclusive default:0x3
- 0x1 IO
- 0x3 ANY
-
-# CPU_IA64_2 Table 11-96
-name:rse_references_retired type:bitmask default:0x3
- 0x1 LOAD
- 0x2 STORE
- 0x3 ALL
-
-# CPU_IA64_2 Table 11-97 bitmask
-name:syll_not_dispersed type:bitmask default:0xf
- 0x1 EXPL
- 0x2 IMPL
- 0x4 FE
- 0x8 MLI
- 0xf ALL
-
-# CPU_IA64_2 Table 11-98
-name:syll_overcount type:exclusive default:0x3
- 0x1 EXPL
- 0x2 IMPL
- 0x3 ALL
diff --git a/events/ppc/e500mc/events b/events/ppc/e500mc/events
new file mode 100644
index 0000000..8197a7d
--- /dev/null
+++ b/events/ppc/e500mc/events
@@ -0,0 +1,120 @@
+# e500mc Events
+#
+# Copyright (C) 2010 Freescale Semiconductor, Inc.
+#
+event:0x1 counters:0,1,2,3 um:zero minimum:100 name:CPU_CLK : Cycles
+event:0x2 counters:0,1,2,3 um:zero minimum:500 name:COMPLETED_INSNS : Completed Instructions (0, 1, or 2 per cycle)
+event:0x3 counters:0,1,2,3 um:zero minimum:500 name:COMPLETED_OPS : Completed Micro-ops (counts 2 for load/store w/update)
+event:0x4 counters:0,1,2,3 um:zero minimum:500 name:INSTRUCTION_FETCHES : Instruction fetches
+event:0x5 counters:0,1,2,3 um:zero minimum:500 name:DECODED_OPS : Micro-ops decoded
+event:0x8 counters:0,1,2,3 um:zero minimum:500 name:COMPLETED_BRANCHES : Branch Instructions completed
+event:0x9 counters:0,1,2,3 um:zero minimum:500 name:COMPLETED_LOAD_OPS : Load micro-ops completed
+event:0xa counters:0,1,2,3 um:zero minimum:500 name:COMPLETED_STORE_OPS : Store micro-ops completed
+event:0xb counters:0,1,2,3 um:zero minimum:500 name:COMPLETION_REDIRECTS : Number of completion buffer redirects
+event:0xc counters:0,1,2,3 um:zero minimum:500 name:BRANCHES_FINISHED : Branches finished
+event:0xd counters:0,1,2,3 um:zero minimum:500 name:TAKEN_BRANCHES_FINISHED : Taken branches finished
+event:0xe counters:0,1,2,3 um:zero minimum:500 name:BIFFED_BRANCHES_FINISHED : Biffed branches finished
+event:0xf counters:0,1,2,3 um:zero minimum:500 name:BRANCHES_MISPREDICTED : Branch instructions mispredicted due to direction, target, or IAB prediction
+event:0x10 counters:0,1,2,3 um:zero minimum:500 name:BRANCHES_MISPREDICTED_DIRECTION : Branches mispredicted due to direction prediction
+event:0x11 counters:0,1,2,3 um:zero minimum:500 name:BTB_HITS : Branches that hit in the BTB, or missed but are not taken
+event:0x12 counters:0,1,2,3 um:zero minimum:500 name:DECODE_STALLED : Cycles the instruction buffer was not empty, but 0 instructions decoded
+event:0x13 counters:0,1,2,3 um:zero minimum:500 name:ISSUE_STALLED : Cycles the issue buffer is not empty but 0 instructions issued
+event:0x14 counters:0,1,2,3 um:zero minimum:500 name:BRANCH_ISSUE_STALLED : Cycles the branch buffer is not empty but 0 instructions issued
+event:0x15 counters:0,1,2,3 um:zero minimum:500 name:SRS0_SCHEDULE_STALLED : Cycles SRS0 is not empty but 0 instructions scheduled
+event:0x16 counters:0,1,2,3 um:zero minimum:500 name:SRS1_SCHEDULE_STALLED : Cycles SRS1 is not empty but 0 instructions scheduled
+event:0x17 counters:0,1,2,3 um:zero minimum:500 name:VRS_SCHEDULE_STALLED : Cycles VRS is not empty but 0 instructions scheduled
+event:0x18 counters:0,1,2,3 um:zero minimum:500 name:LRS_SCHEDULE_STALLED : Cycles LRS is not empty but 0 instructions scheduled
+event:0x19 counters:0,1,2,3 um:zero minimum:500 name:BRS_SCHEDULE_STALLED : Cycles BRS is not empty but 0 instructions scheduled Load/Store, Data Cache, and dLFB Events
+event:0x1a counters:0,1,2,3 um:zero minimum:500 name:TOTAL_TRANSLATED : Total Ldst microops translated.
+event:0x1b counters:0,1,2,3 um:zero minimum:500 name:LOADS_TRANSLATED : Number of cacheable L* or EVL* microops translated. (This includes microops from load-multiple, load-update, and load-context instructions.)
+event:0x1c counters:0,1,2,3 um:zero minimum:500 name:STORES_TRANSLATED : Number of cacheable ST* or EVST* microops translated. (This includes microops from store-multiple, store-update, and save-context instructions.)
+event:0x1d counters:0,1,2,3 um:zero minimum:500 name:TOUCHES_TRANSLATED : Number of cacheable DCBT and DCBTST instructions translated (L1 only) (Does not count touches that are converted to nops i.e. exceptions, noncacheable, hid0[nopti] bit is set.)
+event:0x1e counters:0,1,2,3 um:zero minimum:500 name:CACHEOPS_TRANSLATED : Number of dcba, dcbf, dcbst, and dcbz instructions translated (e500 traps on dcbi)
+event:0x1f counters:0,1,2,3 um:zero minimum:500 name:CACHEINHIBITED_ACCESSES_TRANSLATED : Number of cache inhibited accesses translated
+event:0x20 counters:0,1,2,3 um:zero minimum:500 name:GUARDED_LOADS_TRANSLATED : Number of guarded loads translated
+event:0x21 counters:0,1,2,3 um:zero minimum:500 name:WRITETHROUGH_STORES_TRANSLATED : Number of write-through stores translated
+event:0x22 counters:0,1,2,3 um:zero minimum:500 name:MISALIGNED_ACCESSES_TRANSLATED : Number of misaligned load or store accesses translated.
+event:0x23 counters:0,1,2,3 um:zero minimum:500 name:TOTAL_ALLOCATED_DLFB : Total allocated to dLFB
+event:0x24 counters:0,1,2,3 um:zero minimum:500 name:LOADS_TRANSLATED_ALLOCATED_DLFB : Loads translated and allocated to dLFB (Applies to same class of instructions as loads translated.)
+event:0x25 counters:0,1,2,3 um:zero minimum:500 name:STORES_COMPLETED_ALLOCATED_DLFB : Stores completed and allocated to dLFB (Applies to same class of instructions as stores translated.)
+event:0x26 counters:0,1,2,3 um:zero minimum:500 name:TOUCHES_TRANSLATED_ALLOCATED_DLFB : Touches translated and allocated to dLFB (Applies to same class of instructions as touches translated.)
+event:0x27 counters:0,1,2,3 um:zero minimum:500 name:STORES_COMPLETED : Number of cacheable ST* or EVST* microops completed. (Applies to the same class of instructions as stores translated.)
+event:0x28 counters:0,1,2,3 um:zero minimum:500 name:DL1_LOCKS : Number of cache lines locked in the dL1. (Counts a lock even if an overlock condition is encountered.)
+event:0x29 counters:0,1,2,3 um:zero minimum:500 name:DL1_RELOADS : This is historically used to determine dcache miss rate (along with loads/stores completed). This counts dL1 reloads for any reason.
+event:0x2a counters:0,1,2,3 um:zero minimum:500 name:DL1_CASTOUTS : dL1 castouts. Does not count castouts due to DCBF.
+event:0x2b counters:0,1,2,3 um:zero minimum:500 name:DETECTED_REPLAYS : Times detected replay condition - Load miss with dLFB full.
+event:0x2c counters:0,1,2,3 um:zero minimum:500 name:LOAD_MISS_QUEUE_FULL_REPLAYS : Load miss with load queue full.
+event:0x2d counters:0,1,2,3 um:zero minimum:500 name:LOAD_GUARDED_MISS_NOT_LAST_REPLAYS : Load guarded miss when the load is not yet at the bottom of the completion buffer.
+event:0x2e counters:0,1,2,3 um:zero minimum:500 name:STORE_TRANSLATED_QUEUE_FULL_REPLAYS : Translate a store when the StQ is full.
+event:0x2f counters:0,1,2,3 um:zero minimum:500 name:ADDRESS_COLLISION_REPLAYS : Address collision.
+event:0x30 counters:0,1,2,3 um:zero minimum:500 name:DMMU_MISS_REPLAYS : DMMU_MISS_REPLAYS : DMMU miss.
+event:0x31 counters:0,1,2,3 um:zero minimum:500 name:DMMU_BUSY_REPLAYS : DMMU_BUSY_REPLAYS : DMMU busy.
+event:0x32 counters:0,1,2,3 um:zero minimum:500 name:SECOND_PART_MISALIGNED_AFTER_MISS_REPLAYS : Second part of misaligned access when first part missed in cache.
+event:0x33 counters:0,1,2,3 um:zero minimum:500 name:LOAD_MISS_DLFB_FULL_CYCLES : Cycles stalled on replay condition - Load miss with dLFB full.
+event:0x34 counters:0,1,2,3 um:zero minimum:500 name:LOAD_MISS_QUEUE_FULL_CYCLES : Cycles stalled on replay condition - Load miss with load queue full.
+event:0x35 counters:0,1,2,3 um:zero minimum:500 name:LOAD_GUARDED_MISS_NOT_LAST_CYCLES : Cycles stalled on replay condition - Load guarded miss when the load is not yet at the bottom of the completion buffer.
+event:0x36 counters:0,1,2,3 um:zero minimum:500 name:STORE_TRANSLATED_QUEUE_FULL_CYCLES : Cycles stalled on replay condition - Translate a store when the StQ is full.
+event:0x37 counters:0,1,2,3 um:zero minimum:500 name:ADDRESS_COLLISION_CYCLES : Cycles stalled on replay condition - Address collision.
+event:0x38 counters:0,1,2,3 um:zero minimum:500 name:DMMU_MISS_CYCLES : Cycles stalled on replay condition - DMMU miss.
+event:0x39 counters:0,1,2,3 um:zero minimum:500 name:DMMU_BUSY_CYCLES : Cycles stalled on replay condition - DMMU busy.
+event:0x3a counters:0,1,2,3 um:zero minimum:500 name:SECOND_PART_MISALIGNED_AFTER_MISS_CYCLES : Cycles stalled on replay condition - Second part of misaligned access when first part missed in cache.
+event:0x3b counters:0,1,2,3 um:zero minimum:500 name:IL1_LOCKS : Number of cache lines locked in the iL1. (Counts a lock even if an overlock condition is encountered.)
+event:0x3c counters:0,1,2,3 um:zero minimum:500 name:IL1_FETCH_RELOADS : This is historically used to determine icache miss rate (along with instructions completed) Reloads due to demand fetch.
+event:0x3d counters:0,1,2,3 um:zero minimum:500 name:FETCHES : Counts the number of fetches that write at least one instruction to the instruction buffer. (With instruction fetched, can used to compute instructions-per-fetch)
+event:0x3e counters:0,1,2,3 um:zero minimum:500 name:IMMU_TLB4K_RELOADS : iMMU TLB4K reloads
+event:0x3f counters:0,1,2,3 um:zero minimum:500 name:IMMU_VSP_RELOADS : iMMU VSP reloads
+event:0x40 counters:0,1,2,3 um:zero minimum:500 name:DMMU_TLB4K_RELOADS : dMMU TLB4K reloads
+event:0x41 counters:0,1,2,3 um:zero minimum:500 name:DMMU_VSP_RELOADS : dMMU VSP reloads
+event:0x42 counters:0,1,2,3 um:zero minimum:500 name:L2MMU_MISSES : Counts iTLB/dTLB error interrupt
+event:0x43 counters:0,1,2,3 um:zero minimum:500 name:BIU_MASTER_REQUESTS : Number of master transactions. (Number of master TSs.)
+event:0x44 counters:0,1,2,3 um:zero minimum:500 name:BIU_MASTER_I_REQUESTS : Number of master I-Side transactions. (Number of master I-Side TSs.)
+event:0x45 counters:0,1,2,3 um:zero minimum:500 name:BIU_MASTER_D_REQUESTS : Number of master D-Side transactions. (Number of master D-Side TSs.)
+event:0x46 counters:0,1,2,3 um:zero minimum:500 name:BIU_MASTER_D_CASTOUT_REQUESTS : Number of master D-Side non-program-demand castout transactions. This counts replacement pushes and snoop pushes. This does not count DCBF castouts. (Number of master D-side non-program-demand castout TSs.)
+event:0x48 counters:0,1,2,3 um:zero minimum:500 name:SNOOP_REQUESTS : Number of externally generated snoop requests. (Counts snoop TSs.)
+event:0x49 counters:0,1,2,3 um:zero minimum:500 name:SNOOP_HITS : Number of snoop hits on all D-side resources regardless of the cache state (modified, exclusive, or shared)
+event:0x4a counters:0,1,2,3 um:zero minimum:500 name:SNOOP_PUSHES : Number of snoop pushes from all D-side resources. (Counts snoop ARTRY/WOPs.)
+event:0x52 counters:0,1,2,3 um:zero minimum:500 name:PMC0_OVERFLOW : Counts the number of times PMC0[32] transitioned from 1 to 0.
+event:0x53 counters:0,1,2,3 um:zero minimum:500 name:PMC1_OVERFLOW : Counts the number of times PMC1[32] transitioned from 1 to 0.
+event:0x54 counters:0,1,2,3 um:zero minimum:500 name:PMC2_OVERFLOW : Counts the number of times PMC2[32] transitioned from 1 to 0.
+event:0x55 counters:0,1,2,3 um:zero minimum:500 name:PMC3_OVERFLOW : Counts the number of times PMC3[32] transitioned from 1 to 0.
+event:0x56 counters:0,1,2,3 um:zero minimum:500 name:INTERRUPTS : Number of interrupts taken
+event:0x57 counters:0,1,2,3 um:zero minimum:500 name:EXTERNAL_INTERRUPTS : Number of external input interrupts taken
+event:0x58 counters:0,1,2,3 um:zero minimum:500 name:CRITICAL_INTERRUPTS : Number of critical input interrupts taken
+event:0x59 counters:0,1,2,3 um:zero minimum:500 name:SC_TRAP_INTERRUPTS : Number of system call and trap interrupts
+event:0x5b counters:0,1,2,3 um:zero minimum:500 name:L2_LINEFILL_REQ : Number L2 Linefill requests
+event:0x5c counters:0,1,2,3 um:zero minimum:500 name:L2_VICTIM_SELECT : Number L2 Victim selects
+event:0x6e counters:0,1,2,3 um:zero minimum:500 name:L2_ACCESS : Number L2 cache accesses
+event:0x6f counters:0,1,2,3 um:zero minimum:500 name:L2_HIT_ACCESS : Number L2 hit cache accesses
+event:0x70 counters:0,1,2,3 um:zero minimum:500 name:L2_DATA_ACCESS : Number L2 data cache accesses
+event:0x71 counters:0,1,2,3 um:zero minimum:500 name:L2_HIT_DATA_ACCESS : Number L2 hit data cache accesses
+event:0x72 counters:0,1,2,3 um:zero minimum:500 name:L2_INST_ACCESS : Number L2 instruction cache accesses
+event:0x73 counters:0,1,2,3 um:zero minimum:500 name:L2_HIT_INST_ACCESS : Number L2 hit instruction cache accesses
+event:0x74 counters:0,1,2,3 um:zero minimum:500 name:L2_ALLOC : Number L2 cache allocations
+event:0x75 counters:0,1,2,3 um:zero minimum:500 name:L2_DATA_ALLOC : Number L2 data cache allocations
+event:0x76 counters:0,1,2,3 um:zero minimum:500 name:L2_DIRTY_DATA_ALLOC : Number L2 dirty data cache allocations
+event:0x77 counters:0,1,2,3 um:zero minimum:500 name:L2_INST_ALLOC : Number L2 instruction cache allocations
+event:0x78 counters:0,1,2,3 um:zero minimum:500 name:L2_UPDATE : Number L2 cache updates
+event:0x79 counters:0,1,2,3 um:zero minimum:500 name:L2_CLEAN_UPDATE : Number L2 cache clean updates
+event:0x7a counters:0,1,2,3 um:zero minimum:500 name:L2_DIRTY_UPDATE : Number L2 cache dirty updates
+event:0x7b counters:0,1,2,3 um:zero minimum:500 name:L2_CLEAN_REDU_UPDATE : Number L2 cache clean redundant updates
+event:0x7c counters:0,1,2,3 um:zero minimum:500 name:L2_DIRTY_REDU_UPDATE : Number L2 cache dirty redundant updates
+event:0x7d counters:0,1,2,3 um:zero minimum:500 name:L2_LOCKS : Number L2 cache locks
+event:0x7e counters:0,1,2,3 um:zero minimum:500 name:L2_CASTOUT : Number L2 cache castouts
+event:0x7f counters:0,1,2,3 um:zero minimum:500 name:L2_HIT_DATA_DIRTY : Number L2 cache data dirty hits
+event:0x82 counters:0,1,2,3 um:zero minimum:500 name:L2_INV_CLEAN : Number L2 cache invalidation of clean lines
+event:0x83 counters:0,1,2,3 um:zero minimum:500 name:L2_INV_INCOHER : Number L2 cache invalidation of incoherent lines
+event:0x84 counters:0,1,2,3 um:zero minimum:500 name:L2_INV_COHER : Number L2 cache invalidation of coherent lines
+event:0x94 counters:0,1,2,3 um:zero minimum:500 name:DVT0 : Detection of write to DEVENT with DVT0 set
+event:0x95 counters:0,1,2,3 um:zero minimum:500 name:DVT1 : Detection of write to DEVENT with DVT1 set
+event:0x96 counters:0,1,2,3 um:zero minimum:500 name:DVT2 : Detection of write to DEVENT with DVT2 set
+event:0x97 counters:0,1,2,3 um:zero minimum:500 name:DVT3 : Detection of write to DEVENT with DVT3 set
+event:0x98 counters:0,1,2,3 um:zero minimum:500 name:DVT4 : Detection of write to DEVENT with DVT4 set
+event:0x99 counters:0,1,2,3 um:zero minimum:500 name:DVT5 : Detection of write to DEVENT with DVT5 set
+event:0x9a counters:0,1,2,3 um:zero minimum:500 name:DVT6 : Detection of write to DEVENT with DVT6 set
+event:0x9b counters:0,1,2,3 um:zero minimum:500 name:DVT7 : Detection of write to DEVENT with DVT7 set
+event:0x9c counters:0,1,2,3 um:zero minimum:500 name:CYCLES_NEXUS_STALLED : Number of completion cycles stalled due to Nexus FIFO full
+event:0xb0 counters:0,1,2,3 um:zero minimum:500 name:DECORATED_LOAD : Number of decorated loads.
+event:0xb1 counters:0,1,2,3 um:zero minimum:500 name:DECORATED_STORE : Number of decorated stores
+event:0xb2 counters:0,1,2,3 um:zero minimum:500 name:LOAD_RETRY : Number of load retries
+event:0xb3 counters:0,1,2,3 um:zero minimum:500 name:STWCX_SUCCESS : Number of successful stwcx. instructions
+event:0xb4 counters:0,1,2,3 um:zero minimum:500 name:STWCX_UNSUCCESS : Number of unsuccessful stwcx. instructions
diff --git a/events/rtc/unit_masks b/events/ppc/e500mc/unit_masks
similarity index 67%
rename from events/rtc/unit_masks
rename to events/ppc/e500mc/unit_masks
index 6984b62..395c653 100644
--- a/events/rtc/unit_masks
+++ b/events/ppc/e500mc/unit_masks
@@ -1,4 +1,4 @@
-# RTC possible unit masks
+# e500 possible unit masks
#
name:zero type:mandatory default:0x0
0x0 No unit mask
diff --git a/events/ppc/e6500/events b/events/ppc/e6500/events
new file mode 100644
index 0000000..f34f82d
--- /dev/null
+++ b/events/ppc/e6500/events
@@ -0,0 +1,266 @@
+# e6500 Events
+#
+# Copyright (C) 2012 Freescale Semiconductor, Inc.
+#
+event:0x1 counters:0,1,2,3,4,5 um:zero minimum:100 name:CPU_CLK : Cycles
+event:0x2 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_INSNS : Completed Instructions (0, 1, or 2 per cycle)
+event:0x3 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_OPS : Completed Micro-ops
+event:0x5 counters:0,1,2,3,4,5 um:zero minimum:500 name:DECODED_OPS : Micro-ops decoded
+event:0x6 counters:0,1,2,3,4,5 um:zero minimum:500 name:TRANSITIONS_PM_EVENT : 0 to 1 transitions on the pm_event input
+event:0x7 counters:0,1,2,3,4,5 um:zero minimum:500 name:CPU_CLK_PM_EVENT : Processor cycles that occur when the pm_event input is asserted
+event:0x8 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_BRANCHES : Branch Instructions completed
+event:0x9 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_LOAD_OPS : Load micro-ops completed
+event:0xa counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_STORE_OPS : Store micro-ops completed
+event:0xb counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETION_REDIRECTS : Number of completion buffer redirects
+event:0xc counters:0,1,2,3,4,5 um:zero minimum:500 name:BRANCHES_FINISHED : Branches finished
+event:0xd counters:0,1,2,3,4,5 um:zero minimum:500 name:TAKEN_BRANCHES_FINISHED : Taken branches finished
+event:0xe counters:0,1,2,3,4,5 um:zero minimum:500 name:TAKEN_BRANCHES_FINISHED_NOT_BTB : Finished unconditional branches that miss the BTB
+event:0xf counters:0,1,2,3,4,5 um:zero minimum:500 name:BRANCHES_MISPREDICTED : Branch instructions mispredicted due to direction, target, or IAB prediction
+event:0x10 counters:0,1,2,3,4,5 um:zero minimum:500 name:BRANCHES_MISPREDICTED_DIRECTION : Branches mispredicted due to direction prediction
+event:0x11 counters:0,1,2,3,4,5 um:zero minimum:500 name:BTB_HITS : Branches that hit in the BTB, or missed but are not taken
+event:0x12 counters:0,1,2,3,4,5 um:zero minimum:500 name:DECODE_STALLED : Cycles the instruction buffer was not empty, but 0 instructions decoded
+event:0x13 counters:0,1,2,3,4,5 um:zero minimum:500 name:ISSUE_STALLED : Cycles the SFX/CFX issue queue is not empty but 0 instructions issued
+event:0x14 counters:0,1,2,3,4,5 um:zero minimum:500 name:BRANCH_ISSUE_STALLED : Cycles the branch buffer is not empty but 0 instructions issued
+event:0x15 counters:0,1,2,3,4,5 um:zero minimum:500 name:SFX0_SCHEDULE_STALLED : Cycles SFX0 is not empty but 0 instructions scheduled
+event:0x16 counters:0,1,2,3,4,5 um:zero minimum:500 name:SFX1_SCHEDULE_STALLED : Cycles SFX1 is not empty but 0 instructions scheduled
+event:0x17 counters:0,1,2,3,4,5 um:zero minimum:500 name:CFX_SCHEDULE_STALLED : Cycles CFX is not empty but 0 instructions scheduled
+event:0x18 counters:0,1,2,3,4,5 um:zero minimum:500 name:LSU_SCHEDULE_STALLED : Cycles LSU is not empty but 0 instructions scheduled
+event:0x19 counters:0,1,2,3,4,5 um:zero minimum:500 name:BU_SCHEDULE_STALLED : Cycles BU is not empty but 0 instructions scheduled
+event:0x1a counters:0,1,2,3,4,5 um:zero minimum:500 name:TOTAL_TRANSLATED : Total LSU micro-ops that reach the second stage of the LSU
+event:0x1b counters:0,1,2,3,4,5 um:zero minimum:500 name:LOADS_TRANSLATED : Cacheable load micro-ops translated.1 (Does not include WT)
+event:0x1c counters:0,1,2,3,4,5 um:zero minimum:500 name:STORES_TRANSLATED : Cacheable store micro-ops translated.1 (Does not include WT)
+event:0x1d counters:0,1,2,3,4,5 um:zero minimum:500 name:TOUCHES_TRANSLATED : Cacheable touch instructions translated. Includes: dcbt / dcbtep dcbtst / dcbtstep icbt ct=2
+event:0x1e counters:0,1,2,3,4,5 um:zero minimum:500 name:CACHEOPS_TRANSLATED : Number of dcba, dcbf, dcbst, and dcbz instructions translated (e500 traps on dcbi)
+event:0x1f counters:0,1,2,3,4,5 um:zero minimum:500 name:CACHEINHIBITED_ACCESSES_TRANSLATED : Number of cache inhibited accesses translated
+event:0x20 counters:0,1,2,3,4,5 um:zero minimum:500 name:GUARDED_LOADS_TRANSLATED : Number of guarded loads translated
+event:0x21 counters:0,1,2,3,4,5 um:zero minimum:500 name:WRITETHROUGH_STORES_TRANSLATED : Number of write-through stores translated
+event:0x22 counters:0,1,2,3,4,5 um:zero minimum:500 name:MISALIGNED_ACCESSES_TRANSLATED : Number of misaligned load or store accesses translated.
+event:0x23 counters:0,1,2,3,4,5 um:zero minimum:500 name:FETCH_2X4_HITS : Each fetch retrieves up to 8 instructions, but only the first 4 are required. This event increments if at least one instruction of the second 4 are actually used.
+event:0x24 counters:0,1,2,3,4,5 um:zero minimum:500 name:FETCH_HITS_ON_PREFETCHES : Fetch hits on instruction prefetch when the data is still in the ILFB.
+event:0x25 counters:0,1,2,3,4,5 um:zero minimum:500 name:GENERATED_FETCH_PREFETCHES : Number of prefetches generated.
+event:0x29 counters:0,1,2,3,4,5 um:zero minimum:500 name:DL1_RELOADS : This is historically used to determine dcache miss rate (along with loads/stores completed). This counts dL1 reloads for any reason.
+event:0x2c counters:0,1,2,3,4,5 um:zero minimum:500 name:LOAD_MISS_WITH_LOAD_QUEUE_FULL : Counts number of stalls; Com:52 counts cycles stalled. Includes: cacheable loads, CI loads, loadec, larx, touches, ibll, ibsl,ibllsl
+event:0x2d counters:0,1,2,3,4,5 um:zero minimum:500 name:LOAD_GUARDED_MISS_NOT_LAST_REPLAYS : Load guarded miss when the load is not yet at the bottom of the completion buffer.
+event:0x2e counters:0,1,2,3,4,5 um:zero minimum:500 name:STORE_TRANSLATED_QUEUE_FULL_REPLAYS : Translate a store when the StQ is full.
+event:0x2f counters:0,1,2,3,4,5 um:zero minimum:500 name:ADDRESS_COLLISION_REPLAYS : Address collision.
+event:0x30 counters:0,1,2,3,4,5 um:zero minimum:500 name:DTLB_MISS_REPLAYS : Counts number of stalls; Com:56 counts cycles stalled.
+event:0x31 counters:0,1,2,3,4,5 um:zero minimum:500 name:DTLB_BUSY_REPLAYS : Counts number of stalls; Com:57 counts cycles stalled.
+event:0x32 counters:0,1,2,3,4,5 um:zero minimum:500 name:SECOND_PART_MISALIGNED_AFTER_MISS_REPLAYS : Second part of misaligned access when first part missed in cache.
+event:0x34 counters:0,1,2,3,4,5 um:zero minimum:500 name:LOAD_MISS_QUEUE_FULL_CYCLES : Cycles stalled on replay condition - Load miss with load queue full.
+event:0x35 counters:0,1,2,3,4,5 um:zero minimum:500 name:LOAD_GUARDED_MISS_NOT_LAST_CYCLES : Cycles stalled on replay condition - Load guarded miss when the load is not yet at the bottom of the completion buffer.
+event:0x36 counters:0,1,2,3,4,5 um:zero minimum:500 name:STORE_TRANSLATED_QUEUE_FULL_CYCLES : Cycles stalled on replay condition - Translate a store when the StQ is full.
+event:0x37 counters:0,1,2,3,4,5 um:zero minimum:500 name:ADDRESS_COLLISION_CYCLES : Cycles stalled on replay condition - Address collision.
+event:0x38 counters:0,1,2,3,4,5 um:zero minimum:500 name:DTLB_MISS_CYCLES : Cycles stalled on replay condition - DTLB miss.
+event:0x39 counters:0,1,2,3,4,5 um:zero minimum:500 name:DTLB_BUSY_CYCLES : Cycles stalled on replay condition - DTLB busy.
+event:0x3a counters:0,1,2,3,4,5 um:zero minimum:500 name:SECOND_PART_MISALIGNED_AFTER_MISS_CYCLES : Cycles stalled on replay condition - Second part of misaligned access when first part missed in cache.
+event:0x3c counters:0,1,2,3,4,5 um:zero minimum:500 name:IL1_FETCH_RELOADS : This is historically used to determine icache miss rate (along with instructions completed) Reloads due to demand fetch.
+event:0x3d counters:0,1,2,3,4,5 um:zero minimum:500 name:FETCHES : Counts fetches that write at least one instruction to the Instruction Buffer.
+event:0x3e counters:0,1,2,3,4,5 um:zero minimum:500 name:IMMU_TLB4K_RELOADS : iMMU TLB4K reloads
+event:0x3f counters:0,1,2,3,4,5 um:zero minimum:500 name:IMMU_VSP_RELOADS : iMMU VSP reloads
+event:0x40 counters:0,1,2,3,4,5 um:zero minimum:500 name:DMMU_TLB4K_RELOADS : dMMU TLB4K reloads
+event:0x41 counters:0,1,2,3,4,5 um:zero minimum:500 name:DMMU_VSP_RELOADS : dMMU VSP reloads
+event:0x42 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_MISSES : Counts iTLB/dTLB error interrupt
+event:0x43 counters:0,1,2,3,4,5 um:zero minimum:500 name:TAKEN_BRANCHES : Completed branch instructions that were taken.
+event:0x44 counters:0,1,2,3,4,5 um:zero minimum:500 name:TAKEN_BLR : Completed blr instructions that were taken.
+event:0x45 counters:0,1,2,3,4,5 um:zero minimum:500 name:BTB_TARGET_MISPREDICT : Number of target mispredicts (BTB).
+event:0x46 counters:0,1,2,3,4,5 um:zero minimum:500 name:MISPREDICT_TARGET_BLR : Number of link stack mispredicts (LS).
+event:0x47 counters:0,1,2,3,4,5 um:zero minimum:500 name:TAKEN_BTB_BUT_MISS : Number of BTB misses, but taken (BTB allocates).
+event:0x52 counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC0_OVERFLOW : Counts the number of times PMC0[32] transitioned from 1 to 0.
+event:0x53 counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC1_OVERFLOW : Counts the number of times PMC1[32] transitioned from 1 to 0.
+event:0x54 counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC2_OVERFLOW : Counts the number of times PMC2[32] transitioned from 1 to 0.
+event:0x55 counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC3_OVERFLOW : Counts the number of times PMC3[32] transitioned from 1 to 0.
+event:0x56 counters:0,1,2,3,4,5 um:zero minimum:500 name:INTERRUPTS : Number of interrupts taken
+event:0x57 counters:0,1,2,3,4,5 um:zero minimum:500 name:EXTERNAL_INTERRUPTS : Number of external input interrupts taken
+event:0x58 counters:0,1,2,3,4,5 um:zero minimum:500 name:CRITICAL_INTERRUPTS : Number of critical input interrupts taken
+event:0x59 counters:0,1,2,3,4,5 um:zero minimum:500 name:SC_TRAP_INTERRUPTS : Number of system call and trap interrupts
+event:0x5a counters:0,1,2,3,4,5 um:zero minimum:500 name:TBL_BIT_TRANS_PMGC0 : Counts transitions of the TBL bit selected by PMGC0[TBSEL].
+event:0x5b counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC4_OVERFLOW : Counts the number of times PMC4[32] transitioned from 1 to 0.
+event:0x5c counters:0,1,2,3,4,5 um:zero minimum:500 name:PMC5_OVERFLOW : Counts the number of times PMC5[32] transitioned from 1 to 0.
+event:0x61 counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_STASH_HIT : Stash hits in L1 Data Cache.
+event:0x63 counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_STASH_REQ : Stash requests to L1 Data Cache.
+event:0x64 counters:0,1,2,3,4,5 um:zero minimum:500 name:TIMES_LSU_THREAD_PRIO_SWTICHED : Number of times the Load Store Unit thread priority switched based on resource collisions.
+event:0x65 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_REQ_FPU_DENIED : Number of cycles both threads had Floating Point Unit requests and one was denied.
+event:0x66 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_REQ_VPERM_DENIED : Number of cycles both threads had Altivec Permute requests and one was denied.
+event:0x67 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_REQ_VGEN_DENIED : Number of cycles both threads had Altivec General requests and one was denied.
+event:0x68 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_REQ_CFX_DENIED : Number of cycles both threads had Complex Fixed-Point Unit requests and one was denied.
+event:0x69 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_REQ_FETCH_DENIED : Number of cycles both threads both threads made a Fetch request to the L1 Instruction Cache and one thread wins arbitration.
+event:0x6e counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_LSU_ISSUE_STALLED : Cycles the LSU issue queue is not empty but 0 instructions issued.
+event:0x6f counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_FPU_ISSUE_STALLED : Cycles the FPU issue queue is not empty but 0 instructions issued.
+event:0x70 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_ALTIVEC_ISSUE_STALLED : Cycles the AltiVec issue queue is not empty but 0 instructions issued.
+event:0x71 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_FPU_SCHEDULE_STALLED : Cycles FPU is not empty but 0 instructions scheduled.
+event:0x72 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VPERM_SCHEDULE_STALLED : Cycles VPERM is not empty but 0 instructions scheduled.
+event:0x73 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VGEN_SCHEDULE_STALLED : Cycles VGEN is not empty but 0 instructions scheduled.
+event:0x74 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VPU_INSTRUCTION_WAIT_FOR_OPERA : Cycles VPU instruction waits for operands.
+event:0x75 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VFPU_INSTRUCTION_WAIT_FOR_OPERA : Cycles VFPU instruction waits for operands.
+event:0x76 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VSFX_INSTRUCTION_WAIT_FOR_OPERA : Cycles VSFX instruction waits for operands
+event:0x77 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VCFX_INSTRUCTION_WAIT_FOR_OPERA : Cycles VCFX instruction waits for operands.
+event:0x7a counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_IB_EMPT : Number of cycles the Instruction Buffer is empty
+event:0x7b counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_IB_FULL : Number of cycles the Instruction Buffer is full enough such that fetch stops fetching.
+event:0x7c counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_CB_EMPT : Number of cycles the Completion Buffer is empty.
+event:0x7d counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_CB_FULL : Number of cycles the Completion Buffer is full enough such that decode stops.
+event:0x7e counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_PRESYNC_SI_IB : Number of cycles a pre-sync serialized instruction holds in the Instruction Buffer and is not decoded.
+event:0x7f counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_CLK_0_INSTRUCTIONS : Increments if 0 instructions (micro-ops) completed.
+event:0x80 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_CLK_1_INSTRUCTIONS : Increments if 1 instruction (micro-op) completed.
+event:0x80 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_CLK_2_INSTRUCTIONS : Increments if 2 instructions (micro-op) completed.
+event:0x88 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC5S : Every valid IAC5 detection.
+event:0x89 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC6S : Every valid IAC6 detection.
+event:0x8a counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC7S : Every valid IAC7 detection.
+event:0x8b counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC8S : Every valid IAC8 detection.
+event:0x8c counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC1S : Every valid IAC1 detection.
+event:0x8d counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC2S : Every valid IAC2 detection.
+event:0x8e counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC3S : Every valid IAC3 detection.
+event:0x8f counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_IAC4S : Every valid IAC4 detection.
+event:0x90 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DAC1S : Every valid DAC1 detection.
+event:0x91 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DAC2S : Every valid DAC2 detection.
+event:0x94 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT0 : Detection of a write to DEVENT SPR with DVT0 set.
+event:0x95 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT1 : Detection of a write to DEVENT SPR with DVT1 set.
+event:0x96 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT2 : Detection of a write to DEVENT SPR with DVT2 set.
+event:0x97 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT3 : Detection of a write to DEVENT SPR with DVT3 set.
+event:0x98 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT4 : Detection of a write to DEVENT SPR with DVT4 set.
+event:0x99 counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT5 : Detection of a write to DEVENT SPR with DVT5 set.
+event:0x9a counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT6 : Detection of a write to DEVENT SPR with DVT6 set.
+event:0x9b counters:0,1,2,3,4,5 um:zero minimum:500 name:DETECTED_DVT7 : Detection of a write to DEVENT SPR with DVT7 set.
+event:0x9c counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_COMPLETION_STALLED : Number of completion cycles stalled due to Nexus FIFO full.
+event:0xa1 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_FINISH : FPU finish.
+event:0xa2 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_FPU_DIV : Counts once for every cycle of divide execution. (fdivs and fdiv).
+event:0xa3 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_DENORM_INPUT : Counts extra cycles delay due to denormalized inputs. If there is one, this is incremented 4 times, Two operands increments it 5 times. This shows the real penalty due to denorms, not just how often they occur.
+event:0xa4 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_DENORM_OUTPUT : FPU denorm output.
+event:0xa5 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_FPSCR_FULL_STALL : FPU FPSCR stall.
+event:0xa6 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_PIPE_SYNC_STALL : Synchronization-op stalls: count once for each cycle that a ��break-before�� FPU is in the RS/issue stage but cannotissue. Also count once for each cycle that an FPU op is in the RS/issue stage but cannot issue due to ��break-after��: of an FPU op currently in progress.
+event:0xa7 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_INPUT_DATA_STALL : FPU data-ready stall: cycles in which there is an op in the RS/issue stage that cannot issue because one or more of its operands is not yet available.
+event:0xa8 counters:0,1,2,3,4,5 um:zero minimum:500 name:FPU_INSTRUCTIONS_GEN_FLAG : FPU instruction sets FPSCR[FEX].
+event:0xac counters:0,1,2,3,4,5 um:zero minimum:500 name:PW20_CNT : Number of times the core enters the PW20 power management state.
+event:0xb0 counters:0,1,2,3,4,5 um:zero minimum:500 name:DECORATED_LOADS : Number of decorated loads to cache inhibited memory performed.
+event:0xb1 counters:0,1,2,3,4,5 um:zero minimum:500 name:DECORATED_STORES : Number of decorated stores to cache inhibited memory performed.
+event:0xb3 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_INSTRUCTIONS_SUCC : Number of successful stbcx., sthcx., stwcx., or stdcx. instructions.
+event:0xb4 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_INSTRUCTIONS_UNSUCC : Number of unsuccessful stbcx., sthcx., stwcx., or stdcx. instructions.
+event:0xb5 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_LSU_MICROOPS : Completed Load Store Unit micro-ops. Every micro-op that goes down the LSU pipe. Includes: GPR loads / GPR stores, FPR loads / FPR stores, VR loads / VR stores, Cache ops. Memory barriers Other LSU ops (dsn, msgsnd, mvidsplt, mviwsplt, tlbilx, tlbivax, tlbsync)
+event:0xb6 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_GPR_LOADS : GPR load micro-ops completed. This event only counts once for misaligns. Note that lmw that causes a fault may end up double-counting micro-ops -- once for first pass, once for second pass.
+event:0xb7 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_GPR_STORES : GPR store micro-ops completed. This event only counts once for misaligns. Note that stmw that causes a fault may end up double-counting micro-ops -- once for first pass, once for second pass.
+event:0xb8 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_CACHEOPS : Cache ops completed. Includes: dcba / dcbal, dcbf / dcbfep, dcbi, dcblc / dcblq, dcbst / dcbstep, dcbt / dcbtep / dcbtls, dcbtst / dcbtstep / dcbtstls, dcbz / dcbzep / dcbzl / dcbzlep, icbi / icbiep, icblc / icblq., icbt / icbtls
+event:0xb9 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_MEM_BARRIERS : Memory barriers completed. Includes: msync (sync, lwsync, elemental barriers) mbar (eieio) miso.
+event:0xba counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_SFX_MICROOPS : SFX micro-ops completed.
+event:0xbb counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_SINCLK_SFX_MICROOPS : SFX single-cycle micro-ops completed.
+event:0xbc counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_DBLCLK_SFX_MICROOPS : SFX double-cycle micro-ops completed.
+event:0xbe counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_CFX_INSTRUCTIONS : CFX instructions completed.
+event:0xbf counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_SFX_CFX_INSTRUCTIONS : SFX or CFX instructions completed.
+event:0xc0 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPU_INSTRUCTIONS : FPU instructions completed.
+event:0xc1 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPR_MICROOPS_LOADS : FPR load micro-ops completed.
+event:0xc2 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPR_MICROOPS_STORES : FPR store micro-ops completed.
+event:0xc3 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPR_MICROOPS_LOADS_STORES : FPR load and store micro-ops completed.
+event:0xc4 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPR_SINPRECISE_LOADS_STORES : FPR single-precision load and store micro-ops completed.
+event:0xc5 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_FPR_DBLPRECISE_LOADS_STORES : FPR double-precision load and store micro-ops completed.
+event:0xc6 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_ALTIVEC_INSTRUCTIONS : AltiVec instructions completed. (non-LSU).
+event:0xc7 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_ALTIVEC_VSFX_INSTRUCTIONS : AltiVec VSFX instructions completed.
+event:0xc8 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_ALTIVEC_VCFX_INSTRUCTIONS : AltiVec VCFX instructions completed.
+event:0xc9 counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_ALTIVEC_VPU_INSTRUCTIONS : AltiVec VPU instructions completed.
+event:0xca counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_ALTIVEC_VFPU_INSTRUCTIONS : AltiVec VFPU instructions completed.
+event:0xcb counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_VR_LOADS_MICROOPS : VR load micro-ops completed.
+event:0xcc counters:0,1,2,3,4,5 um:zero minimum:500 name:COMPLETED_VR_STORES_MICROOPS : VR store micro-ops completed.
+event:0xcd counters:0,1,2,3,4,5 um:zero minimum:500 name:VSCR_SAT_SET : Number of times the saturate bit flips from 0 to 1.
+event:0xd2 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_SFX0_IDLE : Cycles Simple Fixed Point Unit 0 is idle.
+event:0xd3 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_SFX1_IDLE : Cycles Simple Fixed Point Unit 1 is idle.
+event:0xd4 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_CFX_IDLE : Cycles Complex Fixed Point Unit is idle.
+event:0xd5 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_LSU_IDLE : Cycles Load Store Unit is idle.
+event:0xd6 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_BU_IDLE : Cycles Branch Unit is idle.
+event:0xd7 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_FPU_IDLE : Cycles Floating Point Unit is idle.
+event:0xd8 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VPU_IDLE : Cycles AltiVec Permute Unit is idle.
+event:0xd9 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VFPU_IDLE : Cycles AltiVec Floating Point Unit is idle.
+event:0xda counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VSFX_IDLE : Cycles AltiVec Simple Fixed Point Unit is idle.
+event:0xdb counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_VCFX_IDLE : Cycles AltiVec Complex Fixed Point Unit is idle.
+event:0xdd counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_CACHE_MISSES : Data L1 cache misses. (Includes load, store, cache ops).
+event:0xde counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_CACHE_LOAD_MISSES : Data L1 cache load misses.
+event:0xdf counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_CACHE_STORE_MISSES : Data L1 cache store misses.
+event:0xe0 counters:0,1,2,3,4,5 um:zero minimum:500 name:LMQ_ALLOCATED_LOADS : Loads that allocate into Load Miss Queue. (Data L1 cache misses, but may not be to different cache lines).
+event:0xe1 counters:0,1,2,3,4,5 um:zero minimum:500 name:LOAD_THREAD_MISS_COLLISION : Number of times that this thread��s load hits a line that is valid for the other thread but not this thread.
+event:0xe2 counters:0,1,2,3,4,5 um:zero minimum:500 name:INTERTHREAD_STATUS_ARRAY_COLLISION : Number of times that two threads collide on status array access.
+event:0xe3 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_ALLOC : Number of Store Gather Buffer allocates.
+event:0xe4 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_GATHERS : Number of Store Gather Buffer gathers.
+event:0xe5 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_OVERFLOWS : Number of Store Gather Buffer overflows. (Causes SGB full condition when additional store request is made).
+event:0xe6 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_PROMOTIONS : Number of Store Gather Buffer promotions.
+event:0xe7 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_INORDER_PROMOTIONS : Number of Store Gather Buffer in-order promotions. (Also includes oldest-entry timeout condition).
+event:0xe8 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_OUTOFORDER_PROMOTIONS : Number of Store Gather Buffer out-of-order promotions.
+event:0xe9 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_HP_PROMOTIONS : Number of Store Gather Buffer high-priority promotions. (Load hits on pending store).
+event:0xea counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_MISO_PROMOTIONS : Number of Store Gather Buffer miso promotions. promotions. (Load hits on pending store).
+event:0xeb counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_WATERMARK_PROMOTIONS : Number of Store Gather Buffer watermark promotions.
+event:0xec counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_SGB_OVERFLOW_PROMOTIONS : Number of Store Gather Buffer overflow promotions.
+event:0xed counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_DLAQ_FULL : Number of cycles the DLink Age Queue is full.
+event:0xee counters:0,1,2,3,4,5 um:zero minimum:500 name:TIMES_DLAQ_FULL : Number of times the DLink Age Queue is full.
+event:0xef counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_LRSAQ_FULL : Number of cycles the Load Reservation Set Age Queue is full.
+event:0xf0 counters:0,1,2,3,4,5 um:zero minimum:500 name:TIMES_LRSAQ_FULL : Number of times the Load Reservation Set Age Queue is full.
+event:0xf1 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_FWDAQ_FULL : Number of cycles the Forward Age Queue is full.
+event:0xf2 counters:0,1,2,3,4,5 um:zero minimum:500 name:TIMES_FWDAQ_FULL : Number of times the Forward Age Queue is full.
+event:0xf3 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_TIMES : Number of times a Store Queue collision is forwardable. The following cases are not forwardable: store address + size does not contain the load, cache-inhibited store, denormalized, floating point store, stcx, guarded load.
+event:0xf4 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_TIMES_DATA_RDY : Number of times a Store Queue collision is forwardable and is ready with data to forward.
+event:0xf5 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_TIMES_DATA_NORDY : Number of times a Store Queue collision is forwardable but is not ready with data to forward.
+event:0xf6 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_NOFWD_STQ_COLLISION_TIMES : Number of times a Store Queue collision is not forwardable and must wait until the store leaves the Store Queue.
+event:0xf7 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_CLK : Number of cycles a Store Queue collision is forwardable. (Number of cycles from the detection of a forwardable Store Queue entry until the load is replayed in stg1).
+event:0xf8 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_CLK_DATA_RDY : Number of cycles a Store Queue collision is forwardable and is ready with data to forward. (Number of cycles from the detection of a forwardable Store Queue entry with valid data until the load is replayed in stg1).
+event:0xf9 counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FWD_STQ_COLLISION_CLK_DATA_NORDY : Number of cycles a Store Queue collision is forwardable but is not ready with data to forward. (Number of cycles from the detection of a forwardable Store Queue entry without valid data until the load is replayed in stg1).
+event:0xfa counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_NOFWD_STQ_COLLISION_CLK : Number of cycles a Store Queue collision is not forwardable and has to wait until the store leaves the Store Queue. (Number of cycles from the detection of a non-forwardable Store Queue entry until the load is replayed in stg1).
+event:0xfb counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_FALSE_EA_COLLISION : Number of times the lower 12-bits of EA matched but the upper bits did not, leading to a false load-on-store replay. Cycle penalty is 4x the number of times.
+event:0xfc counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_LSO_BUS_COLLISION : Number of LS0 result bus collisions. Cycle penalty is 3x this measurement.
+event:0xfd counters:0,1,2,3,4,5 um:zero minimum:500 name:NUM_INTERTHREAD_DBLWORKD_BANK_COLLISION : Number of inter-thread double-word bank collisions. Measures when both threads attempt to access the same double-word bank. Cycle penalty is 3x this measurement.
+event:0xfe counters:0,1,2,3,4,5 um:zero minimum:500 name:L1_CACHE_IM : Instruction L1 cache demand fetch misses. (Includes icbtls. Does not include prefetch).
+event:0x100 counters:0,1,2,3,4,5 um:zero minimum:500 name:IMMU_MISSES : Counts misses in the level 1 Instruction MMU.
+event:0x101 counters:0,1,2,3,4,5 um:zero minimum:500 name:IMMU_TLB4K_HITS : Counts hits in the level 1 Instruction MMU TLB-4K.
+event:0x102 counters:0,1,2,3,4,5 um:zero minimum:500 name:IMMU_VSP_HITS : Counts hits in the level 1 Instruction MMU VSP.
+event:0x103 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_IMMU_HW_TABLEWALK : Counts IMMU cycles spent in hardware tablewalk. This represents the cycles from the point where the L2 MMU miss occurs to when the page table walk completes with a valid translation or exception.
+event:0x104 counters:0,1,2,3,4,5 um:zero minimum:500 name:DMMU_MISSES : Counts misses in the level 1 Data MMU. (Does not count replayed operations).
+event:0x105 counters:0,1,2,3,4,5 um:zero minimum:500 name:DMMU_TLB4K_HITS : Counts hits in the level 1 Data MMU TLB-4K. (Does not count replayed operations).
+event:0x106 counters:0,1,2,3,4,5 um:zero minimum:500 name:DMMU_VSP_HITS : Counts hits in the level 1 Data MMU VSP. (Does not count replayed operations).
+event:0x107 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_DMMU_HW_TABLEWALK : Counts DMMU cycles spent in hardware tablewalk. This represents the cycles from the point where the L2 MMU miss occurs to when the page table walk completes with a valid translation or exception.
+event:0x108 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_MISSES : Counts level 2 MMU misses. (Does not count misses that occur due to dcbt / dcbtst / dcba / dcbal instructions that fail translation and are no-oped. Does not count misses in L2MMU-VSP when looking up an indirect entry).
+event:0x109 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_4K_HITS : Counts level 2 MMU hits in L2MMU-4K.
+event:0x10a counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_VSP_HITS : Counts level 2 MMU hits in L2MMU-VSP. (Does not count indirect lookups).
+event:0x10b counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_INDIRECT_MISSES : Counts level 2 MMU indirect misses. This represents indirect entry lookups that do not have a matching indirect entry.
+event:0x10c counters:0,1,2,3,4,5 um:zero minimum:500 name:L2MMU_INDIRECT_VALID_MISSES : Counts level 2 MMU indirect valid misses. This occurts when the indirect entry is valid, but the corresponding PTE[V] = 0 or the premissions in the PTE are not sufficient for the requested access.
+event:0x10d counters:0,1,2,3,4,5 um:zero minimum:500 name:LRAT_MISSES : Counts Logical to Real Address Translation misses. This includes LRAT misses from tlbwe instructions or from page table translations.
+event:0x110 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_LMQ_LOSE_DLINK_DUE_SGB : Cycles the Load Miss Queue loses DLINK arbitration due to the Store Gather Buffer.
+event:0x111 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_SGB_LOSE_DLINK_DUE_LMQ : Cycles the Store Gather Buffer loses DLINK arbitration due to the Load Miss Queue.
+event:0x112 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_THREAD_LOSE_DLINK_DUE_OTHER_THREAD : Cycles thread loses DLINK arbitration due to other thread: Cycles thread loses DLINK arbitration due to other thread.
+event:0x116 counters:0,1,2,3,4,5 um:zero minimum:500 name:DECODE_MASK_VALUE : One mask/value pair that allows instructions to be counted in Decode.
+event:0x1bb counters:0,1,2,3,4,5 um:zero minimum:500 name:SHR_L2_DLINK_REQ : Number of DLINK requests made from core to Shared L2.
+event:0x1bc counters:0,1,2,3,4,5 um:zero minimum:500 name:SHR_L2_ILINK_REQ : Number of ILINK requests made from core to Shared L2. (Includes instruction fetches and L2MMU hardware tablewalk requests).
+event:0x1bd counters:0,1,2,3,4,5 um:zero minimum:500 name:SHR_L2_RLINK_REQ : Number of RLINK requests made from Shared L2 to core. (back invalidates, stashes, barriers).
+event:0x1be counters:0,1,2,3,4,5 um:zero minimum:500 name:SHR_L2_BLINK_REQ : Number of BLINK requests made from Shared L2 to core. (back invalidates, stashes, barriers).
+event:0x1bf counters:0,1,2,3,4,5 um:zero minimum:500 name:SHR_L2_CLINK_REQ : Number of CLINK requests made from Shared L2 to core. (back invalidates, stashes, barriers).
+event:0x1c8 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_HITS : Number of L2 Cache hits. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1c9 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_MISSES : Number of L2 Cache hits. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1ca counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DEMAND_ACCESS : Number of L2 Cache demand accesses. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1cb counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_ACCESSES : Number of L2 Cache accesses from all sources (demand, reload, snoop, etc). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1cc counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_STORE_ALLOCATE : Number of L2 Cache store allocates. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1cd counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_INSTRUCTIONS_ACCESS : Number of L2 Cache instruction accesses. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1ce counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DATA_ACCESS : Number of L2 Cache data accesses. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1cf counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_INSTRUCTIONS_MISSES : Number of L2 Cache instruction misses. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d0 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DATA_MISSES : Number of L2 Cache data misses. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d1 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_HITS_PER_THREAD : Number of times this core/thread hits in the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d2 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_MISSES_PER_THREAD : Number of times this core/thread misses in the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d3 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DEMAND_ACCESS_PER_THREAD : Number of times this core/thread makes a demand access to the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d4 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_STORE_ALLOC_PER_THREAD : Number of times a store from this core/thread allocates in the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d5 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_INSTRUCTIONS_ACCESS_PER_THREAD : Number of times an instruction from this core/thread accesses the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d6 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DATA_ACCESS_PER_THREAD : Number of times a data operation from this core/thread accesses the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d7 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_INSTRUCTION_MISSES_PER_THREAD : Number of times an instruction from this core/thread misses in the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d8 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_DATA_MISSES_PER_THREAD : Number of times a data operation from this core/thread misses in the L2 Cache. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1d9 counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_RELOAD_FROM_CORENET : Number of L2 Cache reloads from CoreNet. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1da counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_IN_STASH_REQ : Number of incoming L2 Cache stash requests. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1db counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_STASH_REQ_DOWNGRD_TO_SNOOPS : Number of incoming L2 Cache stash requests downgraded to snoops. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1dc counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_SNOOPS_HITS : Number of L2 Cache snoop hits. Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1dd counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_SNOOPS_MINT : Number of L2 Cache snoops causing MINT.
+event:0x1de counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_SNOOPS_SINT : Number of L2 Cache snoops causing SINT.
+event:0x1df counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_SNOOPS_PUSHES : Number of L2 Cache snoop pushes.
+event:0x1e0 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_BIB_STALL : Stall for Back Invalidate Buffer entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1e2 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_RLT_STALL : Stall for Reload Table entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1e4 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_RLFQ_STALL : Stall for Reload Fold Queue entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1e6 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_DTQ_STALL : Stall for Data Transaction Queue entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1e8 counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_COB_STALL : Stall for Castout Buffer entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1ea counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_WDB_STALL : Stall for Write Data Buffer entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1ec counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_RLDB_STALL : Stall for Reload Data Buffer entry (cycles). Counts 0, 1, 2, 3, or 4 per cycle.
+event:0x1ee counters:0,1,2,3,4,5 um:zero minimum:500 name:CLK_SNPQ_STALL : Stall for Snoop Queue entry (cycles).
+event:0x1fa counters:0,1,2,3,4,5 um:zero minimum:500 name:BIU_MASTER_REQ : Master transaction starts. (Number of AOut sent to CoreNet).
+event:0x1fb counters:0,1,2,3,4,5 um:zero minimum:500 name:BIU_MASTER_GLOBAL_REQ : Master transaction starts that are global. (Number of AOut with M=1 sent to CoreNet).
+event:0x1fc counters:0,1,2,3,4,5 um:zero minimum:500 name:BIU_MASTER_DATA_SIDE_REQ : Master transaction starts that are global. (Number of AOut with M=1 sent to CoreNet).
+event:0x1fd counters:0,1,2,3,4,5 um:zero minimum:500 name:BIU_MASTER_INSTRUCTION_SIDE_REQ : Master instruction-side transaction starts. (Number of I-side AOut sent to CoreNet).
+event:0x1fe counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_STASH_REQ : Stash request on AIn matches stash IDs for core or L2.
+event:0x1ff counters:0,1,2,3,4,5 um:zero minimum:500 name:L2_SNOOP_REQ : Externally generated snoop requests. (Number of AIn from CoreNet not from self).
+
diff --git a/events/ppc/e6500/unit_masks b/events/ppc/e6500/unit_masks
new file mode 100644
index 0000000..b7e7a23
--- /dev/null
+++ b/events/ppc/e6500/unit_masks
@@ -0,0 +1,4 @@
+# e6500 possible unit masks
+#
+name:zero type:mandatory default:0x0
+ 0x0 no unit mask
diff --git a/events/ppc64/architected_events_v1/events b/events/ppc64/architected_events_v1/events
new file mode 100644
index 0000000..a52d9ee
--- /dev/null
+++ b/events/ppc64/architected_events_v1/events
@@ -0,0 +1,62 @@
+#
+# Copyright OProfile authors
+# Copyright (c) International Business Machines, 2013.
+# Contributed by Maynard Johnson .
+#
+# IBM Power Architected Events -- Version 1: Power ISA 2.07
+
+# Manually add CYCLES for backward compatibility for default event
+event:0x100f0 counters:0 um:zero minimum:100000 name:CYCLES : Cycles
+
+event:0x100f2 counters:0 um:zero minimum:100000 name:PM_1PLUS_PPC_CMPL : 1 or more ppc insts finished (completed).
+event:0x400f2 counters:3 um:zero minimum:100000 name:PM_1PLUS_PPC_DISP : Cycles at least one Instr Dispatched. Could be a group with only microcode. Issue HW016521
+event:0x100fa counters:0 um:zero minimum:100000 name:PM_ANY_THRD_RUN_CYC : Any thread in run_cycles (was one thread in run_cycles).
+event:0x400f6 counters:3 um:zero minimum:10000 name:PM_BR_MPRED_CMPL : Number of Branch Mispredicts.
+event:0x200fa counters:1 um:zero minimum:10000 name:PM_BR_TAKEN_CMPL : Branch Taken.
+event:0x1e counters:0,1,2,3 um:zero minimum:100000 name:PM_CYC : Cycles .
+event:0x200fe counters:1 um:zero minimum:10000 name:PM_DATA_FROM_L2MISS : Demand LD - L2 Miss (not L2 hit).
+event:0x300fe counters:2 um:zero minimum:10000 name:PM_DATA_FROM_L3MISS : Demand LD - L3 Miss (not L2 hit and not L3 hit).
+event:0x400fe counters:3 um:zero minimum:10000 name:PM_DATA_FROM_MEM : Data cache reload from memory (including L4).
+event:0x300fc counters:2 um:zero minimum:10000 name:PM_DTLB_MISS : Data PTEG Reloaded (DTLB Miss).
+event:0x200f8 counters:1 um:zero minimum:10000 name:PM_EXT_INT : external interrupt.
+event:0x100f4 counters:0 um:zero minimum:10000 name:PM_FLOP : Floating Point Operations Finished.
+event:0x400f8 counters:3 um:zero minimum:10000 name:PM_FLUSH : Flush (any type).
+event:0x100f8 counters:0 um:zero minimum:10000 name:PM_GCT_NOSLOT_CYC : Pipeline empty (No itags assigned , no GCT slots used).
+event:0x100f6 counters:0 um:zero minimum:10000 name:PM_IERAT_RELOAD : IERAT Reloaded (Miss).
+event:0x200f2 counters:1 um:zero minimum:100000 name:PM_INST_DISP : PPC Dispatched.
+event:0x300fa counters:2 um:zero minimum:10000 name:PM_INST_FROM_L3MISS : Inst from L3 miss.
+event:0x400fc counters:3 um:zero minimum:10000 name:PM_ITLB_MISS : ITLB Reloaded.
+event:0x300f6 counters:2 um:zero minimum:10000 name:PM_L1_DCACHE_RELOAD_VALID : DL1 reloaded due to Demand Load .
+event:0x200fd counters:1 um:zero minimum:10000 name:PM_L1_ICACHE_MISS : Demand iCache Miss.
+event:0x3e054 counters:2 um:zero minimum:10000 name:PM_LD_MISS_L1 : Load Missed L1.
+event:0x200f6 counters:1 um:zero minimum:10000 name:PM_LSU_DERAT_MISS : DERAT Reloaded (Miss).
+event:0x301e4 counters:2 um:zero minimum:1000 name:PM_MRK_BR_MPRED_CMPL : Marked Branch Mispredicted.
+event:0x101e2 counters:0 um:zero minimum:1000 name:PM_MRK_BR_TAKEN_CMPL : Marked Branch Taken.
+event:0x401e8 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2MISS : Data cache reload L2 miss.
+event:0x201e4 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3MISS : The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load.
+event:0x201e0 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_MEM : The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load.
+event:0x301e6 counters:2 um:zero minimum:1000 name:PM_MRK_DERAT_MISS : Erat Miss (TLB Access) All page sizes.
+event:0x401e4 counters:3 um:zero minimum:1000 name:PM_MRK_DTLB_MISS : Marked dtlb miss.
+event:0x401e0 counters:3 um:zero minimum:1000 name:PM_MRK_INST_CMPL : marked instruction completed.
+event:0x101e0 counters:0 um:zero minimum:1000 name:PM_MRK_INST_DISP : Marked Instruction dispatched.
+event:0x401e6 counters:3 um:zero minimum:1000 name:PM_MRK_INST_FROM_L3MISS : n/a
+event:0x101e4 counters:0 um:zero minimum:1000 name:PM_MRK_L1_ICACHE_MISS : Marked L1 Icache Miss.
+event:0x101ea counters:0 um:zero minimum:1000 name:PM_MRK_L1_RELOAD_VALID : Marked demand reload.
+event:0x201e2 counters:1 um:zero minimum:1000 name:PM_MRK_LD_MISS_L1 : Marked DL1 Demand Miss counted at exec time.
+event:0x10134 counters:0 um:zero minimum:1000 name:PM_MRK_ST_CMPL : Marked store completed.
+event:0x600f4 counters:5 um:zero minimum:100000 name:PM_RUN_CYC : Run_cycles.
+event:0x500fa counters:4 um:zero minimum:100000 name:PM_RUN_INST_CMPL : Run_Instructions.
+event:0x400f4 counters:3 um:zero minimum:10000 name:PM_RUN_PURR : Run_PURR.
+event:0x200f0 counters:1 um:zero minimum:10000 name:PM_ST_FIN : Store Instructions Finished (store sent to nest).
+event:0x300f0 counters:2 um:zero minimum:10000 name:PM_ST_MISS_L1 : Store Missed L1.
+event:0x300f8 counters:2 um:zero minimum:10000 name:PM_TB_BIT_TRANS : timebase event.
+event:0x300f4 counters:2 um:zero minimum:100000 name:PM_THRD_CONC_RUN_INST : Concurrent Run Instructions.
+event:0x301ea counters:2 um:zero minimum:1000 name:PM_THRESH_EXC_1024 : Threshold counter exceeded a value of 1024.
+event:0x401ea counters:3 um:zero minimum:1000 name:PM_THRESH_EXC_128 : Threshold counter exceeded a value of 128.
+event:0x401ec counters:3 um:zero minimum:1000 name:PM_THRESH_EXC_2048 : Threshold counter exceeded a value of 2048.
+event:0x101e8 counters:0 um:zero minimum:1000 name:PM_THRESH_EXC_256 : Threshold counter exceed a count of 256.
+event:0x201e6 counters:1 um:zero minimum:1000 name:PM_THRESH_EXC_32 : Threshold counter exceeded a value of 32.
+event:0x101e6 counters:0 um:zero minimum:1000 name:PM_THRESH_EXC_4096 : Threshold counter exceed a count of 4096.
+event:0x201e8 counters:1 um:zero minimum:1000 name:PM_THRESH_EXC_512 : Threshold counter exceeded a value of 512.
+event:0x301e8 counters:2 um:zero minimum:1000 name:PM_THRESH_EXC_64 : Threshold counter exceeded a value of 64.
+event:0x101ec counters:0 um:zero minimum:10000 name:PM_THRESH_MET : threshold exceeded.
diff --git a/events/ppc64/ibm-compat-v1/unit_masks b/events/ppc64/architected_events_v1/unit_masks
similarity index 78%
rename from events/ppc64/ibm-compat-v1/unit_masks
rename to events/ppc64/architected_events_v1/unit_masks
index 170c53b..999ebfe 100644
--- a/events/ppc64/ibm-compat-v1/unit_masks
+++ b/events/ppc64/architected_events_v1/unit_masks
@@ -1,6 +1,6 @@
#
# Copyright OProfile authors
-# Copyright (c) International Business Machines, 2009.
+# Copyright (c) International Business Machines, 2013.
# Contributed by Maynard Johnson .
#
# ppc64 compat mode version 1 possible unit masks
diff --git a/events/ppc64/cell-be/events b/events/ppc64/cell-be/events
deleted file mode 100644
index 3bcb393..0000000
--- a/events/ppc64/cell-be/events
+++ /dev/null
@@ -1,517 +0,0 @@
-#ppc64 Cell Broadband Engine events
-#
-# Copyright OProfile authors
-#
-#(C) COPYRIGHT International Business Machines Corp. 2006
-# Contributed by Maynard Johnson
-#
-#
-# As many as 4 signals may be specified when they are from the same group.
-# In some instances, signals from other groups in the same island or one
-# other island may also be specified.
-#
-# Each signal is assigned to a unique counter. There are 4 32-bit hardware
-# counters. The signals are defined in the Cell Broadband Engine
-# Performance manual.
-#
-# Each event is given a unique event number. The event number is used by the
-# Oprofile code to resolve event names for the postprocessing. This is done
-# to preserve compatibility with the rest of the Oprofile code. The event
-# number format group_num followed by the counter number for the event within
-# the group.
-
-# Signal Default
-event:0x1 counters:0,1,2,3 um:zero minimum:100000 name:CYCLES : Processor Cycles
-event:0x2 counters:0,1,2,3 um:zero minimum:60000 name:SPU_CYCLES : SPU Processor Cycles
-
-
-# Cell BE Island 2 - PowerPC Processing Unit (PPU)
-
-# CBE Signal Group 21 - PPU Instruction Unit - Group 1 (NClk)
-event:0x834 counters:0,1,2,3 um:PPU_01_edges minimum:10000 name:Branch_Commit : Branch instruction committed.
-event:0x835 counters:0,1,2,3 um:PPU_01_edges minimum:10000 name:Branch_Flush : Branch instruction that caused a misprediction flush is committed. Branch misprediction includes: (1) misprediction of taken or not-taken on conditional branch, (2) misprediction of branch target address on bclr[1] and bcctr[1].
-event:0x836 counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:Ibuf_Empty : Instruction buffer empty.
-event:0x837 counters:0,1,2,3 um:PPU_01_edges minimum:10000 name:IERAT_Miss : Instruction effective-address-to-real-address translation (I-ERAT) miss.
-event:0x838 counters:0,1,2,3 um:PPU_01_cycles_or_edges minimum:10000 name:IL1_Miss_Cycles : L1 Instruction cache miss cycles. Counts the cycles from the miss event until the returned instruction is dispatched or cancelled due to branch misprediction, completion restart, or exceptions (see Note 1).
-event:0x83a counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:Dispatch_Blocked : Valid instruction available for dispatch, but dispatch is blocked.
-event:0x83d counters:0,1,2,3 um:PPU_01_edges minimum:10000 name:Instr_Flushed : Instruction in pipeline stage EX7 causes a flush.
-event:0x83f counters:0,1,2,3 um:PPU_01_edges minimum:10000 name:PPC_Commit : Two PowerPC instructions committed. For microcode sequences, only the last microcode operation is counted. Committed instructions are counted two at a time. If only one instruction has committed for a given cycle, this event will not be raised until another instruction has been committed in a future cycle.
-
-
-# CBE Signal Group 22 - PPU Execution Unit (NClk)
-event:0x89a counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:DERAT_Miss : Data effective-address-to-real-address translation (D-ERAT) miss. Not speculative.
-event:0x89b counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:Store_Request : Store request counted at the L2 interface. Counts microcoded PPE sequences more than once (see Note 1 for exceptions). (Thread 0 and 1)
-event:0x89c counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:Load_Valid : Load valid at a particular pipe stage. Speculative, since flushed operations are counted as well. Counts microcoded PPE sequences more than once. Misaligned flushes might be counted the first time as well. Load operations include all loads that read data from the cache, dcbt and dcbtst. Does not include load Vector/SIMD multimedia extension pattern instructions.
-event:0x89d counters:0,1,2,3 um:PPU_01_cycles minimum:10000 name:DL1_Miss : L1 D-cache load miss. Pulsed when there is a miss request that has a tag miss but not an ERAT miss. Speculative, since flushed operations are counted as well.
-
-
-# Cell BE Island 3 - PowerPC Storage Subsystem (PPSS)
-
-# CBE Signal Group 31 - PPSS Bus Interface Unit (NClk/2)
-event:0xc1c counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:rcv_mmio_rd_ev : Load from MFC memory-mapped I/O (MMIO) space.
-event:0xc1d counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:rcv_mmio_wr_ev : Stores to MFC MMIO space.
-event:0xc22 counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:even_token_req_ev : Request token for even memory bank numbers 0-14.
-event:0xc2b counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:rcv_data_ev : Receive 8-beat data from the Element Interconnect Bus (EIB).
-event:0xc2c counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:send_data_ev : Send 8-beat data to the EIB.
-event:0xc2d counters:0,1,2,3 um:PPU_2_edges minimum:10000 name:send_cmd_ev : Send a command to the EIB; includes retried commands.
-event:0xc2e counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:dgnt_dly_cy : Cycles between data request and data grant.
-event:0xc33 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:nc_wr_not_emp_cy : The five-entry Non-Cacheable Unit (NCU) Store Command queue not empty.
-
-
-# CBE Signal Group 32 - PPSS L2 Cache Controller - Group 1 (NClk/2)
-event:0xc80 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:cache_hit : Cache hit for core interface unit (CIU) loads and stores.
-event:0xc81 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:cache_miss : Cache miss for CIU loads and stores.
-event:0xc84 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:load_miss : CIU load miss.
-event:0xc85 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:store_miss : CIU store to Invalid state (miss).
-event:0xc87 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:larx_miss_th1 : Load word and reserve indexed (lwarx/ldarx) for Thread 0 hits Invalid cache state
-event:0xc8e counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:stcx_miss_th1 : Store word conditional indexed (stwcx/stdcx) for Thread 0 hits Invalid cache state when reservation is set.
-event:0xc99 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:all_snp_busy : All four snoop state machines busy.
-
-# CBE Signal Group 33 - PPSS L2 Cache Controller - Group 2 (NClk/2)
-event:0xce8 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:dclaim_srt : Data line claim (dclaim) that received good combined response; includes store/stcx/dcbz to Shared (S), Shared Last (SL),or Tagged (T) cache state; does not include dcbz to Invalid (I) cache state (see Note 1).
-event:0xcef counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:dclaim_to_rwitm : Dclaim converted into rwitm; may still not get to the bus if stcx is aborted (see Note 2).
-event:0xcf0 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:store_mxe : Store to modified (M), modified unsolicited (MU), or exclusive (E) cache state.
-event:0xcf1 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:stq_full : 8-entry store queue (STQ) full.
-event:0xcf2 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:store_rc_ack : Store dispatched to RC machine is acknowledged.
-event:0xcf3 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:gather_store : Gatherable store (type = 00000) received from CIU.
-event:0xcf6 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_push : Snoop push.
-event:0xcf7 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:intv_snode_er : Send intervention from (SL | E) cache state to a destination within the same CBE chip.
-event:0xcf8 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:intv_snode_mx : Send intervention from (M | MU) cache state to a destination within the same CBE chip.
-event:0xcfd counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_retry : Respond with Retry to a snooped request due to one of the following conflicts: read-and-claim state machine (RC) full address, castout (CO) congruence class, snoop (SNP) machine full address, all snoop machines busy, directory lockout, or parity error.
-event:0xcfe counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_busy_retry : Respond with Retry to a snooped request because all snoop machines are busy.
-event:0xcff counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_mx_to_est : Snooped response causes a cache state transition from (M | MU) to (E | S | T).
-event:0xd00 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_e_to_s : Snooped response causes a cache state transition from E to S.
-event:0xd01 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_esrt_to_i : Snooped response causes a cache state transition from (E | SL | S | T) to Invalid (I).
-event:0xd02 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:snp_mx_to_i : Snooped response causes a cache state transition from (M | MU) to I.
-
-# CBE Signal Group 34 - PPSS L2 Cache Controller - Group 3 (NClk/2)
-event:0xd54 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:larx_miss : Load and reserve indexed (lwarx/ldarx) for Thread 1 hits Invalid cache state.
-event:0xd5b counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:stcx_miss_th2 : Store conditional indexed (stwcx/stdcx) for Thread 1 hits Invalid cache state.
-
-# CBE Signal Group 35 - PPSS Non-Cacheable Unit (NClk/2)
-event:0xdac counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_req_any : Non-cacheable store request received from CIU; includes all synchronization operations such as sync and eieio.
-event:0xdad counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_req_sync : sync received from CIU.
-event:0xdb0 counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_req_store : Non-cacheable store request received from CIU; includes only stores.
-event:0xdb2 counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_req_eieio : eieio received from CIU.
-event:0xdb3 counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_req_tlbie : tlbie received from CIU.
-event:0xdb4 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_bot_sync : sync at the bottom of the store queue, while waiting on st_done signal from the Bus Interface Unit (BIU) and sync_done signal from L2.
-event:0xdb5 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_bot_lsync : lwsync at the bottom of the store queue, while waiting for a sync_done signal from the L2.
-event:0xdb6 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_bot_eieio : eieio at the bottom of the store queue, while waiting for a st_done signal from the BIU and a sync_done signal from the L2.
-event:0xdb7 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_bot_tlbieg : tlbie at the bottom of the store queue, while waiting for a st_done signal from the BIU.
-event:0xdb8 counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:st_combined : Non-cacheable store combined with the previous non-cacheable store with a contiguous address.
-event:0xdb9 counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:ld_cancel : Load request canceled by CIU due to late detection of load-hit-store condition (128B boundary).
-event:0xdba counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:ld_hit_st : NCU detects a load hitting a previous store to an overlapping address (32B boundary).
-event:0xdbb counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stb_full : All four store-gather buffers full.
-event:0xdbc counters:0,1,2,3 um:PPU_0_edges minimum:10000 name:ld_req : Non-cacheable load request received from CIU; includes instruction and data fetches.
-event:0xdbd counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_not_empty : The four-deep store queue not empty.
-event:0xdbe counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stq_full : The four-deep store queue full.
-event:0xdbf counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:stb_not_empty : At least one store gather buffer not empty.
-
-# Cell BE Island 4 - Synergistic Processor Unit (SPU)
-#
-# OPROFILE FOR CELL ONLY SUPPORTS PROFILING ON ONE SPU EVENT AT A TIME
-#
-# CBE Signal Group 41 - SPU (NClk)
-event:0x1004 counters:0 um:SPU_02_cycles minimum:10000 name:dual_instrctn_commit : Dual instruction committed.
-event:0x1005 counters:0 um:SPU_02_cycles minimum:10000 name:sngl_instrctn_commit : Single instruction committed.
-event:0x1006 counters:0 um:SPU_02_cycles minimum:10000 name:ppln0_instrctn_commit : Pipeline 0 instruction committed.
-event:0x1007 counters:0 um:SPU_02_cycles minimum:10000 name:ppln1_instrctn_commit : Pipeline 1 instruction committed.
-event:0x1008 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:instrctn_ftch_stll : Instruction fetch stall.
-event:0x1009 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:lcl_strg_bsy : Local storage busy.
-event:0x100A counters:0 um:SPU_02_cycles minimum:10000 name:dma_cnflct_ld_st : DMA may conflict with load or store.
-event:0x100B counters:0 um:SPU_02_cycles minimum:10000 name:str_to_lcl_strg : Store instruction to local storage issued.
-event:0x100C counters:0 um:SPU_02_cycles minimum:10000 name:ld_frm_lcl_strg : Load intruction from local storage issued.
-event:0x100D counters:0 um:SPU_02_cycles minimum:10000 name:fpu_exctn : Floating-Point Unit (FPU) exception.
-event:0x100E counters:0 um:SPU_02_cycles minimum:10000 name:brnch_instrctn_commit : Branch instruction committed.
-event:0x100F counters:0 um:SPU_02_cycles minimum:10000 name:change_of_flow : Non-sequential change of the SPU program counter, which can be caused by branch, asynchronous interrupt, stalled wait on channel, error correction code (ECC) error, and so forth.
-event:0x1010 counters:0 um:SPU_02_cycles minimum:10000 name:brnch_not_tkn : Branch not taken.
-event:0x1011 counters:0 um:SPU_02_cycles minimum:10000 name:brnch_mss_prdctn : Branch miss prediction; not exact. Certain other code sequences can cause additional pulses on this signal (see Note 2).
-event:0x1012 counters:0 um:SPU_02_cycles minimum:10000 name:brnch_hnt_mss_prdctn : Branch hint miss prediction; not exact. Certain other code sequences can cause additional pulses on this signal (see Note 2).
-event:0x1013 counters:0 um:SPU_02_cycles minimum:10000 name:instrctn_seqnc_err : Instruction sequence error.
-event:0x1015 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl_wrt : Stalled waiting on any blocking channel write (see Note 3).
-event:0x1016 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl0 : Stalled waiting on External Event Status (Channel 0) (see Note 3).
-event:0x1017 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl3 : Stalled waiting on Signal Notification 1 (Channel 3) (see Note 3).
-event:0x1018 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl4 : Stalled waiting on Signal Notification 2 (Channel 4) (see Note 3).
-event:0x1019 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl21 : Stalled waiting on DMA Command Opcode or ClassID Register (Channel 21) (see Note 3).
-event:0x101A counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl24 : Stalled waiting on Tag Group Status (Channel 24) (see Note 3).
-event:0x101B counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl25 : Stalled waiting on List Stall-and-Notify Tag Status (Channel 25) (see Note 3).
-event:0x101C counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl28 : Stalled waiting on PPU Mailbox (Channel 28) (see Note 3).
-event:0x1022 counters:0 um:SPU_02_cycles_or_edges minimum:10000 name:stlld_wait_on_chnl29 : Stalled waiting on SPU Mailbox (Channel 29) (see Note 3).
-
-
-# CBE Signal Group 42 - SPU Trigger (NClk)
-event:0x10A1 counters:0 um:SPU_Trigger_cycles_or_edges minimum:10000 name:stld_wait_chnl_op : Stalled waiting on channel operation (See Note 2).
-
-# CBE Signal Group 43 - SPU Event (NClk)
-event:0x1107 counters:0 um:SPU_Event_cycles_or_edges minimum:10000 name:instrctn_ftch_stll : Instruction fetch stall.
-
-# Cell BE Island 6 - Element Interconnect Bus (EIB)
-
-# CBE Signal Group 61 - EIB Address Concentrator 0 (NClk/2)
-event:0x17d4 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(0) : Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 1)
-event:0x17d5 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(1) : Number of dclaim commands (including atomic) AC1 to AC0. (Group 1)
-event:0x17d6 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(2) : Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 1)
-event:0x17d7 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(3) : Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 1)
-event:0x17d8 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(4) : Number of tlbie commands from AC1 to AC0. (Group 1)
-event:0x17df counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_PERF(1) : Previous adjacent address match (PAAM) Content Addressable Memory (CAM) hit. (Group 1)
-event:0x17e0 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_PERF(2) : PAAM CAM miss. (Group 1)
-event:0x17e2 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_CMD_REFLECTED : Command reflected. (Group 1)
-event:0x17e4 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(0) : Number of read and rwitm commands (including atomic) AC1 to AC0. (Group 2)
-event:0x17e5 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(1) : Number of dclaim commands (including atomic) AC1 to AC0. (Group 2)
-event:0x17e6 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(2) : Number of wwk, wwc, and wwf commands from AC1 to AC0. (Group 2)
-event:0x17e7 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(3) : Number of sync, tlbsync, and eieio commands from AC1 to AC0. (Group 2)
-event:0x17e8 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_ICMD_PERF(4) : Number of tlbie commands from AC1 to AC0. (Group 2)
-event:0x17ef counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_PERF(1) : PAAM CAM hit. (Group 2)
-event:0x17f0 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_PERF(2) : PAAM CAM miss. (Group 2)
-event:0x17f2 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC0_W_CAM_CMD_REFLECTED : Command reflected. (Group 2)
-
-# CBE Signal Group 62 - EIB Address Concentrator 1 (NClk/2)
-event:0x1839 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(1) : Local command from SPE 6.
-event:0x183a counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(2) : Local command from SPE 4.
-event:0x183b counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(3) : Local command from SPE 2.
-event:0x183c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(4) : Local command from SPE 0.
-event:0x183d counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(5) : Local command from PPE.
-event:0x183e counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(6) : Local command from SPE 1.
-event:0x183f counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(7) : Local command from SPE 3.
-event:0x1840 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(8) : Local command from SPE 5.
-event:0x1841 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(9) : Local command from SPE 7.
-event:0x1844 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(12) : AC1-to-AC0 global command from SPE 6.
-event:0x1845 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(13) : AC1-to-AC0 global command from SPE 4.
-event:0x1846 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(14) : AC1-to-AC0 global command from SPE 2.
-event:0x1847 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(15) : AC1-to-AC0 global command from SPE 0.
-event:0x1848 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(16) : AC1-to-AC0 global command from PPE.
-event:0x1849 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(17) : AC1-to-AC0 global command from SPE 1.
-event:0x184a counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(18) : AC1-to-AC0 global command from SPE 3.
-event:0x184b counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(19) : AC1-to-AC0 global command from SPE 5.
-event:0x184c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(20) : AC1-to-AC0 global command from SPE 7.
-event:0x184f counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(23) : AC1 sends a global command to AC0.
-event:0x1850 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(24) : AC0 reflects a global command back to AC1.
-event:0x1851 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WAC1_WAC1_TRCMUX_W_TRCGRP_ACPERF(25) : AC1 reflects a command back to the bus masters.
-
-# CBE Signal Group 63 - EIB Data Ring Arbitrator - Group 1 (NClk/2)
-event:0x189c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(0) : Grant on data ring 0.
-event:0x189d counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(1) : Grant on data ring 1.
-event:0x189e counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(2) : Grant on data ring 2.
-event:0x189f counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(3) : Grant on data ring 3.
-event:0x18a0 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(4) : Data ring 0 is in use.
-event:0x18a1 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(5) : Data ring 1 is in use.
-event:0x18a2 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(6) : Data ring 2 is in use.
-event:0x18a3 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(7) : Data ring 3 is in use.
-event:0x18a4 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(8) : All data rings are idle.
-event:0x18a5 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(9) : One data ring is busy.
-event:0x18a6 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(10) : Two or three data rings are busy.
-event:0x18a7 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPE(11) : All data rings are busy.
-event:0x18a8 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(12) : BIC data request pending.
-event:0x18a9 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(13) : SPE 6 data request pending.
-event:0x18aa counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(14) : SPE 4 data request pending.
-event:0x18ab counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(15) : SPE 2 data request pending.
-event:0x18ac counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(16) : SPE 0 data request pending.
-event:0x18ad counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(17) : MIC data request pending.
-event:0x18ae counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(18) : PPE data request pending.
-event:0x18af counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(19) : SPE 1 data request pending.
-event:0x18b0 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(20) : SPE 3 data request pending.
-event:0x18b1 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(21) : SPE 5 data request pending.
-event:0x18b2 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(22) : SPE 7 data request pending.
-event:0x18b3 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPE(23) : IOC data request pending.
-event:0x18b4 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(24) : BIC is data destination.
-event:0x18b5 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(25) : SPE 6 is data destination.
-event:0x18b6 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(26) : SPE 4 is data destination.
-event:0x18b7 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(27) : SPE 2 is data destination.
-event:0x18b8 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(28) : SPE 0 is data destination.
-event:0x18b9 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(29) : MIC is data destination.
-event:0x18ba counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(30) : PPE is data destination.
-event:0x18bb counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPE(31) : SPE 1 is data destination.
-
-# CBE Signal Group 64 - EIB Data Ring Arbitrator - Group 2 (NClk/2)
-event:0x1900 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(0) : BIC data request pending.
-event:0x1901 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(1) : SPE 6 data request pending.
-event:0x1902 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(2) : SPE 4 data request pending.
-event:0x1903 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(3) : SPE 2 data request pending.
-event:0x1904 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(4) : SPE 0 data request pending.
-event:0x1905 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(5) : MIC data request pending.
-event:0x1906 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(6) : PPE data request pending.
-event:0x1907 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(7) : SPE 1 data request pending.
-event:0x1908 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(8) : SPE 3 data request pending.
-event:0x1909 counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(9) : SPE 5 data request pending.
-event:0x190a counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(10) : SPE 7 data request pending.
-event:0x190b counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:WDA_DTRC_TRCGRPF(11) : IOC data request pending.
-event:0x190c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(12) : BIC is data destination.
-event:0x190d counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(13) : SPE 6 is data destination.
-event:0x190e counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(14) : SPE 4 is data destination.
-event:0x190f counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(15) : SPE 2 is data destination.
-event:0x1910 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(16) : SPE 0 is data destination.
-event:0x1911 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(17) : MIC is data destination.
-event:0x1912 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(18) : PPE is data destination.
-event:0x1913 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(19) : SPE 1 is data destination.
-event:0x1914 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(20) : SPE 3 is data destination.
-event:0x1915 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(21) : SPE 5 is data destination.
-event:0x1916 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(22) : SPE 7 is data destination.
-event:0x1917 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(23) : IOC is data destination.
-event:0x1918 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(24) : Grant on data ring 0.
-event:0x1919 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(25) : Grant on data ring 1.
-event:0x191a counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(26) : Grant on data ring 2.
-event:0x191b counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:WDA_DTRC_TRCGRPF(27) : Grant on data ring 3.
-event:0x191c counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPF(28) : All data rings are idle.
-event:0x191d counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPF(29) : One data ring is busy.
-event:0x191e counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPF(30) : Two or three data rings are busy.
-event:0x191f counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:WDA_DTRC_TRCGRPF(31) : All four data rings are busy.
-
-# CBE Signal Group 651 - EIB Token Manager - Group A0/B0 (NClk/2)
-event:0xfe4c counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_e_unused : Even XIO token unused by RAG 0.
-event:0xfe4d counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_o_unused : Odd XIO token unused by RAG 0.
-event:0xfe4e counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_e_unused : Even bank token unused by RAG 0.
-event:0xfe4f counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_o_unused : Odd bank token unused by RAG 0.
-event:0xfe54 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc0 : Token granted for SPE 0.
-event:0xfe55 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc1 : Token granted for SPE 1.
-event:0xfe56 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc2 : Token granted for SPE 2.
-event:0xfe57 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc3 : Token granted for SPE 3.
-event:0xfe58 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc4 : Token granted for SPE 4.
-event:0xfe59 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc5 : Token granted for SPE 5.
-event:0xfe5a counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc6 : Token granted for SPE 6.
-event:0xfe5b counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:token_granted_spc7 : Token granted for SPE 7.
-
-
-# CBE Signal Group 652 - EIB Token Manager - Group A1/B1 (NClk/2)
-event:0xfeb0 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_e_wasted : Even XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.
-event:0xfeb1 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_o_wasted : Odd XIO token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.
-event:0xfeb2 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_e_wasted : Even bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.
-event:0xfeb3 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_o_wasted : Odd bank token wasted by RAG 0; valid only when Unused Enable (UE) = 1 in TKM_CR register.
-event:0xfebc counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_xio_e_wasted : Even XIO token wasted by RAG U.
-event:0xfebd counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_xio_o_wasted : Odd XIO token wasted by RAG U.
-event:0xfebe counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_bank_e_wasted : Even bank token wasted by RAG U.
-event:0xfebf counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_bank_o_wasted : Odd bank token wasted by RAG U.
-
-# CBE Signal Group 653 - EIB Token Manager - Group A2/B2 (NClk/2)
-event:0xff14 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_e_shared_to_rag1 : Even XIO token from RAG 0 shared with RAG 1
-event:0xff15 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_e_shared_to_rag2 : Even XIO token from RAG 0 shared with RAG 2
-event:0xff16 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_e_shared_to_rag3 : Even XIO token from RAG 0 shared with RAG 3
-event:0xff17 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_o_shared_to_rag1 : Odd XIO token from RAG 0 shared with RAG 1
-event:0xff18 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_o_shared_to_rag2 : Odd XIO token from RAG 0 shared with RAG 2
-event:0xff19 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_xio_o_shared_to_rag3 : Odd XIO token from RAG 0 shared with RAG 3
-event:0xff1a counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_e_shared_to_rag1 : Even bank token from RAG 0 shared with RAG 1
-event:0xff1b counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_e_shared_to_rag2 : Even bank token from RAG 0 shared with RAG 2
-event:0xff1c counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_e_shared_to_rag3 : Even bank token from RAG 0 shared with RAG 3
-event:0xff1d counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_o_shared_to_rag1 : Odd bank token from RAG 0 shared with RAG 1
-event:0xff1e counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_o_shared_to_rag2 : Odd bank token from RAG 0 shared with RAG 2
-event:0xff1f counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag0_bank_o_shared_to_rag3 : Odd bank token from RAG 0 shared with RAG 3
-
-
-# CBE Signal Group 654 - EIB Token Manager - Group A0/B0 (NClk/2)
-# Repeat of the 65400, 65401, 65402, 65403, 65416, 65417, 65418, 65419 events
-
-
-# CBE Signal Group 655 - EIB Token Manager - Group A1/B1 (NClk/2)
-#repeat of the 65200 events
-
-
-# CBE Signal Group 656 - EIB Token Manager - Group A2/B2 (NClk/2)
-event:0x1004f counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_bank_o_shared_to_rag0 : Odd bank token from RAG U shared with RAG 0
-event:0x10050 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_e_shared_to_rag0 : Even XIO token from RAG 1 shared with RAG 0
-event:0x10051 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_e_shared_to_rag2 : Even XIO token from RAG 1 shared with RAG 2
-event:0x10052 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_e_shared_to_rag3 : Even XIO token from RAG 1 shared with RAG 3
-event:0x10053 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_o_shared_to_rag0 : Odd XIO token from RAG 1 shared with RAG 0
-event:0x10054 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_o_shared_to_rag2 : Odd XIO token from RAG 1 shared with RAG 2
-event:0x10055 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_xio_o_shared_to_rag3 : Odd XIO token from RAG 1 shared with RAG 3
-event:0x10056 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_e_shared_to_rag0 : Even bank token from RAG 1 shared with RAG 0
-event:0x10057 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_e_shared_to_rag2 : Even bank token from RAG 1 shared with RAG 2
-event:0x10058 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_e_shared_to_rag3 : Even bank token from RAG 1 shared with RAG 3
-event:0x10059 counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_o_shared_to_rag0 : Odd bank token from RAG 1 shared with RAG 0
-event:0x1005a counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_o_shared_to_rag2 : Odd bank token from RAG 1 shared with RAG 2
-event:0x1005b counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:rag1_bank_o_shared_to_rag3 : Odd bank token from RAG 1 shared with RAG 3
-event:0x1005c counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_xio_e_shared_to_rag1 : Even XIO token from RAG U shared with RAG 1
-event:0x1005d counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_xio_o_shared_to_rag1 : Odd XIO token from RAG U shared with RAG 1
-event:0x1005e counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_bank_e_shared_to_rag1 : Even bank token from RAG U shared with RAG 1
-event:0x1005f counters:0,1,2,3 um:PPU_0_cycles minimum:10000 name:ragu_bank_o_shared_to_rag1 : Odd bank token from RAG U shared with RAG 1
-
-# CBE Signal Group 657 - EIB Token Manager - Group C0/D0 (NClk/2)
-event:0x100e4 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_e_unused : Even XIO token unused by RAG 2
-event:0x100e5 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_o_unused : Odd XIO token unused by RAG 2
-event:0x100e6 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_e_unused : Even bank token unused by RAG 2
-event:0x100e7 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_o_unused : Odd bank token unused by RAG 2
-event:0x100e8 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif0_in_unused : IOIF0 In token unused by RAG 0
-event:0x100e9 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif0_out_unused : IOIF0 Out token unused by RAG 0
-event:0x100ea counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif1_in_unused : IOIF1 In token unused by RAG 0
-event:0x100eb counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif1_out_unused : IOIF1 Out token unused by RAG 0
-
-
-# CBE Signal Group 658 - EIB Token Manager - Group C1/D1 (NClk/2)
-event:0x10148 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_e_wasted : Even XIO token wasted by RAG 2
-event:0x10149 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_o_wasted : Odd XIO token wasted by RAG 2
-event:0x1014a counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_e_wasted : Even bank token wasted by RAG 2
-event:0x1014b counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_o_wasted : Odd bank token wasted by RAG 2
-
-
-# CBE Signal Group 659 - EIB Token Manager - Group C2/D2 (NClk/2)
-event:0x101ac counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_e_shared_to_rag0 : Even XIO token from RAG 2 shared with RAG 0
-event:0x101ad counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_e_shared_to_rag1 : Even XIO token from RAG 2 shared with RAG 1
-event:0x101ae counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_e_shared_to_rag3 : Even XIO token from RAG 2 shared with RAG 3
-event:0x101af counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_o_shared_to_rag0 : Odd XIO token from RAG 2 shared with RAG 0
-event:0x101b0 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_o_shared_to_rag1 : Odd XIO token from RAG 2 shared with RAG 1
-event:0x101b1 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_xio_o_shared_to_rag3 : Odd XIO token from RAG 2 shared with RAG 3
-event:0x101b2 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_e_shared_to_rag0 : Even bank token from RAG 2 shared with RAG 0
-event:0x101b3 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_e_shared_to_rag1 : Even bank token from RAG 2 shared with RAG 1
-event:0x101b4 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_e_shared_to_rag3 : Even bank token from RAG 2 shared with RAG 3
-event:0x101b5 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_o_shared_to_rag0 : Odd bank token from RAG 2 shared with RAG 0
-event:0x101b6 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_o_shared_to_rag1 : Odd bank token from RAG 2 shared with RAG 1
-event:0x101b7 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag2_bank_o_shared_to_rag3 : Odd bank token from RAG 2 shared with RAG 3
-
-
-# CBE Signal Group 6510 - EIB Token Manager - Group C3 (NClk/2)
-event:0x9ef38 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif0_in_wasted : IOIF0 In token wasted by RAG 0
-event:0x9ef39 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif0_out_wasted : IOIF0 Out token wasted by RAG 0
-event:0x9ef3a counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif1_in_wasted : IOIF1 In token wasted by RAG 0
-event:0x9ef3b counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag0_ioif1_out_wasted : IOIF1 Out token wasted by RAG 0
-
-
-# CBE Signal Group 6511 - EIB Token Manager - Group C0/D0 (NClk/2)
-# repeat of the events 65764 - 65771
-
-# CBE Signal Group 6512 - EIB Token Manager - Group C1/D1 (NClk/2)
-event:0x9f010 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_e_wasted : Even XIO token wasted by RAG 3
-event:0x9f011 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_o_wasted : Odd XIO token wasted by RAG 3
-event:0x9f012 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_e_wasted : Even bank token wasted by RAG 3
-event:0x9f013 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_o_wasted : Odd bank token wasted by RAG 3
-
-# CBE Signal Group 6513 - EIB Token Manager - Group C2/D2 (NClk/2)
-event:0x9f074 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_e_shared_to_rag0 : Even XIO token from RAG 3 shared with RAG 0
-event:0x9f075 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_e_shared_to_rag1 : Even XIO token from RAG 3 shared with RAG 1
-event:0x9f076 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_e_shared_to_rag2 : Even XIO token from RAG 3 shared with RAG 2
-event:0x9f077 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_o_shared_to_rag0 : Odd XIO token from RAG 3 shared with RAG 0
-event:0x9f078 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_o_shared_to_rag1 : Odd XIO token from RAG 3 shared with RAG 1
-event:0x9f079 counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_xio_o_shared_to_rag2 : Odd XIO token from RAG 3 shared with RAG 2
-event:0x9f07a counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_e_shared_to_rag0 : Even bank token from RAG 3 shared with RAG 0
-event:0x9f07b counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_e_shared_to_rag1 : Even bank token from RAG 3 shared with RAG 1
-event:0x9f07c counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_e_shared_to_rag2 : Even bank token from RAG 3 shared with RAG 2
-event:0x9f07d counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_o_shared_to_rag0 : Odd bank token from RAG 3 shared with RAG 0
-event:0x9f07e counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_o_shared_to_rag1 : Odd bank token from RAG 3 shared with RAG 1
-event:0x9f07f counters:0,1,2,3 um:PPU_2_cycles minimum:10000 name:rag3_bank_o_shared_to_rag2 : Odd bank token from RAG 3 shared with RAG 2
-
-
-# Cell BE Island 7 - Memory Interface Controller (MIC)
-
-# CBE Signal Group 71 - MIC Group 1 (NClk/2)
-event:0x1bc5 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(1) : XIO1 - Read command queue is empty.
-event:0x1bc6 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(2) : XIO1 - Write command queue is empty.
-event:0x1bc8 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(4) : XIO1 - Read command queue is full.
-event:0x1bc9 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(5) : XIO1 - MIC responds with a Retry for a read command because the read command queue is full.
-event:0x1bca counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(6) : XIO1 - Write command queue is full.
-event:0x1bcb counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM1(7) : XIO1 - MIC responds with a Retry for a write command because the write command queue is full.
-event:0x1bde counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CCS_PERFORM(2) : XIO1 - Read command dispatched; includes high-priority and fast-path reads (see Note 1).
-event:0x1bdf counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CCS_PERFORM(3) : XIO1 - Write command dispatched (see Note 1).
-event:0x1be0 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CCS_PERFORM(4) : XIO1 - Read-Modify-Write command (data size < 16 bytes) dispatched (see Note 1).
-event:0x1be1 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CCS_PERFORM(5) : XIO1 - Refresh dispatched (see Note 1).
-event:0x1be3 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CCS_PERFORM(7) : XIO1 - Byte-masking write command (data size >= 16 bytes) dispatched (see Note 1).
-event:0x1be5 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CRW_PERFORM(1) : XIO1 - Write command dispatched after a read command was previously dispatched (see Note 1).
-event:0x1be6 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL1_YMM_CRW_PERFORM(2) : XIO1 - Read command dispatched after a write command was previously dispatched (see Note 1).
-
-
-# CBE Signal Group 72 - MIC Group 2 (NClk/2)
-event:0x1c29 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(1) : XIO0 - Read command queue is empty.
-event:0x1c2a counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(2) : XIO0 - Write command queue is empty.
-event:0x1c2c counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(4) : XIO0 - Read command queue is full.
-event:0x1c2d counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(5) : XIO0 - MIC responds with a Retry for a read command because the read command queue is full.
-event:0x1c2e counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(6) : XIO0 - Write command queue is full.
-event:0x1c2f counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_COMMON_YMB_CSR_PERFORM2(7) : XIO0 - MIC responds with a Retry for a write command because the write command queue is full.
-event:0x1c42 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(2) : XIO0 - Read command dispatched; includes high-priority and fast-path reads (see Note 1).
-event:0x1c43 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(3) : XIO0 - Write command dispatched (see Note 1).
-event:0x1c44 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(4) : XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched (see Note 1).
-event:0x1c45 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(5) : XIO0 - Refresh dispatched (see Note 1).
-event:0x1c49 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CRW_PERFORM(1) : XIO0 - Write command dispatched after a read command was previously dispatched (see Note 1).
-event:0x1c4a counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CRW_PERFORM(2) : XIO0 - Read command dispatched after a write command was previously dispatched (see Note 1).
-
-# CBE Signal Group 73 - MIC Group 3 (NClk/2)
-event:0x1ca7 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(3) : XIO0 - Write command dispatched (see Note 1).
-event:0x1ca8 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(4) : XIO0 - Read-Modify-Write command (data size < 16 bytes) dispatched (see Note 1).
-event:0x1ca9 counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(5) : XIO0 - Refresh dispatched (see Note 1).
-event:0x1cab counters:0,1,2,3 um:PPU_0123_cycles minimum:10000 name:YM_CTL0_YMM_CCS_PERFORM(7) : XIO0 - Byte-masking write command (data size >= 16 bytes) dispatched (see Note 1).
-
-
-# Cell BE Island 8 - Broadband Engine Interface (BEI)
-
-# CBE Signal Group 81 - BIF Controller - IOIF0 Word 0 (NClk/2)
-event:0x1fb0 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:B2F_Type_A_Data : Type A data physical layer group (PLG). Does not include header-only or credit-only data PLGs. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.
-event:0x1fb1 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:B2F_Type_B_Data : Type B data PLG. In IOIF mode, counts I/O device read data; in BIF mode, counts all outbound data.
-event:0x1fb2 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:IOC_Type_A_Data : Type A data PLG. Does not include header-only or credit-only PLGs. In IOIF mode, counts CBE store data to I/O device. Does not apply in BIF mode.
-event:0x1fb3 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:IOC_Type_B_Data : Type B data PLG. In IOIF mode, counts CBE store data to an I/O device. Does not apply in BIF mode.
-event:0x1fb4 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Data_PLG : Data PLG. Does not include header-only or credit-only PLGs.
-event:0x1fb5 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Command_PLG : Command PLG (no credit-only PLG). In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/ reflected command or snoop/combined responses.
-event:0x1fb6 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_A_Transfer : Type A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).
-event:0x1fb7 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_B_Transfer : Type B data transfer.
-event:0x1fb8 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Cmd_Credit_Only_PLG : Command-credit-only command PLG in either IOIF or BIF mode.
-event:0x1fb9 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Data_Credit_Only_PLG : Data-credit-only data PLG sent in either IOIF or BIF mode.
-event:0x1fba counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Non-Null_Envelopes : Non-null envelope sent (does not include long envelopes).
-event:0x1fbc counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:null_env_sent : Null envelope sent (see Note 1).
-event:0x1fbd counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:no_valid_data : No valid data sent this cycle (see Note 1).
-event:0x1fbe counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:norm_env_sent : Normal envelope sent (see Note 1).
-event:0x1fbf counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:lnog_env_sent : Long envelope sent (see Note 1).
-event:0x1fc0 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:per_mon_null_sent : A Null PLG inserted in an outgoing envelope.
-event:0x1fc1 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:per_mon_array_full : Outbound envelope array is full.
-
-# CBE Signal Group 82 - BIF Controller - IOIF1 Word 0 (NClk/2)
-event:0x201b counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_B_Transfer : Type B data transfer.
-
-
-# CBE Signal Group 83 - BIF Controller - IOIF0 Word 2 (NClk/2)
-event:0x206d counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:null_env_rcvd : Null envelope received (see Note 1).
-event:0x207a counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Command_PLG : Command PLG, but not credit-only PLG. In IOIF mode, counts I/O command or reply PLGs. In BIF mode, counts command/reflected command or snoop/combined responses.
-event:0x207b counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Command_Credit_Only_PLG : Command-credit-only command PLG.
-event:0x2080 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:norm_env_rcvd_good : Normal envelope received is good (see Note 1).
-event:0x2081 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:long_env_rcvd_good : Long envelope received is good (see Note 1).
-event:0x2082 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:cmd_credit_only_PLG : Data-credit-only data PLG in either IOIF or BIF mode; will count a maximum of one per envelope (see Note 1).
-event:0x2083 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:non-null_envelope : Non-null envelope; does not include long envelopes; includes retried envelopes (see Note 1).
-event:0x2084 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:data_grnt_rcvd : Data grant received.
-event:0x2088 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Data_PLG : Data PLG. Does not include header-only or credit-only PLGs.
-event:0x2089 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_A_transfer : Type A data transfer regardless of length. Can also be used to count Type A data header PLGs, but not credit-only PLGs.
-event:0x208a counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_B_transfer : Type B data transfer.
-
-# CBE Signal Group 84 - BIF Controller - IOIF1 Word 2 (NClk/2)
-event:0x20d1 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:null_env_rcvd : Null envelope received (see Note 1).
-event:0x20de counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Command_PLG : Command PLG (no credit-only PLG). Counts I/O command or reply PLGs.
-event:0x20df counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Command_Credit_Only_PLG : Command-credit-only command PLG.
-event:0x20e4 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:norm_env_rcvd_good : Normal envelope received is good (see Note 1).
-event:0x20e5 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:long_env_rcvd_good : Long envelope received is good (see Note 1).
-event:0x20e6 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:cmd_credit_only_PLG : Data-credit-only data PLG received; will count a maximum of one per envelope (see Note 1).
-event:0x20e7 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:non-null_envelope : Non-Null envelope received; does not include long envelopes; includes retried envelopes (see Note 1).
-event:0x20e8 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:data_grnt_rcvd : Data grant received.
-event:0x20ec counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Data_PLG : Data PLG received. Does not include header-only or credit-only PLGs.
-event:0x20ed counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_A_transfer : Type I A data transfer regardless of length. Can also be used to count Type A data header PLGs (but not credit-only PLGs).
-event:0x20ee counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:Type_B_transfer : Type B data transfer received.
-
-# CBE Signal Group 85 - I/O Controller Word 0 - Group 1 (NClk/2)
-event:0x213c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:mmio_rd_to_ioif1 : Received MMIO read targeted to IOIF1.
-event:0x213d counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:mmio_wrt_to_ioif1 : Received MMIO write targeted to IOIF1.
-event:0x213e counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:mmio_rd_to_ioif0 : Received MMIO read targeted to IOIF0.
-event:0x213f counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:mmio_wrt_to_ioif0 : Received MMIO write targeted to IOIF0.
-event:0x2140 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:cmd_to_slice0 : Sent command to IOIF0.
-event:0x2141 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:cmd_to_slice1 : Sent command to IOIF1.
-
-# CBE Signal Group 86 - I/O Controller Word 2 - Group 2 (NClk/2)
-event:0x219d counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:re_dep_dm3 : IOIF0 Dependency Matrix 3 is occupied by a dependent command (see Note 1).
-event:0x219e counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:re_dep_dm4 : IOIF0 Dependency Matrix 4 is occupied by a dependent command (see Note 1).
-event:0x219f counters:0,1,2,3 um:PPU_02_cycles_or_edges minimum:10000 name:re_dep_dm5 : IOIF0 Dependency Matrix 5 is occupied by a dependent command (see Note 1).
-event:0x21a2 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:slice0_ld_rqst : Received read request from IOIF0.
-event:0x21a3 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:slice0_str_rqst : Received write request from IOIF0.
-event:0x21a6 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:intrpt_from_realizer : Received interrupt from the IOIF0.
-
-# CBE Signal Group 87 - I/O Controller - Group 3 (NClk/2)
-event:0x220c counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn_even : IOIF0 request for token for even memory banks 0-14 (see Note 1).
-event:0x220d counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn_odd : IOIF0 request for token for odd memory banks 1-15 (see Note 1).
-event:0x220e counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn1_3_5_7 : IOIF0 request for token type 1, 3, 5, or 7 (see Note 1).
-event:0x220f counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn9_11_13_15 : IOIF0 request for token type 9, 11, 13, or 15 (see Note 1).
-event:0x2214 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn16 : IOIF0 request for token type 16 (see Note 1).
-event:0x2215 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn17 : IOIF0 request for token type 17 (see Note 1).
-event:0x2216 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn18 : IOIF0 request for token type 18 (see Note 1).
-event:0x2217 counters:0,1,2,3 um:PPU_02_cycles minimum:10000 name:slice0_rqst_tkn19 : IOIF0 request for token type 19 (see Note 1).
-
-
-# CBE Signal Group 88 - I/O Controller Word 0 - Group 4 (NClk/2)
-event:0x2260 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:io_pt_hit : I/O page table cache hit for commands from IOIF.
-event:0x2261 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:io_pt_miss : I/O page table cache miss for commands from IOIF.
-event:0x2263 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:io_seg_tbl_hit : I/O segment table cache hit.
-event:0x2264 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:io_seg_tbl_miss : I/O segment table cache miss.
-event:0x2278 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:intrrpt_frm_spu : Interrupt received from any SPU (reflected cmd when IIC has sent ACK response).
-event:0x2279 counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:iic_intrrpt_to_pu_thrd0 : Internal interrupt controller (IIC) generated interrupt to PPU thread 0.
-event:0x227a counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:iic_intrrpt_to_pu_thrd1 : IIC generated interrupt to PPU thread 1.
-event:0x227b counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:pu_intrrpt_to_pu_thrd0 : Received external interrupt (using MMIO) from PPU to PPU thread 0.
-event:0x227c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:pu_intrrpt_to_pu_thrd1 : Received external interrupt (using MMIO) from PPU to PPU thread 1.
-event:0x227c counters:0,1,2,3 um:PPU_02_edges minimum:10000 name:pu_intrrpt_to_pu_thrd1 : Received external interrupt (using MMIO) from PPU to PPU thread 1.
diff --git a/events/ppc64/cell-be/unit_masks b/events/ppc64/cell-be/unit_masks
deleted file mode 100644
index 64a4959..0000000
--- a/events/ppc64/cell-be/unit_masks
+++ /dev/null
@@ -1,137 +0,0 @@
-# Cell Broadband Engine possible unit masks
-#
-# Copyright OProfile authors
-#
-#(C) COPYRIGHT International Business Machines Corp. 2006
-# Contributed by Maynard Johnson
-#
-#
-name:zero type:mandatory default:0x0
- 0x000 Count cycles [mandatory]
-name:PPU_0_cycles type:bitmask default:0x013
- 0x001 Count cycles [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [mandatory]
-name:PPU_0_edges type:bitmask default:0x012
- 0x000 Count edges [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [mandatory]
-name:PPU_2_cycles type:bitmask default:0x043
- 0x001 Count cycles [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x040 PPU Bus Word 2 [mandatory]
-name:PPU_2_edges type:bitmask default:0x042
- 0x000 Count edges [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x040 PPU Bus Word 2 [mandatory]
-name:PPU_01_cycles type:bitmask default:0x023
- 0x001 Count cycles [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [optional ]
- 0x020 PPU Bus Word 1 [default ]
-name:PPU_01_edges type:bitmask default:0x022
- 0x000 Count edges [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [optional ]
- 0x020 PPU Bus Word 1 [default ]
-name:PPU_01_cycles_or_edges type:bitmask default:0x023
- 0x000 Count edges [optional ]
- 0x001 Count cycles [default ]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [optional ]
- 0x020 PPU Bus Word 1 [default ]
-name:PPU_02_cycles type:bitmask default:0x013
- 0x001 Count cycles [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [default ]
- 0x040 PPU Bus Word 2 [optional ]
-name:PPU_02_edges type:bitmask default:0x012
- 0x000 Count edges [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [default ]
- 0x040 PPU Bus Word 2 [optional ]
-name:PPU_02_cycles_or_edges type:bitmask default:0x013
- 0x000 Count edges [optional ]
- 0x001 Count cycles [default ]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x010 PPU Bus Word 0 [default ]
- 0x040 PPU Bus Word 2 [optional ]
-name:PPU_0123_cycles type:bitmask default:0x033
- 0x001 Count cycles [mandatory]
- 0x000 Negative polarity [optional ]
- 0x002 Positive polarity [default ]
- 0x030 PPU Bus Word 0/1 [default ]
- 0x0c0 PPU Bus Word 2/3 [optional ]
-name:SPU_02_cycles type:bitmask default:0x0113
- 0x0001 Count cycles [mandatory]
- 0x0000 Negative polarity [optional ]
- 0x0002 Positive polarity [default ]
- 0x0110 SPU Bus Word 0 [default ]
- 0x0140 SPU Bus Word 2 [optional ]
- 0x0000 SPU 0 [default ]
- 0x1000 SPU 1 [optional ]
- 0x2000 SPU 2 [optional ]
- 0x3000 SPU 3 [optional ]
- 0x4000 SPU 4 [optional ]
- 0x5000 SPU 5 [optional ]
- 0x6000 SPU 6 [optional ]
- 0x7000 SPU 7 [optional ]
-name:SPU_02_cycles_or_edges type:bitmask default:0x0113
- 0x0000 Count edges [optional ]
- 0x0001 Count cycles [default ]
- 0x0000 Negative polarity [optional ]
- 0x0002 Positive polarity [default ]
- 0x0110 SPU Bus Word 0 [default ]
- 0x0140 SPU Bus Word 2 [optional ]
- 0x0000 SPU 0 [default ]
- 0x1000 SPU 1 [optional ]
- 0x2000 SPU 2 [optional ]
- 0x3000 SPU 3 [optional ]
- 0x4000 SPU 4 [optional ]
- 0x5000 SPU 5 [optional ]
- 0x6000 SPU 6 [optional ]
- 0x7000 SPU 7 [optional ]
-name:SPU_Trigger_cycles_or_edges type:bitmask default:0x0107
- 0x0000 Count edges [optional ]
- 0x0001 Count cycles [default ]
- 0x0000 Negative polarity [optional ]
- 0x0002 Positive polarity [default ]
- 0x0104 SPU Trigger 0 [default ]
- 0x0114 SPU Trigger 1 [optional ]
- 0x0124 SPU Trigger 2 [optional ]
- 0x0134 SPU Trigger 3 [optional ]
- 0x0000 SPU 0 [default ]
- 0x1000 SPU 1 [optional ]
- 0x2000 SPU 2 [optional ]
- 0x3000 SPU 3 [optional ]
- 0x4000 SPU 4 [optional ]
- 0x5000 SPU 5 [optional ]
- 0x6000 SPU 6 [optional ]
- 0x7000 SPU 7 [optional ]
-name:SPU_Event_cycles_or_edges type:bitmask default:0x0147
- 0x0000 Count edges [optional ]
- 0x0001 Count cycles [default ]
- 0x0000 Negative polarity [optional ]
- 0x0002 Positive polarity [default ]
- 0x0144 SPU Event 0 [default ]
- 0x0154 SPU Event 1 [optional ]
- 0x0164 SPU Event 2 [optional ]
- 0x0174 SPU Event 3 [optional ]
- 0x0000 SPU 0 [default ]
- 0x1000 SPU 1 [optional ]
- 0x2000 SPU 2 [optional ]
- 0x3000 SPU 3 [optional ]
- 0x4000 SPU 4 [optional ]
- 0x5000 SPU 5 [optional ]
- 0x6000 SPU 6 [optional ]
- 0x7000 SPU 7 [optional ]
diff --git a/events/ppc64/ibm-compat-v1/event_mappings b/events/ppc64/ibm-compat-v1/event_mappings
deleted file mode 100644
index 5805604..0000000
--- a/events/ppc64/ibm-compat-v1/event_mappings
+++ /dev/null
@@ -1,82 +0,0 @@
-#PPC64 pmu-compat event mappings, version 1
-#
-# Copyright OProfile authors
-# Copyright (c) International Business Machines, 2009.
-# Contributed by Maynard Johnson .
-#
-#Mapping of event groups to MMCR values
-
-#Group Default
-event:0X001 mmcr0:0X00000000 mmcr1:0X00000000FAF41EF4 mmcra:0X00000000
-
-#Group 1 pm_compat_utilization1, Basic CPU utilization
-event:0X0010 mmcr0:0X00000000 mmcr1:0X00000000FAF41EF4 mmcra:0X00000000
-event:0X0011 mmcr0:0X00000000 mmcr1:0X00000000FAF41EF4 mmcra:0X00000000
-event:0X0012 mmcr0:0X00000000 mmcr1:0X00000000FAF41EF4 mmcra:0X00000000
-event:0X0013 mmcr0:0X00000000 mmcr1:0X00000000FAF41EF4 mmcra:0X00000000
-
-#Group 2 pm_compat_utilization2, CPI and utilization data
-event:0X0020 mmcr0:0X00000000 mmcr1:0X00000000F4F41EFA mmcra:0X00000000
-event:0X0021 mmcr0:0X00000000 mmcr1:0X00000000F4F41EFA mmcra:0X00000000
-event:0X0022 mmcr0:0X00000000 mmcr1:0X00000000F4F41EFA mmcra:0X00000000
-event:0X0023 mmcr0:0X00000000 mmcr1:0X00000000F4F41EFA mmcra:0X00000000
-
-#Group 3 pm_compat_dsource, Data Access sources
-event:0X0030 mmcr0:0X00000000 mmcr1:0X00000000FEFEFEFA mmcra:0X00000000
-event:0X0031 mmcr0:0X00000000 mmcr1:0X00000000FEFEFEFA mmcra:0X00000000
-event:0X0032 mmcr0:0X00000000 mmcr1:0X00000000FEFEFEFA mmcra:0X00000000
-event:0X0033 mmcr0:0X00000000 mmcr1:0X00000000FEFEFEFA mmcra:0X00000000
-
-#Group 4 pm_compat_l1_dcache_load_store_miss, L1 D-Cache load/store miss
-event:0X0040 mmcr0:0X00000000 mmcr1:0X0000000002F0F0F0 mmcra:0X00000000
-event:0X0041 mmcr0:0X00000000 mmcr1:0X0000000002F0F0F0 mmcra:0X00000000
-event:0X0042 mmcr0:0X00000000 mmcr1:0X0000000002F0F0F0 mmcra:0X00000000
-event:0X0043 mmcr0:0X00000000 mmcr1:0X0000000002F0F0F0 mmcra:0X00000000
-
-#Group 5 pm_compat_l1_cache_load, L1 Cache loads
-event:0X0050 mmcr0:0X00000000 mmcr1:0X0000000002FEF6F0 mmcra:0X00000000
-event:0X0051 mmcr0:0X00000000 mmcr1:0X0000000002FEF6F0 mmcra:0X00000000
-event:0X0052 mmcr0:0X00000000 mmcr1:0X0000000002FEF6F0 mmcra:0X00000000
-event:0X0053 mmcr0:0X00000000 mmcr1:0X0000000002FEF6F0 mmcra:0X00000000
-
-#Group 6 pm_compat_instruction_directory, Instruction Directory
-event:0X0060 mmcr0:0X00000000 mmcr1:0X00000000F6FC02FC mmcra:0X00000000
-event:0X0061 mmcr0:0X00000000 mmcr1:0X00000000F6FC02FC mmcra:0X00000000
-event:0X0062 mmcr0:0X00000000 mmcr1:0X00000000F6FC02FC mmcra:0X00000000
-event:0X0063 mmcr0:0X00000000 mmcr1:0X00000000F6FC02FC mmcra:0X00000000
-
-#Group 7 pm_compat_data_directory, Data Directory
-event:0X0070 mmcr0:0X00000000 mmcr1:0X00000000FCF6FCFA mmcra:0X00000000
-event:0X0071 mmcr0:0X00000000 mmcr1:0X00000000FCF6FCFA mmcra:0X00000000
-event:0X0072 mmcr0:0X00000000 mmcr1:0X00000000FCF6FCFA mmcra:0X00000000
-event:0X0073 mmcr0:0X00000000 mmcr1:0X00000000FCF6FCFA mmcra:0X00000000
-
-#Group 8 pm_compat_cpi_1plus_ppc, Misc CPI and utilization data
-event:0X0080 mmcr0:0X00000000 mmcr1:0X00000000F2F4F2F2 mmcra:0X00000000
-event:0X0081 mmcr0:0X00000000 mmcr1:0X00000000F2F4F2F2 mmcra:0X00000000
-event:0X0082 mmcr0:0X00000000 mmcr1:0X00000000F2F4F2F2 mmcra:0X00000000
-event:0X0083 mmcr0:0X00000000 mmcr1:0X00000000F2F4F2F2 mmcra:0X00000000
-
-#Group 9 pm_compat_misc_events1, Misc Events
-event:0X0090 mmcr0:0X00000000 mmcr1:0X0000000002F8F81E mmcra:0X00000000
-event:0X0091 mmcr0:0X00000000 mmcr1:0X0000000002F8F81E mmcra:0X00000000
-event:0X0092 mmcr0:0X00000000 mmcr1:0X0000000002F8F81E mmcra:0X00000000
-event:0X0093 mmcr0:0X00000000 mmcr1:0X0000000002F8F81E mmcra:0X00000000
-
-#Group 10 pm_compat_misc_events2, Misc Events
-event:0X00A0 mmcr0:0X00000000 mmcr1:0X00000000F0F2F4F8 mmcra:0X00000000
-event:0X00A1 mmcr0:0X00000000 mmcr1:0X00000000F0F2F4F8 mmcra:0X00000000
-event:0X00A2 mmcr0:0X00000000 mmcr1:0X00000000F0F2F4F8 mmcra:0X00000000
-event:0X00A3 mmcr0:0X00000000 mmcr1:0X00000000F0F2F4F8 mmcra:0X00000000
-
-#Group 11 pm_compat_misc_events3, Misc Events
-event:0X00B0 mmcr0:0X00000000 mmcr1:0X00000000F8F2F8F6 mmcra:0X00000000
-event:0X00B1 mmcr0:0X00000000 mmcr1:0X00000000F8F2F8F6 mmcra:0X00000000
-event:0X00B2 mmcr0:0X00000000 mmcr1:0X00000000F8F2F8F6 mmcra:0X00000000
-event:0X00B3 mmcr0:0X00000000 mmcr1:0X00000000F8F2F8F6 mmcra:0X00000000
-
-#Group 12 pm_compat_suspend, Suspend Events
-event:0X00C0 mmcr0:0X00000000 mmcr1:0X0000000000000000 mmcra:0X00000000
-event:0X00C1 mmcr0:0X00000000 mmcr1:0X0000000000000000 mmcra:0X00000000
-event:0X00C2 mmcr0:0X00000000 mmcr1:0X0000000000000000 mmcra:0X00000000
-event:0X00C3 mmcr0:0X00000000 mmcr1:0X0000000000000000 mmcra:0X00000000
diff --git a/events/ppc64/ibm-compat-v1/events b/events/ppc64/ibm-compat-v1/events
deleted file mode 100644
index 9d5e9c6..0000000
--- a/events/ppc64/ibm-compat-v1/events
+++ /dev/null
@@ -1,91 +0,0 @@
-#PPC64 pmu-compat events, version 1
-#
-# Copyright OProfile authors
-# Copyright (c) International Business Machines, 2009.
-# Contributed by Maynard Johnson .
-#
-#
-# Within each group, the event names must be unique. Each event in a group is
-# assigned to a unique counter.
-#
-# Only events within the same group can be selected simultaneously.
-# Each event is given a unique event number. The event number is used by the
-# OProfile code to resolve event names for the post-processing. This is done
-# to preserve compatibility with the rest of the OProfile code. The event
-# numbers are formatted as follows: concat().
-
-#Group Default
-event:0X001 counters:2 um:zero minimum:10000 name:CYCLES : Processor Cycles
-
-
-#Group 1 pm_compat_utilization1, Basic CPU utilization
-event:0X0010 counters:0 um:zero minimum:1000 name:PM_THRD_ONE_RUN_CYC_GRP1 : (Group 1 pm_compat_utilization1) At least one thread in run cycles
-event:0X0011 counters:1 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_compat_utilization1) Run cycles
-event:0X0012 counters:2 um:zero minimum:10000 name:PM_CYC_GRP1 : (Group 1 pm_compat_utilization1) Processor cycles
-event:0X0013 counters:3 um:zero minimum:1000 name:PM_RUN_PURR_GRP1 : (Group 1 pm_compat_utilization1) Run PURR Even
-
-#Group 2 pm_compat_utilization2, CPI and utilization data
-event:0X0020 counters:0 um:zero minimum:1000 name:PM_FPU_FLOP_GRP2 : (Group 2 pm_compat_utilization2) FPU executed 1FLOP, FMA, FSQRT or FDIV instruction
-event:0X0021 counters:1 um:zero minimum:10000 name:PM_RUN_CYC_GRP2 : (Group 2 pm_compat_utilization2) Run cycles
-event:0X0022 counters:2 um:zero minimum:10000 name:PM_CYC_GRP2 : (Group 2 pm_compat_utilization2) Processor cycles
-event:0X0023 counters:3 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP2 : (Group 2 pm_compat_utilization2) Run instructions completed
-
-#Group 3 pm_compat_dsource, Data Access sources
-event:0X0030 counters:0 um:zero minimum:1000 name:PM_DATA_FROM_L1-5_GRP3 : (Group 3 pm_compat_dsource) Data loaded from L1.5
-event:0X0031 counters:1 um:zero minimum:1000 name:PM_DATA_FROM_L2MISS_GRP3 : (Group 3 pm_compat_dsource) Data loaded missed L2
-event:0X0032 counters:2 um:zero minimum:1000 name:PM_DATA_FROM_L3MISS_GRP3 : (Group 3 pm_compat_dsource) Data loaded from private L3 miss
-event:0X0033 counters:3 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP3 : (Group 3 pm_compat_dsource) Run instructions completed
-
-#Group 4 pm_compat_l1_dcache_load_store_miss, L1 D-Cache load/store miss
-event:0X0040 counters:0 um:zero minimum:10000 name:PM_INST_CMPL_GRP4 : (Group 4 pm_compat_l1_dcache_load_store_miss) Instruction completed
-event:0X0041 counters:1 um:zero minimum:1000 name:PM_ST_FIN_GRP4 : (Group 4 pm_compat_l1_dcache_load_store_miss) Store instructions finished
-event:0X0042 counters:2 um:zero minimum:1000 name:PM_ST_MISS_L1_GRP4 : (Group 4 pm_compat_l1_dcache_load_store_miss) L1 D cache store misses
-event:0X0043 counters:3 um:zero minimum:1000 name:PM_LD_MISS_L1_GRP4 : (Group 4 pm_compat_l1_dcache_load_store_miss) L1 D cache load misses
-
-#Group 5 pm_compat_l1_cache_load, L1 Cache loads
-event:0X0050 counters:0 um:zero minimum:10000 name:PM_INST_CMPL_GRP5 : (Group 5 pm_compat_l1_cache_load) Instruction completed
-event:0X0051 counters:1 um:zero minimum:1000 name:PM_DATA_FROM_L2MISS_GRP5 : (Group 5 pm_compat_l1_cache_load) Data loaded missed L2
-event:0X0052 counters:2 um:zero minimum:1000 name:PM_L1_DCACHE_RELOAD_VALID_GRP5 : (Group 5 pm_compat_l1_cache_load) L1 reload data source valid
-event:0X0053 counters:3 um:zero minimum:1000 name:PM_LD_MISS_L1_GRP5 : (Group 5 pm_compat_l1_cache_load) L1 D cache load misses
-
-#Group 6 pm_compat_instruction_directory, Instruction Directory
-event:0X0060 counters:0 um:zero minimum:1000 name:PM_IERAT_MISS_GRP6 : (Group 6 pm_compat_instruction_directory) IERAT miss coun
-event:0X0061 counters:1 um:zero minimum:1000 name:PM_L1_ICACHE_MISS_GRP6 : (Group 6 pm_compat_instruction_directory) L1 I cache miss coun
-event:0X0062 counters:2 um:zero minimum:10000 name:PM_INST_CMPL_GRP6 : (Group 6 pm_compat_instruction_directory) Instruction completed
-event:0X0063 counters:3 um:zero minimum:1000 name:PM_ITLB_MISS_GRP6 : (Group 6 pm_compat_instruction_directory) Instruction TLB misses
-
-#Group 7 pm_compat_data_directory, Data Directory
-event:0X0070 counters:0 um:zero minimum:1000 name:PM_LSU_DERAT_MISS_CYC_GRP7 : (Group 7 pm_compat_data_directory) DERAT miss latency
-event:0X0071 counters:1 um:zero minimum:1000 name:PM_LSU_DERAT_MISS_GRP7 : (Group 7 pm_compat_data_directory) DERAT misses
-event:0X0072 counters:2 um:zero minimum:1000 name:PM_DTLB_MISS_GRP7 : (Group 7 pm_compat_data_directory) Data TLB misses
-event:0X0073 counters:3 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP7 : (Group 7 pm_compat_data_directory) Run instructions completed
-
-#Group 8 pm_compat_cpi_1plus_ppc, Misc CPI and utilization data
-event:0X0080 counters:0 um:zero minimum:1000 name:PM_1PLUS_PPC_CMPL_GRP8 : (Group 8 pm_compat_cpi_1plus_ppc) One or more PPC instruction completed
-event:0X0081 counters:1 um:zero minimum:10000 name:PM_RUN_CYC_GRP8 : (Group 8 pm_compat_cpi_1plus_ppc) Run cycles
-event:0X0082 counters:2 um:zero minimum:1000 name:PM_INST_DISP_GRP8 : (Group 8 pm_compat_cpi_1plus_ppc) Instructions dispatched
-event:0X0083 counters:3 um:zero minimum:1000 name:PM_1PLUS_PPC_DISP_GRP8 : (Group 8 pm_compat_cpi_1plus_ppc) Cycles at least one instruction dispatched
-
-#Group 9 pm_compat_misc_events1, Misc Events
-event:0X0090 counters:0 um:zero minimum:10000 name:PM_INST_CMPL_GRP9 : (Group 9 pm_compat_misc_events1) Instruction completed
-event:0X0091 counters:1 um:zero minimum:1000 name:PM_EXT_INT_GRP9 : (Group 9 pm_compat_misc_events1) External interrupts
-event:0X0092 counters:2 um:zero minimum:1000 name:PM_TB_BIT_TRANS_GRP9 : (Group 9 pm_compat_misc_events1) Time Base bit transition
-event:0X0093 counters:3 um:zero minimum:10000 name:PM_CYC_GRP9 : (Group 9 pm_compat_misc_events1) Processor cycles
-
-#Group 10 pm_compat_misc_events2, Misc Events
-event:0X00A0 counters:0 um:zero minimum:1000 name:PM_INST_IMC_MATCH_CMPL_GRP10 : (Group 10 pm_compat_misc_events2) IMC matched instructions completed
-event:0X00A1 counters:1 um:zero minimum:1000 name:PM_INST_DISP_GRP10 : (Group 10 pm_compat_misc_events2) Instructions dispatched
-event:0X00A2 counters:2 um:zero minimum:1000 name:PM_THRD_CONC_RUN_INST_GRP10 : (Group 10 pm_compat_misc_events2) Concurrent run instructions
-event:0X00A3 counters:3 um:zero minimum:1000 name:PM_FLUSH_GRP10 : (Group 10 pm_compat_misc_events2) Flushes
-
-#Group 11 pm_compat_misc_events3, Misc Events
-event:0X00B0 counters:0 um:zero minimum:1000 name:PM_GCT_EMPTY_CYC_GRP11 : (Group 11 pm_compat_misc_events3) Cycles GCT empty
-event:0X00B1 counters:1 um:zero minimum:1000 name:PM_INST_DISP_GRP11 : (Group 11 pm_compat_misc_events3) Instructions dispatched
-event:0X00B2 counters:2 um:zero minimum:1000 name:PM_TB_BIT_TRANS_GRP11 : (Group 11 pm_compat_misc_events3) Time Base bit transition
-event:0X00B3 counters:3 um:zero minimum:1000 name:PM_BR_MPRED_GRP11 : (Group 11 pm_compat_misc_events3) Branches incorrectly predicted
-
-#Group 12 pm_compat_suspend, Suspend Events
-event:0X00C0 counters:0 um:zero minimum:1000 name:PM_SUSPENDED_GRP12 : (Group 12 pm_compat_suspend) Suspended
-event:0X00C1 counters:1 um:zero minimum:1000 name:PM_SUSPENDED_GRP12 : (Group 12 pm_compat_suspend) Suspended
-event:0X00C2 counters:2 um:zero minimum:1000 name:PM_SUSPENDED_GRP12 : (Group 12 pm_compat_suspend) Suspended
-event:0X00C3 counters:3 um:zero minimum:1000 name:PM_SUSPENDED_GRP12 : (Group 12 pm_compat_suspend) Suspended
diff --git a/events/ppc64/pa6t/event_mappings b/events/ppc64/pa6t/event_mappings
deleted file mode 100644
index 0bbddcb..0000000
--- a/events/ppc64/pa6t/event_mappings
+++ /dev/null
@@ -1,48 +0,0 @@
-# pa6t does not have an mmcra. mmcr0 has all the enables and config
-# bits. mmcr1 contains the event selectors for the four programmable
-# events
-
-# Group Default
-event:0x1 mmcr0:0x000000000005b81b mmcr1:0x0000000000949f00 mmcra:0x0
-event:0x3 mmcr0:0x000000000005b81b mmcr1:0x0000000000949f00 mmcra:0x0
-event:0x4 mmcr0:0x000000000005b81b mmcr1:0x0000000000949f00 mmcra:0x0
-
-# Group 1, Load/Store
-event:0x10 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-event:0x11 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-event:0x12 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-event:0x13 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-event:0x14 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-event:0x15 mmcr0:0x000000000007f83f mmcr1:0x00000000a8c0cab1 mmcra:0x0
-
-# Group 2, Frontend
-event:0x20 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-event:0x21 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-event:0x22 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-event:0x23 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-event:0x24 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-event:0x25 mmcr0:0x000000000007f83f mmcr1:0x0000000002058401 mmcra:0x0
-
-# Group 3, Branches
-event:0x30 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-event:0x31 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-event:0x32 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-event:0x33 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-event:0x34 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-event:0x35 mmcr0:0x000000000007f83f mmcr1:0x000000008d8b8988 mmcra:0x0
-
-# Group 4, Translation
-event:0x40 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-event:0x41 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-event:0x42 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-event:0x43 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-event:0x44 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-event:0x45 mmcr0:0x000000000007f83f mmcr1:0x0000000086baa7a8 mmcra:0x0
-
-# Group 5, Memory
-event:0x50 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
-event:0x51 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
-event:0x52 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
-event:0x53 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
-event:0x54 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
-event:0x55 mmcr0:0x000000000007f83f mmcr1:0x00000000c030cab1 mmcra:0x0
diff --git a/events/ppc64/pa6t/events b/events/ppc64/pa6t/events
deleted file mode 100644
index 5e2bc2f..0000000
--- a/events/ppc64/pa6t/events
+++ /dev/null
@@ -1,52 +0,0 @@
-# ppc64 pa6t events
-#
-# Unlike the IBM ppc64 chips, any of pa6t's events can be programmed into any
-# of the counters (pmc2-5). The notion of groups on pa6t is thus
-# artificial. That said, we can still define useful aggregations to guide the
-# user in his choice of group for a profiling session.
-
-# Group Default
-event:0x1 counters:0 um:zero minimum:10000 name:CYCLES : Processor Cycles
-event:0x3 counters:3 um:zero minimum:10000 name:ISS_CYCLES : Processor Cycles with instructions issued
-event:0x4 counters:4 um:zero minimum:10000 name:RET_UOP : Retired Micro-operatioins
-
-# Group 1, Load/Store
-event:0x10 counters:0 um:zero minimum:10000 name:GRP1_CYCLES : Processor Cycles
-event:0x11 counters:1 um:zero minimum:10000 name:GRP1_INST_RETIRED : Instructions retired
-event:0x12 counters:2 um:zero minimum:1000 name:GRP1_DCACHE_RD_MISS__NS : Dcache read misses NS
-event:0x13 counters:3 um:zero minimum:500 name:GRP1_MRB_LD_MISS_L2__NS : Load misses filling from memory
-event:0x14 counters:4 um:zero minimum:500 name:GRP1_MRB_ST_MISS_ALLOC__NS : Store misses in L1D and allocates an MRB entry
-event:0x15 counters:5 um:zero minimum:500 name:GRP1_TLB_MISS_D__NS : TLB misses NS (D- only)
-
-# Group 2, Frontend
-event:0x20 counters:0 um:zero minimum:10000 name:GRP2_CYCLES : Processor Cycles
-event:0x21 counters:1 um:zero minimum:10000 name:GRP2_INST_RETIRED : Instructions retired
-event:0x22 counters:2 um:zero minimum:2000 name:GRP2_FETCH_REQ : Demand fetch requests made to the Icache
-event:0x23 counters:3 um:zero minimum:500 name:GRP2_ICACHE_MISS_DEM__NS : Demand fetch requests missing in the Icache
-event:0x24 counters:4 um:zero minimum:500 name:GRP2_ICACHE_MISS_ALL : Demand and spec fetch requests missing in the Icache
-event:0x25 counters:5 um:zero minimum:2000 name:GRP2_ICACHE_ACC : Icache accesses
-
-# Group 3, Branches
-event:0x30 counters:0 um:zero minimum:10000 name:GRP3_CYCLES : Processor Cycles
-event:0x31 counters:1 um:zero minimum:10000 name:GRP3_INST_RETIRED : Instructions retired
-event:0x32 counters:2 um:zero minimum:500 name:GRP3_NXT_LINE_MISPRED__NS : Next fetch address mispredict
-event:0x33 counters:3 um:zero minimum:500 name:GRP3_DIRN_MISPRED__NS : Branch direction mispredict
-event:0x34 counters:4 um:zero minimum:500 name:GRP3_TGT_ADDR_MISPRED__NS : Branch target address mispredict
-event:0x35 counters:5 um:zero minimum:2000 name:GRP3_BRA_TAKEN__NS : Taken branches
-
-# Group 4, Translation
-event:0x40 counters:0 um:zero minimum:10000 name:GRP4_CYCLES : Processor Cycles
-event:0x41 counters:1 um:zero minimum:10000 name:GRP4_INST_RETIRED : Instructions retired
-event:0x42 counters:2 um:zero minimum:500 name:GRP4_TLB_MISS_D__NS : TLB Misses (D-)
-event:0x43 counters:3 um:zero minimum:500 name:GRP4_TLB_MISS_I__NS : TLB MIsses (I-)
-event:0x44 counters:4 um:zero minimum:500 name:GRP4_DERAT_MISS__NS : DERAT Misses
-event:0x45 counters:5 um:zero minimum:500 name:GRP4_IERAT_MISS__NS : IERAT Misses
-
-# Group 5, Memory
-event:0x50 counters:0 um:zero minimum:10000 name:GRP5_CYCLES : Processor Cycles
-event:0x51 counters:1 um:zero minimum:10000 name:GRP5_INST_RETIRED : Instructions retired
-event:0x52 counters:2 um:zero minimum:500 name:GRP5_DCACHE_RD_MISS__NS : Dcache read misses NS
-event:0x53 counters:3 um:zero minimum:500 name:GRP5_MRB_LD_MISS_L2__NS : Load misses filling from memory
-event:0x54 counters:4 um:zero minimum:500 name:GRP5_DCACHE_VIC : Dcache line evicted (snoops not included)
-event:0x55 counters:5 um:zero minimum:500 name:GRP5_MRB_ST_MISS_ALLOC__NS : Store misses in L1D and allocates an MRB entry
-
diff --git a/events/ppc64/pa6t/unit_masks b/events/ppc64/pa6t/unit_masks
deleted file mode 100644
index ccc3ddd..0000000
--- a/events/ppc64/pa6t/unit_masks
+++ /dev/null
@@ -1,4 +0,0 @@
-# ppc64 pa6t possible unit masks
-#
-name:zero type:mandatory default:0x0
- 0x0 No unit mask
diff --git a/events/ppc64/power5++/event_mappings b/events/ppc64/power5++/event_mappings
index 57ed17b..07ff5b2 100644
--- a/events/ppc64/power5++/event_mappings
+++ b/events/ppc64/power5++/event_mappings
@@ -8,9 +8,6 @@
#Group Default
event:0X001 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
-#Group 0 with random sampling
-event:0X002 mmcr0:0X00000000 mmcr1:0X4000000002341E36 mmcra:0X00000001
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 mmcr0:0X00000000 mmcr1:0X000000000A12121E mmcra:0X00000000
event:0X0011 mmcr0:0X00000000 mmcr1:0X000000000A12121E mmcra:0X00000000
diff --git a/events/ppc64/power5++/events b/events/ppc64/power5++/events
index e4d055b..550dbf0 100644
--- a/events/ppc64/power5++/events
+++ b/events/ppc64/power5++/events
@@ -9,7 +9,11 @@
# assigned to a unique counter. The groups are from the groups defined in the
# Performance Monitor Unit user guide for this processor.
#
-# Only events within the same group can be selected simultaneously.
+# Only events within the same group can be selected simultaneously when
+# using legacy opcontrol to do profiling. When profiling with operf,
+# events from different groups may be specified, and the Linux Performance
+# Events Kernel Subsystem code will handle the necessary multiplexing.
+#
# Each event is given a unique event number. The event number is used by the
# OProfile code to resolve event names for the post-processing. This is done
# to preserve compatibility with the rest of the OProfile code. The event
@@ -18,10 +22,6 @@
#Group Default
event:0X001 counters:1 um:zero minimum:10000 name:CYCLES : Processor Cycles
-#Group 0 with random sampling
-event:0X002 counters:2 um:zero minimum:10000 name:CYCLES_RND_SMPL : Processor Cycles with random sampling
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 counters:0 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_utilization) Run cycles
event:0X0011 counters:1 um:zero minimum:10000 name:PM_INST_CMPL_GRP1 : (Group 1 pm_utilization) Instructions completed
diff --git a/events/ppc64/power5+/event_mappings b/events/ppc64/power5+/event_mappings
index 735d2d1..77e4957 100644
--- a/events/ppc64/power5+/event_mappings
+++ b/events/ppc64/power5+/event_mappings
@@ -3,10 +3,6 @@
#Group Default
event:0X001 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
-#Group 0 with random sampling
-event:0X002 mmcr0:0X00000000 mmcr1:0X4000000002341E36 mmcra:0X00000001
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X010 mmcr0:0X00000000 mmcr1:0X000000000A12121E mmcra:0X00000000
event:0X011 mmcr0:0X00000000 mmcr1:0X000000000A12121E mmcra:0X00000000
diff --git a/events/ppc64/power5+/events b/events/ppc64/power5+/events
index 0624c39..deba0d0 100644
--- a/events/ppc64/power5+/events
+++ b/events/ppc64/power5+/events
@@ -4,7 +4,11 @@
# assigned to a unique counter. The groups are from the groups defined in the
# Performance Monitor Unit user guide for this processor.
#
-# Only events within the same group can be selected simultaneously.
+# Only events within the same group can be selected simultaneously when
+# using legacy opcontrol to do profiling. When profiling with operf,
+# events from different groups may be specified, and the Linux Performance
+# Events Kernel Subsystem code will handle the necessary multiplexing.
+#
# Each event is given a unique event number. The event number is used by the
# OProfile code to resolve event names for the post-processing. This is done
# to preserve compatibility with the rest of the OProfile code. The event
@@ -13,10 +17,6 @@
#Group Default
event:0X001 counters:3 um:zero minimum:10000 name:CYCLES : Processor Cycles using continuous sampling
-#Group 0 with random sampling
-event:0X002 counters:2 um:zero minimum:10000 name:CYCLES_RND_SMPL : Processor Cycles with random sampling
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X010 counters:0 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_utilization) Run cycles
event:0X011 counters:1 um:zero minimum:10000 name:PM_INST_CMPL_GRP1 : (Group 1 pm_utilization) Instructions completed
diff --git a/events/ppc64/power5/event_mappings b/events/ppc64/power5/event_mappings
index dd3c779..52dd76f 100644
--- a/events/ppc64/power5/event_mappings
+++ b/events/ppc64/power5/event_mappings
@@ -3,10 +3,6 @@
#Group Default
event:0X001 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
-#Group 0 with random sampling
-event:0X002 mmcr0:0X00000000 mmcr1:0X4000000002341E36 mmcra:0X00000001
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X010 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
event:0X011 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
diff --git a/events/ppc64/power5/events b/events/ppc64/power5/events
index 8f438bd..c40f78f 100644
--- a/events/ppc64/power5/events
+++ b/events/ppc64/power5/events
@@ -4,7 +4,11 @@
# assigned to a unique counter. The groups are from the groups defined in the
# Performance Monitor Unit user guide for this processor.
#
-# Only events within the same group can be selected simultaneously.
+# Only events within the same group can be selected simultaneously when
+# using legacy opcontrol to do profiling. When profiling with operf,
+# events from different groups may be specified, and the Linux Performance
+# Events Kernel Subsystem code will handle the necessary multiplexing.
+#
# Each event is given a unique event number. The event number is used by the
# OProfile code to resolve event names for the post-processing. This is done
# to preserve compatibility with the rest of the OProfile code. The event
@@ -13,10 +17,6 @@
#Group Default
event:0X001 counters:3 um:zero minimum:10000 name:CYCLES : Processor Cycles using continuous sampling
-#Group 0 with random sampling
-event:0X002 counters:2 um:zero minimum:10000 name:CYCLES_RND_SMPL : Processor Cycles with random sampling
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X010 counters:0 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_utilization) Run cycles
event:0X011 counters:1 um:zero minimum:1000 name:PM_IOPS_CMPL_GRP1 : (Group 1 pm_utilization) IOPS instructions completed
diff --git a/events/ppc64/power6/event_mappings b/events/ppc64/power6/event_mappings
index 0d627b3..fdde90b 100644
--- a/events/ppc64/power6/event_mappings
+++ b/events/ppc64/power6/event_mappings
@@ -9,9 +9,6 @@
#Group Default
event:0X001 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
-#Group 0 with random sampling
-event:0X002 mmcr0:0X00000000 mmcr1:0X000000001E1E021A mmcra:0X00000001
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
event:0X0011 mmcr0:0X00000000 mmcr1:0X000000000A02121E mmcra:0X00000000
diff --git a/events/ppc64/power6/events b/events/ppc64/power6/events
index c1e2c76..df48b86 100644
--- a/events/ppc64/power6/events
+++ b/events/ppc64/power6/events
@@ -9,7 +9,11 @@
# assigned to a unique counter. The groups are from the groups defined in the
# Performance Monitor Unit user guide for this processor.
#
-# Only events within the same group can be selected simultaneously.
+# Only events within the same group can be selected simultaneously when
+# using legacy opcontrol to do profiling. When profiling with operf,
+# events from different groups may be specified, and the Linux Performance
+# Events Kernel Subsystem code will handle the necessary multiplexing.
+#
# Each event is given a unique event number. The event number is used by the
# OProfile code to resolve event names for the post-processing. This is done
# to preserve compatibility with the rest of the OProfile code. The event
@@ -18,10 +22,6 @@
#Group Default
event:0X001 counters:3 um:zero minimum:10000 name:CYCLES : Processor Cycles
-#Group 0 with random sampling
-event:0X002 counters:1 um:zero minimum:10000 name:CYCLES_RND_SMPL : Processor Cycles with random sampling
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 counters:0 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_utilization) Run cycles
event:0X0011 counters:1 um:zero minimum:10000 name:PM_INST_CMPL_GRP1 : (Group 1 pm_utilization) Instructions completed
diff --git a/events/ppc64/power7/event_mappings b/events/ppc64/power7/event_mappings
index 7de556d..fb752b0 100644
--- a/events/ppc64/power7/event_mappings
+++ b/events/ppc64/power7/event_mappings
@@ -8,9 +8,6 @@
#Group Default
event:0X001 mmcr0:0X00000000 mmcr1:0X000000001EF4F202 mmcra:0X00000000
-#Group 0 with random sampling
-event:0X002 mmcr0:0X00000000 mmcr1:0XDD0000008486021E mmcra:0X00000001
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 mmcr0:0X00000000 mmcr1:0X000000001EF4F202 mmcra:0X00000000
event:0X0011 mmcr0:0X00000000 mmcr1:0X000000001EF4F202 mmcra:0X00000000
@@ -2114,3 +2111,49 @@ event:0X1072 mmcr0:0X00000000 mmcr1:0X000000001E1E0232 mmcra:0X00000001
event:0X1073 mmcr0:0X00000000 mmcr1:0X000000001E1E0232 mmcra:0X00000001
event:0X1074 mmcr0:0X00000000 mmcr1:0X000000001E1E0232 mmcra:0X00000001
event:0X1075 mmcr0:0X00000000 mmcr1:0X000000001E1E0232 mmcra:0X00000001
+
+#Group 264 pm_gct_noslot, GCT no slot events
+###### DO NOT REMOVE ######
+# Manually added group
+event:0X1080 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+event:0X1081 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+event:0X1082 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+event:0X1083 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+event:0X1084 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+event:0X1085 mmcr0:0X00000000 mmcr1:0X00400000F908021B mmcra:0X00000000
+
+#Group 265 pm_cmplu_stall, CMPLU stall events
+###### DO NOT REMOVE ######
+event:0X1090 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+event:0X1091 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+event:0X1092 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+event:0X1093 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+event:0X1094 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+event:0X1095 mmcr0:0X00000000 mmcr1:0X000000001D3C021C mmcra:0X00000000
+
+#Group 266 pm_cmplu_stall2, CMPLU stall (with vector)
+###### DO NOT REMOVE ######
+event:0X10A0 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+event:0X10A1 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+event:0X10A2 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+event:0X10A3 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+event:0X10A4 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+event:0X10A5 mmcr0:0X00000000 mmcr1:0X00000000281D3F0B mmcra:0X00000000
+
+#Group 267 pm_cmplu_stall3, CMPLU stall (scalar)
+###### DO NOT REMOVE ######
+event:0X10B0 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+event:0X10B1 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+event:0X10B2 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+event:0X10B3 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+event:0X10B4 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+event:0X10B5 mmcr0:0X00000000 mmcr1:0X00000000F4183E13 mmcra:0X00000000
+
+#Group 268 pm_cmplu_ifu, IFU stall
+###### DO NOT REMOVE ######
+event:0X10C0 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
+event:0X10C1 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
+event:0X10C2 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
+event:0X10C3 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
+event:0X10C4 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
+event:0X10C5 mmcr0:0X00000000 mmcr1:0X0CC00000289C9E4D mmcra:0X00000000
diff --git a/events/ppc64/power7/events b/events/ppc64/power7/events
index 10775a0..851cb93 100644
--- a/events/ppc64/power7/events
+++ b/events/ppc64/power7/events
@@ -5,7 +5,11 @@
# Contributed by Maynard Johnson .
#
#
-# Only events within the same group can be selected simultaneously.
+# Only events within the same group can be selected simultaneously when
+# using legacy opcontrol to do profiling. When profiling with operf,
+# events from different groups may be specified, and the Linux Performance
+# Events Kernel Subsystem code will handle the necessary multiplexing.
+#
# Each event is given a unique event number. The event number is used by the
# OProfile code to resolve event names for the post-processing. This is done
# to preserve compatibility with the rest of the OProfile code. The event
@@ -14,10 +18,6 @@
#Group Default
event:0X001 counters:0 um:zero minimum:10000 name:CYCLES : Processor Cycles
-#Group 0 with random sampling
-event:0X002 counters:3 um:zero minimum:10000 name:CYCLES_RND_SMPL : Processor Cycles with random sampling
-
-
#Group 1 pm_utilization, CPI and utilization data
event:0X0010 counters:0 um:zero minimum:10000 name:PM_CYC_GRP1 : (Group 1 pm_utilization) Processor Cycles
event:0X0011 counters:1 um:zero minimum:10000 name:PM_RUN_CYC_GRP1 : (Group 1 pm_utilization) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
@@ -2121,3 +2121,48 @@ event:0X1072 counters:2 um:zero minimum:10000 name:PM_INST_CMPL_GRP263 : (Group
event:0X1073 counters:3 um:zero minimum:1000 name:PM_MRK_LSU_FIN_GRP263 : (Group 263 pm_mrk_misc8) One of the Load/Store Units finished a marked instruction. Instructions that finish may not necessary complete
event:0X1074 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP263 : (Group 263 pm_mrk_misc8) Number of run instructions completed.
event:0X1075 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP263 : (Group 263 pm_mrk_misc8) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
+
+#Group 264 pm_gct_noslot, GCT no slot events
+###### DO NOT REMOVE ######
+# Manually added group
+event:0X1080 counters:0 um:zero minimum:1000 name:PM_GCT_NOSLOT_CYC_EDGE_COUNT_GRP264 : (Group 264 pm_gct_noslot) Number of distinct occurrences when the Global Completion Table has no slots from this thread.
+event:0X1081 counters:1 um:zero minimum:1000 name:PM_GCT_EMPTY_CYC_GRP264 : (Group 264 pm_gct_noslot) Cycles when the Global Completion Table was completely empty. No thread had an entry allocated.
+event:0X1082 counters:2 um:zero minimum:10000 name:PM_INST_CMPL_GRP264 : (Group 264 pm_gct_noslot) Number of PowerPC Instructions that completed.
+event:0X1083 counters:3 um:zero minimum:1000 name:PM_GCT_NOSLOT_BR_MPRED_EDGE_COUNT_GRP264 : (Group 264 pm_gct_noslot) Number of distinct occurrences when the Global Completion Table has no slots from this thread because of a branch misprediction.
+event:0X1084 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP264 : (Group 264 pm_gct_noslot) Number of run instructions completed.
+event:0X1085 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP264 : (Group 264 pm_gct_noslot) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
+
+#Group 265 pm_cmplu_stall, CMPLU stall events
+###### DO NOT REMOVE ######
+event:0X1090 counters:0 um:zero minimum:1000 name:PM_CMPLU_STALL_THRD_EDGE_COUNT_GRP265 : (Group 265 pm_cmplu_stall) Number of distinct occurrences when completion stalled due to thread conflict. Group ready to complete but it was another thread's turn
+event:0X1091 counters:1 um:zero minimum:1000 name:PM_CMPLU_STALL_DFU_GRP265 : (Group 265 pm_cmplu_stall) Completion stall caused by Decimal Floating Point Unit
+event:0X1092 counters:2 um:zero minimum:10000 name:PM_INST_CMPL_GRP265 : (Group 265 pm_cmplu_stall) Number of PowerPC Instructions that completed.
+event:0X1093 counters:3 um:zero minimum:1000 name:PM_GCT_NOSLOT_BR_MPRED_IC_MISS_GRP265 : (Group 265 pm_cmplu_stall) No slot in GCT caused by branch mispredict or I cache miss
+event:0X1094 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP265 : (Group 265 pm_cmplu_stall) Number of run instructions completed.
+event:0X1095 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP265 : (Group 265 pm_cmplu_stall) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
+
+#Group 266 pm_cmplu_stall2, CMPLU stall (vector)
+###### DO NOT REMOVE ######
+event:0X10A0 counters:0 um:zero minimum:1000 name:PM_CMPLU_STALL_END_GCT_NOSLOT_GRP266 : (Group 266 pm_cmplu_stall2) Count ended because GCT went empty
+event:0X10A1 counters:1 um:zero minimum:1000 name:PM_CMPLU_STALL_VECTOR_EDGE_COUNT_GRP266 : (Group 266 pm_cmplu_stall2) Number of distinct occurrences when completion stalled caused by Vector instruction
+event:0X10A2 counters:2 um:zero minimum:1000 name:PM_MRK_STALL_CMPLU_CYC_COUNT_GRP266 : (Group 266 pm_cmplu_stall2) Marked Group Completion Stall cycles (use edge detect to count #)
+event:0X10A3 counters:3 um:zero minimum:1000 name:PM_CMPLU_STALL_EDGE_COUNT_GRP266 : (Group 266 pm_cmplu_stall2) Number of distinct occurrences when no groups completed, GCT not empty
+event:0X10A4 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP266 : (Group 266 pm_cmplu_stall2) Number of run instructions completed.
+event:0X10A5 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP266 : (Group 266 pm_cmplu_stall2) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
+
+#Group 267 pm_cmplu_stall3, CMPLU stall (scalar)
+###### DO NOT REMOVE ######
+event:0X10B0 counters:0 um:zero minimum:1000 name:PM_FLOP_GRP267 : (Group 267 pm_cmplu_stall3) A floating point operation has completed
+event:0X10B1 counters:1 um:zero minimum:1000 name:PM_CMPLU_STALL_SCALAR_LONG_GRP267 : (Group 267 pm_cmplu_stall3) Completion stall caused by long latency scalar instruction
+event:0X10B2 counters:2 um:zero minimum:1000 name:PM_MRK_STALL_CMPLU_CYC_GRP267 : (Group 267 pm_cmplu_stall3) Marked Group Completion Stall cycles
+event:0X10B3 counters:3 um:zero minimum:1000 name:PM_CMPLU_STALL_SCALAR_EDGE_COUNT_GRP267 : (Group 267 pm_cmplu_stall3) Number of distinct occurrences when completion stalled caused by FPU instruction
+event:0X10B4 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP267 : (Group 267 pm_cmplu_stall3) Number of run instructions completed.
+event:0X10B5 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP267 : (Group 267 pm_cmplu_stall3) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
+
+#Group 268 pm_cmplu_ifu, IFU stall
+event:0X10C0 counters:0 um:zero minimum:1000 name:PM_CMPLU_STALL_END_GCT_NOSLOT_GRP268 : (Group 268 pm_cmplu_ifu) Count ended because GCT went empty
+event:0X10C1 counters:1 um:zero minimum:1000 name:PM_LSU0_L1_SW_PREF_GRP268 : (Group 268 pm_cmplu_ifu) LSU0 Software L1 Prefetches, including SW Transient Prefetches
+event:0X10C2 counters:2 um:zero minimum:1000 name:PM_LSU1_L1_SW_PREF_GRP268 : (Group 268 pm_cmplu_ifu) LSU1 Software L1 Prefetches, including SW Transient Prefetches
+event:0X10C3 counters:3 um:zero minimum:1000 name:PM_CMPLU_STALL_IFU_EDGE_COUNT_GRP268 : (Group 268 pm_cmplu_ifu) Number of distinct occurrences when completion stalled due to IFU
+event:0X10C4 counters:4 um:zero minimum:1000 name:PM_RUN_INST_CMPL_GRP268 : (Group 268 pm_cmplu_ifu) Number of run instructions completed.
+event:0X10C5 counters:5 um:zero minimum:10000 name:PM_RUN_CYC_GRP268 : (Group 268 pm_cmplu_ifu) Processor Cycles gated by the run latch. Operating systems use the run latch to indicate when they are doing useful work. The run latch is typically cleared in the OS idle loop. Gating by the run latch filters out the idle loop.
diff --git a/events/ppc64/power8/events b/events/ppc64/power8/events
new file mode 100644
index 0000000..6e4e688
--- /dev/null
+++ b/events/ppc64/power8/events
@@ -0,0 +1,1020 @@
+#
+# Copyright OProfile authors
+# Copyright (c) International Business Machines, 2013.
+# Contributed by Maynard Johnson .
+#
+# IBM POWER8 Events
+
+include:ppc64/architected_events_v1
+
+event:0x1f05e counters:0 um:zero minimum:100000 name:PM_1LPAR_CYC : Number of cycles in single lpar mode.
+event:0x2006e counters:1 um:zero minimum:10000 name:PM_2LPAR_CYC : Number of cycles in 2 lpar mode.
+event:0x4e05e counters:3 um:zero minimum:100000 name:PM_4LPAR_CYC : Number of cycles in 4 LPAR mode.
+event:0x610050 counters:0 um:zero minimum:10000 name:PM_ALL_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for all data types ( demand load,data,inst prefetch,inst fetch,xlate (I or d)
+event:0x520050 counters:1 um:zero minimum:10000 name:PM_ALL_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x620052 counters:1 um:zero minimum:10000 name:PM_ALL_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x610052 counters:0 um:zero minimum:10000 name:PM_ALL_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x610054 counters:0 um:zero minimum:10000 name:PM_ALL_PUMP_CPRED : Pump prediction correct. Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x640052 counters:3 um:zero minimum:10000 name:PM_ALL_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x630050 counters:2 um:zero minimum:10000 name:PM_ALL_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x630052 counters:2 um:zero minimum:10000 name:PM_ALL_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x640050 counters:3 um:zero minimum:10000 name:PM_ALL_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)
+event:0x2505e counters:1 um:zero minimum:10000 name:PM_BACK_BR_CMPL : Branch instruction completed with a target address less than current instruction address.
+event:0x4082 counters:0,1,2,3 um:zero minimum:10000 name:PM_BANK_CONFLICT : Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.
+event:0x10068 counters:0 um:zero minimum:10000 name:PM_BRU_FIN : Branch Instruction Finished .
+event:0x20036 counters:1 um:zero minimum:10000 name:PM_BR_2PATH : two path branch.
+event:0x5086 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_BC_8 : Pairable BC+8 branch that has not been converted to a Resolve Finished in the BRU pipeline
+event:0x5084 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_BC_8_CONV : Pairable BC+8 branch that was converted to a Resolve Finished in the BRU pipeline.
+event:0x40060 counters:3 um:zero minimum:10000 name:PM_BR_CMPL : Branch Instruction completed.
+event:0x40ac counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_MPRED_CCACHE : Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction
+event:0x40b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_MPRED_CR : Conditional Branch Completed that was Mispredicted due to the BHT Direction Prediction (taken/not taken).
+event:0x40ae counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_MPRED_LSTACK : Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction
+event:0x40ba counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_MPRED_TA : Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.
+event:0x10138 counters:0 um:zero minimum:10000 name:PM_BR_MRK_2PATH : marked two path branch.
+event:0x409c counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_BR0 : Conditional Branch Completed on BR0 (1st branch in group) in which the HW predicted the Direction or Target
+event:0x409e counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_BR1 : Conditional Branch Completed on BR1 (2nd branch in group) in which the HW predicted the Direction or Target. Note: BR1 can only be used in Single Thread Mode. In all of the SMT modes, only one branch can complete, thus BR1 is unused.
+event:0x489c counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_BR_CMPL : IFU
+event:0x40a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CCACHE_BR0 : Conditional Branch Completed on BR0 that used the Count Cache for Target Prediction
+event:0x40a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CCACHE_BR1 : Conditional Branch Completed on BR1 that used the Count Cache for Target Prediction
+event:0x48a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CCACHE_CMPL : IFU
+event:0x40b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CR_BR0 : Conditional Branch Completed on BR0 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and bra
+event:0x40b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CR_BR1 : Conditional Branch Completed on BR1 that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and bra
+event:0x48b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_CR_CMPL : IFU
+event:0x40a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_LSTACK_BR0 : Conditional Branch Completed on BR0 that used the Link Stack for Target Prediction
+event:0x40aa counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_LSTACK_BR1 : Conditional Branch Completed on BR1 that used the Link Stack for Target Prediction
+event:0x48a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_LSTACK_CMPL : IFU
+event:0x40b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_TA_BR0 : Conditional Branch Completed on BR0 that had its target address predicted. Only XL-form branches set this event.
+event:0x40b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_TA_BR1 : Conditional Branch Completed on BR1 that had its target address predicted. Only XL-form branches set this event.
+event:0x48b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_PRED_TA_CMPL : IFU
+event:0x40a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_UNCOND_BR0 : Unconditional Branch Completed on BR0. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was coverted to a Resolve.
+event:0x40a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_UNCOND_BR1 : Unconditional Branch Completed on BR1. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was coverted to a Resolve.
+event:0x48a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_BR_UNCOND_CMPL : IFU
+event:0x3094 counters:0,1,2,3 um:zero minimum:10000 name:PM_CASTOUT_ISSUED : Castouts issued
+event:0x3096 counters:0,1,2,3 um:zero minimum:10000 name:PM_CASTOUT_ISSUED_GPR : Castouts issued GPR
+event:0x10050 counters:0 um:zero minimum:10000 name:PM_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for all data types ( demand load,data,inst prefetch,inst fetch,xlate (I or d).
+event:0x2090 counters:0,1,2,3 um:zero minimum:10000 name:PM_CLB_HELD : CLB Hold: Any Reason
+event:0x4000a counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL : Completion stall.
+event:0x4d018 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_BRU : Completion stall due to a Branch Unit.
+event:0x2d018 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_BRU_CRU : Completion stall due to IFU.
+event:0x30026 counters:2 um:zero minimum:10000 name:PM_CMPLU_STALL_COQ_FULL : Completion stall due to CO q full.
+event:0x2c012 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_DCACHE_MISS : Completion stall by Dcache miss.
+event:0x2c018 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_L21_L31 : Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3).
+event:0x2c016 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_L2L3 : Completion stall by Dcache miss which resolved in L2/L3.
+event:0x4c016 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_L2L3_CONFLICT : Completion stall due to cache miss resolving in core's L2/L3 with a conflict.
+event:0x4c01a counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_L3MISS : Completion stall due to cache miss resolving missed the L3.
+event:0x4c018 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_LMEM : Completion stall due to cache miss resolving in core's Local Memory.
+event:0x2c01c counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_DMISS_REMOTE : Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3).
+event:0x4c012 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_ERAT_MISS : Completion stall due to LSU reject ERAT miss.
+event:0x30038 counters:2 um:zero minimum:10000 name:PM_CMPLU_STALL_FLUSH : completion stall due to flush by own thread.
+event:0x4d016 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_FXLONG : Completion stall due to a long latency fixed point instruction.
+event:0x2d016 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_FXU : Completion stall due to FXU.
+event:0x30036 counters:2 um:zero minimum:10000 name:PM_CMPLU_STALL_HWSYNC : completion stall due to hwsync.
+event:0x4d014 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_LOAD_FINISH : Completion stall due to a Load finish.
+event:0x2c010 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_LSU : Completion stall by LSU instruction.
+event:0x10036 counters:0 um:zero minimum:10000 name:PM_CMPLU_STALL_LWSYNC : completion stall due to isync/lwsync.
+event:0x30028 counters:2 um:zero minimum:10000 name:PM_CMPLU_STALL_MEM_ECC_DELAY : Completion stall due to mem ECC delay.
+event:0x2e01c counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_NO_NTF : Completion stall due to nop.
+event:0x2e01e counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_NTCG_FLUSH : Completion stall due to reject (load hit store).
+event:0x30006 counters:2 um:zero minimum:10000 name:PM_CMPLU_STALL_OTHER_CMPL : Instructions core completed while this thread was stalled.
+event:0x4c010 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_REJECT : Completion stall due to LSU reject.
+event:0x2c01a counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_REJECT_LHS : Completion stall due to reject (load hit store).
+event:0x4c014 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_REJ_LMQ_FULL : Completion stall due to LSU reject LMQ full.
+event:0x4d010 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_SCALAR : Completion stall due to VSU scalar instruction.
+event:0x2d010 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_SCALAR_LONG : Completion stall due to VSU scalar long latency instruction.
+event:0x2c014 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_STORE : Completion stall by stores.
+event:0x4c01c counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_ST_FWD : Completion stall due to store forward.
+event:0x1001c counters:0 um:zero minimum:10000 name:PM_CMPLU_STALL_THRD : Completion stall due to thread conflict.
+event:0x2d014 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_VECTOR : Completion stall due to VSU vector instruction.
+event:0x4d012 counters:3 um:zero minimum:10000 name:PM_CMPLU_STALL_VECTOR_LONG : Completion stall due to VSU vector long instruction.
+event:0x2d012 counters:1 um:zero minimum:10000 name:PM_CMPLU_STALL_VSU : Completion stall due to VSU instruction.
+event:0x16083 counters:0 um:zero minimum:10000 name:PM_CO0_ALLOC : 0.0
+event:0x16082 counters:0 um:zero minimum:10000 name:PM_CO0_BUSY : CO mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)
+event:0x517082 counters:0 um:zero minimum:10000 name:PM_CO_DISP_FAIL : CO dispatch failed due to all CO machines being busy
+event:0x527084 counters:1 um:zero minimum:10000 name:PM_CO_TM_SC_FOOTPRINT : L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3)
+event:0x3608a counters:2 um:zero minimum:10000 name:PM_CO_USAGE : Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running
+event:0x40066 counters:3 um:zero minimum:10000 name:PM_CRU_FIN : IFU Finished a (non-branch) instruction.
+event:0x61c050 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for a demand load
+event:0x64c048 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_DL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c048 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_DL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c04c counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_DL4 : The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c04c counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_DMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c042 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2 : The processor's data cache was reloaded from local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c046 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L21_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c046 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L21_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c04e counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2MISS_MOD : The processor's data cache was reloaded from a localtion other than the local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c040 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2_DISP_CONFLICT_LDHITST : The processor's data cache was reloaded from local core's L2 with load hit store conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c040 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2_DISP_CONFLICT_OTHER : The processor's data cache was reloaded from local core's L2 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c040 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2_MEPF : The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c040 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L2_NO_CONFLICT : The processor's data cache was reloaded from local core's L2 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c042 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L3 : The processor's data cache was reloaded from local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c044 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L31_ECO_MOD : The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c044 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L31_ECO_SHR : The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c044 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L31_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c046 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L31_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c04e counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L3MISS_MOD : The processor's data cache was reloaded from a localtion other than the local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c042 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L3_DISP_CONFLICT : The processor's data cache was reloaded from local core's L3 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c042 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L3_MEPF : The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c044 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_L3_NO_CONFLICT : The processor's data cache was reloaded from local core's L3 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c04c counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_LL4 : The processor's data cache was reloaded from the local chip's L4 cache due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c048 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_LMEM : The processor's data cache was reloaded from the local chip's Memory due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c04c counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_MEMORY : The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x64c04a counters:3 um:zero minimum:10000 name:PM_DATA_ALL_FROM_OFF_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c048 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_ON_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c046 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_RL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x61c04a counters:0 um:zero minimum:10000 name:PM_DATA_ALL_FROM_RL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c04a counters:1 um:zero minimum:10000 name:PM_DATA_ALL_FROM_RL4 : The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x63c04a counters:2 um:zero minimum:10000 name:PM_DATA_ALL_FROM_RMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1
+event:0x62c050 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for a demand load
+event:0x62c052 counters:1 um:zero minimum:10000 name:PM_DATA_ALL_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x61c052 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor a demand load
+event:0x61c054 counters:0 um:zero minimum:10000 name:PM_DATA_ALL_PUMP_CPRED : Pump prediction correct. Counts across all types of pumps for a demand load
+event:0x64c052 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor a demand load
+event:0x63c050 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for a demand load
+event:0x63c052 counters:2 um:zero minimum:10000 name:PM_DATA_ALL_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x64c050 counters:3 um:zero minimum:10000 name:PM_DATA_ALL_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for a demand load
+event:0x1c050 counters:0 um:zero minimum:10000 name:PM_DATA_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for a demand load.
+event:0x4c048 counters:3 um:zero minimum:10000 name:PM_DATA_FROM_DL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c048 counters:2 um:zero minimum:10000 name:PM_DATA_FROM_DL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c04c counters:2 um:zero minimum:10000 name:PM_DATA_FROM_DL4 : The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c04c counters:3 um:zero minimum:10000 name:PM_DATA_FROM_DMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c042 counters:0 um:zero minimum:10000 name:PM_DATA_FROM_L2 : The processor's data cache was reloaded from local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c046 counters:3 um:zero minimum:10000 name:PM_DATA_FROM_L21_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c046 counters:2 um:zero minimum:10000 name:PM_DATA_FROM_L21_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c04e counters:0 um:zero minimum:10000 name:PM_DATA_FROM_L2MISS_MOD : The processor's data cache was reloaded from a localtion other than the local core's L2 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c040 counters:2 um:zero minimum:10000 name:PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST : The processor's data cache was reloaded from local core's L2 with load hit store conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c040 counters:3 um:zero minimum:10000 name:PM_DATA_FROM_L2_DISP_CONFLICT_OTHER : The processor's data cache was reloaded from local core's L2 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c040 counters:1 um:zero minimum:10000 name:PM_DATA_FROM_L2_MEPF : The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c040 counters:0 um:zero minimum:10000 name:PM_DATA_FROM_L2_NO_CONFLICT : The processor's data cache was reloaded from local core's L2 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1 .
+event:0x4c042 counters:3 um:zero minimum:10000 name:PM_DATA_FROM_L3 : The processor's data cache was reloaded from local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c044 counters:3 um:zero minimum:10000 name:PM_DATA_FROM_L31_ECO_MOD : The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c044 counters:2 um:zero minimum:10000 name:PM_DATA_FROM_L31_ECO_SHR : The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c044 counters:1 um:zero minimum:10000 name:PM_DATA_FROM_L31_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c046 counters:0 um:zero minimum:10000 name:PM_DATA_FROM_L31_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c04e counters:3 um:zero minimum:10000 name:PM_DATA_FROM_L3MISS_MOD : The processor's data cache was reloaded from a localtion other than the local core's L3 due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c042 counters:2 um:zero minimum:10000 name:PM_DATA_FROM_L3_DISP_CONFLICT : The processor's data cache was reloaded from local core's L3 with dispatch conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c042 counters:1 um:zero minimum:10000 name:PM_DATA_FROM_L3_MEPF : The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c044 counters:0 um:zero minimum:10000 name:PM_DATA_FROM_L3_NO_CONFLICT : The processor's data cache was reloaded from local core's L3 without conflict due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c04c counters:0 um:zero minimum:10000 name:PM_DATA_FROM_LL4 : The processor's data cache was reloaded from the local chip's L4 cache due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c048 counters:1 um:zero minimum:10000 name:PM_DATA_FROM_LMEM : The processor's data cache was reloaded from the local chip's Memory due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c04c counters:1 um:zero minimum:10000 name:PM_DATA_FROM_MEMORY : The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x4c04a counters:3 um:zero minimum:10000 name:PM_DATA_FROM_OFF_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c048 counters:0 um:zero minimum:10000 name:PM_DATA_FROM_ON_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c046 counters:1 um:zero minimum:10000 name:PM_DATA_FROM_RL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x1c04a counters:0 um:zero minimum:10000 name:PM_DATA_FROM_RL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c04a counters:1 um:zero minimum:10000 name:PM_DATA_FROM_RL4 : The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x3c04a counters:2 um:zero minimum:10000 name:PM_DATA_FROM_RMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to either only demand loads or demand loads plus prefetches if MMCR1[16] is 1.
+event:0x2c050 counters:1 um:zero minimum:10000 name:PM_DATA_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for a demand load.
+event:0x2c052 counters:1 um:zero minimum:10000 name:PM_DATA_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x1c052 counters:0 um:zero minimum:10000 name:PM_DATA_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor a demand load.
+event:0x1c054 counters:0 um:zero minimum:10000 name:PM_DATA_PUMP_CPRED : Pump prediction correct. Counts across all types of pumps for a demand load.
+event:0x4c052 counters:3 um:zero minimum:10000 name:PM_DATA_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor a demand load.
+event:0x3c050 counters:2 um:zero minimum:10000 name:PM_DATA_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for a demand load.
+event:0x3c052 counters:2 um:zero minimum:10000 name:PM_DATA_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x4c050 counters:3 um:zero minimum:10000 name:PM_DATA_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for a demand load.
+event:0x3001a counters:2 um:zero minimum:10000 name:PM_DATA_TABLEWALK_CYC : Data Tablewalk Active.
+event:0xe0bc counters:0,1,2,3 um:zero minimum:10000 name:PM_DC_COLLISIONS : DATA Cache collisions42
+event:0x1e050 counters:0 um:zero minimum:10000 name:PM_DC_PREF_STREAM_ALLOC : Stream marked valid. The stream could have been allocated through the hardware prefetch mechanism or through software. This is combined ls0 and ls1.
+event:0x2e050 counters:1 um:zero minimum:10000 name:PM_DC_PREF_STREAM_CONF : A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Combine up + down.
+event:0x4e050 counters:3 um:zero minimum:10000 name:PM_DC_PREF_STREAM_FUZZY_CONF : A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up).
+event:0x3e050 counters:2 um:zero minimum:10000 name:PM_DC_PREF_STREAM_STRIDED_CONF : A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software..
+event:0x4c054 counters:3 um:zero minimum:10000 name:PM_DERAT_MISS_16G : Data ERAT Miss (Data TLB Access) page size 16G.
+event:0x3c054 counters:2 um:zero minimum:10000 name:PM_DERAT_MISS_16M : Data ERAT Miss (Data TLB Access) page size 16M.
+event:0x1c056 counters:0 um:zero minimum:10000 name:PM_DERAT_MISS_4K : Data ERAT Miss (Data TLB Access) page size 4K.
+event:0x2c054 counters:1 um:zero minimum:10000 name:PM_DERAT_MISS_64K : Data ERAT Miss (Data TLB Access) page size 64K.
+event:0xb0ba counters:0,1,2,3 um:zero minimum:10000 name:PM_DFU : Finish DFU (all finish)
+event:0xb0be counters:0,1,2,3 um:zero minimum:10000 name:PM_DFU_DCFFIX : Convert from fixed opcode finish (dcffix,dcffixq)
+event:0xb0bc counters:0,1,2,3 um:zero minimum:10000 name:PM_DFU_DENBCD : BCD->DPD opcode finish (denbcd, denbcdq)
+event:0xb0b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_DFU_MC : Finish DFU multicycle
+event:0x2092 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_CLB_HELD_BAL : Dispatch/CLB Hold: Balance
+event:0x2094 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_CLB_HELD_RES : Dispatch/CLB Hold: Resource
+event:0x20a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_CLB_HELD_SB : Dispatch/CLB Hold: Scoreboard
+event:0x2098 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_CLB_HELD_SYNC : Dispatch/CLB Hold: Sync type instruction
+event:0x2096 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_CLB_HELD_TLBIE : Dispatch Hold: Due to TLBIE
+event:0x10006 counters:0 um:zero minimum:10000 name:PM_DISP_HELD : Dispatch Held.
+event:0x20006 counters:1 um:zero minimum:10000 name:PM_DISP_HELD_IQ_FULL : Dispatch held due to Issue q full.
+event:0x1002a counters:0 um:zero minimum:10000 name:PM_DISP_HELD_MAP_FULL : Dispatch held due to Mapper full.
+event:0x30018 counters:2 um:zero minimum:10000 name:PM_DISP_HELD_SRQ_FULL : Dispatch held due SRQ no room.
+event:0x4003c counters:3 um:zero minimum:10000 name:PM_DISP_HELD_SYNC_HOLD : Dispatch held due to SYNC hold.
+event:0x30a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_DISP_HOLD_GCT_FULL : Dispatch Hold Due to no space in the GCT
+event:0x30008 counters:2 um:zero minimum:10000 name:PM_DISP_WT : Dispatched Starved (not held, nothing to dispatch).
+event:0x4e048 counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_DL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.
+event:0x3e048 counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_DL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request.
+event:0x3e04c counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_DL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request.
+event:0x4e04c counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_DMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request.
+event:0x1e042 counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_L2 : A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request.
+event:0x4e046 counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_L21_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request.
+event:0x3e046 counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_L21_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request.
+event:0x1e04e counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_L2MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a data side request.
+event:0x3e040 counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST : A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a data side request.
+event:0x4e040 counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_L2_DISP_CONFLICT_OTHER : A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a data side request.
+event:0x2e040 counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_L2_MEPF : A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request.
+event:0x1e040 counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_L2_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request.
+event:0x4e042 counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_L3 : A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request.
+event:0x4e044 counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_L31_ECO_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request.
+event:0x3e044 counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_L31_ECO_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request.
+event:0x2e044 counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_L31_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request.
+event:0x1e046 counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_L31_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request.
+event:0x4e04e counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_L3MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a data side request.
+event:0x3e042 counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_L3_DISP_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request.
+event:0x2e042 counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_L3_MEPF : A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request.
+event:0x1e044 counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_L3_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request.
+event:0x1e04c counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_LL4 : A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request.
+event:0x2e048 counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_LMEM : A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request.
+event:0x2e04c counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_MEMORY : A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request.
+event:0x4e04a counters:3 um:zero minimum:10000 name:PM_DPTEG_FROM_OFF_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request.
+event:0x1e048 counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_ON_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request.
+event:0x2e046 counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_RL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.
+event:0x1e04a counters:0 um:zero minimum:10000 name:PM_DPTEG_FROM_RL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request.
+event:0x2e04a counters:1 um:zero minimum:10000 name:PM_DPTEG_FROM_RL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request.
+event:0x3e04a counters:2 um:zero minimum:10000 name:PM_DPTEG_FROM_RMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request.
+event:0xd094 counters:0,1,2,3 um:zero minimum:10000 name:PM_DSLB_MISS : Data SLB Miss - Total of all segment sizesData SLB misses
+event:0x1c058 counters:0 um:zero minimum:10000 name:PM_DTLB_MISS_16G : Data TLB Miss page size 16G.
+event:0x4c056 counters:3 um:zero minimum:10000 name:PM_DTLB_MISS_16M : Data TLB Miss page size 16M.
+event:0x2c056 counters:1 um:zero minimum:10000 name:PM_DTLB_MISS_4K : Data TLB Miss page size 4k.
+event:0x3c056 counters:2 um:zero minimum:10000 name:PM_DTLB_MISS_64K : Data TLB Miss page size 64K.
+event:0x50a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_EAT_FORCE_MISPRED : XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is
+event:0x4084 counters:0,1,2,3 um:zero minimum:10000 name:PM_EAT_FULL_CYC : Cycles No room in EATSet on bank conflict and case where no ibuffers available.
+event:0x2080 counters:0,1,2,3 um:zero minimum:10000 name:PM_EE_OFF_EXT_INT : Ee off and external interrupt
+event:0x20b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_FAV_TBEGIN : Dispatch time Favored tbegin
+event:0xa0ae counters:0,1,2,3 um:zero minimum:10000 name:PM_FLOP_SUM_SCALAR : flops summary scalar instructions
+event:0xa0ac counters:0,1,2,3 um:zero minimum:10000 name:PM_FLOP_SUM_VEC : flops summary vector instructions
+event:0x2084 counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_BR_MPRED : Flush caused by branch mispredict
+event:0x30012 counters:2 um:zero minimum:10000 name:PM_FLUSH_COMPLETION : Completion Flush.
+event:0x2082 counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_DISP : Dispatch flush
+event:0x208c counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_DISP_SB : Dispatch Flush: Scoreboard
+event:0x2088 counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_DISP_SYNC : Dispatch Flush: Sync
+event:0x208a counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_DISP_TLBIE : Dispatch Flush: TLBIE
+event:0x208e counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_LSU : Flush initiated by LSU
+event:0x2086 counters:0,1,2,3 um:zero minimum:10000 name:PM_FLUSH_PARTIAL : Partial flush
+event:0xa0b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU0_FCONV : Convert instruction executed
+event:0xa0b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU0_FEST : Estimate instruction executed
+event:0xa0b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU0_FRSP : Round to single precision instruction executed
+event:0xa0b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU1_FCONV : Convert instruction executed
+event:0xa0ba counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU1_FEST : Estimate instruction executed
+event:0xa0b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_FPU1_FRSP : Round to single precision instruction executed
+event:0x3000c counters:2 um:zero minimum:10000 name:PM_FREQ_DOWN : Frequency is being slewed down due to Power Management.
+event:0x4000c counters:3 um:zero minimum:10000 name:PM_FREQ_UP : Frequency is being slewed up due to Power Management.
+event:0x50b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_TOC_GRP0_1 : One pair of instructions fused with TOC in Group0
+event:0x50ae counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_TOC_GRP0_2 : Two pairs of instructions fused with TOCin Group0
+event:0x50ac counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_TOC_GRP0_3 : Three pairs of instructions fused with TOC in Group0
+event:0x50b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_TOC_GRP1_1 : One pair of instructions fused with TOX in Group1
+event:0x50b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_VSX_GRP0_1 : One pair of instructions fused with VSX in Group0
+event:0x50b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_VSX_GRP0_2 : Two pairs of instructions fused with VSX in Group0
+event:0x50b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_VSX_GRP0_3 : Three pairs of instructions fused with VSX in Group0
+event:0x50ba counters:0,1,2,3 um:zero minimum:10000 name:PM_FUSION_VSX_GRP1_1 : One pair of instructions fused with VSX in Group1
+event:0x3000e counters:2 um:zero minimum:10000 name:PM_FXU0_BUSY_FXU1_IDLE : fxu0 busy and fxu1 idle.
+event:0x10004 counters:0 um:zero minimum:10000 name:PM_FXU0_FIN : FXU0 Finished.
+event:0x4000e counters:3 um:zero minimum:10000 name:PM_FXU1_BUSY_FXU0_IDLE : fxu0 idle and fxu1 busy. .
+event:0x40004 counters:3 um:zero minimum:10000 name:PM_FXU1_FIN : FXU1 Finished.
+event:0x2000e counters:1 um:zero minimum:10000 name:PM_FXU_BUSY : fxu0 busy and fxu1 busy..
+event:0x1000e counters:0 um:zero minimum:10000 name:PM_FXU_IDLE : fxu0 idle and fxu1 idle.
+event:0x20008 counters:1 um:zero minimum:10000 name:PM_GCT_EMPTY_CYC : No itags assigned either thread (GCT Empty).
+event:0x30a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_MERGE : Group dispatched on a merged GCT empty. GCT entries can be merged only within the same thread
+event:0x4d01e counters:3 um:zero minimum:10000 name:PM_GCT_NOSLOT_BR_MPRED : Gct empty for this thread due to branch mispred.
+event:0x4d01a counters:3 um:zero minimum:10000 name:PM_GCT_NOSLOT_BR_MPRED_ICMISS : Gct empty for this thread due to Icache Miss and branch mispred.
+event:0x2d01e counters:1 um:zero minimum:10000 name:PM_GCT_NOSLOT_DISP_HELD_ISSQ : Gct empty for this thread due to dispatch hold on this thread due to Issue q full.
+event:0x4d01c counters:3 um:zero minimum:10000 name:PM_GCT_NOSLOT_DISP_HELD_MAP : Gct empty for this thread due to dispatch hold on this thread due to Mapper full.
+event:0x2e010 counters:1 um:zero minimum:10000 name:PM_GCT_NOSLOT_DISP_HELD_OTHER : Gct empty for this thread due to dispatch hold on this thread due to sync.
+event:0x2d01c counters:1 um:zero minimum:10000 name:PM_GCT_NOSLOT_DISP_HELD_SRQ : Gct empty for this thread due to dispatch hold on this thread due to SRQ full.
+event:0x4e010 counters:3 um:zero minimum:10000 name:PM_GCT_NOSLOT_IC_L3MISS : Gct empty for this thread due to icach l3 miss.
+event:0x2d01a counters:1 um:zero minimum:10000 name:PM_GCT_NOSLOT_IC_MISS : Gct empty for this thread due to Icache Miss.
+event:0x20a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_11_14_ENTRIES : GCT Utilization 11-14 entries
+event:0x20a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_15_17_ENTRIES : GCT Utilization 15-17 entries
+event:0x20a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_18_ENTRIES : GCT Utilization 18+ entries
+event:0x209c counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_1_2_ENTRIES : GCT Utilization 1-2 entries
+event:0x209e counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_3_6_ENTRIES : GCT Utilization 3-6 entries
+event:0x20a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_GCT_UTIL_7_10_ENTRIES : GCT Utilization 7-10 entries
+event:0x1000a counters:0 um:zero minimum:10000 name:PM_GRP_BR_MPRED_NONSPEC : Group experienced Non-speculative br mispredicct.
+event:0x30004 counters:2 um:zero minimum:100000 name:PM_GRP_CMPL : group completed.
+event:0x3000a counters:2 um:zero minimum:100000 name:PM_GRP_DISP : dispatch_success (Group Dispatched).
+event:0x1000c counters:0 um:zero minimum:10000 name:PM_GRP_IC_MISS_NONSPEC : Group experi enced Non-specu lative I cache miss.
+event:0x10130 counters:0 um:zero minimum:10000 name:PM_GRP_MRK : Instruction marked in idu.
+event:0x509c counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_NON_FULL_GROUP : GROUPs where we did not have 6 non branch instructions in the group(ST mode), in SMT mode 3 non branches
+event:0x20050 counters:1 um:zero minimum:10000 name:PM_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x20052 counters:1 um:zero minimum:10000 name:PM_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x10052 counters:0 um:zero minimum:10000 name:PM_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x50a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_TERM_2ND_BRANCH : There were enough instructions in the Ibuffer, but 2nd branch ends group
+event:0x50a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_TERM_FPU_AFTER_BR : There were enough instructions in the Ibuffer, but FPU OP IN same group after a branch terminates a group, cant do partial flushes
+event:0x509e counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_TERM_NOINST : Do not fill every slot in the group, Not enough instructions in the Ibuffer. This includes cases where the group started with enough instructions, but some got knocked out by a cache miss or branch redirect (which would also empty the Ibuffer).
+event:0x50a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_TERM_OTHER : There were enough instructions in the Ibuffer, but the group terminated early for some other reason, most likely due to a First or Last.
+event:0x50a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_GRP_TERM_SLOT_LIMIT : There were enough instructions in the Ibuffer, but 3 src RA/RB/RC , 2 way crack caused a group termination
+event:0x2000a counters:1 um:zero minimum:10000 name:PM_HV_CYC : cycles in hypervisor mode .
+event:0x4086 counters:0,1,2,3 um:zero minimum:10000 name:PM_IBUF_FULL_CYC : Cycles No room in ibufffully qualified tranfer (if5 valid).
+event:0x10018 counters:0 um:zero minimum:10000 name:PM_IC_DEMAND_CYC : Demand ifetch pending.
+event:0x4098 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_DEMAND_L2_BHT_REDIRECT : L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)
+event:0x409a counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_DEMAND_L2_BR_REDIRECT : L2 I cache demand request due to branch Mispredict ( 15 cycle path)
+event:0x4088 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_DEMAND_REQ : Demand Instruction fetch request
+event:0x508a counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_INVALIDATE : Ic line invalidated
+event:0x4092 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_PREF_CANCEL_HIT : Prefetch Canceled due to icache hit
+event:0x4094 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_PREF_CANCEL_L2 : L2 Squashed request
+event:0x4090 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_PREF_CANCEL_PAGE : Prefetch Canceled due to page boundary
+event:0x408a counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_PREF_REQ : Instruction prefetch requests
+event:0x408e counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_PREF_WRITE : Instruction prefetch written into IL1
+event:0x4096 counters:0,1,2,3 um:zero minimum:10000 name:PM_IC_RELOAD_PRIVATE : Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight thrreads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was inv
+event:0x4006a counters:3 um:zero minimum:10000 name:PM_IERAT_RELOAD_16M : IERAT Reloaded (Miss) for a 16M page.
+event:0x20064 counters:1 um:zero minimum:10000 name:PM_IERAT_RELOAD_4K : IERAT Reloaded (Miss) for a 4k page.
+event:0x3006a counters:2 um:zero minimum:10000 name:PM_IERAT_RELOAD_64K : IERAT Reloaded (Miss) for a 64k page.
+event:0x3405e counters:2 um:zero minimum:10000 name:PM_IFETCH_THROTTLE : Cycles instruction fecth was throttled in IFU.
+event:0x5088 counters:0,1,2,3 um:zero minimum:10000 name:PM_IFU_L2_TOUCH : L2 touch to update MRU on a line
+event:0x514050 counters:0 um:zero minimum:10000 name:PM_INST_ALL_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for an instruction fetch
+event:0x544048 counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_DL2L3_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x534048 counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_DL2L3_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x53404c counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_DL4 : The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x54404c counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_DMEM : The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x514042 counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2 : The processor's Instruction cache was reloaded from local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x544046 counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_L21_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x534046 counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_L21_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x51404e counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2MISS : The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x534040 counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2_DISP_CONFLICT_LDHITST : The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x544040 counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2_DISP_CONFLICT_OTHER : The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524040 counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2_MEPF : The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x514040 counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_L2_NO_CONFLICT : The processor's Instruction cache was reloaded from local core's L2 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x544042 counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_L3 : The processor's Instruction cache was reloaded from local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x544044 counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_L31_ECO_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x534044 counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_L31_ECO_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524044 counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_L31_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x514046 counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_L31_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x54404e counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_L3MISS_MOD : The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x534042 counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_L3_DISP_CONFLICT : The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524042 counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_L3_MEPF : The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x514044 counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_L3_NO_CONFLICT : The processor's Instruction cache was reloaded from local core's L3 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x51404c counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_LL4 : The processor's Instruction cache was reloaded from the local chip's L4 cache due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524048 counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_LMEM : The processor's Instruction cache was reloaded from the local chip's Memory due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x52404c counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_MEMORY : The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x54404a counters:3 um:zero minimum:10000 name:PM_INST_ALL_FROM_OFF_CHIP_CACHE : The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x514048 counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_ON_CHIP_CACHE : The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524046 counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_RL2L3_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x51404a counters:0 um:zero minimum:10000 name:PM_INST_ALL_FROM_RL2L3_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x52404a counters:1 um:zero minimum:10000 name:PM_INST_ALL_FROM_RL4 : The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x53404a counters:2 um:zero minimum:10000 name:PM_INST_ALL_FROM_RMEM : The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1
+event:0x524050 counters:1 um:zero minimum:10000 name:PM_INST_ALL_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for an instruction fetch
+event:0x524052 counters:1 um:zero minimum:10000 name:PM_INST_ALL_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x514052 counters:0 um:zero minimum:10000 name:PM_INST_ALL_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor an instruction fetch
+event:0x514054 counters:0 um:zero minimum:10000 name:PM_INST_ALL_PUMP_CPRED : Pump prediction correct. Counts across all types of pumpsfor an instruction fetch
+event:0x544052 counters:3 um:zero minimum:10000 name:PM_INST_ALL_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor an instruction fetch
+event:0x534050 counters:2 um:zero minimum:10000 name:PM_INST_ALL_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for an instruction fetch
+event:0x534052 counters:2 um:zero minimum:10000 name:PM_INST_ALL_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x544050 counters:3 um:zero minimum:10000 name:PM_INST_ALL_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for an instruction fetch
+event:0x14050 counters:0 um:zero minimum:10000 name:PM_INST_CHIP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was chip pump (prediction=correct) for an instruction fetch.
+event:0x2 counters:0,1,2,3 um:zero minimum:100000 name:PM_INST_CMPL : PPC Instructions Finished (completed).
+event:0x44048 counters:3 um:zero minimum:10000 name:PM_INST_FROM_DL2L3_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x34048 counters:2 um:zero minimum:10000 name:PM_INST_FROM_DL2L3_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x3404c counters:2 um:zero minimum:10000 name:PM_INST_FROM_DL4 : The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x4404c counters:3 um:zero minimum:10000 name:PM_INST_FROM_DMEM : The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x4080 counters:0,1,2,3 um:zero minimum:10000 name:PM_INST_FROM_L1 : Instruction fetches from L1
+event:0x14042 counters:0 um:zero minimum:10000 name:PM_INST_FROM_L2 : The processor's Instruction cache was reloaded from local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x44046 counters:3 um:zero minimum:10000 name:PM_INST_FROM_L21_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x34046 counters:2 um:zero minimum:10000 name:PM_INST_FROM_L21_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x1404e counters:0 um:zero minimum:10000 name:PM_INST_FROM_L2MISS : The processor's Instruction cache was reloaded from a localtion other than the local core's L2 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x34040 counters:2 um:zero minimum:10000 name:PM_INST_FROM_L2_DISP_CONFLICT_LDHITST : The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x44040 counters:3 um:zero minimum:10000 name:PM_INST_FROM_L2_DISP_CONFLICT_OTHER : The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24040 counters:1 um:zero minimum:10000 name:PM_INST_FROM_L2_MEPF : The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x14040 counters:0 um:zero minimum:10000 name:PM_INST_FROM_L2_NO_CONFLICT : The processor's Instruction cache was reloaded from local core's L2 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x44042 counters:3 um:zero minimum:10000 name:PM_INST_FROM_L3 : The processor's Instruction cache was reloaded from local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x44044 counters:3 um:zero minimum:10000 name:PM_INST_FROM_L31_ECO_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x34044 counters:2 um:zero minimum:10000 name:PM_INST_FROM_L31_ECO_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24044 counters:1 um:zero minimum:10000 name:PM_INST_FROM_L31_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x14046 counters:0 um:zero minimum:10000 name:PM_INST_FROM_L31_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x4404e counters:3 um:zero minimum:10000 name:PM_INST_FROM_L3MISS_MOD : The processor's Instruction cache was reloaded from a localtion other than the local core's L3 due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x34042 counters:2 um:zero minimum:10000 name:PM_INST_FROM_L3_DISP_CONFLICT : The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24042 counters:1 um:zero minimum:10000 name:PM_INST_FROM_L3_MEPF : The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x14044 counters:0 um:zero minimum:10000 name:PM_INST_FROM_L3_NO_CONFLICT : The processor's Instruction cache was reloaded from local core's L3 without conflict due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x1404c counters:0 um:zero minimum:10000 name:PM_INST_FROM_LL4 : The processor's Instruction cache was reloaded from the local chip's L4 cache due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24048 counters:1 um:zero minimum:10000 name:PM_INST_FROM_LMEM : The processor's Instruction cache was reloaded from the local chip's Memory due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x2404c counters:1 um:zero minimum:10000 name:PM_INST_FROM_MEMORY : The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x4404a counters:3 um:zero minimum:10000 name:PM_INST_FROM_OFF_CHIP_CACHE : The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x14048 counters:0 um:zero minimum:10000 name:PM_INST_FROM_ON_CHIP_CACHE : The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24046 counters:1 um:zero minimum:10000 name:PM_INST_FROM_RL2L3_MOD : The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x1404a counters:0 um:zero minimum:10000 name:PM_INST_FROM_RL2L3_SHR : The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x2404a counters:1 um:zero minimum:10000 name:PM_INST_FROM_RL4 : The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x3404a counters:2 um:zero minimum:10000 name:PM_INST_FROM_RMEM : The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to either an instruction fetch or instruction fetch plus prefetch if MMCR1[17] is 1 .
+event:0x24050 counters:1 um:zero minimum:10000 name:PM_INST_GRP_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was group pump for an instruction fetch.
+event:0x24052 counters:1 um:zero minimum:10000 name:PM_INST_GRP_PUMP_MPRED : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope OR Final Pump Scope(Group) got data from source that was at smaller scope(Chip) Final pump was group pump and initial pump was chip or final and initial pump was gro
+event:0x14052 counters:0 um:zero minimum:10000 name:PM_INST_GRP_PUMP_MPRED_RTY : Final Pump Scope(Group) to get data sourced, ended up larger than Initial Pump Scope (Chip) Final pump was group pump and initial pump was chip pumpfor an instruction fetch.
+event:0x1003a counters:0 um:zero minimum:10000 name:PM_INST_IMC_MATCH_CMPL : IMC Match Count.
+event:0x30016 counters:2 um:zero minimum:10000 name:PM_INST_IMC_MATCH_DISP : IMC Matches dispatched.
+event:0x14054 counters:0 um:zero minimum:10000 name:PM_INST_PUMP_CPRED : Pump prediction correct. Counts across all types of pumpsfor an instruction fetch.
+event:0x44052 counters:3 um:zero minimum:10000 name:PM_INST_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor an instruction fetch.
+event:0x34050 counters:2 um:zero minimum:10000 name:PM_INST_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for an instruction fetch.
+event:0x34052 counters:2 um:zero minimum:10000 name:PM_INST_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x44050 counters:3 um:zero minimum:10000 name:PM_INST_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for an instruction fetch.
+event:0x10014 counters:0 um:zero minimum:100000 name:PM_IOPS_CMPL : IOPS Completed.
+event:0x30014 counters:2 um:zero minimum:100000 name:PM_IOPS_DISP : IOPS dispatched.
+event:0x45048 counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_DL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request.
+event:0x35048 counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_DL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request.
+event:0x3504c counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_DL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request.
+event:0x4504c counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_DMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request.
+event:0x15042 counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_L2 : A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request.
+event:0x45046 counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_L21_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request.
+event:0x35046 counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_L21_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request.
+event:0x1504e counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_L2MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a instruction side request.
+event:0x35040 counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_L2_DISP_CONFLICT_LDHITST : A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a instruction side request.
+event:0x45040 counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_L2_DISP_CONFLICT_OTHER : A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a instruction side request.
+event:0x25040 counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_L2_MEPF : A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request.
+event:0x15040 counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_L2_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request.
+event:0x45042 counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_L3 : A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request.
+event:0x45044 counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_L31_ECO_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request.
+event:0x35044 counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_L31_ECO_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request.
+event:0x25044 counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_L31_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request.
+event:0x15046 counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_L31_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request.
+event:0x4504e counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_L3MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a instruction side request.
+event:0x35042 counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_L3_DISP_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request.
+event:0x25042 counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_L3_MEPF : A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request.
+event:0x15044 counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_L3_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request.
+event:0x1504c counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_LL4 : A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request.
+event:0x25048 counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_LMEM : A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request.
+event:0x2504c counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_MEMORY : A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request.
+event:0x4504a counters:3 um:zero minimum:10000 name:PM_IPTEG_FROM_OFF_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request.
+event:0x15048 counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_ON_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request.
+event:0x25046 counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_RL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request.
+event:0x1504a counters:0 um:zero minimum:10000 name:PM_IPTEG_FROM_RL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request.
+event:0x2504a counters:1 um:zero minimum:10000 name:PM_IPTEG_FROM_RL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request.
+event:0x3504a counters:2 um:zero minimum:10000 name:PM_IPTEG_FROM_RMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request.
+event:0x617082 counters:0 um:zero minimum:10000 name:PM_ISIDE_DISP : All i-side dispatch attempts
+event:0x627084 counters:1 um:zero minimum:10000 name:PM_ISIDE_DISP_FAIL : All i-side dispatch attempts that failed due to a addr collision with another machine
+event:0x627086 counters:1 um:zero minimum:10000 name:PM_ISIDE_DISP_FAIL_OTHER : All i-side dispatch attempts that failed due to a reason other than addrs collision
+event:0x4608e counters:3 um:zero minimum:10000 name:PM_ISIDE_L2MEMACC : valid when first beat of data comes in for an i-side fetch where data came from mem(or L4)
+event:0x44608e counters:3 um:zero minimum:10000 name:PM_ISIDE_MRU_TOUCH : Iside L2 MRU touch
+event:0xd096 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISLB_MISS : I SLB Miss.
+event:0x30ac counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_FX0 : FX0 ISU reject
+event:0x30ae counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_FX1 : FX1 ISU reject
+event:0x38ac counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_FXU : ISU
+event:0x30b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_LS0 : LS0 ISU reject
+event:0x30b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_LS1 : LS1 ISU reject
+event:0x30b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_LS2 : LS2 ISU reject
+event:0x30b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REF_LS3 : LS3 ISU reject
+event:0x309c counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJECTS_ALL : All isu rejects could be more than 1 per cycle
+event:0x30a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJECT_RES_NA : ISU reject due to resource not available
+event:0x309e counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJECT_SAR_BYPASS : Reject because of SAR bypass
+event:0x30a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJECT_SRC_NA : ISU reject due to source not available
+event:0x30a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJ_VS0 : VS0 ISU reject
+event:0x30aa counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJ_VS1 : VS1 ISU reject
+event:0x38a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISU_REJ_VSU : ISU
+event:0x30b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_ISYNC : Isync count per thread
+event:0x200301ea counters:2 um:zero minimum:10000 name:PM_L1MISS_LAT_EXC_1024 : Reload latency exceeded 1024 cyc
+event:0x200401ec counters:3 um:zero minimum:10000 name:PM_L1MISS_LAT_EXC_2048 : Reload latency exceeded 2048 cyc
+event:0x200101e8 counters:0 um:zero minimum:10000 name:PM_L1MISS_LAT_EXC_256 : Reload latency exceeded 256 cyc
+event:0x200201e6 counters:1 um:zero minimum:10000 name:PM_L1MISS_LAT_EXC_32 : Reload latency exceeded 32 cyc
+event:0x26086 counters:1 um:zero minimum:10000 name:PM_L1PF_L2MEMACC : valid when first beat of data comes in for an L1pref where data came from mem(or L4)
+event:0x1002c counters:0 um:zero minimum:10000 name:PM_L1_DCACHE_RELOADED_ALL : L1 data cache reloaded for demand or prefetch .
+event:0x408c counters:0,1,2,3 um:zero minimum:10000 name:PM_L1_DEMAND_WRITE : Instruction Demand sectors wriittent into IL1
+event:0x40012 counters:3 um:zero minimum:10000 name:PM_L1_ICACHE_RELOADED_ALL : Counts all Icache reloads includes demand, prefetchm prefetch turned into demand and demand turned into prefetch.
+event:0x30068 counters:2 um:zero minimum:10000 name:PM_L1_ICACHE_RELOADED_PREF : Counts all Icache prefetch reloads ( includes demand turned into prefetch).
+event:0x417080 counters:0 um:zero minimum:10000 name:PM_L2_CASTOUT_MOD : L2 Castouts - Modified (M, Mu, Me)
+event:0x417082 counters:0 um:zero minimum:10000 name:PM_L2_CASTOUT_SHR : L2 Castouts - Shared (T, Te, Si, S)
+event:0x27084 counters:1 um:zero minimum:10000 name:PM_L2_CHIP_PUMP : RC requests that were local on chip pump attempts
+event:0x427086 counters:1 um:zero minimum:10000 name:PM_L2_DC_INV : Dcache invalidates from L2
+event:0x44608c counters:3 um:zero minimum:10000 name:PM_L2_DISP_ALL_L2MISS : All successful Ld/St dispatches for this thread that were an L2miss.
+event:0x64608e counters:3 um:zero minimum:10000 name:PM_L2_GROUP_PUMP : RC requests that were on Node Pump attempts
+event:0x626084 counters:1 um:zero minimum:10000 name:PM_L2_GRP_GUESS_CORRECT : L2 guess grp and guess was correct (data intra-6chip AND ^on-chip)
+event:0x626086 counters:1 um:zero minimum:10000 name:PM_L2_GRP_GUESS_WRONG : L2 guess grp and guess was not correct (ie data on-chip OR beyond-6chip)
+event:0x427084 counters:1 um:zero minimum:10000 name:PM_L2_IC_INV : Icache Invalidates from L2
+event:0x436088 counters:2 um:zero minimum:10000 name:PM_L2_INST : All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)
+event:0x43608a counters:2 um:zero minimum:10000 name:PM_L2_INST_MISS : All successful i-side dispatches that were an L2miss for this thread (excludes i_l2mru_tch reqs)
+event:0x416080 counters:0 um:zero minimum:10000 name:PM_L2_LD : All successful D-side Load dispatches for this thread
+event:0x437088 counters:2 um:zero minimum:10000 name:PM_L2_LD_DISP : All successful load dispatches
+event:0x43708a counters:2 um:zero minimum:10000 name:PM_L2_LD_HIT : All successful load dispatches that were L2 hits
+event:0x426084 counters:1 um:zero minimum:10000 name:PM_L2_LD_MISS : All successful D-Side Load dispatches that were an L2miss for this thread
+event:0x616080 counters:0 um:zero minimum:10000 name:PM_L2_LOC_GUESS_CORRECT : L2 guess loc and guess was correct (ie data local)
+event:0x616082 counters:0 um:zero minimum:10000 name:PM_L2_LOC_GUESS_WRONG : L2 guess loc and guess was not correct (ie data not on chip)
+event:0x516080 counters:0 um:zero minimum:10000 name:PM_L2_RCLD_DISP : L2 RC load dispatch attempt
+event:0x516082 counters:0 um:zero minimum:10000 name:PM_L2_RCLD_DISP_FAIL_ADDR : L2 RC load dispatch attempt failed due to address collision with RC/CO/SN/SQ
+event:0x526084 counters:1 um:zero minimum:10000 name:PM_L2_RCLD_DISP_FAIL_OTHER : L2 RC load dispatch attempt failed due to other reasons
+event:0x536088 counters:2 um:zero minimum:10000 name:PM_L2_RCST_DISP : L2 RC store dispatch attempt
+event:0x53608a counters:2 um:zero minimum:10000 name:PM_L2_RCST_DISP_FAIL_ADDR : L2 RC store dispatch attempt failed due to address collision with RC/CO/SN/SQ
+event:0x54608c counters:3 um:zero minimum:10000 name:PM_L2_RCST_DISP_FAIL_OTHER : L2 RC store dispatch attempt failed due to other reasons
+event:0x537088 counters:2 um:zero minimum:10000 name:PM_L2_RC_ST_DONE : RC did st to line that was Tx or Sx
+event:0x63708a counters:2 um:zero minimum:10000 name:PM_L2_RTY_LD : RC retries on PB for any load from core
+event:0x3708a counters:2 um:zero minimum:10000 name:PM_L2_RTY_ST : RC retries on PB for any store from core
+event:0x54708c counters:3 um:zero minimum:10000 name:PM_L2_SN_M_RD_DONE : SNP dispatched for a read and was M
+event:0x54708e counters:3 um:zero minimum:10000 name:PM_L2_SN_M_WR_DONE : SNP dispatched for a write and was M
+event:0x53708a counters:2 um:zero minimum:10000 name:PM_L2_SN_SX_I_DONE : SNP dispatched and went from Sx or Tx to Ix
+event:0x17080 counters:0 um:zero minimum:10000 name:PM_L2_ST : All successful D-side store dispatches for this thread
+event:0x44708c counters:3 um:zero minimum:10000 name:PM_L2_ST_DISP : All successful store dispatches
+event:0x44708e counters:3 um:zero minimum:10000 name:PM_L2_ST_HIT : All successful store dispatches that were L2Hits
+event:0x17082 counters:0 um:zero minimum:10000 name:PM_L2_ST_MISS : All successful D-side store dispatches for this thread that were L2 Miss
+event:0x636088 counters:2 um:zero minimum:10000 name:PM_L2_SYS_GUESS_CORRECT : L2 guess sys and guess was correct (ie data beyond-6chip)
+event:0x63608a counters:2 um:zero minimum:10000 name:PM_L2_SYS_GUESS_WRONG : L2 guess sys and guess was not correct (ie data ^beyond-6chip)
+event:0x37088 counters:2 um:zero minimum:10000 name:PM_L2_SYS_PUMP : RC requests that were system pump attempts
+event:0x1e05e counters:0 um:zero minimum:10000 name:PM_L2_TM_REQ_ABORT : TM abort.
+event:0x3e05c counters:2 um:zero minimum:10000 name:PM_L2_TM_ST_ABORT_SISTER : TM marked store abort.
+event:0x23808a counters:2 um:zero minimum:10000 name:PM_L3_CINJ : l3 ci of cache inject
+event:0x128084 counters:1 um:zero minimum:10000 name:PM_L3_CI_HIT : L3 Castins Hit (total count
+event:0x128086 counters:1 um:zero minimum:10000 name:PM_L3_CI_MISS : L3 castins miss (total count
+event:0x819082 counters:0 um:zero minimum:10000 name:PM_L3_CI_USAGE : rotating sample of 16 CI or CO actives
+event:0x438088 counters:2 um:zero minimum:10000 name:PM_L3_CO : l3 castout occuring ( does not include casthrough or log writes (cinj/dmaw)
+event:0x83908b counters:2 um:zero minimum:10000 name:PM_L3_CO0_ALLOC : 0.0
+event:0x83908a counters:2 um:zero minimum:10000 name:PM_L3_CO0_BUSY : lifetime, sample of CO machine 0 valid
+event:0x28086 counters:1 um:zero minimum:10000 name:PM_L3_CO_L31 : L3 CO to L3.1 OR of port 0 and 1 ( lossy)
+event:0x238088 counters:2 um:zero minimum:10000 name:PM_L3_CO_LCO : Total L3 castouts occurred on LCO
+event:0x28084 counters:1 um:zero minimum:10000 name:PM_L3_CO_MEM : L3 CO to memory OR of port 0 and 1 ( lossy)
+event:0x18082 counters:0 um:zero minimum:10000 name:PM_L3_CO_MEPF : L3 CO of line in Mep state ( includes casthrough
+event:0xb19082 counters:0 um:zero minimum:10000 name:PM_L3_GRP_GUESS_CORRECT : Initial scope=group and data from same group (near) (pred successful)
+event:0xb3908a counters:2 um:zero minimum:10000 name:PM_L3_GRP_GUESS_WRONG_HIGH : Initial scope=group but data from local node. Predition too high
+event:0xb39088 counters:2 um:zero minimum:10000 name:PM_L3_GRP_GUESS_WRONG_LOW : Initial scope=group but data from outside group (far or rem). Prediction too Low
+event:0x218080 counters:0 um:zero minimum:10000 name:PM_L3_HIT : L3 Hits
+event:0x138088 counters:2 um:zero minimum:10000 name:PM_L3_L2_CO_HIT : L2 castout hits
+event:0x13808a counters:2 um:zero minimum:10000 name:PM_L3_L2_CO_MISS : L2 castout miss
+event:0x14808c counters:3 um:zero minimum:10000 name:PM_L3_LAT_CI_HIT : L3 Lateral Castins Hit
+event:0x14808e counters:3 um:zero minimum:10000 name:PM_L3_LAT_CI_MISS : L3 Lateral Castins Miss
+event:0x228084 counters:1 um:zero minimum:10000 name:PM_L3_LD_HIT : L3 demand LD Hits
+event:0x228086 counters:1 um:zero minimum:10000 name:PM_L3_LD_MISS : L3 demand LD Miss
+event:0x1e052 counters:0 um:zero minimum:10000 name:PM_L3_LD_PREF : L3 Load Prefetches.
+event:0xb19080 counters:0 um:zero minimum:10000 name:PM_L3_LOC_GUESS_CORRECT : initial scope=node/chip and data from local node (local) (pred successful)
+event:0xb29086 counters:1 um:zero minimum:10000 name:PM_L3_LOC_GUESS_WRONG : Initial scope=node but data from out side local node (near or far or rem). Prediction too Low
+event:0x218082 counters:0 um:zero minimum:10000 name:PM_L3_MISS : L3 Misses
+event:0x54808c counters:3 um:zero minimum:10000 name:PM_L3_P0_CO_L31 : l3 CO to L3.1 (lco) port 0
+event:0x538088 counters:2 um:zero minimum:10000 name:PM_L3_P0_CO_MEM : l3 CO to memory port 0
+event:0x929084 counters:1 um:zero minimum:10000 name:PM_L3_P0_CO_RTY : L3 CO received retry port 0
+event:0xa29084 counters:1 um:zero minimum:10000 name:PM_L3_P0_GRP_PUMP : L3 pf sent with grp scope port 0
+event:0x528084 counters:1 um:zero minimum:10000 name:PM_L3_P0_LCO_DATA : lco sent with data port 0
+event:0x518080 counters:0 um:zero minimum:10000 name:PM_L3_P0_LCO_NO_DATA : dataless l3 lco sent port 0
+event:0xa4908c counters:3 um:zero minimum:10000 name:PM_L3_P0_LCO_RTY : L3 LCO received retry port 0
+event:0xa19080 counters:0 um:zero minimum:10000 name:PM_L3_P0_NODE_PUMP : L3 pf sent with nodal scope port 0
+event:0x919080 counters:0 um:zero minimum:10000 name:PM_L3_P0_PF_RTY : L3 PF received retry port 0
+event:0x939088 counters:2 um:zero minimum:10000 name:PM_L3_P0_SN_HIT : L3 snoop hit port 0
+event:0x118080 counters:0 um:zero minimum:10000 name:PM_L3_P0_SN_INV : Port0 snooper detects someone doing a store to a line thats Sx
+event:0x94908c counters:3 um:zero minimum:10000 name:PM_L3_P0_SN_MISS : L3 snoop miss port 0
+event:0xa39088 counters:2 um:zero minimum:10000 name:PM_L3_P0_SYS_PUMP : L3 pf sent with sys scope port 0
+event:0x54808e counters:3 um:zero minimum:10000 name:PM_L3_P1_CO_L31 : l3 CO to L3.1 (lco) port 1
+event:0x53808a counters:2 um:zero minimum:10000 name:PM_L3_P1_CO_MEM : l3 CO to memory port 1
+event:0x929086 counters:1 um:zero minimum:10000 name:PM_L3_P1_CO_RTY : L3 CO received retry port 1
+event:0xa29086 counters:1 um:zero minimum:10000 name:PM_L3_P1_GRP_PUMP : L3 pf sent with grp scope port 1
+event:0x528086 counters:1 um:zero minimum:10000 name:PM_L3_P1_LCO_DATA : lco sent with data port 1
+event:0x518082 counters:0 um:zero minimum:10000 name:PM_L3_P1_LCO_NO_DATA : dataless l3 lco sent port 1
+event:0xa4908e counters:3 um:zero minimum:10000 name:PM_L3_P1_LCO_RTY : L3 LCO received retry port 1
+event:0xa19082 counters:0 um:zero minimum:10000 name:PM_L3_P1_NODE_PUMP : L3 pf sent with nodal scope port 1
+event:0x919082 counters:0 um:zero minimum:10000 name:PM_L3_P1_PF_RTY : L3 PF received retry port 1
+event:0x93908a counters:2 um:zero minimum:10000 name:PM_L3_P1_SN_HIT : L3 snoop hit port 1
+event:0x118082 counters:0 um:zero minimum:10000 name:PM_L3_P1_SN_INV : Port1 snooper detects someone doing a store to a line thats Sx
+event:0x94908e counters:3 um:zero minimum:10000 name:PM_L3_P1_SN_MISS : L3 snoop miss port 1
+event:0xa3908a counters:2 um:zero minimum:10000 name:PM_L3_P1_SYS_PUMP : L3 pf sent with sys scope port 1
+event:0x84908d counters:3 um:zero minimum:10000 name:PM_L3_PF0_ALLOC : 0.0
+event:0x84908c counters:3 um:zero minimum:10000 name:PM_L3_PF0_BUSY : lifetime, sample of PF machine 0 valid
+event:0x428084 counters:1 um:zero minimum:10000 name:PM_L3_PF_HIT_L3 : l3 pf hit in l3
+event:0x18080 counters:0 um:zero minimum:10000 name:PM_L3_PF_MISS_L3 : L3 Prefetch missed in L3
+event:0x3808a counters:2 um:zero minimum:10000 name:PM_L3_PF_OFF_CHIP_CACHE : L3 Prefetch from Off chip cache
+event:0x4808e counters:3 um:zero minimum:10000 name:PM_L3_PF_OFF_CHIP_MEM : L3 Prefetch from Off chip memory
+event:0x38088 counters:2 um:zero minimum:10000 name:PM_L3_PF_ON_CHIP_CACHE : L3 Prefetch from On chip cache
+event:0x4808c counters:3 um:zero minimum:10000 name:PM_L3_PF_ON_CHIP_MEM : L3 Prefetch from On chip memory
+event:0x829084 counters:1 um:zero minimum:10000 name:PM_L3_PF_USAGE : rotating sample of 32 PF actives
+event:0x4e052 counters:3 um:zero minimum:10000 name:PM_L3_PREF_ALL : Total HW L3 prefetches(Load+store).
+event:0x84908f counters:3 um:zero minimum:10000 name:PM_L3_RD0_ALLOC : 0.0
+event:0x84908e counters:3 um:zero minimum:10000 name:PM_L3_RD0_BUSY : lifetime, sample of RD machine 0 valid
+event:0x829086 counters:1 um:zero minimum:10000 name:PM_L3_RD_USAGE : rotating sample of 16 RD actives
+event:0x839089 counters:2 um:zero minimum:10000 name:PM_L3_SN0_ALLOC : 0.0
+event:0x839088 counters:2 um:zero minimum:10000 name:PM_L3_SN0_BUSY : lifetime, sample of snooper machine 0 valid
+event:0x819080 counters:0 um:zero minimum:10000 name:PM_L3_SN_USAGE : rotating sample of 8 snoop valids
+event:0x2e052 counters:1 um:zero minimum:10000 name:PM_L3_ST_PREF : L3 store Prefetches.
+event:0x3e052 counters:2 um:zero minimum:10000 name:PM_L3_SW_PREF : Data stream touchto L3.
+event:0xb29084 counters:1 um:zero minimum:10000 name:PM_L3_SYS_GUESS_CORRECT : Initial scope=system and data from outside group (far or rem)(pred successful)
+event:0xb4908c counters:3 um:zero minimum:10000 name:PM_L3_SYS_GUESS_WRONG : Initial scope=system but data from local or near. Predction too high
+event:0x24808e counters:3 um:zero minimum:10000 name:PM_L3_TRANS_PF : L3 Transient prefetch
+event:0x18081 counters:0 um:zero minimum:10000 name:PM_L3_WI0_ALLOC : 0.0
+event:0x418080 counters:0 um:zero minimum:10000 name:PM_L3_WI0_BUSY : lifetime, sample of Write Inject machine 0 valid
+event:0x418082 counters:0 um:zero minimum:10000 name:PM_L3_WI_USAGE : rotating sample of 8 WI actives
+event:0x3c058 counters:2 um:zero minimum:10000 name:PM_LARX_FIN : Larx finished .
+event:0x1002e counters:0 um:zero minimum:10000 name:PM_LD_CMPL : count of Loads completed.
+event:0x10062 counters:0 um:zero minimum:10000 name:PM_LD_L3MISS_PEND_CYC : Cycles L3 miss was pending for this thread.
+event:0x100ee counters:0 um:zero minimum:10000 name:PM_LD_REF_L1 : Load Ref count combined for all units.
+event:0xc080 counters:0,1,2,3 um:zero minimum:10000 name:PM_LD_REF_L1_LSU0 : LS0 L1 D cache load references counted at finish, gated by rejectLSU0 L1 D cache load references
+event:0xc082 counters:0,1,2,3 um:zero minimum:10000 name:PM_LD_REF_L1_LSU1 : LS1 L1 D cache load references counted at finish, gated by rejectLSU1 L1 D cache load references
+event:0xc094 counters:0,1,2,3 um:zero minimum:10000 name:PM_LD_REF_L1_LSU2 : LS2 L1 D cache load references counted at finish, gated by reject42
+event:0xc096 counters:0,1,2,3 um:zero minimum:10000 name:PM_LD_REF_L1_LSU3 : LS3 L1 D cache load references counted at finish, gated by reject42
+event:0x509a counters:0,1,2,3 um:zero minimum:10000 name:PM_LINK_STACK_INVALID_PTR : A flush were LS ptr is invalid, results in a pop , A lot of interrupts between push and pops
+event:0x5098 counters:0,1,2,3 um:zero minimum:10000 name:PM_LINK_STACK_WRONG_ADD_PRED : Link stack predicts wrong address, because of link stack design limitation.
+event:0xe080 counters:0,1,2,3 um:zero minimum:10000 name:PM_LS0_ERAT_MISS_PREF : LS0 Erat miss due to prefetch42
+event:0xd0b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_LS0_L1_PREF : LS0 L1 cache data prefetches42
+event:0xc098 counters:0,1,2,3 um:zero minimum:10000 name:PM_LS0_L1_SW_PREF : Software L1 Prefetches, including SW Transient Prefetches42
+event:0xe082 counters:0,1,2,3 um:zero minimum:10000 name:PM_LS1_ERAT_MISS_PREF : LS1 Erat miss due to prefetch42
+event:0xd0ba counters:0,1,2,3 um:zero minimum:10000 name:PM_LS1_L1_PREF : LS1 L1 cache data prefetches42
+event:0xc09a counters:0,1,2,3 um:zero minimum:10000 name:PM_LS1_L1_SW_PREF : Software L1 Prefetches, including SW Transient Prefetches42
+event:0xc0b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_FLUSH_LRQ : LS0 Flush: LRQLSU0 LRQ flushes
+event:0xc0b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_FLUSH_SRQ : LS0 Flush: SRQLSU0 SRQ lhs flushes
+event:0xc0a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_FLUSH_ULD : LS0 Flush: Unaligned LoadLSU0 unaligned load flushes
+event:0xc0ac counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_FLUSH_UST : LS0 Flush: Unaligned StoreLSU0 unaligned store flushes
+event:0xf088 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_L1_CAM_CANCEL : ls0 l1 tm cam cancel42
+event:0x1e056 counters:0 um:zero minimum:10000 name:PM_LSU0_LARX_FIN : .
+event:0xd08c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_LMQ_LHR_MERGE : LS0 Load Merged with another cacheline request42
+event:0xc08c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_NCLD : LS0 Non-cachable Loads counted at finishLSU0 non-cacheable loads
+event:0xe090 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_PRIMARY_ERAT_HIT : Primary ERAT hit42
+event:0x1e05a counters:0 um:zero minimum:10000 name:PM_LSU0_REJECT : LSU0 reject .
+event:0xc09c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_SRQ_STFWD : LS0 SRQ forwarded data to a loadLSU0 SRQ store forwarded
+event:0xf084 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_STORE_REJECT : ls0 store reject42
+event:0xe0a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_TMA_REQ_L2 : addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42
+event:0xe098 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_TM_L1_HIT : Load tm hit in L142
+event:0xe0a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU0_TM_L1_MISS : Load tm L1 miss42
+event:0xc0b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_FLUSH_LRQ : LS1 Flush: LRQLSU1 LRQ flushes
+event:0xc0ba counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_FLUSH_SRQ : LS1 Flush: SRQLSU1 SRQ lhs flushes
+event:0xc0a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_FLUSH_ULD : LS 1 Flush: Unaligned LoadLSU1 unaligned load flushes
+event:0xc0ae counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_FLUSH_UST : LS1 Flush: Unaligned StoreLSU1 unaligned store flushes
+event:0xf08a counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_L1_CAM_CANCEL : ls1 l1 tm cam cancel42
+event:0x2e056 counters:1 um:zero minimum:10000 name:PM_LSU1_LARX_FIN : Larx finished in LSU pipe1.
+event:0xd08e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_LMQ_LHR_MERGE : LS1 Load Merge with another cacheline request42
+event:0xc08e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_NCLD : LS1 Non-cachable Loads counted at finishLSU1 non-cacheable loads
+event:0xe092 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_PRIMARY_ERAT_HIT : Primary ERAT hit42
+event:0x2e05a counters:1 um:zero minimum:10000 name:PM_LSU1_REJECT : LSU1 reject .
+event:0xc09e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_SRQ_STFWD : LS1 SRQ forwarded data to a loadLSU1 SRQ store forwarded
+event:0xf086 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_STORE_REJECT : ls1 store reject42
+event:0xe0aa counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_TMA_REQ_L2 : addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42
+event:0xe09a counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_TM_L1_HIT : Load tm hit in L142
+event:0xe0a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU1_TM_L1_MISS : Load tm L1 miss42
+event:0xc0b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_FLUSH_LRQ : LS02Flush: LRQ42
+event:0xc0bc counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_FLUSH_SRQ : LS2 Flush: SRQ42
+event:0xc0a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_FLUSH_ULD : LS3 Flush: Unaligned Load42
+event:0xf08c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_L1_CAM_CANCEL : ls2 l1 tm cam cancel42
+event:0x3e056 counters:2 um:zero minimum:10000 name:PM_LSU2_LARX_FIN : Larx finished in LSU pipe2.
+event:0xc084 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_LDF : LS2 Scalar Loads42
+event:0xc088 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_LDX : LS0 Vector Loads42
+event:0xd090 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_LMQ_LHR_MERGE : LS0 Load Merged with another cacheline request42
+event:0xe094 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_PRIMARY_ERAT_HIT : Primary ERAT hit42
+event:0x3e05a counters:2 um:zero minimum:10000 name:PM_LSU2_REJECT : LSU2 reject .
+event:0xc0a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_SRQ_STFWD : LS2 SRQ forwarded data to a load42
+event:0xe0ac counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_TMA_REQ_L2 : addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42
+event:0xe09c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_TM_L1_HIT : Load tm hit in L142
+event:0xe0a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU2_TM_L1_MISS : Load tm L1 miss42
+event:0xc0b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_FLUSH_LRQ : LS3 Flush: LRQ42
+event:0xc0be counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_FLUSH_SRQ : LS13 Flush: SRQ42
+event:0xc0aa counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_FLUSH_ULD : LS 14Flush: Unaligned Load42
+event:0xf08e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_L1_CAM_CANCEL : ls3 l1 tm cam cancel42
+event:0x4e056 counters:3 um:zero minimum:10000 name:PM_LSU3_LARX_FIN : Larx finished in LSU pipe3.
+event:0xc086 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_LDF : LS3 Scalar Loads 42
+event:0xc08a counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_LDX : LS1 Vector Loads42
+event:0xd092 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_LMQ_LHR_MERGE : LS1 Load Merge with another cacheline request42
+event:0xe096 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_PRIMARY_ERAT_HIT : Primary ERAT hit42
+event:0x4e05a counters:3 um:zero minimum:10000 name:PM_LSU3_REJECT : LSU3 reject .
+event:0xc0a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_SRQ_STFWD : LS3 SRQ forwarded data to a load42
+event:0xe0ae counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_TMA_REQ_L2 : addrs only req to L2 only on the first one,Indication that Load footprint is not expanding42
+event:0xe09e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_TM_L1_HIT : Load tm hit in L142
+event:0xe0a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU3_TM_L1_MISS : Load tm L1 miss42
+event:0xe880 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_ERAT_MISS_PREF : LSU
+event:0x30066 counters:2 um:zero minimum:10000 name:PM_LSU_FIN : LSU Finished an instruction (up to 2 per cycle).
+event:0xc8ac counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_FLUSH_UST : LSU
+event:0xd0a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_FOUR_TABLEWALK_CYC : Cycles when four tablewalks pending on this thread42
+event:0x10066 counters:0 um:zero minimum:10000 name:PM_LSU_FX_FIN : LSU Finished a FX operation (up to 2 per cycle.
+event:0xd8b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_L1_PREF : LSU
+event:0xc898 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_L1_SW_PREF : LSU
+event:0xc884 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LDF : LSU
+event:0xc888 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LDX : LSU
+event:0xd0a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LMQ_FULL_CYC : LMQ fullCycles LMQ full,
+event:0xd0a1 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LMQ_S0_ALLOC : 0.0
+event:0xd0a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LMQ_S0_VALID : Slot 0 of LMQ validLMQ slot 0 valid
+event:0x3001c counters:2 um:zero minimum:10000 name:PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC : ALL threads lsu empty (lmq and srq empty). Issue HW016541
+event:0x2003e counters:1 um:zero minimum:10000 name:PM_LSU_LMQ_SRQ_EMPTY_CYC : LSU empty (lmq and srq empty).
+event:0xd09f counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LRQ_S0_ALLOC : 0.0
+event:0xd09e counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LRQ_S0_VALID : Slot 0 of LRQ validLRQ slot 0 valid
+event:0xf091 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LRQ_S43_ALLOC : 0.0
+event:0xf090 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_LRQ_S43_VALID : LRQ slot 43 was busy42
+event:0x30162 counters:2 um:zero minimum:10000 name:PM_LSU_MRK_DERAT_MISS : DERAT Reloaded (Miss).
+event:0xc88c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_NCLD : LSU
+event:0xc092 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_NCST : Non-cachable Stores sent to nest42
+event:0x10064 counters:0 um:zero minimum:10000 name:PM_LSU_REJECT : LSU Reject (up to 4 per cycle).
+event:0x2e05c counters:1 um:zero minimum:10000 name:PM_LSU_REJECT_ERAT_MISS : LSU Reject due to ERAT (up to 4 per cycles).
+event:0x4e05c counters:3 um:zero minimum:10000 name:PM_LSU_REJECT_LHS : LSU Reject due to LHS (up to 4 per cycle).
+event:0x1e05c counters:0 um:zero minimum:10000 name:PM_LSU_REJECT_LMQ_FULL : LSU reject due to LMQ full ( 4 per cycle).
+event:0xd082 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SET_MPRED : Line already in cache at reload time42
+event:0x40008 counters:3 um:zero minimum:10000 name:PM_LSU_SRQ_EMPTY_CYC : All threads srq empty.
+event:0x1001a counters:0 um:zero minimum:10000 name:PM_LSU_SRQ_FULL_CYC : SRQ is Full.
+event:0xd09d counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_S0_ALLOC : 0.0
+event:0xd09c counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_S0_VALID : Slot 0 of SRQ validSRQ slot 0 valid
+event:0xf093 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_S39_ALLOC : 0.0
+event:0xf092 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_S39_VALID : SRQ slot 39 was busy42
+event:0xd09b counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_SYNC : 0.0
+event:0xd09a counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_SRQ_SYNC_CYC : A sync is in the SRQ (edge detect to count)SRQ sync duration
+event:0xf084 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_STORE_REJECT : LSU
+event:0xd0a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_LSU_TWO_TABLEWALK_CYC : Cycles when two tablewalks pending on this thread42
+event:0x5094 counters:0,1,2,3 um:zero minimum:10000 name:PM_LWSYNC : threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out
+event:0x209a counters:0,1,2,3 um:zero minimum:10000 name:PM_LWSYNC_HELD : LWSYNC held at dispatch
+event:0x4c058 counters:3 um:zero minimum:10000 name:PM_MEM_CO : Memory castouts from this lpar.
+event:0x10058 counters:0 um:zero minimum:10000 name:PM_MEM_LOC_THRESH_IFU : Local Memory above threshold for IFU speculation control.
+event:0x40056 counters:3 um:zero minimum:10000 name:PM_MEM_LOC_THRESH_LSU_HIGH : Local memory above threshold for LSU medium.
+event:0x1c05e counters:0 um:zero minimum:10000 name:PM_MEM_LOC_THRESH_LSU_MED : Local memory above theshold for data prefetch.
+event:0x2c058 counters:1 um:zero minimum:10000 name:PM_MEM_PREF : Memory prefetch for this lpar.
+event:0x10056 counters:0 um:zero minimum:10000 name:PM_MEM_READ : Reads from Memory from this lpar (includes data/inst/xlate/l1prefetch/inst prefetch).
+event:0x3c05e counters:2 um:zero minimum:10000 name:PM_MEM_RWITM : Memory rwitm for this lpar.
+event:0x3515e counters:2 um:zero minimum:1000 name:PM_MRK_BACK_BR_CMPL : Marked branch instruction completed with a target address less than current instruction address.
+event:0x2013a counters:1 um:zero minimum:1000 name:PM_MRK_BRU_FIN : bru marked instr finish.
+event:0x1016e counters:0 um:zero minimum:1000 name:PM_MRK_BR_CMPL : Branch Instruction completed.
+event:0x3013a counters:2 um:zero minimum:1000 name:PM_MRK_CRU_FIN : IFU non-branch marked instruction finished.
+event:0x4d148 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.
+event:0x2d128 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL2L3_MOD_CYC : Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.
+event:0x3d148 counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.
+event:0x2c128 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL2L3_SHR_CYC : Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load.
+event:0x3d14c counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL4 : The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load.
+event:0x2c12c counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DL4_CYC : Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load.
+event:0x4d14c counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load.
+event:0x2d12c counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_DMEM_CYC : Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load.
+event:0x1d142 counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2 : The processor's data cache was reloaded from local core's L2 due to a marked load.
+event:0x4d146 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L21_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load.
+event:0x2d126 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L21_MOD_CYC : Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load.
+event:0x3d146 counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L21_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load.
+event:0x2c126 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L21_SHR_CYC : Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load.
+event:0x4c12e counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2MISS_CYC : Duration in cycles to reload from a localtion other than the local core's L2 due to a marked load.
+event:0x4c122 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_CYC : Duration in cycles to reload from local core's L2 due to a marked load.
+event:0x3d140 counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST : The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load.
+event:0x2c120 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC : Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load.
+event:0x4d140 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER : The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load.
+event:0x2d120 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC : Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load.
+event:0x2d140 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_MEPF : The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load.
+event:0x4d120 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_MEPF_CYC : Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load.
+event:0x1d140 counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_NO_CONFLICT : The processor's data cache was reloaded from local core's L2 without conflict due to a marked load.
+event:0x4c120 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC : Duration in cycles to reload from local core's L2 without conflict due to a marked load.
+event:0x4d142 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3 : The processor's data cache was reloaded from local core's L3 due to a marked load.
+event:0x4d144 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_ECO_MOD : The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load.
+event:0x2d124 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_ECO_MOD_CYC : Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load.
+event:0x3d144 counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_ECO_SHR : The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load.
+event:0x2c124 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_ECO_SHR_CYC : Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load.
+event:0x2d144 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_MOD : The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load.
+event:0x4d124 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_MOD_CYC : Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load.
+event:0x1d146 counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_SHR : The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load.
+event:0x4c126 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L31_SHR_CYC : Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load.
+event:0x2d12e counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3MISS_CYC : Duration in cycles to reload from a localtion other than the local core's L3 due to a marked load.
+event:0x2d122 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_CYC : Duration in cycles to reload from local core's L3 due to a marked load.
+event:0x3d142 counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_DISP_CONFLICT : The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load.
+event:0x2c122 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC : Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load.
+event:0x2d142 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_MEPF : The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load.
+event:0x4d122 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_MEPF_CYC : Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load.
+event:0x1d144 counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_NO_CONFLICT : The processor's data cache was reloaded from local core's L3 without conflict due to a marked load.
+event:0x4c124 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC : Duration in cycles to reload from local core's L3 without conflict due to a marked load.
+event:0x1d14c counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_LL4 : The processor's data cache was reloaded from the local chip's L4 cache due to a marked load.
+event:0x4c12c counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_LL4_CYC : Duration in cycles to reload from the local chip's L4 cache due to a marked load.
+event:0x2d148 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_LMEM : The processor's data cache was reloaded from the local chip's Memory due to a marked load.
+event:0x4d128 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_LMEM_CYC : Duration in cycles to reload from the local chip's Memory due to a marked load.
+event:0x2d14c counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_MEMORY : The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load.
+event:0x4d12c counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_MEMORY_CYC : Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load.
+event:0x4d14a counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_OFF_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load.
+event:0x2d12a counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC : Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load.
+event:0x1d148 counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_ON_CHIP_CACHE : The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load.
+event:0x4c128 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC : Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load.
+event:0x2d146 counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL2L3_MOD : The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.
+event:0x4d126 counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL2L3_MOD_CYC : Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.
+event:0x1d14a counters:0 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL2L3_SHR : The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.
+event:0x4c12a counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL2L3_SHR_CYC : Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load.
+event:0x2d14a counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL4 : The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load.
+event:0x4d12a counters:3 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RL4_CYC : Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load.
+event:0x3d14a counters:2 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RMEM : The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load.
+event:0x2c12a counters:1 um:zero minimum:1000 name:PM_MRK_DATA_FROM_RMEM_CYC : Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load.
+event:0x40118 counters:3 um:zero minimum:1000 name:PM_MRK_DCACHE_RELOAD_INTV : Combined Intervention event.
+event:0x4d154 counters:3 um:zero minimum:1000 name:PM_MRK_DERAT_MISS_16G : Marked Data ERAT Miss (Data TLB Access) page size 16G.
+event:0x3d154 counters:2 um:zero minimum:1000 name:PM_MRK_DERAT_MISS_16M : Marked Data ERAT Miss (Data TLB Access) page size 16M.
+event:0x1d156 counters:0 um:zero minimum:1000 name:PM_MRK_DERAT_MISS_4K : Marked Data ERAT Miss (Data TLB Access) page size 4K.
+event:0x2d154 counters:1 um:zero minimum:1000 name:PM_MRK_DERAT_MISS_64K : Marked Data ERAT Miss (Data TLB Access) page size 64K.
+event:0x20132 counters:1 um:zero minimum:1000 name:PM_MRK_DFU_FIN : Decimal Unit marked Instruction Finish.
+event:0x4f148 counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_DL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.
+event:0x3f148 counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_DL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request.
+event:0x3f14c counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_DL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request.
+event:0x4f14c counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_DMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request.
+event:0x1f142 counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2 : A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request.
+event:0x4f146 counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L21_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request.
+event:0x3f146 counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L21_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request.
+event:0x1f14e counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L2 due to a marked data side request.
+event:0x3f140 counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_LDHITST : A Page Table Entry was loaded into the TLB from local core's L2 with load hit store conflict due to a marked data side request.
+event:0x4f140 counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2_DISP_CONFLICT_OTHER : A Page Table Entry was loaded into the TLB from local core's L2 with dispatch conflict due to a marked data side request.
+event:0x2f140 counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2_MEPF : A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request.
+event:0x1f140 counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L2_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request.
+event:0x4f142 counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L3 : A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request.
+event:0x4f144 counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L31_ECO_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request.
+event:0x3f144 counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L31_ECO_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request.
+event:0x2f144 counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L31_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request.
+event:0x1f146 counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L31_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.
+event:0x4f14e counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L3MISS : A Page Table Entry was loaded into the TLB from a localtion other than the local core's L3 due to a marked data side request.
+event:0x3f142 counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request.
+event:0x2f142 counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L3_MEPF : A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request.
+event:0x1f144 counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_L3_NO_CONFLICT : A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request.
+event:0x1f14c counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_LL4 : A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.
+event:0x2f148 counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_LMEM : A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request.
+event:0x2f14c counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_MEMORY : A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request.
+event:0x4f14a counters:3 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request.
+event:0x1f148 counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_ON_CHIP_CACHE : A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.
+event:0x2f146 counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_RL2L3_MOD : A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.
+event:0x1f14a counters:0 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_RL2L3_SHR : A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.
+event:0x2f14a counters:1 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_RL4 : A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request.
+event:0x3f14a counters:2 um:zero minimum:1000 name:PM_MRK_DPTEG_FROM_RMEM : A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request.
+event:0x1d158 counters:0 um:zero minimum:1000 name:PM_MRK_DTLB_MISS_16G : Marked Data TLB Miss page size 16G.
+event:0x4d156 counters:3 um:zero minimum:1000 name:PM_MRK_DTLB_MISS_16M : Marked Data TLB Miss page size 16M.
+event:0x2d156 counters:1 um:zero minimum:1000 name:PM_MRK_DTLB_MISS_4K : Marked Data TLB Miss page size 4k.
+event:0x3d156 counters:2 um:zero minimum:1000 name:PM_MRK_DTLB_MISS_64K : Marked Data TLB Miss page size 64K.
+event:0x40154 counters:3 um:zero minimum:1000 name:PM_MRK_FAB_RSP_BKILL : Marked store had to do a bkill.
+event:0x2f150 counters:1 um:zero minimum:1000 name:PM_MRK_FAB_RSP_BKILL_CYC : cycles L2 RC took for a bkill.
+event:0x3015e counters:2 um:zero minimum:1000 name:PM_MRK_FAB_RSP_CLAIM_RTY : Sampled store did a rwitm and got a rty.
+event:0x30154 counters:2 um:zero minimum:1000 name:PM_MRK_FAB_RSP_DCLAIM : Marked store had to do a dclaim.
+event:0x2f152 counters:1 um:zero minimum:1000 name:PM_MRK_FAB_RSP_DCLAIM_CYC : cycles L2 RC took for a dclaim.
+event:0x30156 counters:2 um:zero minimum:1000 name:PM_MRK_FAB_RSP_MATCH : ttype and cresp matched as specified in MMCR1.
+event:0x4f152 counters:3 um:zero minimum:1000 name:PM_MRK_FAB_RSP_MATCH_CYC : cresp/ttype match cycles.
+event:0x4015e counters:3 um:zero minimum:1000 name:PM_MRK_FAB_RSP_RD_RTY : Sampled L2 reads retry count.
+event:0x1015e counters:0 um:zero minimum:1000 name:PM_MRK_FAB_RSP_RD_T_INTV : Sampled Read got a T intervention.
+event:0x4f150 counters:3 um:zero minimum:1000 name:PM_MRK_FAB_RSP_RWITM_CYC : cycles L2 RC took for a rwitm.
+event:0x2015e counters:1 um:zero minimum:1000 name:PM_MRK_FAB_RSP_RWITM_RTY : Sampled store did a rwitm and got a rty.
+event:0x3012e counters:2 um:zero minimum:1000 name:PM_MRK_FILT_MATCH : Marked filter Match.
+event:0x1013c counters:0 um:zero minimum:1000 name:PM_MRK_FIN_STALL_CYC : Marked instruction Finish Stall cycles (marked finish after NTC) (use edge detect to count #).
+event:0x20134 counters:1 um:zero minimum:1000 name:PM_MRK_FXU_FIN : fxu marked instr finish.
+event:0x40130 counters:3 um:zero minimum:1000 name:PM_MRK_GRP_CMPL : marked instruction finished (completed).
+event:0x4013a counters:3 um:zero minimum:1000 name:PM_MRK_GRP_IC_MISS : Marked Group experienced I cache miss.
+event:0x3013c counters:2 um:zero minimum:1000 name:PM_MRK_GRP_NTC : Marked group ntc cycles.
+event:0x20130 counters:1 um:zero minimum:1000 name:PM_MRK_INST_DECODED : marked instruction decoded. Name from ISU?
+event:0x30130 counters:2 um:zero minimum:1000 name:PM_MRK_INST_FIN : marked instr finish any unit .
+event:0x10132 counters:0 um:zero minimum:1000 name:PM_MRK_INST_ISSUED : Marked instruction issued.
+event:0x40134 counters:3 um:zero minimum:1000 name:PM_MRK_INST_TIMEO : marked Instruction finish timeout (instruction lost).
+event:0x20114 counters:1 um:zero minimum:1000 name:PM_MRK_L2_RC_DISP : Marked Instruction RC dispatched in L2.
+event:0x3012a counters:2 um:zero minimum:1000 name:PM_MRK_L2_RC_DONE : Marked RC done.
+event:0x40116 counters:3 um:zero minimum:1000 name:PM_MRK_LARX_FIN : Larx finished .
+event:0x1013f counters:0 um:zero minimum:1000 name:PM_MRK_LD_MISS_EXPOSED : Marked Load exposed Miss (use edge detect to count #)
+event:0x1013e counters:0 um:zero minimum:1000 name:PM_MRK_LD_MISS_EXPOSED_CYC : Marked Load exposed Miss (use edge detect to count #).
+event:0x4013e counters:3 um:zero minimum:1000 name:PM_MRK_LD_MISS_L1_CYC : Marked ld latency.
+event:0x40132 counters:3 um:zero minimum:1000 name:PM_MRK_LSU_FIN : lsu marked instr finish.
+event:0xd180 counters:0,1,2,3 um:zero minimum:1000 name:PM_MRK_LSU_FLUSH : Flush: (marked) : All Cases42
+event:0xd188 counters:0,1,2,3 um:zero minimum:1000 name:PM_MRK_LSU_FLUSH_LRQ : Flush: (marked) LRQMarked LRQ flushes
+event:0xd18a counters:0,1,2,3 um:zero minimum:1000 name:PM_MRK_LSU_FLUSH_SRQ : Flush: (marked) SRQMarked SRQ lhs flushes
+event:0xd184 counters:0,1,2,3 um:zero minimum:1000 name:PM_MRK_LSU_FLUSH_ULD : Flush: (marked) Unaligned LoadMarked unaligned load flushes
+event:0xd186 counters:0,1,2,3 um:zero minimum:1000 name:PM_MRK_LSU_FLUSH_UST : Flush: (marked) Unaligned StoreMarked unaligned store flushes
+event:0x40164 counters:3 um:zero minimum:1000 name:PM_MRK_LSU_REJECT : LSU marked reject (up to 2 per cycle).
+event:0x30164 counters:2 um:zero minimum:1000 name:PM_MRK_LSU_REJECT_ERAT_MISS : LSU marked reject due to ERAT (up to 2 per cycle).
+event:0x20112 counters:1 um:zero minimum:1000 name:PM_MRK_NTF_FIN : Marked next to finish instruction finished.
+event:0x1d15e counters:0 um:zero minimum:10000 name:PM_MRK_RUN_CYC : Marked run cycles.
+event:0x1d15a counters:0 um:zero minimum:1000 name:PM_MRK_SRC_PREF_TRACK_EFF : Marked src pref track was effective.
+event:0x3d15a counters:2 um:zero minimum:1000 name:PM_MRK_SRC_PREF_TRACK_INEFF : Prefetch tracked was ineffective for marked src.
+event:0x4d15c counters:3 um:zero minimum:1000 name:PM_MRK_SRC_PREF_TRACK_MOD : Prefetch tracked was moderate for marked src.
+event:0x1d15c counters:0 um:zero minimum:1000 name:PM_MRK_SRC_PREF_TRACK_MOD_L2 : Marked src Prefetch Tracked was moderate (source L2).
+event:0x3d15c counters:2 um:zero minimum:1000 name:PM_MRK_SRC_PREF_TRACK_MOD_L3 : Prefetch tracked was moderate (L3 hit) for marked src.
+event:0x3013e counters:2 um:zero minimum:1000 name:PM_MRK_STALL_CMPLU_CYC : Marked Group Completion Stall cycles (use edge detect to count #).
+event:0x3e158 counters:2 um:zero minimum:1000 name:PM_MRK_STCX_FAIL : marked stcx failed.
+event:0x30134 counters:2 um:zero minimum:1000 name:PM_MRK_ST_CMPL_INT : marked store complete (data home) with intervention.
+event:0x3f150 counters:2 um:zero minimum:1000 name:PM_MRK_ST_DRAIN_TO_L2DISP_CYC : cycles to drain st from core to L2.
+event:0x3012c counters:2 um:zero minimum:1000 name:PM_MRK_ST_FWD : Marked st forwards.
+event:0x1f150 counters:0 um:zero minimum:1000 name:PM_MRK_ST_L2DISP_TO_CMPL_CYC : cycles from L2 rc disp to l2 rc completion.
+event:0x20138 counters:1 um:zero minimum:1000 name:PM_MRK_ST_NEST : Marked store sent to nest.
+event:0x1c15a counters:0 um:zero minimum:1000 name:PM_MRK_TGT_PREF_TRACK_EFF : Marked target pref track was effective.
+event:0x3c15a counters:2 um:zero minimum:1000 name:PM_MRK_TGT_PREF_TRACK_INEFF : Prefetch tracked was ineffective for marked target.
+event:0x4c15c counters:3 um:zero minimum:1000 name:PM_MRK_TGT_PREF_TRACK_MOD : Prefetch tracked was moderate for marked target.
+event:0x1c15c counters:0 um:zero minimum:1000 name:PM_MRK_TGT_PREF_TRACK_MOD_L2 : Marked target Prefetch Tracked was moderate (source L2).
+event:0x3c15c counters:2 um:zero minimum:1000 name:PM_MRK_TGT_PREF_TRACK_MOD_L3 : Prefetch tracked was moderate (L3 hit) for marked target.
+event:0x30132 counters:2 um:zero minimum:1000 name:PM_MRK_VSU_FIN : vsu (fpu) marked instr finish.
+event:0x3d15e counters:2 um:zero minimum:10000 name:PM_MULT_MRK : mult marked instr.
+event:0x20b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_NESTED_TEND : Completion time nested tend
+event:0x3006e counters:2 um:zero minimum:10000 name:PM_NEST_REF_CLK : Nest reference clocks.
+event:0x20b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_NON_FAV_TBEGIN : Dispatch time non favored tbegin
+event:0x328084 counters:1 um:zero minimum:10000 name:PM_NON_TM_RST_SC : non tm snp rst tm sc
+event:0x2001a counters:1 um:zero minimum:10000 name:PM_NTCG_ALL_FIN : Ccycles after all instructions have finished to group completed.
+event:0x20ac counters:0,1,2,3 um:zero minimum:10000 name:PM_OUTER_TBEGIN : Completion time outer tbegin
+event:0x20ae counters:0,1,2,3 um:zero minimum:10000 name:PM_OUTER_TEND : Completion time outer tend
+event:0x20010 counters:1 um:zero minimum:10000 name:PM_PMC1_OVERFLOW : Overflow from counter 1.
+event:0x30010 counters:2 um:zero minimum:10000 name:PM_PMC2_OVERFLOW : Overflow from counter 2.
+event:0x30020 counters:2 um:zero minimum:10000 name:PM_PMC2_REWIND : PMC2 Rewind Event (did not match condition).
+event:0x10022 counters:0 um:zero minimum:10000 name:PM_PMC2_SAVED : PMC2 Rewind Value saved (matched condition).
+event:0x40010 counters:3 um:zero minimum:10000 name:PM_PMC3_OVERFLOW : Overflow from counter 3.
+event:0x10010 counters:0 um:zero minimum:10000 name:PM_PMC4_OVERFLOW : Overflow from counter 4.
+event:0x10020 counters:0 um:zero minimum:10000 name:PM_PMC4_REWIND : PMC4 Rewind Event (did not match condition).
+event:0x30022 counters:2 um:zero minimum:10000 name:PM_PMC4_SAVED : PMC4 Rewind Value saved (matched condition).
+event:0x10024 counters:0 um:zero minimum:10000 name:PM_PMC5_OVERFLOW : Overflow from counter 5.
+event:0x30024 counters:2 um:zero minimum:10000 name:PM_PMC6_OVERFLOW : Overflow from counter 6.
+event:0x2005a counters:1 um:zero minimum:10000 name:PM_PREF_TRACKED : Total number of Prefetch Operations that were tracked.
+event:0x1005a counters:0 um:zero minimum:10000 name:PM_PREF_TRACK_EFF : Prefetch Tracked was effective.
+event:0x3005a counters:2 um:zero minimum:10000 name:PM_PREF_TRACK_INEFF : Prefetch tracked was ineffective.
+event:0x4005a counters:3 um:zero minimum:10000 name:PM_PREF_TRACK_MOD : Prefetch tracked was moderate.
+event:0x1005c counters:0 um:zero minimum:10000 name:PM_PREF_TRACK_MOD_L2 : Prefetch Tracked was moderate (source L2).
+event:0x3005c counters:2 um:zero minimum:10000 name:PM_PREF_TRACK_MOD_L3 : Prefetch tracked was moderate (L3).
+event:0x40014 counters:3 um:zero minimum:10000 name:PM_PROBE_NOP_DISP : ProbeNops dispatched.
+event:0xe084 counters:0,1,2,3 um:zero minimum:10000 name:PM_PTE_PREFETCH : PTE prefetches42
+event:0x10054 counters:0 um:zero minimum:10000 name:PM_PUMP_CPRED : Pump prediction correct. Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x40052 counters:3 um:zero minimum:10000 name:PM_PUMP_MPRED : Pump Mis prediction Counts across all types of pumpsfor all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x16081 counters:0 um:zero minimum:10000 name:PM_RC0_ALLOC : 0.0
+event:0x16080 counters:0 um:zero minimum:10000 name:PM_RC0_BUSY : RC mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)
+event:0x200301ea counters:2 um:zero minimum:10000 name:PM_RC_LIFETIME_EXC_1024 : Reload latency exceeded 1024 cyc
+event:0x200401ec counters:3 um:zero minimum:10000 name:PM_RC_LIFETIME_EXC_2048 : Threshold counter exceeded a value of 2048
+event:0x200101e8 counters:0 um:zero minimum:10000 name:PM_RC_LIFETIME_EXC_256 : Threshold counter exceed a count of 256
+event:0x200201e6 counters:1 um:zero minimum:10000 name:PM_RC_LIFETIME_EXC_32 : Reload latency exceeded 32 cyc
+event:0x36088 counters:2 um:zero minimum:10000 name:PM_RC_USAGE : Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running
+event:0x34808e counters:3 um:zero minimum:10000 name:PM_RD_CLEARING_SC : rd clearing sc
+event:0x34808c counters:3 um:zero minimum:10000 name:PM_RD_FORMING_SC : rd forming sc
+event:0x428086 counters:1 um:zero minimum:10000 name:PM_RD_HIT_PF : rd machine hit l3 pf machine
+event:0x20004 counters:1 um:zero minimum:10000 name:PM_REAL_SRQ_FULL : Out of real srq entries.
+event:0x3006c counters:2 um:zero minimum:10000 name:PM_RUN_CYC_SMT2_MODE : Cycles run latch is set and core is in SMT2 mode.
+event:0x2006a counters:1 um:zero minimum:10000 name:PM_RUN_CYC_SMT2_SHRD_MODE : Cycles run latch is set and core is in SMT2-shared mode.
+event:0x1006a counters:0 um:zero minimum:100000 name:PM_RUN_CYC_SMT2_SPLIT_MODE : Cycles run latch is set and core is in SMT2-split mode.
+event:0x2006c counters:1 um:zero minimum:10000 name:PM_RUN_CYC_SMT4_MODE : Cycles run latch is set and core is in SMT4 mode.
+event:0x4006c counters:3 um:zero minimum:100000 name:PM_RUN_CYC_SMT8_MODE : Cycles run latch is set and core is in SMT8 mode.
+event:0x1006c counters:0 um:zero minimum:100000 name:PM_RUN_CYC_ST_MODE : Cycles run latch is set and core is in ST mode.
+event:0x10008 counters:0 um:zero minimum:10000 name:PM_RUN_SPURR : Run SPURR.
+event:0xf082 counters:0,1,2,3 um:zero minimum:10000 name:PM_SEC_ERAT_HIT : secondary ERAT Hit42
+event:0x508c counters:0,1,2,3 um:zero minimum:10000 name:PM_SHL_CREATED : Store-Hit-Load Table Entry Created
+event:0x508e counters:0,1,2,3 um:zero minimum:10000 name:PM_SHL_ST_CONVERT : Store-Hit-Load Table Read Hit with entry Enabled
+event:0x5090 counters:0,1,2,3 um:zero minimum:10000 name:PM_SHL_ST_DISABLE : Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)
+event:0x26085 counters:1 um:zero minimum:10000 name:PM_SN0_ALLOC : 0.0
+event:0x26084 counters:1 um:zero minimum:10000 name:PM_SN0_BUSY : SN mach 0 Busy. Used by PMU to sample ave RC livetime(mach0 used as sample point)
+event:0xd0b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_SNOOP_TLBIE : TLBIE snoopSnoop TLBIE
+event:0x338088 counters:2 um:zero minimum:10000 name:PM_SNP_TM_HIT_M : snp tm st hit m mu
+event:0x33808a counters:2 um:zero minimum:10000 name:PM_SNP_TM_HIT_T : snp tm_st_hit t tn te
+event:0x4608c counters:3 um:zero minimum:10000 name:PM_SN_USAGE : Continuous 16 cycle(2to1) window where this signals rotates thru sampling each L2 SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running
+event:0x10028 counters:0 um:zero minimum:10000 name:PM_STALL_END_GCT_EMPTY : Count ended because GCT went empty.
+event:0x1e058 counters:0 um:zero minimum:10000 name:PM_STCX_FAIL : stcx failed .
+event:0xc090 counters:0,1,2,3 um:zero minimum:10000 name:PM_STCX_LSU : STCX executed reported at sent to nest42
+event:0x717080 counters:0 um:zero minimum:10000 name:PM_ST_CAUSED_FAIL : Non TM St caused any thread to fail
+event:0x20016 counters:1 um:zero minimum:10000 name:PM_ST_CMPL : Store completion count.
+event:0x20018 counters:1 um:zero minimum:10000 name:PM_ST_FWD : Store forwards that finished.
+event:0x0 counters:0,1,2,3 um:zero minimum:10000 name:PM_SUSPENDED : Counter OFF.
+event:0x3090 counters:0,1,2,3 um:zero minimum:10000 name:PM_SWAP_CANCEL : SWAP cancel , rtag not available
+event:0x3092 counters:0,1,2,3 um:zero minimum:10000 name:PM_SWAP_CANCEL_GPR : SWAP cancel , rtag not available for gpr
+event:0x308c counters:0,1,2,3 um:zero minimum:10000 name:PM_SWAP_COMPLETE : swap cast in completed
+event:0x308e counters:0,1,2,3 um:zero minimum:10000 name:PM_SWAP_COMPLETE_GPR : swap cast in completed fpr gpr
+event:0x15152 counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_BR_LINK : Marked Branch and link branch that can cause a synchronous interrupt.
+event:0x1515c counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_BR_MPRED : Marked Branch mispredict that can cause a synchronous interrupt.
+event:0x15156 counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_FX_DIVIDE : Marked fixed point divide that can cause a synchronous interrupt.
+event:0x15158 counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_L2HIT : Marked L2 Hits that can throw a synchronous interrupt.
+event:0x1515a counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_L2MISS : Marked L2 Miss that can throw a synchronous interrupt.
+event:0x15154 counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_L3MISS : Marked L3 misses that can throw a synchronous interrupt.
+event:0x15150 counters:0 um:zero minimum:10000 name:PM_SYNC_MRK_PROBE_NOP : Marked probeNops which can cause synchronous interrupts.
+event:0x30050 counters:2 um:zero minimum:10000 name:PM_SYS_PUMP_CPRED : Initial and Final Pump Scope and data sourced across this scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x30052 counters:2 um:zero minimum:10000 name:PM_SYS_PUMP_MPRED : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope(Chip/Group) OR Final Pump Scope(system) got data from source that was at smaller scope(Chip/group) Final pump was system pump and initial pump was chip or group or
+event:0x40050 counters:3 um:zero minimum:10000 name:PM_SYS_PUMP_MPRED_RTY : Final Pump Scope(system) to get data sourced, ended up larger than Initial Pump Scope (Chip or Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate).
+event:0x10026 counters:0 um:zero minimum:10000 name:PM_TABLEWALK_CYC : Tablewalk Active.
+event:0xe086 counters:0,1,2,3 um:zero minimum:10000 name:PM_TABLEWALK_CYC_PREF : tablewalk qualified for pte prefetches42
+event:0x20b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_TABORT_TRECLAIM : Completion time tabortnoncd, tabortcd, treclaim
+event:0xe0ba counters:0,1,2,3 um:zero minimum:10000 name:PM_TEND_PEND_CYC : TEND latency per thread42
+event:0x2000c counters:1 um:zero minimum:100000 name:PM_THRD_ALL_RUN_CYC : All Threads in Run_cycles (was both threads in run_cycles).
+event:0x10012 counters:0 um:zero minimum:10000 name:PM_THRD_GRP_CMPL_BOTH_CYC : Two threads finished same cycle (gated by run latch).
+event:0x40bc counters:0,1,2,3 um:zero minimum:1000 name:PM_THRD_PRIO_0_1_CYC : Cycles thread running at priority level 0 or 1
+event:0x40be counters:0,1,2,3 um:zero minimum:1000 name:PM_THRD_PRIO_2_3_CYC : Cycles thread running at priority level 2 or 3
+event:0x5080 counters:0,1,2,3 um:zero minimum:1000 name:PM_THRD_PRIO_4_5_CYC : Cycles thread running at priority level 4 or 5
+event:0x5082 counters:0,1,2,3 um:zero minimum:1000 name:PM_THRD_PRIO_6_7_CYC : Cycles thread running at priority level 6 or 7
+event:0x3098 counters:0,1,2,3 um:zero minimum:10000 name:PM_THRD_REBAL_CYC : cycles rebalance was active
+event:0x4016e counters:3 um:zero minimum:10000 name:PM_THRESH_NOT_MET : Threshold counter did not meet threshold.
+event:0x30058 counters:2 um:zero minimum:10000 name:PM_TLBIE_FIN : tlbie finished.
+event:0x20066 counters:1 um:zero minimum:10000 name:PM_TLB_MISS : TLB Miss (I + D).
+event:0x20b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_BEGIN_ALL : Tm any tbegin
+event:0x318082 counters:0 um:zero minimum:10000 name:PM_TM_CAM_OVERFLOW : l3 tm cam overflow during L2 co of SC
+event:0x74708c counters:3 um:zero minimum:10000 name:PM_TM_CAP_OVERFLOW : TM Footprint Capactiy Overflow
+event:0x20ba counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_END_ALL : Tm any tend
+event:0x3086 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_CONF_NON_TM : TEXAS fail reason @ completion
+event:0x3088 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_CON_TM : TEXAS fail reason @ completion
+event:0xe0b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_DISALLOW : TM fail disallow42
+event:0x3084 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_FOOTPRINT_OVERFLOW : TEXAS fail reason @ completion
+event:0xe0b8 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_NON_TX_CONFLICT : Non transactional conflict from LSU whtver gets repoted to texas42
+event:0x308a counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_SELF : TEXAS fail reason @ completion
+event:0xe0b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_TLBIE : TLBIE hit bloom filter42
+event:0xe0b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_FAIL_TX_CONFLICT : Transactional conflict from LSU, whatever gets reported to texas 42
+event:0x727086 counters:1 um:zero minimum:10000 name:PM_TM_FAV_CAUSED_FAIL : TM Load (fav) caused another thread to fail
+event:0x717082 counters:0 um:zero minimum:10000 name:PM_TM_LD_CAUSED_FAIL : Non TM Ld caused any thread to fail
+event:0x727084 counters:1 um:zero minimum:10000 name:PM_TM_LD_CONF : TM Load (fav or non-fav) ran into conflict (failed)
+event:0x328086 counters:1 um:zero minimum:10000 name:PM_TM_RST_SC : tm snp rst tm sc
+event:0x318080 counters:0 um:zero minimum:10000 name:PM_TM_SC_CO : l3 castout tm Sc line
+event:0x73708a counters:2 um:zero minimum:10000 name:PM_TM_ST_CAUSED_FAIL : TM Store (fav or non-fav) caused another thread to fail
+event:0x737088 counters:2 um:zero minimum:10000 name:PM_TM_ST_CONF : TM Store (fav or non-fav) ran into conflict (failed)
+event:0x20bc counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_TBEGIN : Tm nested tbegin
+event:0x10060 counters:0 um:zero minimum:10000 name:PM_TM_TRANS_RUN_CYC : run cycles in transactional state.
+event:0x30060 counters:2 um:zero minimum:10000 name:PM_TM_TRANS_RUN_INST : Instructions completed in transactional state.
+event:0x3080 counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_TRESUME : Tm resume
+event:0x20be counters:0,1,2,3 um:zero minimum:10000 name:PM_TM_TSUSPEND : Tm suspend
+event:0x2e012 counters:1 um:zero minimum:10000 name:PM_TM_TX_PASS_RUN_CYC : run cycles spent in successful transactions.
+event:0x4e014 counters:3 um:zero minimum:10000 name:PM_TM_TX_PASS_RUN_INST : run instructions spent in successful transactions.
+event:0xe08c counters:0,1,2,3 um:zero minimum:10000 name:PM_UP_PREF_L3 : Micropartition prefetch42
+event:0xe08e counters:0,1,2,3 um:zero minimum:10000 name:PM_UP_PREF_POINTER : Micrpartition pointer prefetches42
+event:0xa0a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_16FLOP : Sixteen flops operation (SP vector versions of fdiv,fsqrt)
+event:0xa080 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_1FLOP : one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finishedDecode into 1,2,4 FLOP according to instr IOP, multiplied by #vector elements according to route( eg x1, x2, x4) Only if instr sends finish to ISU
+event:0xa098 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_2FLOP : two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)
+event:0xa09c counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_4FLOP : four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)
+event:0xa0a0 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_8FLOP : eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)
+event:0xb0a4 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_COMPLEX_ISSUED : Complex VMX instruction issued
+event:0xb0b4 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_CY_ISSUED : Cryptographic instruction RFC02196 Issued
+event:0xb0a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_DD_ISSUED : 64BIT Decimal Issued
+event:0xa08c counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_DP_2FLOP : DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg
+event:0xa090 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_DP_FMA : DP vector version of fmadd,fnmadd,fmsub,fnmsub
+event:0xa094 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_DP_FSQRT_FDIV : DP vector versions of fdiv,fsqrt
+event:0xb0ac counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_DQ_ISSUED : 128BIT Decimal Issued
+event:0xb0b0 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_EX_ISSUED : Direct move 32/64b VRFtoGPR RFC02206 Issued
+event:0xa0bc counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_FIN : VSU0 Finished an instruction
+event:0xa084 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_FMA : two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!
+event:0xb098 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_FPSCR : Move to/from FPSCR type instruction issued on Pipe 0
+event:0xa088 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_FSQRT_FDIV : four flops operation (fdiv,fsqrt) Scalar Instructions only!
+event:0xb090 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_PERMUTE_ISSUED : Permute VMX Instruction Issued
+event:0xb088 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_SCALAR_DP_ISSUED : Double Precision scalar instruction issued on Pipe0
+event:0xb094 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_SIMPLE_ISSUED : Simple VMX instruction issued
+event:0xa0a8 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_SINGLE : FPU single precision
+event:0xb09c counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_SQ : Store Vector Issued
+event:0xb08c counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_STF : FPU store (SP or DP) issued on Pipe0
+event:0xb080 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_VECTOR_DP_ISSUED : Double Precision vector instruction issued on Pipe0
+event:0xb084 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU0_VECTOR_SP_ISSUED : Single Precision vector instruction issued (executed)
+event:0xa0a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_16FLOP : Sixteen flops operation (SP vector versions of fdiv,fsqrt)
+event:0xa082 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_1FLOP : one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation finished
+event:0xa09a counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_2FLOP : two flops operation (scalar fmadd, fnmadd, fmsub, fnmsub and DP vector versions of single flop instructions)
+event:0xa09e counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_4FLOP : four flops operation (scalar fdiv, fsqrt, DP vector version of fmadd, fnmadd, fmsub, fnmsub, SP vector versions of single flop instructions)
+event:0xa0a2 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_8FLOP : eight flops operation (DP vector versions of fdiv,fsqrt and SP vector versions of fmadd,fnmadd,fmsub,fnmsub)
+event:0xb0a6 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_COMPLEX_ISSUED : Complex VMX instruction issued
+event:0xb0b6 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_CY_ISSUED : Cryptographic instruction RFC02196 Issued
+event:0xb0aa counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_DD_ISSUED : 64BIT Decimal Issued
+event:0xa08e counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_DP_2FLOP : DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg
+event:0xa092 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_DP_FMA : DP vector version of fmadd,fnmadd,fmsub,fnmsub
+event:0xa096 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_DP_FSQRT_FDIV : DP vector versions of fdiv,fsqrt
+event:0xb0ae counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_DQ_ISSUED : 128BIT Decimal Issued
+event:0xb0b2 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_EX_ISSUED : Direct move 32/64b VRFtoGPR RFC02206 Issued
+event:0xa0be counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_FIN : VSU1 Finished an instruction
+event:0xa086 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_FMA : two flops operation (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only!
+event:0xb09a counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_FPSCR : Move to/from FPSCR type instruction issued on Pipe 0
+event:0xa08a counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_FSQRT_FDIV : four flops operation (fdiv,fsqrt) Scalar Instructions only!
+event:0xb092 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_PERMUTE_ISSUED : Permute VMX Instruction Issued
+event:0xb08a counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_SCALAR_DP_ISSUED : Double Precision scalar instruction issued on Pipe1
+event:0xb096 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_SIMPLE_ISSUED : Simple VMX instruction issued
+event:0xa0aa counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_SINGLE : FPU single precision
+event:0xb09e counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_SQ : Store Vector Issued
+event:0xb08e counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_STF : FPU store (SP or DP) issued on Pipe1
+event:0xb082 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_VECTOR_DP_ISSUED : Double Precision vector instruction issued on Pipe1
+event:0xb086 counters:0,1,2,3 um:zero minimum:10000 name:PM_VSU1_VECTOR_SP_ISSUED : Single Precision vector instruction issued (executed)
diff --git a/events/ppc64/power8/unit_masks b/events/ppc64/power8/unit_masks
new file mode 100644
index 0000000..988dd41
--- /dev/null
+++ b/events/ppc64/power8/unit_masks
@@ -0,0 +1,9 @@
+#
+# Copyright OProfile authors
+# Copyright (c) International Business Machines, 2013.
+# Contributed by Maynard Johnson .
+#
+# ppc64 POWER8 possible unit masks
+#
+name:zero type:mandatory default:0x0
+ 0x0 No unit mask
diff --git a/events/rtc/events b/events/rtc/events
deleted file mode 100644
index cce44b0..0000000
--- a/events/rtc/events
+++ /dev/null
@@ -1,3 +0,0 @@
-# RTC events
-#
-name:RTC_INTERRUPTS event:0xff counters:0 um:zero minimum:2 : RTC interrupts/sec (rounded up to power of two)
diff --git a/events/s390/z10/events b/events/s390/z10/events
index 08a2e74..9c975ae 100644
--- a/events/s390/z10/events
+++ b/events/s390/z10/events
@@ -2,6 +2,7 @@
# Copyright (c) International Business Machines, 2011.
# Contributed by Andreas Krebbel .
#
-# IBM System z10 Basic Mode Sampling events
+# IBM System z10 events for operf/ocount
#
-event:0x00 counters:0 um:zero minimum:2202 name:HWSAMPLING : Sampling using Basic Mode Hardware Sampling
+event:0x00 counters:0 um:zero minimum:2202 name:CPU_CYCLES : Processor cycles
+event:0x01 counters:0 um:zero minimum:2202 name:INSTRUCTIONS : Instructions completed
diff --git a/events/s390/z196/events b/events/s390/z196/events
index 6c4bd65..c9a7526 100644
--- a/events/s390/z196/events
+++ b/events/s390/z196/events
@@ -2,6 +2,6 @@
# Copyright (c) International Business Machines, 2011.
# Contributed by Andreas Krebbel .
#
-# zEnterprise z196 Basic Mode Sampling events
+# zEnterprise z196 events for operf/ocount
#
include:s390/z10
diff --git a/events/s390/zEC12/events b/events/s390/zEC12/events
new file mode 100644
index 0000000..f2fb415
--- /dev/null
+++ b/events/s390/zEC12/events
@@ -0,0 +1,8 @@
+# Copyright OProfile authors
+# Copyright (c) International Business Machines, 2013.
+# Contributed by Andreas Krebbel .
+#
+# IBM Enterprise EC12 events for operf/ocount
+#
+event:0x00 counters:0 um:zero minimum:19264 name:CPU_CYCLES : Processor cycles
+event:0x01 counters:0 um:zero minimum:19264 name:INSTRUCTIONS : Instructions completed
diff --git a/events/s390/zEC12/unit_masks b/events/s390/zEC12/unit_masks
new file mode 100644
index 0000000..cfc4dc1
--- /dev/null
+++ b/events/s390/zEC12/unit_masks
@@ -0,0 +1,7 @@
+# Copyright OProfile authors
+# Copyright (c) International Business Machines, 2013.
+# Contributed by Andreas Krebbel .
+#
+# S/390 Basic Mode Hardware Sampling unit masks
+#
+include:s390/z10
diff --git a/events/x86-64/family10/events b/events/x86-64/family10/events
index 0213f26..3f6ae1e 100644
--- a/events/x86-64/family10/events
+++ b/events/x86-64/family10/events
@@ -9,13 +9,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 10h Processors,
# Publication# 31116, Revision 3.48, April 22, 2010
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.4
+# Revision: 1.5
#
# ChangeLog:
+# 1.5: 11 August 2014
+# - Removal of IBS events due to missing support in Operf
+#
# 1.4: 11 March 2011
# - Update to BKDG revision 3.48
# - Fix typo in the description for event 0xf244
@@ -179,72 +178,3 @@ event:0x4e1 counters:0,1,2,3 um:l3_cache minimum:500 name:L3_CACHE_MISSES : Numb
event:0x4e2 counters:0,1,2,3 um:l3_fill minimum:500 name:L3_FILLS_CAUSED_BY_L2_EVICTIONS : Number of L3 fills caused by L2 evictions per core
event:0x4e3 counters:0,1,2,3 um:l3_evict minimum:500 name:L3_EVICTIONS : Number of L3 cache line evictions by cache state
event:0x4ed counters:0,1,2,3 um:non_cancelled_l3_read_requests minimum:500 name:NON_CANCELLED_L3_READ_REQUESTS : Non-cancelled L3 Read Requests (Rev D)
-
-###############################
-# IBS FETCH EVENTS
-###############################
-event:0xf000 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ALL : All IBS fetch samples
-event:0xf001 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_KILLED : IBS fetch killed
-event:0xf002 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ATTEMPTED : IBS fetch attempted
-event:0xf003 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_COMPLETED : IBS fetch completed
-event:0xf004 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ABORTED : IBS fetch aborted
-event:0xf005 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ITLB_HITS : IBS ITLB hit
-event:0xf006 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS : IBS L1 ITLB misses (and L2 ITLB hits)
-event:0xf007 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES : IBS L1 L2 ITLB miss
-event:0xf008 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_MISSES : IBS Instruction cache misses
-event:0xf009 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_HITS : IBS Instruction cache hit
-event:0xf00A ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_4K_PAGE : IBS 4K page translation
-event:0xf00B ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_2M_PAGE : IBS 2M page translation
-#
-event:0xf00E ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_LATENCY : IBS fetch latency
-
-###############################
-# IBS OP EVENTS
-###############################
-event:0xf100 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL : All IBS op samples
-event:0xf101 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAG_TO_RETIRE : IBS tag-to-retire cycles
-event:0xf102 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_COMP_TO_RET : IBS completion-to-retire cycles
-event:0xf103 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BRANCH_RETIRED : IBS branch op
-event:0xf104 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH : IBS mispredicted branch op
-event:0xf105 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAKEN_BRANCH : IBS taken branch op
-event:0xf106 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH_TAKEN : IBS mispredicted taken branch op
-event:0xf107 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RETURNS : IBS return op
-event:0xf108 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_RETURNS : IBS mispredicted return op
-event:0xf109 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RESYNC : IBS resync op
-event:0xf200 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL_LOAD_STORE : IBS all load store ops
-event:0xf201 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOAD : IBS load ops
-event:0xf202 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_STORE : IBS store ops
-event:0xf203 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_HITS : IBS L1 DTLB hit
-event:0xf204 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT : IBS L1 DTLB misses L2 hits
-event:0xf205 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_L2_DTLB_MISS : IBS L1 and L2 DTLB misses
-event:0xf206 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_CACHE_MISS : IBS data cache misses
-event:0xf207 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_HITS : IBS data cache hits
-event:0xf208 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISALIGNED_DATA_ACC : IBS misaligned data access
-event:0xf209 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_LOAD : IBS bank conflict on load op
-event:0xf20A ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_STORE : IBS bank conflict on store op
-event:0xf20B ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_FORWARD : IBS store-to-load forwarded
-event:0xf20C ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_CANCELLED : IBS store-to-load cancelled
-event:0xf20D ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCUC_MEM_ACC : IBS UC memory access
-event:0xf20E ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCWC_MEM_ACC : IBS WC memory access
-event:0xf20F ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOCKED : IBS locked operation
-event:0xf210 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MAB_HIT : IBS MAB hit
-event:0xf211 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_4K : IBS L1 DTLB 4K page
-event:0xf212 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_2M : IBS L1 DTLB 2M page
-event:0xf213 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_1G : IBS L1 DTLB 1G page
-event:0xf215 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_4K : IBS L2 DTLB 4K page
-event:0xf216 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_2M : IBS L2 DTLB 2M page
-event:0xf217 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_1G : IBS L2 DTLB 1G page
-event:0xf219 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DC_LOAD_LAT : IBS data cache miss load latency
-event:0xf240 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_ONLY : IBS northbridge local
-event:0xf241 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_ONLY : IBS northbridge remote
-event:0xf242 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_L3 : IBS northbridge local L3
-event:0xf243 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE : IBS northbridge local core L1 or L2 cache
-event:0xf244 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE : IBS northbridge remote core L1, L2, L3 cache
-event:0xf245 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_DRAM : IBS northbridge local DRAM
-event:0xf246 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_DRAM : IBS northbridge remote DRAM
-event:0xf247 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_OTHER : IBS northbridge local APIC MMIO Config PCI
-event:0xf248 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_OTHER : IBS northbridge remote APIC MMIO Config PCI
-event:0xf249 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_MODIFIED : IBS northbridge cache modified state
-event:0xf24A ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_OWNED : IBS northbridge cache owned state
-event:0xf24B ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE_LAT : IBS northbridge local cache latency
-event:0xf24C ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE_LAT : IBS northbridge remote cache latency
diff --git a/events/x86-64/family10/unit_masks b/events/x86-64/family10/unit_masks
index 5c42206..e18504b 100644
--- a/events/x86-64/family10/unit_masks
+++ b/events/x86-64/family10/unit_masks
@@ -13,9 +13,12 @@
# Publication# 40546, Revision 3.13, February 2011
# (Note: For IBS Derived Performance Events)
#
-# Revision: 1.4
+# Revision: 1.5
#
# ChangeLog:
+# 1.5: 11 August 2014
+# - Removal of IBS events due to missing support in Operf
+#
# 1.4: 11 March 2011
# - Update to BKDG revision 3.48
# - Fix typo in the description for event 0xf244
@@ -377,10 +380,6 @@ name:retired_x87_fp type:bitmask default:0x07
0x01 Add/subtract ops
0x02 Multiply ops
0x04 Divide ops
-name:ibs_op type:bitmask default:0x01
- 0x00 Using IBS OP cycle count mode
- 0x01 Using IBS OP dispatch count mode
- 0x02 Enable IBS OP Memory Access Log
name:non_cancelled_l3_read_requests type:bitmask default:0xf7
0x01 RbBlk
0x02 RbBlkS
diff --git a/events/x86-64/family12h/events b/events/x86-64/family12h/events
index eb5ac5c..b99bbe5 100644
--- a/events/x86-64/family12h/events
+++ b/events/x86-64/family12h/events
@@ -10,13 +10,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 12h Processors,
# Publication# 41131, Revision 1.13, March 01, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.2
+# Revision: 1.3
#
# ChangeLog:
+# 1.3: 11 August 2014
+# - Remove IBS events due to missing operf support
+#
# 1.2: 09 March 2011
# - Update with BKDG Rev.1.13 (preliminary)
#
@@ -130,63 +129,3 @@ event:0x0ee counters:0,1,2,3 um:gart minimum:500 name:DEV_EVENTS : DEV Events
event:0x1f0 counters:0,1,2,3 um:mem_control_request minimum:500 name:MEMORY_CONTROLLER_REQUESTS : Memory Controller Requests
event:0x1e9 counters:0,1,2,3 um:sideband_signals minimum:500 name:SIDEBAND_SIGNALS : Sideband Signals and Special Cycles
event:0x1ea counters:0,1,2,3 um:interrupt_events minimum:500 name:INTERRUPT_EVENTS : Interrupt Events
-event:0xf000 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ALL : All IBS fetch samples
-event:0xf001 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_KILLED : IBS fetch killed
-event:0xf002 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ATTEMPTED : IBS fetch attempted
-event:0xf003 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_COMPLETED : IBS fetch completed
-event:0xf004 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ABORTED : IBS fetch aborted
-event:0xf005 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ITLB_HITS : IBS ITLB hit
-event:0xf006 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS : IBS L1 ITLB misses (and L2 ITLB hits)
-event:0xf007 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES : IBS L1 L2 ITLB miss
-event:0xf008 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_MISSES : IBS instruction cache misses
-event:0xf009 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_HITS : IBS instruction cache hit
-event:0xf00a ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_4K_PAGE : IBS 4K page translation
-event:0xf00b ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_2M_PAGE : IBS 2M page translation
-event:0xf00e ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_LATENCY : IBS fetch latency
-event:0xf100 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL : All IBS op samples
-event:0xf101 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAG_TO_RETIRE : IBS tag-to-retire cycles
-event:0xf102 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_COMP_TO_RET : IBS completion-to-retire cycles
-event:0xf103 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BRANCH_RETIRED : IBS branch op
-event:0xf104 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH : IBS mispredicted branch op
-event:0xf105 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAKEN_BRANCH : IBS taken branch op
-event:0xf106 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH_TAKEN : IBS mispredicted taken branch op
-event:0xf107 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RETURNS : IBS return op
-event:0xf108 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_RETURNS : IBS mispredicted return op
-event:0xf109 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RESYNC : IBS resync op
-event:0xf200 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL_LOAD_STORE : IBS all load store ops
-event:0xf201 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOAD : IBS load ops
-event:0xf202 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_STORE : IBS store ops
-event:0xf203 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_HITS : IBS L1 DTLB hit
-event:0xf204 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT : IBS L1 DTLB misses L2 hits
-event:0xf205 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_L2_DTLB_MISS : IBS L1 and L2 DTLB misses
-event:0xf206 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_CACHE_MISS : IBS data cache misses
-event:0xf207 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_HITS : IBS data cache hits
-event:0xf208 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISALIGNED_DATA_ACC : IBS misaligned data access
-event:0xf209 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_LOAD : IBS bank conflict on load op
-event:0xf20a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_STORE : IBS bank conflict on store op
-event:0xf20b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_FORWARD : IBS store-to-load forwarded
-event:0xf20c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_CANCELLED : IBS store-to-load cancelled
-event:0xf20d ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCUC_MEM_ACC : IBS UC memory access
-event:0xf20e ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCWC_MEM_ACC : IBS WC memory access
-event:0xf20f ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOCKED : IBS locked operation
-event:0xf210 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MAB_HIT : IBS MAB hit
-event:0xf211 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_4K : IBS L1 DTLB 4K page
-event:0xf212 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_2M : IBS L1 DTLB 2M page
-event:0xf213 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_1G : IBS L1 DTLB 1G page
-event:0xf215 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_4K : IBS L2 DTLB 4K page
-event:0xf216 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_2M : IBS L2 DTLB 2M page
-event:0xf217 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_1G : IBS L2 DTLB 1G page
-event:0xf219 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DC_LOAD_LAT : IBS data cache miss load latency
-event:0xf240 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_ONLY : IBS Northbridge local
-event:0xf241 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_ONLY : IBS Northbridge remote
-event:0xf242 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_L3 : IBS Northbridge local L3
-event:0xf243 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE : IBS Northbridge local core L1 or L2 cache
-event:0xf244 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE : IBS Northbridge local core L1, L2, L3 cache
-event:0xf245 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_DRAM : IBS Northbridge local DRAM
-event:0xf246 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_DRAM : IBS Northbridge remote DRAM
-event:0xf247 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_OTHER : IBS Northbridge local APIC MMIO Config PCI
-event:0xf248 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_OTHER : IBS Northbridge remote APIC MMIO Config PCI
-event:0xf249 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_MODIFIED : IBS Northbridge cache modified state
-event:0xf24a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_OWNED : IBS Northbridge cache owned state
-event:0xf24b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE_LAT : IBS Northbridge local cache latency
-event:0xf24c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE_LAT : IBS Northbridge remote cache latency
diff --git a/events/x86-64/family12h/unit_masks b/events/x86-64/family12h/unit_masks
index 4f97c3b..f824490 100644
--- a/events/x86-64/family12h/unit_masks
+++ b/events/x86-64/family12h/unit_masks
@@ -10,13 +10,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 12h Processors,
# Publication# 41131, Revision 1.13, March 01, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.2
+# Revision: 1.3
#
# ChangeLog:
+# 1.3: 11 August 2014
+# - Remove IBS events due to missing operf support
+#
# 1.2: 09 March 2011
# - Update with BKDG Rev.1.13 (preliminary)
#
@@ -266,8 +265,3 @@ name:interrupt_events type:bitmask default:0xff
0x20 STARTUP
0x40 INT
0x80 EOI
-name:ibs_op type:bitmask default:0x01
- 0x00 Using IBS OP cycle count mode
- 0x01 Using IBS OP dispatch count mode
- 0x02 Enable IBS OP Memory Access Log
- 0x04 Enable IBS OP Branch Target Address Log
diff --git a/events/x86-64/family14h/events b/events/x86-64/family14h/events
index cd05d28..956bc24 100644
--- a/events/x86-64/family14h/events
+++ b/events/x86-64/family14h/events
@@ -10,13 +10,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 14h Processors,
# Publication# 43170, Revision 3.04, Feb 16, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.2
+# Revision: 1.3
#
# ChangeLog:
+# 1.3: 11 August 2014
+# - Remove IBS events due to missing support in Operf
+#
# 1.2: 11 March 2011
# - Update to BKDG Rev.3.04
#
@@ -109,63 +108,3 @@ event:0x0ee counters:0,1,2,3 um:gart minimum:500 name:DEV_EVENTS : DEV Events
event:0x1f0 counters:0,1,2,3 um:mem_control_request minimum:500 name:MEMORY_CONTROLLER_REQUESTS : Memory Controller Requests
event:0x1e9 counters:0,1,2,3 um:sideband_signals minimum:500 name:SIDEBAND_SIGNALS : Sideband Signals and Special Cycles
event:0x1ea counters:0,1,2,3 um:interrupt_events minimum:500 name:INTERRUPT_EVENTS : Interrupt Events
-event:0xf000 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ALL : All IBS fetch samples
-event:0xf001 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_KILLED : IBS fetch killed
-event:0xf002 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ATTEMPTED : IBS fetch attempted
-event:0xf003 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_COMPLETED : IBS fetch completed
-event:0xf004 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ABORTED : IBS fetch aborted
-event:0xf005 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ITLB_HITS : IBS ITLB hit
-event:0xf006 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS : IBS L1 ITLB misses (and L2 ITLB hits)
-event:0xf007 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES : IBS L1 L2 ITLB miss
-event:0xf008 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_MISSES : IBS instruction cache misses
-event:0xf009 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_HITS : IBS instruction cache hit
-event:0xf00a ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_4K_PAGE : IBS 4K page translation
-event:0xf00b ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_2M_PAGE : IBS 2M page translation
-event:0xf00e ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_LATENCY : IBS fetch latency
-event:0xf100 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL : All IBS op samples
-event:0xf101 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAG_TO_RETIRE : IBS tag-to-retire cycles
-event:0xf102 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_COMP_TO_RET : IBS completion-to-retire cycles
-event:0xf103 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BRANCH_RETIRED : IBS branch op
-event:0xf104 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH : IBS mispredicted branch op
-event:0xf105 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAKEN_BRANCH : IBS taken branch op
-event:0xf106 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH_TAKEN : IBS mispredicted taken branch op
-event:0xf107 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RETURNS : IBS return op
-event:0xf108 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_RETURNS : IBS mispredicted return op
-event:0xf109 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RESYNC : IBS resync op
-event:0xf200 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL_LOAD_STORE : IBS all load store ops
-event:0xf201 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOAD : IBS load ops
-event:0xf202 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_STORE : IBS store ops
-event:0xf203 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_HITS : IBS L1 DTLB hit
-event:0xf204 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT : IBS L1 DTLB misses L2 hits
-event:0xf205 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_L2_DTLB_MISS : IBS L1 and L2 DTLB misses
-event:0xf206 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_CACHE_MISS : IBS data cache misses
-event:0xf207 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_HITS : IBS data cache hits
-event:0xf208 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISALIGNED_DATA_ACC : IBS misaligned data access
-event:0xf209 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_LOAD : IBS bank conflict on load op
-event:0xf20a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_STORE : IBS bank conflict on store op
-event:0xf20b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_FORWARD : IBS store-to-load forwarded
-event:0xf20c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_CANCELLED : IBS store-to-load cancelled
-event:0xf20d ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCUC_MEM_ACC : IBS UC memory access
-event:0xf20e ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCWC_MEM_ACC : IBS WC memory access
-event:0xf20f ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOCKED : IBS locked operation
-event:0xf210 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MAB_HIT : IBS MAB hit
-event:0xf211 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_4K : IBS L1 DTLB 4K page
-event:0xf212 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_2M : IBS L1 DTLB 2M page
-event:0xf213 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_1G : IBS L1 DTLB 1G page
-event:0xf215 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_4K : IBS L2 DTLB 4K page
-event:0xf216 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_2M : IBS L2 DTLB 2M page
-event:0xf217 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_1G : IBS L2 DTLB 1G page
-event:0xf219 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DC_LOAD_LAT : IBS data cache miss load latency
-event:0xf240 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_ONLY : IBS Northbridge local
-event:0xf241 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_ONLY : IBS Northbridge remote
-event:0xf242 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_L3 : IBS Northbridge local L3
-event:0xf243 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE : IBS Northbridge local core L1 or L2 cache
-event:0xf244 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE : IBS Northbridge local core L1, L2, L3 cache
-event:0xf245 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_DRAM : IBS Northbridge local DRAM
-event:0xf246 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_DRAM : IBS Northbridge remote DRAM
-event:0xf247 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_OTHER : IBS Northbridge local APIC MMIO Config PCI
-event:0xf248 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_OTHER : IBS Northbridge remote APIC MMIO Config PCI
-event:0xf249 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_MODIFIED : IBS Northbridge cache modified state
-event:0xf24a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_OWNED : IBS Northbridge cache owned state
-event:0xf24b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE_LAT : IBS Northbridge local cache latency
-event:0xf24c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE_LAT : IBS Northbridge remote cache latency
diff --git a/events/x86-64/family14h/unit_masks b/events/x86-64/family14h/unit_masks
index 9e4484e..b722ced 100644
--- a/events/x86-64/family14h/unit_masks
+++ b/events/x86-64/family14h/unit_masks
@@ -10,13 +10,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 14h Processors,
# Publication# 43170, Revision 3.04, Feb 16, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.2
+# Revision: 1.3
#
# ChangeLog:
+# 1.3: 11 August 2014
+# - Remove IBS events due to missing support in Operf
+#
# 1.2: 11 March 2011
# - Update to BKDG Rev.3.04
#
@@ -239,8 +238,3 @@ name:interrupt_events type:bitmask default:0xff
0x20 STARTUP
0x40 INT
0x80 EOI
-name:ibs_op type:bitmask default:0x01
- 0x00 Using IBS OP cycle count mode
- 0x01 Using IBS OP dispatch count mode
- 0x02 Enable IBS OP Memory Access Log
- 0x04 Enable IBS OP Branch Target Address Log
diff --git a/events/x86-64/family15h/events b/events/x86-64/family15h/events
index faa9b90..cc7b49e 100644
--- a/events/x86-64/family15h/events
+++ b/events/x86-64/family15h/events
@@ -10,13 +10,12 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 15h Models 00h-0Fh Processors,
# Publication# 42301, Revision 1.12, February 16, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.3
+# Revision: 1.4
#
# ChangeLog:
+# 1.4: 11 August 2014
+# - Remove IBS events due to missing support in Operf
+#
# 1.3: 9 March 2011
# - Update to BKDG Rev 1.12 (still preliminary)
#
@@ -111,63 +110,3 @@ event:0x0de counters:0,1,2,3,4,5 um:zero minimum:500 name:DR2_BREAKPOINTS : DR2
event:0x0df counters:0,1,2,3,4,5 um:zero minimum:500 name:DR3_BREAKPOINTS : DR3 Breakpoint Match
event:0x1cf counters:0,1,2,3,4,5 um:ibs_ops_tagged minimum:50000 name:IBS_OPS_TAGGED : Tagged IBS Ops
event:0x1d8 counters:0,1,2,3,4,5 um:zero minimum:500 name:DISPATCH_STALL_FOR_STQ_FULL : Dispatch Stall for STQ Full
-event:0xf000 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ALL : All IBS fetch samples
-event:0xf001 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_KILLED : IBS fetch killed
-event:0xf002 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ATTEMPTED : IBS fetch attempted
-event:0xf003 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_COMPLETED : IBS fetch completed
-event:0xf004 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ABORTED : IBS fetch aborted
-event:0xf005 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ITLB_HITS : IBS ITLB hit
-event:0xf006 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_HITS : IBS L1 ITLB misses (and L2 ITLB hits)
-event:0xf007 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_L1_ITLB_MISSES_L2_ITLB_MISSES : IBS L1 L2 ITLB miss
-event:0xf008 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_MISSES : IBS instruction cache misses
-event:0xf009 ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_ICACHE_HITS : IBS instruction cache hit
-event:0xf00a ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_4K_PAGE : IBS 4K page translation
-event:0xf00b ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_2M_PAGE : IBS 2M page translation
-event:0xf00e ext:ibs_fetch um:zero minimum:50000 name:IBS_FETCH_LATENCY : IBS fetch latency
-event:0xf100 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL : All IBS op samples
-event:0xf101 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAG_TO_RETIRE : IBS tag-to-retire cycles
-event:0xf102 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_COMP_TO_RET : IBS completion-to-retire cycles
-event:0xf103 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BRANCH_RETIRED : IBS branch op
-event:0xf104 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH : IBS mispredicted branch op
-event:0xf105 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_TAKEN_BRANCH : IBS taken branch op
-event:0xf106 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_BRANCH_TAKEN : IBS mispredicted taken branch op
-event:0xf107 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RETURNS : IBS return op
-event:0xf108 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISPREDICTED_RETURNS : IBS mispredicted return op
-event:0xf109 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_RESYNC : IBS resync op
-event:0xf200 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_ALL_LOAD_STORE : IBS all load store ops
-event:0xf201 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOAD : IBS load ops
-event:0xf202 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_STORE : IBS store ops
-event:0xf203 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_HITS : IBS L1 DTLB hit
-event:0xf204 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_MISS_L2_DTLB_HIT : IBS L1 DTLB misses L2 hits
-event:0xf205 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_L2_DTLB_MISS : IBS L1 and L2 DTLB misses
-event:0xf206 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_CACHE_MISS : IBS data cache misses
-event:0xf207 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DATA_HITS : IBS data cache hits
-event:0xf208 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MISALIGNED_DATA_ACC : IBS misaligned data access
-event:0xf209 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_LOAD : IBS bank conflict on load op
-event:0xf20a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_BANK_CONF_STORE : IBS bank conflict on store op
-event:0xf20b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_FORWARD : IBS store-to-load forwarded
-event:0xf20c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_CANCELLED : IBS store-to-load cancelled
-event:0xf20d ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCUC_MEM_ACC : IBS UC memory access
-event:0xf20e ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DCWC_MEM_ACC : IBS WC memory access
-event:0xf20f ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_LOCKED : IBS locked operation
-event:0xf210 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_MAB_HIT : IBS MAB hit
-event:0xf211 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_4K : IBS L1 DTLB 4K page
-event:0xf212 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_2M : IBS L1 DTLB 2M page
-event:0xf213 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L1_DTLB_1G : IBS L1 DTLB 1G page
-event:0xf215 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_4K : IBS L2 DTLB 4K page
-event:0xf216 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_2M : IBS L2 DTLB 2M page
-event:0xf217 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_L2_DTLB_1G : IBS L2 DTLB 1G page
-event:0xf219 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_DC_LOAD_LAT : IBS data cache miss load latency
-event:0xf240 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_ONLY : IBS Northbridge local
-event:0xf241 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_ONLY : IBS Northbridge remote
-event:0xf242 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_L3 : IBS Northbridge local L3
-event:0xf243 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE : IBS Northbridge local core L1 or L2 cache
-event:0xf244 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE : IBS Northbridge local core L1, L2, L3 cache
-event:0xf245 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_DRAM : IBS Northbridge local DRAM
-event:0xf246 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_DRAM : IBS Northbridge remote DRAM
-event:0xf247 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_OTHER : IBS Northbridge local APIC MMIO Config PCI
-event:0xf248 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_OTHER : IBS Northbridge remote APIC MMIO Config PCI
-event:0xf249 ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_MODIFIED : IBS Northbridge cache modified state
-event:0xf24a ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_CACHE_OWNED : IBS Northbridge cache owned state
-event:0xf24b ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_LOCAL_CACHE_LAT : IBS Northbridge local cache latency
-event:0xf24c ext:ibs_op um:ibs_op minimum:50000 name:IBS_OP_NB_REMOTE_CACHE_LAT : IBS Northbridge remote cache latency
diff --git a/events/x86-64/family15h/unit_masks b/events/x86-64/family15h/unit_masks
index 071eb1b..4c207ff 100644
--- a/events/x86-64/family15h/unit_masks
+++ b/events/x86-64/family15h/unit_masks
@@ -10,15 +10,11 @@
# Sources: BIOS and Kernel Developer's Guide for AMD Family 15h Models 00h-0Fh Processors,
# Publication# 42301, Revision 1.12, February 16, 2011
#
-# Software Optimization Guide for AMD Family 10h and Family 12h Processors,
-# Publication# 40546, Revision 3.13, February 2011
-# (Note: For IBS Derived Performance Events)
-#
-# Revision: 1.3
+# Revision: 1.4
#
# ChangeLog:
-# 1.3: 9 March 2011
-# - Update to BKDG Rev 1.12 (still preliminary)
+# 1.4: 11 August 2014
+# - Remove IBS events due to missing support in Operf
#
# 1.2: 25 Januray 2011
# - Updated to BKDG Rev 1.09 (still preliminary)
@@ -185,8 +181,3 @@ name:ls_dispatch type:bitmask default:0x07
name:l2_prefetcher_trigger type:bitmask default:0x03
0x01 Load L1 miss seen by prefetcher
0x02 Store L1 miss seen by prefetcher
-name:ibs_op type:bitmask default:0x01
- 0x00 Using IBS OP cycle count mode
- 0x01 Using IBS OP dispatch count mode
- 0x02 Enable IBS OP Memory Access Log
- 0x04 Enable IBS OP Branch Target Address Log
diff --git a/events/x86-64/generic/events b/events/x86-64/generic/events
new file mode 100644
index 0000000..3edf5ce
--- /dev/null
+++ b/events/x86-64/generic/events
@@ -0,0 +1,40 @@
+# AMD Generic processor performance events
+#
+# Copyright OProfile authors
+# Copyright (c) 2006-2013 Advanced Micro Devices
+# Contributed by Ray Bryant ,
+# Jason Yeh
+# Suravee Suthikulpanit
+# Paul Drongowski
+#
+# Sources: BIOS and Kernel Developer's Guide for AMD processors,
+#
+# Revision: 1.0
+#
+# ChangeLog:
+# 1.0: 07 Feb 2013
+# - Preliminary version
+
+# L1 DATA CACHE
+event:0x040 counters:0,1,2,3 um:zero minimum:500 name:DATA_CACHE_ACCESSES : Data Cache Accesses
+event:0x041 counters:0,1,2,3 um:dcache_misses minimum:500 name:DATA_CACHE_MISSES : Data Cache Misses
+event:0x042 counters:0,1,2,3 um:dcache_refills minimum:500 name:DATA_CACHE_REFILLS_FROM_L2_OR_NORTHBRIDGE : Data Cache Refills from L2 or System
+event:0x043 counters:0,1,2,3 um:zero minimum:500 name:DATA_CACHE_REFILLS_FROM_NORTHBRIDGE : Data Cache Refills from System
+
+# CYCLE
+event:0x076 counters:0,1,2,3 um:zero minimum:50000 name:CPU_CLK_UNHALTED : CPU Clocks not Halted
+
+# INSTRUCTION CACHE
+event:0x080 counters:0,1,2,3 um:zero minimum:500 name:INSTRUCTION_CACHE_FETCHES : Instruction Cache Fetches
+event:0x081 counters:0,1,2,3 um:zero minimum:500 name:INSTRUCTION_CACHE_MISSES : Instruction Cache Misses
+event:0x082 counters:0,1,2,3 um:zero minimum:500 name:INSTRUCTION_CACHE_REFILLS_FROM_L2 : Instruction Cache Refills from L2
+event:0x083 counters:0,1,2,3 um:zero minimum:500 name:INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM : Instruction Cache Refills from System
+
+# INSTRUCTIONS
+event:0x0c0 counters:0,1,2,3 um:zero minimum:50000 name:RETIRED_INSTRUCTIONS : Retired Instructions
+event:0x0c1 counters:0,1,2,3 um:zero minimum:50000 name:RETIRED_UOPS : Retired uops
+event:0x0c2 counters:0,1,2,3 um:zero minimum:500 name:RETIRED_BRANCH_INSTRUCTIONS : Retired Branch Instructions
+event:0x0c3 counters:0,1,2,3 um:zero minimum:500 name:RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS : Retired Mispredicted Branch Instructions
+event:0x0c4 counters:0,1,2,3 um:zero minimum:500 name:RETIRED_TAKEN_BRANCH_INSTRUCTIONS : Retired Taken Branch Instructions
+event:0x0c5 counters:0,1,2,3 um:zero minimum:500 name:RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED : Retired Taken Branch Instructions Mispredicted
+event:0x0ca counters:0,1,2,3 um:zero minimum:500 name:RETIRED_INDIRECT_BRANCHES_MISPREDICTED : Retired Indirect Branches Mispredicted
diff --git a/events/x86-64/generic/unit_masks b/events/x86-64/generic/unit_masks
new file mode 100644
index 0000000..b111b82
--- /dev/null
+++ b/events/x86-64/generic/unit_masks
@@ -0,0 +1,26 @@
+# AMD Generic processor performance events
+#
+# Copyright OProfile authors
+# Copyright (c) 2006-2013 Advanced Micro Devices
+# Contributed by Ray Bryant ,
+# Jason Yeh
+# Suravee Suthikulpanit
+# Paul Drongowski
+#
+# Sources: BIOS and Kernel Developer's Guide for AMD processors,
+#
+# Revision: 1.0
+#
+# ChangeLog:
+# 1.0: 07 Feb 2013
+# - Preliminary version
+
+name:zero type:mandatory default:0x00
+ 0x00 No unit mask
+name:dcache_misses type:bitmask default:0x01
+ 0x01 First data cache miss or streaming store to a 64B cache line
+ 0x02 First streaming store to a 64B cache line
+name:dcache_refills type:bitmask default:0x0b
+ 0x01 Fill with good data. (Final valid status is valid)
+ 0x02 Early valid status turned out to be invalid
+ 0x08 Fill with read data error
diff --git a/gui/Makefile.am b/gui/Makefile.am
deleted file mode 100644
index c079e9b..0000000
--- a/gui/Makefile.am
+++ /dev/null
@@ -1,43 +0,0 @@
-SUBDIRS = ui
-
-dist_sources = \
- oprof_start.cpp \
- oprof_start_config.cpp \
- oprof_start_util.cpp \
- oprof_start_main.cpp \
- oprof_start.h \
- oprof_start_config.h \
- oprof_start_util.h
-
-EXTRA_DIST = $(dist_sources)
-
-if have_qt
-
-AM_CPPFLAGS = \
- @QT_CFLAGS@ \
- -I ${top_srcdir}/libop \
- -I ${top_srcdir}/libutil++ \
- -I ${top_srcdir}/libutil \
- @OP_CPPFLAGS@
-
-AM_CXXFLAGS = @OP_CXXFLAGS@
-
-bin_PROGRAMS = oprof_start
-
-oprof_start_SOURCES = $(dist_sources)
-nodist_oprof_start_SOURCES = oprof_start.moc.cpp
-oprof_start_LDADD = \
- ../libutil++/libutil++.a \
- ../libop/libop.a \
- ../libutil/libutil.a \
- ui/liboprof_start.a \
- @QT_LIBS@ \
- @X_LIBS@
-
-oprof_start.moc.cpp: ${top_srcdir}/gui/oprof_start.h
- $(MOC) -o $@ ${top_srcdir}/gui/oprof_start.h
-
-clean-local:
- rm -f oprof_start.moc.cpp
-
-endif
diff --git a/gui/Makefile.in b/gui/Makefile.in
deleted file mode 100644
index a1d5b12..0000000
--- a/gui/Makefile.in
+++ /dev/null
@@ -1,767 +0,0 @@
-# Makefile.in generated by automake 1.11.1 from Makefile.am.
-# @configure_input@
-
-# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
-# 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation,
-# Inc.
-# This Makefile.in is free software; the Free Software Foundation
-# gives unlimited permission to copy and/or distribute it,
-# with or without modifications, as long as this notice is preserved.
-
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
-# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
-# PARTICULAR PURPOSE.
-
-@SET_MAKE@
-
-VPATH = @srcdir@
-pkgdatadir = $(datadir)/@PACKAGE@
-pkgincludedir = $(includedir)/@PACKAGE@
-pkglibdir = $(libdir)/@PACKAGE@
-pkglibexecdir = $(libexecdir)/@PACKAGE@
-am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
-install_sh_DATA = $(install_sh) -c -m 644
-install_sh_PROGRAM = $(install_sh) -c
-install_sh_SCRIPT = $(install_sh) -c
-INSTALL_HEADER = $(INSTALL_DATA)
-transform = $(program_transform_name)
-NORMAL_INSTALL = :
-PRE_INSTALL = :
-POST_INSTALL = :
-NORMAL_UNINSTALL = :
-PRE_UNINSTALL = :
-POST_UNINSTALL = :
-build_triplet = @build@
-host_triplet = @host@
-@have_qt_TRUE@bin_PROGRAMS = oprof_start$(EXEEXT)
-subdir = gui
-DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
-ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
-am__aclocal_m4_deps = $(top_srcdir)/m4/binutils.m4 \
- $(top_srcdir)/m4/builtinexpect.m4 \
- $(top_srcdir)/m4/cellspubfdsupport.m4 \
- $(top_srcdir)/m4/compileroption.m4 \
- $(top_srcdir)/m4/copyifchange.m4 $(top_srcdir)/m4/docbook.m4 \
- $(top_srcdir)/m4/extradirs.m4 \
- $(top_srcdir)/m4/kernelversion.m4 $(top_srcdir)/m4/libtool.m4 \
- $(top_srcdir)/m4/ltoptions.m4 $(top_srcdir)/m4/ltsugar.m4 \
- $(top_srcdir)/m4/ltversion.m4 $(top_srcdir)/m4/lt~obsolete.m4 \
- $(top_srcdir)/m4/mallocattribute.m4 \
- $(top_srcdir)/m4/poptconst.m4 \
- $(top_srcdir)/m4/precompiledheader.m4 $(top_srcdir)/m4/qt.m4 \
- $(top_srcdir)/m4/sstream.m4 $(top_srcdir)/m4/typedef.m4 \
- $(top_srcdir)/configure.ac
-am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
- $(ACLOCAL_M4)
-mkinstalldirs = $(install_sh) -d
-CONFIG_HEADER = $(top_builddir)/config.h
-CONFIG_CLEAN_FILES =
-CONFIG_CLEAN_VPATH_FILES =
-am__installdirs = "$(DESTDIR)$(bindir)"
-PROGRAMS = $(bin_PROGRAMS)
-am__oprof_start_SOURCES_DIST = oprof_start.cpp oprof_start_config.cpp \
- oprof_start_util.cpp oprof_start_main.cpp oprof_start.h \
- oprof_start_config.h oprof_start_util.h
-am__objects_1 = oprof_start.$(OBJEXT) oprof_start_config.$(OBJEXT) \
- oprof_start_util.$(OBJEXT) oprof_start_main.$(OBJEXT)
-@have_qt_TRUE@am_oprof_start_OBJECTS = $(am__objects_1)
-@have_qt_TRUE@nodist_oprof_start_OBJECTS = oprof_start.moc.$(OBJEXT)
-oprof_start_OBJECTS = $(am_oprof_start_OBJECTS) \
- $(nodist_oprof_start_OBJECTS)
-@have_qt_TRUE@oprof_start_DEPENDENCIES = ../libutil++/libutil++.a \
-@have_qt_TRUE@ ../libop/libop.a ../libutil/libutil.a \
-@have_qt_TRUE@ ui/liboprof_start.a
-DEFAULT_INCLUDES = -I.@am__isrc@ -I$(top_builddir)
-depcomp = $(SHELL) $(top_srcdir)/depcomp
-am__depfiles_maybe = depfiles
-am__mv = mv -f
-CXXCOMPILE = $(CXX) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
- $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS)
-LTCXXCOMPILE = $(LIBTOOL) --tag=CXX $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
- --mode=compile $(CXX) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
- $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS)
-CXXLD = $(CXX)
-CXXLINK = $(LIBTOOL) --tag=CXX $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
- --mode=link $(CXXLD) $(AM_CXXFLAGS) $(CXXFLAGS) $(AM_LDFLAGS) \
- $(LDFLAGS) -o $@
-COMPILE = $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) \
- $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS)
-LTCOMPILE = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
- --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
- $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS)
-CCLD = $(CC)
-LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
- --mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) $(AM_LDFLAGS) \
- $(LDFLAGS) -o $@
-SOURCES = $(oprof_start_SOURCES) $(nodist_oprof_start_SOURCES)
-DIST_SOURCES = $(am__oprof_start_SOURCES_DIST)
-RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
- html-recursive info-recursive install-data-recursive \
- install-dvi-recursive install-exec-recursive \
- install-html-recursive install-info-recursive \
- install-pdf-recursive install-ps-recursive install-recursive \
- installcheck-recursive installdirs-recursive pdf-recursive \
- ps-recursive uninstall-recursive
-RECURSIVE_CLEAN_TARGETS = mostlyclean-recursive clean-recursive \
- distclean-recursive maintainer-clean-recursive
-AM_RECURSIVE_TARGETS = $(RECURSIVE_TARGETS:-recursive=) \
- $(RECURSIVE_CLEAN_TARGETS:-recursive=) tags TAGS ctags CTAGS \
- distdir
-ETAGS = etags
-CTAGS = ctags
-DIST_SUBDIRS = $(SUBDIRS)
-DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
-am__relativize = \
- dir0=`pwd`; \
- sed_first='s,^\([^/]*\)/.*$$,\1,'; \
- sed_rest='s,^[^/]*/*,,'; \
- sed_last='s,^.*/\([^/]*\)$$,\1,'; \
- sed_butlast='s,/*[^/]*$$,,'; \
- while test -n "$$dir1"; do \
- first=`echo "$$dir1" | sed -e "$$sed_first"`; \
- if test "$$first" != "."; then \
- if test "$$first" = ".."; then \
- dir2=`echo "$$dir0" | sed -e "$$sed_last"`/"$$dir2"; \
- dir0=`echo "$$dir0" | sed -e "$$sed_butlast"`; \
- else \
- first2=`echo "$$dir2" | sed -e "$$sed_first"`; \
- if test "$$first2" = "$$first"; then \
- dir2=`echo "$$dir2" | sed -e "$$sed_rest"`; \
- else \
- dir2="../$$dir2"; \
- fi; \
- dir0="$$dir0"/"$$first"; \
- fi; \
- fi; \
- dir1=`echo "$$dir1" | sed -e "$$sed_rest"`; \
- done; \
- reldir="$$dir2"
-ACLOCAL = @ACLOCAL@
-AMTAR = @AMTAR@
-AR = @AR@
-AUTOCONF = @AUTOCONF@
-AUTOHEADER = @AUTOHEADER@
-AUTOMAKE = @AUTOMAKE@
-AWK = @AWK@
-BFD_LIBS = @BFD_LIBS@
-CAT_ENTRY_END = @CAT_ENTRY_END@
-CAT_ENTRY_START = @CAT_ENTRY_START@
-CC = @CC@
-CCDEPMODE = @CCDEPMODE@
-CFLAGS = @CFLAGS@
-CPP = @CPP@
-CPPFLAGS = @CPPFLAGS@
-CXX = @CXX@
-CXXCPP = @CXXCPP@
-CXXDEPMODE = @CXXDEPMODE@
-CXXFLAGS = @CXXFLAGS@
-CYGPATH_W = @CYGPATH_W@
-DATE = @DATE@
-DEFS = @DEFS@
-DEPDIR = @DEPDIR@
-DOCBOOK_ROOT = @DOCBOOK_ROOT@
-DSYMUTIL = @DSYMUTIL@
-DUMPBIN = @DUMPBIN@
-ECHO_C = @ECHO_C@
-ECHO_N = @ECHO_N@
-ECHO_T = @ECHO_T@
-EGREP = @EGREP@
-EXEEXT = @EXEEXT@
-EXTRA_CFLAGS_MODULE = @EXTRA_CFLAGS_MODULE@
-FGREP = @FGREP@
-GREP = @GREP@
-INSTALL = @INSTALL@
-INSTALL_DATA = @INSTALL_DATA@
-INSTALL_PROGRAM = @INSTALL_PROGRAM@
-INSTALL_SCRIPT = @INSTALL_SCRIPT@
-INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
-JAVA_HOMEDIR = @JAVA_HOMEDIR@
-LD = @LD@
-LDFLAGS = @LDFLAGS@
-LIBERTY_LIBS = @LIBERTY_LIBS@
-LIBOBJS = @LIBOBJS@
-LIBS = @LIBS@
-LIBTOOL = @LIBTOOL@
-LIPO = @LIPO@
-LN_S = @LN_S@
-LTLIBOBJS = @LTLIBOBJS@
-MAKEINFO = @MAKEINFO@
-MKDIR_P = @MKDIR_P@
-MOC = @MOC@
-NM = @NM@
-NMEDIT = @NMEDIT@
-OBJDUMP = @OBJDUMP@
-OBJEXT = @OBJEXT@
-OP_CFLAGS = @OP_CFLAGS@
-OP_CPPFLAGS = @OP_CPPFLAGS@
-OP_CXXFLAGS = @OP_CXXFLAGS@
-OP_DOCDIR = @OP_DOCDIR@
-OP_LDFLAGS = @OP_LDFLAGS@
-OTOOL = @OTOOL@
-OTOOL64 = @OTOOL64@
-PACKAGE = @PACKAGE@
-PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
-PACKAGE_NAME = @PACKAGE_NAME@
-PACKAGE_STRING = @PACKAGE_STRING@
-PACKAGE_TARNAME = @PACKAGE_TARNAME@
-PACKAGE_VERSION = @PACKAGE_VERSION@
-PATH_SEPARATOR = @PATH_SEPARATOR@
-PERF_EVENT_FLAGS = @PERF_EVENT_FLAGS@
-PFM_LIB = @PFM_LIB@
-PKG_CONFIG = @PKG_CONFIG@
-POPT_LIBS = @POPT_LIBS@
-PTRDIFF_T_TYPE = @PTRDIFF_T_TYPE@
-QT_CFLAGS = @QT_CFLAGS@
-QT_INCLUDES = @QT_INCLUDES@
-QT_LDFLAGS = @QT_LDFLAGS@
-QT_LIB = @QT_LIB@
-QT_LIBS = @QT_LIBS@
-QT_VERSION = @QT_VERSION@
-RANLIB = @RANLIB@
-SED = @SED@
-SET_MAKE = @SET_MAKE@
-SHELL = @SHELL@
-SIZE_T_TYPE = @SIZE_T_TYPE@
-STRIP = @STRIP@
-UIC = @UIC@
-UIChelp = @UIChelp@
-VERSION = @VERSION@
-XMKMF = @XMKMF@
-XML_CATALOG = @XML_CATALOG@
-XSLTPROC = @XSLTPROC@
-XSLTPROC_FLAGS = @XSLTPROC_FLAGS@
-X_CFLAGS = @X_CFLAGS@
-X_EXTRA_LIBS = @X_EXTRA_LIBS@
-X_LIBS = @X_LIBS@
-X_PRE_LIBS = @X_PRE_LIBS@
-abs_builddir = @abs_builddir@
-abs_srcdir = @abs_srcdir@
-abs_top_builddir = @abs_top_builddir@
-abs_top_srcdir = @abs_top_srcdir@
-ac_ct_CC = @ac_ct_CC@
-ac_ct_CXX = @ac_ct_CXX@
-ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
-am__include = @am__include@
-am__leading_dot = @am__leading_dot@
-am__quote = @am__quote@
-am__tar = @am__tar@
-am__untar = @am__untar@
-bindir = @bindir@
-build = @build@
-build_alias = @build_alias@
-build_cpu = @build_cpu@
-build_os = @build_os@
-build_vendor = @build_vendor@
-builddir = @builddir@
-datadir = @datadir@
-datarootdir = @datarootdir@
-docdir = @docdir@
-dvidir = @dvidir@
-exec_prefix = @exec_prefix@
-host = @host@
-host_alias = @host_alias@
-host_cpu = @host_cpu@
-host_os = @host_os@
-host_vendor = @host_vendor@
-htmldir = @htmldir@
-includedir = @includedir@
-infodir = @infodir@
-install_sh = @install_sh@
-libdir = @libdir@
-libexecdir = @libexecdir@
-localedir = @localedir@
-localstatedir = @localstatedir@
-lt_ECHO = @lt_ECHO@
-mandir = @mandir@
-mkdir_p = @mkdir_p@
-oldincludedir = @oldincludedir@
-pdfdir = @pdfdir@
-prefix = @prefix@
-program_transform_name = @program_transform_name@
-psdir = @psdir@
-sbindir = @sbindir@
-sharedstatedir = @sharedstatedir@
-srcdir = @srcdir@
-sysconfdir = @sysconfdir@
-target_alias = @target_alias@
-top_build_prefix = @top_build_prefix@
-top_builddir = @top_builddir@
-top_srcdir = @top_srcdir@
-topdir = @topdir@
-SUBDIRS = ui
-dist_sources = \
- oprof_start.cpp \
- oprof_start_config.cpp \
- oprof_start_util.cpp \
- oprof_start_main.cpp \
- oprof_start.h \
- oprof_start_config.h \
- oprof_start_util.h
-
-EXTRA_DIST = $(dist_sources)
-@have_qt_TRUE@AM_CPPFLAGS = \
-@have_qt_TRUE@ @QT_CFLAGS@ \
-@have_qt_TRUE@ -I ${top_srcdir}/libop \
-@have_qt_TRUE@ -I ${top_srcdir}/libutil++ \
-@have_qt_TRUE@ -I ${top_srcdir}/libutil \
-@have_qt_TRUE@ @OP_CPPFLAGS@
-
-@have_qt_TRUE@AM_CXXFLAGS = @OP_CXXFLAGS@
-@have_qt_TRUE@oprof_start_SOURCES = $(dist_sources)
-@have_qt_TRUE@nodist_oprof_start_SOURCES = oprof_start.moc.cpp
-@have_qt_TRUE@oprof_start_LDADD = \
-@have_qt_TRUE@ ../libutil++/libutil++.a \
-@have_qt_TRUE@ ../libop/libop.a \
-@have_qt_TRUE@ ../libutil/libutil.a \
-@have_qt_TRUE@ ui/liboprof_start.a \
-@have_qt_TRUE@ @QT_LIBS@ \
-@have_qt_TRUE@ @X_LIBS@
-
-all: all-recursive
-
-.SUFFIXES:
-.SUFFIXES: .cpp .lo .o .obj
-$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
- @for dep in $?; do \
- case '$(am__configure_deps)' in \
- *$$dep*) \
- ( cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh ) \
- && { if test -f $@; then exit 0; else break; fi; }; \
- exit 1;; \
- esac; \
- done; \
- echo ' cd $(top_srcdir) && $(AUTOMAKE) --foreign gui/Makefile'; \
- $(am__cd) $(top_srcdir) && \
- $(AUTOMAKE) --foreign gui/Makefile
-.PRECIOUS: Makefile
-Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
- @case '$?' in \
- *config.status*) \
- cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
- *) \
- echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
- cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
- esac;
-
-$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
- cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
-
-$(top_srcdir)/configure: $(am__configure_deps)
- cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
-$(ACLOCAL_M4): $(am__aclocal_m4_deps)
- cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
-$(am__aclocal_m4_deps):
-install-binPROGRAMS: $(bin_PROGRAMS)
- @$(NORMAL_INSTALL)
- test -z "$(bindir)" || $(MKDIR_P) "$(DESTDIR)$(bindir)"
- @list='$(bin_PROGRAMS)'; test -n "$(bindir)" || list=; \
- for p in $$list; do echo "$$p $$p"; done | \
- sed 's/$(EXEEXT)$$//' | \
- while read p p1; do if test -f $$p || test -f $$p1; \
- then echo "$$p"; echo "$$p"; else :; fi; \
- done | \
- sed -e 'p;s,.*/,,;n;h' -e 's|.*|.|' \
- -e 'p;x;s,.*/,,;s/$(EXEEXT)$$//;$(transform);s/$$/$(EXEEXT)/' | \
- sed 'N;N;N;s,\n, ,g' | \
- $(AWK) 'BEGIN { files["."] = ""; dirs["."] = 1 } \
- { d=$$3; if (dirs[d] != 1) { print "d", d; dirs[d] = 1 } \
- if ($$2 == $$4) files[d] = files[d] " " $$1; \
- else { print "f", $$3 "/" $$4, $$1; } } \
- END { for (d in files) print "f", d, files[d] }' | \
- while read type dir files; do \
- if test "$$dir" = .; then dir=; else dir=/$$dir; fi; \
- test -z "$$files" || { \
- echo " $(INSTALL_PROGRAM_ENV) $(LIBTOOL) $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=install $(INSTALL_PROGRAM) $$files '$(DESTDIR)$(bindir)$$dir'"; \
- $(INSTALL_PROGRAM_ENV) $(LIBTOOL) $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=install $(INSTALL_PROGRAM) $$files "$(DESTDIR)$(bindir)$$dir" || exit $$?; \
- } \
- ; done
-
-uninstall-binPROGRAMS:
- @$(NORMAL_UNINSTALL)
- @list='$(bin_PROGRAMS)'; test -n "$(bindir)" || list=; \
- files=`for p in $$list; do echo "$$p"; done | \
- sed -e 'h;s,^.*/,,;s/$(EXEEXT)$$//;$(transform)' \
- -e 's/$$/$(EXEEXT)/' `; \
- test -n "$$list" || exit 0; \
- echo " ( cd '$(DESTDIR)$(bindir)' && rm -f" $$files ")"; \
- cd "$(DESTDIR)$(bindir)" && rm -f $$files
-
-clean-binPROGRAMS:
- @list='$(bin_PROGRAMS)'; test -n "$$list" || exit 0; \
- echo " rm -f" $$list; \
- rm -f $$list || exit $$?; \
- test -n "$(EXEEXT)" || exit 0; \
- list=`for p in $$list; do echo "$$p"; done | sed 's/$(EXEEXT)$$//'`; \
- echo " rm -f" $$list; \
- rm -f $$list
-oprof_start$(EXEEXT): $(oprof_start_OBJECTS) $(oprof_start_DEPENDENCIES)
- @rm -f oprof_start$(EXEEXT)
- $(CXXLINK) $(oprof_start_OBJECTS) $(oprof_start_LDADD) $(LIBS)
-
-mostlyclean-compile:
- -rm -f *.$(OBJEXT)
-
-distclean-compile:
- -rm -f *.tab.c
-
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oprof_start.Po@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oprof_start.moc.Po@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oprof_start_config.Po@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oprof_start_main.Po@am__quote@
-@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oprof_start_util.Po@am__quote@
-
-.cpp.o:
-@am__fastdepCXX_TRUE@ $(CXXCOMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ $<
-@am__fastdepCXX_TRUE@ $(am__mv) $(DEPDIR)/$*.Tpo $(DEPDIR)/$*.Po
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
-@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ $<
-
-.cpp.obj:
-@am__fastdepCXX_TRUE@ $(CXXCOMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ `$(CYGPATH_W) '$<'`
-@am__fastdepCXX_TRUE@ $(am__mv) $(DEPDIR)/$*.Tpo $(DEPDIR)/$*.Po
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=no @AMDEPBACKSLASH@
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
-@am__fastdepCXX_FALSE@ $(CXXCOMPILE) -c -o $@ `$(CYGPATH_W) '$<'`
-
-.cpp.lo:
-@am__fastdepCXX_TRUE@ $(LTCXXCOMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ $<
-@am__fastdepCXX_TRUE@ $(am__mv) $(DEPDIR)/$*.Tpo $(DEPDIR)/$*.Plo
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ source='$<' object='$@' libtool=yes @AMDEPBACKSLASH@
-@AMDEP_TRUE@@am__fastdepCXX_FALSE@ DEPDIR=$(DEPDIR) $(CXXDEPMODE) $(depcomp) @AMDEPBACKSLASH@
-@am__fastdepCXX_FALSE@ $(LTCXXCOMPILE) -c -o $@ $<
-
-mostlyclean-libtool:
- -rm -f *.lo
-
-clean-libtool:
- -rm -rf .libs _libs
-
-# This directory's subdirectories are mostly independent; you can cd
-# into them and run `make' without going through this Makefile.
-# To change the values of `make' variables: instead of editing Makefiles,
-# (1) if the variable is set in `config.status', edit `config.status'
-# (which will cause the Makefiles to be regenerated when you run `make');
-# (2) otherwise, pass the desired values on the `make' command line.
-$(RECURSIVE_TARGETS):
- @fail= failcom='exit 1'; \
- for f in x $$MAKEFLAGS; do \
- case $$f in \
- *=* | --[!k]*);; \
- *k*) failcom='fail=yes';; \
- esac; \
- done; \
- dot_seen=no; \
- target=`echo $@ | sed s/-recursive//`; \
- list='$(SUBDIRS)'; for subdir in $$list; do \
- echo "Making $$target in $$subdir"; \
- if test "$$subdir" = "."; then \
- dot_seen=yes; \
- local_target="$$target-am"; \
- else \
- local_target="$$target"; \
- fi; \
- ($(am__cd) $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
- || eval $$failcom; \
- done; \
- if test "$$dot_seen" = "no"; then \
- $(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
- fi; test -z "$$fail"
-
-$(RECURSIVE_CLEAN_TARGETS):
- @fail= failcom='exit 1'; \
- for f in x $$MAKEFLAGS; do \
- case $$f in \
- *=* | --[!k]*);; \
- *k*) failcom='fail=yes';; \
- esac; \
- done; \
- dot_seen=no; \
- case "$@" in \
- distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
- *) list='$(SUBDIRS)' ;; \
- esac; \
- rev=''; for subdir in $$list; do \
- if test "$$subdir" = "."; then :; else \
- rev="$$subdir $$rev"; \
- fi; \
- done; \
- rev="$$rev ."; \
- target=`echo $@ | sed s/-recursive//`; \
- for subdir in $$rev; do \
- echo "Making $$target in $$subdir"; \
- if test "$$subdir" = "."; then \
- local_target="$$target-am"; \
- else \
- local_target="$$target"; \
- fi; \
- ($(am__cd) $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
- || eval $$failcom; \
- done && test -z "$$fail"
-tags-recursive:
- list='$(SUBDIRS)'; for subdir in $$list; do \
- test "$$subdir" = . || ($(am__cd) $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
- done
-ctags-recursive:
- list='$(SUBDIRS)'; for subdir in $$list; do \
- test "$$subdir" = . || ($(am__cd) $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
- done
-
-ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
- list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
- unique=`for i in $$list; do \
- if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
- done | \
- $(AWK) '{ files[$$0] = 1; nonempty = 1; } \
- END { if (nonempty) { for (i in files) print i; }; }'`; \
- mkid -fID $$unique
-tags: TAGS
-
-TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
- $(TAGS_FILES) $(LISP)
- set x; \
- here=`pwd`; \
- if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
- include_option=--etags-include; \
- empty_fix=.; \
- else \
- include_option=--include; \
- empty_fix=; \
- fi; \
- list='$(SUBDIRS)'; for subdir in $$list; do \
- if test "$$subdir" = .; then :; else \
- test ! -f $$subdir/TAGS || \
- set "$$@" "$$include_option=$$here/$$subdir/TAGS"; \
- fi; \
- done; \
- list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
- unique=`for i in $$list; do \
- if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
- done | \
- $(AWK) '{ files[$$0] = 1; nonempty = 1; } \
- END { if (nonempty) { for (i in files) print i; }; }'`; \
- shift; \
- if test -z "$(ETAGS_ARGS)$$*$$unique"; then :; else \
- test -n "$$unique" || unique=$$empty_fix; \
- if test $$# -gt 0; then \
- $(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
- "$$@" $$unique; \
- else \
- $(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
- $$unique; \
- fi; \
- fi
-ctags: CTAGS
-CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
- $(TAGS_FILES) $(LISP)
- list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
- unique=`for i in $$list; do \
- if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
- done | \
- $(AWK) '{ files[$$0] = 1; nonempty = 1; } \
- END { if (nonempty) { for (i in files) print i; }; }'`; \
- test -z "$(CTAGS_ARGS)$$unique" \
- || $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
- $$unique
-
-GTAGS:
- here=`$(am__cd) $(top_builddir) && pwd` \
- && $(am__cd) $(top_srcdir) \
- && gtags -i $(GTAGS_ARGS) "$$here"
-
-distclean-tags:
- -rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
-
-distdir: $(DISTFILES)
- @srcdirstrip=`echo "$(srcdir)" | sed 's/[].[^$$\\*]/\\\\&/g'`; \
- topsrcdirstrip=`echo "$(top_srcdir)" | sed 's/[].[^$$\\*]/\\\\&/g'`; \
- list='$(DISTFILES)'; \
- dist_files=`for file in $$list; do echo $$file; done | \
- sed -e "s|^$$srcdirstrip/||;t" \
- -e "s|^$$topsrcdirstrip/|$(top_builddir)/|;t"`; \
- case $$dist_files in \
- */*) $(MKDIR_P) `echo "$$dist_files" | \
- sed '/\//!d;s|^|$(distdir)/|;s,/[^/]*$$,,' | \
- sort -u` ;; \
- esac; \
- for file in $$dist_files; do \
- if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
- if test -d $$d/$$file; then \
- dir=`echo "/$$file" | sed -e 's,/[^/]*$$,,'`; \
- if test -d "$(distdir)/$$file"; then \
- find "$(distdir)/$$file" -type d ! -perm -700 -exec chmod u+rwx {} \;; \
- fi; \
- if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
- cp -fpR $(srcdir)/$$file "$(distdir)$$dir" || exit 1; \
- find "$(distdir)/$$file" -type d ! -perm -700 -exec chmod u+rwx {} \;; \
- fi; \
- cp -fpR $$d/$$file "$(distdir)$$dir" || exit 1; \
- else \
- test -f "$(distdir)/$$file" \
- || cp -p $$d/$$file "$(distdir)/$$file" \
- || exit 1; \
- fi; \
- done
- @list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
- if test "$$subdir" = .; then :; else \
- test -d "$(distdir)/$$subdir" \
- || $(MKDIR_P) "$(distdir)/$$subdir" \
- || exit 1; \
- fi; \
- done
- @list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
- if test "$$subdir" = .; then :; else \
- dir1=$$subdir; dir2="$(distdir)/$$subdir"; \
- $(am__relativize); \
- new_distdir=$$reldir; \
- dir1=$$subdir; dir2="$(top_distdir)"; \
- $(am__relativize); \
- new_top_distdir=$$reldir; \
- echo " (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) top_distdir="$$new_top_distdir" distdir="$$new_distdir" \\"; \
- echo " am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)"; \
- ($(am__cd) $$subdir && \
- $(MAKE) $(AM_MAKEFLAGS) \
- top_distdir="$$new_top_distdir" \
- distdir="$$new_distdir" \
- am__remove_distdir=: \
- am__skip_length_check=: \
- am__skip_mode_fix=: \
- distdir) \
- || exit 1; \
- fi; \
- done
-check-am: all-am
-check: check-recursive
-all-am: Makefile $(PROGRAMS)
-installdirs: installdirs-recursive
-installdirs-am:
- for dir in "$(DESTDIR)$(bindir)"; do \
- test -z "$$dir" || $(MKDIR_P) "$$dir"; \
- done
-install: install-recursive
-install-exec: install-exec-recursive
-install-data: install-data-recursive
-uninstall: uninstall-recursive
-
-install-am: all-am
- @$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
-
-installcheck: installcheck-recursive
-install-strip:
- $(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
- install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
- `test -z '$(STRIP)' || \
- echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
-mostlyclean-generic:
-
-clean-generic:
-
-distclean-generic:
- -test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
- -test . = "$(srcdir)" || test -z "$(CONFIG_CLEAN_VPATH_FILES)" || rm -f $(CONFIG_CLEAN_VPATH_FILES)
-
-maintainer-clean-generic:
- @echo "This command is intended for maintainers to use"
- @echo "it deletes files that may require special tools to rebuild."
-@have_qt_FALSE@clean-local:
-clean: clean-recursive
-
-clean-am: clean-binPROGRAMS clean-generic clean-libtool clean-local \
- mostlyclean-am
-
-distclean: distclean-recursive
- -rm -rf ./$(DEPDIR)
- -rm -f Makefile
-distclean-am: clean-am distclean-compile distclean-generic \
- distclean-tags
-
-dvi: dvi-recursive
-
-dvi-am:
-
-html: html-recursive
-
-html-am:
-
-info: info-recursive
-
-info-am:
-
-install-data-am:
-
-install-dvi: install-dvi-recursive
-
-install-dvi-am:
-
-install-exec-am: install-binPROGRAMS
-
-install-html: install-html-recursive
-
-install-html-am:
-
-install-info: install-info-recursive
-
-install-info-am:
-
-install-man:
-
-install-pdf: install-pdf-recursive
-
-install-pdf-am:
-
-install-ps: install-ps-recursive
-
-install-ps-am:
-
-installcheck-am:
-
-maintainer-clean: maintainer-clean-recursive
- -rm -rf ./$(DEPDIR)
- -rm -f Makefile
-maintainer-clean-am: distclean-am maintainer-clean-generic
-
-mostlyclean: mostlyclean-recursive
-
-mostlyclean-am: mostlyclean-compile mostlyclean-generic \
- mostlyclean-libtool
-
-pdf: pdf-recursive
-
-pdf-am:
-
-ps: ps-recursive
-
-ps-am:
-
-uninstall-am: uninstall-binPROGRAMS
-
-.MAKE: $(RECURSIVE_CLEAN_TARGETS) $(RECURSIVE_TARGETS) ctags-recursive \
- install-am install-strip tags-recursive
-
-.PHONY: $(RECURSIVE_CLEAN_TARGETS) $(RECURSIVE_TARGETS) CTAGS GTAGS \
- all all-am check check-am clean clean-binPROGRAMS \
- clean-generic clean-libtool clean-local ctags ctags-recursive \
- distclean distclean-compile distclean-generic \
- distclean-libtool distclean-tags distdir dvi dvi-am html \
- html-am info info-am install install-am install-binPROGRAMS \
- install-data install-data-am install-dvi install-dvi-am \
- install-exec install-exec-am install-html install-html-am \
- install-info install-info-am install-man install-pdf \
- install-pdf-am install-ps install-ps-am install-strip \
- installcheck installcheck-am installdirs installdirs-am \
- maintainer-clean maintainer-clean-generic mostlyclean \
- mostlyclean-compile mostlyclean-generic mostlyclean-libtool \
- pdf pdf-am ps ps-am tags tags-recursive uninstall uninstall-am \
- uninstall-binPROGRAMS
-
-
-@have_qt_TRUE@oprof_start.moc.cpp: ${top_srcdir}/gui/oprof_start.h
-@have_qt_TRUE@ $(MOC) -o $@ ${top_srcdir}/gui/oprof_start.h
-
-@have_qt_TRUE@clean-local:
-@have_qt_TRUE@ rm -f oprof_start.moc.cpp
-
-# Tell versions [3.59,3.63) of GNU make to not export all variables.
-# Otherwise a system limit (for SysV at least) may be exceeded.
-.NOEXPORT:
diff --git a/gui/oprof_start.cpp b/gui/oprof_start.cpp
deleted file mode 100644
index 725b215..0000000
--- a/gui/oprof_start.cpp
+++ /dev/null
@@ -1,1087 +0,0 @@
-/**
- * @file oprof_start.cpp
- * The GUI start main class
- *
- * @remark Copyright 2002 OProfile authors
- * @remark Read the file COPYING
- *
- * @author Philippe Elie
- * @author John Levon
- */
-
-#include
-#include
-
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-
-#if QT3_SUPPORT
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#else
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#define Q3ListView QListView
-#endif
-
-#include "oprof_start.h"
-#include "op_config.h"
-#include "op_config_24.h"
-#include "string_manip.h"
-#include "op_cpufreq.h"
-#include "op_alloc_counter.h"
-#include "oprof_start_util.h"
-#include "file_manip.h"
-
-#include "op_hw_config.h"
-
-using namespace std;
-
-static char const * green_xpm[] = {
-"16 16 2 1",
-" c None",
-". c #00FF00",
-" ....... ",
-" ........... ",
-" ............. ",
-" ............. ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-" ............. ",
-" ............. ",
-" ........... ",
-" ....... ",
-" " };
-
-static char const * red_xpm[] = {
-"16 16 2 1",
-" c None",
-". c #FF0000",
-" ....... ",
-" ........... ",
-" ............. ",
-" ............. ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-"............... ",
-" ............. ",
-" ............. ",
-" ........... ",
-" ....... ",
-" " };
-
-static QPixmap * green_pixmap;
-static QPixmap * red_pixmap;
-
-
-op_event_descr::op_event_descr()
- :
- counter_mask(0),
- val(0),
- unit(0),
- min_count(0)
-{
-}
-
-
-oprof_start::oprof_start()
- :
- oprof_start_base(0, 0, false, 0),
- event_count_validator(new QIntValidator(event_count_edit)),
- current_event(0),
- cpu_speed(op_cpu_frequency()),
- total_nr_interrupts(0)
-{
- green_pixmap = new QPixmap(green_xpm);
- red_pixmap = new QPixmap(red_xpm);
- vector args;
- args.push_back("--init");
-
- if (do_exec_command(OP_BINDIR "/opcontrol", args))
- exit(EXIT_FAILURE);
-
- cpu_type = op_get_cpu_type();
- op_nr_counters = op_get_nr_counters(cpu_type);
-
- if (cpu_type == CPU_TIMER_INT) {
- setup_config_tab->removePage(counter_setup_page);
- } else {
- fill_events();
- }
-
- op_interface interface = op_get_interface();
- if (interface == OP_INTERFACE_NO_GOOD) {
- QMessageBox::warning(this, 0, "Couldn't determine kernel"
- " interface version");
- exit(EXIT_FAILURE);
- }
- bool is_26 = interface == OP_INTERFACE_26;
-
- if (is_26) {
- note_table_size_edit->hide();
- note_table_size_label->hide();
- if (!op_file_readable("/dev/oprofile/backtrace_depth")) {
- callgraph_depth_label->hide();
- callgraph_depth_edit->hide();
- }
- } else {
- callgraph_depth_label->hide();
- callgraph_depth_edit->hide();
- buffer_watershed_label->hide();
- buffer_watershed_edit->hide();
- cpu_buffer_size_label->hide();
- cpu_buffer_size_edit->hide();
- }
-
- // setup the configuration page.
- kernel_filename_edit->setText(config.kernel_filename.c_str());
-
- no_vmlinux->setChecked(config.no_kernel);
-
- buffer_size_edit->setText(QString().setNum(config.buffer_size));
- buffer_watershed_edit->setText(QString().setNum(config.buffer_watershed));
- cpu_buffer_size_edit->setText(QString().setNum(config.cpu_buffer_size));
- note_table_size_edit->setText(QString().setNum(config.note_table_size));
- callgraph_depth_edit->setText(QString().setNum(config.callgraph_depth));
- verbose->setChecked(config.verbose);
- separate_lib_cb->setChecked(config.separate_lib);
- separate_kernel_cb->setChecked(config.separate_kernel);
- separate_cpu_cb->setChecked(config.separate_cpu);
- separate_thread_cb->setChecked(config.separate_thread);
-
- // the unit mask check boxes
- hide_masks();
-
- event_count_edit->setValidator(event_count_validator);
- QIntValidator * iv;
- iv = new QIntValidator(OP_MIN_BUF_SIZE, OP_MAX_BUF_SIZE, buffer_size_edit);
- buffer_size_edit->setValidator(iv);
- iv = new QIntValidator(OP_MIN_NOTE_TABLE_SIZE, OP_MAX_NOTE_TABLE_SIZE, note_table_size_edit);
- note_table_size_edit->setValidator(iv);
- iv = new QIntValidator(0, INT_MAX, callgraph_depth_edit);
- callgraph_depth_edit->setValidator(iv);
- iv = new QIntValidator(0, INT_MAX, buffer_watershed_edit);
- buffer_watershed_edit->setValidator(iv);
- iv = new QIntValidator(0, OP_MAX_CPU_BUF_SIZE, cpu_buffer_size_edit);
- cpu_buffer_size_edit->setValidator(iv);
-
- // daemon status timer
- startTimer(5000);
- timerEvent(0);
-
- resize(minimumSizeHint());
-
- // force the pixmap re-draw
- event_selected();
-}
-
-
-void oprof_start::fill_events()
-{
- // we need to build the event descr stuff before loading the
- // configuration because we use locate_event to get an event descr
- // from its name.
- struct list_head * pos;
- struct list_head * events = op_events(cpu_type);
-
- list_for_each(pos, events) {
- struct op_event * event = list_entry(pos, struct op_event, event_next);
-
- op_event_descr descr;
-
- descr.counter_mask = event->counter_mask;
- descr.val = event->val;
- if (event->unit->num) {
- descr.unit = event->unit;
- } else {
- descr.unit = 0;
- }
-
- descr.name = event->name;
- descr.help_str = event->desc;
- descr.min_count = event->min_count;
-
- for (uint ctr = 0; ctr < op_nr_counters; ++ctr) {
- uint count;
-
- if (!(descr.counter_mask & (1 << ctr)))
- continue;
-
- if (cpu_type == CPU_RTC) {
- count = 1024;
- } else {
- /* setting to cpu Hz / 2000 gives a safe value for
- * all events, and a good one for most.
- */
- if (cpu_speed)
- count = int(cpu_speed * 500);
- else
- count = descr.min_count * 100;
- }
-
- event_cfgs[descr.name].count = count;
- event_cfgs[descr.name].umask = 0;
- if (descr.unit)
- event_cfgs[descr.name].umask = descr.unit->default_mask;
- event_cfgs[descr.name].os_ring_count = 1;
- event_cfgs[descr.name].user_ring_count = 1;
- }
-
- v_events.push_back(descr);
- }
-
- events_list->header()->hide();
- events_list->setSorting(-1);
-
- fill_events_listbox();
-
- read_set_events();
-
- // FIXME: why this ?
- if (cpu_type == CPU_RTC)
- events_list->setCurrentItem(events_list->firstChild());
-
- load_config_file();
-}
-
-
-namespace {
-
-/// find the first item with the given text in column 0 or return NULL
-Q3ListViewItem * findItem(Q3ListView * view, char const * name)
-{
- // Qt 2.3.1 does not have QListView::findItem()
- Q3ListViewItem * item = view->firstChild();
-
- while (item && strcmp(item->text(0).latin1(), name))
- item = item->nextSibling();
-
- return item;
-}
-
-};
-
-
-void oprof_start::setup_default_event()
-{
- struct op_default_event_descr descr;
- op_default_event(cpu_type, &descr);
-
- event_cfgs[descr.name].umask = descr.um;
- event_cfgs[descr.name].count = descr.count;
- event_cfgs[descr.name].user_ring_count = 1;
- event_cfgs[descr.name].os_ring_count = 1;
-
- Q3ListViewItem * item = findItem(events_list, descr.name);
- if (item)
- item->setSelected(true);
-}
-
-
-void oprof_start::read_set_events()
-{
- string name = get_config_filename(".oprofile/daemonrc");
-
- ifstream in(name.c_str());
-
- if (!in) {
- setup_default_event();
- return;
- }
-
- string str;
-
- bool one_enabled = false;
-
- while (getline(in, str)) {
- string const val = split(str, '=');
- string const name = str;
-
- if (!is_prefix(name, "CHOSEN_EVENTS_"))
- continue;
-
- one_enabled = true;
-
- // CHOSEN_EVENTS_#nr=CPU_CLK_UNHALTED:10000:0:1:1
- vector parts = separate_token(val, ':');
-
- if (parts.size() != 5 && parts.size() != 2) {
- cerr << "invalid configuration file\n";
- // FIXME
- exit(EXIT_FAILURE);
- }
-
- string ev_name = parts[0];
- event_cfgs[ev_name].count =
- op_lexical_cast(parts[1]);
-
- // CPU_CLK_UNHALTED:10000 is also valid
- if (parts.size() == 5) {
- event_cfgs[ev_name].umask =
- op_lexical_cast(parts[2]);
- event_cfgs[ev_name].user_ring_count =
- op_lexical_cast(parts[3]);
- event_cfgs[ev_name].os_ring_count =
- op_lexical_cast(parts[4]);
- } else {
- event_cfgs[ev_name].umask = 0;
- event_cfgs[ev_name].user_ring_count = 1;
- event_cfgs[ev_name].os_ring_count = 1;
- }
-
- Q3ListViewItem * item = findItem(events_list, ev_name.c_str());
- if (item)
- item->setSelected(true);
- }
-
- // use default event if none set
- if (!one_enabled)
- setup_default_event();
-}
-
-
-void oprof_start::load_config_file()
-{
- string name = get_config_filename(".oprofile/daemonrc");
-
- ifstream in(name.c_str());
- if (!in) {
- if (!check_and_create_config_dir())
- return;
-
- ofstream out(name.c_str());
- if (!out) {
- QMessageBox::warning(this, 0, "Unable to open configuration "
- "file ~/.oprofile/daemonrc");
- }
- return;
- }
-
- in >> config;
-}
-
-
-// user request a "normal" exit so save the config file.
-void oprof_start::accept()
-{
- // record the previous settings
- record_selected_event_config();
-
- save_config();
-
- QDialog::accept();
-}
-
-
-void oprof_start::closeEvent(QCloseEvent *)
-{
- accept();
-}
-
-
-void oprof_start::timerEvent(QTimerEvent *)
-{
- static time_t last = time(0);
-
- daemon_status dstat;
-
- flush_profiler_data_btn->setEnabled(dstat.running);
- stop_profiler_btn->setEnabled(dstat.running);
- start_profiler_btn->setEnabled(!dstat.running);
- reset_sample_files_btn->setEnabled(!dstat.running);
-
- if (!dstat.running) {
- daemon_label->setText("Profiler is not running.");
- return;
- }
-
- ostringstream ss;
- ss << "Profiler running:";
-
- time_t curr = time(0);
- total_nr_interrupts += dstat.nr_interrupts;
-
- if (curr - last)
- ss << " (" << dstat.nr_interrupts / (curr - last) << " interrupts / second, total " << total_nr_interrupts << ")";
-
- daemon_label->setText(ss.str().c_str());
-
- last = curr;
-}
-
-
-void oprof_start::fill_events_listbox()
-{
- setUpdatesEnabled(false);
-
- for (vector::reverse_iterator cit = v_events.rbegin();
- cit != v_events.rend(); ++cit) {
- new Q3ListViewItem(events_list, cit->name.c_str());
- }
-
- setUpdatesEnabled(true);
- update();
-}
-
-
-void oprof_start::display_event(op_event_descr const & descr)
-{
- setUpdatesEnabled(false);
-
- setup_unit_masks(descr);
- os_ring_count_cb->setEnabled(true);
- user_ring_count_cb->setEnabled(true);
- event_count_edit->setEnabled(true);
-
- event_setting & cfg = event_cfgs[descr.name];
-
- os_ring_count_cb->setChecked(cfg.os_ring_count);
- user_ring_count_cb->setChecked(cfg.user_ring_count);
- QString count_text;
- count_text.setNum(cfg.count);
- event_count_edit->setText(count_text);
- event_count_validator->setRange(descr.min_count, max_perf_count());
-
- setUpdatesEnabled(true);
- update();
-}
-
-
-bool oprof_start::is_selectable_event(Q3ListViewItem * item)
-{
- if (item->isSelected())
- return true;
-
- selected_events.insert(item);
-
- bool ret = false;
- if (alloc_selected_events())
- ret = true;
-
- selected_events.erase(item);
-
- return ret;
-}
-
-
-void oprof_start::draw_event_list()
-{
- Q3ListViewItem * cur;
- for (cur = events_list->firstChild(); cur; cur = cur->nextSibling()) {
- if (is_selectable_event(cur))
- cur->setPixmap(0, *green_pixmap);
- else
- cur->setPixmap(0, *red_pixmap);
- }
-}
-
-
-bool oprof_start::alloc_selected_events() const
-{
- vector events;
-
- set::const_iterator it;
- for (it = selected_events.begin(); it != selected_events.end(); ++it)
- events.push_back(find_event_by_name((*it)->text(0).latin1(),0,0));
-
- size_t * map =
- map_event_to_counter(&events[0], events.size(), cpu_type);
-
- if (!map)
- return false;
-
- free(map);
- return true;
-}
-
-void oprof_start::event_selected()
-{
- // The deal is simple: QT lack of a way to know what item was the last
- // (de)selected item so we record a set of selected items and diff
- // it in the appropriate way with the previous list of selected items.
-
- set current_selection;
- Q3ListViewItem * cur;
- for (cur = events_list->firstChild(); cur; cur = cur->nextSibling()) {
- if (cur->isSelected())
- current_selection.insert(cur);
- }
-
- // First remove the deselected item.
- vector new_deselected;
- set_difference(selected_events.begin(), selected_events.end(),
- current_selection.begin(), current_selection.end(),
- back_inserter(new_deselected));
- vector::const_iterator it;
- for (it = new_deselected.begin(); it != new_deselected.end(); ++it)
- selected_events.erase(*it);
-
- // Now try to add the newly selected item if enough HW resource exists
- vector new_selected;
- set_difference(current_selection.begin(), current_selection.end(),
- selected_events.begin(), selected_events.end(),
- back_inserter(new_selected));
- for (it = new_selected.begin(); it != new_selected.end(); ++it) {
- selected_events.insert(*it);
- if (!alloc_selected_events()) {
- (*it)->setSelected(false);
- selected_events.erase(*it);
- } else {
- current_event = *it;
- }
- }
-
- draw_event_list();
-
- if (current_event)
- display_event(locate_event(current_event->text(0).latin1()));
-}
-
-
-void oprof_start::event_over(Q3ListViewItem * item)
-{
- op_event_descr const & descr = locate_event(item->text(0).latin1());
-
- string help_str = descr.help_str.c_str();
- if (!is_selectable_event(item)) {
- help_str += " conflicts with:";
-
- set::const_iterator it;
- for (it = selected_events.begin();
- it != selected_events.end(); ) {
- Q3ListViewItem * temp = *it;
- selected_events.erase(it++);
- if (is_selectable_event(item)) {
- help_str += " ";
- help_str += temp->text(0).latin1();
- }
- selected_events.insert(temp);
- }
- }
-
- event_help_label->setText(help_str.c_str());
-}
-
-
-/// select the kernel image filename
-void oprof_start::choose_kernel_filename()
-{
- string name = kernel_filename_edit->text().latin1();
- string result = do_open_file_or_dir(name, false);
-
- if (!result.empty())
- kernel_filename_edit->setText(result.c_str());
-}
-
-
-// this record the current selected event setting in the event_cfg[] stuff.
-// FIXME: need validation?
-void oprof_start::record_selected_event_config()
-{
- if (!current_event)
- return;
-
- string name(current_event->text(0).latin1());
-
- event_setting & cfg = event_cfgs[name];
- op_event_descr const & curr = locate_event(name);
-
- cfg.count = event_count_edit->text().toUInt();
- cfg.os_ring_count = os_ring_count_cb->isChecked();
- cfg.user_ring_count = user_ring_count_cb->isChecked();
- cfg.umask = get_unit_mask(curr);
-}
-
-
-// validate and save the configuration (The qt validator installed
-// are not sufficient to do the validation)
-bool oprof_start::record_config()
-{
- config.kernel_filename = kernel_filename_edit->text().latin1();
- config.no_kernel = no_vmlinux->isChecked();
-
- uint temp = buffer_size_edit->text().toUInt();
- if (temp < OP_MIN_BUF_SIZE || temp > OP_MAX_BUF_SIZE) {
- ostringstream error;
-
- error << "buffer size out of range: " << temp
- << " valid range is [" << OP_MIN_BUF_SIZE << ", "
- << OP_MAX_BUF_SIZE << "]";
-
- QMessageBox::warning(this, 0, error.str().c_str());
-
- return false;
- }
- config.buffer_size = temp;
-
- temp = buffer_watershed_edit->text().toUInt();
- // watershed above half of buffer size make little sense.
- if (temp > config.buffer_size / 2) {
- ostringstream error;
-
- error << "buffer watershed out of range: " << temp
- << " valid range is [0 (use default), buffer size/2] "
- << "generally 0.25 * buffer size is fine";
-
- QMessageBox::warning(this, 0, error.str().c_str());
-
- return false;
- }
- config.buffer_watershed = temp;
-
- temp = cpu_buffer_size_edit->text().toUInt();
- if ((temp != 0 && temp < OP_MIN_CPU_BUF_SIZE) ||
- temp > OP_MAX_CPU_BUF_SIZE) {
- ostringstream error;
-
- error << "cpu buffer size out of range: " << temp
- << " valid range is [" << OP_MIN_CPU_BUF_SIZE << ", "
- << OP_MAX_CPU_BUF_SIZE << "] (size = 0: use default)";
-
- QMessageBox::warning(this, 0, error.str().c_str());
-
- return false;
- }
- config.cpu_buffer_size = temp;
-
- temp = note_table_size_edit->text().toUInt();
- if (temp < OP_MIN_NOTE_TABLE_SIZE || temp > OP_MAX_NOTE_TABLE_SIZE) {
- ostringstream error;
-
- error << "note table size out of range: " << temp
- << " valid range is [" << OP_MIN_NOTE_TABLE_SIZE << ", "
- << OP_MAX_NOTE_TABLE_SIZE << "]";
-
- QMessageBox::warning(this, 0, error.str().c_str());
-
- return false;
- }
- config.note_table_size = temp;
-
- temp = callgraph_depth_edit->text().toUInt();
- if (temp > INT_MAX) {
- ostringstream error;
-
- error << "callgraph depth out of range: " << temp
- << " valid range is [" << 0 << ", "
- << INT_MAX << "]";
-
- QMessageBox::warning(this, 0, error.str().c_str());
-
- return false;
- }
- config.callgraph_depth = temp;
-
- config.verbose = verbose->isChecked();
- config.separate_lib = separate_lib_cb->isChecked();
- config.separate_kernel = separate_kernel_cb->isChecked();
- config.separate_cpu = separate_cpu_cb->isChecked();
- config.separate_thread = separate_thread_cb->isChecked();
-
- return true;
-}
-
-
-void oprof_start::get_unit_mask_part(op_event_descr const & descr, uint num,
- bool selected, uint & mask)
-{
- if (!selected)
- return;
- if (num >= descr.unit->num)
- return;
-
- if (descr.unit->unit_type_mask == utm_bitmask)
- mask |= descr.unit->um[num].value;
- else
- mask = descr.unit->um[num].value;
-}
-
-
-// return the unit mask selected through the unit mask check box
-uint oprof_start::get_unit_mask(op_event_descr const & descr)
-{
- uint mask = 0;
-
- if (!descr.unit)
- return 0;
-
- // mandatory mask is transparent for user.
- if (descr.unit->unit_type_mask == utm_mandatory) {
- mask = descr.unit->default_mask;
- return mask;
- }
-
- get_unit_mask_part(descr, 0, check0->isChecked(), mask);
- get_unit_mask_part(descr, 1, check1->isChecked(), mask);
- get_unit_mask_part(descr, 2, check2->isChecked(), mask);
- get_unit_mask_part(descr, 3, check3->isChecked(), mask);
- get_unit_mask_part(descr, 4, check4->isChecked(), mask);
- get_unit_mask_part(descr, 5, check5->isChecked(), mask);
- get_unit_mask_part(descr, 6, check6->isChecked(), mask);
- get_unit_mask_part(descr, 7, check7->isChecked(), mask);
- get_unit_mask_part(descr, 8, check8->isChecked(), mask);
- get_unit_mask_part(descr, 9, check9->isChecked(), mask);
- get_unit_mask_part(descr, 10, check10->isChecked(), mask);
- get_unit_mask_part(descr, 11, check11->isChecked(), mask);
- get_unit_mask_part(descr, 12, check12->isChecked(), mask);
- get_unit_mask_part(descr, 13, check13->isChecked(), mask);
- get_unit_mask_part(descr, 14, check14->isChecked(), mask);
- get_unit_mask_part(descr, 15, check15->isChecked(), mask);
- return mask;
-}
-
-
-void oprof_start::hide_masks()
-{
- check0->hide();
- check1->hide();
- check2->hide();
- check3->hide();
- check4->hide();
- check5->hide();
- check6->hide();
- check7->hide();
- check8->hide();
- check9->hide();
- check10->hide();
- check11->hide();
- check12->hide();
- check13->hide();
- check14->hide();
- check15->hide();
-}
-
-
-void oprof_start::setup_unit_masks(op_event_descr const & descr)
-{
- op_unit_mask const * um = descr.unit;
-
- hide_masks();
-
- if (!um || um->unit_type_mask == utm_mandatory)
- return;
-
- event_setting & cfg = event_cfgs[descr.name];
-
- unit_mask_group->setExclusive(um->unit_type_mask == utm_exclusive);
-
- for (size_t i = 0; i < um->num ; ++i) {
- QCheckBox * check = 0;
- switch (i) {
- case 0: check = check0; break;
- case 1: check = check1; break;
- case 2: check = check2; break;
- case 3: check = check3; break;
- case 4: check = check4; break;
- case 5: check = check5; break;
- case 6: check = check6; break;
- case 7: check = check7; break;
- case 8: check = check8; break;
- case 9: check = check9; break;
- case 10: check = check10; break;
- case 11: check = check11; break;
- case 12: check = check12; break;
- case 13: check = check13; break;
- case 14: check = check14; break;
- case 15: check = check15; break;
- }
- check->setText(um->um[i].desc);
- if (um->unit_type_mask == utm_exclusive)
- check->setChecked(cfg.umask == um->um[i].value);
- else
- check->setChecked(cfg.umask & um->um[i].value);
-
- check->show();
- }
- unit_mask_group->setMinimumSize(unit_mask_group->sizeHint());
- setup_config_tab->setMinimumSize(setup_config_tab->sizeHint());
-}
-
-
-uint oprof_start::max_perf_count() const
-{
- return cpu_type == CPU_RTC ? OP_MAX_RTC_COUNT : OP_MAX_PERF_COUNT;
-}
-
-
-void oprof_start::on_flush_profiler_data()
-{
- vector args;
- args.push_back("--dump");
-
- if (daemon_status().running)
- do_exec_command(OP_BINDIR "/opcontrol", args);
- else
- QMessageBox::warning(this, 0, "The profiler is not started.");
-}
-
-
-// user is happy of its setting.
-void oprof_start::on_start_profiler()
-{
- // save the current settings
- record_selected_event_config();
-
- bool one_enable = false;
-
- Q3ListViewItem * cur;
- for (cur = events_list->firstChild(); cur; cur = cur->nextSibling()) {
- if (!cur->isSelected())
- continue;
-
- // the missing reference is intended: gcc 2.91.66 can compile
- // "op_event_descr const & descr = ..." w/o a warning
- op_event_descr const descr =
- locate_event(cur->text(0).latin1());
-
- event_setting & cfg = event_cfgs[cur->text(0).latin1()];
-
- one_enable = true;
-
- if (!cfg.os_ring_count && !cfg.user_ring_count) {
- QMessageBox::warning(this, 0, "You must select to "
- "profile at least one of user binaries/kernel");
- return;
- }
-
- if (cfg.count < descr.min_count ||
- cfg.count > max_perf_count()) {
- ostringstream out;
-
- out << "event " << descr.name << " count of range: "
- << cfg.count << " must be in [ "
- << descr.min_count << ", "
- << max_perf_count()
- << "]";
-
- QMessageBox::warning(this, 0, out.str().c_str());
- return;
- }
-
- if (descr.unit &&
- descr.unit->unit_type_mask == utm_bitmask &&
- cfg.umask == 0) {
- ostringstream out;
-
- out << "event " << descr.name << " invalid unit mask: "
- << cfg.umask << endl;
-
- QMessageBox::warning(this, 0, out.str().c_str());
- return;
- }
- }
-
- if (one_enable == false && cpu_type != CPU_TIMER_INT) {
- QMessageBox::warning(this, 0, "No counters enabled.\n");
- return;
- }
-
- if (daemon_status().running) {
- // gcc 2.91 work around
- int user_choice = 0;
- user_choice =
- QMessageBox::warning(this, 0,
- "Profiler already started:\n\n"
- "stop and restart it?",
- "&Restart", "&Cancel", 0, 0, 1);
-
- if (user_choice == 1)
- return;
-
- // this flush profiler data also.
- on_stop_profiler();
- }
-
- vector args;
-
- // save_config validate and setup the config
- if (save_config()) {
- // now actually start
- args.push_back("--start");
- if (config.verbose)
- args.push_back("--verbose");
- do_exec_command(OP_BINDIR "/opcontrol", args);
- }
-
- total_nr_interrupts = 0;
- timerEvent(0);
-}
-
-
-bool oprof_start::save_config()
-{
- if (!record_config())
- return false;
-
- vector args;
-
- // saving config is done by running opcontrol --setup with appropriate
- // setted parameters so we use the same config file as command line
- // tools
-
- args.push_back("--setup");
-
- bool one_enabled = false;
-
- vector tmpargs;
- tmpargs.push_back("--setup");
-
- Q3ListViewItem * cur;
- for (cur = events_list->firstChild(); cur; cur = cur->nextSibling()) {
- if (!cur->isSelected())
- continue;
-
- event_setting & cfg = event_cfgs[cur->text(0).latin1()];
-
- op_event_descr const & descr =
- locate_event(cur->text(0).latin1());
-
- one_enabled = true;
-
- string arg = "--event=" + descr.name;
- arg += ":" + op_lexical_cast(cfg.count);
- arg += ":" + op_lexical_cast(cfg.umask);
- arg += ":" + op_lexical_cast(cfg.os_ring_count);
- arg += ":" + op_lexical_cast(cfg.user_ring_count);
-
- tmpargs.push_back(arg);
- }
-
- // only set counters if at least one is enabled
- if (one_enabled)
- args = tmpargs;
-
- if (config.no_kernel) {
- args.push_back("--no-vmlinux");
- } else {
- args.push_back("--vmlinux=" + config.kernel_filename);
- }
-
- args.push_back("--buffer-size=" +
- op_lexical_cast(config.buffer_size));
-
- if (op_get_interface() == OP_INTERFACE_24) {
- args.push_back("--note-table-size=" +
- op_lexical_cast(config.note_table_size));
- } else {
- args.push_back("--buffer-watershed=" +
- op_lexical_cast(config.buffer_watershed));
- args.push_back("--cpu-buffer-size=" +
- op_lexical_cast(config.cpu_buffer_size));
- if (op_file_readable("/dev/oprofile/backtrace_depth")) {
- args.push_back("--callgraph=" +
- op_lexical_cast(config.callgraph_depth));
- }
- }
-
- string sep = "--separate=";
-
- if (config.separate_lib)
- sep += "library,";
- if (config.separate_kernel)
- sep += "kernel,";
- if (config.separate_cpu)
- sep += "cpu,";
- if (config.separate_thread)
- sep += "thread,";
-
- if (sep == "--separate=")
- sep += "none";
- args.push_back(sep);
-
- // 2.95 work-around, it didn't like return !do_exec_command()
- bool ret = !do_exec_command(OP_BINDIR "/opcontrol", args);
- return ret;
-}
-
-
-// flush and stop the profiler if it was started.
-void oprof_start::on_stop_profiler()
-{
- vector args;
- args.push_back("--shutdown");
-
- if (daemon_status().running)
- do_exec_command(OP_BINDIR "/opcontrol", args);
- else
- QMessageBox::warning(this, 0, "The profiler is already stopped.");
-
- timerEvent(0);
-}
-
-
-void oprof_start::on_separate_kernel_cb_changed(int state)
-{
- if (state == 2)
- separate_lib_cb->setChecked(true);
-}
-
-void oprof_start::on_reset_sample_files()
-{
- int ret = QMessageBox::warning(this, 0, "Are you sure you want to "
- "reset your last profile session ?", "Yes", "No", 0, 0, 1);
- if (!ret) {
- vector args;
- args.push_back("--reset");
- if (!do_exec_command(OP_BINDIR "/opcontrol", args))
- // the next timer event will overwrite the message
- daemon_label->setText("Last profile session reseted.");
- else
- QMessageBox::warning(this, 0,
- "Can't reset profiling session.");
- }
-}
-
-
-/// function object for matching against name
-class event_name_eq {
- string name_;
-public:
- explicit event_name_eq(string const & s) : name_(s) {}
- bool operator()(op_event_descr const & d) const {
- return d.name == name_;
- }
-};
-
-
-// helper to retrieve an event descr through its name.
-op_event_descr const & oprof_start::locate_event(string const & name) const
-{
- return *(find_if(v_events.begin(), v_events.end(), event_name_eq(name)));
-}
diff --git a/gui/oprof_start.h b/gui/oprof_start.h
deleted file mode 100644
index 477e3f4..0000000
--- a/gui/oprof_start.h
+++ /dev/null
@@ -1,170 +0,0 @@
-/**
- * @file oprof_start.h
- * The GUI start main class
- *
- * @remark Copyright 2002 OProfile authors
- * @remark Read the file COPYING
- *
- * @author Philippe Elie
- * @author John Levon
- */
-
-#ifndef OPROF_START_H
-#define OPROF_START_H
-
-#include
-#include