<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<title>OProfile Internals</title>
- <meta name="generator" content="DocBook XSL Stylesheets V1.75.2" />
+ <meta name="generator" content="DocBook XSL Stylesheets V1.78.1" />
</head>
<body>
- <div class="book" title="OProfile Internals">
+ <div class="book">
<div class="titlepage">
<div>
<div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="chapter">
<a href="#introduction">1. Introduction</a>
</dt>
<dt>
<span class="sect2">
- <a href="#idp4832768">2.2. IA64 and perfmon</a>
+ <a href="#idm229595329680">2.2. IA64 and perfmon</a>
</span>
</dt>
</dl>
</div>
<div class="list-of-figures">
<p>
- <b>List of Figures</b>
+ <strong>List of Figures</strong>
</p>
<dl>
- <dt>3.1. <a href="#idp4848096">The OProfile buffers</a></dt>
+ <dt>3.1. <a href="#idm229590620240">The OProfile buffers</a></dt>
</dl>
</div>
- <div class="chapter" title="Chapter 1. Introduction">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="introduction"></a>Chapter 1. Introduction</h2>
+ <h1 class="title"><a id="introduction"></a>Chapter 1. Introduction</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#overview">1. Overview</a>
</dl>
</div>
<p>
-This document is current for OProfile version 1.0.0.
+This document is current for OProfile version 1.1.0git.
This document provides some details on the internal workings of OProfile for the
interested hacker. This document assumes strong C, working C++, plus some knowledge of
kernel internals and CPU hardware.
</p>
- <div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;">
+ <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>
Only the "new" implementation associated with kernel 2.6 and above is covered here. 2.4
uses a very different kernel module implementation and daemon to produce the sample files.
</p>
</div>
- <div class="sect1" title="1. Overview">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
interrupt time to producing user-readable profile information.
</p>
</div>
- <div class="sect1" title="2. Components of the OProfile system">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
</div>
</div>
</div>
- <div class="sect2" title="2.1. Architecture-specific components">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
taken at interrupt time is fed into the generic OProfile driver code.
</p>
</div>
- <div class="sect2" title="2.2. oprofilefs">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
a number of useful counters for various OProfile events.
</p>
</div>
- <div class="sect2" title="2.3. Generic kernel driver">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
character device.
</p>
</div>
- <div class="sect2" title="2.4. The OProfile daemon">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
naturally).
</p>
</div>
- <div class="sect2" title="2.5. Post-profiling tools"><div class="titlepage"><div><div><h3 class="title"><a id="post-profiling"></a>2.5. Post-profiling tools</h3></div></div></div>
+ <div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="post-profiling"></a>2.5. Post-profiling tools</h3></div></div></div>
So far, we've collected data, but we've yet to present it in a useful form
to the user. This is the job of the post-profiling tools. In general form,
they collate a subset of the available sample files, load and process each one
</div>
</div>
</div>
- <div class="chapter" title="Chapter 2. Performance counter management">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="performance-counters"></a>Chapter 2. Performance counter management</h2>
+ <h1 class="title"><a id="performance-counters"></a>Chapter 2. Performance counter management</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#performance-counters-ui">1. Providing a user interface</a>
</dt>
<dt>
<span class="sect2">
- <a href="#idp4832768">2.2. IA64 and perfmon</a>
+ <a href="#idm229595329680">2.2. IA64 and perfmon</a>
</span>
</dt>
</dl>
</dd>
</dl>
</div>
- <div class="sect1" title="1. Providing a user interface">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
differently, as described later.
</p>
</div>
- <div class="sect1" title="2. Programming the performance counter registers">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
profiling) of regions where "normal" interrupts are masked, enabling
more reliable profiles.
</p>
- <div class="sect2" title="2.1. Starting and stopping the counters">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
or enable on a per-counter basis, unlike the PPro models).
</p>
</div>
- <div class="sect2" title="2.2. IA64 and perfmon">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
- <h3 class="title"><a id="idp4832768"></a>2.2. IA64 and perfmon</h3>
+ <h3 class="title"><a id="idm229595329680"></a>2.2. IA64 and perfmon</h3>
</div>
</div>
</div>
</div>
</div>
</div>
- <div class="chapter" title="Chapter 3. Collecting and processing samples">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="collecting-samples"></a>Chapter 3. Collecting and processing samples</h2>
+ <h1 class="title"><a id="collecting-samples"></a>Chapter 3. Collecting and processing samples</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#receiving-interrupts">1. Receiving interrupts</a>
</dt>
</dl>
</div>
- <div class="sect1" title="1. Receiving interrupts">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
architectures behave in a similar manner.
</p>
</div>
- <div class="sect1" title="2. Core data structures">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
from the CPU buffers. This process is described in detail later in this chapter.
</p>
<div class="figure">
- <a id="idp4848096"></a>
+ <a id="idm229590620240"></a>
<p class="title">
- <b>Figure 3.1. The OProfile buffers</b>
+ <strong>Figure 3.1. The OProfile buffers</strong>
</p>
<div class="figure-contents">
<div>
</div>
<br class="figure-break" />
</div>
- <div class="sect1" title="3. Logging a sample">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
reset when the CPU buffer is read.
</p>
</div>
- <div class="sect1" title="4. Logging stack traces">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
typical programs will have many <code class="function">main()</code> samples.
</p>
</div>
- <div class="sect1" title="5. Synchronising the CPU buffers to the event buffer">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
buffer synchronisation, we will start again from that point.
</p>
</div>
- <div class="sect1" title="6. Identifying binary images">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
cache lasts for as long as the daemon has the event buffer open.
</p>
</div>
- <div class="sect1" title="7. Finding a sample's binary image and offset">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="chapter" title="Chapter 4. Generating sample files">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="sample-files"></a>Chapter 4. Generating sample files</h2>
+ <h1 class="title"><a id="sample-files"></a>Chapter 4. Generating sample files</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#processing-buffer">1. Processing the buffer</a>
</dt>
</dl>
</div>
- <div class="sect1" title="1. Processing the buffer">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
in the transient structure; we then do a lookup to find the correct
sample file, and log the sample, as described in the next section.
</p>
- <div class="sect2" title="1.1. Handling kernel samples">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="sect1" title="2. Locating and creating sample files">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
the fully-qualified file name to userspace.
</p>
</div>
- <div class="sect1" title="3. Writing data to a sample file">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="chapter" title="Chapter 5. Generating useful output">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="output"></a>Chapter 5. Generating useful output</h2>
+ <h1 class="title"><a id="output"></a>Chapter 5. Generating useful output</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#profile-specification">1. Handling the profile specification</a>
use them to extract meaningful data, before a final collation and
presentation to the user.
</p>
- <div class="sect1" title="1. Handling the profile specification">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
complicated bit...
</p>
</div>
- <div class="sect1" title="2. Collating the candidate sample files">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
need to classify each sample file, and we may also need to "invert"
the profiles.
</p>
- <div class="sect2" title="2.1. Classifying sample files">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
only differ in one aspect, such as thread ID or event name.
</p>
</div>
- <div class="sect2" title="2.2. Creating inverted profile lists">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="sect1" title="3. Generating profile data">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
each inverted profile and make something of the data. The entry point
for this is <code class="function">populate_for_image()</code>.
</p>
- <div class="sect2" title="3.1. Processing the binary image">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
should be clear from the source.
</p>
</div>
- <div class="sect2" title="3.2. Processing the sample files">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="sect1" title="4. Generating output">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="chapter" title="Chapter 6. Extended Feature Interface">
+ <div class="chapter">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="ext"></a>Chapter 6. Extended Feature Interface</h2>
+ <h1 class="title"><a id="ext"></a>Chapter 6. Extended Feature Interface</h1>
</div>
</div>
</div>
<div class="toc">
<p>
- <b>Table of Contents</b>
+ <strong>Table of Contents</strong>
</p>
- <dl>
+ <dl class="toc">
<dt>
<span class="sect1">
<a href="#ext-intro">1. Introduction</a>
</dd>
</dl>
</div>
- <div class="sect1" title="1. Introduction">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
features or features not commonly used by general OProfile users.
</p>
</div>
- <div class="sect1" title="2. Feature Name and Handlers">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
Only the handlers of the enabled feature will be executed.
</p>
</div>
- <div class="sect1" title="3. Enabling Features">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
enabled at a time.
</p>
</div>
- <div class="sect1" title="4. Type of Handlers">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
Each feature is responsible for providing its own set of handlers.
Types of handler are:
</p>
- <div class="sect2" title="4.1. ext_init Handler">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
<code class="function">opd_options()</code> in the file <code class="filename">daemon/oprofiled.c
</code>.
</p>
- <div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;">
+ <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>
The ext_init handler is required for all features.
</p>
</div>
</div>
- <div class="sect2" title="4.2. ext_print_stats Handler">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
in the file <code class="filename">daemon/opd_stats.c</code>.
</p>
</div>
- <div class="sect2" title="4.3. ext_sfile Handler">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
</p>
</div>
</div>
- <div class="sect1" title="5. Extended Feature Reference Implementation">
+ <div class="sect1">
<div class="titlepage">
<div>
<div>
</div>
</div>
</div>
- <div class="sect2" title="5.1. Instruction-Based Sampling (IBS)">
+ <div class="sect2">
<div class="titlepage">
<div>
<div>
An example of extended feature implementation can be seen by
examining the AMD Instruction-Based Sampling support.
</p>
- <div class="sect3" title="5.1.1. IBS Initialization">
+ <div class="sect3">
<div class="titlepage">
<div>
<div>
as the map key.
</p>
</div>
- <div class="sect3" title="5.1.2. IBS Data Processing">
+ <div class="sect3">
<div class="titlepage">
<div>
<div>
<code class="filename">daemon/opd_ibs_trans.[h,c]</code>.
</p>
</div>
- <div class="sect3" title="5.1.3. IBS Sample File">
+ <div class="sect3">
<div class="titlepage">
<div>
<div>
</div>
</div>
</div>
- <div class="glossary" title="Glossary of OProfile source concepts and types">
+ <div class="glossary">
<div class="titlepage">
<div>
<div>
- <h2 class="title"><a id="glossary"></a>Glossary of OProfile source concepts and types</h2>
+ <h1 class="title"><a id="glossary"></a>Glossary of OProfile source concepts and types</h1>
</div>
</div>
</div>
<dl>
- <dt>application image</dt>
- <dd>
+ <dt>
+ <span class="glossterm">application image</span>
+ </dt>
+ <dd class="glossdef">
<p>
The primary binary image used by an application. This is derived
from the kernel and corresponds to the binary started upon running
an application: for example, <code class="filename">/bin/bash</code>.
</p>
</dd>
- <dt>binary image</dt>
- <dd>
+ <dt>
+ <span class="glossterm">binary image</span>
+ </dt>
+ <dd class="glossdef">
<p>
An ELF file containing executable code: this includes kernel modules,
the kernel itself (a.k.a. <code class="filename">vmlinux</code>), shared libraries,
and application binaries.
</p>
</dd>
- <dt>dcookie</dt>
- <dd>
+ <dt>
+ <span class="glossterm">dcookie</span>
+ </dt>
+ <dd class="glossdef">
<p>
Short for "dentry cookie". A unique ID that can be looked up to provide
the full path name of a binary image.
</p>
</dd>
- <dt>dependent image</dt>
- <dd>
+ <dt>
+ <span class="glossterm">dependent image</span>
+ </dt>
+ <dd class="glossdef">
<p>
A binary image that is dependent upon an application, used with
per-application separation. Most commonly, shared libraries. For example,
would be dependent upon <code class="filename">/bin/bash</code>.
</p>
</dd>
- <dt>merging</dt>
- <dd>
+ <dt>
+ <span class="glossterm">merging</span>
+ </dt>
+ <dd class="glossdef">
<p>
This refers to the ability to merge several distinct sample files
into one set of data at runtime, in the post-profiling tools. For example,
because there would be no useful meaning to the results.
</p>
</dd>
- <dt>profile class</dt>
- <dd>
+ <dt>
+ <span class="glossterm">profile class</span>
+ </dt>
+ <dd class="glossdef">
<p>
A collection of profile data that has been collected under the same
class template. For example, if we're using <span class="command"><strong>opreport</strong></span>
there would be a profile class for each CPU.
</p>
</dd>
- <dt>profile specification</dt>
- <dd>
+ <dt>
+ <span class="glossterm">profile specification</span>
+ </dt>
+ <dd class="glossdef">
<p>
The parameters the user passes to the post-profiling tools that limit
what sample files are used. This specification is matched against
the available sample files to generate a selection of profile data.
</p>
</dd>
- <dt>profile template</dt>
- <dd>
+ <dt>
+ <span class="glossterm">profile template</span>
+ </dt>
+ <dd class="glossdef">
<p>
The parameters that define what goes in a particular profile class.
This includes a symbolic name (e.g. "cpu:1") and the code-usable