perf/x86: Use rdpmc() rather than rdmsr() when possible in the kernel
authorVince Weaver <vweaver1@eecs.utk.edu>
Thu, 1 Mar 2012 22:28:14 +0000 (17:28 -0500)
committerIngo Molnar <mingo@kernel.org>
Wed, 6 Jun 2012 15:23:35 +0000 (17:23 +0200)
commitc48b60538c3ba05a7a2713c4791b25405525431b
tree8482e11e2060831f1ffc5e81006d88e8cd8f5ea2
parent1ff4d58a192aea7f245981e2579765f961f6eb9c
perf/x86: Use rdpmc() rather than rdmsr() when possible in the kernel

The rdpmc instruction is faster than the equivelant rdmsr call,
so use it when possible in the kernel.

The perfctr kernel patches did this, after extensive testing showed
rdpmc to always be faster (One can look in etc/costs in the perfctr-2.6
package to see a historical list of the overhead).

I have done some tests on a 3.2 kernel, the kernel module I used
was included in the first posting of this patch:

                   rdmsr           rdpmc
 Core2 T9900:      203.9 cycles     30.9 cycles
 AMD fam0fh:        56.2 cycles      9.8 cycles
 Atom 6/28/2:      129.7 cycles     50.6 cycles

The speedup of using rdpmc is large.

[ It's probably possible (and desirable) to do this without
  requiring a new field in the hw_perf_event structure, but
  the fixed events make this tricky. ]

Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1203011724030.26934@cl320.eecs.utk.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
arch/x86/kernel/cpu/perf_event.c
include/linux/perf_event.h