Arnaldo Carvalho de Melo [Wed, 3 Feb 2010 18:52:01 +0000 (16:52 -0200)]
perf symbols: Fixup vsyscall maps
While debugging a problem reported by Pekka Enberg by printing
the IP and all the maps for a thread when we don't find a map
for an IP I noticed that dso__load_sym needs to fixup these
extra maps it creates to hold symbols in different ELF sections
than the main kernel one.
Now we're back showing things like:
[root@doppio linux-2.6-tip]# perf report | grep vsyscall
0.02% mutt [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% named [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% NetworkManager [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.01% gconfd-2 [kernel.kallsyms].vsyscall_0 [.] vgettimeofday
0.01% hald-addon-rfki [kernel.kallsyms].vsyscall_fn [.] vread_hpet
0.00% dbus-daemon [kernel.kallsyms].vsyscall_fn [.] vread_hpet
[root@doppio linux-2.6-tip]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1265223128-11786-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Wed, 3 Feb 2010 18:52:00 +0000 (16:52 -0200)]
perf symbols: Remove perf_session usage in symbols layer
I noticed while writing the first test in 'perf regtest' that to
just test the symbol handling routines one needs to create a
perf session, that is a layer centered on a perf.data file,
events, etc, so I untied these layers.
This reduces the complexity for the users as the number of
parameters to most of the symbols and session APIs now was
reduced while not adding more state to all the map instances by
only having data that is needed to split the kernel (kallsyms
and ELF symtab sections) maps and do vmlinux relocation on the
main kernel map.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1265223128-11786-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Xiao Guangrong [Wed, 3 Feb 2010 03:53:14 +0000 (11:53 +0800)]
perf tools: Use O_LARGEFILE to open perf data file
Open perf data file with O_LARGEFILE flag since its size is
easily larger that 2G.
For example:
# rm -rf perf.data
# ./perf kmem record sleep 300
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 3142.147 MB perf.data
(~
137282513 samples) ]
# ll -h perf.data
-rw------- 1 root root 3.1G .....
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <
4B68F32A.9040203@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 31 Jan 2010 07:27:58 +0000 (08:27 +0100)]
perf lock: Clean up various details
Fix up a few small stylistic details:
- use consistent vertical spacing/alignment
- remove line80 artifacts
- group some global variables better
- remove dead code
Plus rename 'prof' to 'report' to make it more in line with other
tools, and remove the line/file keying as we really want to use
IPs like the other tools do.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264851813-8413-12-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hitoshi Mitake [Sat, 30 Jan 2010 11:43:33 +0000 (20:43 +0900)]
perf lock: Introduce new tool "perf lock", for analyzing lock statistics
Adding new subcommand "perf lock" to perf.
I have a lot of remaining ToDos, but for now perf lock can
already provide minimal functionality for analyzing lock
statistics.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264851813-8413-12-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hitoshi Mitake [Sat, 30 Jan 2010 11:43:32 +0000 (20:43 +0900)]
perf lock: Enhance information of lock trace events
Add wait time and lock identification details.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264851813-8413-11-git-send-email-mitake@dcl.info.waseda.ac.jp>
[ removed the file/line bits as we can do that better via IPs ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hitoshi Mitake [Sat, 30 Jan 2010 11:43:24 +0000 (20:43 +0900)]
perf: Add util/include/linuxhash.h to include hash.h of kernel
linux/hash.h, hash header of kernel, is also useful for perf.
util/include/linuxhash.h includes linux/hash.h, so we can use
hash facilities (e.g. hash_long()) in perf now.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264851813-8413-3-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hitoshi Mitake [Sat, 30 Jan 2010 11:43:23 +0000 (20:43 +0900)]
perf tools: Add __data_loc support
This patch is required to test the next patch for perf lock.
At
064739bc4b3d7f424b2f25547e6611bcf0132415 ,
support for the modifier "__data_loc" of format is added.
But, when I wanted to parse format of lock_acquired (or some
event else), raw_field_ptr() did not returned correct pointer.
So I modified raw_field_ptr() like this patch. Then
raw_field_ptr() works well.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <
1264851813-8413-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
[ v3: fixed minor stylistic detail ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hitoshi Mitake [Sat, 30 Jan 2010 11:55:41 +0000 (20:55 +0900)]
Revert "perf record: Intercept all events"
This reverts commit
f5a2c3dce03621b55f84496f58adc2d1a87ca16f.
This patch is required for making "perf lock rec" work.
The commit
f5a2c3dce0 changes write_event() of builtin-record.c
. And changed write_event() sometimes doesn't stop with perf
lock rec.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
[ that commit also causes perf record to not be Ctrl-C-able,
and it's concetually wrong to parse the data at record time
(unconditionally - even when not needed), as we eventually
want to be able to do zero-copy recording, at least for
non-archive recordings. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
John Kacur [Wed, 27 Jan 2010 23:05:54 +0000 (21:05 -0200)]
perf: Ignore perf-archive temp file
Tell git to ignore perf-archive.
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264633557-17597-6-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Thiago Farina [Wed, 27 Jan 2010 23:05:55 +0000 (21:05 -0200)]
tools/perf/perf.c: Clean up trivial style issues
Checked with:
./../scripts/checkpatch.pl --terse --file perf.c
perf.c: 51: ERROR: open brace '{' following function declarations go on the next line
perf.c: 73: ERROR: "foo*** bar" should be "foo ***bar"
perf.c:112: ERROR: space prohibited before that close parenthesis ')'
perf.c:127: ERROR: space prohibited before that close parenthesis ')'
perf.c:171: ERROR: "foo** bar" should be "foo **bar"
perf.c:213: ERROR: "(foo*)" should be "(foo *)"
perf.c:216: ERROR: "(foo*)" should be "(foo *)"
perf.c:217: ERROR: space required before that '*' (ctx:OxV)
perf.c:452: ERROR: do not initialise statics to 0 or NULL
perf.c:453: ERROR: do not initialise statics to 0 or NULL
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <
1264633557-17597-7-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 29 Jan 2010 08:24:57 +0000 (09:24 +0100)]
Merge branch 'perf/urgent' into perf/core
Merge reason: We want to queue up a dependent patch. Also update to
later -rc's.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Wed, 27 Jan 2010 23:05:52 +0000 (21:05 -0200)]
perf session: Create kernel maps in the constructor
Removing one extra step needed in the tools that need this,
fixing a bug in 'perf probe' where this was not being done.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264633557-17597-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Wed, 27 Jan 2010 23:05:51 +0000 (21:05 -0200)]
perf symbols: Split helpers used when creating kernel dso object
To make it clear and allow for direct usage by, for instance,
regression test suites.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264633557-17597-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Wed, 27 Jan 2010 23:05:50 +0000 (21:05 -0200)]
perf symbols: Factor out dso__load_vmlinux_path()
So that we can call it directly from regression tests, and also
to reduce the size of dso__load_kernel_sym(), making it more
clear.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264633557-17597-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Wed, 27 Jan 2010 23:05:49 +0000 (21:05 -0200)]
perf top: Exit if specified --vmlinux can't be used
As we do lazy loading of symtabs we only will know if the
specified vmlinux file is invalid when we actually have a hit in
kernel space and then try to load it. So if we get kernel hits
and there are _no_ symbols in the DSO backing the kernel map,
bail out.
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264633557-17597-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 29 Jan 2010 08:04:26 +0000 (09:04 +0100)]
perf_events: Fix sample_period transfer on inherit
One problem with frequency driven counters is that we cannot
predict the rate at which they trigger, therefore we have to
start them at period=1, this causes a ramp up effect. However,
if we fail to propagate the stable state on fork each new child
will have to ramp up again. This can lead to significant
artifacts in sample data.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: eranian@google.com
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264752266.4283.2121.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 27 Jan 2010 22:07:49 +0000 (23:07 +0100)]
perf_events, x86: Remove spurious counter reset from x86_pmu_enable()
At enable time the counter might still have a ->idx pointing to
a previously occupied location that might now be taken by
another event. Resetting the counter at that location with data
from this event will destroy the other counter's count.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100127221122.
261477183@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 27 Jan 2010 22:07:48 +0000 (23:07 +0100)]
perf_events, x86: Implement Intel Westmere support
The new Intel documentation includes Westmere arch specific
event maps that are significantly different from the Nehalem
ones. Add support for this generation.
Found the CPUID model numbers on wikipedia.
Also ammend some Nehalem constraints, spotted those when looking
for the differences between Nehalem and Westmere.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100127221122.
151865645@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 27 Jan 2010 22:07:47 +0000 (23:07 +0100)]
perf_events, x86: Clean up hw_perf_*_all() implementation
Put the recursion avoidance code in the generic hook instead of
replicating it in each implementation.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100127221122.
057507285@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 27 Jan 2010 22:07:46 +0000 (23:07 +0100)]
perf_events, x86: Fix event constraint masks
Since constraints are specified on the event number, not number
and unit mask shorten the constraint masks so that we'll
actually match something.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100127221121.
967610372@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 25 Jan 2010 14:58:43 +0000 (15:58 +0100)]
perf_event: x86: Deduplicate the disable code
Share the meat of the x86_pmu_disable() code with hw_perf_enable().
Also remove the barrier() from that code, since I could not convince
myself we actually need it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 27 Jan 2010 07:39:39 +0000 (08:39 +0100)]
perf, x86: Clean up event constraints code a bit
- Remove stray debug code
- Improve ugly macros a bit
- Remove some whitespace damage
- (Also fix up some accumulated damage in perf_event.h)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Peter Zijlstra [Mon, 25 Jan 2010 10:57:25 +0000 (11:57 +0100)]
perf_event: x86: Optimize x86_pmu_disable()
x86_pmu_disable() removes the event from the cpuc->event_list[], however
since an event can only be on that list once, stop looking after we found
it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 15:40:12 +0000 (16:40 +0100)]
perf_event: x86: Optimize the fast path a little more
Remove num from the fast path and save a few ops.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155536.
056430539@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 15:32:17 +0000 (16:32 +0100)]
perf_event: x86: Optimize constraint weight computation
Add a weight member to the constraint structure and avoid recomputing the
weight at runtime.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
963944926@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 15:32:17 +0000 (16:32 +0100)]
perf_event: x86: Optimize the constraint searching bits
Instead of copying bitmasks around, pass pointers to the constraint
structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
887853503@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 14:59:29 +0000 (15:59 +0100)]
bitops: Provide compile time HWEIGHT{8,16,32,64}
Provide compile time versions of hweight.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <
20100122155535.
797688466@chello.nl>
[ Remove some whitespace damage while we are at it ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 14:38:26 +0000 (15:38 +0100)]
perf_event: x86: Reduce some overly long lines with some MACROs
Introduce INTEL_EVENT_CONSTRAINT and FIXED_EVENT_CONSTRAINT to reduce
some line length and typing work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
688730371@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 14:25:59 +0000 (15:25 +0100)]
perf_event: x86: Clean up some of the u64/long bitmask casting
We need this to be u64 for direct assigment, but the bitmask functions
all work on unsigned long, leading to cast heaven, solve this by using a
union.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
595961269@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 13:55:22 +0000 (14:55 +0100)]
perf_event: x86: Fixup constraints typing issue
Constraints gets defined an u64 but in long quantities and then cast to
long.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
504916780@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Fri, 22 Jan 2010 13:35:46 +0000 (14:35 +0100)]
perf_event: x86: Allocate the fake_cpuc
GCC was complaining the stack usage was too large, so allocate the
structure.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <
20100122155535.
411197266@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephane Eranian [Thu, 21 Jan 2010 15:39:01 +0000 (17:39 +0200)]
perf_events: Add fast-path to the rescheduling code
Implement correct fastpath scheduling, i.e., reuse previous assignment.
Signed-off-by: Stephane Eranian <eranian@google.com>
[ split from larger patch]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
4b588464.
1818d00a.4456.383b@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephane Eranian [Mon, 18 Jan 2010 08:58:01 +0000 (10:58 +0200)]
perf_events, x86: Improve x86 event scheduling
This patch improves event scheduling by maximizing the use of PMU
registers regardless of the order in which events are created in a group.
The algorithm takes into account the list of counter constraints for each
event. It assigns events to counters from the most constrained, i.e.,
works on only one counter, to the least constrained, i.e., works on any
counter.
Intel Fixed counter events and the BTS special event are also handled via
this algorithm which is designed to be fairly generic.
The patch also updates the validation of an event to use the scheduling
algorithm. This will cause early failure in perf_event_open().
The 2nd version of this patch follows the model used by PPC, by running
the scheduling algorithm and the actual assignment separately. Actual
assignment takes place in hw_perf_enable() whereas scheduling is
implemented in hw_perf_group_sched_in() and x86_pmu_enable().
Signed-off-by: Stephane Eranian <eranian@google.com>
[ fixup whitespace and style nits as well as adding is_x86_event() ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
4b5430c6.
0f975e0a.1bf9.
ffff85fe@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
K.Prasad [Thu, 28 Jan 2010 11:14:15 +0000 (16:44 +0530)]
x86/hw-breakpoints: Optimize return code from notifier chain in hw_breakpoint_handler
Processing of debug exceptions in do_debug() can stop if it
originated from a hw-breakpoint exception by returning NOTIFY_STOP
in most cases.
But for certain cases such as:
a) user-space breakpoints with pending SIGTRAP signal delivery (as
in the case of ptrace induced breakpoints).
b) exceptions due to other causes than breakpoints
We will continue to process the exception by returning NOTIFY_DONE.
Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
LKML-Reference: <
20100128111415.GC13935@in.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
K.Prasad [Thu, 28 Jan 2010 11:14:01 +0000 (16:44 +0530)]
x86/debug: Clear reserved bits of DR6 in do_debug()
Clear the reserved bits from the stored copy of debug status
register (DR6).
This will help easy bitwise operations such as quick testing
of a debug event origin.
Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <
20100128111401.GB13935@in.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Xiao Guangrong [Thu, 28 Jan 2010 01:34:27 +0000 (09:34 +0800)]
tracing/kprobe: Cleanup unused return value of tracing functions
The return values of the kprobe's tracing functions are meaningless,
lets remove these.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <
4B60E9A3.2040505@cn.fujitsu.com>
[fweisbec@gmail: whitespace fixes, drop useless void returns in end
of functions]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Xiao Guangrong [Thu, 28 Jan 2010 01:32:29 +0000 (09:32 +0800)]
perf: Factorize trace events raw sample buffer operations
Introduce ftrace_perf_buf_prepare() and ftrace_perf_buf_submit() to
gather the common code that operates on raw events sampling buffer.
This cleans up redundant code between regular trace events, syscall
events and kprobe events.
Changelog v1->v2:
- Rename function name as per Masami and Frederic's suggestion
- Add __kprobes for ftrace_perf_buf_prepare() and make
ftrace_perf_buf_submit() inline as per Masami's suggestion
- Export ftrace_perf_buf_prepare since modules will use it
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <
4B60E92D.9000808@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Anton Blanchard [Mon, 18 Jan 2010 05:47:07 +0000 (16:47 +1100)]
perf: Fix inconsistency between IP and callchain sampling
When running perf across all cpus with backtracing (-a -g), sometimes we
get samples without associated backtraces:
23.44% init [kernel] [k] restore
11.46% init eeba0c [k] 0x00000000eeba0c
6.77% swapper [kernel] [k] .perf_ctx_adjust_freq
5.73% init [kernel] [k] .__trace_hcall_entry
4.69% perf libc-2.9.so [.] 0x0000000006bb8c
|
|--11.11%-- 0xfffa941bbbc
It turns out the backtrace code has a check for the idle task and the IP
sampling does not. This creates problems when profiling an interrupt
heavy workload (in my case 10Gbit ethernet) since we get no backtraces
for interrupts received while idle (ie most of the workload).
Right now x86 and sh check that current is not NULL, which should never
happen so remove that too.
Idle task's exclusion must be performed from the core code, on top
of perf_event_attr:exclude_idle.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <
20100118054707.GT12666@kryten>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Mahesh Salgaonkar [Thu, 21 Jan 2010 12:55:16 +0000 (18:25 +0530)]
hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails.
On a given architecture, when hardware breakpoint registration fails
due to un-supported access type (read/write/execute), we lose the bp
slot since register_perf_hw_breakpoint() does not release the bp slot
on failure.
Hence, any subsequent hardware breakpoint registration starts failing
with 'no space left on device' error.
This patch introduces error handling in register_perf_hw_breakpoint()
function and releases bp slot on error.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
LKML-Reference: <
20100121125516.GA32521@in.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Hitoshi Mitake [Fri, 22 Jan 2010 13:45:29 +0000 (22:45 +0900)]
perf trace: Add -i option for choosing input file
perf trace lacks -i option for choosing input file.
This patch adds it to perf trace.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1264167929-6741-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Fri, 22 Jan 2010 16:35:02 +0000 (14:35 -0200)]
perf symbols: Use the right variable to check for kallsyms in the cache
Probably this wasn't noticed when testing this on my parisc
machine because I must have copied manually to its cache the
vmlinux file used in the x86_64 machine, now that I tried
looking on a x86-32 machine with a fresh cache, kernel symbols
weren't being resolved even with the right kallsyms copy on its
cache, duh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264178102-4203-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Fri, 22 Jan 2010 16:35:01 +0000 (14:35 -0200)]
perf symbols: Fix inverted logic for showing kallsyms as the source of symbols
Only if we parsed /proc/kallsyms (or a copy found in the buildid
cache) we should set the dso long name to "[kernel.kallsyms]".
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264178102-4203-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Thu, 21 Jan 2010 15:04:44 +0000 (13:04 -0200)]
perf top: Handle PERF_RECORD_{FORK,EXIT} events
As noticed by Mike, symbols in new tasks were not being
processed as we weren't processing these events.
Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264086284-1431-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arnaldo Carvalho de Melo [Thu, 21 Jan 2010 15:04:43 +0000 (13:04 -0200)]
perf top: Fix sample counting
Broken since "5b2bb75 perf top: Support userspace symbols too".
Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <
1264086284-1431-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Amerigo Wang [Mon, 25 Jan 2010 05:07:30 +0000 (00:07 -0500)]
perf: Ignore perf.data.old
Tell git to ignore this file.
Signed-off-by: WANG Cong <amwang@redhat.com>
LKML-Reference: <
20100125051052.3999.28082.sendpatchset@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yong Wang [Fri, 22 Jan 2010 01:47:50 +0000 (09:47 +0800)]
perf report: Fix segmentation fault when running with '-g none'
Segmentation fault occurs when running perf report with '-g
none'.
Reported-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
20100122014750.GA4111@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Tue, 26 Jan 2010 17:50:16 +0000 (18:50 +0100)]
perf: Reimplement frequency driven sampling
There was a bug in the old period code that caused intel_pmu_enable_all()
or native_write_msr_safe() to show up quite high in the profiles.
In staring at that code it made my head hurt, so I rewrote it in a
hopefully simpler fashion. Its now fully symetric between tick and
overflow driven adjustments and uses less data to boot.
The only complication is that it basically wants to do a u128 division.
The code approximates that in a rather simple truncate until it fits
fashion, taking care to balance the terms while truncating.
This version does not generate that sampling artefact.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Greg Kroah-Hartman [Tue, 26 Jan 2010 23:04:02 +0000 (15:04 -0800)]
fnctl: f_modown should call write_lock_irqsave/restore
Commit
703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown()
should call write_lock_irqsave instead of just write_lock_irq so that
because a caller could have a spinlock held and it would not be good to
renable interrupts.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Tavis Ormandy <taviso@google.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 26 Jan 2010 03:05:06 +0000 (19:05 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag
ext4: Fix quota accounting error with fallocate
ext4: Handle -EDQUOT error on write
Linus Torvalds [Tue, 26 Jan 2010 03:03:58 +0000 (19:03 -0800)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] sbc_fitpc2_wdt: fix I/O space access technique.
[WATCHDOG] ixp2000: Fix build failure caused by missing include
Linus Torvalds [Tue, 26 Jan 2010 03:03:45 +0000 (19:03 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: fix a memory-leak in wm8903
ALSA: hda - add possibility to choose speakers configuration for 4930g
ALSA: hda - Fix HP T5735 automute
ALSA: hda - Turn on EAPD only if available for Realtek codecs
ALSA: hda - Fix parsing pin node 0x21 on ALC259
Linus Torvalds [Tue, 26 Jan 2010 03:02:31 +0000 (19:02 -0800)]
Merge branch 'kvm-updates/2.6.33' of git://git./virt/kvm/kvm
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()
KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks
KVM: S390: fix potential array overrun in intercept handling
KVM: fix spurious interrupt with irqfd
eventfd - allow atomic read and waitqueue remove
KVM: MMU: bail out pagewalk on kvm_read_guest error
KVM: properly check max PIC pin in irq route setup
KVM: only allow one gsi per fd
KVM: x86: Fix host_mapping_level()
KVM: powerpc: Show timing option only on embedded
KVM: Fix race between APIC TMR and IRR
Linus Torvalds [Tue, 26 Jan 2010 03:02:06 +0000 (19:02 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: fix memory leak in update path
UBI: add more checks to chdev open
UBI: initialise update marker
Linus Torvalds [Tue, 26 Jan 2010 03:00:56 +0000 (19:00 -0800)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (fschmd) Fix a memleak on multiple opens of /dev/watchdog
hwmon: (asus_atk0110) Do not fail if MBIF is missing
hwmon: (amc6821) Double unlock bug
hwmon: (smsc47m1) Fix section mismatch
Linus Torvalds [Tue, 26 Jan 2010 02:59:47 +0000 (18:59 -0800)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (95 commits)
drm/radeon/kms: preface warning printk with driver name
drm/radeon/kms: drop unnecessary printks.
drm: fix regression in fb blank handling
drm/radeon/kms: make hibernate work on IGPs
drm/vmwgfx: Optimize memory footprint for DMA buffers.
drm/ttm: Allow system memory as a busy placement.
drm/ttm: Fix race condition in ttm_bo_delayed_delete (v3, final)
drm/nv50: prevent switching off SOR when in use for DVI-over-DP
drm/nv50: fail auxch transaction if reply count not what we expect
drm/nouveau: fix failure path if userspace specifies no valid memtypes
drm/nouveau: report LVDS as disconnected if lid closed
drm/radeon/kms: fix legacy get_engine/memory clock
drm/radeon/kms/atom: atom parser fixes
drm/radeon/kms: clean up atombios pll code
drm/radeon/kms: clean up pll struct
drm/radeon/kms/atom: fix crtc lock ordering
drm/radeon: r6xx/r7xx possible security issue, system ram access
drm/radeon/kms: r600/r700 don't test ib if ib initialization fails
drm/radeon/kms: Forbid creation of framebuffer with no valid GEM object
drm/radeon/kms: r600 handle irq vector ring overflow
...
Linus Torvalds [Tue, 26 Jan 2010 02:57:25 +0000 (18:57 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix IRQ ->set_affinity() methods.
sparc: cpumask_of_node() should handle -1 as a node
sparc64: Update defconfig.
sparc: Add missing SW perf fault events.
sparc64: Fully support both performance counters.
sparc64: Add perf callchain support.
sparc: convert to arch_gettimeoffset()
sparc: leds_resource.end assigned to itself in clock_board_probe()
sparc32: Fix page_to_phys().
sparc: Simplify param.h by simply including <asm-generic/param.h>
sparc32: Update defconfig.
SPARC: use helpers for rlimits
sparc: copy_from_user() should not return -EFAULT
Linus Torvalds [Tue, 26 Jan 2010 02:57:07 +0000 (18:57 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
virtio_net: Make delayed refill more reliable
sfc: Use fixed-size buffers for MCDI NVRAM requests
sfc: Add workspace for GMAC bug workaround to MCDI MAC_STATS buffer
tcp_probe: avoid modulus operation and wrap fix
qlge: Only free resources if they were allocated
netns xfrm: deal with dst entries in netns
sky2: revert config space change
vlan: fix vlan_skb_recv()
netns xfrm: fix "ip xfrm state|policy count" misreport
sky2: Enable/disable WOL per hardware device
net: Fix IPv6 GSO type checks in Intel ethernet drivers
igb/igbvf: cleanup exception handling in tx_map_adv
MAINTAINERS: Add Intel igbvf maintainer
e1000/e1000e: don't use small hardware rx buffers
fmvj18x_cs: add new id (Panasonic lan & modem card)
be2net: swap only first 2 fields of mcc_wrb
Please add support for Microsoft MN-120 PCMCIA network card
be2net: fix bug in rx page posting
wimax/i2400m: Add support for more i6x50 SKUs
e1000e: enhance frame fragment detection
...
Linus Torvalds [Tue, 26 Jan 2010 02:56:12 +0000 (18:56 -0800)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (25 commits)
OMAP2/3: DMTIMER: Clear pending interrupts when stopping a timer
PM debug: Fix warning when no CONFIG_DEBUG_FS
OMAP3: PM: DSS PM_WKEN to refill DMA
OMAP: timekeeping: time should not stop during suspend
OMAP3: PM: Force write last pad config register into save area
OMAP: omap3_pm_get_suspend_state() error ignored in pwrdm_suspend_get()
OMAP3: PM: Enable wake-up from McBSP2, 3 and 4 modules
OMAP3: PM debug: fix build error when !CONFIG_DEBUG_FS
OMAP3: PM: Removing redundant and potentially dangerous PRCM configration
OMAP3: Fixed ARM aux ctrl register save/restore
OMAP3: CPUidle: Fixed timer resolution
OMAP3: PM: Remove duplicate code blocks
OMAP3: PM: Disable interrupt controller AUTOIDLE before WFI
OMAP3: PM: Enable system control module autoidle
OMAP3: PM: Ack pending interrupts before entering suspend
omap: Enable GPMC clock in gpmc_init
OMAP1 clock: fix for "BUG: spinlock lockup on CPU#0"
OMAP4: clocks: Fix the clksel_rate struct DPLL divs
OMAP4: PRCM: Fix the base address for CHIRONSS reg defines
OMAP: dma_chan[lch_head].flag & OMAP_DMA_ACTIVE tested twice in omap_dma_unlink_lch()
...
Herbert Xu [Mon, 25 Jan 2010 23:51:01 +0000 (15:51 -0800)]
virtio_net: Make delayed refill more reliable
I have seen RX stalls on a machine that experienced a suspected
OOM. After the stall, the RX buffer is empty on the guest side
and there are exactly 16 entries available on the host side. As
the number of entries is less than that required by a maximal
skb, the host cannot proceed.
The guest did not have a refill job scheduled.
My diagnosis is that an OOM had occured, with the delayed refill
job scheduled. The job was able to allocate at least one skb, but
not enough to overcome the minimum required by the host to proceed.
As the refill job would only reschedule itself if it failed completely
to allocate any skbs, this would lead to an RX stall.
The following patch removes this stall possibility by always
rescheduling the refill job until the ring is totally refilled.
Testing has shown that the RX stall no longer occurs whereas
previously it would occur within a day.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 25 Jan 2010 23:49:59 +0000 (15:49 -0800)]
sfc: Use fixed-size buffers for MCDI NVRAM requests
The low-level MCDI code always uses 32-bit MMIO operations, and
callers must pad input and output buffers to multiples of 4 bytes.
The MCDI NVRAM functions are not doing this. Also, their buffers are
declared as variable-length arrays with no explicit maximum length.
Switch to a fixed buffer size based on the chunk size used by the
MTD driver (which is a multiple of 4).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guido Barzini [Mon, 25 Jan 2010 23:49:19 +0000 (15:49 -0800)]
sfc: Add workspace for GMAC bug workaround to MCDI MAC_STATS buffer
Due to a hardware bug in the SFC9000 family, the firmware must
transfer raw GMAC statistics to host memory before aggregating them
into the cooked (speed-independent) MAC statistics. Extend the stats
buffer to support this.
The length of the buffer is explicit in the MAC_STATS command, so this
change is backward-compatible on both sides.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Mon, 25 Jan 2010 23:47:50 +0000 (15:47 -0800)]
tcp_probe: avoid modulus operation and wrap fix
By rounding up the buffer size to power of 2, several expensive
modulus operations can be avoided. This patch also solves a bug where
the gap need when ring gets full was not being accounted for.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Breno Leitao [Mon, 25 Jan 2010 23:46:58 +0000 (15:46 -0800)]
qlge: Only free resources if they were allocated
Currently qlge tries to release regions even if they were not allocated.
This causes messages like the following in the kernel log
Trying to free nonexistent resource <
00000000006af400-
00000000006af4ff>
Trying to free nonexistent resource <
00003c04ff9f4000-
00003c04ff9f7fff>
Trying to free nonexistent resource <
00003c04ffc00000-
00003c04ffcfffff>
This patch fixes the goto logic in order to not release the resources
if they were not allocated.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Denis Turischev [Thu, 21 Jan 2010 14:10:07 +0000 (16:10 +0200)]
[WATCHDOG] sbc_fitpc2_wdt: fix I/O space access technique.
The mdelay function was used between I/O access commands, that causes peak
in CPU usage. Fix it by substitution mdelay to msleep.
Expand usage on fitPC2 compatible boards according to DMI identification.
Signed-off-by: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Takashi Iwai [Mon, 25 Jan 2010 16:00:01 +0000 (17:00 +0100)]
Merge branch 'fix/hda' into for-linus
Guennadi Liakhovetski [Fri, 22 Jan 2010 17:00:03 +0000 (18:00 +0100)]
ASoC: fix a memory-leak in wm8903
Remember to free the temporary register-cache.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
Wei Yongjun [Fri, 22 Jan 2010 06:21:29 +0000 (14:21 +0800)]
KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()
In function kvm_arch_vcpu_init(), if the memory malloc for
vcpu->arch.mce_banks is fail, it does not free the memory
of lapic date. This patch fixed it.
Cc: stable@kernel.org
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Wei Yongjun [Fri, 22 Jan 2010 06:18:47 +0000 (14:18 +0800)]
KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks
vcpu->arch.mce_banks is malloc in kvm_arch_vcpu_init(), but
never free in any place, this may cause memory leak. So this
patch fixed to free it in kvm_arch_vcpu_uninit().
Cc: stable@kernel.org
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Christian Borntraeger [Thu, 21 Jan 2010 11:19:07 +0000 (12:19 +0100)]
KVM: S390: fix potential array overrun in intercept handling
kvm_handle_sie_intercept uses a jump table to get the intercept handler
for a SIE intercept. Static code analysis revealed a potential problem:
the intercept_funcs jump table was defined to contain (0x48 >> 2) entries,
but we only checked for code > 0x48 which would cause an off-by-one
array overflow if code == 0x48.
Use the compiler and ARRAY_SIZE to automatically set the limits.
Cc: stable@kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Michael S. Tsirkin [Wed, 13 Jan 2010 17:12:30 +0000 (19:12 +0200)]
KVM: fix spurious interrupt with irqfd
kvm didn't clear irqfd counter on deassign, as a result we could get a
spurious interrupt when irqfd is assigned back. this leads to poor
performance and, in theory, guest crash.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Davide Libenzi [Wed, 13 Jan 2010 17:34:36 +0000 (09:34 -0800)]
eventfd - allow atomic read and waitqueue remove
KVM needs a wait to atomically remove themselves from the eventfd ->poll()
wait queue head, in order to handle correctly their IRQfd deassign
operation.
This patch introduces such API, plus a way to read an eventfd from its
context.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Marcelo Tosatti [Thu, 14 Jan 2010 19:41:27 +0000 (17:41 -0200)]
KVM: MMU: bail out pagewalk on kvm_read_guest error
Exit the guest pagetable walk loop if reading gpte failed. Otherwise its
possible to enter an endless loop processing the previous present pte.
Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Tue, 12 Jan 2010 18:42:09 +0000 (16:42 -0200)]
KVM: properly check max PIC pin in irq route setup
Otherwise memory beyond irq_states[16] might be accessed.
Noticed by Juan Quintela.
Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Michael S. Tsirkin [Wed, 13 Jan 2010 16:58:09 +0000 (18:58 +0200)]
KVM: only allow one gsi per fd
Looks like repeatedly binding same fd to multiple gsi's with irqfd can
use up a ton of kernel memory for irqfd structures.
A simple fix is to allow each fd to only trigger one gsi: triggering a
storm of interrupts in guest is likely useless anyway, and we can do it
by binding a single gsi to many interrupts if we really want to.
Cc: stable@kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Acked-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 5 Jan 2010 11:02:28 +0000 (19:02 +0800)]
KVM: x86: Fix host_mapping_level()
When found a error hva, should not return PAGE_SIZE but the level...
Also clean up the coding style of the following loop.
Cc: stable@kernel.org
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alexander Graf [Sun, 20 Dec 2009 21:24:06 +0000 (22:24 +0100)]
KVM: powerpc: Show timing option only on embedded
Embedded PowerPC KVM has an exit timing implementation to track and evaluate
how much time was spent in which exit path.
For Book3S, we don't implement it. So let's not expose it as a config option
either.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Dec 2009 10:42:16 +0000 (12:42 +0200)]
KVM: Fix race between APIC TMR and IRR
When we queue an interrupt to the local apic, we set the IRR before the TMR.
The vcpu can pick up the IRR and inject the interrupt before setting the TMR,
and perhaps even EOI it, causing incorrect behaviour.
The race is really insignificant since it can only occur on the first
interrupt (usually following interrupts will not change TMR), but it's better
closed than open.
Fixed by reordering setting the TMR vs IRR.
Cc: stable@kernel.org
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Hans de Goede [Mon, 25 Jan 2010 14:00:50 +0000 (15:00 +0100)]
hwmon: (fschmd) Fix a memleak on multiple opens of /dev/watchdog
When /dev/watchdog gets opened a second time we return -EBUSY, but
we already have got a kref then, so we end up leaking our data struct.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Luca Tettamanti [Mon, 25 Jan 2010 14:00:49 +0000 (15:00 +0100)]
hwmon: (asus_atk0110) Do not fail if MBIF is missing
MBIF (motherboard identification) is only used to print the name of
the board, it's not essential for the driver; do not fail if it's
missing. Based on Juan's patch.
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Acked-by: Juan RP <xtraeme@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Dan Carpenter [Mon, 25 Jan 2010 14:00:49 +0000 (15:00 +0100)]
hwmon: (amc6821) Double unlock bug
The mutex gets unlocked after we goto EXIT.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jeff Mahoney [Mon, 25 Jan 2010 14:00:48 +0000 (15:00 +0100)]
hwmon: (smsc47m1) Fix section mismatch
smsc47m1_restore is called from sm_smsc47m1_exit, which is an __exit
function, so it can't be __init.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Łukasz Wojniłowicz [Sun, 24 Jan 2010 13:12:37 +0000 (14:12 +0100)]
ALSA: hda - add possibility to choose speakers configuration for 4930g
Now one can choose speaker configuration in e.g. PulseAudio mixer
Signed-off-by: Łukasz Wojniłowicz <lukasz.wojnilowicz@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Alexey Dobriyan [Mon, 25 Jan 2010 06:47:53 +0000 (22:47 -0800)]
netns xfrm: deal with dst entries in netns
GC is non-existent in netns, so after you hit GC threshold, no new
dst entries will be created until someone triggers cleanup in init_net.
Make xfrm4_dst_ops and xfrm6_dst_ops per-netns.
This is not done in a generic way, because it woule waste
(AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns.
Reorder GC threshold initialization so it'd be done before registering
XFRM policies.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Sun, 24 Jan 2010 18:46:06 +0000 (18:46 +0000)]
sky2: revert config space change
Obviously, this register had some other impact that is causing
the regression. Either it is masking some other access or needs
to be reset in some path.
Either, way it is best to just revert the change for 2.6.33
This reverts commit
166a0fd4c788ec7f10ca8194ec6d526afa12db75.
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Mon, 25 Jan 2010 06:13:55 +0000 (16:13 +1000)]
drm/radeon/kms: preface warning printk with driver name
This just adds a little more info to the warning for old -ati/mesa
userspaces.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 25 Jan 2010 06:13:12 +0000 (16:13 +1000)]
drm/radeon/kms: drop unnecessary printks.
These printks aren't required anymore.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Zhenyu Wang [Mon, 18 Jan 2010 08:47:04 +0000 (16:47 +0800)]
drm: fix regression in fb blank handling
commit
731b5a15a3b1474a41c2ca29b4c32b0f21bc852e
Author: James Simmons <jsimmons@infradead.org>
Date: Thu Oct 29 20:39:07 2009 +0000
drm/kms: properly handle fbdev blanking
uses DRM_MODE_DPMS_ON for FB_BLANK_NORMAL, but DRM_MODE_DPMS_ON
is actually for turning output on instead of blank.
This makes fb blank broken on my T61, it put LVDS on but leave
pipe disabled which made screen totally white or caused some
'burning' effect.
[airlied: James objects to this but at this point in 2.6.33,
I can't see a patch that will fix this properly like he wants coming
in time and otherwise this is a regression - proper fix for 2.6.34
hopefully.]
Cc: James Simmons <jsimmons@infradead.org>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 25 Jan 2010 03:08:08 +0000 (13:08 +1000)]
drm/radeon/kms: make hibernate work on IGPs
This is the least invasive fix without migrating the radeon driver
to pm_ops from what I can see. We just always migrate VRAM objects
on IGPs for now and we can fix it up later to migrate depending
on STR vs STD.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Hellstrom [Sat, 16 Jan 2010 15:05:05 +0000 (16:05 +0100)]
drm/vmwgfx: Optimize memory footprint for DMA buffers.
Use VRAM whenever there is free space for DMA buffers,
but use system GMR memory if using VRAM would cause an eviction.
This significantly reduces the guest system memory usage for
VMs with a large amount of VRAM allocated.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Hellstrom [Sat, 16 Jan 2010 15:05:04 +0000 (16:05 +0100)]
drm/ttm: Allow system memory as a busy placement.
This is needed to fix a vmwgfx memory usage bug.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 25 Jan 2010 06:04:21 +0000 (16:04 +1000)]
Merge remote branch 'korg/drm-radeon-next' into drm-linus
* korg/drm-radeon-next:
drm/radeon/kms: fix legacy get_engine/memory clock
drm/radeon/kms/atom: atom parser fixes
drm/radeon/kms: clean up atombios pll code
drm/radeon/kms: clean up pll struct
drm/radeon/kms/atom: fix crtc lock ordering
drm/radeon: r6xx/r7xx possible security issue, system ram access
drm/radeon/kms: r600/r700 don't test ib if ib initialization fails
drm/radeon/kms: Forbid creation of framebuffer with no valid GEM object
drm/radeon/kms: r600 handle irq vector ring overflow
drm/radeon/kms: r600/r700 don't process IRQ if not initialized
drm/radeon/kms: r600/r700 disable irq at suspend
drm/radeon/kms/r4xx: cleanup atom path
drm/radeon/kms: fix atombios_crtc_set_base
drm/radeon/kms/atom: upstream parser updates
drm/radeon/kms/atom: fix some parser bugs
drm/radeon/kms: fix hardcoded mmio size in register functions
drm/radeon/kms/r100: fix bug in CS parser
drm/radeon/kms/r200: fix bug in CS parser
drm/radeon/kms/r200: fix bug in CS parser
Dave Airlie [Mon, 25 Jan 2010 06:04:11 +0000 (16:04 +1000)]
Merge remote branch 'nouveau/for-airlied' of ../drm-nouveau-next into drm-linus
* 'nouveau/for-airlied' of ../drm-nouveau-next:
drm/nv50: prevent switching off SOR when in use for DVI-over-DP
drm/nv50: fail auxch transaction if reply count not what we expect
drm/nouveau: fix failure path if userspace specifies no valid memtypes
drm/nouveau: report LVDS as disconnected if lid closed
drm/nv50: prevent accidently turning off encoders we're actually using
drm/nv50: fix alignment of per-channel fifo cache
drm/nouveau: Evict buffers in VRAM before freeing sgdma
drm/nouveau: Acknowledge DMA_VTX_PROTECTION PGRAPH interrupts
drm/nouveau: fix thinko in nv04_instmem.c
drm/nouveau: fix a race condition in nouveau_dma_wait()
Eric Dumazet [Mon, 25 Jan 2010 03:52:24 +0000 (19:52 -0800)]
vlan: fix vlan_skb_recv()
Bruno Prémont found commit
9793241fe92f7d930
(vlan: Precise RX stats accounting) added a regression for non
hw accelerated vlans.
[ 26.390576] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 26.396369] IP: [<
df856b89>] vlan_skb_recv+0x89/0x280 [8021q]
vlan_dev_info() was used with original device, instead of
skb->dev. Also spotted by Américo Wang.
Reported-By: Bruno Prémont <bonbons@linux-vserver.org>
Tested-By: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Luca Barbieri [Wed, 20 Jan 2010 19:01:30 +0000 (20:01 +0100)]
drm/ttm: Fix race condition in ttm_bo_delayed_delete (v3, final)
Resending this with Thomas Hellstrom's signoff for merging into 2.6.33
ttm_bo_delayed_delete has a race condition, because after we do:
kref_put(&nentry->list_kref, ttm_bo_release_list);
we are not holding the list lock and not holding any reference to
objects, and thus every bo in the list can be removed and freed at
this point.
However, we then use the next pointer we stored, which is not guaranteed
to be valid.
This was apparently the cause of some Nouveau oopses I experienced.
This patch rewrites the function so that it keeps the reference to nentry
until nentry itself is freed and we already got a reference to nentry->next.
v2 updated by me according to Thomas Hellstrom's feedback.
v3 proposed by Thomas Hellstrom. Commit comment updated by me.
Both updates fixed minor efficiency/style issues only and all three versions
should be correct.
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ben Skeggs [Fri, 22 Jan 2010 00:57:01 +0000 (10:57 +1000)]
drm/nv50: prevent switching off SOR when in use for DVI-over-DP
Another hack because of us exposing each encoder block's function as
an encoder rather than exposing a single encoder that deals with them
all.
A proper fix will come, it's just rather invasive so this hack will
do until then.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 21 Jan 2010 23:10:05 +0000 (09:10 +1000)]
drm/nv50: fail auxch transaction if reply count not what we expect
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 21 Jan 2010 05:03:23 +0000 (15:03 +1000)]
drm/nouveau: fix failure path if userspace specifies no valid memtypes
We need to add the buffer to the list even if we fail, otherwise the
validate_fini() call won't unreserve + unreference the GEM object,
making TTM very unhappy.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 18 Jan 2010 01:42:37 +0000 (11:42 +1000)]
drm/nouveau: report LVDS as disconnected if lid closed
Also adds a module option to ignore the status reported via ACPI, in case
we hit systems with broken ACPI.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Linus Torvalds [Sun, 24 Jan 2010 18:38:07 +0000 (10:38 -0800)]
Merge branch 'timers-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevent: Don't remove broadcast device when cpu is dead