Frederic Weisbecker [Sat, 13 Dec 2008 19:18:13 +0000 (20:18 +0100)]
tracing/ftrace: add the printk-msg-only option
Impact: display ftrace_printk messages "as is"
By default, ftrace_printk() messages find their output with some other
informations like pid, caller, ...
Sometimes a developer just want to have the ftrace_printk left "as is", without
other information.
This is done by providing a default-off option called printk-msg-only.
To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`
Before the patch:
<...>-2739 [000] 145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
<...>-2739 [000] 145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep
After the patch and the printk-msg-only option enabled:
I'm an ftrace_printk msg in __might_sleep
I'm another ftrace_printk msg in __might_sleep
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 16 Dec 2008 21:08:58 +0000 (22:08 +0100)]
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
Impact: prevent a trace recursion
After some tests with function graph tracer under x86-32, I saw some recursions
caused by ring_buffer_time_stamp() that calls preempt_enable_no_notrace() which
calls preempt_schedule() which is traced itself.
This patch re-enables preemption without rescheduling.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 16 Dec 2008 11:03:38 +0000 (12:03 +0100)]
Merge branches 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/hw-branch-tracing' into tracing/core
Ingo Molnar [Fri, 12 Dec 2008 11:13:36 +0000 (12:13 +0100)]
tracing/function-graph-tracer: add a new .irqentry.text section, fix
Impact: build fix
32-bit x86 needs this section too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 10 Dec 2008 07:08:22 +0000 (08:08 +0100)]
sched: fix tracepoints in scheduler
The trace point only caught one of many places where a task changes cpu,
put it in the right place to we get all of them.
Change the signature while we're at it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 9 Dec 2008 22:55:25 +0000 (23:55 +0100)]
tracing/function-graph-tracer: Output arrows signal on hardirq call/return
Impact: make more obvious the hardirq calls in the output
When a hardirq is triggered inside the codeflow on output, we have
now two arrows that indicate the entry and return of the hardirq.
0) | bit_waitqueue() {
0) 0.880 us | __phys_addr();
0) 2.699 us | }
0) | __wake_up_bit() {
0) ==========> | smp_apic_timer_interrupt() {
0) 0.797 us | native_apic_mem_write();
0) 0.715 us | exit_idle();
0) | irq_enter() {
0) 0.722 us | idle_cpu();
0) 5.519 us | }
0) | hrtimer_interrupt() {
0) | ktime_get() {
0) | ktime_get_ts() {
0) 0.805 us | getnstimeofday();
[...]
0) ! 108.528 us | }
0) | irq_exit() {
0) | do_softirq() {
0) | __do_softirq() {
0) 0.895 us | __local_bh_disable();
0) | run_timer_softirq() {
0) 0.827 us | hrtimer_run_pending();
0) 1.226 us | _spin_lock_irq();
0) | _spin_unlock_irq() {
0) 6.550 us | }
0) 0.924 us | _local_bh_enable();
0) + 12.129 us | }
0) + 13.911 us | }
0) 0.707 us | idle_cpu();
0) + 17.009 us | }
0) ! 137.419 us | }
0) <========== |
0) 1.045 us | }
0) ! 148.908 us | }
0) ! 151.022 us | }
0) ! 153.022 us | }
0) 0.963 us | journal_mark_dirty();
0) 0.925 us | __brelse();
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 9 Dec 2008 22:54:20 +0000 (23:54 +0100)]
tracing/function-graph-tracer: annotate do_IRQ and smp_apic_timer_interrupt
Impact: move most important x86 irq entry-points to a separate subsection
Annotate do_IRQ and smp_apic_timer_interrupt to put them into the .irqentry.text
subsection. These function will so be recognized as hardirq entrypoints for the
function-graph-tracer. We could also annotate other irq entries but the others
are far less important but they can be added on request.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Tue, 9 Dec 2008 22:53:16 +0000 (23:53 +0100)]
tracing/function-graph-tracer: add a new .irqentry.text section
Impact: let the function-graph-tracer be aware of the irq entrypoints
Add a new .irqentry.text section to store the irq entrypoints functions
inside the same section. This way, the tracer will be able to signal
an interrupts triggering on output by recognizing these entrypoints.
Also, make this section recordable for dynamic tracing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Thu, 11 Dec 2008 15:14:23 +0000 (16:14 +0100)]
tracing/fastboot: include missing headers
For now include/trace/boot.h doesn't need to include necessary headers
for its functions and structures because the files that include it already
do it.
But boot.h could be needed as well for further uses on other files.
So, this patch adds the necessary headers for future purposes...
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Stephen Rothwell [Thu, 11 Dec 2008 15:10:08 +0000 (16:10 +0100)]
tracing/fastboot: fix len of func buffer
Impact: fix possible stack overrun
This is a port of a patch included in the mainline (KSYM_SYMBOL_LEN fixes).
The current func len is not large enough to contain the max symbol len, the
right size must be KSYM_SYMBOL_LEN.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 12 Dec 2008 07:21:19 +0000 (08:21 +0100)]
x86, bts: fix build error
Impact: build fix
arch/x86/kernel/ds.c: In function 'ds_request':
arch/x86/kernel/ds.c:236: sorry, unimplemented: inlining failed in call to 'ds_get_context': recursive inlining
but the recursion here is scary ...
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Thu, 11 Dec 2008 12:53:26 +0000 (13:53 +0100)]
x86, bts, ftrace: adapt the hw-branch-tracer to the ds.c interface
Impact: restructure code, cleanup
Remove BTS bits from the hw-branch-tracer (renamed from bts-tracer) and
use the ds interface.
Signed-off-by: Markus Metzger <markut.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Thu, 11 Dec 2008 12:49:59 +0000 (13:49 +0100)]
x86, bts: provide in-kernel branch-trace interface
Impact: cleanup
Move the BTS bits from ptrace.c into ds.c.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Markus Metzger [Thu, 11 Dec 2008 12:45:23 +0000 (13:45 +0100)]
x86, bts: turn BUG_ON into WARN_ON_ONCE
Impact: make the ds code more debuggable
Turn BUG_ON's into WARN_ON_ONCE.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 12 Dec 2008 06:40:08 +0000 (07:40 +0100)]
Merge branches 'tracing/function-graph-tracer' and 'tracing/ring-buffer' into tracing/core
Ingo Molnar [Mon, 8 Dec 2008 15:55:53 +0000 (16:55 +0100)]
tracing/function-graph-tracer: fix 'flags' variable mismatch
this warning:
kernel/trace/trace.c: In function ‘trace_vprintk’:
kernel/trace/trace.c:3626: warning: ‘flags’ may be used uninitialized in this function
shows some confusion about irq_flags / flags use here. We already have
irq_flags so remove the extra flags variable.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sat, 6 Dec 2008 02:43:41 +0000 (03:43 +0100)]
tracing/function-graph-tracer: append the tracing_graph_flag
Impact: Provide a way to pause the function graph tracer
As suggested by Steven Rostedt, the previous patch that prevented from
spinlock function tracing shouldn't use the raw_spinlock to fix it.
It's much better to follow lockdep with normal spinlock, so this patch
adds a new flag for each task to make the function graph tracer able
to be paused. We also can send an ftrace_printk whithout worrying of
the irrelevant traced spinlock during insertion.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sat, 6 Dec 2008 02:41:33 +0000 (03:41 +0100)]
tracing/function-graph-tracer: turn tracing_selftest_running into an int
Impact: cleanup
Apply some suggestions of Steven Rostedt:
_turn tracing_selftest_running into a simple int (no need of an atomic_t)
_set it __read_mostly
_fix a comment style
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Sat, 6 Dec 2008 02:40:00 +0000 (03:40 +0100)]
tracing/function-graph-tracer: introduce __notrace_funcgraph to filter special functions
Impact: trace more functions
When the function graph tracer is configured, three more files are not
traced to prevent only four functions to be traced. And this impacts the
normal function tracer too.
arch/x86/kernel/process_64/32.c:
I had crashes when I let this file traced. After some debugging, I saw
that the "current" task point was changed inside__swtich_to(), ie:
"write_pda(pcurrent, next_p);" inside process_64.c Since the tracer store
the original return address of the function inside current, we had
crashes. Only __switch_to() has to be excluded from tracing.
kernel/module.c and kernel/extable.c:
Because of a function used internally by the function graph tracer:
__kernel_text_address()
To let the other functions inside these files to be traced, this patch
introduces the __notrace_funcgraph function prefix which is __notrace if
function graph tracer is configured and nothing if not.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lai Jiangshan [Mon, 8 Dec 2008 02:58:08 +0000 (10:58 +0800)]
ring_buffer: fix comments
Impact: comments cleanup
fix incorrect comments for enum ring_buffer_type
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Mon, 8 Dec 2008 00:56:06 +0000 (01:56 +0100)]
tracing/function-graph-tracer: implement a print_headers function
Impact: provide trace headers to explain a bit the output
This patch implements the print_headers callback for the function graph
tracer. These headers are output according to the current trace options.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 5 Dec 2008 04:30:56 +0000 (23:30 -0500)]
ftrace: use init_struct_pid as swapper pid
Impact: clean up
Using (struct pid *)-1 as the pointer for ftrace_swapper_pid is
a little confusing for others. This patch uses the address of the
actual init pid structure instead. This change is only for
clarity. It does not affect the code itself. Hopefully soon the
swapper tasks will all have their own pid structure and then
we can clean up the code a bit more.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Thu, 4 Dec 2008 22:51:23 +0000 (23:51 +0100)]
tracing/ftrace: provide the macro task_curr_ret_stack()
Impact: cleanup
As suggested by Steven Rostedt, this patch provide a new macro
task_curr_ret_stack() to move the cpp conditionnal CONFIG into
the linux/ftrace.h headers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Thu, 4 Dec 2008 22:49:47 +0000 (23:49 +0100)]
tracing/ftrace: fix the check of ftrace_trace_task
Impact: fix default empty traces on function-graph-tracer
The actual ftrace_trace_task() checks if ftrace_pid_trace is allocated
and return 1 if it is true.
If it is NULL, it will check the bit of pid tracing flag for the current
task (which are not set by default).
So by default, a task is not traced.
Actually all tasks should be traced by default and filter_by_pid when
ftrace_pid_trace is allocated.
The appropriate condition should be to return 1 if filter_by_pid is
set.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acke-dby: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Thu, 4 Dec 2008 22:47:35 +0000 (23:47 +0100)]
tracing/ftrace: don't insert TRACE_PRINT during selftests
Impact: fix tracer selfstests false results
After setting a ftrace_printk somewhere in th kernel, I saw the
Function tracer selftest failing.
When a selftest occurs, the ring buffer is lurked to see if
some entries were inserted. But concurrent insertion such as
ftrace_printk could occured at the same time and could give
false positive or negative results.
This patch prevent prevent from TRACE_PRINT entries insertion
during selftests.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 5 Dec 2008 13:45:22 +0000 (14:45 +0100)]
Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/urgent' into tracing/core
Frederic Weisbecker [Wed, 3 Dec 2008 22:45:11 +0000 (23:45 +0100)]
tracing/function-graph-tracer: handle ftrace_printk entries
Handle the TRACE_PRINT entries from the function grapg tracer
and output them as a C comment just below the function that called
it, as if it was a comment inside this function.
Example with an ftrace_printk inside might_sleep() function:
void __might_sleep(char *file, int line)
{
static unsigned long prev_jiffy; /* ratelimiting */
ftrace_printk("Hi I'm a comment in might_sleep() :-)");
A chunk of a resulting trace:
0) | _reiserfs_free_block() {
0) | reiserfs_read_bitmap_block() {
0) | __bread() {
0) | __getblk() {
0) | __find_get_block() {
0) 0.698 us | mark_page_accessed();
0) 2.267 us | }
0) | __might_sleep() {
0) | /* Hi I'm a comment in might_sleep() :-) */
0) 1.321 us | }
0) 5.872 us | }
0) 7.313 us | }
0) 8.718 us | }
And this patch brings two minor fixes:
- The newline after a switch-out task has disappeared
- The "|" sign just before the cpu number on task-switch has been deleted.
0) 0.616 us | pick_next_task_rt();
0) 1.457 us | _spin_trylock();
0) 0.653 us | _spin_unlock();
0) 0.728 us | _spin_trylock();
0) 0.631 us | _spin_unlock();
0) 0.729 us | native_load_sp0();
0) 0.593 us | native_load_tls();
------------------------------------------
0) cat-2834 => migrati-3
------------------------------------------
0) | finish_task_switch() {
0) 0.841 us | _spin_unlock_irq();
0) 0.616 us | post_schedule_rt();
0) 3.882 us | }
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Liming Wang [Thu, 4 Dec 2008 06:24:49 +0000 (14:24 +0800)]
ftrace: avoid duplicated function when writing set_graph_function
Impact: fix a bug in function filter setting
when writing function to set_graph_function, we should check whether it
has existed in set_graph_function to avoid duplicating.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 4 Dec 2008 08:18:28 +0000 (09:18 +0100)]
tracing: fix typo and missing inline function
Impact: fix build bugs
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Thu, 4 Dec 2008 05:26:41 +0000 (00:26 -0500)]
ftrace: add ability to only trace swapper tasks
Impact: new feature
This patch lets the swapper tasks of all CPUS be filtered by the
set_ftrace_pid file.
If '0' is echoed into this file, then all the idle tasks (aka swapper)
is flagged to be traced. This affects all CPU idle tasks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Thu, 4 Dec 2008 05:26:40 +0000 (00:26 -0500)]
ftrace: use struct pid
Impact: clean up, extend PID filtering to PID namespaces
Eric Biederman suggested using the struct pid for filtering on
pids in the kernel. This patch is based off of a demonstration
of an implementation that Eric sent me in an email.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Thu, 4 Dec 2008 05:26:39 +0000 (00:26 -0500)]
pid: fix the do_each_pid_task() macro
Impact: macro side-effects fix
This patch adds parenthesis around 'pid' in the do_each_pid_task
macro to allow callers to pass in more complex parameters.
e.g. do_each_pid_task(*pid, type, task)
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 20:36:59 +0000 (15:36 -0500)]
ftrace: trace single pid for function graph tracer
Impact: New feature
This patch makes the changes to set_ftrace_pid apply to the function
graph tracer.
# echo $$ > /debugfs/tracing/set_ftrace_pid
# echo function_graph > /debugfs/tracing/current_tracer
Will cause only the current task to be traced. Note, the trace flags are
also inherited by child processes, so the children of the shell
will also be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 20:36:58 +0000 (15:36 -0500)]
ftrace: use task struct trace flag to filter on pid
Impact: clean up
Use the new task struct trace flags to determine if a process should be
traced or not.
Note: this moves the searching of the pid to the slow path of setting
the pid field. This needs to be converted to the pid name space.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 20:36:57 +0000 (15:36 -0500)]
ftrace: graph of a single function
This patch adds the file:
/debugfs/tracing/set_graph_function
which can be used along with the function graph tracer.
When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.
For example:
# echo blk_unplug > /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/trace
[...]
------------------------------------------
| 2) make-19003 => kjournald-2219
------------------------------------------
2) | blk_unplug() {
2) | dm_unplug_all() {
2) | dm_get_table() {
2) 1.381 us | _read_lock();
2) 0.911 us | dm_table_get();
2) 1. 76 us | _read_unlock();
2) + 12.912 us | }
2) | dm_table_unplug_all() {
2) | blk_unplug() {
2) 0.778 us | generic_unplug_device();
2) 2.409 us | }
2) 5.992 us | }
2) 0.813 us | dm_table_put();
2) + 29. 90 us | }
2) + 34.532 us | }
You can add up to 32 functions into this file. Currently we limit it
to 32, but this may change with later improvements.
To add another function, use the append '>>':
# echo sys_read >> /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/set_graph_function
blk_unplug
sys_read
Using the '>' will clear out the function and write anew:
# echo sys_write > /debug/tracing/set_graph_function
# cat /debug/tracing/set_graph_function
sys_write
Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.
The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.
Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 4 Dec 2008 08:07:44 +0000 (09:07 +0100)]
Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core
Ingo Molnar [Thu, 4 Dec 2008 08:07:19 +0000 (09:07 +0100)]
Merge commit 'v2.6.28-rc7' into tracing/core
Linus Torvalds [Thu, 4 Dec 2008 00:45:56 +0000 (16:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix setting of max_segment_size and seg_boundary mask
block: internal dequeue shouldn't start timer
block: set disk->node_id before it's being used
When block layer fails to map iov, it calls bio_unmap_user to undo
Linus Torvalds [Thu, 4 Dec 2008 00:41:15 +0000 (16:41 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/83xx: Fix MCU support merge issue in mpc8349emitx.dts
powerpc: Fix dma_map_sg() cache flushing on non coherent platforms
Linus Torvalds [Thu, 4 Dec 2008 00:40:37 +0000 (16:40 -0800)]
Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
NLM: client-side nlm_lookup_host() should avoid matching on srcaddr
nfsd: use of unitialized list head on error exit in nfs4recover.c
Add a reference to sunrpc in svc_addsock
nfsd: clean up grace period on early exit
Linus Torvalds [Thu, 4 Dec 2008 00:20:19 +0000 (16:20 -0800)]
iTCO_wdt: fix typo when setting TCO_EN bit
The code used '&= 0x00002000' when it tried to set the TCO_EN bit, which
obviously didn't set that bit at all, but instead just reset all the
other bits in the SMI_EN register.
This bug seemingly caused various random behavior, with Frans Pop
reporting that X.org just silently hung at startup and Rafael Wysocki
reports the fan spinning with full speed.
See
http://lkml.org/lkml/2008/12/3/178
http://bugzilla.kernel.org/show_bug.cgi?id=12162
The problem seems to have been triggered by "[WATCHDOG] iTCO_wdt :
problem with rebooting on new ICH9 based motherboards" (commit
7cd5b08be3c489df11b559fef210b81133764ad4), but the bogus code existed
before that too (in the "supermicro_old_pre_stop()" function), it just
apparently never showed up due to different logic.
In that commit the broken code got moved around and now gets executed
much more.
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Frans Pop <elendil@planet.nl>
Cc: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Wed, 3 Dec 2008 16:04:51 +0000 (11:04 -0500)]
ftrace: fix race in function graph during fork
Impact: graph tracer race/crash fix
There is a nasy race in startup of a new process running the
function graph tracer. In fork.c:
total_forks++;
spin_unlock(¤t->sighand->siglock);
write_unlock_irq(&tasklist_lock);
ftrace_graph_init_task(p);
proc_fork_connector(p);
cgroup_post_fork(p);
return p;
The new task is free to run as soon as the tasklist_lock is released.
This is before the ftrace_graph_init_task. If the task does run
it will be using the same ret_stack and curr_ret_stack as the parent.
This will cause crashes that are difficult to debug.
This patch moves the ftrace_graph_init_task to just after the alloc_pid
code. This fixes the above race.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 16:04:50 +0000 (11:04 -0500)]
trace: fix output of stack trace
Impact: fix to output of stack trace
If a function is not found in the stack of the stack tracer, the
number printed is quite strange. This fixes the algorithm to handle
missing functions better.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Anton Vorontsov [Thu, 27 Nov 2008 17:36:45 +0000 (20:36 +0300)]
powerpc/83xx: Fix MCU support merge issue in mpc8349emitx.dts
Just found the merge issue in
442746989d92afc125040e0f29b33602ad94da99
("powerpc/83xx: Add support for MCU microcontroller in .dts files"):
the commit adds the MCU controller node into the DMA node, which is
wrong because the MCU sits on the I2C bus. Fix this by moving the MCU
node into the I2C controller node.
The original patch[1] was OK though. ;-)
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Milan Broz [Wed, 3 Dec 2008 11:55:08 +0000 (12:55 +0100)]
block: fix setting of max_segment_size and seg_boundary mask
Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.
When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.
If you create MD device over SCSI, these attributes are zeroed.
Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:
request_queue max_segment_size seg_boundary_mask
SCSI 65536 0xffffffff
MD RAID1 0 0
LVM 65536 -1 (64bit)
Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of
physical segments according to these parameters.
During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit. (After
bio_clone() in stack operation.)
Thi is specially problem in CCISS driver, where it produce OOPS here
BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
(MAXSEGENTRIES is 31 by default.)
Sometimes even this command is enough to cause oops:
dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).
For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().
The patch tries to fix it by:
* initializing attributes above in queue request constructor
blk_queue_make_request()
* make sure that blk_queue_stack_limits() inherits setting
(DM uses its own function to set the limits because it
blk_queue_stack_limits() was introduced later. It should probably switch
to use generic stack limit function too.)
* sets the default seg_boundary value in one place (blkdev.h)
* use this mask as default in DM (instead of -1, which differs in 64bit)
Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672
Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 3 Dec 2008 11:41:26 +0000 (12:41 +0100)]
block: internal dequeue shouldn't start timer
blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer. Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it. If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.
Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO. This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cheng Renquan [Thu, 20 Nov 2008 07:37:37 +0000 (08:37 +0100)]
block: set disk->node_id before it's being used
disk->node_id will be refered in allocating in disk_expand_part_tbl, so we
should set it before disk->node_id is refered.
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Petr Vandrovec [Wed, 19 Nov 2008 10:12:14 +0000 (11:12 +0100)]
When block layer fails to map iov, it calls bio_unmap_user to undo
mapping. Which is good if pages were mapped - but if they were provided
by someone else and just copied then bad things happen - pages are
released once here, and once by caller, leading to user triggerable BUG
at include/linux/mm.h:246.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Ingo Molnar [Wed, 3 Dec 2008 09:33:58 +0000 (10:33 +0100)]
tracing/function-graph-tracer: enabled by default
CONFIG_FUNCTION_GRAPH_TRACER depends on FUNCTION_TRACER already,
(turning it non-default) so it so making it default-n is pointless.
So enable it by default - it's a nice extension of the function tracer.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Wed, 3 Dec 2008 01:32:12 +0000 (02:32 +0100)]
tracing/function-graph-tracer: improve duration output
Impact: better trace output of duration for long calls
The old duration output didn't exceeded 9999.999 us to fit the column
and the nanosecs were always 3 numbers. As Ingo suggested, it's better
to have the whole microseconds elapsed time and shift the nanosecs precision
if needed to fit the maximum 7 numbers. And usec need more number, the case
should be rare and important enough to break a bit the column alignment to
show it.
So, depending of the duration value, we now have these patterns:
u.nnn us
uu.nnn us
uuu.nnn us
uuuu.nnn us
uuuuu.nn us
uuuuuu.n us
uuuuuuuu..... us
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Frederic Weisbecker [Wed, 3 Dec 2008 01:30:37 +0000 (02:30 +0100)]
tracing/function-graph-tracer: display unified style cmdline and pid
Impact: extend function-graph output: let one know which thread called a function
This patch implements a helper function to print the couple cmdline/pid.
Its output is provided during task switching and on each row if the new
"funcgraph-proc" defualt-off option is set through trace_options file.
The output is center aligned and never exceeds 14 characters. The cmdline
is truncated over 7 chars.
But note that if the pid exceeds 6 characters, the column will overflow (but
the situation is abnormal).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:06 +0000 (23:50 -0500)]
ftrace: add checks on ret stack in function graph
Import: robustness checks
Add more checks in the function graph code to detect errors and
perhaps print out better information if a bug happens.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:05 +0000 (23:50 -0500)]
ftrace: function graph return for function entry
Impact: feature, let entry function decide to trace or not
This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:04 +0000 (23:50 -0500)]
ftrace: print real return in dumpstack for function graph
Impact: better dumpstack output
I noticed in my crash dumps and even in the stack tracer that a
lot of functions listed in the stack trace are simply
return_to_handler which is ftrace graphs way to insert its own
call into the return of a function.
But we lose out where the actually function was called from.
This patch adds in hooks to the dumpstack mechanism that detects
this and finds the real function to print. Both are printed to
let the user know that a hook is still in place.
This does give a funny side effect in the stack tracer output:
Depth Size Location (80 entries)
----- ---- --------
0) 4144 48 save_stack_trace+0x2f/0x4d
1) 4096 128 ftrace_call+0x5/0x2b
2) 3968 16 mempool_alloc_slab+0x16/0x18
3) 3952 384 return_to_handler+0x0/0x73
4) 3568 -240 stack_trace_call+0x11d/0x209
5) 3808 144 return_to_handler+0x0/0x73
6) 3664 -128 mempool_alloc+0x4d/0xfe
7) 3792 128 return_to_handler+0x0/0x73
8) 3664 -32 scsi_sg_alloc+0x48/0x4a [scsi_mod]
As you can see, the real functions are now negative. This is due
to them not being found inside the stack.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:03 +0000 (23:50 -0500)]
ring-buffer: change "page" variable names to "bpage"
Impact: clean up
Andrew Morton pointed out that the kernel convention of a variable
named page should be of type page struct. The ring buffer uses
a variable named "page" for a pointer to something else.
This patch converts those to be called "bpage" (as in "buffer page").
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:02 +0000 (23:50 -0500)]
ftrace: add ftrace_graph_stop()
Impact: new ftrace_graph_stop function
While developing more features of function graph, I hit a bug that
caused the WARN_ON to trigger in the prepare_ftrace_return function.
Well, it was hard for me to find out that was happening because the
bug would not print, it would just cause a hard lockup or reboot.
The reason is that it is not safe to call printk from this function.
Looking further, I also found that it calls unregister_ftrace_graph,
which grabs a mutex and calls kstop machine. This would definitely
lock the box up if it were to trigger.
This patch adds a fast and safe ftrace_graph_stop() which will
stop the function tracer. Then it is safe to call the WARN ON.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:09 +0000 (15:34 -0500)]
ftrace: have function graph use mcount caller address
Impact: consistency change for function graph
This patch makes function graph record the mcount caller address
the same way the function tracer does.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:08 +0000 (15:34 -0500)]
ftrace: clean up function graph asm
Impact: clean up
There exists macros for x86 asm to handle x86_64 and i386.
This patch updates function graph asm to use them.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:07 +0000 (15:34 -0500)]
ring-buffer: read page interface
Impact: new API to ring buffer
This patch adds a new interface into the ring buffer that allows a
page to be read from the ring buffer on a given CPU. For every page
read, one must also be given to allow for a "swap" of the pages.
rpage = ring_buffer_alloc_read_page(buffer);
if (!rpage)
goto err;
ret = ring_buffer_read_page(buffer, &rpage, cpu, full);
if (!ret)
goto empty;
process_page(rpage);
ring_buffer_free_read_page(rpage);
The caller of these functions must handle any waits that are
needed to wait for new data. The ring_buffer_read_page will simply
return 0 if there is no data, or if "full" is set and the writer
is still on the current page.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:06 +0000 (15:34 -0500)]
ring-buffer: move some metadata into buffer page
Impact: get ready for splice changes
This patch moves the commit and timestamp into the beginning of each
data page of the buffer. This change will allow the page to be moved
to another location (disk, network, etc) and still have information
in the page to be able to read it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:05 +0000 (15:34 -0500)]
ftrace: replace raw_local_irq_save with local_irq_save
Impact: fix for lockdep and ftrace
The raw_local_irq_save/restore confuses lockdep. This patch
converts them to the local_irq_save/restore variants.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 3 Dec 2008 07:54:47 +0000 (08:54 +0100)]
Merge commit 'v2.6.28-rc7'; branch 'x86/dumpstack' into tracing/ftrace
Merge x86/dumpstack into tracing/ftrace because upcoming ftrace changes
depend on cleanups already in x86/dumpstack.
Also merge to latest upstream -rc.
Ingo Molnar [Wed, 3 Dec 2008 07:49:21 +0000 (08:49 +0100)]
Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core
Benjamin Herrenschmidt [Sun, 30 Nov 2008 18:53:40 +0000 (18:53 +0000)]
powerpc: Fix dma_map_sg() cache flushing on non coherent platforms
On PowerPC 4xx or other non cache-coherent platforms, we lost the
appropriate cache flushing in dma_map_sg() when merging the 32 and
64-bit DMA code (commit
4fc665b88a79a45bae8bbf3a05563c27c7337c3d,
"powerpc: Merge 32 and 64-bit dma code"). This restores it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Linus Torvalds [Tue, 2 Dec 2008 23:58:20 +0000 (15:58 -0800)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] hpwdt: Fix kdump when using hpwdt
[WATCHDOG] hpwdt: set the mapped BIOS address space as executable
[WATCHDOG] iTCO_wdt: add PCI ID's for ICH9 & ICH10 chipsets
[WATCHDOG] iTCO_wdt : correct status clearing
[WATCHDOG] iTCO_wdt : problem with rebooting on new ICH9 based motherboards
[WATCHDOG] fix mtx1_wdt compilation failure
Linus Torvalds [Tue, 2 Dec 2008 23:56:55 +0000 (15:56 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6:
UBIFS: pre-allocate bulk-read buffer
UBIFS: do not allocate too much
UBIFS: do not print scary memory allocation warnings
UBIFS: allow for gaps when dirtying the LPT
UBIFS: fix compilation warnings
MAINTAINERS: change UBI/UBIFS git tree URLs
UBIFS: endian handling fixes and annotations
UBIFS: remove printk
Linus Torvalds [Tue, 2 Dec 2008 23:56:17 +0000 (15:56 -0800)]
Merge branch 'kvm-updates/2.6.28' of git://git./linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: MMU: avoid creation of unreachable pages in the shadow
KVM: ppc: stop leaking host memory on VM exit
KVM: MMU: fix sync of ptes addressed at owner pagetable
KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
KVM: ia64: Fix incorrect kbuild CFLAGS override
KVM: VMX: Fix interrupt loss during race with NMI
KVM: s390: Fix problem state handling in guest sigp handler
Linus Torvalds [Tue, 2 Dec 2008 23:55:43 +0000 (15:55 -0800)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix offset calculation in compute_size()
rtc: rtc-starfire fixes
Linus Torvalds [Tue, 2 Dec 2008 23:55:05 +0000 (15:55 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
MAINTAINERS: add netdev to ATM
ATM: horizon, fix hrz_probe fail path
pppol2tp: Add missing sock_put() in pppol2tp_release()
net: Fix soft lockups/OOM issues w/ unix garbage collector
macvlan: don't broadcast PAUSE frames to macvlan devices
Phonet: fix oops in phonet_address_del() on non-Phonet device
netfilter: ctnetlink: fix GFP_KERNEL allocation under spinlock
sungem: Fix PCS_MIICTRL register write in gem_init_phy().
net: make skb_truesize_bug() call WARN()
net: hp-plus uses eip_poll
net/wireless/reg.c: fix bad WARN_ON in if statement
ath5k: disable beacon filter when station is not associated
ath5k: fix Security issue in DebugFS part of ath5k
ath9k: correct expected max RX buffer size
ath9k: Fix SW-IOMMU bounce buffer starvation
mac80211 : Fix setting ad-hoc mode and non-ibss channel
iwlagn: fix DMA sync
phylib: Add Vitesse VSC8221 SGMII PHY
rose: zero length frame filtering in af_rose.c
bridge: netfilter: fix update_pmtu crash with GRE
...
Linus Torvalds [Tue, 2 Dec 2008 23:53:41 +0000 (15:53 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Update defconfigs for 2.6.28-rc7
macfb: Do not overflow fb_fix_screeninfo.id
Linus Torvalds [Tue, 2 Dec 2008 23:53:10 +0000 (15:53 -0800)]
Merge git://git./linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
alim15x3: fix sparse warning
ide: remove dead code from drive_is_ready()
ide: fix build for DEBUG_PM
ide: respect current DMA setting during resume
ide: add SAMSUNG SP0822N with firmware WA100-10 to ivb_list[]
amd74xx: workaround unreliable AltStatus register for nVidia controllers
ide: fix the ide_release_lock imbalance
Linus Torvalds [Tue, 2 Dec 2008 23:52:28 +0000 (15:52 -0800)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] stex: switch to block timeout
[SCSI] make scsi_eh_try_stu use block timeout
[SCSI] megaraid_sas: switch to block timeout
[SCSI] ibmvscsi: switch to block timeout
[SCSI] aacraid: switch to block timeout
[SCSI] zfcp: prevent double decrement on host_busy while being busy
[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP
[SCSI] zfcp: eliminate race between validation and locking
[SCSI] zfcp: verify for correct rport state before scanning for SCSI devs
[SCSI] zfcp: returning an ERR_PTR where a NULL value is expected
[SCSI] zfcp: Fix opening of wka ports
[SCSI] zfcp: fix remote port status check
[SCSI] fc_transport: fix old bug on bitflag definitions
[SCSI] Fix hang in starved list processing
Mark Salter [Tue, 2 Dec 2008 14:38:09 +0000 (14:38 +0000)]
MN10300: Fix application of kernel module relocations
This fixes the MN10300 kernel module linking to match the toolchain. RELA
relocs don't use the value at the location being relocated. This has been
working because the tools always leave the value at the target location
cleared.
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dean Nelson [Tue, 2 Dec 2008 14:06:01 +0000 (08:06 -0600)]
sgi-gru: call fs_initcall() if statically linked
If xpc.ko and gru.ko are both statically linked into the kernel, then
xpc_init() can get called before gru_init() and make a call to one of the
gru's exported functions before the gru has initialized itself. The end
result is a NULL dereference.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kumar Gala [Tue, 2 Dec 2008 19:37:01 +0000 (13:37 -0600)]
powerpc: Use physical cpu id when setting the processor affinity
In the CONFIG_SMP case the irq_choose_cpu() code was returning back
a logical cpu id not the physical id. We were writing that directly
into the HW register.
We need to be calling get_hard_smp_processor_id() so irq_choose_cpu()
always returns a physical cpu id.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Tue, 2 Dec 2008 18:31:52 +0000 (10:31 -0800)]
vmscan: evict streaming IO first
Count the insertion of new pages in the statistics used to drive the
pageout scanning code. This should help the kernel quickly evict
streaming file IO.
We count on the fact that new file pages start on the inactive file LRU
and new anonymous pages start on the active anon list. This means
streaming file IO will increment the recent scanned file statistic, while
leaving the recent rotated file statistic alone, driving pageout scanning
to the file LRUs.
Pageout activity does its own list manipulation.
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: Gene Heskett <gene.heskett@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kay Sievers [Tue, 2 Dec 2008 18:31:50 +0000 (10:31 -0800)]
bdi: register sysfs bdi device only once per queue
Devices which share the same queue, like floppies and mtd devices, get
registered multiple times in the bdi interface, but bdi accounts only the
last registered device of the devices sharing one queue.
On remove, all earlier registered devices leak, stay around in sysfs, and
cause "duplicate filename" errors if the devices are re-created.
This prevents the creation of multiple bdi interfaces per queue, and the
bdi device will carry the dev_t name of the block device which is the
first one registered, of the pool of devices using the same queue.
[akpm@linux-foundation.org: add a WARN_ON so we know which drivers are misbehaving]
Tested-by: Peter Korsgaard <jacmet@sunsite.dk>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junjiro R. Okajima [Tue, 2 Dec 2008 18:31:46 +0000 (10:31 -0800)]
nfsd: fix vm overcommit crash fix #2
The previous patch from Alan Cox ("nfsd: fix vm overcommit crash",
commit
731572d39fcd3498702eda4600db4c43d51e0b26) fixed the problem where
knfsd crashes on exported shmemfs objects and strict overcommit is set.
But the patch forgot supporting the case when CONFIG_SECURITY is
disabled.
This patch copies a part of his fix which is mainly for detecting a bug
earlier.
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Junjiro R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Tue, 2 Dec 2008 19:58:26 +0000 (20:58 +0100)]
m68k: Update defconfigs for 2.6.28-rc7
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Hannes Eder [Tue, 2 Dec 2008 19:40:04 +0000 (20:40 +0100)]
alim15x3: fix sparse warning
Fix this sparse warning:
drivers/ide/alim15x3.c:594:2: warning: returning void-valued expression
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:04 +0000 (20:40 +0100)]
ide: remove dead code from drive_is_ready()
We guarantee 400ns delay at the time of issuing the command.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:03 +0000 (20:40 +0100)]
ide: fix build for DEBUG_PM
Also while at it:
* Drop unused arguments from ide_complete_power_step().
* Move DEBUG_PM printk() from ide_end_drive_cmd() to
ide_complete_power_step().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:03 +0000 (20:40 +0100)]
ide: respect current DMA setting during resume
Respect current DMA setting during resume, otherwise PIO timings
may get destroyed if host uses shared PIO/MWDMA timings.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:03 +0000 (20:40 +0100)]
ide: add SAMSUNG SP0822N with firmware WA100-10 to ivb_list[]
Should fix kernel.org bug #10225:
http://bugzilla.kernel.org/show_bug.cgi?id=10225
Reported-by: Matthias B. <haferfrost@web.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Sergei Shtyltov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Bartlomiej Zolnierkiewicz [Tue, 2 Dec 2008 19:40:03 +0000 (20:40 +0100)]
amd74xx: workaround unreliable AltStatus register for nVidia controllers
It seems that on some nVidia controllers using AltStatus register
can be unreliable so default to Status register if the PCI device
is in Compatibility Mode. In order to achieve this:
* Add ide_pci_is_in_compatibility_mode() inline helper to <linux/ide.h>.
* Add IDE_HFLAG_BROKEN_ALTSTATUS host flag and set it in amd74xx host
driver for nVidia controllers in Compatibility Mode.
* Teach actual_try_to_identify() and drive_is_ready() about the new flag.
This fixes the regression caused by removal of CONFIG_IDEPCI_SHARE_IRQ
config option in 2.6.25 and using AltStatus register unconditionally when
available (kernel.org bugs #11659 and #10216).
[ Moreover for CONFIG_IDEPCI_SHARE_IRQ=y (which is what most people
and distributions use) it never worked correctly. ]
Thanks to Remy LABENE and Lars Winterfeld for help with debugging the problem.
More info at:
http://bugzilla.kernel.org/show_bug.cgi?id=11659
http://bugzilla.kernel.org/show_bug.cgi?id=10216
Reported-by: Remy LABENE <remy.labene@free.fr>
Tested-by: Remy LABENE <remy.labene@free.fr>
Tested-by: Lars Winterfeld <lars.winterfeld@tu-ilmenau.de>
Acked-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Michael Schmitz [Tue, 2 Dec 2008 19:40:02 +0000 (20:40 +0100)]
ide: fix the ide_release_lock imbalance
ide_release_lock() spits out lots of:
ide_release_lock: bug
warnings on Atari Falcon.
Fix the ide_release_lock imbalance.
Signed-off-by: Michael Schmitz <schmitz@biophys.uni-duesseldorf.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Finn Thain [Tue, 18 Nov 2008 19:40:40 +0000 (20:40 +0100)]
macfb: Do not overflow fb_fix_screeninfo.id
Don't overflow the 16-character fb_fix_screeninfo id string (fixes some
console erasing and blanking artifacts). Have the ID default to "Unknown"
on machines with no built-in video and no nubus devices. Check for
fb_alloc_cmap failure.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Frederic Weisbecker [Mon, 1 Dec 2008 23:20:39 +0000 (00:20 +0100)]
tracing/function-graph-tracer: support for x86-64
Impact: extend and enable the function graph tracer to 64-bit x86
This patch implements the support for function graph tracer under x86-64.
Both static and dynamic tracing are supported.
This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
wanted to use probe_kernel_read/write to make the return address
saving/patching code more generic but it causes tracing recursion.
That would be perhaps useful to implement a notrace version of these
function for other archs ports.
Note that arch/x86/process_64.c is not traced, as in X86-32. I first
thought __switch_to() was responsible of crashes during tracing because I
believed current task were changed inside but that's actually not the
case (actually yes, but not the "current" pointer).
So I will have to investigate to find the functions that harm here, to
enable tracing of the other functions inside (but there is no issue at
this time, while process_64.c stays out of -pg flags).
A little possible race condition is fixed inside this patch too. When the
tracer allocate a return stack dynamically, the current depth is not
initialized before but after. An interrupt could occur at this time and,
after seeing that the return stack is allocated, the tracer could try to
trace it with a random uninitialized depth. It's a prevention, even if I
hadn't problems with it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Liming Wang [Tue, 2 Dec 2008 02:33:08 +0000 (10:33 +0800)]
function trace: fix a bug of single thread function trace
Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func()
When disabling single thread function trace using
"echo -1 > set_ftrace_pid", the normal function trace
has to restore to original function, otherwise the normal
function trace will not work well.
Without this commit, something like below:
$ ps |grep 850
850 root 2556 S -/bin/sh
$ echo 850 > /debug/tracing/set_ftrace_pid
$ echo function > /debug/tracing/current_tracer
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ cat /debug/tracing/trace_pipe |wc -l
59704
$ echo -1 > /debug/tracing/set_ftrace_pid
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ more /debug/tracing/trace_pipe
<====== nothing output now!
it should output trace record.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 2 Dec 2008 08:20:44 +0000 (09:20 +0100)]
Merge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-graph-tracer', 'tracing/markers', 'tracing/powerpc', 'tracing/stack-tracer' and 'tracing/tracepoints' into tracing/core
Ingo Molnar [Tue, 2 Dec 2008 08:20:29 +0000 (09:20 +0100)]
Merge branch 'tracing/urgent' into tracing/core
Conflicts:
kernel/trace/ring_buffer.c
Linus Torvalds [Tue, 2 Dec 2008 03:59:23 +0000 (19:59 -0800)]
Linux 2.6.28-rc7
Linus Torvalds [Tue, 2 Dec 2008 03:56:34 +0000 (19:56 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (25 commits)
em28xx: remove backward compat macro added on a previous fix
V4L/DVB (9748): em28xx: fix compile warning
V4L/DVB (9743): em28xx: fix oops audio
V4L/DVB (9742): em28xx-alsa: implement another locking schema
V4L/DVB (9732): sms1xxx: use new firmware for Hauppauge WinTV MiniStick
V4L/DVB (9691): gspca: Move the video device to a separate area.
V4L/DVB (9690): gspca: Lock the subdrivers via module_get/put.
V4L/DVB (9689): gspca: Memory leak when disconnect while streaming.
V4L/DVB (9668): em28xx: fix a race condition with hald
V4L/DVB (9664): af9015: don't reconnect device in USB-bus
V4L/DVB (9647): em28xx: void having two concurrent control URB's
V4L/DVB (9646): em28xx: avoid allocating/dealocating memory on every control urb
V4L/DVB (9645): em28xx: Avoid memory leaks if registration fails
V4L/DVB (9639): Make dib0700 remote control support work with firmware v1.20
V4L/DVB (9635): v4l: s2255drv fix firmware test on big-endian
V4L/DVB (9634): Make sure the i2c gate is open before powering down tuner
V4L/DVB (9632): make em28xx aux audio input work
V4L/DVB (9631): Make s2api work for ATSC support
V4L/DVB (9627): em28xx: Avoid i2c register error for boards without eeprom
V4L/DVB (9608): Fix section mismatch warning for dm1105 during make
...
Andrew Morton [Mon, 1 Dec 2008 21:14:08 +0000 (13:14 -0800)]
drivers/gpu/drm/i915/i915_irq.c: fix warning
drivers/gpu/drm/i915/i915_irq.c: In function 'i915_disable_pipestat':
drivers/gpu/drm/i915/i915_irq.c:101: warning: control may reach end of non-void function 'i915_pipestat' being inlined
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarkko Lavinen [Mon, 1 Dec 2008 21:14:08 +0000 (13:14 -0800)]
i82875p_edac: fix module remove
Fix module removal bugs of i82875p_edac. Also i82975x_edac code seems to
have the same module removal bugs as in i82875p_edac.
The problems were:
1. In module removal i82875p_remove_one() is never called.
Variable i82875p_registered is newer changed from 1, which
guarantees i82875p_remove_one() is not called (and even if it were
called, it would be called in wrong order).
As a result, the edac_mc workque is not stopped and keeps probing.
If kernel debugging options are not enabled, user may not notice
anything going wrong.
if debugging options are enabled and I do "rmmod i82875p_edac", I
get:
edac debug: edac_pci_workq_function() checking
BUG: unable to handle kernel paging request at
f882d16f
...
call trace:
[<
f8834df3>] ? edac_mc_workq_function+0x55/0x7e [edac_core]
[<
c0233974>] ? run_workqueue+0xd7/0x1a5
[<
c023392f>] ? run_workqueue+0x92/0x1a5
[<
f8834d9e>] ? edac_mc_workq_function+0x0/0x7e [edac_core]
[<
c0233af9>] ? worker_thread+0xb7/0xc3
[<
c0236a7b>] ? autoremove_wake_function+0x0/0x33
[<
c0233a42>] ? worker_thread+0x0/0xc3
[<
c0236809>] ? kthread+0x3b/0x61
[<
c02367ce>] ? kthread+0x0/0x61
[<
c0204587>] ? kernel_thread_helper+0x7/0x10
Fix for this is to get rid of needles variable i82875p_registered
altogether and run i82875p_remove_one() *before*
pci_unregister_driver().
2. edac_mc_del_mc() uses mci after freeing mci
edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device(). The
kobject refcount of mci drops to 0 and mci is freed. After this
mci is accessed via debug print and i82875p_remove_one() still
uses mci->pvt and tries to free mci again with edac_mc_free().
The fix for this is add kobject_get(&mci->edac_mci_kobj) after
edac_mc_alloc(). Then the mci is still available after returning
from edac_mc_del_mc() with refcount 1, and mci->pvt is still
available. When i82875p_remove_one() finally calls edac_mc_free(),
this will cause kobject_put() and mci is released properly.
Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarkko Lavinen [Mon, 1 Dec 2008 21:14:06 +0000 (13:14 -0800)]
i82875p_edac: fix overflow device resource setup
When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26
or later, the module load fails due to BAR 0 collision. On 2.6.25 the
module loads just fine.
The overflow device on the MB seems to be hidden and its resources are not
allocated at normal PCI bus init. Log shows the missing resource problem:
EDAC DEBUG: i82875p_probe1()
PCI: 0000:00:06.0 reg 10 32bit mmio: [
fecf0000,
fecf0fff]
pci 0000:00:06.0: device not available because of BAR 0
[0xfecf0000-0xfecf0fff] collisions
EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow
device
The patch below fixes this by calling pci_bus_assign_resources() after
the overflow device is revealed and added to the bus. With this patch
I am again able to load and use the module.
Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Baryshkov [Mon, 1 Dec 2008 21:14:05 +0000 (13:14 -0800)]
fbdev: fix FB console blanking
The commit
aef7db4bd5a3b6068dfa05919a3d685199eed116 fixed the problem with
recursive locking in fb blanking code if blank is caused by user setting
the /sys/class/graphics/fb*/blank. However this broke the fbcon timeout
blanking.
If you use a driver that defines ->fb_blank operation and at the same time
that driver relies on other driver (e.g. backlight or lcd class) to blank
the screen, when the fbcon times out and tries to blank the fb, it will
call only fb driver blanker and won't notify the other driver. Thus FB
output is disabled, but the screen isn't blanked.
Restore fbcon blanking and at the same time apply the proper fix for the
above problem: if fbcon_blank is called with FBINFO_FLAG_USEREVENT, we are
already called through notification from fb_blank, thus we don't have to
blank the fb again.
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Mon, 1 Dec 2008 21:14:04 +0000 (13:14 -0800)]
ntfs: don't fool kernel-doc
kernel-doc handles macros now (it has for quite some time), so change the
ntfs_debug() macro's kernel-doc to be just before the macro instead of
before a phony function prototype.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Mon, 1 Dec 2008 21:14:03 +0000 (13:14 -0800)]
kernel-doc: handle varargs cleanly
The method for listing varargs in kernel-doc notation is:
* @...: these arguments are printed by the @fmt argument
but scripts/kernel-doc is confused: it always lists varargs as:
... variable arguments
and ignores the @...: line's description, but then prints that
line after the list of function parameters as though it's
not part of the function parameters.
This patch makes kernel-doc print the supplied @... description if it is
present; otherwise a boilerplate "variable arguments" is printed.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Mon, 1 Dec 2008 21:14:02 +0000 (13:14 -0800)]
lib/idr.c: fix rcu related race with idr_find
2nd part of the fixes needed for
http://bugzilla.kernel.org/show_bug.cgi?id=11796.
When the idr tree is either grown or shrunk, then the update to the number
of layers and the top pointer were not atomic. This race caused crashes.
The attached patch fixes that by replicating the layers counter in each
layer, thus idr_find doesn't need idp->layers anymore.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Clement Calmels <cboulte@gmail.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>