Paul Mackerras [Mon, 9 Feb 2009 11:42:47 +0000 (22:42 +1100)]
perf_counters: make software counters work as per-cpu counters
Impact: kernel crash fix
Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
counter as a per-cpu counter would reliably crash the system, because
it calls __task_delta_exec with a null pointer. The page fault,
context switch and cpu migration counters also won't function
correctly as per-cpu counters since they reference the current task.
This fixes the problem by redirecting the task_clock counter to the
cpu_clock counter when used as a per-cpu counter, and by implementing
per-cpu page fault, context switch and cpu migration counters.
Along the way, this:
- Initializes counter->ctx earlier, in perf_counter_alloc, so that
sw_perf_counter_init can use it
- Adds code to kernel/sched.c to count task migrations into each
cpu, in rq->nr_migrations_in
- Exports the per-cpu context switch and task migration counts
via new functions added to kernel/sched.c
- Makes sure that if sw_perf_counter_init fails, we don't try to
initialize the counter as a hardware counter. Since the user has
passed a negative, non-raw event type, they clearly don't intend
for it to be interpreted as a hardware event.
Reported-by: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 5 Feb 2009 14:23:08 +0000 (15:23 +0100)]
perfcounters: fix "perf counters kills oprofile" bug, v2
Impact: fix kernel crash
Both oprofile and perfcounters register an NMI die handler, but only one
can handle the NMI. Conveniently, oprofile unregisters it's notifier
when not actively in use, so setting it's notifier priority higher than
perfcounter's allows oprofile to borrow the NMI for the duration of it's
run. Tested/works both as module and built-in.
While testing, I found that if kerneltop was generating NMIs at very
high frequency, the kernel may panic when oprofile registered it's
handler. This turned out to be because oprofile registers it's handler
before reset_value has been allocated, so if an NMI comes in while it's
still setting up, kabOom. Rather than try more invasive changes, I
followed the lead of other places in op_model_ppro.c, and simply
returned in that highly unlikely event. (debug warnings attached)
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Wed, 4 Feb 2009 16:11:34 +0000 (17:11 +0100)]
perfcounters: fix "perf counters kill oprofile" bug
With oprofile as a module, and unloaded by profiling script,
both oprofile and kerneltop work fine.. unless you leave kerneltop
running when you start profiling, then you may see badness.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jaswinder Singh Rajput [Sun, 1 Feb 2009 16:37:39 +0000 (22:07 +0530)]
x86: irqinit_32.c fix compilation warning
Fix:
arch/x86/kernel/irqinit_32.c:124: warning: 'smp_intr_init' defined but not used
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Thu, 29 Jan 2009 13:06:52 +0000 (14:06 +0100)]
perfcounters: fix refcounting bug
don't kfree in use counters.
Running...
while true; do perfstat -e 1 -c true; done
...on all cores for a while doesn't seem to be eating ram, and my oops
is gone.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Sun, 25 Jan 2009 10:38:09 +0000 (02:38 -0800)]
x86: make irqinit_32.c more like irqinit_64.c, v2
Impact: cleanup
1. add smp_intr_init and apic_intr_init for 32bit, the same as 64bit
2. move the apic_intr_init() call before set gate with interrupt[i]
3. for 64bit, if ia32_emulation is not used, will make per_cpu to use 0x80 vector.
[ v2: should use !test_bit() instead of test_bit() with 32bit ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Fri, 23 Jan 2009 13:16:53 +0000 (14:16 +0100)]
perfcounters fix section mismatch warning in perf_counter.c::perf_counters_lapic_init()
Fix:
WARNING: arch/x86/kernel/built-in.o(.text+0xdd0f): Section mismatch in reference from the function pmc_generic_enable() to the function .cpuinit.text:perf_counters_lapic_init()
The function pmc_generic_enable() references
the function __cpuinit perf_counters_lapic_init().
This is often because pmc_generic_enable lacks a __cpuinit
annotation or the annotation of perf_counters_lapic_init is wrong.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Fri, 23 Jan 2009 13:36:16 +0000 (14:36 +0100)]
perfcounters: ratelimit performance counter interrupts
Ratelimit performance counter interrupts to 100KHz per CPU.
This replaces the irq-delta-time based method.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Galbraith [Fri, 23 Jan 2009 09:13:01 +0000 (10:13 +0100)]
perfcounters: throttle on too high IRQ rates
Starting kerneltop with only -c 100 seems to be a bad idea, it can
easily lock the system due to perfcounter IRQ overload.
So add throttling: if a new IRQ arrives in a shorter than
PERFMON_MIN_PERIOD_NS time, turn off perfcounters and untrottle them
from the next timer tick.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 23 Jan 2009 10:00:43 +0000 (11:00 +0100)]
Merge branch 'core/percpu' into perfcounters/core
Ingo Molnar [Fri, 23 Jan 2009 10:09:15 +0000 (11:09 +0100)]
x86, xen: fix hardirq.h merge fallout
Impact: build fix
This build error:
arch/x86/xen/suspend.c:22: error: implicit declaration of function 'fix_to_virt'
arch/x86/xen/suspend.c:22: error: 'FIX_PARAVIRT_BOOTMAP' undeclared (first use in this function)
arch/x86/xen/suspend.c:22: error: (Each undeclared identifier is reported only once
arch/x86/xen/suspend.c:22: error: for each function it appears in.)
triggers because the hardirq.h unification removed an implicit fixmap.h
include - on which arch/x86/xen/suspend.c depended. Add the fixmap.h
include explicitly.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 23 Jan 2009 09:20:15 +0000 (10:20 +0100)]
Merge branch 'core/percpu' into perfcounters/core
Conflicts:
arch/x86/include/asm/hardirq_32.h
arch/x86/include/asm/hardirq_64.h
Semantic merge:
arch/x86/include/asm/hardirq.h
[ added apic_perf_irqs field. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 23 Jan 2009 09:06:18 +0000 (10:06 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu
Brian Gerst [Fri, 23 Jan 2009 02:03:32 +0000 (11:03 +0900)]
x86: make irq_cpustat_t fields conditional
Impact: shrink size of irq_cpustat_t when possible
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Fri, 23 Jan 2009 02:03:31 +0000 (11:03 +0900)]
x86: merge hardirq_{32,64}.h into hardirq.h
Impact: cleanup
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Fri, 23 Jan 2009 02:03:31 +0000 (11:03 +0900)]
x86: sync hardirq_{32,64}.h
Impact: better code generation and removal of unused field for 32bit
In general, use the 64-bit version.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Fri, 23 Jan 2009 02:03:29 +0000 (11:03 +0900)]
x86: remove include of apic.h from hardirq_64.h
Impact: cleanup
APIC definitions aren't needed here. Remove the include and fix
up the fallout.
tj: added include to mce_intel_64.c.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Fri, 23 Jan 2009 02:03:28 +0000 (11:03 +0900)]
x86: remove idle_timestamp from 32bit irq_cpustat_t
Impact: bogus irq_cpustat field removed
idle_timestamp is left over from the removed irqbalance code.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Ingo Molnar [Wed, 21 Jan 2009 15:37:27 +0000 (16:37 +0100)]
Merge commit 'v2.6.29-rc2' into perfcounters/core
Conflicts:
include/linux/syscalls.h
Nick Piggin [Tue, 20 Jan 2009 03:36:04 +0000 (04:36 +0100)]
x86: make UV support configurable
Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size
on my modest config.
text data bss dec hex filename
6770537 1158680 694356 8623573 8395d5 vmlinux
6757492 1157664 694228 8609384 835e68 vmlinux.nouv
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 10:30:07 +0000 (11:30 +0100)]
x86: uv cleanup, build fix #2
Fix more build-failure fallout from the UV cleanup - the UV drivers
were not updated to include <asm/uv/uv.h>.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 09:32:44 +0000 (10:32 +0100)]
x86: make x86_32 use tlb_64.c, build fix, clean up X86_L1_CACHE_BYTES
Fix:
arch/x86/mm/tlb.c:47: error: ‘CONFIG_X86_INTERNODE_CACHE_BYTES’ undeclared here (not in a function)
The CONFIG_X86_INTERNODE_CACHE_BYTES symbol is only defined on 64-bit,
because vsmp support is 64-bit only. Define it on 32-bit too - where it
will always be equal to X86_L1_CACHE_BYTES.
Also move the default of X86_L1_CACHE_BYTES (which is separate from the
more commonly used L1_CACHE_SHIFT kconfig symbol) from 128 bytes to
64 bytes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 09:39:51 +0000 (10:39 +0100)]
Merge branch 'x86/mm' into core/percpu
Conflicts:
arch/x86/mm/fault.c
Ingo Molnar [Wed, 21 Jan 2009 09:24:27 +0000 (10:24 +0100)]
x86: uv cleanup, build fix
Fix:
arch/x86/mm/srat_64.c: In function ‘acpi_numa_processor_affinity_init’:
arch/x86/mm/srat_64.c:141: error: implicit declaration of function ‘get_uv_system_type’
arch/x86/mm/srat_64.c:141: error: ‘UV_X2APIC’ undeclared (first use in this function)
arch/x86/mm/srat_64.c:141: error: (Each undeclared identifier is reported only once
arch/x86/mm/srat_64.c:141: error: for each function it appears in.)
A couple of UV definitions were moved to asm/uv/uv.h, but srat_64.c did
not include that header. Add it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 09:08:53 +0000 (10:08 +0100)]
x86, mm: move tlb.c to arch/x86/mm/
Impact: cleanup
Now that it's unified, move the (SMP) TLB flushing code from arch/x86/kernel/
to arch/x86/mm/, where it belongs logically.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 09:14:17 +0000 (10:14 +0100)]
Merge branch 'cpus4096' into core/percpu
Conflicts:
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
arch/x86/kernel/tlb_32.c
Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 21 Jan 2009 09:04:52 +0000 (10:04 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: rename tlb_64.c to tlb.c
Impact: file rename
tlb_64.c is now the tlb code for both 32 and 64. Rename it to tlb.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: make x86_32 use tlb_64.c
Impact: less contention when issuing invalidate IPI, cleanup
Make x86_32 use the same tlb code as 64bit. The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention. This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.
Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup. This has been noted with a FIXME
comment in tlb_64.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: prepare for tlb merge
Impact: clean up, ipi vector number reordering for x86_32
Make the following changes to prepare for tlb merge.
* reorder x86_32 ip vectors
* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
- on spurious invalidate ipi, tlb_32 acks the irq
- tlb_64 now has proper memory barriers around clearing
flush_cpumask (no change in generated code)
* unexport flush_tlb_page from tlb_32.c, there's no user
* use unsigned int for cpu id
* drop unnecessary includes from tlb_64.c
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: uv cleanup
Impact: cleanup
Make the following uv related cleanups.
* collect visible uv related definitions and interfaces into uv/uv.h
and use it. this cleans up the messy situation where on 64bit, uv
is defined properly, on 32bit generic it's dummy and on the rest
undefined. after this clean up, uv is defined on 64 and dummy on
32.
* update uv_flush_tlb_others() such that it takes cpumask of
to-be-flushed cpus as argument, instead of that minus self, and
returns yet-to-be-flushed cpumask, instead of modifying the passed
in parameter. this interface change will ease dummy implementation
of uv_flush_tlb_others() and makes uv tlb flush related stuff
defined in tlb_uv proper.
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: merge irq_regs.h
Impact: cleanup, better irq_regs code generation for x86_64
Make 64-bit use the same optimizations as 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Wed, 21 Jan 2009 08:26:06 +0000 (17:26 +0900)]
x86: merge mmu_context.h
Impact: cleanup
tj: * changed cpu to unsigned as was done on mmu_context_64.h as cpu
id is officially unsigned int
* added missing ';' to 32bit version of deactivate_mm()
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32
Impact: cleanup
%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus. Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: fix percpu_write with 64-bit constants
Impact: slightly better code generation for percpu_to_op()
The processor will sign-extend 32-bit immediate values in 64-bit
operations. Use the 'e' constraint ("32-bit signed integer constant,
or a symbolic reference known to fit that range") for 64-bit constants.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: clean up gdt_page definition
Impact: cleanup && more compact percpu area layout with future changes
Move 64-bit GDT to page-aligned section and clean up comment
formatting.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Wed, 21 Jan 2009 08:26:05 +0000 (17:26 +0900)]
x86: update canary handling during switch
Impact: cleanup
In switch_to(), instead of taking offset to irq_stack_union.stack,
make it a proper percpu access using __percpu_arg() and per_cpu_var().
Signed-off-by: Tejun Heo <tj@kernel.org>
Nick Piggin [Tue, 20 Jan 2009 03:24:26 +0000 (04:24 +0100)]
x86: optimise x86's do_page_fault (C entry point for the page fault path)
Impact: cleanup, restructure code to improve assembly
gcc isn't _all_ that smart about spilling registers to stack or reusing
stack slots, even with branch annotations. do_page_fault contained a lot
of functionality, so split unlikely paths into their own functions, and
mark them as noinline just to be sure. I consider this actually to be
somewhat of a cleanup too: the main function now contains about half
the number of lines so the normal path is easier to read, while the error
cases are also nicely split away.
Also, ensure the order of arguments to functions is always the same: regs,
addr, error_code. This can reduce code size a tiny bit, and just looks neater
too.
And add a couple of branch annotations.
Before:
do_page_fault:
subq $360, %rsp #,
After:
do_page_fault:
subq $56, %rsp #,
bloat-o-meter:
add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
function old new delta
__bad_area_nosemaphore - 506 +506
no_context - 474 +474
vmalloc_fault - 424 +424
spurious_fault - 358 +358
mm_fault_error - 272 +272
bad_area_access_error - 89 +89
bad_area - 89 +89
bad_area_nosemaphore - 10 +10
do_page_fault 2464 784 -1680
Yes, the total size increases by 542 bytes, due to the extra function calls.
But these will very rarely be called (except for vmalloc_fault) in a normal
workload. Importantly, do_page_fault is less than 1/3rd it's original size,
and touches far less stack.
Existing gotos and branch hints did move a lot of the infrequently used text
out of the fastpath, but that's even further improved after this patch.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 20 Jan 2009 08:23:28 +0000 (09:23 +0100)]
Merge commit 'v2.6.29-rc2' into x86/mm
Ingo Molnar [Tue, 20 Jan 2009 08:13:15 +0000 (09:13 +0100)]
x86, cpumask: fix tlb flush race
Impact: fix bootup crash
The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on
the stack - hence it is not constant anymore during the TLB flush.
That way it could race and some static sanity checks would trigger:
[ 238.154287] ------------[ cut here ]------------
[ 238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130!
[ 238.156039] invalid opcode: 0000 [#1] SMP
[ 238.156039] last sysfs file: /sys/class/net/eth2/address
[ 238.156039] Modules linked in:
[ 238.156039]
[ 238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6
[ 238.156039] EIP: 0060:[<
c0118f87>] EFLAGS:
00010202 CPU: 2
[ 238.156039] EIP is at native_flush_tlb_others+0x35/0x158
[ 238.156039] EAX:
c0ef972c EBX:
f6143301 ECX:
00000000 EDX:
00000000
[ 238.156039] ESI:
f61433a8 EDI:
f6143200 EBP:
f34f3e00 ESP:
f34f3df0
[ 238.156039] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 238.156039] Process ifup-eth (pid: 6493, ti=
f34f2000 task=
f399ab00 task.ti=
f34f2000)
[ 238.156039] Stack:
[ 238.156039]
ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200
[ 238.156039]
f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3
[ 238.156039]
bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071
[ 238.156039] Call Trace:
[ 238.156039] [<
c0118e9c>] ? flush_tlb_others+0x52/0x5b
[ 238.156039] [<
c0119435>] ? flush_tlb_mm+0x7f/0x8b
[ 238.156039] [<
c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55
[ 238.156039] [<
c01c3051>] ? exit_mmap+0x124/0x170
[ 238.156039] [<
c013e965>] ? mmput+0x40/0xf5
[ 238.156039] [<
c01e4788>] ? flush_old_exec+0x640/0x94b
[ 238.156039] [<
c01ddb4e>] ? fsnotify_access+0x37/0x39
[ 238.156039] [<
c01e3435>] ? kernel_read+0x39/0x4b
[ 238.156039] [<
c021bc8a>] ? load_elf_binary+0x4a1/0x11bb
[ 238.156039] [<
c01c0af9>] ? might_fault+0x51/0x9c
[ 238.156039] [<
c010a2cc>] ? paravirt_read_tsc+0x20/0x4f
[ 238.156039] [<
c010a406>] ? native_sched_clock+0x5d/0x60
[ 238.156039] [<
c01e2fda>] ? search_binary_handler+0xab/0x2c4
[ 238.156039] [<
c021b7e9>] ? load_elf_binary+0x0/0x11bb
[ 238.156039] [<
c04ae9a5>] ? _raw_read_unlock+0x21/0x46
[ 238.156039] [<
c021b7e9>] ? load_elf_binary+0x0/0x11bb
[ 238.156039] [<
c01e2fe1>] ? search_binary_handler+0xb2/0x2c4
[ 238.156039] [<
c01e4076>] ? do_execve+0x21c/0x2ee
[ 238.156039] [<
c01029b7>] ? sys_execve+0x51/0x8c
[ 238.156039] [<
c0103eaf>] ? sysenter_do_call+0x12/0x43
Fix it by not assuming that the cpumask is constant.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 20 Jan 2009 07:23:45 +0000 (08:23 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu
Tejun Heo [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
linker script: kill PERCPU_VADDR_PREALLOC()
Impact: cleanup
With .data.percpu.first in place, PERCPU_VADDR_PREALLOC() is no longer
necessary. Kill it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Mon, 19 Jan 2009 00:52:25 +0000 (19:52 -0500)]
x86: remove pda.h
Impact: cleanup
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Brian Gerst [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
x86: move stack_canary into irq_stack
Impact: x86_64 percpu area layout change, irq_stack now at the beginning
Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section. If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.
tj: * updated subject
* dropped asm relocation of irq_stack_ptr
* updated comments a bit
* rebased on top of stack canary changes
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Mon, 19 Jan 2009 03:21:28 +0000 (12:21 +0900)]
x86: rework __per_cpu_load adjustments
Impact: cleanup
Use cpu_number to determine if the adjustment is necessary.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Mon, 19 Jan 2009 03:21:27 +0000 (12:21 +0900)]
percpu: refactor percpu.h
Impact: cleanup
Refactor the DEFINE_PER_CPU_* macros and add .data.percpu.first
section.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Mon, 19 Jan 2009 03:21:27 +0000 (12:21 +0900)]
x86: remove pda_init()
Impact: cleanup
Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized. Remove all other calls, since the segments are
already initialized in head_64.S.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Tue, 20 Jan 2009 03:29:19 +0000 (12:29 +0900)]
x86: conditionalize stack canary handling in hot path
Impact: no unnecessary stack canary swapping during context switch
There's no point in moving stack_canary around during context switch
if it's not enabled. Conditionalize it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Tue, 20 Jan 2009 03:29:19 +0000 (12:29 +0900)]
x86: cleanup stack protector
Impact: cleanup
Make the following cleanups.
* remove duplicate comment from boot_init_stack_canary() which fits
better in the other place - cpu_idle().
* move stack_canary offset check from __switch_to() to
boot_init_stack_canary().
Signed-off-by: Tejun Heo <tj@kernel.org>
Ingo Molnar [Mon, 19 Jan 2009 19:49:37 +0000 (20:49 +0100)]
x86: fully honor "nolapic", fix
Impact: build fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Mon, 19 Jan 2009 16:12:20 +0000 (17:12 +0100)]
Merge branch 'master' of git://git./linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096
Ingo Molnar [Mon, 19 Jan 2009 11:36:09 +0000 (12:36 +0100)]
Merge branch 'stackprotector' into core/percpu
Ingo Molnar [Sun, 18 Jan 2009 17:37:14 +0000 (18:37 +0100)]
Merge branch 'core/percpu' into stackprotector
Conflicts:
arch/x86/include/asm/pda.h
arch/x86/include/asm/system.h
Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 18 Jan 2009 17:15:49 +0000 (18:15 +0100)]
Merge branch 'core/percpu' into perfcounters/core
Conflicts:
arch/x86/include/asm/pda.h
We merge tip/core/percpu into tip/perfcounters/core because of a
semantic and contextual conflict: the former eliminates the PDA,
while the latter extends it with apic_perf_irqs field.
Resolve the conflict by moving the new field to the irq_cpustat
structure on 64-bit too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Sun, 18 Jan 2009 16:41:32 +0000 (17:41 +0100)]
Merge branch 'tj-percpu' of git://git./linux/kernel/git/tj/misc into core/percpu
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Use absolute displacements for per-cpu accesses.
Accessing memory through %gs should not use rip-relative addressing.
Adding a P prefix for the argument tells gcc to not add (%rip) to
the memory references.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Move isidle from PDA to per-cpu.
tj: s/isidle/is_idle/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:59 +0000 (00:38 +0900)]
x86-64: Move nodenumber from PDA to per-cpu.
tj: * s/nodenumber/node_number/
* removed now unused pda variable from pda_init()
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move irqcount from PDA to per-cpu.
tj: s/irqcount/irq_count/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move oldrsp from PDA to per-cpu.
tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
is still referenced in the file
* s/oldrsp/old_rsp/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move kernelstack from PDA to per-cpu.
Also clean up PER_CPU_VAR usage in xen-asm_64.S
tj: * remove now unused stack_thread_info()
* s/kernelstack/kernel_stack/
* added FIXME comment in xen-asm_64.S
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.
tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
for voyager.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Convert exception stacks to per-cpu
Move the exception stacks to per-cpu, removing specific allocation code.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:58 +0000 (00:38 +0900)]
x86-64: Convert irqstacks to per-cpu
Move the irqstackptr variable from the PDA to per-cpu. Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.
tj: * sprinkle some underbars around.
* irq_stack_ptr is not used till traps_init(), no reason to
initialize it early. On SMP, just leaving it NULL till proper
initialization in setup_per_cpu_areas() works. Dropped
is_boot_cpu and early irq_stack_ptr initialization.
* do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
instead of (char, irq_stack[IRQ_STACK_SIZE]).
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:57 +0000 (00:38 +0900)]
x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Brian Gerst [Sun, 18 Jan 2009 15:38:57 +0000 (00:38 +0900)]
x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Paul Mackerras [Sat, 17 Jan 2009 07:10:22 +0000 (18:10 +1100)]
perf_counter: Add counter enable/disable ioctls
Impact: New perf_counter features
This primarily adds a way for perf_counter users to enable and disable
counters and groups. Enabling or disabling a counter or group also
enables or disables all of the child counters that have been cloned
from it to monitor children of the task monitored by the top-level
counter. The userspace interface to enable/disable counters is via
ioctl on the counter file descriptor.
Along the way this extends the code that handles child counters to
handle child counter groups properly. A group with multiple counters
will be cloned to child tasks if and only if the group leader has the
hw_event.inherit bit set - if it is set the whole group is cloned as a
group in the child task.
In order to be able to enable or disable all child counters of a given
top-level counter, we need a way to find them all. Hence I have added
a child_list field to struct perf_counter, which is the head of the
list of children for a top-level counter, or the link in that list for
a child counter. That list is protected by the perf_counter.mutex
field.
This also adds a mutex to the perf_counter_context struct. Previously
the list of counters was protected just by the lock field in the
context, which meant that perf_counter_init_task had to take that lock
and then take whatever lock/mutex protects the top-level counter's
child_list. But the counter enable/disable functions need to take
that lock in order to traverse the list, then for each counter take
the lock in that counter's context in order to change the counter's
state safely, which will lead to a deadlock.
To solve this, we now have both a mutex and a spinlock in the context,
and taking either is sufficient to ensure the list of counters can't
change - you have to take both before changing the list. Now
perf_counter_init_task takes the mutex instead of the lock (which
incidentally means that inherit_counter can use GFP_KERNEL instead of
GFP_ATOMIC) and thus avoids the possible deadlock. Similarly the new
enable/disable functions can take the mutex while traversing the list
of child counters without incurring a possible deadlock when the
counter manipulation code locks the context for a child counter.
We also had an misfeature that the first counter added to a context
would possibly not go on until the next sched-in, because we were
using ctx->nr_active to detect if the context was running on a CPU.
But nr_active is the number of active counters, and if that was zero
(because the context didn't have any counters yet) it would look like
the context wasn't running on a cpu and so the retry code in
__perf_install_in_context wouldn't retry. So this adds an 'is_active'
field that is set when the context is on a CPU, even if it has no
counters. The is_active field is only used for task contexts, not for
per-cpu contexts.
If we enable a subsidiary counter in a group that is active on a CPU,
and the arch code can't enable the counter, then we have to pull the
whole group off the CPU. We do this with group_sched_out, which gets
moved up in the file so it comes before all its callers. This also
adds similar logic to __perf_install_in_context so that the "all on,
or none" invariant of groups is preserved when adding a new counter to
a group.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tejun Heo [Sat, 17 Jan 2009 06:26:32 +0000 (15:26 +0900)]
linker script: add missing .data.percpu.page_aligned
arm, arm/mach-integrator and powerpc were missing
.data.percpu.page_aligned in their percpu output section definitions.
Add it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tejun Heo [Sat, 17 Jan 2009 05:42:50 +0000 (14:42 +0900)]
linker script: add missing VMLINUX_SYMBOL
The newly added PERCPU_*() macros define and use __per_cpu_load but
VMLINUX_SYMBOL() was missing from usages causing build failures on
archs where linker visible symbol is different from C symbols
(e.g. blackfin).
Signed-off-by: Tejun Heo <tj@kernel.org>
Mike Travis [Fri, 16 Jan 2009 23:58:13 +0000 (15:58 -0800)]
x86: put trigger in to detect mismatched apic versions.
Fire off one message if two apic's discovered with different
apic versions.
Signed-off-by: Mike Travis <travis@sgi.com>
Mike Travis [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage
Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().
Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.
Note: This patch basically reverts
50c668d6 which reverted
7503bfba, now
that the work_on_cpu() function is more stable.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>
Rusty Russell [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
work_on_cpu: Use our own workqueue.
Impact: remove potential clashes with generic kevent workqueue
Annoyingly, some places we want to use work_on_cpu are already in
workqueues. As per Ingo's suggestion, we create a different workqueue
for work_on_cpu.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Rusty Russell [Fri, 16 Jan 2009 23:31:15 +0000 (15:31 -0800)]
work_on_cpu: don't try to get_online_cpus() in work_on_cpu.
Impact: remove potential circular lock dependency with cpu hotplug lock
This has caused more problems than it solved, with a pile of cpu
hotplug locking issues.
Followup patches will get_online_cpus() in callers that need it, but
if they don't do it they're no worse than before when they were using
set_cpus_allowed without locking.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Linus Torvalds [Fri, 16 Jan 2009 20:43:00 +0000 (12:43 -0800)]
Linux 2.6.29-rc2
Linus Torvalds [Fri, 16 Jan 2009 20:40:37 +0000 (12:40 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (23 commits)
ACPI PCI hotplug: harden against panic regression
ACPI: rename main.c to sleep.c
dell-laptop: move to drivers/platform/x86/ from drivers/misc/
eeepc-laptop: enable Bluetooth ACPI details
ACPI: fix ACPI_FADT_S4_RTC_WAKE comment
kprobes: check CONFIG_FREEZER instead of CONFIG_PM
PM: Fix freezer compilation if PM_SLEEP is unset
thermal fixup for broken BIOS which has invalid trip points.
ACPI: EC: Don't trust ECDT tables from ASUS
ACPI: EC: Limit workaround for ASUS notebooks even more
ACPI: thinkpad-acpi: bump up version to 0.22
ACPI: thinkpad-acpi: handle HKEY event 6030
ACPI: thinkpad-acpi: clean-up fan subdriver quirk
ACPI: thinkpad-acpi: start the event hunt season
ACPI: thinkpad-acpi: handle HKEY thermal and battery alarms
ACPI: thinkpad-acpi: clean up hotkey_notify()
ACPI: thinkpad-acpi: use killable instead of interruptible mutexes
ACPI: thinkpad-acpi: add UWB radio support
ACPI: thinkpad-acpi: preserve radio state across shutdown
ACPI: thinkpad-acpi: resume with radios disabled
...
Linus Torvalds [Fri, 16 Jan 2009 20:40:11 +0000 (12:40 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netxen: include ipv6.h (fixes build failure)
netxen: avoid invalid iounmap
James Bottomley [Thu, 15 Jan 2009 20:12:27 +0000 (15:12 -0500)]
ACPI PCI hotplug: harden against panic regression
ACPI hotplug panic with current git head
http://lkml.org/lkml/2009/1/10/136
Rather than reverting the entire commit that causes the crash:
e8c331e963c58b83db24b7d0e39e8c07f687dbc6
"PCI hotplug: introduce functions for ACPI slot detection"
simply harden against it while the changes to
the hotplug code on this particularl machine are understood.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 16 Jan 2009 19:45:34 +0000 (14:45 -0500)]
Merge branch 'misc' into release
Len Brown [Fri, 16 Jan 2009 19:45:24 +0000 (14:45 -0500)]
Merge branch 'thinkpad-acpi' into release
Len Brown [Fri, 16 Jan 2009 19:45:11 +0000 (14:45 -0500)]
Merge branches 'bugzilla-11884' and 'bugzilla-8544' into release
Len Brown [Fri, 16 Jan 2009 18:52:03 +0000 (13:52 -0500)]
ACPI: rename main.c to sleep.c
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 9 Jan 2009 22:23:38 +0000 (17:23 -0500)]
dell-laptop: move to drivers/platform/x86/ from drivers/misc/
Signed-off-by: Len Brown <len.brown@intel.com>
Jonathan McDowell [Wed, 3 Dec 2008 20:31:11 +0000 (20:31 +0000)]
eeepc-laptop: enable Bluetooth ACPI details
Although rfkill support for the EEE bluetooth device has been added to
2.6.28-rc the appropriate ACPI accessor definitions were not added, so
the support was non functional. The patch below adds the get and set
accessors and has been verified to work on an EEE 901.
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Acked-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Corentin Chary <corentincj@iksaif.net>
Signed-off-by: Len Brown <len.brown@intel.com>
David Brownell [Fri, 9 Jan 2009 20:17:08 +0000 (12:17 -0800)]
ACPI: fix ACPI_FADT_S4_RTC_WAKE comment
Make the comment for ACPI_FADT_S4_RTC_WAKE match the ACPI spec;
that bit has nothing to do with status bits.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Masami Hiramatsu [Tue, 6 Jan 2009 20:15:32 +0000 (21:15 +0100)]
kprobes: check CONFIG_FREEZER instead of CONFIG_PM
Check CONFIG_FREEZER instead of CONFIG_PM because kprobe booster
depends on freeze_processes() and thaw_processes() when CONFIG_PREEMPT=y.
This fixes a linkage error which occurs when CONFIG_PREEMPT=y, CONFIG_PM=y
and CONFIG_FREEZER=n.
Reported-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
Rafael J. Wysocki [Tue, 6 Jan 2009 20:14:04 +0000 (21:14 +0100)]
PM: Fix freezer compilation if PM_SLEEP is unset
Freezer fails to compile if with the following configuration
settings:
CONFIG_CGROUPS=y
CONFIG_CGROUP_FREEZER=y
CONFIG_MODULES=y
CONFIG_FREEZER=y
CONFIG_PM=y
CONFIG_PM_SLEEP=n
Fix this by making process.o compilation depend on CONFIG_FREEZER.
Reported-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Fri, 16 Jan 2009 17:53:42 +0000 (12:53 -0500)]
thermal fixup for broken BIOS which has invalid trip points.
ACPI thermal driver only re-evaluate VALID trip points.
For the broken BIOS show in
http://bugzilla.kernel.org/show_bug.cgi?id=8544
the active[0] is set to invalid at boot time
and it will not be re-evaluated again.
We can still get a single warning message at boot time.
http://marc.info/?l=linux-kernel&m=
120496222629983&w=2
http://bugzilla.kernel.org/show_bug.cgi?id=12203
Signed-off-by: Zhang Rui<rui.zhang@intel.com>
Tested-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
Dhananjay Phadke [Fri, 16 Jan 2009 19:03:25 +0000 (11:03 -0800)]
netxen: include ipv6.h (fixes build failure)
Fixes a build error in absence of CONFIG_IPV6:
drivers/net/netxen/netxen_nic_main.c:1189: error: implicit declaration of function 'ipv6_hdr'
drivers/net/netxen/netxen_nic_main.c:1189: error: invalid type argument of '->'
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Starikovskiy [Tue, 13 Jan 2009 23:57:53 +0000 (02:57 +0300)]
ACPI: EC: Don't trust ECDT tables from ASUS
http://bugzilla.kernel.org/show_bug.cgi?id=9399
http://bugzilla.kernel.org/show_bug.cgi?id=11880
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Dhananjay Phadke [Fri, 16 Jan 2009 19:03:01 +0000 (11:03 -0800)]
netxen: avoid invalid iounmap
For NX3031 only one I/O range is mapped, so unmapping other
two which are used by older chips, causes this warning on
ppc64.
"Attempt to iounmap early bolted mapping at 0x0000000000000000"
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Starikovskiy [Tue, 13 Jan 2009 23:57:47 +0000 (02:57 +0300)]
ACPI: EC: Limit workaround for ASUS notebooks even more
References: http://bugzilla.kernel.org/show_bug.cgi?id=11884
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Fri, 16 Jan 2009 17:32:33 +0000 (09:32 -0800)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix ioctl arg size (userland incompatible change!)
Btrfs: Clear the device->running_pending flag before bailing on congestion
Chris Mason [Fri, 16 Jan 2009 16:59:08 +0000 (11:59 -0500)]
Btrfs: fix ioctl arg size (userland incompatible change!)
The structure used to send device in btrfs ioctl calls was not
properly aligned, and so 32 bit ioctls would not work properly on
64 bit kernels.
We could fix this with compat ioctls, but we're just one byte away
and it doesn't make sense at this stage to carry about the compat ioctls
forever at this stage in the project.
This patch brings the ioctl arg up to an evenly aligned 4k.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Fri, 16 Jan 2009 16:58:19 +0000 (11:58 -0500)]
Btrfs: Clear the device->running_pending flag before bailing on congestion
Btrfs maintains a queue of async bio submissions so the checksumming
threads don't have to wait on get_request_wait. In order to avoid
extra wakeups, this code has a running_pending flag that is used
to tell new submissions they don't need to wake the thread.
When the threads notice congestion on a single device, they
may decide to requeue the job and move on to other devices. This
makes sure the running_pending flag is cleared before the
job is requeued.
It should help avoid IO stalls by making sure the task is woken up
when new submissions come in.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Fri, 16 Jan 2009 16:41:09 +0000 (08:41 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
serial: Add 16850 uart type support to OF uart driver
hvc_console: Remove tty->low_latency
powerpc: Get the number of SLBs from "slb-size" property
powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices
powerpc/ps3: printing fixups for l64 to ll64 conversion drivers/video
powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/scsi
powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/ps3
powerpc/ps3: Printing fixups for l64 to ll64 conversion sound/ppc
powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/char
powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/block
powerpc/ps3: Printing fixups for l64 to ll64 conversion arch/powerpc
powerpc/ps3: ps3_repository_read_mm_info() takes u64 * arguments
powerpc/ps3: clear_bit()/set_bit() operate on unsigned longs
powerpc/ps3: The lv1_ routines have u64 parameters
powerpc/ps3: Use dma_addr_t down through the stack
powerpc/ps3: set_dabr() takes an unsigned long
powerpc: Cleanup from l64 to ll64 change drivers/scsi
Linus Torvalds [Fri, 16 Jan 2009 16:40:57 +0000 (08:40 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_fsl: Return non-zero on error in probe()
drivers/ata/pata_ali.c: s/isa_bridge/ali_isa_bridge/ to fix alpha build
libata: New driver for OCTEON SOC Compact Flash interface (v7).
libata: Add another column to the ata_timing table.
sata_via: Add VT8261 support
pata_atiixp: update port enabledness test handling
[libata] get-identity ioctl: Fix use of invalid memory pointer
Linus Torvalds [Fri, 16 Jan 2009 16:40:40 +0000 (08:40 -0800)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] Skip deleted devices in __scsi_device_lookup_by_target()
[SCSI] Add SUN Universal Xport to no attach blacklist
[SCSI] iscsi_tcp: make padbuf non-static
[SCSI] mpt fusion: Add Firmware debug support
[SCSI] mpt fusion: Add separate msi enable disable for FC,SPI,SAS
[SCSI] mpt fusion: Update MPI Headers to version 01.05.19
[SCSI] qla2xxx: Fix ISP restart bug in multiq code
Linus Torvalds [Fri, 16 Jan 2009 16:39:52 +0000 (08:39 -0800)]
Merge branch 'drm-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: lock correct mutex around object unreference.
drm/i915: add support for physical memory objects
drm/i915: make LVDS fixed mode a preferred mode
drm: handle depth & bpp changes correctly
drm: initial KMS config fixes
drm/i915: setup sarea properly in master_priv
drm/i915: set vblank enabled flag correctly across IRQ install/uninstall
drm/i915: don't enable vblanks on disabled pipes
Linus Torvalds [Fri, 16 Jan 2009 16:14:51 +0000 (08:14 -0800)]
Revert "PCI PM: Register power state of devices during initialization"
This reverts commit
98e6e286d7b01deb7453b717aa38ebb69d6cefc0, as Yinghai
Lu reports that it breaks kexec with at least the e1000 and e1000e
drivers. The reason is that the shutdown sequence puts the hardware
into D3 sleep, and the commit causes us to claim that it then is in D0
(running) state just because we don't understand the PM capabilities.
Which then later makes "pci_set_power_state()" not do anything, and the
device never wakes up properly and just returns 0xff to everything.
Reported-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: From: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jesse Barnes <jesse.barnes@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>