profile/ivi/kernel-x86-ivi.git
15 years agotracing: clean up splice code
Steven Rostedt [Mon, 9 Feb 2009 17:06:29 +0000 (12:06 -0500)]
tracing: clean up splice code

Ingo Molnar suggested a series of clean ups for the splice code.
This patch implements those suggestions.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agotracing: Move pipe waiting code out of tracing_read_pipe().
Eduard - Gabriel Munteanu [Mon, 9 Feb 2009 06:15:55 +0000 (08:15 +0200)]
tracing: Move pipe waiting code out of tracing_read_pipe().

This moves the pipe waiting code from tracing_read_pipe() into
tracing_wait_pipe(), which is useful to implement other fops, like
splice_read.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agotracing: splice support for tracing_pipe
Eduard - Gabriel Munteanu [Mon, 9 Feb 2009 06:15:56 +0000 (08:15 +0200)]
tracing: splice support for tracing_pipe

Added and implemented tracing_pipe_fops->splice_read(). This allows
userspace programs to get tracing data more efficiently.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agotracing/function-graph-tracer: handle the leaf functions from trace_pipe
Frederic Weisbecker [Fri, 6 Feb 2009 17:30:44 +0000 (18:30 +0100)]
tracing/function-graph-tracer: handle the leaf functions from trace_pipe

When one cats the trace file, the leaf functions are printed without brackets:

 function();

whereas in the trace_pipe file we'll see the following:

 function() {
 }

This is because the ring_buffer handling is not the same between those two files.
On the trace file, when an entry is printed, the iterator advanced and then we can
check the next entry.

There is no iterator with trace_pipe, the current entry to print has been peeked
and not consumed. So checking the next entry will still return the current one while
we don't consume it.

This patch introduces a new value for the output callbacks to ask the tracing
core to not consume the current entry after printing it.

We need it because we will have to consume the current entry ourself to check
the next one.

Now the trace_pipe is able to handle well the leaf functions.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/blktrace: move the tracing file to kernel/trace, fix
Ingo Molnar [Mon, 9 Feb 2009 11:06:54 +0000 (12:06 +0100)]
tracing/blktrace: move the tracing file to kernel/trace, fix

Impact: build fix

The BLK_DEV_IO_TRACE entry used to be in block/Kconfig - which
file itself was dependent on CONFIG_BLOCK. But now the entry is
in kernel/trace/Kconfig - which is present even on !CONFIG_BLOCK.

So add a 'depends on BLOCK' to BLK_DEV_IO_TRACE.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing: handle unregistering the current tracer
Arnaldo Carvalho de Melo [Sat, 7 Feb 2009 20:52:59 +0000 (18:52 -0200)]
tracing: handle unregistering the current tracer

Impact: simplification

Instead of requiring that plugins have the sequence:

  my_tracer_stop(my_trace_array);
  unregister_tracer(my_tracer);

it should be possible just do a:

  unregister_tracer(my_tracer);

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/function-graph-tracer: drop the kernel_text_address check
Frederic Weisbecker [Sat, 7 Feb 2009 23:04:02 +0000 (00:04 +0100)]
tracing/function-graph-tracer: drop the kernel_text_address check

When the function graph tracer picks a return address, it ensures this address
is really a kernel text one by calling __kernel_text_address()

Actually this path has never been taken.Its role was more likely to debug the tracer
on the beginning of its development but this function is wasteful since it is called
for every traced function.

The fault check is already sufficient.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/power: move the power trace headers to a dedicated file
Frederic Weisbecker [Sat, 7 Feb 2009 21:16:12 +0000 (22:16 +0100)]
tracing/power: move the power trace headers to a dedicated file

Impact: cleanup

Move the power tracer headers to trace/power.h to keep ftrace.h and power bits
more easy to maintain as separated topics.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/function-graph-tracer: provide a selftest for the function graph tracer
Frederic Weisbecker [Sat, 7 Feb 2009 20:33:57 +0000 (21:33 +0100)]
tracing/function-graph-tracer: provide a selftest for the function graph tracer

Making it more easy to do a basic regression test for this tracer.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing/blktrace: move the tracing file to kernel/trace
Frederic Weisbecker [Sat, 7 Feb 2009 19:46:45 +0000 (20:46 +0100)]
tracing/blktrace: move the tracing file to kernel/trace

Impact: cleanup

Move blktrace.c to kernel/trace, also move its config entry.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'tip/tracing/core/devel' of git://git.kernel.org/pub/scm/linux/kernel...
Ingo Molnar [Mon, 9 Feb 2009 09:35:12 +0000 (10:35 +0100)]
Merge branch 'tip/tracing/core/devel' of git://git./linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace

Conflicts:
kernel/trace/trace_hw_branches.c

15 years agoMerge commit 'v2.6.29-rc4' into tracing/core
Ingo Molnar [Mon, 9 Feb 2009 09:32:48 +0000 (10:32 +0100)]
Merge commit 'v2.6.29-rc4' into tracing/core

15 years agoLinux 2.6.29-rc4 v2.6.29-rc4
Linus Torvalds [Sun, 8 Feb 2009 20:37:20 +0000 (12:37 -0800)]
Linux 2.6.29-rc4

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-update
Linus Torvalds [Sun, 8 Feb 2009 20:35:26 +0000 (12:35 -0800)]
Merge git://git./linux/kernel/git/arjan/linux-2.6-async-update

* git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-update:
  async: use list_move_tail
  async: Rename _special -> _domain for clarity.
  async: Add some documentation.
  async: Handle kthread_run() return codes.
  async: Fix running list handling.

15 years agoradeonfb: Fix resume from D3Cold on some platforms
Benjamin Herrenschmidt [Thu, 5 Feb 2009 01:06:52 +0000 (12:06 +1100)]
radeonfb: Fix resume from D3Cold on some platforms

For historical reason, this driver used its own saving/restoring
of the PCI config space, and used the state of it on resume as
an indication as to whether it needed to re-POST the chip or not.

This methods breaks with the later core changes since the core will
have restored things for us.

This patch fixes it by removing that custom code, using standard
core methods to save/restore state, and testing for the need to
re-POST by comparing the content of a few key PLL registers.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoaty128fb: Properly save PCI state before changing PCI PM level
Benjamin Herrenschmidt [Thu, 5 Feb 2009 01:06:51 +0000 (12:06 +1100)]
aty128fb: Properly save PCI state before changing PCI PM level

This fixes aty128fb to properly save the PCI config space -before- it
potentially switches the PM state of the chip. This avoids a
warning with the new PM core and is the right thing to do anyway.

I also replaced the hand-coded switch to D2 with a call to the
genericc pci_set_power_state() and removed the code that switches it
back to D0 since the generic code is doing that for us nowadays.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoatyfb: Properly save PCI state before changing PCI PM level
Benjamin Herrenschmidt [Thu, 5 Feb 2009 01:06:50 +0000 (12:06 +1100)]
atyfb: Properly save PCI state before changing PCI PM level

This fixes atyfb to properly save the PCI config space -before- it
potentially switches the PM state of the chip. This avoids a
warning with the new PM core and is the right thing to do anyway.

I also slightly cleaned up the code that checks whether we are
running on a PowerMac to do a runtime check instead of a compile
check only, and replaced a deprecated number with the proper
symbolic constant.

Finally, I removed the useless switch to D0 from resume since
the core does it for us.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoasync: use list_move_tail
Stefan Richter [Mon, 2 Feb 2009 12:24:34 +0000 (13:24 +0100)]
async: use list_move_tail

list.h provides a dedicated primitive for
"list_del followed by list_add_tail"... list_move_tail.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
15 years agoasync: Rename _special -> _domain for clarity.
Cornelia Huck [Tue, 20 Jan 2009 14:31:31 +0000 (15:31 +0100)]
async: Rename _special -> _domain for clarity.

Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
15 years agoasync: Add some documentation.
Cornelia Huck [Mon, 19 Jan 2009 12:45:33 +0000 (13:45 +0100)]
async: Add some documentation.

Add some kerneldoc to the async interface.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
15 years agoasync: Handle kthread_run() return codes.
Cornelia Huck [Mon, 19 Jan 2009 12:45:31 +0000 (13:45 +0100)]
async: Handle kthread_run() return codes.

If we fail to create the manager thread, fall back to non-fastboot.
If we fail to create an async thread, try again after waiting for
a bit.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
15 years agoasync: Fix running list handling.
Cornelia Huck [Mon, 19 Jan 2009 12:45:28 +0000 (13:45 +0100)]
async: Fix running list handling.

async_schedule() should pass in async_running as the running
list, and run_one_entry() should put the entry to be run on
the provided running list instead of always on the generic one.

Reported-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
15 years agotrace: trivial fixes in comment typos.
Wenji Huang [Fri, 6 Feb 2009 09:33:27 +0000 (17:33 +0800)]
trace: trivial fixes in comment typos.

Impact: clean up

Fixed several typos in the comments.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoring-buffer: use generic version of in_nmi
Steven Rostedt [Fri, 6 Feb 2009 06:45:16 +0000 (01:45 -0500)]
ring-buffer: use generic version of in_nmi

Impact: clean up

Now that a generic in_nmi is available, this patch removes the
special code in the ring_buffer and implements the in_nmi generic
version instead.

With this change, I was also able to rename the "arch_ftrace_nmi_enter"
back to "ftrace_nmi_enter" and remove the code from the ring buffer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoftrace: change function graph tracer to use new in_nmi
Steven Rostedt [Fri, 6 Feb 2009 06:14:26 +0000 (01:14 -0500)]
ftrace: change function graph tracer to use new in_nmi

The function graph tracer piggy backed onto the dynamic ftracer
to use the in_nmi custom code for dynamic tracing. The problem
was (as Andrew Morton pointed out) it really only wanted to bail
out if the context of the current CPU was in NMI context. But the
dynamic ftrace in_nmi custom code was true if _any_ CPU happened
to be in NMI context.

Now that we have a generic in_nmi interface, this patch changes
the function graph code to use it instead of the dynamic ftarce
custom code.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agonmi: add generic nmi tracking state
Steven Rostedt [Fri, 6 Feb 2009 05:51:37 +0000 (00:51 -0500)]
nmi: add generic nmi tracking state

This code adds an in_nmi() macro that uses the current tasks preempt count
to track when it is in NMI context. Other parts of the kernel can
use this to determine if the context is in NMI context or not.

This code was inspired by the -rt patch in_nmi version that was
written by Peter Zijlstra, who borrowed that code from
Mathieu Desnoyers.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoftrace, x86: rename in_nmi variable
Steven Rostedt [Fri, 6 Feb 2009 03:30:07 +0000 (22:30 -0500)]
ftrace, x86: rename in_nmi variable

Impact: clean up

The in_nmi variable in x86 arch ftrace.c is a misnomer.
Andrew Morton pointed out that the in_nmi variable is incremented
by all CPUS. It can be set when another CPU is running an NMI.

Since this is actually intentional, the fix is to rename it to
what it really is: "nmi_running"

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoring-buffer: allow tracing_off to be used in core kernel code
Steven Rostedt [Fri, 6 Feb 2009 00:54:51 +0000 (19:54 -0500)]
ring-buffer: allow tracing_off to be used in core kernel code

tracing_off() is the fastest way to stop recording to the ring buffers.
This may be used in places like panic and die, just before the
ftrace_dump is called.

This patch adds the appropriate CPP conditionals to make it a stub
function when the ring buffer is not configured it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoring-buffer: add NMI protection for spinlocks
Steven Rostedt [Thu, 5 Feb 2009 23:43:07 +0000 (18:43 -0500)]
ring-buffer: add NMI protection for spinlocks

Impact: prevent deadlock in NMI

The ring buffers are not yet totally lockless with writing to
the buffer. When a writer crosses a page, it grabs a per cpu spinlock
to protect against a reader. The spinlocks taken by a writer are not
to protect against other writers, since a writer can only write to
its own per cpu buffer. The spinlocks protect against readers that
can touch any cpu buffer. The writers are made to be reentrant
with the spinlocks disabling interrupts.

The problem arises when an NMI writes to the buffer, and that write
crosses a page boundary. If it grabs a spinlock, it can be racing
with another writer (since disabling interrupts does not protect
against NMIs) or with a reader on the same CPU. Luckily, most of the
users are not reentrant and protects against this issue. But if a
user of the ring buffer becomes reentrant (which is what the ring
buffers do allow), if the NMI also writes to the ring buffer then
we risk the chance of a deadlock.

This patch moves the ftrace_nmi_enter called by nmi_enter() to the
ring buffer code. It replaces the current ftrace_nmi_enter that is
used by arch specific code to arch_ftrace_nmi_enter and updates
the Kconfig to handle it.

When an NMI is called, it will set a per cpu variable in the ring buffer
code and will clear it when the NMI exits. If a write to the ring buffer
crosses page boundaries inside an NMI, a trylock is used on the spin
lock instead. If the spinlock fails to be acquired, then the entry
is discarded.

This bug appeared in the ftrace work in the RT tree, where event tracing
is reentrant. This workaround solved the deadlocks that appeared there.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agotrace: remove deprecated entry->cpu
Steven Rostedt [Sun, 8 Feb 2009 00:38:43 +0000 (19:38 -0500)]
trace: remove deprecated entry->cpu

Impact: fix to prevent developers from using entry->cpu

With the new ring buffer infrastructure, the cpu for the entry is
implicit with which CPU buffer it is on.

The original code use to record the current cpu into the generic
entry header, which can be retrieved by entry->cpu. When the
ring buffer was introduced, the users were convert to use the
the cpu number of which cpu ring buffer was in use (this was passed
to the tracers by the iterator: iter->cpu).

Unfortunately, the cpu item in the entry structure was never removed.
This allowed for developers to use it instead of the proper iter->cpu,
unknowingly, using an uninitialized variable. This was not the fault
of the developers, since it would seem like the logical place to
retrieve the cpu identifier.

This patch removes the cpu item from the entry structure and fixes
all the users that should have been using iter->cpu.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes...
Linus Torvalds [Sat, 7 Feb 2009 18:46:30 +0000 (10:46 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI PM: make the PM core more careful with drivers using the new PM framework
  PCI PM: Read power state from device after trying to change it on resume
  PCI PM: Do not disable and enable bridges during suspend-resume
  PCI: PCIe portdrv: Simplify suspend and resume
  PCI PM: Fix saving of device state in pci_legacy_suspend
  PCI PM: Check if the state has been saved before trying to restore it
  PCI PM: Fix handling of devices without drivers
  PCI: return error on failure to read PCI ROMs
  PCI: properly clean up ASPM link state on device remove

15 years agomodule: remove over-zealous check in __module_get()
Rusty Russell [Sat, 7 Feb 2009 07:45:56 +0000 (18:15 +1030)]
module: remove over-zealous check in __module_get()

Impact: fix spurious BUG_ON() triggered under load

module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
 can give false results, for example).

Almost noone should be using __module_get, but that's another issue.

Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Sat, 7 Feb 2009 16:30:20 +0000 (08:30 -0800)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (30 commits)
  ACPI: Kconfig text - Fix the ACPI_CONTAINER module name according to the real module name.
  eeepc-laptop: fix oops when changing backlight brightness during eeepc-laptop init
  ACPICA: Fix table entry truncation calculation
  ACPI: Enable bit 11 in _PDC to advertise hw coord
  ACPI: struct device - replace bus_id with dev_name(), dev_set_name()
  ACPI: add missing KERN_* constants to printks
  ACPI: dock: Don't eval _STA on every show_docked sysfs read
  ACPI: disable ACPI cleanly when bad RSDP found
  ACPI: delete CPU_IDLE=n code
  ACPI: cpufreq: Remove deprecated /proc/acpi/processor/../performance proc entries
  ACPI: make some IO ports off-limits to AML
  ACPICA: add debug dump of BIOS _OSI strings
  ACPI: proc_dir_entry 'video/VGA' already registered
  ACPI: Skip the first two elements in the _BCL package
  ACPI: remove BM_RLD access from idle entry path
  ACPI: remove locking from PM1x_STS register reads
  eeepc-laptop: use netlink interface
  eeepc-laptop: Implement rfkill hotplugging in eeepc-laptop
  eeepc-laptop: Check return values from rfkill_register
  eeepc-laptop: Add support for extended hotkeys
  ...

15 years agoMerge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc...
Len Brown [Sat, 7 Feb 2009 06:34:56 +0000 (01:34 -0500)]
Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release

15 years agoACPI: Kconfig text - Fix the ACPI_CONTAINER module name according to the real module...
Thierry Vignaud [Sat, 7 Feb 2009 06:12:19 +0000 (01:12 -0500)]
ACPI: Kconfig text - Fix the ACPI_CONTAINER module name according to the real module name.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoeeepc-laptop: fix oops when changing backlight brightness during eeepc-laptop init
Darren Salt [Sat, 7 Feb 2009 06:02:07 +0000 (01:02 -0500)]
eeepc-laptop: fix oops when changing backlight brightness during eeepc-laptop init

I got the following oops while changing the backlight brightness during
startup.  When it happens, it prevents use of the hotkeys, Fn-Fx, and the
lid button.

It's a clear use-before-init, as I verified by testing with an
appropriately-placed "else printk".

BUG: unable to handle kernel NULL pointer dereference at 00000000
*pde = 00000000
Oops: 0002 [#1] PREEMPT SMP
Pid: 160, comm: kacpi_notify Not tainted (2.6.28.1-eee901 #4) 901
EIP: 0060:[<c0264e68>]  [<c0264e68>] eeepc_hotk_notify+26/da
EFLAGS: 00010246 CPU: 1
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: 00000009 EBX: 00000000 ECX: 00000009 EDX: f70dbf64
ESI: 00000029 EDI: f7335188 EBP: c02112c9 ESP: f70dbf80
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 f70731e0 f73acd50 c02164ac f7335180 f70aa040 c02112e6 f733518c c012b62f
 f70aa044 f70aa040 c012bdba f70aa04c 00000000 c012be6e 00000000 f70bdf80
 c012e198 f70dbfc4 f70dbfc4 f70aa040 c012bdba 00000000 c012e0c9 c012e091
Call Trace:
 [<c02164ac>] ? acpi_ev_notify_dispatch+4c/55
 [<c02112e6>] ? acpi_os_execute_deferred+1d/25
 [<c012b62f>] ? run_workqueue+71/f1
 [<c012bdba>] ? worker_thread+0/bf
 [<c012be6e>] ? worker_thread+b4/bf
 [<c012e198>] ? autoremove_wake_function+0/2b
 [<c012bdba>] ? worker_thread+0/bf
 [<c012e0c9>] ? kthread+38/5f
 [<c012e091>] ? kthread+0/5f
 [<c0103abf>] ? kernel_thread_helper+7/10
Code: 00 00 00 00 c3 83 3d 60 5c 50 c0 00 56 89 d6 53 0f 84 c4 00 00 00 8d 42
e0 83 f8 0f 77 0f 8b 1d 68 5c 50 c0 89 d8 e8 a9 fa ff ff <89> 03 8b 1d 60 5c
50 c0 89 f2 83 e2 7f 0f b7 4c 53 10 8d 41 01

Signed-off-by: Darren Salt <linux@youmustbejoking.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPICA: Fix table entry truncation calculation
Myron Stowe [Fri, 30 Jan 2009 22:44:53 +0000 (15:44 -0700)]
ACPICA: Fix table entry truncation calculation

During early boot, ACPI RSDT/XSDT table entries are gathered into the
'initial_tables[]' array.  This array is currently statically defined (see
./drivers/acpi/tables.c).  When there are more table entries than can be
held in the 'initial_tables[]' array, the message "Truncating N table
entries!" is output.  As currently implemented, this message will always
erroneously calculate N as 0.

This patch fixes the calculation that determines how many table entries
will be missing (truncated).

This modification may be used under either the GPL or the BSD-style
license used for Intel ACPI CA code.

Signed-off-by: Myron Stowe <myron.stowe@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPI: Enable bit 11 in _PDC to advertise hw coord
Pallipadi, Venkatesh [Mon, 2 Feb 2009 19:57:18 +0000 (11:57 -0800)]
ACPI: Enable bit 11 in _PDC to advertise hw coord

Bit 11 in intel PDC definitions is meant for OS capability to handle
hardware coordination of P-states. In Linux we have always supported
hwardware coordination of P-states. Just let the BIOSes know that we
support it, by setting this bit.

Some BIOSes use this bit to choose between hardware or software coordination
and without this change below, BIOSes switch to software coordination, which
is not very optimal in terms of power consumption and extra wakeups from idle.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPI: struct device - replace bus_id with dev_name(), dev_set_name()
Kay Sievers [Sun, 25 Jan 2009 22:40:56 +0000 (23:40 +0100)]
ACPI: struct device - replace bus_id with dev_name(), dev_set_name()

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPI: add missing KERN_* constants to printks
Frank Seidel [Wed, 4 Feb 2009 16:03:07 +0000 (17:03 +0100)]
ACPI: add missing KERN_* constants to printks

According to kerneljanitors todo list all printk calls (beginning
a new line) should have an according KERN_* constant.
Those are the missing peaces here for the acpi subsystem.

Signed-off-by: Frank Seidel <frank@f-seidel.de>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPI: dock: Don't eval _STA on every show_docked sysfs read
Holger Macht [Tue, 20 Jan 2009 11:18:24 +0000 (12:18 +0100)]
ACPI: dock: Don't eval _STA on every show_docked sysfs read

Some devices trigger a DEVICE_CHECK on every evalutation of _STA. This
can also be seen in commit 8b59560a3baf2e7c24e0fb92ea5d09eca92805db
(ACPI: dock: avoid check _STA method).  If an undock is processed, the
dock driver sends a uevent and userspace might read the show_docked
property in sysfs. This causes an evaluation of _STA of the particular
device which causes the dock driver to immediately dock again.

In any case, evaluation of _STA (show_docked) does not necessarily mean
that we are docked, so check with the internal device structure.

http://bugzilla.kernel.org/show_bug.cgi?id=12360

Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sat, 7 Feb 2009 02:52:55 +0000 (18:52 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  CRED: Fix SUID exec regression

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Sat, 7 Feb 2009 02:37:22 +0000 (18:37 -0800)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (37 commits)
  Btrfs: Make sure dir is non-null before doing S_ISGID checks
  Btrfs: Fix memory leak in cache_drop_leaf_ref
  Btrfs: don't return congestion in write_cache_pages as often
  Btrfs: Only prep for btree deletion balances when nodes are mostly empty
  Btrfs: fix btrfs_unlock_up_safe to walk the entire path
  Btrfs: change btrfs_del_leaf to drop locks earlier
  Btrfs: Change btrfs_truncate_inode_items to stop when it hits the inode
  Btrfs: Don't try to compress pages past i_size
  Btrfs: join the transaction in __btrfs_setxattr
  Btrfs: Handle SGID bit when creating inodes
  Btrfs: Make btrfs_drop_snapshot work in larger and more efficient chunks
  Btrfs: Change btree locking to use explicit blocking points
  Btrfs: hash_lock is no longer needed
  Btrfs: disable leak debugging checks in extent_io.c
  Btrfs: sort references by byte number during btrfs_inc_ref
  Btrfs: async threads should try harder to find work
  Btrfs: selinux support
  Btrfs: make btrfs acls selectable
  Btrfs: Catch missed bios in the async bio submission thread
  Btrfs: fix readdir on 32 bit machines
  ...

15 years agoeCryptfs: Regression in unencrypted filename symlinks
Tyler Hicks [Sat, 7 Feb 2009 00:06:51 +0000 (18:06 -0600)]
eCryptfs: Regression in unencrypted filename symlinks

The addition of filename encryption caused a regression in unencrypted
filename symlink support.  ecryptfs_copy_filename() is used when dealing
with unencrypted filenames and it reported that the new, copied filename
was a character longer than it should have been.

This caused the return value of readlink() to count the NULL byte of the
symlink target.  Most applications don't care about the extra NULL byte,
but a version control system (bzr) helped in discovering the bug.

Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'x86/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux...
Linus Torvalds [Sat, 7 Feb 2009 02:36:02 +0000 (18:36 -0800)]
Merge branch 'x86/fixes' of git://git./linux/kernel/git/frob/linux-2.6-roland

* 'x86/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  x86-64: fix int $0x80 -ENOSYS return

15 years agox86-64: fix int $0x80 -ENOSYS return
Roland McGrath [Sat, 7 Feb 2009 02:15:18 +0000 (18:15 -0800)]
x86-64: fix int $0x80 -ENOSYS return

One of my past fixes to this code introduced a different new bug.
When using 32-bit "int $0x80" entry for a bogus syscall number,
the return value is not correctly set to -ENOSYS.  This only happens
when neither syscall-audit nor syscall tracing is enabled (i.e., never
seen if auditd ever started).  Test program:

/* gcc -o int80-badsys -m32 -g int80-badsys.c
   Run on x86-64 kernel.
   Note to reproduce the bug you need auditd never to have started.  */

#include <errno.h>
#include <stdio.h>

int
main (void)
{
  long res;
  asm ("int $0x80" : "=a" (res) : "0" (99999));
  printf ("bad syscall returns %ld\n", res);
  return res != -ENOSYS;
}

The fix makes the int $0x80 path match the sysenter and syscall paths.

Reported-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Roland McGrath <roland@redhat.com>
15 years agoMerge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux...
Linus Torvalds [Sat, 7 Feb 2009 02:10:04 +0000 (18:10 -0800)]
Merge branch 'to-linus' of git://git./linux/kernel/git/frob/linux-2.6-roland

* 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  elf core dump: fix get_user use

15 years agoelf core dump: fix get_user use
Roland McGrath [Sat, 7 Feb 2009 01:34:07 +0000 (17:34 -0800)]
elf core dump: fix get_user use

The elf_core_dump() code does its work with set_fs(KERNEL_DS) in force,
so vma_dump_size() needs to switch back with set_fs(USER_DS) to safely
use get_user() for a normal user-space address.

Checking for VM_READ optimizes out the case where get_user() would fail
anyway.  The vm_file check here was already superfluous given the control
flow earlier in the function, so that is a cleanup/optimization unrelated
to other changes but an obvious and trivial one.

Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
15 years agoCRED: Fix SUID exec regression
David Howells [Fri, 6 Feb 2009 11:45:46 +0000 (11:45 +0000)]
CRED: Fix SUID exec regression

The patch:

commit a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
CRED: Make execve() take advantage of copy-on-write credentials

moved the place in which the 'safeness' of a SUID/SGID exec was performed to
before de_thread() was called.  This means that LSM_UNSAFE_SHARE is now
calculated incorrectly.  This flag is set if any of the usage counts for
fs_struct, files_struct and sighand_struct are greater than 1 at the time the
determination is made.  All of which are true for threads created by the
pthread library.

However, since we wish to make the security calculation before irrevocably
damaging the process so that we can return it an error code in the case where
we decide we want to reject the exec request on this basis, we have to make the
determination before calling de_thread().

So, instead, we count up the number of threads (CLONE_THREAD) that are sharing
our fs_struct (CLONE_FS), files_struct (CLONE_FILES) and sighand_structs
(CLONE_SIGHAND/CLONE_THREAD) with us.  These will be killed by de_thread() and
so can be discounted by check_unsafe_exec().

We do have to be careful because CLONE_THREAD does not imply FS or FILES.

We _assume_ that there will be no extra references to these structs held by the
threads we're going to kill.

This can be tested with the attached pair of programs.  Build the two programs
using the Makefile supplied, and run ./test1 as a non-root user.  If
successful, you should see something like:

[dhowells@andromeda tmp]$ ./test1
--TEST1--
uid=4043, euid=4043 suid=4043
exec ./test2
--TEST2--
uid=4043, euid=0 suid=0
SUCCESS - Correct effective user ID

and if unsuccessful, something like:

[dhowells@andromeda tmp]$ ./test1
--TEST1--
uid=4043, euid=4043 suid=4043
exec ./test2
--TEST2--
uid=4043, euid=4043 suid=4043
ERROR - Incorrect effective user ID!

The non-root user ID you see will depend on the user you run as.

[test1.c]
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>

static void *thread_func(void *arg)
{
while (1) {}
}

int main(int argc, char **argv)
{
pthread_t tid;
uid_t uid, euid, suid;

printf("--TEST1--\n");
getresuid(&uid, &euid, &suid);
printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid);

if (pthread_create(&tid, NULL, thread_func, NULL) < 0) {
perror("pthread_create");
exit(1);
}

printf("exec ./test2\n");
execlp("./test2", "test2", NULL);
perror("./test2");
_exit(1);
}

[test2.c]
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char **argv)
{
uid_t uid, euid, suid;

getresuid(&uid, &euid, &suid);
printf("--TEST2--\n");
printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid);

if (euid != 0) {
fprintf(stderr, "ERROR - Incorrect effective user ID!\n");
exit(1);
}
printf("SUCCESS - Correct effective user ID\n");
exit(0);
}

[Makefile]
CFLAGS = -D_GNU_SOURCE -Wall -Werror -Wunused
all: test1 test2

test1: test1.c
gcc $(CFLAGS) -o test1 test1.c -lpthread

test2: test2.c
gcc $(CFLAGS) -o test2 test2.c
sudo chown root.root test2
sudo chmod +s test2

Reported-by: David Smith <dsmith@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David Smith <dsmith@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
15 years agovfs: Don't call attach_nobh_buffers() with an empty list
Dave Kleikamp [Fri, 6 Feb 2009 20:59:26 +0000 (14:59 -0600)]
vfs: Don't call attach_nobh_buffers() with an empty list

This is a modification of a patch by Bill Pemberton <wfp5p@virginia.edu>

nobh_write_end() could call attach_nobh_buffers() with head == NULL.
This would result in a trap when attach_nobh_buffers() attempted to
access bh->b_this_page.

This can be illustrated by running the writev01 testcase from LTP on jfs.

This error was introduced by commit 5b41e74a "vfs: fix data leak in
nobh_write_end()".  That patch did not take into account that if
PageMappedToDisk() is true upon entry to nobh_write_begin(), then no
buffers will be allocated for the page.  In that case, we won't have to
worry about a failed write leaving unitialized data in the page.

Of course, head != NULL implies !page_has_buffers(page), so no need to
test both.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Dmitri Monakhov <dmonakhov@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Fri, 6 Feb 2009 19:14:23 +0000 (11:14 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Add missing COEF initialization for ALC887
  ALSA: hda - Add missing initialization for ALC272
  sound: usb-audio: handle wMaxPacketSize for FIXED_ENDPOINT devices
  ALSA: hda - Fix misc workqueue issues
  ALSA: hda - Add quirk for FSC Amilo Xi2550

15 years agoACPI: disable ACPI cleanly when bad RSDP found
Len Brown [Fri, 6 Feb 2009 19:00:56 +0000 (14:00 -0500)]
ACPI: disable ACPI cleanly when bad RSDP found

When ACPI is disabled in the BIOS of this VIA C3 box,
it invalidates the RSDP, which Linux notices:

ACPI Error (tbxfroot-0218): A valid RSDP was not found [20080926]

Bug Linux neglected to disable ACPI at that stage,
and later scribbled on smp_found_config:

ACPI: No APIC-table, disabling MPS

But this box doesn't run well in legacy PIC mode,
it needed IOAPIC mode to perform correctly:

http://lkml.org/lkml/2009/2/5/39

So exit ACPI mode cleanly when we first detect
that it is hopeless.

Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoACPI: delete CPU_IDLE=n code
Len Brown [Fri, 6 Feb 2009 17:24:17 +0000 (12:24 -0500)]
ACPI: delete CPU_IDLE=n code

CPU_IDLE=y has been default for ACPI=y since Nov-2007,
and has shipped in many distributions since then.

Here we delete the CPU_IDLE=n ACPI idle code, since
nobody should be using it, and we don't want to
maintain two versions.

Signed-off-by: Len Brown <len.brown@intel.com>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394...
Linus Torvalds [Fri, 6 Feb 2009 16:48:16 +0000 (08:48 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  ieee1394: dv1394: move deprecation message from module init to file open
  firewire: core: Remove card from list of cards when enable fails

15 years agoAdd Sascha Hauer to .mailmap
Uwe Kleine-König [Fri, 6 Feb 2009 13:53:18 +0000 (14:53 +0100)]
Add Sascha Hauer to .mailmap

This fixes the shortlog attribution e.g. for 106757b38fff

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoadd another mailmap entry for Uwe Kleine-König
Uwe Kleine-König [Fri, 6 Feb 2009 13:53:19 +0000 (14:53 +0100)]
add another mailmap entry for Uwe Kleine-König

I created commit 7971db5a4b4176ad5df590fce07a962c643a2740 on a machine
where I forgot to set user.name and user.email before.  The default
values were not optimal.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agofork.c: fix NULL pointer dereference when nr_threads == threads-max
Li Zefan [Fri, 6 Feb 2009 08:17:19 +0000 (08:17 +0000)]
fork.c: fix NULL pointer dereference when nr_threads == threads-max

I happened to forked lots of processes, and hit NULL pointer dereference.
It is because in copy_process() after checking max_threads, 0 is returned
but not -EAGAIN.

The bug is introduced by "CRED: Detach the credentials from task_struct"
(commit f1752eec6145c97163dbce62d17cf5d928e28a27).

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoBtrfs: Make sure dir is non-null before doing S_ISGID checks
Chris Mason [Fri, 6 Feb 2009 16:35:57 +0000 (11:35 -0500)]
Btrfs: Make sure dir is non-null before doing S_ISGID checks

The S_ISGID check in btrfs_new_inode caused an oops during subvol creation
because sometimes the dir is null.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
15 years agoMerge branch 'for-linus' of git://neil.brown.name/md
Linus Torvalds [Fri, 6 Feb 2009 15:41:10 +0000 (07:41 -0800)]
Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
  md: Ensure an md array never has too many devices.
  md: Fix a bug in linear.c causing which_dev() to return the wrong device.
  md: Allow read error in a single drive raid1 to be passed up.

15 years agoieee1394: dv1394: move deprecation message from module init to file open
Stefan Richter [Tue, 3 Feb 2009 16:54:31 +0000 (17:54 +0100)]
ieee1394: dv1394: move deprecation message from module init to file open

On many Linux installations, the dv1394 driver will be auto-loaded
whenever an AV/C device (e.g. camcorder or audio device) is plugged in.
An irritating message would then appear in the kernel log.

Defer this message to until a dv1394 character device file is actually
used by a program.  Also include the program name in the message and
update the message slightly.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
15 years agoMerge branch 'fix/usb-audio' into for-linus
Takashi Iwai [Fri, 6 Feb 2009 13:25:13 +0000 (14:25 +0100)]
Merge branch 'fix/usb-audio' into for-linus

15 years agoMerge branch 'fix/hda' into for-linus
Takashi Iwai [Fri, 6 Feb 2009 13:25:04 +0000 (14:25 +0100)]
Merge branch 'fix/hda' into for-linus

15 years agoALSA: hda - Add missing COEF initialization for ALC887
Takashi Iwai [Fri, 6 Feb 2009 11:46:59 +0000 (12:46 +0100)]
ALSA: hda - Add missing COEF initialization for ALC887

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Add missing initialization for ALC272
Takashi Iwai [Fri, 6 Feb 2009 11:45:52 +0000 (12:45 +0100)]
ALSA: hda - Add missing initialization for ALC272

ALC272 needs EAPD for speaker outputs as well as other similar ALC
codecs.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agosound: usb-audio: handle wMaxPacketSize for FIXED_ENDPOINT devices
Clemens Ladisch [Fri, 6 Feb 2009 07:13:07 +0000 (08:13 +0100)]
sound: usb-audio: handle wMaxPacketSize for FIXED_ENDPOINT devices

For audio devices that do not have proper audio descriptors (e.g.,
Edirol UA-20), we use hardcoded parameters from our quirks list.
However, we must still read the maximum packet size from the standard
endpoint descriptor; otherwise, we might use packets that are too big
and therefore rejected by the USB core.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agomd: Ensure an md array never has too many devices.
NeilBrown [Fri, 6 Feb 2009 07:02:46 +0000 (18:02 +1100)]
md: Ensure an md array never has too many devices.

Each different metadata format supported by md supports a
different maximum number of devices.
We really should be enforcing this maximum in the kernel, but
we aren't quite doing that properly.

We currently only enforce it at the 'hot_add' point, which is an
older interface which is not used by current userspace.

We need to also enforce it at 'add_new_disk' time for active arrays
and at 'do_md_run' time when starting a new array.

So move the test from 'hot_add' into 'bind_rdev_to_array' which is
called from both 'hot_add' and 'add_new_disk, and add a new
test in 'analyse_sbs' which is called from 'do_md_run'.

This bug (or missing feature) has been around "forever" and so
the patch is suitable for any -stable that is currently maintained.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: Fix a bug in linear.c causing which_dev() to return the wrong device.
Andre Noll [Fri, 6 Feb 2009 04:10:52 +0000 (15:10 +1100)]
md: Fix a bug in linear.c causing which_dev() to return the wrong device.

ab5bd5cbc8d4b868378d062eed3d4240930fbb86 introduced the following
bug in linear software raid for large arrays on 32 bit machines:

which_dev() computes the device holding a given sector by shifting
down the sector number to a 32 bit range, dividing by the array
spacing and looking up the resulting index in the hash table of
the array.

Because the computed index might be slightly too small, a loop at
the end of which_dev() increases the index until the given sector
actually falls into the range of the device associated with that index.

The changes of the above mentioned commit caused this loop to check
whether the _index_ rather than the sector number is small enough,
effectively bypassing the loop and thus possibly returning the wrong
device.

As reported by Simon Kirby, this leads to errors such as

linear_make_request: Sector 2340486136 out of bounds on dev sdi: 156301312 sectors, offset 2109870464

Fix this bug by introducing a local variable for the index so that
the variable containing the passed sector is left unchanged.

Cc: stable@kernel.org
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
15 years agomd: Allow read error in a single drive raid1 to be passed up.
NeilBrown [Fri, 6 Feb 2009 04:06:47 +0000 (15:06 +1100)]
md: Allow read error in a single drive raid1 to be passed up.

If a raid1 only has a single working device and gets a read error,
we choose to simply return that error up to the filesystem (or whatever)
rather than failing the whole array.

However the codes doesn't quite do that.  We attempt a readbalance
which allocates the same drive, so we retry the read - indefinitely.

Instead:  If read_balance in the error case chooses the same drive that just
failed, treat it as a failure and don't retry.

Signed-off-by: NeilBrown <neilb@suse.de>
15 years agoprevent kprobes from catching spurious page faults
Masami Hiramatsu [Thu, 5 Feb 2009 22:12:39 +0000 (17:12 -0500)]
prevent kprobes from catching spurious page faults

Prevent kprobes from catching spurious faults which will cause infinite
recursive page-fault and memory corruption by stack overflow.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agobraino in sg_ioctl_trans()
Al Viro [Fri, 6 Feb 2009 00:32:27 +0000 (00:32 +0000)]
braino in sg_ioctl_trans()

... and yes, gcc is insane enough to eat that without complaint.
We probably want sparse to scream on those...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoring_buffer: remove unused flags parameter, fix
Ingo Molnar [Fri, 6 Feb 2009 00:12:02 +0000 (01:12 +0100)]
ring_buffer: remove unused flags parameter, fix

Oprofile's ring-buffer use was not considered.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Fri, 6 Feb 2009 00:12:38 +0000 (16:12 -0800)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  Revert "configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item()"

15 years agoMerge branch 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal...
Linus Torvalds [Fri, 6 Feb 2009 00:11:54 +0000 (16:11 -0800)]
Merge branch 'sh/for-2.6.29' of git://git./linux/kernel/git/lethal/sh-2.6

* 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Fix up T-bit error handling in SH-4A mutex fastpath.
  sh: Fix up spurious syscall restarting.
  sh: fcnvds fix with denormalized numbers on SH-4 FPU.
  sh: Only reserve memory under CONFIG_ZERO_PAGE_OFFSET when it != 0.
  sh: Handle calling csum_partial with misaligned data
  sh: ap325rxa: Enable ov772x in defconfig.
  sh: ap325rxa: Add ov772x support.
  sh: ap325rxa: control camera power toggling.
  sh: mach-migor: Enable ov772x and tw9910 in defconfig.

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 6 Feb 2009 00:11:32 +0000 (16:11 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  Revert "tcp: Always set urgent pointer if it's beyond snd_nxt"
  ipv6: Copy cork options in ip6_append_data
  udp: Fix UDP short packet false positive
  gianfar: Fix potential soft reset race
  gianfar: Fix BD_LENGTH_MASK definition
  cxgb3: Fix lro switch
  iwlwifi: save PCI state before suspend, restore after resume
  iwlwifi: clean key table in iwl_clear_stations_table

15 years agotrace: Call tracing_reset_online_cpus before tracer->init()
Arnaldo Carvalho de Melo [Thu, 5 Feb 2009 20:02:00 +0000 (18:02 -0200)]
trace: Call tracing_reset_online_cpus before tracer->init()

Impact: cleanup

To make it easy for ftrace plugin writers, as this was open coded in
the existing plugins

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agotracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Arnaldo Carvalho de Melo [Thu, 5 Feb 2009 18:14:13 +0000 (16:14 -0200)]
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}

Impact: new API

These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.

It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.

With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.

$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
  trace_vprintk              |   -5
  trace_graph_return         |  -22
  trace_graph_entry          |  -26
  trace_function             |  -45
  __ftrace_trace_stack       |  -27
  ftrace_trace_userstack     |  -29
  tracing_sched_switch_trace |  -66
  tracing_stop               |   +1
  trace_seq_to_user          |   -1
  ftrace_trace_special       |  -63
  ftrace_special             |   +1
  tracing_sched_wakeup_trace |  -70
  tracing_reset_online_cpus  |   -1
 13 functions changed, 2 bytes added, 355 bytes removed, diff: -353

linux-2.6-tip/block/blktrace.c:
  __blk_add_trace |  -58
 1 function changed, 58 bytes removed, diff: -58

linux-2.6-tip/kernel/trace/trace.c:
  trace_buffer_lock_reserve  |  +88
  trace_buffer_unlock_commit |  +86
 2 functions changed, 174 bytes added, diff: +174

/tmp/vmlinux.after:
 16 functions changed, 176 bytes added, 413 bytes removed, diff: -237

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoring_buffer: remove unused flags parameter
Arnaldo Carvalho de Melo [Thu, 5 Feb 2009 18:12:56 +0000 (16:12 -0200)]
ring_buffer: remove unused flags parameter

Impact: API change, cleanup

>From ring_buffer_{lock_reserve,unlock_commit}.

$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
  trace_vprintk              |  -14
  trace_graph_return         |  -14
  trace_graph_entry          |  -10
  trace_function             |   -8
  __ftrace_trace_stack       |   -8
  ftrace_trace_userstack     |   -8
  tracing_sched_switch_trace |   -8
  ftrace_trace_special       |  -12
  tracing_sched_wakeup_trace |   -8
 9 functions changed, 90 bytes removed, diff: -90

linux-2.6-tip/block/blktrace.c:
  __blk_add_trace |   -1
 1 function changed, 1 bytes removed, diff: -1

/tmp/vmlinux.after:
 10 functions changed, 91 bytes removed, diff: -91

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoRevert "tcp: Always set urgent pointer if it's beyond snd_nxt"
David S. Miller [Thu, 5 Feb 2009 23:38:31 +0000 (15:38 -0800)]
Revert "tcp: Always set urgent pointer if it's beyond snd_nxt"

This reverts commit 64ff3b938ec6782e6585a83d5459b98b0c3f6eb8.

Jeff Chua reports that it breaks rlogin for him.

Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv6: Copy cork options in ip6_append_data
Herbert Xu [Thu, 5 Feb 2009 23:15:50 +0000 (15:15 -0800)]
ipv6: Copy cork options in ip6_append_data

As the options passed to ip6_append_data may be ephemeral, we need
to duplicate it for corking.  This patch applies the simplest fix
which is to memdup all the relevant bits.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Thu, 5 Feb 2009 23:08:11 +0000 (15:08 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

15 years agoudp: Fix UDP short packet false positive
Jesper Dangaard Brouer [Thu, 5 Feb 2009 23:05:45 +0000 (15:05 -0800)]
udp: Fix UDP short packet false positive

The UDP header pointer assignment must happen after calling
pskb_may_pull().  As pskb_may_pull() can potentially alter the SKB
buffer.

This was exposted by running multicast traffic through the NIU driver,
as it won't prepull the protocol headers into the linear area on
receive.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoseq_file: fix big-enough lseek() + read()
Alexey Dobriyan [Thu, 5 Feb 2009 21:30:05 +0000 (00:30 +0300)]
seq_file: fix big-enough lseek() + read()

lseek() further than length of the file will leave stale ->index
(second-to-last during iteration). Next seq_read() will not notice
that ->f_pos is big enough to return 0, but will print last item
as if ->f_pos is pointing to it.

Introduced in commit cb510b8172602a66467f3551b4be1911f5a7c8c2
aka "seq_file: more atomicity in traverse()".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoseq_file: move traverse so it can be used from seq_read
Eric Biederman [Wed, 4 Feb 2009 23:12:25 +0000 (15:12 -0800)]
seq_file: move traverse so it can be used from seq_read

In 2.6.25 some /proc files were converted to use the seq_file
infrastructure.  But seq_files do not correctly support pread(), which
broke some usersapce applications.

To handle pread correctly we can't assume that f_pos is where we left it
in seq_read.  So move traverse() so that we can eventually use it in
seq_read and do thus some day support pread().

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Cc: Paul Turner <pjt@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosgi-xp: fix writing past the end of kzalloc()'d space
Dean Nelson [Wed, 4 Feb 2009 23:12:24 +0000 (15:12 -0800)]
sgi-xp: fix writing past the end of kzalloc()'d space

A missing type cast results in writing way beyond the end of a kzalloc()'d
memory segment resulting in slab corruption. But it seems like the better
solution is to define ->recv_msg_slots as a 'void *' rather than a
'struct xpc_notify_mq_msg_uv *' and add the type cast.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoalpha: fixup BUG macro
Alexey Dobriyan [Wed, 4 Feb 2009 23:12:21 +0000 (15:12 -0800)]
alpha: fixup BUG macro

Do usual do {} while (0) dance, otherwise

fs/gfs2/util.c:99: error: expected expression before 'else'
drivers/scsi/lpfc/lpfc_sli.c:363: error: expected expression before 'else'

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agosx.c: fix missed unlock_kernel() on error path in sx_fw_ioctl()
Dan Carpenter [Wed, 4 Feb 2009 23:12:20 +0000 (15:12 -0800)]
sx.c: fix missed unlock_kernel() on error path in sx_fw_ioctl()

If we return directly with -EPERM then lock_kernel() is still held.

This was found with a code checker (http://repo.or.cz/w/smatch.git/).

[akpm@linux-foundation.org: fix another such path - missed func_exit()]
Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: <R.E.Wolff@BitWizard.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoatyfb: fix CONFIG_ namespace violations
Randy Dunlap [Wed, 4 Feb 2009 23:12:20 +0000 (15:12 -0800)]
atyfb: fix CONFIG_ namespace violations

Fix namespace violations by changing non-kconfig CONFIG_ names to CNFG_*.

Fixes breakage in staging/, which adds a real CONFIG_PANEL.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agortc-ds1390: fix compilation warnings in drivers/rtc/rtc-ds1390.c
Manish Katiyar [Wed, 4 Feb 2009 23:12:19 +0000 (15:12 -0800)]
rtc-ds1390: fix compilation warnings in drivers/rtc/rtc-ds1390.c

drivers/rtc/rtc-ds1390.c:125: warning: unused variable 'rtc'

Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodrivers/video/backlight: rename da903x to da903x_bl
Mike Rapoport [Wed, 4 Feb 2009 23:12:18 +0000 (15:12 -0800)]
drivers/video/backlight: rename da903x to da903x_bl

Currently both da903x backlight and voltage reulator drivers have the
same name. Rename the backlight driver to allow use of both drivers as
modules.

Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Eric Miao <eric.miao@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoatmel-ssc: fix misuse of dev_dbg when requested ssc instance is not found
Hans-Christian Egtvedt [Wed, 4 Feb 2009 23:12:17 +0000 (15:12 -0800)]
atmel-ssc: fix misuse of dev_dbg when requested ssc instance is not found

The ssc pointer is not valid when the id is not found in the list.
Convert the message from a debug one into an error message and avoid
dereferencing the bad pointer.

Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Huang Weiyi <weiyi.huang@gmail.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodo_wp_page: fix regression with execute in place
Carsten Otte [Wed, 4 Feb 2009 23:12:16 +0000 (15:12 -0800)]
do_wp_page: fix regression with execute in place

Fix do_wp_page for VM_MIXEDMAP mappings.

In the case where pfn_valid returns 0 for a pfn at the beginning of
do_wp_page and the mapping is not shared writable, the code branches to
label `gotten:' with old_page == NULL.

In case the vma is locked (vma->vm_flags & VM_LOCKED), lock_page,
clear_page_mlock, and unlock_page try to access the old_page.

This patch checks whether old_page is valid before it is dereferenced.

The regression was introduced by "mlock: mlocked pages are unevictable"
(commit b291f000393f5a0b679012b39d79fbc85c018233).

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agowait: prevent exclusive waiter starvation
Johannes Weiner [Wed, 4 Feb 2009 23:12:14 +0000 (15:12 -0800)]
wait: prevent exclusive waiter starvation

With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org> ["after some testing"]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomaintainers: general@lists.openfabrics.org is moderated
Randy Dunlap [Wed, 4 Feb 2009 23:12:13 +0000 (15:12 -0800)]
maintainers: general@lists.openfabrics.org is moderated

I got the "list is moderated message," so add it here.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolis3lv02d: add axes knowledge for HP 6710
Martin Kebert [Wed, 4 Feb 2009 23:12:12 +0000 (15:12 -0800)]
lis3lv02d: add axes knowledge for HP 6710

Add support for the HP laptops of model 6710x for having correctly setup
axes.

Signed-off-by: Martin Kebert <gkmarty@gmail.com>
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolis3lv02d: add axes knowledge for HP 6730
Pavel Herrmann [Wed, 4 Feb 2009 23:12:11 +0000 (15:12 -0800)]
lis3lv02d: add axes knowledge for HP 6730

Add support for the HP laptops of model 6730x for having correctly setup
axes.

Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolis3lv02d: add axes knowledge for HP 6530
Eric Piel [Wed, 4 Feb 2009 23:12:11 +0000 (15:12 -0800)]
lis3lv02d: add axes knowledge for HP 6530

Add support for the HP laptops of model 6530x for having correctly setup
axes.

Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolis3lv02d: add axes knowledge for HP 6510b
Jiri Tersel [Wed, 4 Feb 2009 23:12:09 +0000 (15:12 -0800)]
lis3lv02d: add axes knowledge for HP 6510b

According to dmesg my laptop model HP 6510b is not being recognized by this
driver. After I have modified "lis3lv02d.c" axes in Neverball are OK.

Signed-off-by: Jiri Tersel <tersel@mail.muni.cz>
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agohp-wmi: fix error path in hp_wmi_bios_setup()
Andrew Morton [Wed, 4 Feb 2009 23:12:07 +0000 (15:12 -0800)]
hp-wmi: fix error path in hp_wmi_bios_setup()

The error-path code can call rfkill_unregister() with a pointer which does
not contain the result of a call to rfkill_register().  It goes BUG().

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12560.

Cc: Frans Pop <elendil@planet.nl>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Matthew Garrett <mjg@redhat.com>
Reported-by: Helge Deller <deller@gmx.de>
Testted-by: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agorevert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY"
Andrew Morton [Wed, 4 Feb 2009 23:12:06 +0000 (15:12 -0800)]
revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY"

Revert commit 0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f because it causes
(arguably poorly designed) existing userspace to spend interminable
periods closing billions of not-open file descriptors.

We could bring this back, with some sort of opt-in tunable in /proc, which
defaults to "off".

Peter's alanysis follows:

: I spent several hours trying to get to the bottom of a serious
: performance issue that appeared on one of our servers after upgrading to
: 2.6.28.  In the end it's what could be considered a userspace bug that
: was triggered by a change in 2.6.28.  Since this might also affect other
: people I figured I'd at least document what I found here, and maybe we
: can even do something about it:
:
:
: So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
: the team maintaining our ftp archive complained that one of their
: scripts that previously ran in a few minutes still hadn't even come
: close to being done after an hour or so.  Downgrading to 2.6.27 fixed
: that.
:
: Turns out that script is forking a lot and something in it or python or
: whereever closes all the file descriptors it doesn't want to pass on.
: That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
: closes them all with a few exceptions.
:
: Turns out that takes a long time when your limit -n is now 2^20 (1048576).
:
: With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
: now a thousand times that.
:
: 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
: RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
: allows, as the title implies, to set the limit for number of files to
: infinity.
:
: Closer investigation showed that the broken default ulimit did not apply
: to "system" processes (like stuff started from init).  In the end I
: could establish that all processes that passed through pam_limit at one
: point had the bad resource limit.
:
: Apparently the pam library in Debian etch (4.0) initializes the limits
: to some default values when it doesn't have any settings in limit.conf
: to override them.  Turns out that for nofiles this is RLIM_INFINITY.
: Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
: package version 0.79-5 fixes that - tho I'm not sure what side effects
: that has.
:
: Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
: uses a different pam (version).

Reported-by: Peter Palfrader <weasel@debian.org>
Cc: Adam Tkac <vonsch@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <stable@kernel.org> [2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoshm: fix shmctl(SHM_INFO) lockup with !CONFIG_SHMEM
Tony Battersby [Wed, 4 Feb 2009 23:12:04 +0000 (15:12 -0800)]
shm: fix shmctl(SHM_INFO) lockup with !CONFIG_SHMEM

shm_get_stat() assumes that the inode is a "struct shmem_inode_info",
which is incorrect for !CONFIG_SHMEM (see fs/ramfs/inode.c:
ramfs_get_inode() vs.  mm/shmem.c: shmem_get_inode()).

This bad assumption can cause shmctl(SHM_INFO) to lockup when
shm_get_stat() tries to spin_lock(&info->lock).  Users of !CONFIG_SHMEM
may encounter this lockup simply by invoking the 'ipcs' command.

Reported by Jiri Olsa back in February 2008:
http://lkml.org/lkml/2008/2/29/74

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Reported-by: Jiri Olsa <olsajiri@gmail.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: <stable@kernel.org> [2.6.everything]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>