Linus Torvalds [Fri, 1 Oct 2010 17:53:45 +0000 (10:53 -0700)]
Merge branch 'idle-release' of git://git./linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: Voluntary leave_mm before entering deeper
acpi_idle: add missing \n to printk
intel_idle: add missing __percpu markup
intel_idle: Change mode 755 => 644
cpuidle: Fix typos
intel_idle: PCI quirk to prevent Lenovo Ideapad s10-3 boot hang
Linus Torvalds [Fri, 1 Oct 2010 17:53:06 +0000 (10:53 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
omap: McBSP: tx_irq_completion used in rx_irq_handler
omap: Fix compile dependency to LEDS_CLASS
Frederic Weisbecker [Thu, 30 Sep 2010 22:15:38 +0000 (15:15 -0700)]
reiserfs: fix unwanted reiserfs lock recursion
Prevent from recursively locking the reiserfs lock in reiserfs_unpack()
because we may call journal_begin() that requires the lock to be taken
only once, otherwise it won't be able to release the lock while taking
other mutexes, ending up in inverted dependencies between the journal
mutex and the reiserfs lock for example.
This fixes:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.35.4.4a #3
-------------------------------------------------------
lilo/1620 is trying to acquire lock:
(&journal->j_mutex){+.+...}, at: [<
d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<
d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[<
c10562b7>] lock_acquire+0x67/0x80
[<
c12facad>] __mutex_lock_common+0x4d/0x410
[<
c12fb0c8>] mutex_lock_nested+0x18/0x20
[<
d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
[<
d0325c06>] do_journal_begin_r+0x86/0x340 [reiserfs]
[<
d0325f77>] journal_begin+0x77/0x140 [reiserfs]
[<
d0315be4>] reiserfs_remount+0x224/0x530 [reiserfs]
[<
c10b6a20>] do_remount_sb+0x60/0x110
[<
c10cee25>] do_mount+0x625/0x790
[<
c10cf014>] sys_mount+0x84/0xb0
[<
c12fca3d>] syscall_call+0x7/0xb
-> #0 (&journal->j_mutex){+.+...}:
[<
c10560f6>] __lock_acquire+0x1026/0x1180
[<
c10562b7>] lock_acquire+0x67/0x80
[<
c12facad>] __mutex_lock_common+0x4d/0x410
[<
c12fb0c8>] mutex_lock_nested+0x18/0x20
[<
d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
[<
d0325f77>] journal_begin+0x77/0x140 [reiserfs]
[<
d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
[<
d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
[<
c10db9db>] __block_prepare_write+0x1bb/0x3a0
[<
c10dbbe6>] block_prepare_write+0x26/0x40
[<
d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
[<
d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
[<
d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
[<
c10c3188>] vfs_ioctl+0x28/0xa0
[<
c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
[<
c10c3eb3>] sys_ioctl+0x63/0x70
[<
c12fca3d>] syscall_call+0x7/0xb
other info that might help us debug this:
2 locks held by lilo/1620:
#0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<
d032945a>] reiserfs_unpack+0x6a/0x120 [reiserfs]
#1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<
d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
stack backtrace:
Pid: 1620, comm: lilo Not tainted 2.6.35.4.4a #3
Call Trace:
[<
c10560f6>] __lock_acquire+0x1026/0x1180
[<
c10562b7>] lock_acquire+0x67/0x80
[<
c12facad>] __mutex_lock_common+0x4d/0x410
[<
c12fb0c8>] mutex_lock_nested+0x18/0x20
[<
d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
[<
d0325f77>] journal_begin+0x77/0x140 [reiserfs]
[<
d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
[<
d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
[<
c10db9db>] __block_prepare_write+0x1bb/0x3a0
[<
c10dbbe6>] block_prepare_write+0x26/0x40
[<
d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
[<
d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
[<
d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
[<
c10c3188>] vfs_ioctl+0x28/0xa0
[<
c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
[<
c10c3eb3>] sys_ioctl+0x63/0x70
[<
c12fca3d>] syscall_call+0x7/0xb
Reported-by: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: All since 2.6.32 <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frederic Weisbecker [Thu, 30 Sep 2010 22:15:37 +0000 (15:15 -0700)]
reiserfs: fix dependency inversion between inode and reiserfs mutexes
The reiserfs mutex already depends on the inode mutex, so we can't lock
the inode mutex in reiserfs_unpack() without using the safe locking API,
because reiserfs_unpack() is always called with the reiserfs mutex locked.
This fixes:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.35c #13
-------------------------------------------------------
lilo/1606 is trying to acquire lock:
(&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<
d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<
d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[<
c1056347>] lock_acquire+0x67/0x80
[<
c12f083d>] __mutex_lock_common+0x4d/0x410
[<
c12f0c58>] mutex_lock_nested+0x18/0x20
[<
d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
[<
d0329e9a>] reiserfs_lookup_privroot+0x2a/0x90 [reiserfs]
[<
d0316b81>] reiserfs_fill_super+0x941/0xe60 [reiserfs]
[<
c10b7d17>] get_sb_bdev+0x117/0x170
[<
d0313e21>] get_super_block+0x21/0x30 [reiserfs]
[<
c10b74ba>] vfs_kern_mount+0x6a/0x1b0
[<
c10b7659>] do_kern_mount+0x39/0xe0
[<
c10cebe0>] do_mount+0x340/0x790
[<
c10cf0b4>] sys_mount+0x84/0xb0
[<
c12f25cd>] syscall_call+0x7/0xb
-> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}:
[<
c1056186>] __lock_acquire+0x1026/0x1180
[<
c1056347>] lock_acquire+0x67/0x80
[<
c12f083d>] __mutex_lock_common+0x4d/0x410
[<
c12f0c58>] mutex_lock_nested+0x18/0x20
[<
d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
[<
d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
[<
c10c3228>] vfs_ioctl+0x28/0xa0
[<
c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
[<
c10c3f53>] sys_ioctl+0x63/0x70
[<
c12f25cd>] syscall_call+0x7/0xb
other info that might help us debug this:
1 lock held by lilo/1606:
#0: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<
d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
stack backtrace:
Pid: 1606, comm: lilo Not tainted 2.6.35c #13
Call Trace:
[<
c1056186>] __lock_acquire+0x1026/0x1180
[<
c1056347>] lock_acquire+0x67/0x80
[<
c12f083d>] __mutex_lock_common+0x4d/0x410
[<
c12f0c58>] mutex_lock_nested+0x18/0x20
[<
d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
[<
d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
[<
c10c3228>] vfs_ioctl+0x28/0xa0
[<
c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
[<
c10c3f53>] sys_ioctl+0x63/0x70
[<
c12f25cd>] syscall_call+0x7/0xb
Reported-by: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: <stable@kernel.org> [2.6.32 and later]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kukjin Kim [Thu, 30 Sep 2010 22:15:35 +0000 (15:15 -0700)]
MAINTAINERS: update maintainer for S5P ARM ARCHITECTURES
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Kyungmin Park <kmpark@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Petr Vandrovec [Thu, 30 Sep 2010 22:15:34 +0000 (15:15 -0700)]
MAINTAINERS: update matroxfb & ncpfs status
I moved couple years ago, so let's update my email and snail mail.
And I do not have any access to Matrox hardware anymore, and I'm quite
unresponsive to matroxfb bug reports (sorry Alan), so saying that I'm
maintainer is a bit far fetched.
For ncpfs I do not use ncpfs in my daily life either, but at least I can
test that one, so I can stay listed here for odd fixes.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Olsa [Thu, 30 Sep 2010 22:15:33 +0000 (15:15 -0700)]
proc: make /proc/pid/limits world readable
Having the limits file world readable will ease the task of system
management on systems where root privileges might be restricted.
Having admin restricted with root priviledges, he/she could not check
other users process' limits.
Also it'd align with most of the /proc stat files.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Eugene Teo <eugene@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don Mullis [Thu, 30 Sep 2010 22:15:32 +0000 (15:15 -0700)]
lib/list_sort: do not pass bad pointers to cmp callback
If the original list is a POT in length, the first callback from line 73
will pass a==b both pointing to the original list_head. This is dangerous
because the 'list_sort()' user can use 'container_of()' and accesses the
"containing" object, which does not necessary exist for the list head. So
the user can access RAM which does not belong to him. If this is a write
access, we can end up with memory corruption.
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Rosenberg [Thu, 30 Sep 2010 22:15:31 +0000 (15:15 -0700)]
sys_semctl: fix kernel stack leakage
The semctl syscall has several code paths that lead to the leakage of
uninitialized kernel stack memory (namely the IPC_INFO, SEM_INFO,
IPC_STAT, and SEM_STAT commands) during the use of the older, obsolete
version of the semid_ds struct.
The copy_semid_to_user() function declares a semid_ds struct on the stack
and copies it back to the user without initializing or zeroing the
"sem_base", "sem_pending", "sem_pending_last", and "undo" pointers,
allowing the leakage of 16 bytes of kernel stack memory.
The code is still reachable on 32-bit systems - when calling semctl()
newer glibc's automatically OR the IPC command with the IPC_64 flag, but
invoking the syscall directly allows users to use the older versions of
the struct.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marcin Slusarz [Thu, 30 Sep 2010 22:15:30 +0000 (15:15 -0700)]
i7core_edac: fix panic in udimm sysfs attributes registration
Array of udimm sysfs attributes was not ended with NULL marker, leading to
dereference of random memory.
EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0
EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1
EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2
BUG: unable to handle kernel NULL pointer dereference at
00000000000001a4
IP: [<
ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name
RIP: 0010:[<
ffffffff81330b36>] [<
ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
(...)
Call Trace:
[<
ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1
[<
ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2
[<
ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557
[<
ffffffff81428901>] i7core_probe+0xccf/0xec0
RIP [<
ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1
---[ end trace
20de320855b81d78 ]---
Kernel panic - not syncing: Attempted to kill init!
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 30 Sep 2010 22:15:29 +0000 (15:15 -0700)]
drivers/serial/mrst_max3110.c needs linux/irq.h
sparc64 allmodconfig:
drivers/serial/mrst_max3110.c: In function `serial_m3110_startup':
drivers/serial/mrst_max3110.c:470: error: `IRQ_TYPE_EDGE_FALLING' undeclared (first use in this function)
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 30 Sep 2010 22:15:29 +0000 (15:15 -0700)]
arch/m68k/mac/macboing.c: use unsigned long for irqflags
Fix the warnings
arch/m68k/mac/macboing.c: In function 'mac_mksound':
arch/m68k/mac/macboing.c:189: warning: comparison of distinct pointer types lacks a cast
arch/m68k/mac/macboing.c:211: warning: comparison of distinct pointer types lacks a cast
arch/m68k/mac/macboing.c: In function 'mac_quadra_start_bell':
arch/m68k/mac/macboing.c:241: warning: comparison of distinct pointer types lacks a cast
arch/m68k/mac/macboing.c:263: warning: comparison of distinct pointer types lacks a cast
arch/m68k/mac/macboing.c: In function 'mac_quadra_ring_bell':
arch/m68k/mac/macboing.c:283: warning: comparison of distinct pointer types lacks a cast
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Thu, 30 Sep 2010 22:15:28 +0000 (15:15 -0700)]
drivers/serial/mfd.c needs slab.h
alpha allmodconfig:
drivers/serial/mfd.c:144: error: implicit declaration of function 'kzalloc'
drivers/serial/mfd.c:144: warning: assignment makes pointer from integer without a cast
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ira W. Snyder [Thu, 30 Sep 2010 22:15:27 +0000 (15:15 -0700)]
kfifo: fix scatterlist usage
The kfifo_dma family of functions use sg_mark_end() on the last element in
their scatterlist. This forces use of a fresh scatterlist for each DMA
operation, which makes recycling a single scatterlist impossible.
Change the behavior of the kfifo_dma functions to match the usage of the
dma_map_sg function. This means that users must respect the returned
nents value. The sample code is updated to reflect the change.
This bug is trivial to cause: call kfifo_dma_in_prepare() such that it
prepares a scatterlist with a single entry comprising the whole fifo.
This is the case when you map the entirety of a newly created empty fifo.
This causes the setup_sgl() function to mark the first scatterlist entry
as the end of the chain, no matter what comes after it.
Afterwards, add and remove some data from the fifo such that another call
to kfifo_dma_in_prepare() will create two scatterlist entries. It returns
nents=2. However, due to the previous sg_mark_end() call, sg_is_last()
will now return true for the first scatterlist element. This causes the
sample code to print a single scatterlist element when it should print
two.
By removing the call to sg_mark_end(), we make the API as similar as
possible to the DMA mapping API. All users are required to respect the
returned nents.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Suresh Siddha [Fri, 1 Oct 2010 01:19:07 +0000 (21:19 -0400)]
intel_idle: Voluntary leave_mm before entering deeper
Avoid TLB flush IPIs for the cores in deeper c-states by voluntary leave_mm()
before entering into that state. CPUs tend to flush TLB in those c-states
anyways.
acpi_idle does this with C3-type states, but it was not caried over
when intel_idle was introduced. intel_idle can apply it
to C-states in addition to those that ACPI might export as C3...
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Thu, 30 Sep 2010 15:37:38 +0000 (08:37 -0700)]
Fix up more fallout form alpha signal cleanups
Commit
c52c2ddc1dfa ("alpha: switch osf_sigprocmask() to use of
sigprocmask()") had several problems. The more obvious compile issues
got fixed in commit
0f44fbd297e1 ("alpha: fix compile problem in
arch/alpha/kernel/signal.c"), but it also caused a regression.
Since _BLOCKABLE is already the set of signals that can be blocked, the
code should do "newmask & _BLOCKABLE" rather than inverting _BLOCKABLE
before masking.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Patch-by: Al Viro <viro@zeniv.linux.org.uk>
Patch-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 30 Sep 2010 03:38:07 +0000 (20:38 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/jlbec/ocfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2: Don't walk off the end of fast symlinks.
Linus Torvalds [Thu, 30 Sep 2010 01:41:19 +0000 (18:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/djbw/async_tx
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: fix interrupt clearing for mv_xor
missing inline keyword for static function in linux/dmaengine.h
dma/shdma: move dereference below the NULL check
Joel Becker [Thu, 30 Sep 2010 00:33:05 +0000 (17:33 -0700)]
ocfs2: Don't walk off the end of fast symlinks.
ocfs2 fast symlinks are NUL terminated strings stored inline in the
inode data area. However, disk corruption or a local attacker could, in
theory, remove that NUL. Because we're using strlen() (my fault,
introduced in a731d1 when removing vfs_follow_link()), we could walk off
the end of that string.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Cc: stable@kernel.org
Linus Torvalds [Wed, 29 Sep 2010 21:58:11 +0000 (14:58 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: force background CIL push under sustained load
Linus Torvalds [Wed, 29 Sep 2010 21:57:53 +0000 (14:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sameo/mfd-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Fix max8925 irq control bit incorrect setting
mfd: Ignore non-GPIO IRQs when setting wm831x IRQ types
Daniel J Blueman [Wed, 29 Sep 2010 20:01:55 +0000 (21:01 +0100)]
fix OMAP2 MTD build failure
Fix build failure from recent interface change and merge.
Tested on OMAP3430.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Chinner [Fri, 24 Sep 2010 08:13:44 +0000 (18:13 +1000)]
xfs: force background CIL push under sustained load
I have been seeing occasional pauses in transaction throughput up to
30s long under heavy parallel workloads. The only notable thing was
that the xfsaild was trying to be active during the pauses, but
making no progress. It was running exactly 20 times a second (on the
50ms no-progress backoff), and the number of pushbuf events was
constant across this time as well. IOWs, the xfsaild appeared to be
stuck on buffers that it could not push out.
Further investigation indicated that it was trying to push out inode
buffers that were pinned and/or locked. The xfsbufd was also getting
woken at the same frequency (by the xfsaild, no doubt) to push out
delayed write buffers. The xfsbufd was not making any progress
because all the buffers in the delwri queue were pinned. This scan-
and-make-no-progress dance went one in the trace for some seconds,
before the xfssyncd came along an issued a log force, and then
things started going again.
However, I noticed something strange about the log force - there
were way too many IO's issued. 516 log buffers were written, to be
exact. That added up to 129MB of log IO, which got me very
interested because it's almost exactly 25% of the size of the log.
He delayed logging code is suppose to aggregate the minimum of 25%
of the log or 8MB worth of changes before flushing. That's what
really puzzled me - why did a log force write 129MB instead of only
8MB?
Essentially what has happened is that no CIL pushes had occurred
since the previous tail push which cleared out 25% of the log space.
That caused all the new transactions to block because there wasn't
log space for them, but they kick the xfsaild to push the tail.
However, the xfsaild was not making progress because there were
buffers it could not lock and flush, and the xfsbufd could not flush
them because they were pinned. As a result, both the xfsaild and the
xfsbufd could not move the tail of the log forward without the CIL
first committing.
The cause of the problem was that the background CIL push, which
should happen when 8MB of aggregated changes have been committed, is
being held off by the concurrent transaction commit load. The
background push does a down_write_trylock() which will fail if there
is a concurrent transaction commit holding the push lock in read
mode. With 8 CPUs all doing transactions as fast as they can, there
was enough concurrent transaction commits to hold off the background
push until tail-pushing could no longer free log space, and the halt
would occur.
It should be noted that there is no reason why it would halt at 25%
of log space used by a single CIL checkpoint. This bug could
definitely violate the "no transaction should be larger than half
the log" requirement and hence result in corruption if the system
crashed under heavy load. This sort of bug is exactly the reason why
delayed logging was tagged as experimental....
The fix is to start blocking background pushes once the threshold
has been exceeded. Rework the threshold calculations to keep the
amount of log space a CIL checkpoint can use to below that of the
AIL push threshold to avoid the problem completely.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Kevin Liu [Wed, 8 Sep 2010 13:44:36 +0000 (09:44 -0400)]
mfd: Fix max8925 irq control bit incorrect setting
In max8925_irq_sync_unlock(), irq control bit is set at the same time.
Zero means enabling irq, and one means disabling irq.
The original code is:
irq_chg[0] &= irq_data->enable;
It should be changed to:
irq_chg[0] &= ~irq_data->enable;
Otherwise, irq control bit is mess.
Signed-off-by: Kevin Liu <kliu5@marvell.com>
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Mon, 16 Aug 2010 19:26:51 +0000 (20:26 +0100)]
mfd: Ignore non-GPIO IRQs when setting wm831x IRQ types
The driver was originally tested with an additional patch which
made this unneeded but that patch had issuges and got lost on the
way to mainline, causing problems when the errors are reported.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Cc: stable@kernel.org
Len Brown [Wed, 29 Sep 2010 03:30:58 +0000 (23:30 -0400)]
Merge branch 'meego-7093' into idle-release
Len Brown [Sat, 25 Sep 2010 00:50:02 +0000 (20:50 -0400)]
acpi_idle: add missing \n to printk
otherwise, these two lines print as one:
ACPI: acpi_idle yielding to intel_idle
ACPI: SSDT
3f5d8741 00203 (v02 PmRef Cpu0Ist
00003000 INTL
20050624)
Signed-off-by: Len Brown <len.brown@intel.com>
Namhyung Kim [Sat, 7 Aug 2010 18:10:03 +0000 (03:10 +0900)]
intel_idle: add missing __percpu markup
intel_idle_cpuidle_devices is a percpu pointer
but was missing __percpu markup.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Thomas Weber [Mon, 12 Jul 2010 06:56:43 +0000 (08:56 +0200)]
intel_idle: Change mode 755 => 644
Remove execution permission from source file.
Signed-off-by: Thomas Weber <weber@corscience.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Lucas De Marchi [Tue, 7 Sep 2010 16:53:49 +0000 (12:53 -0400)]
cpuidle: Fix typos
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Wed, 29 Sep 2010 01:01:22 +0000 (18:01 -0700)]
Linux 2.6.36-rc6
David Howells [Wed, 29 Sep 2010 00:57:02 +0000 (01:57 +0100)]
MN10300: Handle missing sys_cacheflush() when caching disabled
When caching is disabled on the MN10300 arch, the sys_cacheflush()
function is removed by conditional stuff in the makefiles, but is still
referred to by the syscall table.
Provide a null version that just returns 0 when caching is disabled (or
-EINVAL if the arguments are silly).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 28 Sep 2010 20:26:57 +0000 (13:26 -0700)]
alpha: fix compile problem in arch/alpha/kernel/signal.c
Tssk. Apparently Al hadn't checked commit
c52c2ddc1dfa ("alpha: switch
osf_sigprocmask() to use of sigprocmask()") at all. It doesn't compile.
Fixed as per suggestions from Michael Cree.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 28 Sep 2010 19:38:52 +0000 (12:38 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ahci: fix module refcount breakage introduced by libahci split
Tejun Heo [Tue, 21 Sep 2010 07:25:48 +0000 (09:25 +0200)]
ahci: fix module refcount breakage introduced by libahci split
libata depends on scsi_host_template for module reference counting and
sht's should be owned by each low level driver. During libahci split,
the sht was left with libahci.ko leaving the actual low level drivers
not reference counted. This made ahci and ahci_platform always
unloadable even while they're being actively used.
Fix it by defining AHCI_SHT() macro in ahci.h and defining a sht for
each low level ahci driver.
stable: only applicable to 2.6.35.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Pedro Francisco <pedrogfrancisco@gmail.com>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Tue, 28 Sep 2010 19:13:13 +0000 (12:13 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon (coretemp): Fix build breakage if SMP is undefined
Linus Torvalds [Tue, 28 Sep 2010 19:02:22 +0000 (12:02 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fix pci_resource_alignment prototype
Linus Torvalds [Tue, 28 Sep 2010 19:01:26 +0000 (12:01 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
tcp: Fix >4GB writes on 64-bit.
net/9p: Mount only matching virtio channels
de2104x: fix ethtool
tproxy: check for transparent flag in ip_route_newports
ipv6: add IPv6 to neighbour table overflow warning
tcp: fix TSO FACK loss marking in tcp_mark_head_lost
3c59x: fix regression from patch "Add ethtool WOL support"
ipv6: add a missing unregister_pernet_subsys call
s390: use free_netdev(netdev) instead of kfree()
sgiseeq: use free_netdev(netdev) instead of kfree()
rionet: use free_netdev(netdev) instead of kfree()
ibm_newemac: use free_netdev(netdev) instead of kfree()
smsc911x: Add MODULE_ALIAS()
net: reset skb queue mapping when rx'ing over tunnel
br2684: fix scheduling while atomic
de2104x: fix TP link detection
de2104x: fix power management
de2104x: disable autonegotiation on broken hardware
net: fix a lockdep splat
e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
...
Guenter Roeck [Tue, 28 Sep 2010 01:01:49 +0000 (18:01 -0700)]
hwmon (coretemp): Fix build breakage if SMP is undefined
Commit
e40cc4bdfd4b89813f072f72bd9c7055814d3f0f introduced
a build breakage if CONFIG_SMP is undefined. This commit
fixes the problem.
This fix is only a workaround. For a real fix, cpu_sibling_mask() should
be defined in UP include code, eg in linux/smp.h, and asm/smp.h should not be
included directly. This fix is currently not possible because asm/smp.h defines
cpu_sibling_mask() unconditionally and is included directly from many source
files.
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Linus Torvalds [Tue, 28 Sep 2010 04:19:27 +0000 (21:19 -0700)]
Merge branch 'x86/urgent' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatile
David S. Miller [Tue, 28 Sep 2010 03:24:54 +0000 (20:24 -0700)]
tcp: Fix >4GB writes on 64-bit.
Fixes kernel bugzilla #16603
tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.
There is also the problem higher up of how verify_iovec() works. It
wants to prevent the total length from looking like an error return
value.
However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines). So it could trigger
false-positives on 64-bit as written. So fix it to use 'long'.
Reported-by: Olaf Bonorden <bono@onlinehome.de>
Reported-by: Daniel Büse <dbuese@gmx.de>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Rosenberg [Mon, 27 Sep 2010 16:30:28 +0000 (12:30 -0400)]
Fix pktcdvd ioctl dev_minor range check
The PKT_CTRL_CMD_STATUS device ioctl retrieves a pointer to a
pktcdvd_device from the global pkt_devs array. The index into this
array is provided directly by the user and is a signed integer, so the
comparison to ensure that it falls within the bounds of this array will
fail when provided with a negative index.
This can be used to read arbitrary kernel memory or cause a crash due to
an invalid pointer dereference. This can be exploited by users with
permission to open /dev/pktcdvd/control (on many distributions, this is
readable by group "cdrom").
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
[ Rather than add a cast, just make the function take the right type -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Mon, 27 Sep 2010 12:12:33 +0000 (13:12 +0100)]
MN10300: Default config choice GDBSTUB_TTYSM0 should be GDBSTUB_ON_TTYSM0
The configuration choice for the port on which the GDB stub listens has
a default of GDBSTUB_TTYSM0, but this should be GDBSTUB_ON_TTYSM0 to
match the option.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sven Eckelmann [Mon, 27 Sep 2010 22:54:44 +0000 (15:54 -0700)]
net/9p: Mount only matching virtio channels
p9_virtio_create will only compare the the channel's tag characters
against the device name till the end of the channel's tag but not till
the end of the device name. This means that if a user defines channels
with the tags foo and foobar then he would mount foo when he requested
foonot and may mount foo when he requested foobar.
Thus it is necessary to check both string lengths against each other in
case of a successful partial string match.
Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Zary [Mon, 27 Sep 2010 11:41:45 +0000 (11:41 +0000)]
de2104x: fix ethtool
When the interface is up, using ethtool breaks it because:
a) link is put down but media_timer interval is not shortened to NO_LINK
b) rxtx is stopped but not restarted
Also manual 10baseT-HD (and probably FD too - untested) mode does not work -
the link is forced up, packets are transmitted but nothing is received.
Changing CSR14 value to match documentation (not disabling link check) fixes this.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 27 Sep 2010 22:04:23 +0000 (15:04 -0700)]
Merge branch 'vhost-net' of git://git./linux/kernel/git/mst/vhost
Ulrich Weber [Mon, 27 Sep 2010 03:31:00 +0000 (03:31 +0000)]
tproxy: check for transparent flag in ip_route_newports
as done in ip_route_connect()
Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ulrich Weber [Mon, 27 Sep 2010 22:02:18 +0000 (15:02 -0700)]
ipv6: add IPv6 to neighbour table overflow warning
IPv4 and IPv6 have separate neighbour tables, so
the warning messages should be distinguishable.
[ Add a suitable message prefix on the ipv4 side as well -DaveM ]
Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Fri, 24 Sep 2010 13:22:06 +0000 (13:22 +0000)]
tcp: fix TSO FACK loss marking in tcp_mark_head_lost
When TCP uses FACK algorithm to mark lost packets in
tcp_mark_head_lost(), if the number of packets in the (TSO) skb is
greater than the number of packets that should be marked lost, TCP
incorrectly exits the loop and marks no packets lost in the skb. This
underestimates tp->lost_out and affects the recovery/retransmission.
This patch fargments the skb and marks the correct amount of packets
lost.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 27 Sep 2010 19:33:54 +0000 (12:33 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cxgb3: Turn off RX coalescing for iWARP connections
Linus Torvalds [Mon, 27 Sep 2010 19:32:36 +0000 (12:32 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (28 commits)
ARM: 6411/1: vexpress: set RAM latencies to 1 cycle for PL310 on ct-ca9x4 tile
ARM: 6409/1: davinci: map sram using MT_MEMORY_NONCACHED instead of MT_DEVICE
ARM: 6408/1: omap: Map only available sram memory
ARM: 6407/1: mmu: Setup MT_MEMORY and MT_MEMORY_NONCACHED L1 entries
ARM: pxa: remove pr_<level> uses of KERN_<level>
ARM: pxa168fb: clear enable bit when not active
ARM: pxa: fix cpu_is_pxa*() not expanding to zero when not configured
ARM: pxa168: fix corrected reset vector
ARM: pxa: Use PIO for PI2C communication on Palm27x
ARM: pxa: Fix Vpac270 gpio_power for MMC
ARM: 6401/1: plug a race in the alignment trap handler
ARM: 6406/1: at91sam9g45: fix i2c bus speed
leds: leds-ns2: fix locking
ARM: dove: fix __io() definition to use bus based offset
dmaengine: fix interrupt clearing for mv_xor
ARM: kirkwood: Unbreak PCIe I/O port
ARM: Fix build error when using KCONFIG_CONFIG
ARM: 6383/1: Implement phys_mem_access_prot() to avoid attributes aliasing
ARM: 6400/1: at91: fix arch_gettimeoffset fallout
ARM: 6398/1: add proc info for ARM11MPCore/Cortex-A9 from ARM
...
Linus Torvalds [Mon, 27 Sep 2010 19:32:00 +0000 (12:32 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: fix memory handling/allocation in rdma_request()
Linus Torvalds [Mon, 27 Sep 2010 19:31:12 +0000 (12:31 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
amd64_edac: Fix driver module removal
Linus Torvalds [Mon, 27 Sep 2010 19:29:39 +0000 (12:29 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
TOMOYO: Don't abuse sys_getpid(), sys_getppid()
Linus Torvalds [Mon, 27 Sep 2010 19:28:19 +0000 (12:28 -0700)]
Merge branch 'drm-intel-fixes' of git://git./linux/kernel/git/ickle/drm-intel
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel:
drm/i915/sdvo: Handle unsupported GET_SUPPORTED_ENHANCEMENTS gracefully
drm/i915/sdvo: Cleanup connector on error path
drm/i915: Fix 945GM regression in
e259befd
Linus Torvalds [Mon, 27 Sep 2010 19:27:00 +0000 (12:27 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: sdhci-s3c: fix NULL ptr access in sdhci_s3c_remove
mmc: sdhci-s3c: fix incorrect spinlock usage after merge
mmc: MAINTAINERS: add myself as MMC maintainer
Linus Torvalds [Mon, 27 Sep 2010 19:26:33 +0000 (12:26 -0700)]
Merge branch 'urgent' of git://git./linux/kernel/git/brodo/pcmcia-2.6
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: pd6729: Fix error path
pcmcia: preserve configuration information if request_io fails partly
Linus Torvalds [Mon, 27 Sep 2010 19:25:10 +0000 (12:25 -0700)]
Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
intel-iommu: Use symbolic values instead of magic numbers in Lenovo w/a
intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space
Linus Torvalds [Mon, 27 Sep 2010 19:22:21 +0000 (12:22 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Fix rounding-bug in __unmap_single
x86/amd-iommu: Work around S3 BIOS bug
x86/amd-iommu: Set iommu configuration flags in enable-loop
x86, setup: Fix earlyprintk=serial,0x3f8,115200
x86, setup: Fix earlyprintk=serial,ttyS0,115200
Linus Torvalds [Mon, 27 Sep 2010 19:21:48 +0000 (12:21 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Catch spurious interrupts after disabling counters
tracing/x86: Don't use mcount in kvmclock.c
tracing/x86: Don't use mcount in pvclock.c
Al Viro [Sun, 26 Sep 2010 18:29:12 +0000 (19:29 +0100)]
mn10300: check __get_user/__put_user results...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:29:02 +0000 (19:29 +0100)]
mn10300: get rid of set_fs(USER_DS) in sigframe setup
It really has no business being there; short of a serious kernel bug
we should already have USER_DS at that point. It shouldn't have been
done on x86 either...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:28:52 +0000 (19:28 +0100)]
mn10300: ->restart_block.fn needs to be reset on sigreturn
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:28:42 +0000 (19:28 +0100)]
mn10300: prevent double syscall restarts
set ->orig_d0 to -1, same as what sigreturn does
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:28:32 +0000 (19:28 +0100)]
mn10300: avoid SIGSEGV delivery loop
force_sigsegv() is there for purpose...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:28:22 +0000 (19:28 +0100)]
alpha: __get_user/__put_user results need to be checked...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 26 Sep 2010 18:28:12 +0000 (19:28 +0100)]
alpha: switch osf_sigprocmask() to use of sigprocmask()
get rid of a useless wrapper, while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Beulich [Mon, 27 Sep 2010 18:07:00 +0000 (11:07 -0700)]
3c59x: fix regression from patch "Add ethtool WOL support"
This patch (commit
690a1f2002a3091bd18a501f46c9530f10481463) added a
new call site for acpi_set_WOL() without checking that the function is
actually suitable to be called via
vortex_set_wol+0xcd/0xe0 [3c59x]
dev_ethtool+0xa5a/0xb70
dev_ioctl+0x2e0/0x4b0
T.961+0x49/0x50
sock_ioctl+0x47/0x290
do_vfs_ioctl+0x7f/0x340
sys_ioctl+0x80/0xa0
system_call_fastpath+0x16/0x1b
i.e. outside of code paths run when the device is not yet enabled or
already disabled. In particular, putting the device into D3hot is a
pretty bad idea when it was already brought up.
Furthermore, all prior callers of the function made sure they're
actually dealing with a PCI device, while the newly added one didn't.
In the same spirit, the .get_wol handler shouldn't indicate support
for WOL for non-PCI devices.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Wise [Sun, 19 Sep 2010 00:38:21 +0000 (19:38 -0500)]
RDMA/cxgb3: Turn off RX coalescing for iWARP connections
The HW by default has RX coalescing on. For iWARP connections, this
causes a 100ms delay in connection establishement due to the ingress
MPA Start message being stalled in HW. So explicitly turn RX
coalescing off when setting up iWARP connections.
This was causing very bad performance for NP64 gather operations using
Open MPI, due to the way it sets up connections on larger jobs.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Will Deacon [Mon, 27 Sep 2010 13:55:15 +0000 (14:55 +0100)]
ARM: 6411/1: vexpress: set RAM latencies to 1 cycle for PL310 on ct-ca9x4 tile
The PL310 on the ct-ca9x4 tile for the Versatile Express does not need
to add additional latency when accessing its cache RAMs. Unfortunately,
the boot monitor sets this up for an 8-cycle delay on reads and writes,
resulting in greatly reduced memory performance when the L2 cache is
enabled.
This patch sets the L2 RAM latencies to the correct value of 1 cycle
on the ct-ca9x4 tile before enabling the L2 cache.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Davidlohr Bueso [Mon, 13 Sep 2010 15:53:18 +0000 (15:53 +0000)]
net/9p: fix memory handling/allocation in rdma_request()
Return -ENOMEM when erroring on kmalloc and fix memory leaks when returning on error.
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Borislav Petkov [Sun, 26 Sep 2010 10:42:23 +0000 (12:42 +0200)]
amd64_edac: Fix driver module removal
f4347553b30ec66530bfe63c84530afea3803396 removed the edac polling
mechanism in favor of using a notifier chain for conveying MCE
information to edac. However, the module removal path didn't test
whether the driver had setup the polling function workqueue at all and
the rmmod process was hanging in the kernel at try_to_del_timer_sync()
in the cancel_delayed_work() path, trying to cancel an uninitialized
work struct.
Fix that by adding a balancing check to the workqueue removal path.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Alexander Chumachenko [Thu, 1 Apr 2010 12:34:52 +0000 (15:34 +0300)]
x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatile
While debugging bit_spin_lock() hang, it was tracked down to gcc-4.4
misoptimization of non-inlined constant_test_bit() due to non-volatile
addr when 'const volatile unsigned long *addr' cast to 'unsigned long *'
with subsequent unconditional jump to pause (and not to the test) leading
to hang.
Compiling with gcc-4.3 or disabling CONFIG_OPTIMIZE_INLINING yields inlined
constant_test_bit() and correct jump, thus working around the kernel bug.
Other arches than asm-x86 may implement this slightly differently;
2.6.29 mitigates the misoptimization by changing the function prototype
(commit
c4295fbb6048d85f0b41c5ced5cbf63f6811c46c) but probably fixing the issue
itself is better.
Signed-off-by: Alexander Chumachenko <ledest@gmail.com>
Signed-off-by: Michael Shigorin <mike@osdn.org.ua>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Neil Horman [Fri, 24 Sep 2010 09:55:52 +0000 (09:55 +0000)]
ipv6: add a missing unregister_pernet_subsys call
Clean up a missing exit path in the ipv6 module init routines. In
addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
for the ipv6_addr_label_ops structure. But if module loading fails, or if the
ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
which leaves a now-bogus address on the pernet_list, leading to oopses in
subsequent registrations. This patch cleans up both the failed load path and
the unload path. Tested by myself with good results.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
include/net/addrconf.h | 1 +
net/ipv6/addrconf.c | 11 ++++++++---
net/ipv6/addrlabel.c | 5 +++++
3 files changed, 14 insertions(+), 3 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasiliy Kulikov [Mon, 27 Sep 2010 01:56:06 +0000 (18:56 -0700)]
s390: use free_netdev(netdev) instead of kfree()
Freeing netdev without free_netdev() leads to net, tx leaks.
I might lead to dereferencing freed pointer.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
@@
struct net_device* dev;
@@
-kfree(dev)
+free_netdev(dev)
Signed-off-by: David S. Miller <davem@davemloft.net>
Kulikov Vasiliy [Sat, 25 Sep 2010 23:58:06 +0000 (23:58 +0000)]
sgiseeq: use free_netdev(netdev) instead of kfree()
Freeing netdev without free_netdev() leads to net, tx leaks.
I might lead to dereferencing freed pointer.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
@@
struct net_device* dev;
@@
-kfree(dev)
+free_netdev(dev)
Signed-off-by: David S. Miller <davem@davemloft.net>
Kulikov Vasiliy [Sat, 25 Sep 2010 23:58:03 +0000 (23:58 +0000)]
rionet: use free_netdev(netdev) instead of kfree()
Freeing netdev without free_netdev() leads to net, tx leaks.
I might lead to dereferencing freed pointer.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
@@
struct net_device* dev;
@@
-kfree(dev)
+free_netdev(dev)
Signed-off-by: David S. Miller <davem@davemloft.net>
Kulikov Vasiliy [Sat, 25 Sep 2010 23:58:00 +0000 (23:58 +0000)]
ibm_newemac: use free_netdev(netdev) instead of kfree()
Freeing netdev without free_netdev() leads to net, tx leaks.
I might lead to dereferencing freed pointer.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
@@
struct net_device* dev;
@@
-kfree(dev)
+free_netdev(dev)
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Stehlé [Mon, 27 Sep 2010 01:50:05 +0000 (18:50 -0700)]
smsc911x: Add MODULE_ALIAS()
This enables auto loading for the smsc911x ethernet driver.
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 23 Sep 2010 11:19:54 +0000 (11:19 +0000)]
net: reset skb queue mapping when rx'ing over tunnel
Reset queue mapping when an skb is reentering the stack via a tunnel.
On second pass, the queue mapping from the original device is no
longer valid.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karl Hiramoto [Thu, 23 Sep 2010 01:50:54 +0000 (01:50 +0000)]
br2684: fix scheduling while atomic
You can't call atomic_notifier_chain_unregister() while in atomic context.
Fix, call un/register_atmdevice_notifier in module __init and __exit.
Bug report:
http://comments.gmane.org/gmane.linux.network/172603
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sun, 26 Sep 2010 04:55:13 +0000 (05:55 +0100)]
TOMOYO: Don't abuse sys_getpid(), sys_getppid()
System call entry functions sys_*() are never to be called from
general kernel code. The fact that they aren't declared in header
files should have been a clue. These functions also don't exist on
Alpha since it has sys_getxpid() instead.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
Ondrej Zary [Sat, 25 Sep 2010 10:39:17 +0000 (10:39 +0000)]
de2104x: fix TP link detection
Compex FreedomLine 32 PnP-PCI2 cards have only TP and BNC connectors but the
SROM contains AUI port too. When TP loses link, the driver switches to
non-existing AUI port (which reports that carrier is always present).
Connecting TP back generates LinkPass interrupt but de_media_interrupt() is
broken - it only updates the link state of currently connected media, ignoring
the fact that LinkPass and LinkFail bits of MacStatus register belong to the
TP port only (the chip documentation says that).
This patch changes de_media_interrupt() to switch media to TP when link goes
up (and media type is not locked) and also to update the link state only when
the TP port is used.
Also the NonselPortActive (and also SelPortActive) bits of SIAStatus register
need to be cleared (by writing 1) after reading or they're useless.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Zary [Fri, 24 Sep 2010 23:57:02 +0000 (23:57 +0000)]
de2104x: fix power management
At least my 21041 cards come out of suspend with bus mastering disabled so
they did not work after resume(no data transferred).
After adding pci_set_master(), the driver oopsed immediately on resume -
because de_clean_rings() is called on suspend but de_init_rings() call
was missing in resume.
Also disable link (reset SIA) before sleep (de4x5 does this too).
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Szyprowski [Thu, 23 Sep 2010 14:22:05 +0000 (16:22 +0200)]
mmc: sdhci-s3c: fix NULL ptr access in sdhci_s3c_remove
If not all clocks have been defined in platform data, the driver will
cause a null pointer dereference when it is removed. This patch fixes
this issue.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Marek Szyprowski [Mon, 20 Sep 2010 13:03:42 +0000 (15:03 +0200)]
mmc: sdhci-s3c: fix incorrect spinlock usage after merge
In the commit
f522886e202a34a2191dd5d471b3c4d46410a9a0 a merge conflict
in the sdhci-s3c driver been fixed. However the fix used incorrect
spinlock operation - it caused a race with sdhci interrupt service. The
correct way to solve it is to use spin_lock_irqsave/irqrestore() calls.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Chris Ball [Fri, 10 Sep 2010 16:05:24 +0000 (12:05 -0400)]
mmc: MAINTAINERS: add myself as MMC maintainer
Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: Pierre Ossman <pierre-list@ossman.eu>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rahul Ruikar [Sun, 26 Sep 2010 11:30:29 +0000 (17:00 +0530)]
pcmcia: pd6729: Fix error path
In error return path
call pci_disable_device() which was enabled earlier.
Signed-off-by: Rahul Ruikar <rahul.ruikar@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Al Viro [Sat, 25 Sep 2010 20:07:51 +0000 (21:07 +0100)]
alpha: fix usp value in multithreaded coredumps
rdusp() gives us the right value only for the current thread...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sat, 25 Sep 2010 20:07:14 +0000 (21:07 +0100)]
alpha: fix hae_cache race in RESTORE_ALL
We want interrupts disabled on all paths leading to RESTORE_ALL;
otherwise, we are risking an IRQ coming between the updates of
alpha_mv->hae_cache and *alpha_mv->hae_register and set_hae()
within the IRQ getting badly confused.
RESTORE_ALL used to play with disabling IRQ itself, but that got
removed back in 2002, without making sure we had them disabled
on all paths. It's cheaper to make sure we have them disabled than
to revert to original variant...
Remove the detritus left from that commit back in 2002; we used to
need a reload of $0 and $1 since swpipl would change those, but
doing that had become pointless when we stopped doing swpipl in
there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 25 Sep 2010 16:51:54 +0000 (09:51 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: prevent merges of discard and write requests
Linus Torvalds [Sat, 25 Sep 2010 16:51:31 +0000 (09:51 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: sound/pci/rme9652: prevent reading uninitialized stack memory
ALSA: hda - Fix auto-parse of SPDIF input of Realtek codecs
ASoC: Fix multi-componentism
ASoC: Fix soc-cache buffer overflow bug
ALSA: oxygen: fix analog capture on Claro halo cards
ALSA: hda - Add Dell Latitude E6400 model quirk
ASoC: fix clkdev API usage in sh/migor.c
Larry Woodman [Fri, 24 Sep 2010 16:04:48 +0000 (12:04 -0400)]
Avoid pgoff overflow in remap_file_pages
Thomas Pollet noticed that the remap_file_pages() system call in
fremap.c has a potential overflow in the first part of the if statement
below, which could cause it to process bogus input parameters.
Specifically the pgoff + size parameters could be wrap thereby
preventing the system call from failing when it should.
Reported-by: Thomas Pollet <thomas.pollet@gmail.com>
Signed-off-by: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Sat, 25 Sep 2010 15:57:53 +0000 (17:57 +0200)]
Merge branch 'fix/hda' into for-linus
Takashi Iwai [Sat, 25 Sep 2010 15:57:49 +0000 (17:57 +0200)]
Merge branch 'fix/asoc' into for-linus
Dan Rosenberg [Sat, 25 Sep 2010 15:07:27 +0000 (11:07 -0400)]
ALSA: sound/pci/rme9652: prevent reading uninitialized stack memory
The SNDRV_HDSP_IOCTL_GET_CONFIG_INFO and
SNDRV_HDSP_IOCTL_GET_CONFIG_INFO ioctls in hdspm.c and hdsp.c allow
unprivileged users to read uninitialized kernel stack memory, because
several fields of the hdsp{m}_config_info structs declared on the stack
are not altered or zeroed before being copied back to the user. This
patch takes care of it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Santosh Shilimkar [Fri, 24 Sep 2010 06:21:05 +0000 (07:21 +0100)]
ARM: 6409/1: davinci: map sram using MT_MEMORY_NONCACHED instead of MT_DEVICE
On Davinci SRAM is mapped as MT_DEVICE becasue of the section
mapping pre-requisite instead of intended MT_MEMORY_NONCACHED
Since the section mapping limitation gets fixed with first
patch in this series, the MT_MEMORY_NONCACHED can be used now.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Santosh Shilimkar [Fri, 24 Sep 2010 06:19:49 +0000 (07:19 +0100)]
ARM: 6408/1: omap: Map only available sram memory
Currently we map 1 MB section while setting up SRAM on OMAPs
Regardless of the actual memory. The physical OCM RAM available
on OMAP SOCs is in order of KBs. This patch maps only available
sram and cleans up some un-necessary cpu_is_xxx checks.
Mapping un-available or non-accessible(secure) memory on the newer ARM
processor is dangerous. Because ARM CPUs can now speculatively prefetch,
we should avoid mapping any no-existing or secure memory.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Santosh Shilimkar [Fri, 24 Sep 2010 06:18:22 +0000 (07:18 +0100)]
ARM: 6407/1: mmu: Setup MT_MEMORY and MT_MEMORY_NONCACHED L1 entries
This patch populates the L1 entries for MT_MEMORY and MT_MEMORY_NONCACHED
types so that at boot-up, we can map memories outside system memory
at page level granularity
Previously the mapping was limiting to section level, which creates
unnecessary additional mapping for which physical memory may not
present. On the newer ARM with speculation, this is dangerous and can
result in untraceable aborts.
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adrian Hunter [Sat, 25 Sep 2010 10:42:55 +0000 (12:42 +0200)]
block: prevent merges of discard and write requests
Add logic to prevent two I/O requests being merged if
only one of them is a discard. Ditto secure discard.
Without this fix, it is possible for write requests
to transform into discard requests. For example:
Submit bio 1 to discard 8 sectors from sector n
Submit bio 2 to write 8 sectors from sector n + 16
Submit bio 3 to write 8 sectors from sector n + 8
Bio 1 becomes request 1. Bio 2 becomes request 2.
Bio 3 is merged with request 2, and then subsequently
request 2 is merged with request 1 resulting in just
one I/O request which discards all 24 sectors.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
(Moved the checks above the position checks /Jens)
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>