Stephen Rothwell [Mon, 20 Oct 2008 11:59:18 +0000 (22:59 +1100)]
staging: sxg depends on X86
sxghif.h has code that explicitly will not build fo other architecures.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 20 Oct 2008 16:06:35 +0000 (09:06 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netfilter: replace old NF_ARP calls with NFPROTO_ARP
netfilter: fix compilation error with NAT=n
netfilter: xt_recent: use proc_create_data()
netfilter: snmp nat leaks memory in case of failure
netfilter: xt_iprange: fix range inversion match
netfilter: netns: use NFPROTO_NUMPROTO instead of NUMPROTO for tables array
netfilter: ctnetlink: remove obsolete NAT dependency from Kconfig
pkt_sched: sch_generic: Fix oops in sch_teql
dccp: Port redirection support for DCCP
tcp: Fix IPv6 fallout from 'Port redirection support for TCP'
netdev: change name dropping error codes
ipvs: Update CONFIG_IP_VS_IPV6 description and help text
Linus Torvalds [Mon, 20 Oct 2008 16:03:12 +0000 (09:03 -0700)]
Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (69 commits)
Revert "[MTD] m25p80.c code cleanup"
[MTD] [NAND] GPIO driver depends on ARM... for now.
[MTD] [NAND] sh_flctl: fix compile error
[MTD] [NOR] AT49BV6416 has swapped erase regions
[MTD] [NAND] GPIO NAND flash driver
[MTD] cmdlineparts documentation change - explain where mtd-id comes from
[MTD] cfi_cmdset_0002.c: Add Macronix CFI V1.0 TopBottom detection
[MTD] [NAND] Fix compilation warnings in drivers/mtd/nand/cs553x_nand.c
[JFFS2] Write buffer offset adjustment for NOR-ECC (Sibley) flash
[MTD] mtdoops: Fix a bug where block may not be erased
[MTD] mtdoops: Add a magic number to logged kernel oops
[MTD] mtdoops: Fix an off by one error
[JFFS2] Correct parameter names of jffs2_compress() in comments
[MTD] [NAND] sh_flctl: add support for Renesas SuperH FLCTL
[MTD] [NAND] Bug on atmel_nand HW ECC : OOB info not correctly written
[MTD] [MAPS] Remove unused variable after ROM API cleanup.
[MTD] m25p80.c extended jedec support (v2)
[MTD] remove unused mtd parameter in of_mtd_parse_partitions()
[MTD] [NAND] remove dead Kconfig associated with !CONFIG_PPC_MERGE
[MTD] [NAND] driver extension to support NAND on TQM85xx modules
...
Parag Warudkar [Sun, 19 Oct 2008 03:28:50 +0000 (20:28 -0700)]
x86: sysfs: kill owner field from attribute
Tejun's commit
7b595756ec1f49e0049a9e01a1298d53a7faaa15 made sysfs
attribute->owner unnecessary. But the field was left in the structure to
ease the merge. It's been over a year since that change and it is now
time to start killing attribute->owner along with its users - one arch at
a time!
This patch is attempt #1 to get rid of attribute->owner only for
CONFIG_X86_64 or CONFIG_X86_32 . We will deal with other arches later on
as and when possible - avr32 will be the next since that is something I
can test. Compile (make allyesconfig / make allmodconfig / custom config)
and boot tested.
akpm: the idea is that we put the declaration of sttribute.owner inside
`#ifndef CONFIG_X86'. But that proved to be too ambitious for now because
new usages kept on turning up in subsystem trees.
[akpm: remove the ifdef for now]
Signed-off-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: Greg KH <greg@kroah.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Sun, 19 Oct 2008 03:28:49 +0000 (20:28 -0700)]
fs/Kconfig: move CIFS out
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:48 +0000 (20:28 -0700)]
include/linux/bcd.h: remove comments
- the macros are gone
- there's no more code in this file,
LGPL + GPL = GPL,
and the code that was moved to lib/bcd.c is anyway trivial
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:47 +0000 (20:28 -0700)]
remove the obsolete BCD*BIN/BIN*BCD macros
Remove the following obsolete macros:
- BCD2BIN
- BIN2BCD
- BCD_TO_BIN
- BIN_TO_BCD
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:47 +0000 (20:28 -0700)]
drivers/scsi/sr_vendor.c: use bcd2bin
Change sr_vendor.c to use the new bcd2bin function instead of the obsolete
BCD2BIN macro.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:46 +0000 (20:28 -0700)]
i2c: use bcd2bin/bin2bcd
Change i2c to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD2BIN/BIN2BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:45 +0000 (20:28 -0700)]
mn10300: use bcd2bin/bin2bcd
Change mn10300 to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD_TO_BIN/BIN_TO_BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:44 +0000 (20:28 -0700)]
mips: use bcd2bin/bin2bcd
Changes mips to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD_TO_BIN/BIN_TO_BCD/BCD2BIN/BIN2BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Sun, 19 Oct 2008 03:28:43 +0000 (20:28 -0700)]
drivers/rtc/rtc-bq4802.c: don't use BIN_2_BCD and BCD_2_BIN
These are going away.
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:42 +0000 (20:28 -0700)]
rtc: use bcd2bin/bin2bcd
Change various rtc related code to use the new bcd2bin/bin2bcd functions
instead of the obsolete BCD_TO_BIN/BIN_TO_BCD/BCD2BIN/BIN2BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:41 +0000 (20:28 -0700)]
drivers/rtc/: use bcd2bin/bin2bcd
Change drivers/rtc/ to use the new bcd2bin/bin2bcd functions instead of
the obsolete BCD_TO_BIN/BIN_TO_BCD/BCD2BIN/BIN2BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:40 +0000 (20:28 -0700)]
cris: use bcd2bin/bin2bcd
Change cris to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD_TO_BIN/BIN_TO_BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Chris Zankel <zankel@tensilica.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:39 +0000 (20:28 -0700)]
alpha: use bcd2bin/bin2bcd
Change alpha to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD_TO_BIN/BIN_TO_BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:39 +0000 (20:28 -0700)]
acpi: use bcd2bin/bin2bcd
Change ACPI to use the new bcd2bin/bin2bcd functions instead of the
obsolete BCD_TO_BIN/BIN_TO_BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:38 +0000 (20:28 -0700)]
make mm/rmap.c:anon_vma_cachep static
This patch makes the needlessly global anon_vma_cachep static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harvey Harrison [Sun, 19 Oct 2008 03:28:37 +0000 (20:28 -0700)]
byteorder: remove direct includes of linux/byteorder/swab[b].h
A consolidated implementation will provide this generically through
asm/byteorder, remove direct includes to avoid breakage when the
changeover to the new implementation occurs.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harvey Harrison [Sun, 19 Oct 2008 03:28:37 +0000 (20:28 -0700)]
byteorder: provide swabb.h generically in asm/byteorder.h
This is needed during the transition to the new byteorder headers as the
swabb.h functionality will be provided from asm/byteorder.h in the new
version. To avoid breakage on arches still using the old implementation,
provide swabb.h from asm/byteorder.h as well.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harvey Harrison [Sun, 19 Oct 2008 03:28:36 +0000 (20:28 -0700)]
byteorder: use generic C version for value byteswapping
This makes the new implementation of the byteorder helpers match the old
in how it degraded when an arch-defined version was not available:
1) swab()
- look for arch defined
- if not, use generic c version
2) swabp()
- look for arch-defined
- if not, deref pointer and use swab()
3) swabs()
- look for arch defined
- if not, use swabp
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harvey Harrison [Sun, 19 Oct 2008 03:28:35 +0000 (20:28 -0700)]
byteorder: add new headers for make headers-install
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Benjamin Herrenschmidt [Sun, 19 Oct 2008 03:28:35 +0000 (20:28 -0700)]
edac cell: fix incorrect edac_mode
The cell_edac driver is setting the edac_mode field of the csrow's to an
incorrect value, causing the sysfs show routine for that field to go out
of an array bound and Oopsing the kernel when used.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: <stable@kernel.org> [2.6.27.x, 2.6.26.x. 2.6.25.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andre Haupt [Sun, 19 Oct 2008 03:28:33 +0000 (20:28 -0700)]
pc8736x_gpio: add support for PC87365 chips
This is only compile tested, because I do not own appropriate hardware.
Signed-off-by: Andre Haupt <andre@bitwigglers.org>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Korty [Sun, 19 Oct 2008 03:28:32 +0000 (20:28 -0700)]
message queues: increase range limits
Increase the range of various posix message queue limits.
Posix gives the message queue user the ability to 'trade off' the maximum
size of messages with the number of possible messages that can be 'in
flight'. Linux currently makes this trade off more restrictive than it
needs to be.
In particular, the maximum message size today can be made no smaller than
8192. This greatly restricts those applications that would like to have
the ability to post large numbers of very small messages.
So this task lowers the limit that the maximum message size can be set to,
from 8192 to 128. It also lowers the limit that the maximum #number of
messages in flight can be set to, from 10 to 1.
With these changes the message queue user can make better trade offs
between #messages and message size, in order to get everything to fit
within the setrlimit(RLIMIT_MSGQUEUE) limit for that particular user.
This patch also applies the values in
/proc/sys/fs/mqueue/msg_max
/proc/sys/fs/mqueue/msgsize_max
as the defaults for the max #messages allowed and the max message size
allowed, respectively, for those applications that do not supply these.
Previously, the defaults were hardwired to 10 and 8192, respectively.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ken'ichi Ohmichi [Sun, 19 Oct 2008 03:28:30 +0000 (20:28 -0700)]
kdump: add vmlist.addr to vmcoreinfo for x86 vmalloc translation.
Add the symbols 'vmlist' and offset 'vm_struct.addr' to the vmcoreinfo[1]
data for i386 vmalloc translation.
makedumpfile[2] needs VMALLOC_START value for distinguishing a vmalloc
address or not, because it should choose suitable translation method. If
applying this patch, makedumpfile will be able to take VMALLOC_START value
from 'vmlist.addr'.
vmcoreinfo[1]:
The vmcoreinfo data has the minimum debugging information only for dump
filtering. makedumpfile[2] uses it to distinguish unnecessary pages and
creates a small dumpfile.
makedumpfile[2]:
dump filtering command
https://sourceforge.net/projects/makedumpfile/
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Horman [Sun, 19 Oct 2008 03:28:29 +0000 (20:28 -0700)]
always reserve elfcore header memory in crash kernel
elfcore header memory needs to be reserved in a crash kernel. This means
that the relevant code should be protected by CONFIG_CRASH_DUMP rather
than CONFIG_PROC_VMCORE.
Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Horman [Sun, 19 Oct 2008 03:28:29 +0000 (20:28 -0700)]
kdump: add is_vmcore_usable() and vmcore_unusable()
The usage of elfcorehdr_addr has changed recently such that being set to
ELFCORE_ADDR_MAX is used by is_kdump_kernel() to indicate if the code is
executing in a kernel executed as a crash kernel.
However, arch/ia64/kernel/setup.c:reserve_elfcorehdr will rest
elfcorehdr_addr to ELFCORE_ADDR_MAX on error, which means any subsequent
calls to is_kdump_kernel() will return 0, even though they should return
1.
Ok, at this point in time there are no subsequent calls, but I think its
fair to say that there is ample scope for error or at the very least
confusion.
This patch add an extra state, ELFCORE_ADDR_ERR, which indicates that
elfcorehdr_addr was passed on the command line, and thus execution is
taking place in a crashdump kernel, but vmcore can't be used for some
reason. This is tested for using is_vmcore_usable() and set using
vmcore_unusable(). A subsequent patch makes use of this new code.
To summarise, the states that elfcorehdr_addr can now be in are as follows:
ELFCORE_ADDR_MAX: not a crashdump kernel
ELFCORE_ADDR_ERR: crashdump kernel but vmcore is unusable
any other value: crash dump kernel and vmcore is usable
Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Horman [Sun, 19 Oct 2008 03:28:28 +0000 (20:28 -0700)]
kdump: use is_kdump_kernel() in sba_init()
o Make use of is_kdump_kernel() rather than checking elfcorehdr_addr directly.
o Remove CONFIG_CRASH_DUMP as is_kdump_kernel() is safe to call anywhere
o Remove CONFIG_PROC_FS as it is bogus, the check
should occur regardless of if CONFIG_PROC_FS is set or not.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Horman [Sun, 19 Oct 2008 03:28:27 +0000 (20:28 -0700)]
kdump: update elfcorehdr documentation to reflect supported architectures
IA64, PPC and SH also support the elfcorehdr command line.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Goyal [Sun, 19 Oct 2008 03:28:25 +0000 (20:28 -0700)]
kdump: make elfcorehdr_addr independent of CONFIG_PROC_VMCORE
o elfcorehdr_addr is used by not only the code under CONFIG_PROC_VMCORE
but also by the code which is not inside CONFIG_PROC_VMCORE. For
example, is_kdump_kernel() is used by powerpc code to determine if
kernel is booting after a panic then use previous kernel's TCE table.
So even if CONFIG_PROC_VMCORE is not set in second kernel, one should be
able to correctly determine that we are booting after a panic and setup
calgary iommu accordingly.
o So remove the assumption that elfcorehdr_addr is under
CONFIG_PROC_VMCORE.
o Move definition of elfcorehdr_addr to arch dependent crash files.
(Unfortunately crash dump does not have an arch independent file
otherwise that would have been the best place).
o kexec.c is not the right place as one can Have CRASH_DUMP enabled in
second kernel without KEXEC being enabled.
o I don't see sh setup code parsing the command line for
elfcorehdr_addr. I am wondering how does vmcore interface work on sh.
Anyway, I am atleast defining elfcoredhr_addr so that compilation is not
broken on sh.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Sun, 19 Oct 2008 03:28:24 +0000 (20:28 -0700)]
kthread_bind: use wait_task_inactive(TASK_UNINTERRUPTIBLE)
Now that wait_task_inactive(task, state) checks task->state == state,
we can simplify the code and make this debugging check more robust.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland McGrath [Sun, 19 Oct 2008 03:28:23 +0000 (20:28 -0700)]
add CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS
This adds a kconfig option to change the /proc/PID/coredump_filter default.
Fedora has been carrying a trivial patch to change the hard-wired value for
this default, since Fedora 8. The default default can't change safely
because there are old GDB versions out there (all before 6.7) that are
confused by the core dump files created by the MMF_DUMP_ELF_HEADERS setting.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kawai Hidehiro <hidehiro.kawai.ez@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Sun, 19 Oct 2008 03:28:22 +0000 (20:28 -0700)]
coredump: format_corename: don't append .%pid if multi-threaded
If the coredumping is multi-threaded, format_corename() appends .%pid to
the corename. This was needed before the proper multi-thread core dump
support, now all the threads in the mm go into a single unified core file.
Remove this special case, it is not even documented and we have "%p"
and core_uses_pid.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: La Monte Yarroll <piggy@laurelnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:28:21 +0000 (20:28 -0700)]
make ptrace_untrace() static
ptrace_untrace() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:21 +0000 (20:28 -0700)]
bitmask: remove bitmap_scnprintf_len()
bitmap_scnprintf_len() is not used now, so we remove it.
Otherwise we have to maintain it and make its return
value always equal to bitmap_scnprintf()'s return value.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:20 +0000 (20:28 -0700)]
cpuset: use seq_*mask_* to print masks
1) seq_file excepts that m->count == m->size when it's buf is full,
so current code will causes bugs when buf is overflow.
2) There is not too good that cpuset accesses struct seq_file's
fields directly.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:19 +0000 (20:28 -0700)]
seq_file: add seq_cpumask_list(), seq_nodemask_list()
seq_cpumask_list(), seq_nodemask_list() are very like seq_cpumask(),
seq_nodemask(), but they print human readable string.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:18 +0000 (20:28 -0700)]
seq_file: don't call bitmap_scnprintf_len()
"m->count + len < m->size" is true commonly, so bitmap_scnprintf()
is commonly called. this fix saves a call to bitmap_scnprintf_len().
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rakib Mullick [Sun, 19 Oct 2008 03:28:18 +0000 (20:28 -0700)]
cpuset.c: remove extra variable
Remove the use of int cpus_nonempty variable from 'update_flag' function.
Signed-off-by: Md.Rakib H. Mullick <rakib.mullick@gmail.com>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:16 +0000 (20:28 -0700)]
memcg: allocate all page_cgroup at boot
Allocate all page_cgroup at boot and remove page_cgroup poitner from
struct page. This patch adds an interface as
struct page_cgroup *lookup_page_cgroup(struct page*)
All FLATMEM/DISCONTIGMEM/SPARSEMEM and MEMORY_HOTPLUG is supported.
Remove page_cgroup pointer reduces the amount of memory by
- 4 bytes per PAGE_SIZE.
- 8 bytes per PAGE_SIZE
if memory controller is disabled. (even if configured.)
On usual 8GB x86-32 server, this saves 8MB of NORMAL_ZONE memory.
On my x86-64 server with 48GB of memory, this saves 96MB of memory.
I think this reduction makes sense.
By pre-allocation, kmalloc/kfree in charge/uncharge are removed.
This means
- we're not necessary to be afraid of kmalloc faiulre.
(this can happen because of gfp_mask type.)
- we can avoid calling kmalloc/kfree.
- we can avoid allocating tons of small objects which can be fragmented.
- we can know what amount of memory will be used for this extra-lru handling.
I added printk message as
"allocated %ld bytes of page_cgroup"
"please try cgroup_disable=memory option if you don't want"
maybe enough informative for users.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:11 +0000 (20:28 -0700)]
memcg: atomic ops for page_cgroup->flags
This patch makes page_cgroup->flags to be atomic_ops and define functions
(and macros) to access it.
Before trying to modify memory resource controller, this atomic operation
on flags is necessary. Most of flags in this patch is for LRU and modfied
under mz->lru_lock but we'll add another flags which is not for LRU soon.
For example, we'll place LOCK bit on flags field. We need atomic
operation to modify LRU bit without LOCK.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:10 +0000 (20:28 -0700)]
memcg: optimize per-cpu statistics
Some obvious optimization to memcg.
I found mem_cgroup_charge_statistics() is a little big (in object) and
does unnecessary address calclation. This patch is for optimization to
reduce the size of this function.
And res_counter_charge() is 'likely' to succeed.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:10 +0000 (20:28 -0700)]
memcg: avoid accounting special pages
There are not-on-LRU pages which can be mapped and they are not worth to
be accounted. (becasue we can't shrink them and need dirty codes to
handle specical case) We'd like to make use of usual objrmap/radix-tree's
protcol and don't want to account out-of-vm's control pages.
When special_mapping_fault() is called, page->mapping is tend to be NULL
and it's charged as Anonymous page. insert_page() also handles some
special pages from drivers.
This patch is for avoiding to account special pages.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:09 +0000 (20:28 -0700)]
memcg: make page->mapping NULL before uncharge
This patch tries to make page->mapping to be NULL before
mem_cgroup_uncharge_cache_page() is called.
"page->mapping == NULL" is a good check for "whether the page is still
radix-tree or not". This patch also adds BUG_ON() to
mem_cgroup_uncharge_cache_page();
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Sun, 19 Oct 2008 03:28:08 +0000 (20:28 -0700)]
memcg: move charge swapin under lock
While page-cache's charge/uncharge is done under page_lock(), swap-cache
isn't. (anonymous page is charged when it's newly allocated.)
This patch moves do_swap_page()'s charge() call under lock. I don't see
any bad problem *now* but this fix will be good for future for avoiding
unnecessary racy state.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:07 +0000 (20:28 -0700)]
devcgroup: remove spin_lock()
Since we introduced rcu for read side, spin_lock is used only for update.
But we always hold cgroup_lock() when update, so spin_lock() is not need.
Additional cleanup:
1) include linux/rcupdate.h explicitly
2) remove unused variable cur_devcgroup in devcgroup_update_access()
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Sun, 19 Oct 2008 03:28:07 +0000 (20:28 -0700)]
devcgroup: remove unused variable
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Sun, 19 Oct 2008 03:28:06 +0000 (20:28 -0700)]
devcgroup: use kmemdup()
This saves 40 bytes on my x86_32 box.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menage [Sun, 19 Oct 2008 03:28:05 +0000 (20:28 -0700)]
cgroups: fix declaration of cgroup_mm_owner_callbacks
The choice of real/dummy declaration for cgroup_mm_owner_callbacks()
shouldn't be based on CONFIG_MM_OWNER, but on CONFIG_CGROUPS. Otherwise
kernel/exit.c fails to compile when something other than a cgroups
controller selects CONFIG_MM_OWNER
Signed-off-by: Paul Menage <menage@google.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menage [Sun, 19 Oct 2008 03:28:04 +0000 (20:28 -0700)]
cgroups: convert tasks file to use a seq_file with shared pid array
Rather than pre-generating the entire text for the "tasks" file each
time the file is opened, we instead just generate/update the array of
process ids and use a seq_file to report these to userspace. All open
file handles on the same "tasks" file can share a pid array, which may
be updated any time that no thread is actively reading the array. By
sharing the array, the potential for userspace to DoS the system by
opening many handles on the same "tasks" file is removed.
[Based on a patch by Lai Jiangshan, extended to use seq_file]
Signed-off-by: Paul Menage <menage@google.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lai Jiangshan [Sun, 19 Oct 2008 03:28:03 +0000 (20:28 -0700)]
cgroups: fix probable race with put_css_set[_taskexit] and find_css_set
put_css_set_taskexit may be called when find_css_set is called on other
cpu. And the race will occur:
put_css_set_taskexit side find_css_set side
|
atomic_dec_and_test(&kref->refcount) |
/* kref->refcount = 0 */ |
....................................................................
| read_lock(&css_set_lock)
| find_existing_css_set
| get_css_set
| read_unlock(&css_set_lock);
....................................................................
__release_css_set |
....................................................................
| /* use a released css_set */
|
[put_css_set is the same. But in the current code, all put_css_set are
put into cgroup mutex critical region as the same as find_css_set.]
[akpm@linux-foundation.org: repair comments]
[menage@google.com: eliminate race in css_set refcounting]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sesterhenn [Sun, 19 Oct 2008 03:28:02 +0000 (20:28 -0700)]
hfsplus: fix possible deadlock when handling corrupted extents
A corrupted extent for the extent file itself may try to get an impossible
extent, causing a deadlock if I see it correctly.
Check the inode number after the first_blocks checks and fail if it's the
extent file, as according to the spec the extent file should have no
extent for itself.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Sun, 19 Oct 2008 03:28:01 +0000 (20:28 -0700)]
hfsplus: missing O_LARGEFILE check
hfsplus: O_LARGEFILE checking is missing
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=8490
From: Alan Cox <alan@redhat.com>
Reported-by: didier <did447@gmail.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Sun, 19 Oct 2008 03:28:00 +0000 (20:28 -0700)]
ext3: avoid printk floods in the face of directory corruption
A very large directory with many read failures (either due to storage
problems, or due to invalid size & blocks from corruption) will generate a
printk storm as the filesystem continues to try to read all the blocks.
This flood of messages can tie up the box until it is complete - which may
be a very long time, especially for very large corrupted values.
This is fixed by only reporting the corruption once each time we try to
read the directory.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Sun, 19 Oct 2008 03:28:00 +0000 (20:28 -0700)]
ext3: truncate block allocated on a failed ext3_write_begin
For blocksize < pagesize we need to remove blocks that got allocated in
block_write_begin() if we fail with ENOSPC for later blocks.
block_write_begin() internally does this if it allocated page locally.
This makes sure we don't have blocks outside inode.i_size during ENOSPC.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eugene Dashevsky [Sun, 19 Oct 2008 03:27:59 +0000 (20:27 -0700)]
ext3: fix ext3_dx_readdir hash collision handling
This fixes a bug where readdir() would return a directory entry twice
if there was a hash collision in an hash tree indexed directory.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Eugene Dashevsky <eugene@ibrix.com>
Signed-off-by: Mike Snitzer <msnitzer@ibrix.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Sun, 19 Oct 2008 03:27:58 +0000 (20:27 -0700)]
jbd: ordered data integrity fix
In ordered mode, if a file data buffer being dirtied exists in the
committing transaction, we write the buffer to the disk, move it from the
committing transaction to the running transaction, then dirty it. But we
don't have to remove the buffer from the committing transaction when the
buffer couldn't be written out, otherwise it would miss the error and the
committing transaction would not abort.
This patch adds an error check before removing the buffer from the
committing transaction.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Sun, 19 Oct 2008 03:27:57 +0000 (20:27 -0700)]
ext3: add an option to control error handling on file data
If the journal doesn't abort when it gets an IO error in file data blocks,
the file data corruption will spread silently. Because most of
applications and commands do buffered writes without fsync(), they don't
notice the IO error. It's scary for mission critical systems. On the
other hand, if the journal aborts whenever it gets an IO error in file
data blocks, the system will easily become inoperable. So this patch
introduces a filesystem option to determine whether it aborts the journal
or just call printk() when it gets an IO error in file data.
If you mount a ext3 fs with data_err=abort option, it aborts on file data
write error. If you mount it with data_err=ignore, it doesn't abort, just
call printk(). data_err=ignore is the default.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mingming Cao [Sun, 19 Oct 2008 03:27:56 +0000 (20:27 -0700)]
ext3: fix ext3 block reservation early ENOSPC issue
We could run into ENOSPC error on ext3, even when there is free blocks on
the filesystem.
The problem is triggered in the case the goal block group has 0 free
blocks , and the rest block groups are skipped due to the check of
"free_blocks < windowsz/2". Current code could fall back to non
reservation allocation to prevent early ENOSPC after examing all the block
groups with reservation on , but this code was bypassed if the reservation
window is turned off already, which is true in this case.
This patch fixed two issues:
1) We don't need to turn off block reservation if the goal block group has
0 free blocks left and continue search for the rest of block groups.
Current code the intention is to turn off the block reservation if the
goal allocation group has a few (some) free blocks left (not enough for
make the desired reservation window),to try to allocation in the goal
block group, to get better locality. But if the goal blocks have 0 free
blocks, it should leave the block reservation on, and continues search for
the next block groups,rather than turn off block reservation completely.
2) we don't need to check the window size if the block reservation is off.
The problem was originally found and fixed in ext4.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josef Bacik [Sun, 19 Oct 2008 03:27:55 +0000 (20:27 -0700)]
ext3: don't try to resize if there are no reserved gdt blocks left
When trying to resize a ext3 fs and you run out of reserved gdt blocks,
you get an error that doesn't actually tell you what went wrong, it just
says that the gdb it picked is not correct, which is the case since you
don't have any reserved gdt blocks left. This patch adds a check to make
sure you have reserved gdt blocks to use, and if not prints out a more
relevant error.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Andreas Dilger <adilger@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Sun, 19 Oct 2008 03:27:54 +0000 (20:27 -0700)]
jbd: don't dirty original metadata buffer on abort
Currently, original metadata buffers are dirtied when they are unfiled
whether the journal has aborted or not. Eventually these buffers will be
written-back to the filesystem by pdflush. This means some metadata
buffers are written to the filesystem without journaling if the journal
aborts. So if both journal abort and system crash happen at the same
time, the filesystem would become inconsistent state. Additionally,
replaying journaled metadata can overwrite the latest metadata on the
filesystem partly. Because, if the journal aborts, journaled metadata are
preserved and replayed during the next mount not to lose uncheckpointed
metadata. This would also break the consistency of the filesystem.
This patch prevents original metadata buffers from being dirtied on abort
by clearing BH_JBDDirty flag from those buffers. Thus, no metadata
buffers are written to the filesystem without journaling.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hidehiro Kawai [Sun, 19 Oct 2008 03:27:53 +0000 (20:27 -0700)]
jbd: abort when failed to log metadata buffers
If we failed to write metadata buffers to the journal space and succeeded
to write the commit record, stale data can be written back to the
filesystem as metadata in the recovery phase.
To avoid this, when we failed to write out metadata buffers, abort the
journal before writing the commit record.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Richard Holden [Sun, 19 Oct 2008 03:27:53 +0000 (20:27 -0700)]
phonedev: remove BKL
The phone_device array is covered by the phone_lock mutex in all cases and
request_module no longer needs the BKL so we can remove the only remaining
instance of the BKL from phonedev.
Signed-off-by: Richard Holden <aciddeath@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Sun, 19 Oct 2008 03:27:51 +0000 (20:27 -0700)]
fb: convert lock/unlock_kernel() into local fb mutex
Change lock_kernel()/unlock_kernel() to local fb mutex. Each frame buffer
instance has its own mutex.
The one line try_to_load() function is unrolled to request_module() in two
places for readability.
[righi.andrea@gmail.com: fb: fix NULL pointer BUG dereference in fb_open()]
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Sun, 19 Oct 2008 03:27:51 +0000 (20:27 -0700)]
fb: push down the BKL in the ioctl handler
Framebuffer is heavily BKL dependant at the moment so just wrap the ioctl
handler in the driver as we push down.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Sun, 19 Oct 2008 03:27:49 +0000 (20:27 -0700)]
gpiolib: fix oops in gpio_get_value_cansleep()
We can get the following oops from gpio_get_value_cansleep() when a GPIO
controller doesn't provide a get() callback:
Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0x00000000
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [
00000000] 0x0
LR [
c0182fb0] gpio_get_value_cansleep+0x40/0x50
Call Trace:
[
c7b79e80] [
c0183f28] gpio_value_show+0x5c/0x94
[
c7b79ea0] [
c01a584c] dev_attr_show+0x30/0x7c
[
c7b79eb0] [
c00d6b48] fill_read_buffer+0x68/0xe0
[
c7b79ed0] [
c00d6c54] sysfs_read_file+0x94/0xbc
[
c7b79ef0] [
c008f24c] vfs_read+0xb4/0x16c
[
c7b79f10] [
c008f580] sys_read+0x4c/0x90
[
c7b79f40] [
c0013a14] ret_from_syscall+0x0/0x38
It's OK to request the value of *any* GPIO; most GPIOs are bidirectional,
so configuring them as outputs just enables an output driver and doesn't
disable the input logic.
So the problem is that gpio_get_value_cansleep() isn't making the same
sanity check that gpio_get_value() does: making sure this GPIO isn't one
of the atypical "no input logic" cases.
Reported-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org> [2.6.27.x, 2.6.26.x, 2.6.25.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven A. Falco [Sun, 19 Oct 2008 03:27:48 +0000 (20:27 -0700)]
gpio: modify sysfs gpio export so that "value" displays as 0 or 1
gpiolib can export GPIOs to userspace via sysfs. This patch modifies the
gpio_value_show() so that any non-zero value is explicitly printed as "1",
rather than whatever numerical value the lower-level driver returns.
Signed-off-by: Steve Falco <sfalco@harris.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Sun, 19 Oct 2008 03:27:47 +0000 (20:27 -0700)]
rtc-cmos: export second NVRAM bank
Teach rtc-cmos about the second bank of registers found on most modern x86
systems, giving access to 128 bytes more NVRAM.
This version only sees that extra NVRAM when both register banks are
provided as part of *one* PNP resource. Since BIOS on some systems
presents them using two IO resources, and nothing merges them, this can't
always show all the NVRAM. (We're supposed to be able to use PNP id
PNP0b01 too, but BIOS tables doesn't often seem to use that particular
option.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
roel kluin [Sun, 19 Oct 2008 03:27:46 +0000 (20:27 -0700)]
Altix serial: fix
In function sn_sal_switch_to_asynch(): drivers/serial/sn_console.c:713:
HZ * SN_SAL_UART_FIFO_DEPTH / SN_SAL_UART_FIFO_SPEED_CPS;
After preprocessing (see defines in patch) this becomes HZ * 16 / 9600 / 10
(associativity from left to right), not equivalent to HZ * 16 / 960.
Looks-obviously-right-to: Tony Luck <tony.luck@intel.com>
Cc: Jes Sorensen <jes@sgi.com>
Acked-by: Pat Gefre <pfg@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sun, 19 Oct 2008 03:27:45 +0000 (20:27 -0700)]
make probe_serial_gsc() static
This patch makes the needlessly global probe_serial_gsc() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Sun, 19 Oct 2008 03:27:44 +0000 (20:27 -0700)]
Char: sx, remove bogus iomap
readl/writel are not expected to accept iomap return value. Replace
bogus mapping by standard ioremap.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: <R.E.Wolff@BitWizard.nl>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:43 +0000 (20:27 -0700)]
hwmon: applesmc: lighter wait mechanism, drastic improvement
The read fail ratio is sensitive to the delay between the first byte
written and the first byte read; apparently the sensors cannot be rushed.
Increasing the minimum wait time, without changing the total wait time,
improves the fail ratio from a 8% chance that any of the sensors fails in
one read, down to 0.4%, on a Macbook Air. On a Macbook Pro 3,1, the
effect is even more apparent. By reducing the number of status polls, the
ratio is further improved to below 0.1%. Finally, increasing the total
wait time brings the fail ratio down to virtually zero.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Tested-by: Bob McElrath <bob@mcelrath.org>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:42 +0000 (20:27 -0700)]
hwmon: applesmc: Add support for Macbook Pro 3
Add temperature sensor support for Macbook Pro 3.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:41 +0000 (20:27 -0700)]
hwmon: applesmc: Add support for Macbook Pro 4
Adds temperature sensor support for the Macbook Pro 4.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Sun, 19 Oct 2008 03:27:41 +0000 (20:27 -0700)]
drivers/hwmon/applesmc.c: remove unneeded casts
dmi_system_id.driver_data is already void*.
Cc: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:40 +0000 (20:27 -0700)]
hwmon: applesmc: add support for Macbook Air
This patch adds accelerometer, backlight and temperature sensor support
for the Macbook Air.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:39 +0000 (20:27 -0700)]
hwmon: applesmc: allow for variable ALV0 and ALV1 package length
On some recent Macbooks, the package length for the light sensors ALV0 and
ALV1 has changed from 6 to 10. This patch allows for a variable package
length encompassing both variants.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:39 +0000 (20:27 -0700)]
hwmon: applesmc: prolong status wait
The time to wait for a status change while reading or writing to the SMC
ports is a balance between read reliability and system performance. The
current setting yields rougly three errors in a thousand when
simultaneously reading three different temperature values on a Macbook
Air. This patch increases the setting to a value yielding roughly one
error in ten thousand, with no noticable system performance degradation.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:38 +0000 (20:27 -0700)]
hwmon: applesmc: fix the 'wait status failed: c != 8' problem
On many Macbooks since mid 2007, the Pro, C2D and Air models, applesmc
fails to read some or all SMC ports. This problem has various effects,
such as flooded logfiles, malfunctioning temperature sensors,
accelerometers failing to initialize, and difficulties getting backlight
functionality to work properly.
The root of the problem seems to be the command protocol. The current
code sends out a command byte, then repeatedly polls for an ack before
continuing to send or recieve data. From experiments leading to this
patch, it seems the command protocol never quite worked or changed so that
one now sends a command byte, waits a little bit, polls for an ack, and if
it fails, repeats the whole thing by sending the command byte again.
This patch implements a send_command function according to the new
interpretation of the protocol, and should work also for earlier models.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Sun, 19 Oct 2008 03:27:35 +0000 (20:27 -0700)]
hwmon: applesmc: specified number of bytes to read should match actual
At one single place in the code, the specified number of bytes to read and
the actual number of bytes read differ by one. This one-liner patch fixes
that inconsistency.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Cc: Riki Oktarianto <rkoktarianto@gmail.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:35 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: add therm-min/max/crit-alarms
Adds therm-min/max/crit-alarm callbacks, sensor-device-attribute
declarations, and refs to those new decls in the macro used to initialize
the therm_group (of sysfs files)
The thermistors use voltage channels to measure; so they don't have a
fault-alarm, but unlike the other voltages, they do have an overtemp,
which we call crit (by convention).
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:34 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: add dev_dbg help
temp and vin status register values may be set by chip specifications, set
again by bios, or by this previously loaded driver. Debug output nicely
displays modprobe init=\d actions.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:33 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: define LDNI_MAX const
Driver handles 3 logical devices in fixed length array. Give this a
define-d constant.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:32 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: add temp-min/max/crit/fault-alarms
Adds temp-min/max/crit/fault-alarm callbacks, sensor-device-attribute
declarations, and refs to those new decls in the macro used to initialize
the temp_group (of sysfs files)
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:32 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: add in-min/max-alarms
Adds vin-min/max-alarm callbacks, sensor-device-attribute declarations,
and refs to those new decls in the macro used to initialize the vin_group
(of sysfs files)
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jim Cromie [Sun, 19 Oct 2008 03:27:30 +0000 (20:27 -0700)]
hwmon/pc87360 separate alarm files: define some constants
Bring hwmon/pc87360 into agreement with
Documentation/hwmon/sysfs-interface.
Patchset adds separate limit alarms for voltages and temps, it also adds
temp[123]_fault files. On my Soekris, temps 1,2 are unused/unconnected,
so temp[123]_fault = 1,1,0 respectively. This agrees with
/usr/bin/sensors, which has always shown them as OPEN. Temps 4,5,6 are
thermistor based, and dont have a fault bit in their status register.
This patch:
2 different kinds of constants added:
- CHAN_ALM_* constants for (later) vin, temp alarm callbacks.
- CHAN_* conversion constants, used in _init_device, partly for RW1C bits
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ameya Palande [Sun, 19 Oct 2008 03:27:30 +0000 (20:27 -0700)]
intel-iommu: typo fix and correct word in the comment
Fix for a typo and and replacing incorrect word in the comment.
Signed-off-by: Ameya Palande <2ameya@gmail.com>
Cc: "Ashok Raj" <ashok.raj@intel.com>
Cc: "Shaohua Li" <shaohua.li@intel.com>
Cc: "Anil S Keshavamurthy" <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WANG Cong [Sun, 19 Oct 2008 03:27:29 +0000 (20:27 -0700)]
kernel/configs.c: remove useless comments
These comments are useless, remove them.
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Piel [Sun, 19 Oct 2008 03:27:28 +0000 (20:27 -0700)]
HP-WMI: additional keycode (or typo)
On my HP 2510, pressing the (i) button generates an unknown keycode:
0x213b. So here is a patch adding support for it. However, as it seems
there is already support for a similar button connected to 0x231b as
keycode, I wonder if it could be a typo in the driver?
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Sun, 19 Oct 2008 03:27:27 +0000 (20:27 -0700)]
Fix documentation of sysrq-q
I fell into the trap recently that it only dumps hrtimers instead of
all timers. Fix the documentation.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WANG Cong [Sun, 19 Oct 2008 03:27:26 +0000 (20:27 -0700)]
uml: fix a compile error
Fix
arch/um/sys-i386/signal.c: In function 'copy_sc_from_user':
arch/um/sys-i386/signal.c:182: warning: dereferencing 'void *' pointer
arch/um/sys-i386/signal.c:182: error: request for member '_fxsr_env' in something not a structure or union
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Huang Weiyi [Sun, 19 Oct 2008 03:27:25 +0000 (20:27 -0700)]
arch/m68k/bvme6000/rtc.c: remove duplicated include
Removed duplicated include file <linux/smp_lock.h> in
arch/m68k/bvme6000/rtc.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:24 +0000 (20:27 -0700)]
container freezer: document the cgroup freezer subsystem.
Describe why we need the freezer subsystem and how to use it in a
documentation file. Since the cgroups.txt file is focused on the
subsystem-agnostic portions of cgroups make a directory and move the old
cgroups.txt file at the same time.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: containers@lists.linux-foundation.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:24 +0000 (20:27 -0700)]
container freezer: rename check_if_frozen()
check_if_frozen() sounds like it should return something when in fact it's
just updating the freezer state.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:23 +0000 (20:27 -0700)]
container freezer: make freezer state names less generic
Rename cgroup freezer states to be less generic to avoid any name
collisions while also better describing what each state is.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:22 +0000 (20:27 -0700)]
container freezer: prevent frozen tasks or cgroups from changing
Don't let frozen tasks or cgroups change. This means frozen tasks can't
leave their current cgroup for another cgroup. It also means that tasks
cannot be added to or removed from a cgroup in the FROZEN state. We
enforce these rules by checking for frozen tasks and cgroups in the
can_attach() function.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:22 +0000 (20:27 -0700)]
container freezer: skip frozen cgroups during power management resume
When a system is resumed after a suspend, it will also unfreeze frozen
cgroups.
This patchs modifies the resume sequence to skip the tasks which are part
of a frozen control group.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:21 +0000 (20:27 -0700)]
container freezer: implement freezer cgroup subsystem
This patch implements a new freezer subsystem in the control groups
framework. It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.
The freezer subsystem in the container filesystem defines a file named
freezer.state. Writing "FROZEN" to the state file will freeze all tasks
in the cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup. Reading will return the current state.
* Examples of usage :
# mkdir /containers/freezer
# mount -t cgroup -ofreezer freezer /containers
# mkdir /containers/0
# echo $some_pid > /containers/0/tasks
to get status of the freezer subsystem :
# cat /containers/0/freezer.state
RUNNING
to freeze all tasks in the container :
# echo FROZEN > /containers/0/freezer.state
# cat /containers/0/freezer.state
FREEZING
# cat /containers/0/freezer.state
FROZEN
to unfreeze all tasks in the container :
# echo RUNNING > /containers/0/freezer.state
# cat /containers/0/freezer.state
RUNNING
This is the basic mechanism which should do the right thing for user space
task in a simple scenario.
It's important to note that freezing can be incomplete. In that case we
return EBUSY. This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time. After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read. The state will remain
"FREEZING" until one of these things happens:
1) Userspace cancels the freezing operation by writing "RUNNING" to
the freezer.state file
2) Userspace retries the freezing operation by writing "FROZEN" to
the freezer.state file (writing "FREEZING" is not legal
and returns EIO)
3) The tasks that blocked the cgroup from entering the "FROZEN"
state disappear from the cgroup's set of tasks.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Helsley [Sun, 19 Oct 2008 03:27:19 +0000 (20:27 -0700)]
container freezer: make refrigerator always available
Now that the TIF_FREEZE flag is available in all architectures, extract
the refrigerator() and freeze_task() from kernel/power/process.c and make
it available to all.
The refrigerator() can now be used in a control group subsystem
implementing a control group freezer.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>