Theodore Ts'o [Sat, 13 Jun 2009 14:09:41 +0000 (10:09 -0400)]
ext4: change s_mount_opt to be an unsigned int
We can only fit 32 options in s_mount_opt because an unsigned long is
32-bits on a x86 machine. So use an unsigned int to save space on
64-bit platforms.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Akira Fujita [Wed, 17 Jun 2009 23:24:03 +0000 (19:24 -0400)]
ext4: online defrag -- Add EXT4_IOC_MOVE_EXT ioctl
The EXT4_IOC_MOVE_EXT exchanges the blocks between orig_fd and donor_fd,
and then write the file data of orig_fd to donor_fd.
ext4_mext_move_extent() is the main fucntion of ext4 online defrag,
and this patch includes all functions related to ext4 online defrag.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Mon, 27 Apr 2009 21:33:23 +0000 (17:33 -0400)]
ext4: avoid unnecessary spinlock in critical POSIX ACL path
If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem
to do POSIX ACL checks on any files not owned by the caller, and it does
this for every single pathname component that it looks up.
That obviously can be pretty expensive if the filesystem isn't careful
about it, especially with locking. That's doubly sad, since the common
case tends to be that there are no ACL's associated with the files in
question.
ext4 already caches the ACL data so that it doesn't have to look it up
over and over again, but it does so by taking the inode->i_lock spinlock
on every lookup. Which is a noticeable overhead even if it's a private
lock, especially on CPU's where the serialization is expensive (eg Intel
Netburst aka 'P4').
For the special case of not actually having any ACL's, all that locking is
unnecessary. Even if somebody else were to be changing the ACL's on
another CPU, we simply don't care - if we've seen a NULL ACL, we might as
well use it.
So just load the ACL speculatively without any locking, and if it was
NULL, just use it. If it's non-NULL (either because we had a cached
entry, or because the cache hasn't been filled in at all), it means that
we'll need to get the lock and re-load it properly.
(This commit was ported from a patch originally authored by Linus for
ext3.)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Linus Torvalds [Mon, 27 Apr 2009 21:24:22 +0000 (17:24 -0400)]
ext3: avoid unnecessary spinlock in critical POSIX ACL path
If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem
to do POSIX ACL checks on any files not owned by the caller, and it does
this for every single pathname component that it looks up.
That obviously can be pretty expensive if the filesystem isn't careful
about it, especially with locking. That's doubly sad, since the common
case tends to be that there are no ACL's associated with the files in
question.
ext3 already caches the ACL data so that it doesn't have to look it up
over and over again, but it does so by taking the inode->i_lock spinlock
on every lookup. Which is a noticeable overhead even if it's a private
lock, especially on CPU's where the serialization is expensive (eg Intel
Netburst aka 'P4').
For the special case of not actually having any ACL's, all that locking is
unnecessary. Even if somebody else were to be changing the ACL's on
another CPU, we simply don't care - if we've seen a NULL ACL, we might as
well use it.
So just load the ACL speculatively without any locking, and if it was
NULL, just use it. If it's non-NULL (either because we had a cached
entry, or because the cache hasn't been filled in at all), it means that
we'll need to get the lock and re-load it properly.
This is noticeable even on Nehalem, which does locking quite well (much
better than P4). From lmbench:
Processor, Processes - times in microseconds - smaller is better
--------------------------------------------------------------------
Host OS Mhz null null open slct fork exec sh
call I/O stat clos TCP proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ----
- before:
nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.45 2.18 69.1 273. 1141
nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.48 2.28 69.9 253. 1140
nehalem.l Linux 2.6.30- 3193 0.04 0.10 0.95 1.42 2.19 68.6 284. 1141
- after:
nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.44 2.12 68.3 282. 1094
nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.20 67.0 308. 1123
nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.36 67.4 293. 1148
where you can see what appears to be a roughly 3% improvement in stat
and open/close latencies from just the removal of the locking overhead.
Of course, this only matters for files you don't own (the owner never
needs to do the ACL checks), but that's the common case for libraries,
header files, and executables. As well as for the base components of any
absolute pathname, even if you are the owner of the final file.
[ At some point we probably want to move this ACL caching logic entirely
into the VFS layer (and only call down to the filesystem when
uncached), but in the meantime this improves ext3 a bit.
A similar fix to btrfs makes a much bigger difference (15x improvement
in lmbench) due to broken caching. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Theodore Ts'o [Wed, 17 Jun 2009 15:48:11 +0000 (11:48 -0400)]
ext4: convert instrumentation from markers to tracepoints
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Theodore Ts'o [Wed, 17 Jun 2009 15:47:48 +0000 (11:47 -0400)]
jbd2: convert instrumentation from markers to tracepoints
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Linus Torvalds [Wed, 17 Jun 2009 04:26:42 +0000 (21:26 -0700)]
Merge branch 'next-i2c' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'next-i2c' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c-stu300: Make driver depend on MACH_U300
i2c-s3c2410: use resource_size()
i2c: Use resource_size macro
i2c: ST DDC I2C U300 bus driver v3
i2c-bfin-twi: pull in io.h for ioremap()
Linus Torvalds [Wed, 17 Jun 2009 04:20:39 +0000 (21:20 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: switch to using late_initcall
radeon legacy chips: tv dac bg/dac adj updates
drm/radeon: introduce kernel modesetting for radeon hardware
drm: Add the TTM GPU memory manager subsystem.
drm: Memory fragmentation from lost alignment blocks
drm/radeon: fix mobility flags on new PCI IDs.
David Howells [Tue, 16 Jun 2009 20:36:49 +0000 (21:36 +0100)]
AFS: Correctly translate auth error aborts and don't failover in such cases
Authentication error abort codes should be translated to appropriate
Linux error codes, rather than all being translated to EREMOTEIO - which
indicates that the server had internal problems.
Additionally, a server shouldn't be marked unavailable and the next
server tried if an authentication error occurs. This will quickly make
all the servers unavailable to the client. Instead the error should be
returned straight to the user.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 16 Jun 2009 20:36:44 +0000 (21:36 +0100)]
RxRPC: Don't attempt to reuse aborted connections
Connections that have seen a connection-level abort should not be reused
as the far end will just abort them again; instead a new connection
should be made.
Connection-level aborts occur due to such things as authentication
failures.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 17 Jun 2009 04:15:42 +0000 (21:15 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (425 commits)
V4L/DVB (11870): gspca - main: VIDIOC_ENUM_FRAMESIZES ioctl added.
V4L/DVB (12004): poll method lose race condition
V4L/DVB (11894): flexcop-pci: dmesg visible names broken
V4L/DVB (11892): Siano: smsendian - declare function as extern
V4L/DVB (11891): Siano: smscore - bind the GPIO SMS protocol
V4L/DVB (11890): Siano: smscore - remove redundant code
V4L/DVB (11889): Siano: smsdvb - add DVB v3 events
V4L/DVB (11888): Siano: smsusb - remove redundant ifdef
V4L/DVB (11887): Siano: smscards - add board (target) events
V4L/DVB (11886): Siano: smscore - fix some new GPIO definitions names
V4L/DVB (11885): Siano: Add new GPIO management interface
V4L/DVB (11884): Siano: smssdio - revert to stand alone module
V4L/DVB (11883): Siano: cards - add two additional (USB) devices
V4L/DVB (11824): Siano: smsusb - change exit func debug msg
V4L/DVB (11823): Siano: smsusb - fix typo in module description
V4L/DVB (11822): Siano: smscore - bug fix at get_device_mode
V4L/DVB (11821): Siano: smscore - fix isdb-t firmware name
V4L/DVB (11820): Siano: smscore - fix byte ordering bug
V4L/DVB (11819): Siano: smscore - fix get_common_buffer bug
V4L/DVB (11818): Siano: smscards - assign gpio to HPG targets
...
Benjamin Herrenschmidt [Wed, 17 Jun 2009 03:48:39 +0000 (13:48 +1000)]
mm: Move pgtable_cache_init() earlier
Some architectures need to initialize SLAB caches to be able
to allocate page tables. They do that from pgtable_cache_init()
so the later should be called earlier now, best is before
vmalloc_init().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 17 Jun 2009 02:50:13 +0000 (19:50 -0700)]
Merge branch 'akpm'
* akpm: (182 commits)
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
fbdev: *bfin*: fix __dev{init,exit} markings
fbdev: *bfin*: drop unnecessary calls to memset
fbdev: bfin-t350mcqb-fb: drop unused local variables
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
fbdev: s1d13xxxfb: add accelerated bitblt functions
tcx: use standard fields for framebuffer physical address and length
fbdev: add support for handoff from firmware to hw framebuffers
intelfb: fix a bug when changing video timing
fbdev: use framebuffer_release() for freeing fb_info structures
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
s3c-fb: CPUFREQ frequency scaling support
s3c-fb: fix resource releasing on error during probing
carminefb: fix possible access beyond end of carmine_modedb[]
acornfb: remove fb_mmap function
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
mb862xxfb: restrict compliation of platform driver to PPC
Samsung SoC Framebuffer driver: add Alpha Channel support
atmel-lcdc: fix pixclock upper bound detection
offb: use framebuffer_alloc() to allocate fb_info struct
...
Manually fix up conflicts due to kmemcheck in mm/slab.c
Mike Frysinger [Tue, 16 Jun 2009 22:34:43 +0000 (15:34 -0700)]
fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 16 Jun 2009 22:34:42 +0000 (15:34 -0700)]
fbdev: *bfin*: fix __dev{init,exit} markings
The remove member of the platform_driver bfin_t350mcqb_driver should use
__devexit_p() to refer to the remove function, and that function should
get __devexit markings. Likewise, the probe function should be marked
with __devinit and not __init.
Also, module_init() functions should be marked with __init rather than
__devinit.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vivek Kutal [Tue, 16 Jun 2009 22:34:42 +0000 (15:34 -0700)]
fbdev: *bfin*: drop unnecessary calls to memset
The dma_alloc_* functions sets the memory to 0 before returning so there
is no need to call memset after the allocation. Also no point in clearing
the memory when disabling the buffer.
Signed-off-by: Vivek Kutal <vivek.kutal@azingo.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 16 Jun 2009 22:34:41 +0000 (15:34 -0700)]
fbdev: bfin-t350mcqb-fb: drop unused local variables
The local fbinfo/info vars in the suspend functions don't actually get
used which cause ugly gcc warnings, so drop them.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 16 Jun 2009 22:34:40 +0000 (15:34 -0700)]
fbdev: blackfin has __raw I/O accessors, so use them in fb.h
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kristoffer Ericson [Tue, 16 Jun 2009 22:34:40 +0000 (15:34 -0700)]
fbdev: s1d13xxxfb: add accelerated bitblt functions
Add accelerated bitblt functions to s1d13xxx based video chipsets, more
specificly functions copyarea and fillrect.
It has only been tested and activated for 13506 chipsets but is expected
to work for the majority of s1d13xxx based chips. This patch also cleans
up the driver with respect of whitespaces and other formatting issues. We
update the current status comments.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:39 +0000 (15:34 -0700)]
tcx: use standard fields for framebuffer physical address and length
Use standard fields fbinfo.fix.smem_start and fbinfo.fix.smem_len for
physical address and length of framebuffer.
This also fixes output of the 'fbset -i' command - address and length of
the framebuffer are displayed correctly.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Tue, 16 Jun 2009 22:34:38 +0000 (15:34 -0700)]
fbdev: add support for handoff from firmware to hw framebuffers
With KMS we have ran into an issue where we really want the KMS fb driver
to be the one running the console, so panics etc can be shown by switching
out of X etc.
However with vesafb/efifb built-in, we end up with those on fb0 and the
KMS fb driver on fb1, driving the same piece of hw, so this adds an fb
info flag to denote a firmware fbdev, and adds a new aperture base/size
range which can be compared when the hw drivers are installed to see if
there is a conflict with a firmware driver, and if there is the firmware
driver is unregistered and the hw driver takes over.
It uses new aperture_base/size members instead of comparing on the fix
smem_start/length, as smem_start/length might for example only cover the
first 1MB of the PCI aperture, and we could allocate the kms fb from 8MB
into the aperture, thus they would never overlap.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Peter Jones <pjones@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menzel [Tue, 16 Jun 2009 22:34:37 +0000 (15:34 -0700)]
intelfb: fix a bug when changing video timing
When changing video timing dynamically via fbset the screen sporadically
is rendered black.
With the attached fix which disables VCO prior to timing register change
the problem disappears.
I had a look at the Xserver register setup code. Here the VCO is
disabled in the same way [1].
This patch is taken from vga-sync-field version 0.0.11 [2][3].
[1] http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/tree/src/i830_=
driver.c
[2] http://lowbyte.de/vga-sync-fields/vga-sync-fields-0.0.11.tgz
[3] http://easy-vdr.de/git?p=frc.git/.git;a=commit;h=
dcc3b863e5a663652587619c357bd20075af6896
2587619c357bd20075af6896
Signed-off-by: Thomas Hilber <sparkie@lowbyte.de>
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:36 +0000 (15:34 -0700)]
fbdev: use framebuffer_release() for freeing fb_info structures
Use the framebuffer_release() for freeing fb_info structures allocated
with framebuffer_alloc().
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 16 Jun 2009 22:34:35 +0000 (15:34 -0700)]
radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
P2G2CLK_ALWAYS_ONb is tested twice, 2nd should be P2G2CLK_DAC_ALWAYS_ONb.
[akpm@linux-foundation.org: remove duplicated bitwise-OR of PIXCLKS_CNTL__R300_P2G2CLK_ALWAYS_ONb too]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Tue, 16 Jun 2009 22:34:34 +0000 (15:34 -0700)]
s3c-fb: CPUFREQ frequency scaling support
Add support for CPU frequency scaling in the S3C24XX video driver.
Signed-off-by: Simtec Linux Team <linux@simtec.co.uk>
Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:33 +0000 (15:34 -0700)]
s3c-fb: fix resource releasing on error during probing
All resources are released in s3c_fb_win_release so remove other places of
resources releasing. Add releasing of an allocated fb_info structure as
well.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 16 Jun 2009 22:34:32 +0000 (15:34 -0700)]
carminefb: fix possible access beyond end of carmine_modedb[]
This check is off-by-one.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:32 +0000 (15:34 -0700)]
acornfb: remove fb_mmap function
The driver's fb_mmap function is essentially the same as a generic fb_mmap
function. Delete driver's function and use the generic one.
A difference is that generic function marks frame buffer memory as VM_IO |
VM_RESERVED. The driver's function marks it as VM_IO only.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Tue, 16 Jun 2009 22:34:31 +0000 (15:34 -0700)]
mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
With this change, the driver builds fine on Microblaze, which helps
allyesconfig compile tests.
I did not test sparc, but the change should have the same effect there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Anatolij Gustschin <agust@denx.de>
Tested-by: Anatolij Gustschin <agust@denx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Julian Calaby [Tue, 16 Jun 2009 22:34:29 +0000 (15:34 -0700)]
mb862xxfb: restrict compliation of platform driver to PPC
The OpenFirmware part of this driver is uncompilable on SPARC due to it's
dependance on several PPC specific functions.
Restricting this to PPC to prevent these build errors:
CC drivers/video/mb862xx/mb862xxfb.o
drivers/video/mb862xx/mb862xxfb.c: In function 'of_platform_mb862xx_probe':
drivers/video/mb862xx/mb862xxfb.c:559: error: implicit declaration of function 'of_address_to_resource'
drivers/video/mb862xx/mb862xxfb.c:575: error: 'NO_IRQ' undeclared (first use in this function)
drivers/video/mb862xx/mb862xxfb.c:575: error: (Each undeclared identifier is reported only once
drivers/video/mb862xx/mb862xxfb.c:575: error: for each function it appears in.)
This was found using randconfig builds.
Signed-off-by: Julian Calaby <julian.calaby@gmail.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
InKi Dae [Tue, 16 Jun 2009 22:34:27 +0000 (15:34 -0700)]
Samsung SoC Framebuffer driver: add Alpha Channel support
Add support for the ARGB1888 and ARGB4888 hardware to the Samsung SoC
Framebuffer driver (s3c-fb.c).
ARGB1888 and ARGB4888 is decided by var->transp.length and this variable
is set by s3c_fb_check_var().
In s3c_fb_check_var(), if var->vits_per_pixel is 25 or 28, then
var->transp.length would be 1 or 3.
Therefore alpha mode(ARGB1888 or ARGB4888) could be decided through that
variable.
For using alpha mode, you need to set the following: This code should be
added to your machine code as platform data.
static struct s3c_fb_pd_win xxx_fb_win0 = {
/* this is to ensure we use win0 */
.win_mode = {
.pixclock = (8+8+8+240)*(38+4+38+400),
.left_margin = 8,
.right_margin = 8,
.upper_margin = 38,
.lower_margin = 38,
.hsync_len = 8,
.vsync_len = 4,
.xres = 240,
.yres = 400,
},
.max_bpp = 32,
.default_bpp = 24,
};
static struct s3c_fb_pd_win xxx_fb_win1 = {
.win_mode = {
.pixclock = (8+8+8+240)*(38+4+38+400),
.left_margin = 8,
.right_margin = 8,
.upper_margin = 38,
.lower_margin = 38,
.hsync_len = 8,
.vsync_len = 4,
.xres = 240,
.yres = 400,
},
.max_bpp = 32,
.default_bpp = 28,
};
static struct s3c_fb_platdata xxx_lcd_pdata __initdata = {
.win[0] = &ncp_fb_win0,
.win[1] = &ncp_fb_win1,
.vidcon0 = VIDCON0_VIDOUT_RGB | VIDCON0_PNRMODE_RGB,
.vidcon1 = VIDCON1_INV_HSYNC | VIDCON1_INV_VSYNC,
.setup_gpio = xxx_fb_gpio_setup,
};
s3c_fb_set_platdata(&xxx_lcd_pdata);
The above code sets pixelformat for window0 layer to RGB888 and window1
layer to ARGB4888.
Signed-off-by: InKi Dae <inki.dae@samsung.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kyungmin Park <kmpark@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Nizette [Tue, 16 Jun 2009 22:34:24 +0000 (15:34 -0700)]
atmel-lcdc: fix pixclock upper bound detection
AFAICT the code which checks that the requested pixclock value is within
bounds is incorrect. It ensures that the lcdc core clock is at least
(bytes per pixel) times higher than the pixel clock rather than just
greater than or equal to.
There are tighter restrictions on the pixclock value as a function of bus
width for STN panels but even then it isn't a simple relationship as
currently checked for. IMO either something like the below patch should
be applied or else more detailed checking logic should be implemented
which takes in to account the panel type as well.
Signed-off-by: Ben Nizette <bn@niasdigital.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Daniel Glockner <dg@emlix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:23 +0000 (15:34 -0700)]
offb: use framebuffer_alloc() to allocate fb_info struct
Use the framebuffer_alloc() function to allocate the fb_info structure so
the structure is correctly initialized after allocation.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:23 +0000 (15:34 -0700)]
igafb: use framebuffer_alloc() to allocate fb_info struct
Use the framebuffer_alloc() function to allocate the fb_info
structure so the structure is correctly initialized after allocation.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Helt [Tue, 16 Jun 2009 22:34:22 +0000 (15:34 -0700)]
chipsfb: remove redundant assignment
The removed assignment is done inside the framebuffer_alloc() earlier.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menzel [Tue, 16 Jun 2009 22:34:21 +0000 (15:34 -0700)]
Documentation/fb/vesafb.txt: fix typo
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Cc: Gerd Knorr <kraxel@goldbach.in-berlin.de>
Cc: Nico Schmoigl <schmoigl@rumms.uni-mannheim.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menzel [Tue, 16 Jun 2009 22:34:20 +0000 (15:34 -0700)]
fbdev: add video modes for resolutions and timings of PAL RGB
This patch was taken from vga-sync-field version 0.0.3 [1][2].
[1] http://lowbyte.de/vga-sync-fields/vga-sync-fields-0.0.3.tgz
[2] http://git.hellersdorfer-jugendchor.de/?p=3Dvga2scart.git;a=3Dcommit;h=
=3Dc5c8ed6c51fc9879dbf38d8b91d5db6f4300ea03
Signed-off-by: Thomas Hilber <sparkie@lowbyte.de>
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Tue, 16 Jun 2009 22:34:19 +0000 (15:34 -0700)]
fbdev: move logo externs to header file
Now we have __initconst, we can finally move the external declarations for
the various Linux logo structures to <linux/linux_logo.h>.
James' ack dates back to the previous submission (way to long ago), when the
logos were still __initdata, which caused failures on some platforms with some
toolchain versions.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: James Simmons <jsimmons@infradead.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sam Ravnborg [Tue, 16 Jun 2009 22:34:18 +0000 (15:34 -0700)]
fbdev: generated logo sources depend on scripts/pnmtologo
The generated logo sources are not automatically regenerated if
scripts/pnmtologo.c has changed. Add the missing dependency to fix this.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 16 Jun 2009 22:34:17 +0000 (15:34 -0700)]
lis3: add click function
The LIS302DL accelerometer chip has a 'click' feature which can be used to
detect sudden motion on any of the three axis. Configuration data is
passed via spi platform_data and no action is taken if that's not
specified, so it won't harm any existing platform.
To make the configuration effective, the IRQ lines need to be set up
appropriately. This patch also adds a way to do that from board support
code.
The DD_* definitions were factored out to an own enum because they are
specific to LIS3LV02D devices.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Piel [Tue, 16 Jun 2009 22:34:16 +0000 (15:34 -0700)]
lis3: add three new laptop models
Separate the 6710 and 6715, and set the right axis information for the
6715.
Reported-by: Isaac702 <isaac702@gmail.com>
Add the 6930.
Reported-by: Christian Weidle <slateroni@gmail.com>
Add the 2710.
Reported-by: Pavel Herrmann <morpheus.ibis@gmail.com>
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Piel [Tue, 16 Jun 2009 22:34:15 +0000 (15:34 -0700)]
lis3: use input_polled_device
Now that there is no need to hookup on the open/close of the joystick,
it's possible to use the simplified interface input_polled_device, instead
of creating our own kthread.
[randy.dunlap@oracle.com: fix Kconfig]
[randy.dunlap@oracle.com: fix Kconfig some more]
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Piel [Tue, 16 Jun 2009 22:34:14 +0000 (15:34 -0700)]
lis3: remove automatic shutdown of the device
After measurement on my laptop, it seems that turning off the device does
not bring any energy saving (within 0.1W precision). So let's keep the
device always on. It simplifies the code, and it avoids the problem of
reading a wrong value sometimes just after turning the device on.
Moreover, since commit
ef2cfc790bf5f0ff189b01eabc0f4feb5e8524df had been
too zealous, the device was actually never turned off anyway. This patch
also restores the damages done by this commit concerning the
initialisation/poweroff.
Also do more clean up with the usage of the lis3_dev global variable.
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Piel [Tue, 16 Jun 2009 22:34:13 +0000 (15:34 -0700)]
lis3: fix misc device unregistering and printk
Can only unregister the misc device if it was registered before. Also
remove debugging messages, which in addition were not properly formated.
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfram Strepp [Tue, 16 Jun 2009 22:34:13 +0000 (15:34 -0700)]
rb_tree: remove redundant if()-condition in rb_erase()
Furthermore, notice that the initial checks:
if (!node->rb_left)
child = node->rb_right;
else if (!node->rb_right)
child = node->rb_left;
else
{
...
}
guarantee that old->rb_right is set in the final else branch, therefore
we can omit checking that again.
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfram Strepp [Tue, 16 Jun 2009 22:34:12 +0000 (15:34 -0700)]
rb_tree: make clear distinction between two different cases in rb_erase()
There are two cases when a node, having 2 childs, is erased:
'normal case': the successor is not the right-hand-child of the node to be erased
'special case': the successor is the right-hand child of the node to be erased
Here some ascii-art, with following symbols (referring to the code):
O: node to be deleted
N: the successor of O
P: parent of N
C: child of N
L: some other node
normal case:
O N
/ \ / \
/ \ / \
L \ L \
/ \ P ----> / \ P
/ \ / \
/ /
N C
\ / \
\
C
/ \
special case:
O|P N
/ \ / \
/ \ / \
L \ L \
/ \ N ----> / C
\ / \
\
C
/ \
Notice that for the special case we don't have to reconnect C to N.
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfram Strepp [Tue, 16 Jun 2009 22:34:11 +0000 (15:34 -0700)]
rb_tree: reorganize code in rb_erase() for additional changes
First, move some code around in order to make the next change more obvious.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wolfram Strepp <wstrepp@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:10 +0000 (15:34 -0700)]
MAINTAINERS: add file patterns to TTY LAYER
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:09 +0000 (15:34 -0700)]
MAINTAINERS: add Paul McKenney to RCU and RCUTORTURE
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:09 +0000 (15:34 -0700)]
MAINTAINERS: add file pattern to CISCO FCOE HBA DRIVER
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:08 +0000 (15:34 -0700)]
MAINTAINERS: mention scripts/get_maintainer.pl in the preface
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:07 +0000 (15:34 -0700)]
MAINTAINERS: remove L: linux-kernel@vger. from all but "THE REST"
lkml is added to all CC lists via pattern matching on "THE REST"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:07 +0000 (15:34 -0700)]
MAINTAINERS: mark ALSA lists as moderated
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:06 +0000 (15:34 -0700)]
MAINTAINERS: update M32R file patterns after rename
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:06 +0000 (15:34 -0700)]
MAINTAINERS: add file patterns to "THE REST"
These file patterns match all sources.
By default, scripts/get_maintainers.pl excludes Linus Torvalds
from the CC: list. Option --git-chief-penguins will include him.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:05 +0000 (15:34 -0700)]
MAINTAINERS: swap mismarked ECRYPT FS M: and P: entries
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:04 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: change "die" to "warn" when command line file is not a patch
fixes git send-email with a cover letter
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:04 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: allow 8 bit characters in email addresses
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:03 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: don't print maintainers when not requested
Fixed bug introduced after using rfc822 address checking.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:02 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: support both "P:/M:" and integrated "M:" lines
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:02 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: better email name quoting
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:01 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: support M: lines with names and multiple entries per M: line
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:01 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: warn on missing git or git repository
support older versions of grep (use -E not -P)
no need to return data in routine recent_git_signoffs
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:34:00 +0000 (15:34 -0700)]
scripts/get_maintainer.pl: improve --git-chief-penquins (Linus Torvalds) filtering
Moved linux-kernel@vger.kernel.org to MAINTAINERS
lkml will be added to all CC lists via F: pattern match
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:33:59 +0000 (15:33 -0700)]
scripts/get_maintainer.pl: better fix for subscriber-only mailing lists
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 16 Jun 2009 22:33:58 +0000 (15:33 -0700)]
scripts/get_maintainer.pl: output first field only in mailing lists and after maintainers.
Fix mailing lists that are described, but not "(subscriber-only)"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zygo Blaxell [Tue, 16 Jun 2009 22:33:57 +0000 (15:33 -0700)]
lib/genalloc.c: remove unmatched write_lock() in gen_pool_destroy
There is a call to write_lock() in gen_pool_destroy which is not balanced
by any corresponding write_unlock(). This causes problems with preemption
because the preemption-disable counter is incremented in the write_lock()
call, but never decremented by any call to write_unlock(). This bug is
gen_pool_destroy, and one of them is non-x86 arch-specific code.
Signed-off-by: Zygo Blaxell <zygo.blaxell@xandros.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tomas Szepe [Tue, 16 Jun 2009 22:33:56 +0000 (15:33 -0700)]
CONFIG_FILE_LOCKING should not depend on CONFIG_BLOCK
CONFIG_FILE_LOCKING should not depend on CONFIG_BLOCK.
This makes it possible to run complete systems out of a CONFIG_BLOCK=n
initramfs on current kernels again (this last worked on 2.6.27.*).
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Fainelli [Tue, 16 Jun 2009 22:33:53 +0000 (15:33 -0700)]
drivers: add support for the TI VLYNQ bus
Add support for the TI VLYNQ high-speed, serial and packetized bus.
This bus allows external devices to be connected to the System-on-Chip and
appear in the main system memory just like any memory mapped peripheral.
It is widely used in TI's networking and multimedia SoC, including the AR7
SoC.
Signed-off-by: Eugene Konev <ejka@imfi.kspu.ru>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 16 Jun 2009 22:33:52 +0000 (15:33 -0700)]
console: make blank timeout value a boot option
The console blank timer is currently hardcoded to 10*60 seconds which
might be annoying on systems with no input devices attached to wake up the
console again. Especially during development, disabling the screen saver
can be handy - for example when debugging the root fs mount mechanism or
other scenarios where no userspace program could be started to do that at
runtime from userspace.
This patch defines a core_param for the variable in charge which allows
users to entirely disable the blank feature at boot time by setting it 0.
The value can still be overwritten at runtime using the standard ioctl
call - this just allows to conditionally change the default.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Figo.zhang [Tue, 16 Jun 2009 22:33:51 +0000 (15:33 -0700)]
Documentation/atomic_ops.txt: fix sample code
list_add() lost a parameter in sample code.
Signed-off-by: Figo.zhang <figo1802@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maciej W. Rozycki [Tue, 16 Jun 2009 22:33:50 +0000 (15:33 -0700)]
eisa.ids: add Network Peripherals FDDI boards
Add EISA IDs for Network Peripherals FDDI boards. Descriptions taken from
the respective EISA configuration files.
It's unlikely we'll ever support these cards, the problem being the lack
of documentation. Assuming the policy for the EISA ID database is the
same as for PCI I'm sending these entries for the sake of completeness.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Masatake YAMATO [Tue, 16 Jun 2009 22:33:49 +0000 (15:33 -0700)]
syscalls.h: remove duplicated declarations for sys_pipe2
sys_pipe2 is declared twice in include/linux/syscalls.h.
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Tue, 16 Jun 2009 22:33:47 +0000 (15:33 -0700)]
kmap_types: make most arches use generic header file
Convert most arches to use asm-generic/kmap_types.h.
Move the KM_FENCE_ macro additions into asm-generic/kmap_types.h,
controlled by __WITH_KM_FENCE from each arch's kmap_types.h file.
Would be nice to be able to add custom KM_types per arch, but I don't yet
see a nice, clean way to do that.
Built on x86_64, i386, mips, sparc, alpha(tonyb), powerpc(tonyb), and
68k(tonyb).
Note: avr32 should be able to remove KM_PTE2 (since it's not used) and
then just use the generic kmap_types.h file. Get avr32 maintainer
approval.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput [Tue, 16 Jun 2009 22:33:46 +0000 (15:33 -0700)]
Documentation/accounting/getdelays.c intialize the variable before using it
Fix compilation warning:
Documentation/accounting/getdelays.c: In function `main':
Documentation/accounting/getdelays.c:249: warning: `cmd_type' may be used uninitialized in this function
This is in fact a false positive.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Tue, 16 Jun 2009 22:33:45 +0000 (15:33 -0700)]
hexdump: remove the trailing space
For example:
hex_dump_to_buffer("AB", 2, 16, 1, buf, 100, 0);
pr_info("[%s]\n", buf);
I'd expect the output to be "[41 42]", but actually it's "[41 42 ]"
This patch also makes the required buf to be minimum. To print the hex
format of "AB", a buf with size 6 should be sufficient, but
hex_dump_to_buffer() required at least 8.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Tue, 16 Jun 2009 22:33:44 +0000 (15:33 -0700)]
use printk_once() in several places
There are some places to be able to use printk_once instead of hard coding.
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Peterson [Tue, 16 Jun 2009 22:33:43 +0000 (15:33 -0700)]
slow-work: use round_jiffies() for thread pool's cull and OOM timers
Round the slow work queue's cull and OOM timeouts to whole second boundary
with round_jiffies(). The slow work queue uses a pair of timers to cull
idle threads and, after OOM, to delay new thread creation.
This patch also extracts the mod_timer() logic for the cull timer into a
separate helper function.
By rounding non-time-critical timers such as these to whole seconds, they
will be batched up to fire at the same time rather than being spread out.
This allows the CPU wake up less, which saves power.
Signed-off-by: Chris Peterson <cpeterso@cpeterso.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Huang Shijie [Tue, 16 Jun 2009 22:33:42 +0000 (15:33 -0700)]
lib: do code optimization for radix_tree_lookup() and radix_tree_lookup_slot()
radix_tree_lookup() and radix_tree_lookup_slot() have much the
same code except for the return value.
Introduce radix_tree_lookup_element() to do the real work.
/*
* is_slot == 1 : search for the slot.
* is_slot == 0 : search for the node.
*/
static void * radix_tree_lookup_element(struct radix_tree_root *root,
unsigned long index, int is_slot);
Signed-off-by: Huang Shijie <shijie8@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 16 Jun 2009 22:33:40 +0000 (15:33 -0700)]
groups: move code to kernel/groups.c
Move supplementary groups implementation to kernel/groups.c .
kernel/sys.c already accumulated quite a few random stuff.
Do strictly copy/paste + add required headers to compile. Compile-tested
on many configs and archs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Tue, 16 Jun 2009 22:33:39 +0000 (15:33 -0700)]
remove put_cpu_no_resched()
put_cpu_no_resched() is an optimization of put_cpu() which unfortunately
can cause high latencies.
The nfs iostats code uses put_cpu_no_resched() in a code sequence where a
reschedule request caused by an interrupt between the get_cpu() and the
put_cpu_no_resched() can delay the reschedule for at least HZ.
The other users of put_cpu_no_resched() optimize correctly in interrupt
code, but there is no real harm in using the put_cpu() function which is
an alias for preempt_enable(). The extra check of the preemmpt count is
not as critical as the potential source of missing a reschedule.
Debugged in the preempt-rt tree and verified in mainline.
Impact: remove a high latency source
[akpm@linux-foundation.org: build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Philipp Reisner [Tue, 16 Jun 2009 22:33:38 +0000 (15:33 -0700)]
drbd: add major number to major.h
Since we have had a LANANA major number for years, and it is documented in
devices.txt, I think that this first patch can go upstream.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 16 Jun 2009 22:33:37 +0000 (15:33 -0700)]
headers: move module_bug_finalize()/module_bug_cleanup() definitions into module.h
They're in linux/bug.h at present, which causes include order tangles. In
particular, linux/bug.h cannot be used by linux/atomic.h because,
according to Nikanth:
linux/bug.h pulls in linux/module.h => linux/spinlock.h => asm/spinlock.h
(which uses atomic_inc) => asm/atomic.h.
bug.h is a pretty low-level thing and module.h is a higher-level thing,
IMO.
Cc: Nikanth Karthikesan <knikanth@novell.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Tue, 16 Jun 2009 22:33:36 +0000 (15:33 -0700)]
poll: avoid extra wakeups in select/poll
After introduction of keyed wakeups Davide Libenzi did on epoll, we are
able to avoid spurious wakeups in poll()/select() code too.
For example, typical use of poll()/select() is to wait for incoming
network frames on many sockets. But TX completion for UDP/TCP frames call
sock_wfree() which in turn schedules thread.
When scheduled, thread does a full scan of all polled fds and can sleep
again, because nothing is really available. If number of fds is large,
this cause significant load.
This patch makes select()/poll() aware of keyed wakeups and useless
wakeups are avoided. This reduces number of context switches by about 50%
on some setups, and work performed by sofirq handlers.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert P. J. Day [Tue, 16 Jun 2009 22:33:35 +0000 (15:33 -0700)]
ntfs: use is_power_of_2() function for clarity.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert P. J. Day [Tue, 16 Jun 2009 22:33:34 +0000 (15:33 -0700)]
kernel/kfifo.c: replace conditional test with is_power_of_2()
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Blunck [Tue, 16 Jun 2009 22:33:33 +0000 (15:33 -0700)]
atomic: only take lock when the counter drops to zero on UP as well
_atomic_dec_and_lock() should not unconditionally take the lock before
calling atomic_dec_and_test() in the UP case. For consistency reasons it
should behave exactly like in the SMP case.
Besides that this works around the problem that with CONFIG_DEBUG_SPINLOCK
this spins in __spin_lock_debug() if the lock is already taken even if the
counter doesn't drop to 0.
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Smith [Tue, 16 Jun 2009 22:33:33 +0000 (15:33 -0700)]
utsname.h: make new_utsname fields use the proper length constant
The members of the new_utsname structure are defined with magic numbers
that *should* correspond to the constant __NEW_UTS_LEN+1. Everywhere
else, code assumes this and uses the constant, so this patch makes the
structure match.
Originally suggested by Serge here:
https://lists.linux-foundation.org/pipermail/containers/2009-March/016258.html
Signed-off-by: Dan Smith <danms@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 16 Jun 2009 22:33:32 +0000 (15:33 -0700)]
uml: bad macro expansion, parameter is member
`ELF_CORE_COPY_REGS(x, y)' will make expansions like:
`(y)[0] = (x)->x.gp[0]' but correct is `(y)[0] = (x)->regs.gp[0]'
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: WANG Cong <amwang@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amerigo Wang [Tue, 16 Jun 2009 22:33:30 +0000 (15:33 -0700)]
uml: fix a section warning
When compiling uml on x86_64:
MODPOST vmlinux.o
WARNING: vmlinux.o (.__syscall_stub.2): unexpected non-allocatable section.
Did you forget to use "ax"/"aw" in a .S file?
Note that for example <linux/init.h> contains
section definitions for use in .S files.
Because modpost checks for missing SHF_ALLOC section flag. So just add
it.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Tue, 16 Jun 2009 22:33:29 +0000 (15:33 -0700)]
um: remove obsolete hw_interrupt_type
The defines and typedefs (hw_interrupt_type, no_irq_type, irq_desc_t) have
been kept around for migration reasons. After more than two years it's
time to remove them finally.
This patch cleans up one of the remaining users. When all such patches
hit mainline we can remove the defines and typedefs finally.
Impact: cleanup
Convert the last remaining users to struct irq_chip and remove the
define.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 16 Jun 2009 22:33:28 +0000 (15:33 -0700)]
uml: UML net driver does not allow for vlans
See ancient discussion at
http://marc.info/?l=user-mode-linux-devel&m=
101990155831279&w=2
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=7854
Signed-off-by: Alan Cox <alan@linux.intel.com>
Reported-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Roland Kletzing <devzero@web.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Tue, 16 Jun 2009 22:33:26 +0000 (15:33 -0700)]
m32r: remove obsolete hw_interrupt_type
The defines and typedefs (hw_interrupt_type, no_irq_type, irq_desc_t) have
been kept around for migration reasons. After more than two years it's
time to remove them finally.
This patch cleans up one of the remaining users. When all such patches
hit mainline we can remove the defines and typedefs finally.
Impact: cleanup
Convert the last remaining users to struct irq_chip and remove the
define.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 16 Jun 2009 22:33:25 +0000 (15:33 -0700)]
alpha: bad macro expansion, parameter is member
`for_each_mem_cluster(x, y, z)' will expand to
`for ((x) = (y)->x ...' but correct is `for ((x) = (y)->cluster ...'
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Tue, 16 Jun 2009 22:33:25 +0000 (15:33 -0700)]
alpha: remove obsolete hw_interrupt_type
The defines and typedefs (hw_interrupt_type, no_irq_type, irq_desc_t) have
been kept around for migration reasons. After more than two years it's
time to remove them finally.
This patch cleans up one of the remaining users. When all such patches
hit mainline we can remove the defines and typedefs finally.
Impact: cleanup
Convert the last remaining users to struct irq_chip and remove the
define.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 16 Jun 2009 22:33:24 +0000 (15:33 -0700)]
mm: fix lumpy reclaim lru handling at isolate_lru_pages
At lumpy reclaim, a page failed to be taken by __isolate_lru_page() can be
pushed back to "src" list by list_move(). But the page may not be from
"src" list. This pushes the page back to wrong LRU. And list_move()
itself is unnecessary because the page is not on top of LRU. Then, leave
it as it is if __isolate_lru_page() fails.
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Tue, 16 Jun 2009 22:33:23 +0000 (15:33 -0700)]
vmscan: count the number of times zone_reclaim() scans and fails
On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim. On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met.
There is a heuristic that determines if the scan is worthwhile but it is
possible that the heuristic will fail and the CPU gets tied up scanning
uselessly. Detecting the situation requires some guesswork and
experimentation so this patch adds a counter "zreclaim_failed" to
/proc/vmstat. If during high CPU utilisation this counter is increasing
rapidly, then the resolution to the problem may be to set
/proc/sys/vm/zone_reclaim_mode to 0.
[akpm@linux-foundation.org: name things consistently]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Tue, 16 Jun 2009 22:33:22 +0000 (15:33 -0700)]
vmscan: do not unconditionally treat zones that fail zone_reclaim() as full
On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim. On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met. The problem is that zone_reclaim() failing at all means the
zone gets marked full.
This can cause situations where a zone is usable, but is being skipped
because it has been considered full. Take a situation where a large tmpfs
mount is occuping a large percentage of memory overall. The pages do not
get cleaned or reclaimed by zone_reclaim(), but the zone gets marked full
and the zonelist cache considers them not worth trying in the future.
This patch makes zone_reclaim() return more fine-grained information about
what occured when zone_reclaim() failued. The zone only gets marked full
if it really is unreclaimable. If it's a case that the scan did not occur
or if enough pages were not reclaimed with the limited reclaim_mode, then
the zone is simply skipped.
There is a side-effect to this patch. Currently, if zone_reclaim()
successfully reclaimed SWAP_CLUSTER_MAX, an allocation attempt would go
ahead. With this patch applied, zone watermarks are rechecked after
zone_reclaim() does some work.
This bug was introduced by commit
9276b1bc96a132f4068fdee00983c532f43d3a26
("memory page_alloc zonelist caching speedup") way back in 2.6.19 when the
zonelist_cache was introduced. It was not intended that zone_reclaim()
aggressively consider the zone to be full when it failed as full direct
reclaim can still be an option. Due to the age of the bug, it should be
considered a -stable candidate.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Tue, 16 Jun 2009 22:33:20 +0000 (15:33 -0700)]
vmscan: properly account for the number of page cache pages zone_reclaim() can reclaim
A bug was brought to my attention against a distro kernel but it affects
mainline and I believe problems like this have been reported in various
guises on the mailing lists although I don't have specific examples at the
moment.
The reported problem was that malloc() stalled for a long time (minutes in
some cases) if a large tmpfs mount was occupying a large percentage of
memory overall. The pages did not get cleaned or reclaimed by
zone_reclaim() because the zone_reclaim_mode was unsuitable, but the lists
are uselessly scanned frequencly making the CPU spin at near 100%.
This patchset intends to address that bug and bring the behaviour of
zone_reclaim() more in line with expectations which were noticed during
investigation. It is based on top of mmotm and takes advantage of
Kosaki's work with respect to zone_reclaim().
Patch 1 fixes the heuristics that zone_reclaim() uses to determine if the
scan should go ahead. The broken heuristic is what was causing the
malloc() stall as it uselessly scanned the LRU constantly. Currently,
zone_reclaim is assuming zone_reclaim_mode is 1 and historically it
could not deal with tmpfs pages at all. This fixes up the heuristic so
that an unnecessary scan is more likely to be correctly avoided.
Patch 2 notes that zone_reclaim() returning a failure automatically means
the zone is marked full. This is not always true. It could have
failed because the GFP mask or zone_reclaim_mode were unsuitable.
Patch 3 introduces a counter zreclaim_failed that will increment each
time the zone_reclaim scan-avoidance heuristics fail. If that
counter is rapidly increasing, then zone_reclaim_mode should be
set to 0 as a temporarily resolution and a bug reported because
the scan-avoidance heuristic is still broken.
This patch:
On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim. On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met.
There is a heuristic that determines if the scan is worthwhile but the
problem is that the heuristic is not being properly applied and is
basically assuming zone_reclaim_mode is 1 if it is enabled. The lack of
proper detection can manfiest as high CPU usage as the LRU list is scanned
uselessly.
Historically, once enabled it was depending on NR_FILE_PAGES which may
include swapcache pages that the reclaim_mode cannot deal with. Patch
vmscan-change-the-number-of-the-unmapped-files-in-zone-reclaim.patch by
Kosaki Motohiro noted that zone_page_state(zone, NR_FILE_PAGES) included
pages that were not file-backed such as swapcache and made a calculation
based on the inactive, active and mapped files. This is far superior when
zone_reclaim==1 but if RECLAIM_SWAP is set, then NR_FILE_PAGES is a
reasonable starting figure.
This patch alters how zone_reclaim() works out how many pages it might be
able to reclaim given the current reclaim_mode. If RECLAIM_SWAP is set in
the reclaim_mode it will either consider NR_FILE_PAGES as potential
candidates or else use NR_{IN}ACTIVE}_PAGES-NR_FILE_MAPPED to discount
swapcache and other non-file-backed pages. If RECLAIM_WRITE is not set,
then NR_FILE_DIRTY number of pages are not candidates. If RECLAIM_SWAP is
not set, then NR_FILE_MAPPED are not.
[kosaki.motohiro@jp.fujitsu.com: Estimate unmapped pages minus tmpfs pages]
[fengguang.wu@intel.com: Fix underflow problem in Kosaki's estimate]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wu Fengguang [Tue, 16 Jun 2009 22:33:17 +0000 (15:33 -0700)]
writeback: skip new or to-be-freed inodes
1) I_FREEING tests should be coupled with I_CLEAR
The two I_FREEING tests are racy because clear_inode() can set i_state to
I_CLEAR between the clear of I_SYNC and the test of I_FREEING.
2) skip I_WILL_FREE inodes in generic_sync_sb_inodes() to avoid possible
races with generic_forget_inode()
generic_forget_inode() sets I_WILL_FREE call writeback on its own, so
generic_sync_sb_inodes() shall not try to step in and create possible races:
generic_forget_inode
inode->i_state |= I_WILL_FREE;
spin_unlock(&inode_lock);
generic_sync_sb_inodes()
spin_lock(&inode_lock);
__iget(inode);
__writeback_single_inode
// see non zero i_count
may WARN here ==> WARN_ON(inode->i_state & I_WILL_FREE);
spin_unlock(&inode_lock);
may call generic_forget_inode again ==> iput(inode);
The above race and warning didn't turn up because writeback_inodes() holds
the s_umount lock, so generic_forget_inode() finds MS_ACTIVE and returns
early. But we are not sure the UBIFS calls and future callers will
guarantee that. So skip I_WILL_FREE inodes for the sake of safety.
Cc: Eric Sandeen <sandeen@sandeen.net>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>