Guennadi Liakhovetski [Wed, 9 Mar 2011 12:09:27 +0000 (13:09 +0100)]
ARM: mach-shmobile: fix SDHI IO address-range
SDHI registers occupy only a 0x100 byte large window, not 0x200 byte.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 9 Mar 2011 16:28:55 +0000 (17:28 +0100)]
mmc: tmio: only access registers above 0xff, if available
Not all tmio implementations have registers above oxff. Accessing
them on thise platforms is dangerous. In some cases it leads to
address wrapping to addresses below 0x100, which corrupts random
unrelated registers.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 9 Mar 2011 10:33:35 +0000 (11:33 +0100)]
mfd: remove now redundant sh_mobile_sdhi.h header
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 9 Mar 2011 10:29:48 +0000 (11:29 +0100)]
sh: convert boards to use linux/mmc/sh_mobile_sdhi.h
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 9 Mar 2011 10:27:56 +0000 (11:27 +0100)]
ARM: mach-shmobile: convert boards to use linux/mmc/sh_mobile_sdhi.h
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Mon, 14 Mar 2011 08:52:33 +0000 (09:52 +0100)]
mmc: tmio: convert the SDHI MMC driver from MFD to a platform driver
On sh-mobile platforms the SDHI driver was using the tmio_mmc SD/SDIO
MFD cell driver. Now that the tmio_mmc driver has been split into a
core and a separate MFD glue, we can support SDHI natively without the
need to emulate an MFD controller. This also allows to support systems
with an on-SoC SDHI controller and a separate MFD with a TMIO core.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 9 Mar 2011 12:44:29 +0000 (13:44 +0100)]
sh: ecovec: use the CONFIG_MMC_TMIO symbols instead of MFD
The CONFIG_MFD_SH_MOBILE_SDHI Kconfig symbol is going to disappear soon,
switch ecovec to using CONFIG_MMC_TMIO(_MODULE).
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Wed, 23 Mar 2011 11:42:44 +0000 (12:42 +0100)]
mmc: tmio: split core functionality, DMA and MFD glue
TMIO MMC chips contain an SD / SDIO IP core from Panasonic, similar to
the one, used in MN5774 and other MN57xx controllers. These IP cores are
included in many multifunction devices, in sh-mobile chips from Renesas,
in the latter case they can also use DMA. Some sh-mobile implementations
also have some other specialities, that MFD-based solutions don't have.
This makes supporting all these features in a monolithic driver inconveniet
and error-prone. This patch splits the driver into 3 parts: the core,
the MFD glue and the DMA support. In case of a modular build, two modules
will be built: mmc_tmio_core and mmc_tmio.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Mon, 7 Mar 2011 10:33:11 +0000 (11:33 +0100)]
mmc: tmio: use PIO for short transfers
This patch allows transferring of some requests in PIO and some in DMA
mode and defaults to using DMA only for transfers longer than 8 bytes.
This is especially useful with SDIO, which can have lots of 2- and 4-byte
transfers, creating unnecessary high overhead, when executed in DMA.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Guennadi Liakhovetski [Fri, 4 Mar 2011 08:56:21 +0000 (09:56 +0100)]
mmc: tmio-mmc: Improve DMA stability on sh-mobile
On some SDHI tmio implementations the order of DMA and command completion
interrupts swaps, which leads to malfunction. This patch postpones
DMA activation until the MMC command completion IRQ time.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Yoshihiro Shimoda [Fri, 25 Mar 2011 06:06:19 +0000 (15:06 +0900)]
mmc: fix mmc_app_send_scr() for dma transfer
This patch is based on the commit "
af51715079e7fb6b290e1881d63d815dc4de5011":
* Bugfix to that mmc_send_cxd_data() code: dma-to-stack is
unsafe/nonportable, so kmalloc a bounce buffer instead.
The driver may invalidate the mmc_card->csd when host driver uses dma.
So this subroutine also needs a kmalloc buffer.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Richard Zhu [Mon, 21 Mar 2011 05:22:16 +0000 (13:22 +0800)]
mmc: sdhci-esdhc: enable esdhc on imx53
Fix the NO INT in the Multi-BLK IO in SD/MMC, and Multi-BLK read in
SDIO on imx53.
The CMDTYPE of the CMD register (offset 0xE) should be set to "11"
when the STOP CMD12 is issued on imx53 to abort one open ended
multi-blk IO. Otherwise the TC INT wouldn't be generated.
In exact block transfer, the controller doesn't complete the
operations automatically as required at the end of the transfer
and remains on hold if the abort command is not sent on imx53.
As a result, the TC flag is not asserted and SW receives timeout
exception. Set bit1 of Vendor Spec register to fix it.
Signed-off-by: Richard Zhu <Hong-Xing.Zhu@freescale.com>
Signed-off-by: Richard Zhao <richard.zhao@freescale.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Richard Zhu [Fri, 25 Mar 2011 13:18:27 +0000 (09:18 -0400)]
mmc: sdhci-esdhc: use writel/readl as general APIs
Add one flag to indicate the GPIO CD/WP is enabled or not
on imx platforms, and reuse the writel/readl as the general
APIs for imx SOCs.
Signed-off-by: Richard Zhu <Hong-Xing.Zhu@freescale.com>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Richard Zhu [Mon, 21 Mar 2011 05:22:14 +0000 (13:22 +0800)]
mmc: sdhci: add the abort CMDTYPE bits definition
Add the abort CMDTYPE bits definition of command register (offset 0xE)
Signed-off-by: Richard Zhu <Hong-Xing.Zhu@freescale.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Richard Zhu [Mon, 21 Mar 2011 05:22:13 +0000 (13:22 +0800)]
mmc: sdhci-esdhc: remove SDHCI_QUIRK_NO_CARD_NO_RESET from esdhc
sdhci-esdhc-imx does not need SDHCI_QUIRK_NO_CARD_NO_RESET.
Make it OF-specific.
Signed-off-by: Richard Zhu <Hong-Xing.Zhu@freescale.com>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
David Miller [Fri, 18 Mar 2011 22:00:23 +0000 (15:00 -0700)]
mmc: of_mmc_spi: Need to include irq.h and of_irq.h
Since these are the headers that provide irq_of_parse_and_map()
and NO_IRQ.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Chris Ball <cjb@laptop.org>
Pawel Moll [Fri, 11 Mar 2011 17:18:07 +0000 (17:18 +0000)]
mmc: mmci: Add ARM variant with extended FIFO
New IO FPGA implementation for Versatile Express boards contain
MMCI (PL180) cell with FIFO extended to 128 words (512 bytes).
Matt Waddel reports that this patch improves MMC performance on
his vexpress system, and also fixes "mmcblk0: error -5 transferring
data" errors.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Matt Waddel <matt.waddel@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Jaehoon Chung [Fri, 25 Feb 2011 02:08:13 +0000 (11:08 +0900)]
mmc: dw_mmc: set fixed burst in BMOD register
This patch uses the fixed burst bit when using an internal DMA controller.
I found increased performance with IDMAC when this bit is set.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Sergei Shtylyov [Thu, 17 Mar 2011 20:46:17 +0000 (16:46 -0400)]
mmc: use pci_dev->revision
The SDHCI driver uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so
it was not converted by commit
44c10138fd4bbc4b6d6bff0873c24902f2a9da65
(PCI: Change all drivers to use pci_device->revision). The newer VIA
driver has similar code too. This patch converts both drivers to use
the 'revision' field of 'struct pci_dev'.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Harald Welte <HaraldWelte@viatech.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Chris Ball [Wed, 16 Mar 2011 21:46:45 +0000 (17:46 -0400)]
mmc: mmc_test: Remove set-but-unused variable.
Fixes:
drivers/mmc/card/mmc_test.c: In function ‘mmc_test_seq_perf’:
drivers/mmc/card/mmc_test.c:1878:28: warning: variable ‘ts’ set but not
used [-Wunused-but-set-variable]
There's no reason to be calling timespec_sub() here, because
mmc_test_print_avg_rate() is going to do that itself.
Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: Adrian Hunter <adrian.hunter@nokia.com>
Linus Torvalds [Fri, 25 Mar 2011 02:01:30 +0000 (19:01 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs: simplify iget & friends
fs: pull inode->i_lock up out of writeback_single_inode
fs: rename inode_lock to inode_hash_lock
fs: move i_wb_list out from under inode_lock
fs: move i_sb_list out from under inode_lock
fs: remove inode_lock from iput_final and prune_icache
fs: Lock the inode LRU list separately
fs: factor inode disposal
fs: protect inode->i_state with inode->i_lock
autofs4: Do not potentially dereference NULL pointer returned by fget() in autofs_dev_ioctl_setpipefd()
autofs4 - remove autofs4_lock
autofs4 - fix d_manage() return on rcu-walk
autofs4 - fix autofs4_expire_indirect() traversal
autofs4 - fix dentry leak in autofs4_expire_direct()
autofs4 - reinstate last used update on access
vfs - check non-mountpoint dentry might block in __follow_mount_rcu()
Stephen Rothwell [Fri, 25 Mar 2011 01:30:05 +0000 (12:30 +1100)]
[media] rc: update for bitop name changes
Fix the following compile failure:
drivers/media/rc/ite-cir.c: In function 'ite_decode_bytes':
drivers/media/rc/ite-cir.c:190: error: implicit declaration of function 'generic_find_next_le_bit'
drivers/media/rc/ite-cir.c:199: error: implicit declaration of function 'generic_find_next_zero_le_bit'
Caused by commit
620a32bba4a2 ("[media] rc: New rc-based ite-cir driver
for several ITE CIRs") interacting with commit
c4945b9ed472
("asm-generic: rename generic little-endian bitops functions").
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Wed, 23 Mar 2011 19:03:28 +0000 (15:03 -0400)]
fs: simplify iget & friends
Merge get_new_inode/get_new_inode_fast into iget5_locked/iget_locked
as those were the only callers. Remove the internal ifind/ifind_fast
helpers - ifind_fast only had a single caller, and ifind had two
callers wanting it to do different things. Also clean up the comments
in this area to focus on information important to a developer trying
to use it, instead of overloading them with implementation details.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:43 +0000 (22:23 +1100)]
fs: pull inode->i_lock up out of writeback_single_inode
First thing we do in writeback_single_inode() is take the i_lock and
the last thing we do is drop it. A caller already holds the i_lock,
so pull the i_lock out of writeback_single_inode() to reduce the
round trips on this lock during inode writeback.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:42 +0000 (22:23 +1100)]
fs: rename inode_lock to inode_hash_lock
All that remains of the inode_lock is protecting the inode hash list
manipulation and traversals. Rename the inode_lock to
inode_hash_lock to reflect it's actual function.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:41 +0000 (22:23 +1100)]
fs: move i_wb_list out from under inode_lock
Protect the inode writeback list with a new global lock
inode_wb_list_lock and use it to protect the list manipulations and
traversals. This lock replaces the inode_lock as the inodes on the
list can be validity checked while holding the inode->i_lock and
hence the inode_lock is no longer needed to protect the list.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:40 +0000 (22:23 +1100)]
fs: move i_sb_list out from under inode_lock
Protect the per-sb inode list with a new global lock
inode_sb_list_lock and use it to protect the list manipulations and
traversals. This lock replaces the inode_lock as the inodes on the
list can be validity checked while holding the inode->i_lock and
hence the inode_lock is no longer needed to protect the list.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:39 +0000 (22:23 +1100)]
fs: remove inode_lock from iput_final and prune_icache
Now that inode state changes are protected by the inode->i_lock and
the inode LRU manipulations by the inode_lru_lock, we can remove the
inode_lock from prune_icache and the initial part of iput_final().
instead of using the inode_lock to protect the inode during
iput_final, use the inode->i_lock instead. This protects the inode
against new references being taken while we change the inode state
to I_FREEING, as well as preventing prune_icache from grabbing the
inode while we are manipulating it. Hence we no longer need the
inode_lock in iput_final prior to setting I_FREEING on the inode.
For prune_icache, we no longer need the inode_lock to protect the
LRU list, and the inodes themselves are protected against freeing
races by the inode->i_lock. Hence we can lift the inode_lock from
prune_icache as well.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:38 +0000 (22:23 +1100)]
fs: Lock the inode LRU list separately
Introduce the inode_lru_lock to protect the inode_lru list. This
lock is nested inside the inode->i_lock to allow the inode to be
added to the LRU list in iput_final without needing to deal with
lock inversions. This keeps iput_final() clean and neat.
Further, where marking the inode I_FREEING and removing it from the
LRU, move the LRU list manipulation within the inode->i_lock to keep
the list manipulation consistent with iput_final. This also means
that most of the open coded LRU list removal + unused inode
accounting can now use the inode_lru_list_del() wrappers which
cleans the code up further.
However, this locking change means what the LRU traversal in
prune_icache() inverts this lock ordering and needs to use trylock
semantics on the inode->i_lock to avoid deadlocking. In these cases,
if we fail to lock the inode we move it to the back of the LRU to
prevent spinning on it.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:37 +0000 (22:23 +1100)]
fs: factor inode disposal
We have a couple of places that dispose of inodes. factor the
disposal into evict() to isolate this code and make it simpler to
peel away the inode_lock from the code.
While doing this, change the logic flow in iput_final() to separate
the different cases that need to be handled to make the transitions
the inode goes through more obvious.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dave Chinner [Tue, 22 Mar 2011 11:23:36 +0000 (22:23 +1100)]
fs: protect inode->i_state with inode->i_lock
Protect inode state transitions and validity checks with the
inode->i_lock. This enables us to make inode state transitions
independently of the inode_lock and is the first step to peeling
away the inode_lock from the code.
This requires that __iget() is done atomically with i_state checks
during list traversals so that we don't race with another thread
marking the inode I_FREEING between the state check and grabbing the
reference.
Also remove the unlock_new_inode() memory barrier optimisation
required to avoid taking the inode_lock when clearing I_NEW.
Simplify the code by simply taking the inode->i_lock around the
state change and wakeup. Because the wakeup is no longer tricky,
remove the wake_up_inode() function and open code the wakeup where
necessary.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Fri, 25 Mar 2011 00:51:12 +0000 (17:51 -0700)]
Merge branch 'slab/urgent' of git://git./linux/kernel/git/penberg/slab-2.6
* 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
SLUB: Write to per cpu data when allocating it
slub: Fix debugobjects with lockless fastpath
David Rientjes [Thu, 24 Mar 2011 22:18:15 +0000 (15:18 -0700)]
lib, arch: add filter argument to show_mem and fix private implementations
Commit
ddd588b5dd55 ("oom: suppress nodes that are not allowed from
meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
resulted in build warnings on all architectures that implement their own
versions of show_mem():
lib/lib.a(show_mem.o): In function `show_mem':
show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
arch/sparc/mm/built-in.o:(.text+0xd70): first defined here
The fix is to remove __show_mem() and add its argument to show_mem() in
all implementations to prevent this breakage.
Architectures that implement their own show_mem() actually don't do
anything with the argument yet, but they could be made to filter nodes
that aren't allowed in the current context in the future just like the
generic implementation.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 25 Mar 2011 00:27:20 +0000 (17:27 -0700)]
Merge branch 'drm-core-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/vblank: update recently added vbl interface to be more future proof.
drm radeon: Return -EINVAL on wrong pm sysfs access
drm/radeon/kms: fix hardcoded EDID handling
Revert "drm/i915: Don't save/restore hardware status page address register"
drm/i915: Avoid unmapping pages from a NULL address space
drm/i915: Fix use after free within tracepoint
drm/i915: Restore missing command flush before interrupt on BLT ring
drm/i915: Disable pagefaults along execbuffer relocation fast path
drm/i915: Fix computation of pitch for dumb bo creator
drm/i915: report correct render clock frequencies on SNB
drm/i915/dp: Correct the order of deletion for ghost eDP devices
drm/i915: Fix tiling corruption from pipelined fencing
drm/i915: Re-enable self-refresh
drm/i915: Prevent racy removal of request from client list
drm/i915: skip redundant operations whilst enabling pipes and planes
drm/i915: Remove surplus POSTING_READs before wait_for_vblank
drm/radeon/kms: prefer legacy pll algo for tv-out
drm: check for modesetting on modeset ioctls
drm/kernel: vblank wait on crtc > 1
drm: Fix use-after-free in drm_gem_vm_close()
Christoph Lameter [Thu, 24 Mar 2011 19:51:38 +0000 (14:51 -0500)]
SLUB: Write to per cpu data when allocating it
It turns out that the cmpxchg16b emulation has to access vmalloced
percpu memory with interrupts disabled. If the memory has never
been touched before then the fault necessary to establish the
mapping will not to occur and the kernel will fail on boot.
Fix that by reusing the CONFIG_PREEMPT code that writes the
cpu number into a field on every cpu. Writing to the per cpu
area before causes the mapping to be established before we get
to a cmpxchg16b emulation.
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Thomas Gleixner [Thu, 24 Mar 2011 19:26:46 +0000 (21:26 +0200)]
slub: Fix debugobjects with lockless fastpath
On Thu, 24 Mar 2011, Ingo Molnar wrote:
> RIP: 0010:[<
ffffffff810570a9>] [<
ffffffff810570a9>] get_next_timer_interrupt+0x119/0x260
That's a typical timer crash, but you were unable to debug it with
debugobjects because commit
d3f661d6 broke those.
Cc: Christoph Lameter <cl@linux.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Jesper Juhl [Thu, 24 Mar 2011 17:51:37 +0000 (01:51 +0800)]
autofs4: Do not potentially dereference NULL pointer returned by fget() in autofs_dev_ioctl_setpipefd()
In fs/autofs4/dev-ioctl.c::autofs_dev_ioctl_setpipefd() we call fget(),
which may return NULL, but we do not explicitly test for that NULL return
so we may end up dereferencing a NULL pointer - bad.
When I originally submitted this patch I had chosen EBUSY as the return
value to use if this happens. Ian Kent was kind enough to explain why that
would most likely be wrong and why EBADF should most likely be used
instead. This version of the patch uses EBADF.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:31 +0000 (01:51 +0800)]
autofs4 - remove autofs4_lock
The autofs4_lock introduced by the rcu-walk changes has unnecessarily
broad scope. The locking is better handled by the per-autofs super
block lookup_lock.
Signed-off-by: Ian Kent <raven@themaw.net>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:25 +0000 (01:51 +0800)]
autofs4 - fix d_manage() return on rcu-walk
The daemon never needs to block and, in the rcu-walk case an error
return isn't used, so always return zero.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:20 +0000 (01:51 +0800)]
autofs4 - fix autofs4_expire_indirect() traversal
The vfs-scale changes changed the traversal used in
autofs4_expire_indirect() from a list to a depth first tree traversal
which isn't right.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:14 +0000 (01:51 +0800)]
autofs4 - fix dentry leak in autofs4_expire_direct()
There is a missing dput() when returning from autofs4_expire_direct()
when we see that the dentry is already a pending mount.
Signed-off-by: Ian Kent <raven@themaw.net>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:08 +0000 (01:51 +0800)]
autofs4 - reinstate last used update on access
When direct (and offset) mounts were introduced the the last used
timeout could no longer be updated in ->d_revalidate(). This is
because covered direct mounts would be followed over without calling
the autofs file system. As a result the definition of the busyness
check for all entries was changed to be "actually busy" being an open
file or working directory within the automount. But now we have a call
back in the follow so the last used update on any access can be
re-instated. This requires DCACHE_MANAGE_TRANSIT to always be set.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Ian Kent [Thu, 24 Mar 2011 17:51:02 +0000 (01:51 +0800)]
vfs - check non-mountpoint dentry might block in __follow_mount_rcu()
When following a mount in rcu-walk mode we must check if the incoming dentry
is telling us it may need to block, even if it isn't actually a mountpoint.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Thu, 24 Mar 2011 17:16:26 +0000 (10:16 -0700)]
Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
Linus Torvalds [Thu, 24 Mar 2011 17:07:50 +0000 (10:07 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/dhowells/linux-2.6-mn10300
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
MN10300: gcc 4.6 vs am33 inline assembly
MN10300: Deprecate gdbstub
MN10300: Allow KGDB to use the MN10300 serial ports
MN10300: Emulate single stepping in KGDB on MN10300
MN10300: Generalise kernel debugger kernel halt, reboot or power off hook
KGDB: Notify GDB of machine halt, reboot or power off
MN10300: Use KGDB
MN10300: Create generic kernel debugger hooks
MN10300: Create general kernel debugger cache flushing
MN10300: Introduce a general config option for kernel debugger hooks
MN10300: The icache invalidate functions should disable the icache first
MN10300: gdbstub: Restrict single-stepping to non-preemptable non-SMP configs
Linus Torvalds [Thu, 24 Mar 2011 17:05:23 +0000 (10:05 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
pstore: cleanups to pstore_dump()
[IA64] New syscalls for 2.6.39
Linus Torvalds [Thu, 24 Mar 2011 17:04:59 +0000 (10:04 -0700)]
Merge branch 'rmobile-latest' of git://git./linux/kernel/git/lethal/sh-2.6
* 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
mmc: Add MMC_PROGRESS_*
mmc, ARM: Rename SuperH Mobile ARM zboot helpers
ARM: mach-shmobile: add coherent DMA mask to CEU camera devices
ARM: mach-shmobile: Dynamic backlight control for Mackerel
Linus Torvalds [Thu, 24 Mar 2011 17:04:05 +0000 (10:04 -0700)]
Merge branch 'sh-latest' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Fix build alloc_thread_info_node function
sh: Fix ptrace hw_breakpoint handling
sh: Fix ptrace fpu state initialisation
sh: Re-enable GENERIC_HARDIRQS_NO_DEPRECATED.
sh: pmb: Use struct syscore_ops instead of sysdevs
sh: Use struct syscore_ops instead of sysdevs
sh: Conver to asm-generic/sizes.h.
sh: wire up sys_syncfs.
Linus Torvalds [Thu, 24 Mar 2011 17:02:55 +0000 (10:02 -0700)]
Merge branch 'usb-linus' of git://git./linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: cdc-acm: fix potential null-pointer dereference on disconnect
USB: cdc-acm: fix potential null-pointer dereference
USB: cdc-acm: fix memory corruption / panic
USB: Fix 'bad dma' problem on WDM device disconnect
usb: wwan: fix compilation without CONFIG_PM_RUNTIME
USB: uss720 fixup refcount position
usb: musb: blackfin: fix typo in new bfin_musb_vbus_status func
usb: musb: blackfin: fix typo in new dev_pm_ops struct
usb: musb: blackfin: fix typo in platform driver name
usb: musb: Fix for merge issue
ehci-hcd: Bug fix: don't set a QH's Halt bit
USB: Do not pass negative length to snoop_urb()
Linus Torvalds [Thu, 24 Mar 2011 16:50:13 +0000 (09:50 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (442 commits)
[media] videobuf2-dma-contig: make cookie() return a pointer to dma_addr_t
[media] sh_mobile_ceu_camera: Do not call vb2's mem_ops directly
[media] V4L: soc-camera: explicitly require V4L2_BUF_TYPE_VIDEO_CAPTURE
[media] v4l: soc-camera: Store negotiated buffer settings
[media] rc: interim support for 32-bit NEC-ish scancodes
[media] mceusb: topseed 0x0011 needs gen3 init for tx to work
[media] lirc_zilog: error out if buffer read bytes != chunk size
[media] lirc: silence some compile warnings
[media] hdpvr: use same polling interval as other OS
[media] ir-kbd-i2c: pass device code w/key in hauppauge case
[media] rc/keymaps: Remove the obsolete rc-rc5-tv keymap
[media] remove the old RC_MAP_HAUPPAUGE_NEW RC map
[media] rc/keymaps: Rename Hauppauge table as rc-hauppauge
[media] rc-rc5-hauppauge-new: Fix Hauppauge Grey mapping
[media] rc-rc5-hauppauge-new: Add support for the old Black RC
[media] rc-rc5-hauppauge-new: Add the old control to the table
[media] rc-winfast: Fix the keycode tables
[media] a800: Fix a few wrong IR key assignments
[media] opera1: Use multimedia keys instead of an app-specific mapping
[media] dw2102: Use multimedia keys instead of an app-specific mapping
...
Fix up trivial conflicts (remove/modify and some real conflicts) in:
arch/arm/mach-omap2/devices.c
drivers/staging/Kconfig
drivers/staging/Makefile
drivers/staging/dabusb/dabusb.c
drivers/staging/dabusb/dabusb.h
drivers/staging/easycap/easycap_ioctl.c
drivers/staging/usbvideo/usbvideo.c
drivers/staging/usbvideo/vicam.c
Linus Torvalds [Thu, 24 Mar 2011 16:33:14 +0000 (09:33 -0700)]
Merge branch 'for-linus' of git://android.git./kernel/tegra
* 'for-linus' of git://android.git.kernel.org/kernel/tegra:
ARM: tegra: harmony: initialize the TPS65862 PMIC
ARM: tegra: update defconfig
ARM: tegra: harmony: update PCI-e initialization sequence
ARM: tegra: trimslice: enable MMC/SD slots
ARM: tegra: enable new drivers in defconfig
ARM: tegra: Add Toshiba AC100 support
ARM: tegra: harmony: Set WM8903 gpio_base
ARM: tegra: harmony: I2C-related portions of audio support
ARM: tegra: harmony: register i2c devices
ARM: tegra: seaboard: register i2c devices
ARM: tegra: harmony: Beginnings of audio support
ARM: tegra: create defines for SD-related GPIO names
ARM: tegra: add devices.c entries for audio
Linus Torvalds [Thu, 24 Mar 2011 16:30:20 +0000 (09:30 -0700)]
Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
spi/pl022: Add loopback support for the SPI on 5500
spi/omap_mcspi: Fix broken last word xfer
of/flattree: minor cleanups
dt: eliminate OF_NO_DEEP_PROBE and test for NULL match table
dt: protect against NULL matches passed to of_match_node()
dt: Refactor of_platform_bus_probe()
Simon Horman [Thu, 24 Mar 2011 07:04:38 +0000 (07:04 +0000)]
mmc: Add MMC_PROGRESS_*
This is my second attempt to make this enum generally available.
The first attempt added MMCIF_PROGRESS_* to include/linux/mmc/sh_mmcif.h.
However this is not sufficiently generic as the enum will be
used by SDHI boot code.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Simon Horman [Thu, 24 Mar 2011 07:04:37 +0000 (07:04 +0000)]
mmc, ARM: Rename SuperH Mobile ARM zboot helpers
These headers and helpers will also be used for SDHI boot
so the mmcif name will start to make a lot less sense.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Thu, 24 Mar 2011 15:25:53 +0000 (08:25 -0700)]
Merge branch 'idle-release' of git://git./linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: Rename cpuidle states
Linus Torvalds [Thu, 24 Mar 2011 15:25:15 +0000 (08:25 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits)
ACPI: minor printk format change in acpi_pad
ACPI: make acpi_pad /sys output more readable
ACPICA: Update version to
20110316
ACPICA: Header support for SLIC table
ACPI: Make sure the FADT is at least rev 2 before using the reset register
ACPI: Bug compatibility for Windows on the ACPI reboot vector
ACPICA: Fix access width for reset vector
ACPI battery: fribble sysfs files from a resume notifier
ACPI button: remove unused procfs I/F
ACPI, APEI, Add PCIe AER error information printing support
PCIe, AER, use pre-generated prefix in error information printing
ACPI, APEI, Add ERST record ID cache
ACPI: Use syscore_ops instead of sysdev class and sysdev
ACPI: Remove the unused EC sysdev class
ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
ACPI: use __init where possible in processor driver
Thermal_Framework-Fix_crash_during_hwmon_unregister
ACPICA: Update version to
20110211.
ACPICA: Add mechanism to defer _REG methods for some installed handlers
ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name
...
Linus Torvalds [Thu, 24 Mar 2011 15:24:28 +0000 (08:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/vapier/blackfin
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin:
Blackfin: bf54x: re-enable anomaly
05000353 for all revs
Blackfin: enable atomic64_t support
Blackfin: wire up new syncfs syscall
Blackfin: SMP: flush CoreB cache when shutting down
Linus Torvalds [Thu, 24 Mar 2011 15:22:34 +0000 (08:22 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/ubi-2.6
* 'for-linus' of git://git.infradead.org/ubi-2.6:
UBIFS: fix assertion warning and refine comments
UBIFS: kill CONFIG_UBIFS_FS_DEBUG_CHKS
UBIFS: use GFP_NOFS properly
UBI: use GFP_NOFS properly
Linus Torvalds [Thu, 24 Mar 2011 15:20:39 +0000 (08:20 -0700)]
Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
SUNRPC: Remove resource leak in svc_rdma_send_error()
nfsd: wrong index used in inner loop
nfsd4: fix comment and remove unused nfsd4_file fields
nfs41: make sure nfs server return right ca_maxresponsesize_cached
nfsd: fix compile error
svcrpc: fix bad argument in unix_domain_find
nfsd4: fix struct file leak
nfsd4: minor nfs4state.c reshuffling
svcrpc: fix rare race on unix_domain creation
nfsd41: modify the members value of nfsd4_op_flags
nfsd: add proc file listing kernel's gss_krb5 enctypes
gss:krb5 only include enctype numbers in gm_upcall_enctypes
NFSD, VFS: Remove dead code in nfsd_rename()
nfsd: kill unused macro definition
locks: use assign_type()
Linus Torvalds [Thu, 24 Mar 2011 15:02:21 +0000 (08:02 -0700)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
Squashfs: Use vmalloc rather than kmalloc for zlib workspace
Squashfs: handle corruption of directory structure
Squashfs: wrap squashfs_mount() definition
Squashfs: xz_wrapper doesn't need to include squashfs_fs_i.h anymore
Squashfs: Update documentation to include compression options
Squashfs: Update Kconfig help text to include xz compression
Squashfs: add compression options support to xz decompressor
Squashfs: extend decompressor framework to handle compression options
Linus Torvalds [Thu, 24 Mar 2011 14:59:46 +0000 (07:59 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB: Increase DMA max_segment_size on Mellanox hardware
IB/mad: Improve an error message so error code is included
RDMA/nes: Don't print success message at level KERN_ERR
RDMA/addr: Fix return of uninitialized ret value
IB/srp: try to use larger FMR sizes to cover our mappings
IB/srp: add support for indirect tables that don't fit in SRP_CMD
IB/srp: rework mapping engine to use multiple FMR entries
IB/srp: allow sg_tablesize to be set for each target
IB/srp: move IB CM setup completion into its own function
IB/srp: always avoid non-zero offsets into an FMR
Linus Torvalds [Thu, 24 Mar 2011 14:59:01 +0000 (07:59 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/sameo/mfd-2.6
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (90 commits)
mfd: Push byte swaps out of wm8994 bulk read path
mfd: Rename ab8500 gpadc header
mfd: Constify WM8994 write path
mfd: Push byte swap out of WM8994 bulk I/O
mfd: Avoid copying data in WM8994 I2C write
mfd: Remove copy from WM831x I2C write function
mfd: Staticise WM8994 PM ops
regulator: Add a subdriver for TI TPS6105x regulator portions v2
mfd: Add a core driver for TI TPS61050/TPS61052 chips v2
gpio: Add Tunnel Creek support to sch_gpio
mfd: Add Tunnel Creek support to lpc_sch
pci_ids: Add Intel Tunnel Creek LPC Bridge device ID.
regulator: MAX8997/8966 support
mfd: Add WM8994 bulk register write operation
mfd: Append additional read write on 88pm860x
mfd: Adopt mfd_data in 88pm860x input driver
mfd: Adopt mfd_data in 88pm860x regulator
mfd: Adopt mfd_data in 88pm860x led
mfd: Adopt mfd_data in 88pm860x backlight
mfd: Fix MAX8997 Kconfig entry typos
...
Linus Torvalds [Thu, 24 Mar 2011 14:57:38 +0000 (07:57 -0700)]
Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
exofs: deprecate the commands pending counter
exofs: Write sbi->s_nextid as part of the Create command
exofs: Add option to mount by osdname
exofs: Override read-ahead to align on stripe_size
exofs: simple fsync race fix
exofs: Optimize read_4_write
exofs: Trivial: fix some indentation and debug prints
exofs: Remove redundant unlikely()
Linus Torvalds [Thu, 24 Mar 2011 14:56:52 +0000 (07:56 -0700)]
Merge git://git./linux/kernel/git/lethal/fbdev-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6: (140 commits)
MAINTAINERS: de-orphan fbdev.
MAINTAINERS: Add file pattern for fb dt bindings.
video: Move sm501fb devicetree binding documentation to a better place.
fbcon: fix situation where fbcon gets deinitialised and can't reinit.
video, sm501: add OF binding to support SM501
video, sm501: add edid and commandline support
video, sm501: add I/O functions for use on powerpc
video: Fix EDID macros H_SYNC_WIDTH and H_SYNC_OFFSET
fbcon: Bugfix soft cursor detection in Tile Blitting
video: add missing framebuffer_release in error path
video: metronomefb: add __devexit_p around reference to metronomefb_remove
video: hecubafb: add __devexit_p around reference to hecubafb_remove
drivers:video:aty:radeon_base Fix typo occationally to occasionally
atmel_lcdfb: add fb_blank function
atmel_lcdfb: implement inverted contrast pwm
video: s3c-fb: return proper error if clk_get fails
uvesafb,vesafb: create WC or WB PAT-entries
video: ffb: fix ffb_probe error path
radeonfb: Let hwmon driver probe the "monid" I2C bus
fbdev: sh_mobile_lcdc: checking NULL instead of IS_ERR()
...
Artem Bityutskiy [Wed, 23 Mar 2011 08:32:58 +0000 (10:32 +0200)]
UBIFS: fix assertion warning and refine comments
This patch fixes the following UBIFS assertion warning:
UBIFS assert failed in do_readpage at 115 (pid 199)
[<
b00321b8>] (unwind_backtrace+0x0/0xdc) from [<
af025118>]
(do_readpage+0x108/0x594 [ubifs])
[<
af025118>] (do_readpage+0x108/0x594 [ubifs]) from [<
af025764>]
(ubifs_write_end+0x1c0/0x2e8 [ubifs])
[<
af025764>] (ubifs_write_end+0x1c0/0x2e8 [ubifs]) from
[<
b00a0164>] (generic_file_buffered_write+0x18c/0x270)
[<
b00a0164>] (generic_file_buffered_write+0x18c/0x270) from
[<
b00a08d4>] (__generic_file_aio_write+0x478/0x4c0)
[<
b00a08d4>] (__generic_file_aio_write+0x478/0x4c0) from
[<
b00a0984>] (generic_file_aio_write+0x68/0xc8)
[<
b00a0984>] (generic_file_aio_write+0x68/0xc8) from
[<
af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs])
[<
af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs]) from
[<
b00d104c>] (do_sync_write+0xb0/0x100)
[<
b00d104c>] (do_sync_write+0xb0/0x100) from [<
b00d1abc>]
(vfs_write+0xac/0x154)
[<
b00d1abc>] (vfs_write+0xac/0x154) from [<
b00d1c10>]
(sys_write+0x3c/0x68)
[<
b00d1c10>] (sys_write+0x3c/0x68) from [<
b002d9a0>]
(ret_fast_syscall+0x0/0x2c)
The 'PG_checked' flag is used to indicate that the page does not
supposedly exist on the media (e.g., a hole or a page beyond the
inode size), so it requires slightly bigger budget, because we have
to account the indexing size increase. And this flag basically
tells that the budget for this page has to be "new page budget".
The "new page budget" is slightly bigger than the "existing page
budget".
The 'do_readpage()' function has the following assertion which
sometimes is hit: 'ubifs_assert(!PageChecked(page))'. Obviously,
the meaning of this assertion is: "I should not be asked to read
a page which does not exist on the media".
However, in 'ubifs_write_begin()' we have a small "trick". Notice,
that VFS may write pages which were not read yet, so the page data
were not loaded from the media to the page cache yet. If VFS tells
that it is going to change only some part of the page, we obviously
have to load it from the media. However, if VFS tells that it is
going to change whole page, we do not read it from the media for
optimization purposes.
However, since we do not read it, we do not know if it exists on
the media or not (a hole, etc). So we set the 'PG_checked' flag
to this page to force bigger budget, just in case.
So 'ubifs_write_begin()' sets 'PG_checked'. Then we are in
'ubifs_write_end()'. And VFS tells us: "hey, for some reasons I
changed my mind and did not change whole page". Frankly, I do not
know why this happens, but I hit this somehow on an ARM platform.
And this is extremely rare.
So in this case UBIFS does the following:
1. Cancels allocated budget.
2. Loads the page from the media by calling 'do_readpage()'.
3. Asks VFS to repeat the whole write operation from the very
beginning (call '->write_begin() again, etc).
And the assertion warning is hit at the step 2 - remember we have
the 'PG_checked' set for this page, and 'do_readpage()' does not
like this. So this patch fixes the problem by adding step 1.5 and
cleaning the 'PG_checked' before calling 'do_readpage()'.
All in all, this patch does not fix any functionality issue, but it
silences UBIFS false positive warning which may happen in very very
rare cases.
And while on it, this patch also improves a commentary which explains
the reasons of setting the 'PG_checked' flag for the page. The old
commentary was a bit difficult to understand.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Artem Bityutskiy [Mon, 21 Mar 2011 14:16:29 +0000 (16:16 +0200)]
UBIFS: kill CONFIG_UBIFS_FS_DEBUG_CHKS
Simplify UBIFS configuration menu and kill the option to enable self-check
compile-time. We do not really need this because we can do this run-time
using the module parameters or the corresponding sysfs interfaces. And
there is a value in simplifying the kernel configuration menu which becomes
increasingly large.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Artem Bityutskiy [Thu, 24 Mar 2011 14:14:26 +0000 (16:14 +0200)]
UBIFS: use GFP_NOFS properly
This patch fixes a brown-paperbag bug which was introduced by me:
I used incorrect "GFP_KERNEL | GFP_NOFS" allocation flags to make
sure my allocations do not cause write-back. But the correct form
is "GFP_NOFS".
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Artem Bityutskiy [Thu, 24 Mar 2011 14:09:56 +0000 (16:09 +0200)]
UBI: use GFP_NOFS properly
This patch fixes a brown-paperbag bug which was introduced by me:
I used incorrect "GFP_KERNEL | GFP_NOFS" allocation flags to make
sure my allocations do not cause write-back. But the correct form
is "GFP_NOFS".
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Dave Airlie [Thu, 24 Mar 2011 10:54:35 +0000 (20:54 +1000)]
drm/vblank: update recently added vbl interface to be more future proof.
This makes the interface a bit cleaner by leaving a single gap in the
vblank bit space instead of creating two gaps.
Suggestions from Michel on mailing list/irc.
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Renninger [Wed, 23 Mar 2011 15:14:09 +0000 (15:14 +0000)]
drm radeon: Return -EINVAL on wrong pm sysfs access
Throw an error if someone tries to fill this with
wrong data, instead of simply ignoring the input.
Now you get:
echo hello >/sys/../power_method
-bash: echo: write error: Invalid argument
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Alexander.Deucher@amd.com
CC: dri-devel@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Wed, 23 Mar 2011 08:10:10 +0000 (08:10 +0000)]
drm/radeon/kms: fix hardcoded EDID handling
On some servers there is a hardcoded EDID provided
in the vbios so that the driver will always see a
display connected even if something like a KVM
prevents traditional means like DDC or load
detection from working properly. Also most
server boards with DVI are not actually DVI, but
DVO connected to a virtual KVM service processor.
If we fail to detect a monitor via DDC or load
detection and a hardcoded EDID is available, use
it.
Additionally, when using the hardcoded EDID, use
a copy of it rather than the actual one stored
in the driver as the detect() and get_modes()
functions may free it if DDC is successful.
This fixes the virtual KVM on several internal
servers.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 24 Mar 2011 10:21:45 +0000 (20:21 +1000)]
Merge remote branch 'intel/drm-intel-fixes' of ../drm-next into drm-core-next
* 'intel/drm-intel-fixes' of ../drm-next:
Revert "drm/i915: Don't save/restore hardware status page address register"
drm/i915: Avoid unmapping pages from a NULL address space
drm/i915: Fix use after free within tracepoint
drm/i915: Restore missing command flush before interrupt on BLT ring
drm/i915: Disable pagefaults along execbuffer relocation fast path
drm/i915: Fix computation of pitch for dumb bo creator
drm/i915: report correct render clock frequencies on SNB
drm/i915/dp: Correct the order of deletion for ghost eDP devices
drm/i915: Fix tiling corruption from pipelined fencing
drm/i915: Re-enable self-refresh
drm/i915: Prevent racy removal of request from client list
drm/i915: skip redundant operations whilst enabling pipes and planes
drm/i915: Remove surplus POSTING_READs before wait_for_vblank
Chris Wilson [Wed, 23 Mar 2011 17:53:28 +0000 (17:53 +0000)]
Revert "drm/i915: Don't save/restore hardware status page address register"
This reverts commit
a7a75c8f70d6f6a2f16c9f627f938bbee2d32718.
There are two different variations on how Intel hardware addresses the
"Hardware Status Page". One as a location in physical memory and the
other as an offset into the virtual memory of the GPU, used in more
recent chipsets. (The HWS itself is a cacheable region of memory which
the GPU can write to without requiring CPU synchronisation, used for
updating various details of hardware state, such as the position of
the GPU head in the ringbuffer, the last breadcrumb seqno, etc).
These two types of addresses were updated in different locations of code
- one inline with the ringbuffer initialisation, and the other during
device initialisation. (The HWS page is logically associated with
the rings, and there is one HWS page per ring.) During resume, only the
ringbuffers were being re-initialised along with the virtual HWS page,
leaving the older physical address HWS untouched. This then caused a
hang on the older gen3/4 (915GM, 945GM, 965GM) the first time we tried
to synchronise the GPU as the breadcrumbs were never being updated.
Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Jan Niehusmann <jan@gondor.com>
Reported-and-tested-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-and-tested-by: Michael "brot" Groh <brot@minad.de>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Nobuhiro Iwamatsu [Thu, 24 Mar 2011 05:47:40 +0000 (05:47 +0000)]
sh: Fix build alloc_thread_info_node function
By commit
b6a84016bd2598e35ead635147fa53619982648d,
alloc_thread_info was replaced by alloc_thread_info_node.
However, the change of the function name and the addition of the argument
were incomplete.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 24 Mar 2011 06:17:25 +0000 (15:17 +0900)]
Merge branch 'master' of git://git./linux/kernel/git/torvalds/linux-2.6 into sh-latest
Linus Torvalds [Thu, 24 Mar 2011 03:51:42 +0000 (20:51 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
deal with races in /proc/*/{syscall,stack,personality}
proc: enable writing to /proc/pid/mem
proc: make check_mem_permission() return an mm_struct on success
proc: hold cred_guard_mutex in check_mem_permission()
proc: disable mem_write after exec
mm: implement access_remote_vm
mm: factor out main logic of access_process_vm
mm: use mm_struct to resolve gate vma's in __get_user_pages
mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
mm: arch: make in_gate_area take an mm_struct instead of a task_struct
mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
x86: mark associated mm when running a task in 32 bit compatibility mode
x86: add context tag to mark mm when running a task in 32-bit compatibility mode
auxv: require the target to be tracable (or yourself)
close race in /proc/*/environ
report errors in /proc/*/*map* sanely
pagemap: close races with suid execve
make sessionid permissions in /proc/*/task/* match those in /proc/*
fix leaks in path_lookupat()
Fix up trivial conflicts in fs/proc/base.c
Linus Torvalds [Thu, 24 Mar 2011 03:37:26 +0000 (20:37 -0700)]
Merge branch 'devel' of /home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (35 commits)
ARM: Update (and cut down) mach-types
ARM: 6771/1: vexpress: add support for multiple core tiles
ARM: 6797/1: hw_breakpoint: Fix newlines in WARNings
ARM: 6751/1: vexpress: select applicable errata workarounds in Kconfig
ARM: 6753/1: omap4: Enable ARM local timers with OMAP4430 es1.0 exception
ARM: 6759/1: smp: Select local timers vs broadcast timer support runtime
ARM: pgtable: add pud-level code
ARM: 6673/1: LPAE: use phys_addr_t instead of unsigned long for start of membanks
ARM: Use long long format when printing meminfo physical addresses
ARM: integrator: add Integrator/CP sched_clock support
ARM: realview/vexpress: consolidate SMP bringup code
ARM: realview/vexpress: consolidate localtimer support
ARM: integrator/versatile: consolidate FPGA IRQ handling code
ARM: rationalize versatile family Kconfig/Makefile
ARM: realview: remove old AMBA device DMA definitions
ARM: versatile: remove old AMBA device DMA definitions
ARM: vexpress: use new init_early for clock tree and sched_clock init
ARM: realview: use new init_early for clock tree and sched_clock init
ARM: versatile: use new init_early for clock tree and sched_clock init
ARM: integrator: use new init_early for clock tree init
...
Philippe Langlais [Wed, 23 Mar 2011 10:05:16 +0000 (11:05 +0100)]
spi/pl022: Add loopback support for the SPI on 5500
Extend the vendor data with a loopback field, and add new
amba-pl022 vendor data for the DB5500 pl023, as the pl023
on db8500 and db5500 vary.
Signed-off-by: Prajadevi H <prajadevi.h@stericsson.com>
Signed-off-by: Philippe Langlais <philippe.langlais@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Olaf Hering [Wed, 23 Mar 2011 23:43:29 +0000 (16:43 -0700)]
crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state. To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.
Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.
Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup()
to early_param(). It adds an address range check from x86 also on ia64
and powerpc.
[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
FUJITA Tomonori [Wed, 23 Mar 2011 23:43:28 +0000 (16:43 -0700)]
remove dma64_addr_t
There is no user now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mandeep Singh Baines [Wed, 23 Mar 2011 23:43:27 +0000 (16:43 -0700)]
taskstats: use appropriate printk priority level
printk()s without a priority level default to KERN_WARNING. To reduce
noise at KERN_WARNING, this patch set the priority level appriopriately
for unleveled printks()s. This should be useful to folks that look at
dmesg warnings closely.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:26 +0000 (16:43 -0700)]
userns: rename is_owner_or_cap to inode_owner_or_capable
And give it a kernel-doc comment.
[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:25 +0000 (16:43 -0700)]
userns: userns: check user namespace for task->file uid equivalence checks
Cheat for now and say all files belong to init_user_ns. Next step will be
to let superblocks belong to a user_ns, and derive inode_userns(inode)
from inode->i_sb->s_user_ns. Finally we'll introduce more flexible
arrangements.
Changelog:
Feb 15: make is_owner_or_cap take const struct inode
Feb 23: make is_owner_or_cap bool
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:24 +0000 (16:43 -0700)]
userns: user namespaces: convert several capable() calls
CAP_IPC_OWNER and CAP_IPC_LOCK can be checked against current_user_ns(),
because the resource comes from current's own ipc namespace.
setuid/setgid are to uids in own namespace, so again checks can be against
current_user_ns().
Changelog:
Jan 11: Use task_ns_capable() in place of sched_capable().
Jan 11: Use nsown_capable() as suggested by Bastian Blank.
Jan 11: Clarify (hopefully) some logic in futex and sched.c
Feb 15: use ns_capable for ipc, not nsown_capable
Feb 23: let copy_ipcs handle setting ipc_ns->user_ns
Feb 23: pass ns down rather than taking it from current
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:23 +0000 (16:43 -0700)]
userns: add a user namespace owner of ipc ns
Changelog:
Feb 15: Don't set new ipc->user_ns if we didn't create a new
ipc_ns.
Feb 23: Move extern declaration to ipc_namespace.h, and group
fwd declarations at top.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:22 +0000 (16:43 -0700)]
userns: user namespaces: convert all capable checks in kernel/sys.c
This allows setuid/setgid in containers. It also fixes some corner cases
where kernel logic foregoes capability checks when uids are equivalent.
The latter will need to be done throughout the whole kernel.
Changelog:
Jan 11: Use nsown_capable() as suggested by Bastian Blank.
Jan 11: Fix logic errors in uid checks pointed out by Bastian.
Feb 15: allow prlimit to current (was regression in previous version)
Feb 23: remove debugging printks, uninline set_one_prio_perm and
make it bool, and document its return value.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:21 +0000 (16:43 -0700)]
userns: make has_capability* into real functions
So we can let type safety keep things sane, and as a bonus we can remove
the declaration of init_user_ns in capability.h.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:20 +0000 (16:43 -0700)]
userns: allow ptrace from non-init user namespaces
ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e. the same rules as for two tasks in the init user
namespace). ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.
Changelog:
Dec 31: Address feedback by Eric:
. Correct ptrace uid check
. Rename may_ptrace_ns to ptrace_capable
. Also fix the cap_ptrace checks.
Jan 1: Use const cred struct
Jan 11: use task_ns_capable() in place of ptrace_capable().
Feb 23: same_or_ancestore_user_ns() was not an appropriate
check to constrain cap_issubset. Rather, cap_issubset()
only is meaningful when both capsets are in the same
user_ns.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:19 +0000 (16:43 -0700)]
userns: allow killing tasks in your own or child userns
Changelog:
Dec 8: Fixed bug in my check_kill_permission pointed out by
Eric Biederman.
Dec 13: Apply Eric's suggestion to pass target task into kill_ok_by_cred()
for clarity
Dec 31: address comment by Eric Biederman:
don't need cred/tcred in check_kill_permission.
Jan 1: use const cred struct.
Jan 11: Per Bastian Blank's advice, clean up kill_ok_by_cred().
Feb 16: kill_ok_by_cred: fix bad parentheses
Feb 23: per akpm, let compiler inline kill_ok_by_cred
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:18 +0000 (16:43 -0700)]
userns: allow sethostname in a container
Changelog:
Feb 23: let clone_uts_ns() handle setting uts->user_ns
To do so we need to pass in the task_struct who'll
get the utsname, so we can get its user_ns.
Feb 23: As per Oleg's coment, just pass in tsk, instead of two
of its members.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:17 +0000 (16:43 -0700)]
userns: security: make capabilities relative to the user namespace
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serge E. Hallyn [Wed, 23 Mar 2011 23:43:16 +0000 (16:43 -0700)]
userns: add a user_namespace as creator/owner of uts_namespace
The expected course of development for user namespaces targeted
capabilities is laid out at https://wiki.ubuntu.com/UserNamespace.
Goals:
- Make it safe for an unprivileged user to unshare namespaces. They
will be privileged with respect to the new namespace, but this should
only include resources which the unprivileged user already owns.
- Provide separate limits and accounting for userids in different
namespaces.
Status:
Currently (as of 2.6.38) you can clone with the CLONE_NEWUSER flag to
get a new user namespace if you have the CAP_SYS_ADMIN, CAP_SETUID, and
CAP_SETGID capabilities. What this gets you is a whole new set of
userids, meaning that user 500 will have a different 'struct user' in
your namespace than in other namespaces. So any accounting information
stored in struct user will be unique to your namespace.
However, throughout the kernel there are checks which
- simply check for a capability. Since root in a child namespace
has all capabilities, this means that a child namespace is not
constrained.
- simply compare uid1 == uid2. Since these are the integer uids,
uid 500 in namespace 1 will be said to be equal to uid 500 in
namespace 2.
As a result, the lxc implementation at lxc.sf.net does not use user
namespaces. This is actually helpful because it leaves us free to
develop user namespaces in such a way that, for some time, user
namespaces may be unuseful.
Bugs aside, this patchset is supposed to not at all affect systems which
are not actively using user namespaces, and only restrict what tasks in
child user namespace can do. They begin to limit privilege to a user
namespace, so that root in a container cannot kill or ptrace tasks in the
parent user namespace, and can only get world access rights to files.
Since all files currently belong to the initila user namespace, that means
that child user namespaces can only get world access rights to *all*
files. While this temporarily makes user namespaces bad for system
containers, it starts to get useful for some sandboxing.
I've run the 'runltplite.sh' with and without this patchset and found no
difference.
This patch:
copy_process() handles CLONE_NEWUSER before the rest of the namespaces.
So in the case of clone(CLONE_NEWUSER|CLONE_NEWUTS) the new uts namespace
will have the new user namespace as its owner. That is what we want,
since we want root in that new userns to be able to have privilege over
it.
Changelog:
Feb 15: don't set uts_ns->user_ns if we didn't create
a new uts_ns.
Feb 23: Move extern init_user_ns declaration from
init/version.c to utsname.h.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 23 Mar 2011 23:43:14 +0000 (16:43 -0700)]
procfs: kill the global proc_mnt variable
After the previous cleanup in proc_get_sb() the global proc_mnt has no
reasons to exists, kill it.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric W. Biederman [Wed, 23 Mar 2011 23:43:13 +0000 (16:43 -0700)]
pidns: call pid_ns_prepare_proc() from create_pid_namespace()
Reorganize proc_get_sb() so it can be called before the struct pid of the
first process is allocated.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric W. Biederman [Wed, 23 Mar 2011 23:43:12 +0000 (16:43 -0700)]
pid: remove the child_reaper special case in init/main.c
This patchset is a cleanup and a preparation to unshare the pid namespace.
These prerequisites prepare for Eric's patchset to give a file descriptor
to a namespace and join an existing namespace.
This patch:
It turns out that the existing assignment in copy_process of the
child_reaper can handle the initial assignment of child_reaper we just
need to generalize the test in kernel/fork.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Richard Weinberger [Wed, 23 Mar 2011 23:43:11 +0000 (16:43 -0700)]
sysctl: restrict write access to dmesg_restrict
When dmesg_restrict is set to 1 CAP_SYS_ADMIN is needed to read the kernel
ring buffer. But a root user without CAP_SYS_ADMIN is able to reset
dmesg_restrict to 0.
This is an issue when e.g. LXC (Linux Containers) are used and complete
user space is running without CAP_SYS_ADMIN. A unprivileged and jailed
root user can bypass the dmesg_restrict protection.
With this patch writing to dmesg_restrict is only allowed when root has
CAP_SYS_ADMIN.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Dan Rosenberg <drosenberg@vsecurity.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: James Morris <jmorris@namei.org>
Cc: Eugene Teo <eugeneteo@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Petr Holasek [Wed, 23 Mar 2011 23:43:09 +0000 (16:43 -0700)]
sysctl: add some missing input constraint checks
Add boundaries of allowed input ranges for: dirty_expire_centisecs,
drop_caches, overcommit_memory, page-cluster and panic_on_oom.
Signed-off-by: Petr Holasek <pholasek@redhat.com>
Acked-by: Dave Young <hidave.darkstar@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Denis Kirjanov [Wed, 23 Mar 2011 23:43:08 +0000 (16:43 -0700)]
sysctl_check: drop dead code
Drop dead code.
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Denis Kirjanov [Wed, 23 Mar 2011 23:43:08 +0000 (16:43 -0700)]
sysctl_check: drop table->procname checks
Since the for loop checks for the table->procname drop useless
table->procname checks inside the loop body
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Wed, 23 Mar 2011 23:43:07 +0000 (16:43 -0700)]
rapidio: fix potential null deref on failure path
If rio is not a switch then "rswitch" is null.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>