review.tizen.org Git - platform/adaptation/renesas_rcar/renesas

wait_for_helper: SIGCHLD from user-space can lead to use-after-free

1. wait_for_helper() calls allow_signal(SIGCHLD) to ensure the child
   can't autoreap itself.

   However, this means that a spurious SIGCHILD from user-space can
   set TIF_SIGPENDING and:

    - kernel_thread() or sys_wait4() can fail due to signal_pending()

    - worse, wait4() can fail before ____call_usermodehelper() execs
      or exits. In this case the caller may kfree(subprocess_info)
      while the child still uses this memory.

   Change the code to use SIG_DFL instead of magic "(void __user *)2"
   set by allow_signal(). This means that SIGCHLD won't be delivered,
   yet the child won't autoreap itsefl.

   The problem is minor, only root can send a signal to this kthread.

2. If sys_wait4(&ret) fails it doesn't populate "ret", in this case
   wait_for_helper() reports a random value from uninitialized var.

   With this patch sys_wait4() should never fail, but still it makes
   sense to initialize ret = -ECHILD so that the caller can notice
   the problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

call_usermodehelper: no need to unblock signals

____call_usermodehelper() correctly calls flush_signal_handlers() to set
SIG_DFL, but sigemptyset(->blocked) and recalc_sigpending() are not
needed.

This kthread was forked by workqueue thread, all signals must be unblocked
and ignored, no pending signal is possible.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

umh: creds: kill subprocess_info->cred logic

Now that nobody ever changes subprocess_info->cred we can kill this member
and related code. ____call_usermodehelper() always runs in the context of
freshly forked kernel thread, it has the proper ->cred copied from its
parent kthread, keventd.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

umh: creds: convert call_usermodehelper_keys() to use subprocess_info->init()

call_usermodehelper_keys() uses call_usermodehelper_setkeys() to change
subprocess_info->cred in advance. Now that we have info->init() we can
change this code to set tgcred->session_keyring in context of execing
kernel thread.

Note: since currently call_usermodehelper_keys() is never called with
UMH_NO_WAIT, call_usermodehelper_keys()->key_get() and umh_keys_cleanup()
are not really needed, we could rely on install_session_keyring_to_cred()
which does key_get() on success.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

exec: replace call_usermodehelper_pipe with use of umh init function and resolve limit

The first patch in this series introduced an init function to the
call_usermodehelper api so that processes could be customized by caller.
This patch takes advantage of that fact, by customizing the helper in
do_coredump to create the pipe and set its core limit to one (for our
recusrsion check). This lets us clean up the previous uglyness in the
usermodehelper internals and factor call_usermodehelper out entirely.
While I'm at it, we can also modify the helper setup to look for a core
limit value of 1 rather than zero for our recursion check

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kmod: add init function to usermodehelper

About 6 months ago, I made a set of changes to how the core-dump-to-a-pipe
feature in the kernel works.  We had reports of several races, including
some reports of apps bypassing our recursion check so that a process that
was forked as part of a core_pattern setup could infinitely crash and
refork until the system crashed.

We fixed those by improving our recursion checks.  The new check basically
refuses to fork a process if its core limit is zero, which works well.

Unfortunately, I've been getting grief from maintainer of user space
programs that are inserted as the forked process of core_pattern.  They
contend that in order for their programs (such as abrt and apport) to
work, all the running processes in a system must have their core limits
set to a non-zero value, to which I say 'yes'.  I did this by design, and
think thats the right way to do things.

But I've been asked to ease this burden on user space enough times that I
thought I would take a look at it.  The first suggestion was to make the
recursion check fail on a non-zero 'special' number, like one.  That way
the core collector process could set its core size ulimit to 1, and enable
the kernel's recursion detection.  This isn't a bad idea on the surface,
but I don't like it since its opt-in, in that if a program like abrt or
apport has a bug and fails to set such a core limit, we're left with a
recursively crashing system again.

So I've come up with this.  What I've done is modify the
call_usermodehelper api such that an extra parameter is added, a function
pointer which will be called by the user helper task, after it forks, but
before it exec's the required process.  This will give the caller the
opportunity to get a call back in the processes context, allowing it to do
whatever it needs to to the process in the kernel prior to exec-ing the
user space code.  In the case of do_coredump, this callback is ues to set
the core ulimit of the helper process to 1.  This elimnates the opt-in
problem that I had above, as it allows the ulimit for core sizes to be set
to the value of 1, which is what the recursion check looks for in
do_coredump.

This patch:

Create new function call_usermodehelper_fns() and allow it to assign both
an init and cleanup function, as we'll as arbitrary data.

The init function is called from the context of the forked process and
allows for customization of the helper process prior to calling exec.  Its
return code gates the continuation of the process, or causes its exit.
Also add an arbitrary data pointer to the subprocess_info struct allowing
for data to be passed from the caller to the new process, and the
subsequent cleanup process

Also, use this patch to cleanup the cleanup function.  It currently takes
an argp and envp pointer for freeing, which is ugly.  Lets instead just
make the subprocess_info structure public, and pass that to the cleanup
and init routines

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

signals: check_kill_permission(): don't check creds if same_thread_group()

Andrew Tridgell reports that aio_read(SIGEV_SIGNAL) can fail if the
notification from the helper thread races with setresuid(), see
http://samba.org/~tridge/junkcode/aio_uid.c

This happens because check_kill_permission() doesn't permit sending a
signal to the task with the different cred->xids.  But there is not any
security reason to check ->cred's when the task sends a signal (private or
group-wide) to its sub-thread.  Whatever we do, any thread can bypass all
security checks and send SIGKILL to all threads, or it can block a signal
SIG and do kill(gettid(), SIG) to deliver this signal to another
sub-thread.  Not to mention that CLONE_THREAD implies CLONE_VM.

Change check_kill_permission() to avoid the credentials check when the
sender and the target are from the same thread group.

Also, move "cred = current_cred()" down to avoid calling get_current()
twice.

Note: David Howells pointed out we could relax this even more, the
CLONE_SIGHAND (without CLONE_THREAD) case probably does not need
these checks too.

Roland said:
: The glibc (libpthread) that does set*id across threads has
: been in use for a while (2.3.4?), probably in distro's using kernels as old
: or older than any active -stable streams.  In the race in question, this
: kernel bug is breaking valid POSIX application expectations.

Reported-by: Andrew Tridgell <tridge@samba.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: <stable@kernel.org> [all kernel versions]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ptrace: PTRACE_GETFDPIC: fix the unsafe usage of child->mm

Now that Mike Frysinger unified the FDPIC ptrace code, we can fix the
unsafe usage of child->mm in ptrace_request(PTRACE_GETFDPIC).

We have the reference to task_struct, and ptrace_check_attach() verified
the tracee is stopped. But nothing can protect from SIGKILL after that,
we must not assume child->mm != NULL.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Greg Ungerer <gerg@snapgear.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ptrace: unify FDPIC implementations

The Blackfin/FRV/SuperH guys all have the same exact FDPIC ptrace code in
their arch handlers (since they were probably copied & pasted). Since
these ptrace interfaces are an arch independent aspect of the FDPIC code,
unify them in the common ptrace code so new FDPIC ports don't need to copy
and paste this fundamental stuff yet again.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpusets: randomize node rotor used in cpuset_mem_spread_node()

Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems). Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.

This patch changes the rotor to be initialized to a random node number of
the cpuset.

[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpusets: new round-robin rotor for SLAB allocations

We have observed several workloads running on multi-node systems where
memory is assigned unevenly across the nodes in the system.  There are
numerous reasons for this but one is the round-robin rotor in
cpuset_mem_spread_node().

For example, a simple test that writes a multi-page file will allocate
pages on nodes 0 2 4 6 ...  Odd nodes are skipped.  (Sometimes it
allocates on odd nodes & skips even nodes).

An example is shown below.  The program "lfile" writes a file consisting
of 10 pages.  The program then mmaps the file & uses get_mempolicy(...,
MPOL_F_NODE) to determine the nodes where the file pages were allocated.
The output is shown below:

# ./lfile
allocated on nodes: 2 4 6 0 1 2 6 0 2

There is a single rotor that is used for allocating both file pages & slab
pages.  Writing the file allocates both a data page & a slab page
(buffer_head).  This advances the RR rotor 2 nodes for each page
allocated.

A quick confirmation seems to confirm this is the cause of the uneven
allocation:

# echo 0 >/dev/cpuset/memory_spread_slab
# ./lfile
allocated on nodes: 6 7 8 9 0 1 2 3 4 5

This patch introduces a second rotor that is used for slab allocations.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: clean up memory thresholds

Introduce struct mem_cgroup_thresholds. It helps to reduce number of
checks of thresholds type (memory or mem+swap).

[akpm@linux-foundation.org: repair comment]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroups: make cftype.unregister_event() void-returning

Since we are unable to handle an error returned by
cftype.unregister_event() properly, let's make the callback
void-returning.

mem_cgroup_unregister_event() has been rewritten to be a "never fail"
function. On mem_cgroup_usage_register_event() we save old buffer for
thresholds array and reuse it in mem_cgroup_usage_unregister_event() to
avoid allocation.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: fix mis-accounting of file mapped racy with migration

FILE_MAPPED per memcg of migrated file cache is not properly updated,
because our hook in page_add_file_rmap() can't know to which memcg
FILE_MAPPED should be counted.

Basically, this patch is for fixing the bug but includes some big changes
to fix up other messes.

Now, at migrating mapped file, events happen in following sequence.

1. allocate a new page.
2. get memcg of an old page.
3. charge ageinst a new page before migration. But at this point,
    no changes to new page's page_cgroup, no commit for the charge.
    (IOW, PCG_USED bit is not set.)
4. page migration replaces radix-tree, old-page and new-page.
5. page migration remaps the new page if the old page was mapped.
6. Here, the new page is unlocked.
7. memcg commits the charge for newpage, Mark the new page's page_cgroup
    as PCG_USED.

Because "commit" happens after page-remap, we can count FILE_MAPPED
at "5", because we should avoid to trust page_cgroup->mem_cgroup.
if PCG_USED bit is unset.
(Note: memcg's LRU removal code does that but LRU-isolation logic is used
for helping it. When we overwrite page_cgroup->mem_cgroup, page_cgroup is
not on LRU or page_cgroup->mem_cgroup is NULL.)

We can lose file_mapped accounting information at 5 because FILE_MAPPED
is updated only when mapcount changes 0->1. So we should catch it.

BTW, historically, above implemntation comes from migration-failure
of anonymous page. Because we charge both of old page and new page
with mapcount=0, we can't catch
  - the page is really freed before remap.
  - migration fails but it's freed before remap
or .....corner cases.

New migration sequence with memcg is:

1. allocate a new page.
2. mark PageCgroupMigration to the old page.
3. charge against a new page onto the old page's memcg. (here, new page's pc
    is marked as PageCgroupUsed.)
4. page migration replaces radix-tree, page table, etc...
5. At remapping, new page's page_cgroup is now makrked as "USED"
    We can catch 0->1 event and FILE_MAPPED will be properly updated.

    And we can catch SWAPOUT event after unlock this and freeing this
    page by unmap() can be caught.

7. Clear PageCgroupMigration of the old page.

So, FILE_MAPPED will be correctly updated.

Then, for what MIGRATION flag is ?
  Without it, at migration failure, we may have to charge old page again
  because it may be fully unmapped. "charge" means that we have to dive into
  memory reclaim or something complated. So, it's better to avoid
  charge it again. Before this patch, __commit_charge() was working for
  both of the old/new page and fixed up all. But this technique has some
  racy condtion around FILE_MAPPED and SWAPOUT etc...
  Now, the kernel use MIGRATION flag and don't uncharge old page until
  the end of migration.

I hope this change will make memcg's page migration much simpler.  This
page migration has caused several troubles.  Worth to add a flag for
simplification.

Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: memcontrol - uninitialised return value

Only an out of memory error will cause ret to be set.

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: remove unnecessary use of atomic

The bottom 4 hunks are atomically changing memory to which there are no
aliases as it's freshly allocated, so there's no need to use atomic
operations.

The other hunks are just atomic_read and atomic_set, and do not involve
any read-modify-write. The use of atomic_{read,set} doesn't prevent a
read/write or write/write race, so if a race were possible (I'm not saying
one is), then it would still be there even with atomic_set.

See:
http://digitalvampire.org/blog/index.php/2007/05/13/atomic-cargo-cults/

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: make oom killer a no-op when no killable task can be found

It's pointless to try to kill current if select_bad_process() did not find
an eligible task to kill in mem_cgroup_out_of_memory() since it's
guaranteed that current is a member of the memcg that is oom and it is, by
definition, unkillable.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: update documentation

Some information are old, and I think current document doesn't work as "a
guide for users". We need summary of all of our controls, at least.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: move charge of file pages

This patch adds support for moving charge of file pages, which include
normal file, tmpfs file and swaps of tmpfs file.  It's enabled by setting
bit 1 of <target cgroup>/memory.move_charge_at_immigrate.

Unlike the case of anonymous pages, file pages(and swaps) in the range
mmapped by the task will be moved even if the task hasn't done page fault,
i.e.  they might not be the task's "RSS", but other task's "RSS" that maps
the same file.  And mapcount of the page is ignored(the page can be moved
even if page_mapcount(page) > 1).  So, conditions that the page/swap
should be met to be moved is that it must be in the range mmapped by the
target task and it must be charged to the old cgroup.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: clean up move charge

This patch cleans up move charge code by:

- define functions to handle pte for each types, and make
is_target_pte_for_mc() cleaner.

- instead of checking the MOVE_CHARGE_TYPE_ANON bit, define a function
that checks the bit.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: oom kill disable and oom status

This adds a feature to disable oom-killer for memcg, if disabled, of
course, tasks under memcg will stop.

But now, we have oom-notifier for memcg.  And the world around memcg is
not under out-of-memory.  memcg's out-of-memory just shows memcg hits
limit.  Then, administrator or management daemon can recover the situation
by

- kill some process
- enlarge limit, add more swap.
- migrate some tasks
- remove file cache on tmps (difficult ?)

Unlike oom-killer, you can take enough information before killing tasks.
(by gcore, or, ps etc.)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: oom notifier

Considering containers or other resource management softwares in userland,
event notification of OOM in memcg should be implemented. Now, memcg has
"threshold" notifier which uses eventfd, we can make use of it for oom
notification.

This patch adds oom notification eventfd callback for memcg. The usage is
very similar to threshold notifier, but control file is memory.oom_control
and no arguments other than eventfd is required.

% cgroup_event_notifier /cgroup/A/memory.oom_control dummy
(About cgroup_event_notifier, see Documentation/cgroup/)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: David Rientjes <rientjes@google.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: oom wakeup filter

memcg's oom waitqueue is a system-wide wait_queue (for handling
hierarchy.) So, it's better to add custom wake function and do filtering
in wake up path.

This patch adds a filtering feature for waking up oom-waiters. Hierarchy
is properly handled.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation/cgroups/cgroups.txt: fix reference to "numtasks"

Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation: SubmittingDrivers: Resources

- Add additional location (Git) for the kernel master tree
- Add reference to Git Project

Signed-off-by: Abraham Arce <x0066660@ti.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ufs: permit mounting of BorderWare filesystems

I recently had to recover some files from an old broken machine that was
running BorderWare Document Gateway.  It's basically a drop in web server
for sharing files.  From the look of the init process and using strings on
of a few files it seems to be based on FreeBSD 3.3.

The process turned out to be more difficult than I imagined, but to cut a
long story short BorderWare in their wisdom use a nonstandard magic number
in their UFS (ufstype=44bsd) file systems.  Thus Linux refuses to mount
the file systems in order to recover the data.  After a bit of hunting I
was able to make a quick fix to fs/ufs/super.c in order to detect the new
magic number.

I assume that this number is the same for all installations.  It's quite
easy to find out from ufs_fs.h.  The superblock sits 8k into the block
device and the magic number its 1372 bytes into the superblock struct.

# dd if=/dev/sda5 skip=$(( 8192 + 1372 )) bs=1 count=4 2> /dev/null | hd
00000000  97 26 24 0f                                       |.&$.|
#

Signed-off-by: Thomas Stewart <thomas@stewarts.org.uk>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/telephony/ixj.c: use memdup_user

Use memdup_user when user data is immediately copied into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = $kmalloc@p\|kzalloc@p$(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: bf54x-lq043fb: fix unused warnings with backlight code

The current backlight code is stubbed out, so the new props changes added
some warnings:
drivers/video/bf54x-lq043fb.c: In function 'bfin_bf54x_probe':
drivers/video/bf54x-lq043fb.c:666: warning: label 'out9' defined but not used
drivers/video/bf54x-lq043fb.c:504: warning: unused variable 'props'

Fix em !

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: bfin-t350mcqb-fb: avoid unused warnings in backlight code

The current backlight code is stubbed out, so the new props changes added
some warnings about unused label/prop.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/video/via: use memdup_user

Use memdup_user when user data is immediately copied into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = $kmalloc@p\|kzalloc@p$(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Joseph Chan <JosephChan@via.com.tw>
Cc: Scott Fang <ScottFang@viatech.com.cn>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

add support for S3 Trio3D/1X/2X

Add support for S3 Trio3D/1X (86C360) and S3 Trio3D/2X (86C362 and 86C368)
cards to s3fb driver. Tested with 86C362 AGP and 86C368 PCI&AGP.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Ondrej Zajicek <santiago@crfreenet.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/gpio/it8761e_gpio: check return value of gpiochip_remove()

This eliminates the following build warning:

drivers/gpio/it8761e_gpio.c: In function `it8761e_gpio_exit':
drivers/gpio/it8761e_gpio.c:220: warning: ignoring return value of `gpiochip_remove', declared with attribute warn_unused_result

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Denis Turischev <denis@compulab.co.il>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio: add Penwell gpio support

Intel Penwell chip has two 96 pins GPIO blocks, which are very similiar as
Intel Langwell chip GPIO block, except for pin number difference. This
patch expends the original Langwell GPIO driver to support Penwell's.

Signed-off-by: Alek Du <alek.du@intel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

arm: omap: remove the unused omap_gpio_set_debounce methods

Nobody uses that anymore, so remove and expect drivers to use the gpiolib
implementation.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

arm: omap: switch over to gpio_set_debounce

Stop using the omap-specific implementations for gpio debouncing now that
gpiolib provides its own support.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

arm: omap: gpio: implement set_debounce method

OMAP supports debouncing of gpio lines, implement the method using
gpiolib.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: introduce set_debounce method

A few architectures, like OMAP, allow you to set a debouncing time for the
gpio before generating the IRQ.  Teach gpiolib about that.

Mark said:
: This would be generally useful for embedded systems, especially where
: the interrupt concerned is a wake source.  It allows drivers to avoid
: spurious interrupts from noisy sources so if the hardware supports it
: the driver can avoid having to explicitly wait for the signal to become
: stable and software has to cope with fewer events.  We've lived without
: it for quite some time, though.

David said:
: I looked at adding debounce support to the generic GPIO calls (and thus
: gpiolib) some time back, but decided against it.  I forget why at this
: time (check list archives) but it wasn't because of lack of utility in
: certain contexts.
:
: One thing to watch out for is just how variable the hardware capabilities
: are.  Atmel GPIOs have something like a fixed number of 32K clock cycles
: for debounce, twl4030 had something odd, OMAPs were more like the Atmel
: chips but with a different clock.  In some cases debouncing had to be
: ganged, not per-GPIO.  And so forth.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: David Brownell <david-b@pacbell.net>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: make gpiochip_add() show a better error message

The current message, 'not registered' is confusing as it implies it was
not registered with something, whereas printing 'failed to register'
implies it was the gpiochip_add() call that did not work correctly.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio: max732x: fix input configuration for open-drain pins

Fix a bug I noticed while hacking on the max732x driver for interrupt
support. According to the datasheets, open-drain pins have to be
configured as output-high (which in that case is actually high impedance)
to be used as input.

Signed-off-by: Marc Zyngier <maz@misterjones.org>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

max732x: correct nr_port checking off by one error

Setup both client_group_a and client_group_b if nr_port > 8 (not including
nr_port==8).

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Eric Miao <eric.miao@marvell.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pl061: fix offset value range checking

The valid offset value is 0..PL061_GPIO_NR-1, this patch corrects the
offset value range checking.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: document that names can contain printk format specifiers

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: a gpio is unsigned, so use %u to print it

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpiolib: make names array and its values const

gpiolib doesn't need to modify the names and I assume most initializers
use string constants that shouldn't be modified anyhow.

[akpm@linux-foundation.org: fix drivers/gpio/cs5535-gpio.c]
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Kevin Wells <kevin.wells@nxp.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gpio: add interrupt handling capability to max732x

Most of the GPIO expanders supported by the max732x driver have interrupt
generation capability by reporting changes on input pins through an INT#
pin. This patch implements the irq_chip functionnality (edge detection
only).

Signed-off-by: Marc Zyngier <maz@misterjones.org>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Jebediah Huang <jebediah.huang@gmail.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc: AB8500 RTC driver

Add a driver for the RTC on the AB8500 power management chip. This is a
client of the AB8500 MFD driver.

Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/autofs4: use memdup_user

Use memdup_user when user data is immediately copied into the allocated
region.  Elimination of the variable ads, which is no longer useful.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = $kmalloc@p\|kzalloc@p$(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/message/i2o/i2o_config.c: use memdup_user

Use memdup_user when user data is immediately copied into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = $kmalloc@p\|kzalloc@p$(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/char/vt.c: use memdup_user

Use memdup_user when user data is immediately copied into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = $kmalloc@p\|kzalloc@p$(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/mmc/host: use ERR_CAST

Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more
clear what is the purpose of the operation, which otherwise looks like a
no-op.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
type T;
T x;
identifier f;
@@

T f (...) { <+...
- ERR_PTR(PTR_ERR(x))
+ x
...+> }

@@
expression x;
@@

- ERR_PTR(PTR_ERR(x))
+ ERR_CAST(x)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci-spear: ST SPEAr based SDHCI controller glue

Add a glue layer to support the sdhci driver on the ST SPEAr platform.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: <shiraz.hashim@st.com>
Cc: Linus Walleij <linus.ml.walleij@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: add new function for RAW (Read after Write) operation

SDIO specification allows RAW (Read after Write) operation using
IO_RW_DIRECT command (CMD52) by setting the RAW bit. This operation is
similar to ordinary read/write commands, except that both write and read
are performed using single command/response pair. The Linux SDIO layer
already supports this internaly, only external function is missing for
drivers to make use, which is added by this patch.

This type of command is required to implement proper power save mode
support in wl1251 wifi driver.

Android has similar patch for G1 in it's tree for the same reason:

http://android.git.kernel.org/?p=kernel/common.git;a=commitdiff;h=74a47786f6ecbe6c1cf9fb15efe6a968451deb52

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Kalle Valo <kalle.valo@iki.fi>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: remove the "state" argument to mmc_suspend_host()

Even though many mmc host drivers pass a pm_message_t argument to
mmc_suspend_host() that argument isn't used the by MMC core. As host
drivers are converted to dev_pm_ops they'll have to construct
pm_message_t's (as they won't be passed by the PM subsystem any more) just
to appease the mmc suspend interface.

We might as well just delete the unused paramter.

Signed-off-by: Matt Fleming <matt@console-pimps.org>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Acked-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>ZZ
Acked-by: Sascha Sommer <saschasommer@freenet.de>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: OMAP HS-MMC: convert to dev_pm_ops

Convert PM operations to use dev_pm_ops. This will facilitate the runtime
PM coversion which will add to dev_pm_ops hooks.

Note that dev_pm_ops version of the suspend hook no longer takes a 'state'
argument. However, the MMC core function mmc_suspend_host() still takes a
'state' argument, but it is unused, so a dummy state variable was created
to pass to the MMC core.

In the future, the MMC core should be converted to drop this state
argument and the rest of the MMC drivers could be easily converted to
dev_pm_ops as well.

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Madhusudhan Chikkature <madhu.cr@ti.com>
Cc: Adrian Hunter <adrian.hunter@nokia.com>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Denis Karpov <ext-denis.2.karpov@nokia.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

omap_hsmmc: improve interrupt synchronisation

The following changes were needed:
- do not use in_interrupt() because it will not work
with threaded interrupts

In addition, the following improvements were made:
- ensure DMA is unmapped only after the final DMA interrupt
- ensure a request is completed only after the final DMA interrupt
- disable controller interrupts when a request is not in progress
- remove the spin-lock protecting the start of a new request from
an unexpected interrupt because the locking was complicated and
a 'req_in_progress' flag suffices (since the spin-lock only defers
the unexpected interrupts anyway)
- instead use the spin-lock to protect the MMC interrupt handler
from the DMA interrupt handler
- remove the semaphore preventing DMA from being started while
the previous DMA is still in progress - the other changes make that
impossible, so it is now a BUG_ON condition
- ensure the controller interrupt status is clear before exiting
the interrrupt handler

In general, these changes make the code safer but do not fix any specific
bugs so backporting is not necessary.

Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Tested-by: Venkatraman S <svenkatr@ti.com>
Acked-by: Madhusudhan Chikkature <madhu.cr@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci: enable multiblock transfers in sdhci-s3c

Wifi over SDIO doesn't work correctly without multiblock, so enable this.
This patch depends on the following patches:

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
Cc: Thomas Abraham <thomas.ab@samsung.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: s3c6410: enable ADMA feature in 6410 sdhci controller

Enable the ADMA feature in the 6410 SDHCI controller driver.

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
Signed-off-by: Thomas Abraham <thomas.ab@samsung.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: s3c6410: add new quirk in sdhci driver and update ADMA descriptor build

The s3c6410 sdhci controller does not support the 'End' attribute and NOP
attribute in the same 8-Byte ADMA descriptor. This patch adds a new quirk
to identify sdhci host contollers with such behaviour. In addition to
this, for controllers using the new quirk, the last entry in the ADMA
descritor table is marked with the 'End' attribute (instead of using a NOP
descriptor with 'End' attribute).

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
Signed-off-by: Thomas Abraham <thomas.ab@samsung.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci: build fix: rename SDHCI I/O accessor functions

Unfortunately some architectures #define their read{b,w,l} and
write{b,w,l} I/O accessors which makes the SDHCI I/O accessor functions of
the same names subject to preprocessing. This leads to the following
compiler error,

In file included from drivers/mmc/host/sdhci.c:26:
drivers/mmc/host/sdhci.h:318:35: error: macro "writel" passed 3 arguments, but takes just 2

Rename the SDHCI I/O functions so that CONFIG_MMC_SDHCI_IO_ACCESSORS can
be enabled for architectures that implement their read{b,w,l} and
write{b,w,l} functions with macros.

Signed-off-by: Matt Fleming <matt@console-pimps.org>
Cc: Zhangfei Gao <zgao6@marvell.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: SDHCI_INT_DATA_MASK typo error

Signed-off-by: Zhangfei Gao <zgao6@marvell.com>
Reviewed-by: Matt Fleming <matt@console-pimps.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: atmel-mci: Add support for SDIO interrupts

Atmel-mci support for SDIO interrupts.  This adds the enable_sdio_irq()
function and the configuration of sdio irq mask per slot.  With this irq
mask information, we keep the idea of multiple slot per sd/mmc host (not
only A and B).  MMC_CAP_SDIO_IRQ is added according to slot configuration.

A new little function is added to run mmc_signal_sdio_irq() during
interrupt handling routine.

Signed-off-by: Anders Grahn <anders.grahn@hd-wireless.se>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: add support MMCIF for SuperH

MMCIF is the MMC Host Interface in SuperH.

Signed-off-by: Yusuke Goda <yusuke.goda.sx@renesas.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: atmel-mci: enable SD high speed support

Enable high speed support for atmel-mci driver. This support is dependent
of the revision of the IP and, of course, the capacity of the SD card
used.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Reviewed-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: sd: clean up redundant memset

The clearing of mrq via a memset at the top of the for loop in
mmc_wait_for_app_cmd() is not required as mrq is not used and there is
another clearing of mrq just below. We remove the first memset since if
the initial tests in the for loop fail the memset is not required.

Signed-off-by: Mark Asselstine <asselsm@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

davinci: mmc: updates to suspend/resume implementation

Improve the suspend and resume callbacks in DaVinci MMC host controller
driver. Modify the reset status of the contorller and clock during
suspend and resume. Also migrate the power management callbacks from
platform driver to dev_pm_ops structure.

Tested on DA850/OMAP-L138 EVM.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Vipin Bhandari <vipin.bhandari@ti.com>
Cc: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

davinci: mmc: add a function to control reset state of the controller

Add a helper function which will aid in changing the reset
status of the controller.

Signed-off-by: Chaithrika U S <chaithrika@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: Vipin Bhandari <vipin.bhandari@ti.com>
Cc: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci-pltfm: do not print errors in case of an extended iomem size

Some hosts have an extended SDHCI iomem size, so the driver should
only print errors if the iomem size is less than 0x100.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Richard Röjfors <richard.rojfors@pelagicore.com>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Ben Dooks <ben@simtec.co.uk>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci-pltfm: implement platform data passing

This includes platform ops, quirks and (de)initialization callbacks.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Richard Röjfors <richard.rojfors@pelagicore.com>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Ben Dooks <ben@simtec.co.uk>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci: implement CAP_CLOCK_BASE_BROKEN quirk

Some hosts (e.g. as found in CNS3xxx SOCs) report wrong value in
CLOCK_BASE capability field, and currently there is no way to force the
SDHCI core to use the platform-provided base clock value.

This patch implements CAP_CLOCK_BASE_BROKEN quirk. When enabled, the
SDHCI core will always use base clock frequency provided by the platform.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Richard Röjfors <richard.rojfors@pelagicore.com>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Ben Dooks <ben@simtec.co.uk>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc-omap: add support for 16-bit and 32-bit registers

The omap850 and omap730 use 16-bit registers instead of 32-bit, requiring
a modification of the register addresses in the mmc-omap driver. To
resolve this, a bit shift is performed on base register addresses, either
by 1 or 2 bits depending on the CPU in use. This yields the correct
registers for each CPU.

Signed-off-by: Marek Belisko <marek.belisko@open-nandra.com>
Signed-off-by: Cory Maccarrone <darkstar6262@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: Ladislav Michl <ladis@linux-mips.org>
Cc: Ben Dooks <ben@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

davinci: mmc: pass number of SG segments as platform data

On some platforms like DM355, the number of EDMA parameter slots available
for EDMA_SLOT_ANY usage are few.  In such cases, if MMC/SD uses 16 slots
for each instance of MMC controller, then the number of slots available
for other modules will be very few.

By passing the number of EDMA slots to be used in MMC driver from platform
data, EDMA slots available for other purposes can be controlled.

Most of the platforms will not use this platform data variable.  But on
DM355, as the number of EDMA resources available is limited, the number of
scatter- gather segments used inside the MMC driver can be 8 (passed as
platform data) instead of 16.  On DM355, when the number of scatter-gather
segments was reduced to 8, I saw a performance difference of about
0.25-0.4 Mbytes/sec during write.  Read performance variations were
negligible.

Signed-off-by: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Acked-by: Kevin Hilman <khilman@deeprootsystems.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  agp: amd64, fix pci reference leaks
  drm/edid: Allow non-fatal checksum errors in CEA blocks
  drm/radeon/kms: suppress a build warning (unused variable)
  drm: Fixes linux-next & linux-2.6 checkstack warnings:
  nouveau: fix acpi_lid_open undefined
  drm/radeon/kms: release AGP bridge at suspend

do_generic_file_read: clear page errors when issuing a fresh read of the page

I/O errors can happen due to temporary failures, like multipath
errors or losing network contact with the iSCSI server. Because
of that, the VM will retry readpage on the page.

However, do_generic_file_read does not clear PG_error. This
causes the system to be unable to actually use the data in the
page cache page, even if the subsequent readpage completes
successfully!

The function filemap_fault has had a ClearPageError before
readpage forever. This patch simply adds the same to
do_generic_file_read.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Larry Woodman <lwoodman@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git./linux/kernel/git/pkl/squashfs-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  squashfs: update documentation to include description of xattr layout
  squashfs: fix name reading in squashfs_xattr_get
  squashfs: constify xattr handlers
  squashfs: xattr fix sparse warnings
  squashfs: xattr_lookup sparse fix
  squashfs: add xattr support configure option
  squashfs: add new extended inode types
  squashfs: add support for xattr reading
  squashfs: add xattr id support

Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: roccat: fix build failure if built as module
  HID: roccat: propagate special events of roccat hardware to userspace
  HID: Add the GYR4101US USB ID to hid-gyration
  HID: fix hid-roccat-kone for bin_attr API change

Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: emu10k1: allow high-resolution mixer controls
  ALSA: pcm: fix delta calculation at boundary wraparound
  ALSA: hda_intel: fix handling of non-completion stream interrupts
  ALSA: usb/caiaq: fix Traktor Kontrol X1 ABS_HAT2X axis
  ALSA: hda: Fix model quirk for Dell M1730
  ALSA: hda - iMac9,1 sound fixes
  ALSA: hda: Use LPIB for Toshiba A100-259
  ALSA: hda: Use LPIB for Acer Aspire 5110
  ALSA: aw2-alsa.c: use pci_ids.h defines and fix checkpatch.pl noise
  ALSA: usb-audio: add support for Akai MPD16
  ALSA: pcm: fix the fix of the runtime->boundary calculation

Revert "endian: #define __BYTE_ORDER"

This reverts commit b3b77c8caef1750ebeea1054e39e358550ea9f55, which was
also totally broken (see commit 0d2daf5cc858 that reverted the crc32
version of it).  As reported by Stephen Rothwell, it causes problems on
big-endian machines:

> In file included from fs/jfs/jfs_types.h:33,
>                  from fs/jfs/jfs_incore.h:26,
>                  from fs/jfs/file.c:22:
> fs/jfs/endian24.h:36:101: warning: "__LITTLE_ENDIAN" is not defined

The kernel has never had that crazy "__BYTE_ORDER == __LITTLE_ENDIAN"
model.  It's not how we do things, and it isn't how we _should_ do
things.  So don't go there.

Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nuc900: add maintainer entries for Wan ZongShun

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "crc32: use __BYTE_ORDER macro for endian detection"

It doesn't work on big-endian - those architectures don't define
__LITTLE_ENDIAN.

Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/fscache/object-list.c: fix warning on 32-bit

fs/fscache/object-list.c: In function 'fscache_objlist_lookup':
fs/fscache/object-list.c:105: warning: cast to pointer from integer of different size

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nommu: allow private mappings of read-only devices

Slightly rearrange the logic that determines capabilities and vm_flags.
Disable BDI_CAP_MAP_DIRECT in all cases if the device can't support the
protections. Allow private readonly mappings of readonly backing devices.

Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: David McCullough <davidm@snapgear.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mempolicy: ERR_PTR dereference in mpol_shared_policy_init()

The original code called mpol_put(new) while "new" was an ERR_PTR.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'fix/hda' into for-linus

agp: amd64, fix pci reference leaks

Stanse found pci reference leaks in uli_agp_init and nforce3_agp_init
initialization functions.

The PCI devices are bridges, so it's not critical, but still worth fixing.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/edid: Allow non-fatal checksum errors in CEA blocks

Switches will try to update the topology address and not correctly fix
up the checksum, so just let it slide.

https://bugs.freedesktop.org/28229

Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: suppress a build warning (unused variable)

At least 'make CONFIG_DEBUG_SECTION_MISMATCH=y' causes
drivers/gpu/drm/radeon/atombios_crtc.c: In function 'atombios_crtc_set_pll':
drivers/gpu/drm/radeon/atombios_crtc.c:684: warning: 'pll' may be used uninitialized in this function
which has the looks of a falso positive.

Add a default: case so that gcc rests assured that all possible pll_id's are covered.
Keep the present cases that fall through to the default one for self-documentation.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

squashfs: update documentation to include description of xattr layout

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>

Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (63 commits)
  drivers/net/usb/asix.c: Fix pointer cast.
  be2net: Bug fix to avoid disabling bottom half during firmware upgrade.
  proc_dointvec: write a single value
  hso: add support for new products
  Phonet: fix potential use-after-free in pep_sock_close()
  ath9k: remove VEOL support for ad-hoc
  ath9k: change beacon allocation to prefer the first beacon slot
  sock.h: fix kernel-doc warning
  cls_cgroup: Fix build error when built-in
  macvlan: do proper cleanup in macvlan_common_newlink() V2
  be2net: Bug fix in init code in probe
  net/dccp: expansion of error code size
  ath9k: Fix rx of mcast/bcast frames in PS mode with auto sleep
  wireless: fix sta_info.h kernel-doc warnings
  wireless: fix mac80211.h kernel-doc warnings
  iwlwifi: testing the wrong variable in iwl_add_bssid_station()
  ath9k_htc: rare leak in ath9k_hif_usb_alloc_tx_urbs()
  ath9k_htc: dereferencing before check in hif_usb_tx_cb()
  rt2x00: Fix rt2800usb TX descriptor writing.
  rt2x00: Fix failed SLEEP->AWAKE and AWAKE->SLEEP transitions.
  ...

Merge branch 'alpha-next' of git://git./linux/kernel/git/mattst88/alpha-2.6

* 'alpha-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: simplify and optimize sched_find_first_bit
  alpha: invoke oom-killer from page fault
  Convert alpha to use clocksources instead of arch_gettimeoffset

Merge git://git./linux/kernel/git/gregkh/driver-core-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
driver core: add devname module aliases to allow module on-demand auto-loading

Revert "module: drop the lock while waiting for module to complete initialization."

This reverts commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455, since
Rafael reports that it causes occasional kernel paging request faults in
load_module().

Dropping the module lock and re-taking it deep in the call-chain is
definitely not the right thing to do. That just turns the mutex from a
lock into a "random non-locking data structure" that doesn't actually
protect what it's supposed to protect.

Requested-and-tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Brandon Philips <brandon@ifup.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/net/usb/asix.c: Fix pointer cast.

Stephen Rothwell reports the following new warning:

drivers/net/usb/asix.c: In function 'asix_rx_fixup':
drivers/net/usb/asix.c:325: warning: cast from pointer to integer of different size
drivers/net/usb/asix.c:354: warning: cast from pointer to integer of different size

The code just cares about the low alignment bits, so use
an "unsigned long" cast instead of one to "u32".

Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Bug fix to avoid disabling bottom half during firmware upgrade.

Certain firmware commands/operations to upgrade firmware could take several
seconds to complete. The code presently disables bottom half during these
operations which could lead to unpredictable behaviour in certain cases. This
patch now does all firmware upgrade operations asynchronously using a
completion variable.

Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

proc_dointvec: write a single value

The commit 00b7c3395aec3df43de5bd02a3c5a099ca51169f
"sysctl: refactor integer handling proc code"
modified the behaviour of writing to /proc.
Before the commit, write("1\n") to /proc/sys/kernel/printk succeeded. But
now it returns EINVAL.

This commit supports writing a single value to a multi-valued entry.

Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Reviewed-and-tested-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hso: add support for new products

This patch adds a few new product id's for the hso driver.

Signed-off-by: Filip Aben <f.aben@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Phonet: fix potential use-after-free in pep_sock_close()

sk_common_release() might destroy our last reference to the socket.
So an extra temporary reference is needed during cleanup.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

alpha: simplify and optimize sched_find_first_bit

Search only the first 100 bits instead of 140, saving a couple
instructions. The resulting code is about 1/3 faster (40K ticks/1000
iterations down to 30K ticks/1000 iterations).

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>

alpha: invoke oom-killer from page fault

As explained in commit 1c0fe6e3bd, we want to call the architecture
independent oom killer when getting an unexplained OOM from
handle_mm_fault, rather than simply killing current.

[mattst88: kill now unused 'survive' label]
Cc: linux-alpha@vger.kernel.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: linux-arch@vger.kernel.org
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>

Convert alpha to use clocksources instead of arch_gettimeoffset

Alpha has a tsc like rpcc counter that it uses to manage time.
This can be converted to an actual clocksource instead of utilizing
the arch_gettimeoffset method that is really only there for legacy
systems with no continuous counter.

Further cleanups could be made if alpha converted to the clockevent
model.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Richard Henderson <rth@twiddle.net>
Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Tested-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: John Stultz <johnstul@us.ibm.com>

driver core: add devname module aliases to allow module on-demand auto-loading

This adds:
  alias: devname:<name>
to some common kernel modules, which will allow the on-demand loading
of the kernel module when the device node is accessed.

Ideally all these modules would be compiled-in, but distros seems too
much in love with their modularization that we need to cover the common
cases with this new facility. It will allow us to remove a bunch of pretty
useless init scripts and modprobes from init scripts.

The static device node aliases will be carried in the module itself. The
program depmod will extract this information to a file in the module directory:
  $ cat /lib/modules/2.6.34-00650-g537b60d-dirty/modules.devname
  # Device nodes to trigger on-demand module loading.
  microcode cpu/microcode c10:184
  fuse fuse c10:229
  ppp_generic ppp c108:0
  tun net/tun c10:200
  dm_mod mapper/control c10:235

Udev will pick up the depmod created file on startup and create all the
static device nodes which the kernel modules specify, so that these modules
get automatically loaded when the device node is accessed:
  $ /sbin/udevd --debug
  ...
  static_dev_create_from_modules: mknod '/dev/cpu/microcode' c10:184
  static_dev_create_from_modules: mknod '/dev/fuse' c10:229
  static_dev_create_from_modules: mknod '/dev/ppp' c108:0
  static_dev_create_from_modules: mknod '/dev/net/tun' c10:200
  static_dev_create_from_modules: mknod '/dev/mapper/control' c10:235
  udev_rules_apply_static_dev_perms: chmod '/dev/net/tun' 0666
  udev_rules_apply_static_dev_perms: chmod '/dev/fuse' 0666

A few device nodes are switched to statically allocated numbers, to allow
the static nodes to work. This might also useful for systems which still run
a plain static /dev, which is completely unsafe to use with any dynamic minor
numbers.

Note:
The devname aliases must be limited to the *common* and *single*instance*
device nodes, like the misc devices, and never be used for conceptually limited
systems like the loop devices, which should rather get fixed properly and get a
control node for losetup to talk to, instead of creating a random number of
device nodes in advance, regardless if they are ever used.

This facility is to hide the mess distros are creating with too modualized
kernels, and just to hide that these modules are not compiled-in, and not to
paper-over broken concepts. Thanks! :)

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Ian Kent <raven@themaw.net>
Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>