review.tizen.org Git - platform/kernel/linux-arm64.git/log

mm, oom: base root bonus on current usage

A 3% of system memory bonus is sometimes too excessive in comparison to
other processes.

With commit a63d83f427fb ("oom: badness heuristic rewrite"), the OOM
killer tries to avoid killing privileged tasks by subtracting 3% of
overall memory (system or cgroup) from their per-task consumption.  But
as a result, all root tasks that consume less than 3% of overall memory
are considered equal, and so it only takes 33+ privileged tasks pushing
the system out of memory for the OOM killer to do something stupid and
kill dhclient or other root-owned processes.  For example, on a 32G
machine it can't tell the difference between the 1M agetty and the 10G
fork bomb member.

The changelog describes this 3% boost as the equivalent to the global
overcommit limit being 3% higher for privileged tasks, but this is not
the same as discounting 3% of overall memory from _every privileged task
individually_ during OOM selection.

Replace the 3% of system memory bonus with a 3% of current memory usage
bonus.

By giving root tasks a bonus that is proportional to their actual size,
they remain comparable even when relatively small.  In the example
above, the OOM killer will discount the 1M agetty's 256 badness points
down to 179, and the 10G fork bomb's 262144 points down to 183500 points
and make the right choice, instead of discounting both to 0 and killing
agetty because it's first in the task list.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: don't lose the SOFT_DIRTY flag on mprotect

The SOFT_DIRTY bit shows that the content of memory was changed after a
defined point in the past.  mprotect() doesn't change the content of
memory, so it must not change the SOFT_DIRTY bit.

This bug causes a malfunction: on the first iteration all pages are
dumped.  On other iterations only pages with the SOFT_DIRTY bit are
dumped.  So if the SOFT_DIRTY bit is cleared from a page by mistake, the
page is not dumped and its content will be restored incorrectly.

This patch does nothing with _PAGE_SWP_SOFT_DIRTY, becase pte_modify()
is called only for present pages.

Fixes commit 0f8975ec4db2 ("mm: soft-dirty bits for user memory changes
tracking").

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/slub.c: fix page->_count corruption (again)

Commit abca7c496584 ("mm: fix slab->page _count corruption when using
slub") notes that we can not _set_ a page->counters directly, except
when using a real double-cmpxchg.  Doing so can lose updates to
->_count.

That is an absolute rule:

        You may not *set* page->counters except via a cmpxchg.

Commit abca7c496584 fixed this for the folks who have the slub
cmpxchg_double code turned off at compile time, but it left the bad case
alone.  It can still be reached, and the same bug triggered in two
cases:

1. Turning on slub debugging at runtime, which is available on
   the distro kernels that I looked at.
2. On 64-bit CPUs with no CMPXCHG16B (some early AMD x86-64
   cpus, evidently)

There are at least 3 ways we could fix this:

1. Take all of the exising calls to cmpxchg_double_slab() and
   __cmpxchg_double_slab() and convert them to take an old, new
   and target 'struct page'.
2. Do (1), but with the newly-introduced 'slub_data'.
3. Do some magic inside the two cmpxchg...slab() functions to
   pull the counters out of new_counters and only set those
   fields in page->{inuse,frozen,objects}.

I've done (2) as well, but it's a bunch more code.  This patch is an
attempt at (3).  This was the most straightforward and foolproof way
that I could think to do this.

This would also technically allow us to get rid of the ugly

#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
       defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)

in 'struct page', but leaving it alone has the added benefit that
'counters' stays 'unsigned' instead of 'unsigned long', so all the
copies that the slub code does stay a bit smaller.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mempolicy.c: fix mempolicy printing in numa_maps

As a result of commit 5606e3877ad8 ("mm: numa: Migrate on reference
policy"), /proc/<pid>/numa_maps prints the mempolicy for any <pid> as
"prefer:N" for the local node, N, of the process reading the file.

This should only be printed when the mempolicy of <pid> is
MPOL_PREFERRED for node N.

If the process is actually only using the default mempolicy for local
node allocation, make sure "default" is printed as expected.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Robert Lippert <rlippert@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: remove zram->lock in read path and change it with mutex

Finally, we separated zram->lock dependency from 32bit stat/ table
handling so there is no reason to use rw_semaphore between read and
write path so this patch removes the lock from read path totally and
changes rw_semaphore with mutex.  So, we could do

old:

  read-read: OK
  read-write: NO
  write-write: NO

Now:

  read-read: OK
  read-write: OK
  write-write: NO

The below data proves mixed workload performs well 11 times and there is
also enhance on write-write path because current rw-semaphore doesn't
support SPIN_ON_OWNER.  It's side effect but anyway good thing for us.

Write-related tests perform better (from 61% to 1058%) but read path has
good/bad(from -2.22% to 1.45%) but they are all marginal within stddev.

  CPU 12
  iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0

  ==Initial write                ==Initial write
  records: 10                    records: 10
  avg:  516189.16                avg:  839907.96
  std:   22486.53 (4.36%)        std:   47902.17 (5.70%)
  max:  546970.60                max:  909910.35
  min:  481131.54                min:  751148.38
  ==Rewrite                      ==Rewrite
  records: 10                    records: 10
  avg:  509527.98                avg: 1050156.37
  std:   45799.94 (8.99%)        std:   40695.44 (3.88%)
  max:  611574.27                max: 1111929.26
  min:  443679.95                min:  980409.62
  ==Read                         ==Read
  records: 10                    records: 10
  avg: 4408624.17                avg: 4472546.76
  std:  281152.61 (6.38%)        std:  163662.78 (3.66%)
  max: 4867888.66                max: 4727351.03
  min: 4058347.69                min: 4126520.88
  ==Re-read                      ==Re-read
  records: 10                    records: 10
  avg: 4462147.53                avg: 4363257.75
  std:  283546.11 (6.35%)        std:  247292.63 (5.67%)
  max: 4912894.44                max: 4677241.75
  min: 4131386.50                min: 4035235.84
  ==Reverse Read                 ==Reverse Read
  records: 10                    records: 10
  avg: 4565865.97                avg: 4485818.08
  std:  313395.63 (6.86%)        std:  248470.10 (5.54%)
  max: 5232749.16                max: 4789749.94
  min: 4185809.62                min: 3963081.34
  ==Stride read                  ==Stride read
  records: 10                    records: 10
  avg: 4515981.80                avg: 4418806.01
  std:  211192.32 (4.68%)        std:  212837.97 (4.82%)
  max: 4889287.28                max: 4686967.22
  min: 4210362.00                min: 4083041.84
  ==Random read                  ==Random read
  records: 10                    records: 10
  avg: 4410525.23                avg: 4387093.18
  std:  236693.22 (5.37%)        std:  235285.23 (5.36%)
  max: 4713698.47                max: 4669760.62
  min: 4057163.62                min: 3952002.16
  ==Mixed workload               ==Mixed workload
  records: 10                    records: 10
  avg:  243234.25                avg: 2818677.27
  std:   28505.07 (11.72%)       std:  195569.70 (6.94%)
  max:  288905.23                max: 3126478.11
  min:  212473.16                min: 2484150.69
  ==Random write                 ==Random write
  records: 10                    records: 10
  avg:  555887.07                avg: 1053057.79
  std:   70841.98 (12.74%)       std:   35195.36 (3.34%)
  max:  683188.28                max: 1096125.73
  min:  437299.57                min:  992481.93
  ==Pwrite                       ==Pwrite
  records: 10                    records: 10
  avg:  501745.93                avg:  810363.09
  std:   16373.54 (3.26%)        std:   19245.01 (2.37%)
  max:  518724.52                max:  833359.70
  min:  464208.73                min:  765501.87
  ==Pread                        ==Pread
  records: 10                    records: 10
  avg: 4539894.60                avg: 4457680.58
  std:  197094.66 (4.34%)        std:  188965.60 (4.24%)
  max: 4877170.38                max: 4689905.53
  min: 4226326.03                min: 4095739.72

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: remove workqueue for freeing removed pending slot

Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity")
introduced free request pending code to avoid scheduling by mutex under
spinlock and it was a mess which made code lenghty and increased
overhead.

Now, we don't need zram->lock any more to free slot so this patch
reverts it and then, tb_lock should protect it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: introduce zram->tb_lock

Currently, the zram table is protected by zram->lock but it's rather
coarse-grained lock and it makes hard for scalibility.

Let's use own rwlock instead of depending on zram->lock. This patch
adds new locking so obviously, it would make slow but this patch is just
prepartion for removing coarse-grained rw_semaphore(ie, zram->lock)
which is hurdle about zram scalability.

Final patch in this patchset series will remove the lock from read-path
and change rw_semaphore with mutex in write path. With bonus, we could
drop pending slot free mess in next patch.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: use atomic operation for stat

Some of fields in zram->stats are protected by zram->lock which is
rather coarse-grained so let's use atomic operation without explict
locking.

This patch is ready for removing dependency of zram->lock in read path
which is very coarse-grained rw_semaphore.  Of course, this patch adds
new atomic operation so it might make slow but my 12CPU test couldn't
spot any regression.  All gain/lose is marginal within stddev.

  iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0

  ==Initial write                ==Initial write
  records: 50                    records: 50
  avg:  412875.17                avg:  415638.23
  std:   38543.12 (9.34%)        std:   36601.11 (8.81%)
  max:  521262.03                max:  502976.72
  min:  343263.13                min:  351389.12
  ==Rewrite                      ==Rewrite
  records: 50                    records: 50
  avg:  416640.34                avg:  397914.33
  std:   60798.92 (14.59%)       std:   46150.42 (11.60%)
  max:  543057.07                max:  522669.17
  min:  304071.67                min:  316588.77
  ==Read                         ==Read
  records: 50                    records: 50
  avg: 4147338.63                avg: 4070736.51
  std:  179333.25 (4.32%)        std:  223499.89 (5.49%)
  max: 4459295.28                max: 4539514.44
  min: 3753057.53                min: 3444686.31
  ==Re-read                      ==Re-read
  records: 50                    records: 50
  avg: 4096706.71                avg: 4117218.57
  std:  229735.04 (5.61%)        std:  171676.25 (4.17%)
  max: 4430012.09                max: 4459263.94
  min: 2987217.80                min: 3666904.28
  ==Reverse Read                 ==Reverse Read
  records: 50                    records: 50
  avg: 4062763.83                avg: 4078508.32
  std:  186208.46 (4.58%)        std:  172684.34 (4.23%)
  max: 4401358.78                max: 4424757.22
  min: 3381625.00                min: 3679359.94
  ==Stride read                  ==Stride read
  records: 50                    records: 50
  avg: 4094933.49                avg: 4082170.22
  std:  185710.52 (4.54%)        std:  196346.68 (4.81%)
  max: 4478241.25                max: 4460060.97
  min: 3732593.23                min: 3584125.78
  ==Random read                  ==Random read
  records: 50                    records: 50
  avg: 4031070.04                avg: 4074847.49
  std:  192065.51 (4.76%)        std:  206911.33 (5.08%)
  max: 4356931.16                max: 4399442.56
  min: 3481619.62                min: 3548372.44
  ==Mixed workload               ==Mixed workload
  records: 50                    records: 50
  avg:  149925.73                avg:  149675.54
  std:    7701.26 (5.14%)        std:    6902.09 (4.61%)
  max:  191301.56                max:  175162.05
  min:  133566.28                min:  137762.87
  ==Random write                 ==Random write
  records: 50                    records: 50
  avg:  404050.11                avg:  393021.47
  std:   58887.57 (14.57%)       std:   42813.70 (10.89%)
  max:  601798.09                max:  524533.43
  min:  325176.99                min:  313255.34
  ==Pwrite                       ==Pwrite
  records: 50                    records: 50
  avg:  411217.70                avg:  411237.96
  std:   43114.99 (10.48%)       std:   33136.29 (8.06%)
  max:  530766.79                max:  471899.76
  min:  320786.84                min:  317906.94
  ==Pread                        ==Pread
  records: 50                    records: 50
  avg: 4154908.65                avg: 4087121.92
  std:  151272.08 (3.64%)        std:  219505.04 (5.37%)
  max: 4459478.12                max: 4435857.38
  min: 3730512.41                min: 3101101.67

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: remove unnecessary free

Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity")
introduced pending zram slot free in zram's write path in case of
missing slot free by memory allocation failure in zram_slot_free_notify
but it is not necessary because we have already freed the slot right
before overwriting.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: delay pending free request in read path

Sergey reported we don't need to handle pending free request every I/O
so that this patch removes it in read path while we remain it in write
path.

Let's consider below example.

Swap subsystem ask to zram "A" block free by swap_slot_free_notify but
zram had been pended it without real freeing. Swap subsystem allocates
"A" block for new data but request pended for a long time just handled
and zram blindly free new data on the "A" block. :(

That's why we couldn't remove handle pending free request right before
zram-write.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: fix race between reset and flushing pending work

Dan and Sergey reported that there is a racy between reset and flushing
of pending work so that it could make oops by freeing zram->meta in
reset while zram_slot_free can access zram->meta if new request is
adding during the race window.

This patch moves flush after taking init_lock so it prevents new request
so that it closes the race.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zsmalloc: add maintainers

tAdd adds maintainer information for zsmalloc into the MAINTAINERS file.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: add zram maintainers

Add maintainer information for zram into the MAINTAINERS file.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zsmalloc: add copyright

Add my copyright to the zsmalloc source code which I maintain.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: add copyright

Add my copyright to the zram source code which I maintain.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: remove old private project comment

Remove the old private compcache project address so upcoming patches
should be sent to LKML because we Linux kernel community will take care.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: promote zram from staging

Zram has lived in staging for a LONG LONG time and have been
fixed/improved by many contributors so code is clean and stable now.  Of
course, there are lots of product using zram in real practice.

The major TV companys have used zram as swap since two years ago and
recently our production team released android smart phone with zram
which is used as swap, too and recently Android Kitkat start to use zram
for small memory smart phone.  And there was a report Google released
their ChromeOS with zram, too and cyanogenmod have been used zram long
time ago.  And I heard some disto have used zram block device for tmpfs.
In addition, I saw many report from many other peoples.  For example,
Lubuntu start to use it.

The benefit of zram is very clear.  With my experience, one of the
benefit was to remove jitter of video application with backgroud memory
pressure.  It would be effect of efficient memory usage by compression
but more issue is whether swap is there or not in the system.  Recent
mobile platforms have used JAVA so there are many anonymous pages.  But
embedded system normally are reluctant to use eMMC or SDCard as swap
because there is wear-leveling and latency issues so if we do not use
swap, it means we can't reclaim anoymous pages and at last, we could
encounter OOM kill.  :(

Although we have real storage as swap, it was a problem, too.  Because
it sometime ends up making system very unresponsible caused by slow swap
storage performance.

Quote from Luigi on Google
"Since Chrome OS was mentioned: the main reason why we don't use swap
  to a disk (rotating or SSD) is because it doesn't degrade gracefully
  and leads to a bad interactive experience.  Generally we prefer to
  manage RAM at a higher level, by transparently killing and restarting
  processes.  But we noticed that zram is fast enough to be competitive
  with the latter, and it lets us make more efficient use of the
  available RAM.  " and he announced.
http://www.spinics.net/lists/linux-mm/msg57717.html

Other uses case is to use zram for block device.  Zram is block device
so anyone can format the block device and mount on it so some guys on
the internet start zram as /var/tmp.
http://forums.gentoo.org/viewtopic-t-838198-start-0.html

Let's promote zram and enhance/maintain it instead of removing.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zsmalloc: move it under mm

This patch moves zsmalloc under mm directory.

Before that, description will explain why we have needed custom
allocator.

Zsmalloc is a new slab-based memory allocator for storing compressed
pages.  It is designed for low fragmentation and high allocation success
rate on large object, but <= PAGE_SIZE allocations.

zsmalloc differs from the kernel slab allocator in two primary ways to
achieve these design goals.

zsmalloc never requires high order page allocations to back slabs, or
"size classes" in zsmalloc terms.  Instead it allows multiple
single-order pages to be stitched together into a "zspage" which backs
the slab.  This allows for higher allocation success rate under memory
pressure.

Also, zsmalloc allows objects to span page boundaries within the zspage.
This allows for lower fragmentation than could be had with the kernel
slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE.  With the
kernel slab allocator, if a page compresses to 60% of it original size,
the memory savings gained through compression is lost in fragmentation
because another object of the same size can't be stored in the leftover
space.

This ability to span pages results in zsmalloc allocations not being
directly addressable by the user.  The user is given an
non-dereferencable handle in response to an allocation request.  That
handle must be mapped, using zs_map_object(), which returns a pointer to
the mapped region that can be used.  The mapping is necessary since the
object data may reside in two different noncontigious pages.

The zsmalloc fulfills the allocation needs for zram perfectly

[sjenning@linux.vnet.ibm.com: borrow Seth's quote]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/smp.c: remove cpumask_ipi

After commit 9a46ad6d6df3 ("smp: make smp_call_function_many() use logic
similar to smp_call_function_single()"), cfd->cpumask is accessed only
in smp_call_function_many(). So there is no more need to copy it into
cfd->cpumask_ipi before putting csd into the list. The cpumask_ipi
field is obsolete and can be removed.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel: use lockless list for smp_call_function_single

Make smp_call_function_single and friends more efficient by using a
lockless list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/net/phy/mdio_bus.c: call put_device on device_register() failure

It is required to call put_device() if device_register() fails, so that
we give up the last reference to the device. Calling put_device allows
for mdiobus_release to be executed, kfreeing the bus.

Signed-off-by: Levente Kurusa <levex@linux.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: David Daney <david.daney@cavium.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/video/backlight/lcd.c: call put_device if device_register fails

Currently we kfree the container of the device which failed to register.
This is wrong as the last reference is not given up with a put_device
call. Also, now that we have put_device() callen, we no longer need the
kfree as the new_ld->dev.release function will take care of kfreeing the
associated memory.

Signed-off-by: Levente Kurusa <levex@linux.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memblock, bootmem: restore goal for alloc_low

Now we have memblock_virt_alloc_low to replace original bootmem api in
swiotlb.

But we should not use BOOTMEM_LOW_LIMIT for arch that does not support
CONFIG_NOBOOTMEM, as old api take 0.

| #define alloc_bootmem_low(x) \
| __alloc_bootmem_low(x, SMP_CACHE_BYTES, 0)
|#define alloc_bootmem_low_pages_nopanic(x) \
| __alloc_bootmem_low_nopanic(x, PAGE_SIZE, 0)

and we have
#define BOOTMEM_LOW_LIMIT __pa(MAX_DMA_ADDRESS)
for CONFIG_NOBOOTMEM.

Restore goal to 0 to fix ia64 crash, that Tony found.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Tony Luck <tony.luck@gmail.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-block

Pull block IO driver changes from Jens Axboe:

- bcache update from Kent Overstreet.

- two bcache fixes from Nicholas Swenson.

- cciss pci init error fix from Andrew.

- underflow fix in the parallel IDE pg_write code from Dan Carpenter.
   I'm sure the 1 (or 0) users of that are now happy.

- two PCI related fixes for sx8 from Jingoo Han.

- floppy init fix for first block read from Jiri Kosina.

- pktcdvd error return miss fix from Julia Lawall.

- removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from
   Michael Opdenacker.

- comment typo fix for the loop driver from Olaf Hering.

- potential oops fix for null_blk from Raghavendra K T.

- two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing
   an OOM problem and a problem with handling security locked conditions

* 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits)
  mg_disk: Spelling s/finised/finished/
  null_blk: Null pointer deference problem in alloc_page_buffers
  mtip32xx: Correctly handle security locked condition
  mtip32xx: Make SGL container per-command to eliminate high order dma allocation
  drivers/block/loop.c: fix comment typo in loop_config_discard
  drivers/block/cciss.c:cciss_init_one(): use proper errnos
  drivers/block/paride/pg.c: underflow bug in pg_write()
  drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
  drivers/block/sx8.c: use module_pci_driver()
  floppy: bail out in open() if drive is not responding to block0 read
  bcache: Fix auxiliary search trees for key size > cacheline size
  bcache: Don't return -EINTR when insert finished
  bcache: Improve bucket_prio() calculation
  bcache: Add bch_bkey_equal_header()
  bcache: update bch_bkey_try_merge
  bcache: Move insert_fixup() to btree_keys_ops
  bcache: Convert sorting to btree_keys
  bcache: Convert debug code to btree_keys
  bcache: Convert btree_iter to struct btree_keys
  bcache: Refactor bset_tree sysfs stats
  ...

Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block

Pull core block IO changes from Jens Axboe:
"The major piece in here is the immutable bio_ve series from Kent, the
  rest is fairly minor.  It was supposed to go in last round, but
  various issues pushed it to this release instead.  The pull request
  contains:

   - Various smaller blk-mq fixes from different folks.  Nothing major
     here, just minor fixes and cleanups.

   - Fix for a memory leak in the error path in the block ioctl code
     from Christian Engelmayer.

   - Header export fix from CaiZhiyong.

   - Finally the immutable biovec changes from Kent Overstreet.  This
     enables some nice future work on making arbitrarily sized bios
     possible, and splitting more efficient.  Related fixes to immutable
     bio_vecs:

        - dm-cache immutable fixup from Mike Snitzer.
        - btrfs immutable fixup from Muthu Kumar.

  - bio-integrity fix from Nic Bellinger, which is also going to stable"

* 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
  xtensa: fixup simdisk driver to work with immutable bio_vecs
  block/blk-mq-cpu.c: use hotcpu_notifier()
  blk-mq: for_each_* macro correctness
  block: Fix memory leak in rw_copy_check_uvector() handling
  bio-integrity: Fix bio_integrity_verify segment start bug
  block: remove unrelated header files and export symbol
  blk-mq: uses page->list incorrectly
  blk-mq: use __smp_call_function_single directly
  btrfs: fix missing increment of bi_remaining
  Revert "block: Warn and free bio if bi_end_io is not set"
  block: Warn and free bio if bi_end_io is not set
  blk-mq: fix initializing request's start time
  block: blk-mq: don't export blk_mq_free_queue()
  block: blk-mq: make blk_sync_queue support mq
  block: blk-mq: support draining mq queue
  dm cache: increment bi_remaining when bi_end_io is restored
  block: fixup for generic bio chaining
  block: Really silence spurious compiler warnings
  block: Silence spurious compiler warnings
  block: Kill bio_pair_split()
  ...

Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
- Handle some loose ends from the vfs read delegation support.
   (For example nfsd can stop breaking leases on its own in a
    fewer places where it can now depend on the vfs to.)
- Make life a little easier for NFSv4-only configurations
   (thanks to Kinglong Mee).
- Fix some gss-proxy problems (thanks Jeff Layton).
- miscellaneous bug fixes and cleanup

* 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
  nfsd: consider CLAIM_FH when handing out delegation
  nfsd4: fix delegation-unlink/rename race
  nfsd4: delay setting current_fh in open
  nfsd4: minor nfs4_setlease cleanup
  gss_krb5: use lcm from kernel lib
  nfsd4: decrease nfsd4_encode_fattr stack usage
  nfsd: fix encode_entryplus_baggage stack usage
  nfsd4: simplify xdr encoding of nfsv4 names
  nfsd4: encode_rdattr_error cleanup
  nfsd4: nfsd4_encode_fattr cleanup
  minor svcauth_gss.c cleanup
  nfsd4: better VERIFY comment
  nfsd4: break only delegations when appropriate
  NFSD: Fix a memory leak in nfsd4_create_session
  sunrpc: get rid of use_gssp_lock
  sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
  sunrpc: don't wait for write before allowing reads from use-gss-proxy file
  nfsd: get rid of unused function definition
  Define op_iattr for nfsd4_open instead using macro
  NFSD: fix compile warning without CONFIG_NFSD_V3
  ...

ipmi: Add missing rv in ipmi_parisc_probe()

Fix

  drivers/char/ipmi/ipmi_si_intf.c: In function 'ipmi_parisc_probe':
  drivers/char/ipmi/ipmi_si_intf.c:2752:2: error: 'rv' undeclared (first use in this function)
  drivers/char/ipmi/ipmi_si_intf.c:2752:2: note: each undeclared identifier is reported only once for each function it appears in

Introduced by commit d02b3709ff8e ("ipmi: Cleanup error return")

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nfs: fix xattr inode op pointers when disabled

Chris Mason reported a NULL pointer derefernence in generic_getxattr()
that was due to sb->s_xattr being NULL.

The reason is that the nfs #ifdef's for ACL support were misplaced, and
the nfs3 inode operations had the xattr operation pointers set up, even
though xattrs were not actually supported. As a result, the xattr code
was being called without the infrastructure having been set up.

Move the #ifdef's appropriately.

Reported-and-tested-by: Chris Mason <clm@fb.com>
Acked-by: Al Viro viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
"Been a bit busy, first week of kids school, and waiting on other trees
  to go in before I could send this, so its a bit later than I'd
  normally like.

  Highlights:
   - core:
      timestamp fixes, lots of misc cleanups
   - new drivers:
      bochs virtual vga
   - vmwgfx:
      major overhaul for their nextgen virt gpu.
   - i915:
      runtime D3 on HSW, watermark fixes, power well work, fbc fixes,
      bdw is no longer prelim.
   - nouveau:
      gk110/208 acceleration, more pm groundwork, old overlay support
   - radeon:
      dpm rework and clockgating for CIK, pci config reset, big endian
      fixes
   - tegra:
      panel support and DSI support, build as module, prime.
   - armada, omap, gma500, rcar, exynos, mgag200, cirrus, ast:
      fixes
   - msm:
      hdmi support for mdp5"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (595 commits)
  drm/nouveau: resume display if any later suspend bits fail
  drm/nouveau: fix lock unbalance in nouveau_crtc_page_flip
  drm/nouveau: implement hooks for needed for drm vblank timestamping support
  drm/nouveau/disp: add a method to fetch info needed by drm vblank timestamping
  drm/nv50: fill in crtc mode struct members from crtc_mode_fixup
  drm/radeon/dce8: workaround for atom BlankCrtc table
  drm/radeon/DCE4+: clear bios scratch dpms bit (v2)
  drm/radeon: set si_notify_smc_display_change properly
  drm/radeon: fix DAC interrupt handling on DCE5+
  drm/radeon: clean up active vram sizing
  drm/radeon: skip async dma init on r6xx
  drm/radeon/runpm: don't runtime suspend non-PX cards
  drm/radeon: add ring to fence trace functions
  drm/radeon: add missing trace point
  drm/radeon: fix VMID use tracking
  drm: ast,cirrus,mgag200: use drm_can_sleep
  drm/gma500: Lock struct_mutex around cursor updates
  drm/i915: Fix the offset issue for the stolen GEM objects
  DRM: armada: fix missing DRM_KMS_FB_HELPER select
  drm/i915: Decouple GPU error reporting from ring initialisation
  ...

Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dma updates from Vinod Koul:
- new driver for BCM2835 used in R-pi
- new driver for MOXA ART
- dma_get_any_slave_channel API for DT based systems
- minor fixes and updates spread acrooss driver

[ The fsl-ssi dual fifo mode support addition clashed badly with the
  other changes to fsl-ssi that came in through the sound merge.  I did
  a very rough cut at fixing up the conflict, but Nicolin Chen (author
  of both sides) will need to verify and check things ]

* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (36 commits)
  dmaengine: mmp_pdma: fix mismerge
  dma: pl08x: Export pl08x_filter_id
  acpi-dma: align documentation with kernel-doc format
  dma: fix vchan_cookie_complete() debug print
  DMA: dmatest: extend the "device" module parameter to 32 characters
  drivers/dma: fix error return code
  dma: omap: Set debug level to debugging messages
  dmaengine: fix kernel-doc style typos for few comments
  dma: tegra: add support for Tegra148/124
  dma: dw: use %pad instead of casting dma_addr_t
  dma: dw: join split up messages
  dma: dw: fix style of multiline comment
  dmaengine: k3dma: fix sparse warnings
  dma: pl330: Use dma_get_slave_channel() in the of xlate callback
  dma: pl330: Differentiate between submitted and issued descriptors
  dmaengine: sirf: Add device_slave_caps interface
  DMA: Freescale: change BWC from 256 bytes to 1024 bytes
  dmaengine: Add MOXA ART DMA engine driver
  dmaengine: Add DMA_PRIVATE to BCM2835 driver
  dma: imx-sdma: Assign a default script number for ROM firmware cases
  ...

Merge tag 'for-linus' of git://git./linux/kernel/git/olof/chrome-platform

Pull chrome platform cleanups and improvements from Olof Johansson:
- Use deferred probing on Chrome OS platforms for the i2c device
   registration.  This fixes a long-standing race of initialization of
   touchpad/screen on Chromebooks.
- Added in platform device registration for pstore console on supported
   hardware
- Misc smaller fixes (__initdata, module exit cleanup, etc)

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
  platform/chrome: unregister platform driver/device when module exit
  platform/chrome: Make i2c_adapter_names static
  platform/chrome: chromeos_laptop - fix incorrect placement of __initdata tag
  platform/chrome: chromeos_laptop - Use deferred probing
  platform/chrome: chromeos_laptop - Restructure device associations
  platform/chrome: Add pstore platform_device

Merge tag 'iommu-updates-v3.14' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU Updates from Joerg Roedel:
"A few patches have been queued up for this merge window:

   - improvements for the ARM-SMMU driver (IOMMU_EXEC support, IOMMU
     group support)
   - updates and fixes for the shmobile IOMMU driver
   - various fixes to generic IOMMU code and the Intel IOMMU driver
   - some cleanups in IOMMU drivers (dev_is_pci() usage)"

* tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits)
  iommu/vt-d: Fix signedness bug in alloc_irte()
  iommu/vt-d: free all resources if failed to initialize DMARs
  iommu/vt-d, trivial: clean sparse warnings
  iommu/vt-d: fix wrong return value of dmar_table_init()
  iommu/vt-d: release invalidation queue when destroying IOMMU unit
  iommu/vt-d: fix access after free issue in function free_dmar_iommu()
  iommu/vt-d: keep shared resources when failed to initialize iommu devices
  iommu/vt-d: fix invalid memory access when freeing DMAR irq
  iommu/vt-d, trivial: simplify code with existing macros
  iommu/vt-d, trivial: use defined macro instead of hardcoding
  iommu/vt-d: mark internal functions as static
  iommu/vt-d, trivial: clean up unused code
  iommu/vt-d, trivial: check suitable flag in function detect_intel_iommu()
  iommu/vt-d, trivial: print correct domain id of static identity domain
  iommu/vt-d, trivial: refine support of 64bit guest address
  iommu/vt-d: fix resource leakage on error recovery path in iommu_init_domains()
  iommu/vt-d: fix a race window in allocating domain ID for virtual machines
  iommu/vt-d: fix PCI device reference leakage on error recovery path
  drm/msm: Fix link error with !MSM_IOMMU
  iommu/vt-d: use dedicated bitmap to track remapping entry allocation status
  ...

Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:
- new driver for bcm281xx watchdog device
- new driver for gpio based watchdog devices
- remove DEFINE_PCI_DEVICE_TABLE macro for watchdog device drivers
- conversion of davinci_wdt and mpc8xxx_wdt to watchdog core
- improvements on davinci_wdt, at91/dt, at91sam9_wdt and s3c2410_wdt
- Auto-detect IO address and expand supported chips on w836* super-I/O
   chipsets
- core: Make dt "timeout-sec" property work on drivers w/out min/max
- fix Kconfig dependencies
- sirf: Remove redundant of_match_ptr helper
- mach-moxart: add restart handler
- hpwdt patch to display better panic information
- imx2_wdt: disable watchdog timer during low power mode

* git://www.linux-watchdog.org/linux-watchdog: (31 commits)
  watchdog: w83627hf_wdt: Reset watchdog trigger during initialization
  watchdog: w83627hf: Add support for W83697HF and W83697UG
  watchdog: w83627hf: Auto-detect IO address and supported chips
  watchdog: at91sam9_wdt: increase security margin on watchdog counter reset
  watchdog: at91sam9_wdt: avoid spurious watchdog reset during init
  watchdog: at91sam9_wdt: fix secs_to_ticks
  ARM: at91/dt: add watchdog properties to kizbox board
  ARM: at91/dt: add sam9 watchdog default options to SoCs
  watchdog: at91sam9_wdt: update device tree doc
  watchdog: at91sam9_wdt: better watchdog support
  watchdog: sp805_wdt depends also on ARM64
  watchdog: mach-moxart: add restart handler
  watchdog: mpc8xxx_wdt convert to watchdog core
  watchdog: sirf: Remove redundant of_match_ptr helper
  watchdog: hpwdt patch to display informative string
  watchdog: dw_wdt: remove build dependencies
  watchdog: imx2_wdt: disable watchdog timer during low power mode
  watchdog: s3c2410_wdt: Report when the watchdog reset the system
  watchdog: s3c2410_wdt: use syscon regmap interface to configure pmu register
  watchdog: s3c2410_wdt: Handle rounding a little better for timeout
  ...

Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull more i2c updates from Wolfram Sang:
"Mostly bugfixes, small but wanted cleanups, and Paul's init.h removal
  applied"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: rcar: fix NACK error code
  i2c: update i2c_algorithm documentation
  i2c: rcar: use devm_clk_get to ensure clock is properly ref-counted
  i2c: rcar: do not print error if device nacks transfer
  i2c: rely on driver core when sanitizing devices
  i2c: delete non-required instances of include <linux/init.h>
  i2c: acorn: is tristate and should use module.h
  i2c: piix4: Standardize log messages
  i2c: piix4: Use different message for AMD Auxiliary SMBus Controller
  i2c: piix4: Add support for AMD ML and CZ SMBus changes

Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

Pull hwmon updates from Jean Delvare:
"This include it87 driver improvements, and a tree-wide change of my
  e-mail address"

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  Update Jean Delvare's e-mail address
  hwmon: (it87) Print proper names for the IT8771E and IT8772E
  hwmon: (it87) Add support for the ITE IT8603E

Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86

Pull x86 platform drivers update from Matthew Garrett:
"Nothing amazingly special here.  Some cleanups, a new driver to
  support a single button on some new HPs, a tiny amount of hardware
  enablement"

* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
  ipc: add intel-mid's pci id macros
  hp-wireless: new driver for hp wireless button for Windows 8
  toshiba_acpi: Support RFKILL hotkey scancode
  hp_accel: Add a new PnP ID HPQ6007 for new HP laptops
  sony-laptop: remove unnecessary assigment of len
  fujitsu-laptop: fix error return code
  dell-laptop: Only install the i8042 filter when rfkill is active
  X86 platform: New BayTrail IOSF-SB MBI driver
  drivers: platform: Include appropriate header file in mxm-wmi.c
  drivers: platform: Mark functions as static in hp_accel.c
  dell-laptop: rkill whitelist Precision models
  ipc: simplify platform data approach
  asus-wmi: Convert to use devm_hwmon_device_register_with_groups
  compal-laptop: Use devm_hwmon_device_register_with_groups
  compal-laptop: Replace SENSOR_DEVICE_ATTR with DEVICE_ATTR
  eeepc-laptop: Convert to use devm_hwmon_device_register_with_groups
  compal-laptop: Use devm_kzalloc to allocate local data structure
  dell-laptop: fix to return error code in dell_send_intensity()

Merge tag 'blackfin-for-linus' of git://git./linux/kernel/git/realmz6/blackfin-linux

Pull blackfin updates from Steven Miao:
"Some minor changes and bug fixes"

* tag 'blackfin-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/realmz6/blackfin-linux:
  From: Eunbong Song <eunb.song@samsung.com>
  Add platfrom device resource for bfin-sport on bf533 stamp
  fix build error for bf527-ezkit_defconfig for old silicon
  blackfin: Support L1 SRAM parity checking feature on bf60x
  blackfin: bf609: update the anomaly list to Nov 2013
  blackfin: delete non-required instances of <linux/init.h>
  From: Paul Walmsley <pwalmsley@nvidia.com>
  06/18] smp, blackfin: kill SMP single function call interrupt
  arch: blackfin: uapi: be sure of "_UAPI" prefix for all guard macros

Merge branch 'x86-intel-mid-for-linus' of git://git./linux/kernel/git/tip/tip

Pull intel MID cleanups from Peter Anvin:
"Miscellaneous cleanups to the intel-mid code merged earlier in this
  merge window"

* 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, intel-mid: Cleanup some platform code's header files
  x86, intel-mid: Add missing 'void' to functions without arguments
  x86: Don't add new __cpuinit users to Merrifield platform code
  x86: Don't introduce more __cpuinit users in intel_mid_weak_decls.h

Merge branch 'x86-x32-for-linus' of git://git./linux/kernel/git/tip/tip

Pull more x32 uabi type fixes from Peter Anvin:
"Despite the branch name, **most of these changes are to generic
  code**.  They change types so that they make an increasing amount of
  the exported uapi kernel headers usable for libc.

  The ARM64 people are also interested in these changes for their ILP32
  ABI"

* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  uapi: Use __kernel_long_t in struct mq_attr
  uapi: Use __kernel_ulong_t in shmid64_ds/shminfo64/shm_info
  x86, uapi, x32: Use __kernel_ulong_t in x86 struct semid64_ds
  uapi: Use __kernel_ulong_t in struct msqid64_ds
  uapi: Use __kernel_long_t in struct msgbuf
  uapi, asm-generic: Use __kernel_ulong_t in uapi struct ipc64_perm
  uapi: Use __kernel_long_t/__kernel_ulong_t in <linux/resource.h>
  uapi: Use __kernel_long_t in struct timex

Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull more ARM updates from Russell King:
"Some further changes for this merge window:
   - fix bug building with gcc 4.6.4 and EABI.
   - fix pgtbl macro with some LPAE configurations
   - fix initrd override - FDT was overriding the command line, and it
     should be the other way around.
   - fix byteswap of instructions in undefined instruction handler
   - add basic support for SolidRun Hummingboard and Cubox-i boards"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: fix building with gcc 4.6.4
  ARM: 7941/2: Fix incorrect FDT initrd parameter override
  ARM: 7947/1: Make pgtbl macro more robust
  ARM: 7946/1: asm: __und_usr_thumb need byteswap instructions in BE case
  ARM: 7930/1: Introduce atomic MMIO modify
  ARM: imx: initial SolidRun Cubox-i support
  ARM: imx: initial SolidRun HummingBoard support

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"Several fixups, of note:

  1) Fix unlock of not held spinlock in RXRPC code, from Alexey
     Khoroshilov.

  2) Call pci_disable_device() from the correct shutdown path in bnx2x
     driver, from Yuval Mintz.

  3) Fix qeth build on s390 for some configurations, from Eugene
     Crosser.

  4) Cure locking bugs in bond_loadbalance_arp_mon(), from Ding
     Tianhong.

  5) Must do netif_napi_add() before registering netdevice in sky2
     driver, from Stanislaw Gruszka.

  6) Fix lost bug fix during merge due to code movement in ieee802154,
     noticed and fixed by the eagle eyed Stephen Rothwell.

  7) Get rid of resource leak in xen-netfront driver, from Annie Li.

  8) Bounds checks in qlcnic driver are off by one, from Manish Chopra.

  9) TPROXY can leak sockets when TCP early demux is enabled, fix from
     Holger Eitzenberger"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
  qeth: fix build of s390 allmodconfig
  bonding: fix locking in bond_loadbalance_arp_mon()
  tun: add device name(iff) field to proc fdinfo entry
  DT: net: davinci_emac: "ti, davinci-no-bd-ram" property is actually optional
  DT: net: davinci_emac: "ti, davinci-rmii-en" property is actually optional
  bnx2x: Fix generic option settings
  net: Fix warning on make htmldocs caused by skbuff.c
  llc: remove noisy WARN from llc_mac_hdr_init
  qlcnic: Fix loopback test failure
  qlcnic: Fix tx timeout.
  qlcnic: Fix initialization of vlan list.
  qlcnic: Correct off-by-one errors in bounds checks
  net: Document promote_secondaries
  net: gre: use icmp_hdr() to get inner ip header
  i40e: Add missing braces to i40e_dcb_need_reconfig()
  xen-netfront: fix resource leak in netfront
  net: 6lowpan: fixup for code movement
  hyperv: Add support for physically discontinuous receive buffer
  sky2: initialize napi before registering device
  net: Fix memory leak if TPROXY used with TCP early demux
  ...

Merge git://git./linux/kernel/git/davem/sparc

Pull sparc update from David Miller:
"Two cleanups from Paul Gortmaker and hook up the new scheduler system
  calls"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Hook up sched_setattr and sched_getattr syscalls.
  sparc: don't use module_init in non-modular pci.c code
  sparc: delete non-required instances of include <linux/init.h>

Merge git://git./linux/kernel/git/davem/ide

Pull IDE fixes from David Miller:
"Two header file inclusion fixes from Rashika Kheria"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
drivers: ide: Include appropriate header file in ide-pio-blacklist.c
drivers: ide: Include appropriate header file in ide-cd_verbose.c

Merge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next

more fixes for nouveau.

* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau: resume display if any later suspend bits fail
  drm/nouveau: fix lock unbalance in nouveau_crtc_page_flip
  drm/nouveau: implement hooks for needed for drm vblank timestamping support
  drm/nouveau/disp: add a method to fetch info needed by drm vblank timestamping
  drm/nv50: fill in crtc mode struct members from crtc_mode_fixup

Merge branch 'drm-next-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-next

more radeon fixes

* 'drm-next-3.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/dce8: workaround for atom BlankCrtc table
  drm/radeon/DCE4+: clear bios scratch dpms bit (v2)
  drm/radeon: set si_notify_smc_display_change properly
  drm/radeon: fix DAC interrupt handling on DCE5+
  drm/radeon: clean up active vram sizing
  drm/radeon: skip async dma init on r6xx
  drm/radeon/runpm: don't runtime suspend non-PX cards
  drm/radeon: add ring to fence trace functions
  drm/radeon: add missing trace point
  drm/radeon: fix VMID use tracking

Merge branch 'akpm' (patches from Andrew Morton)

Merge random fixes from Andrew Morton:
"Random fixes.

  I have one batch remaining for -rc1, mainly zram changes which await a
  merge of Jens's trees"

* emailed patches fron Andrew Morton akpm@linux-foundation.org>:
  MAINTAINERS: ADI Linux development mailing lists: change to the new server
  Documentation: fix multiple typo occurences s/KenelVersion/KernelVersion/
  dma-debug: fix overlap detection
  memblock: add limit checking to memblock_virt_alloc
  mm/readahead.c: fix do_readahead() for no readpage(s)
  mm/slub.c: do not VM_BUG_ON_PAGE() for temporary on-stack pages
  slab: fix wrong retval on kmem_cache_create_memcg error path
  s390/compat: change parameter types from unsigned long to compat_ulong_t
  fs/compat: fix lookup_dcookie() parameter handling
  fs/compat: fix parameter handling for compat readv/writev syscalls
  mm/mempolicy.c: convert to pr_foo()
  mm: numa: initialise numa balancing after jump label initialisation
  mm/page-writeback.c: do not count anon pages as dirtyable memory
  mm/page-writeback.c: fix dirty_balance_reserve subtraction from dirtyable memory
  mm: document improved handling of swappiness==0
  lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL

MAINTAINERS: ADI Linux development mailing lists: change to the new server

Update Blackfin arch maintainer's email as well.

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Cc: Michael Hennerich <michael.hennerich@analog.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation: fix multiple typo occurences s/KenelVersion/KernelVersion/

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Greg KH <greg@kroah.com>
Acked-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dma-debug: fix overlap detection

Commit 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()") was
reworked to expand the overlap counter to the full range expressable by
3 tag bits, but it has a thinko in treating the overlap counter as a
pure reference count for the entry.

Instead of deleting when the reference-count drops to zero, we need to
delete when the overlap-count drops below zero. Also, when detecting
overflow we can just test the overlap-count > MAX rather than applying
special meaning to 0.

Regression report available here:
http://marc.info/?l=linux-netdev&m=139073373932386&w=2

This patch, now tested on the original net_dma case, sees the expected
handful of reports before the eventual data corruption occurs.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memblock: add limit checking to memblock_virt_alloc

In original bootmem wrapper for memblock, we have limit checking.

Add it to memblock_virt_alloc, to address arm and x86 booting crash.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Reported-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Reported-by: Olof Johansson <olof@lixom.net>
Tested-by: Olof Johansson <olof@lixom.net>
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: "Strashko, Grygorii" <grygorii.strashko@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/readahead.c: fix do_readahead() for no readpage(s)

Commit 63d0f0a3c7e1 ("mm/readahead.c:do_readhead(): don't check for
->readpage") unintentionally made do_readahead return 0 for all valid
files regardless of whether readahead was supported, rather than the
expected -EINVAL.  This gets forwarded on to userspace, and results in
sys_readahead appearing to succeed in cases that don't make sense (e.g.
when called on pipes or sockets).  This issue is detected by the LTP
readahead01 testcase.

As the exact return value of force_page_cache_readahead is currently
never used, we can simplify it to return only 0 or -EINVAL (when
readpage or readpages is missing).  With that in place we can simply
forward on the return value of force_page_cache_readahead in
do_readahead.

This patch performs said change, restoring the expected semantics.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/slub.c: do not VM_BUG_ON_PAGE() for temporary on-stack pages

Commit 309381feaee5 ("mm: dump page when hitting a VM_BUG_ON using
VM_BUG_ON_PAGE") added a bunch of VM_BUG_ON_PAGE() calls.

But, most of the ones in the slub code are for _temporary_ 'struct
page's which are declared on the stack and likely have lots of gunk in
them. Dumping their contents out will just confuse folks looking at
bad_page() output. Plus, if we try to page_to_pfn() on them or
soemthing, we'll probably oops anyway.

Turn them back in to VM_BUG_ON()s.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

slab: fix wrong retval on kmem_cache_create_memcg error path

On kmem_cache_create_memcg() error path we set 'err', but leave 's' (the
new cache ptr) undefined.  The latter can be NULL if we could not
allocate the cache, or pointing to a freed area if we failed somewhere
later while trying to initialize it.  Initially we checked 'err'
immediately before exiting the function and returned NULL if it was set
ignoring the value of 's':

    out_unlock:
        ...
        if (err) {
            /* report error */
            return NULL;
        }
        return s;

Recently this check was, in fact, broken by commit f717eb3abb5e ("slab:
do not panic if we fail to create memcg cache"), which turned it to:

    out_unlock:
        ...
        if (err && !memcg) {
            /* report error */
            return NULL;
        }
        return s;

As a result, if we are failing creating a cache for a memcg, we will
skip the check and return 's' that can contain crap.  Obviously, commit
f717eb3abb5e intended not to return crap on error allocating a cache for
a memcg, but only to remove the error reporting in this case, so the
check should look like this:

    out_unlock:
        ...
        if (err) {
            if (!memcg)
                return NULL;
            /* report error */
            return NULL;
        }
        return s;

[rientjes@google.com: despaghettification]
[vdavydov@parallels.com: patch monkeying]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

s390/compat: change parameter types from unsigned long to compat_ulong_t

Change parameter types of s390's compat ipc syscall from unsigned long
to compat_ulong_t to enforce zero extension of these parameters.

This is not really a bug, since s390_ipc compat syscall is only a
wrapper to the generic compat_sys_ipc() syscall, which performs correct
zero and sign extension.

This was introduced with commit 56e41d3c5aa8 ("merge compat sys_ipc
instances").

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/compat: fix lookup_dcookie() parameter handling

Commit d5dc77bfeeab ("consolidate compat lookup_dcookie()") coverted all
architectures to the new compat_sys_lookup_dcookie() syscall.

The "len" paramater of the new compat syscall must have the type
compat_size_t in order to enforce zero extension for architectures where
the ABI requires that the caller of a function performed zero and/or
sign extension to 64 bit of all parameters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/compat: fix parameter handling for compat readv/writev syscalls

We got a report that the pwritev syscall does not work correctly in
compat mode on s390.

It turned out that with commit 72ec35163f9f ("switch compat readv/writev
variants to COMPAT_SYSCALL_DEFINE") we lost the zero extension of a
couple of syscall parameters because the some parameter types haven't
been converted from unsigned long to compat_ulong_t.

This is needed for architectures where the ABI requires that the caller
of a function performed zero and/or sign extension to 64 bit of all
parameters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/mempolicy.c: convert to pr_foo()

A few printk(KERN_*'s have snuck in there.

Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: initialise numa balancing after jump label initialisation

The command line parsing takes place before jump labels are initialised
which generates a warning if numa_balancing= is specified and
CONFIG_JUMP_LABEL is set.

On older kernels before commit c4b2c0c5f647 ("static_key: WARN on usage
before jump_label_init was called") the kernel would have crashed. This
patch enables automatic numa balancing later in the initialisation
process if numa_balancing= is specified.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page-writeback.c: do not count anon pages as dirtyable memory

The VM is currently heavily tuned to avoid swapping.  Whether that is
good or bad is a separate discussion, but as long as the VM won't swap
to make room for dirty cache, we can not consider anonymous pages when
calculating the amount of dirtyable memory, the baseline to which
dirty_background_ratio and dirty_ratio are applied.

A simple workload that occupies a significant size (40+%, depending on
memory layout, storage speeds etc.) of memory with anon/tmpfs pages and
uses the remainder for a streaming writer demonstrates this problem.  In
that case, the actual cache pages are a small fraction of what is
considered dirtyable overall, which results in an relatively large
portion of the cache pages to be dirtied.  As kswapd starts rotating
these, random tasks enter direct reclaim and stall on IO.

Only consider free pages and file pages dirtyable.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Tested-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page-writeback.c: fix dirty_balance_reserve subtraction from dirtyable memory

Tejun reported stuttering and latency spikes on a system where random
tasks would enter direct reclaim and get stuck on dirty pages.  Around
50% of memory was occupied by tmpfs backed by an SSD, and another disk
(rotating) was reading and writing at max speed to shrink a partition.

: The problem was pretty ridiculous.  It's a 8gig machine w/ one ssd and 10k
: rpm harddrive and I could reliably reproduce constant stuttering every
: several seconds for as long as buffered IO was going on on the hard drive
: either with tmpfs occupying somewhere above 4gig or a test program which
: allocates about the same amount of anon memory.  Although swap usage was
: zero, turning off swap also made the problem go away too.
:
: The trigger conditions seem quite plausible - high anon memory usage w/
: heavy buffered IO and swap configured - and it's highly likely that this
: is happening in the wild too.  (this can happen with copying large files
: to usb sticks too, right?)

This patch (of 2):

The dirty_balance_reserve is an approximation of the fraction of free
pages that the page allocator does not make available for page cache
allocations.  As a result, it has to be taken into account when
calculating the amount of "dirtyable memory", the baseline to which
dirty_background_ratio and dirty_ratio are applied.

However, currently the reserve is subtracted from the sum of free and
reclaimable pages, which is non-sensical and leads to erroneous results
when the system is dominated by unreclaimable pages and the
dirty_balance_reserve is bigger than free+reclaimable.  In that case, at
least the already allocated cache should be considered dirtyable.

Fix the calculation by subtracting the reserve from the amount of free
pages, then adding the reclaimable pages on top.

[akpm@linux-foundation.org: fix CONFIG_HIGHMEM build]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Tested-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: document improved handling of swappiness==0

Prior to commit fe35004fbf9e ("mm: avoid swapping out with
swappiness==0") setting swappiness to 0, reclaim code could still evict
recently used user anonymous memory to swap even though there is a
significant amount of RAM used for page cache.

The behaviour of setting swappiness to 0 has since changed. When set,
the reclaim code does not initiate swap until the amount of free pages
and file-backed pages, is less than the high water mark in a zone.

Let's update the documentation to reflect this.

[akpm@linux-foundation.org: remove comma, per Randy]
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Bryn M. Reeves <bmr@redhat.com>
Cc: Satoru Moriya <satoru.moriya@hds.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL

In the gen_pool_dma_alloc() the dma pointer can be NULL and while
assigning gen_pool_virt_to_phys(pool, vaddr) to dma caused the following
crash on da850 evm:

   Unable to handle kernel NULL pointer dereference at virtual address 00000000
   Internal error: Oops: 805 [#1] PREEMPT ARM
   Modules linked in:
   CPU: 0 PID: 1 Comm: swapper Tainted: G        W    3.13.0-rc1-00001-g0609e45-dirty #5
   task: c4830000 ti: c4832000 task.ti: c4832000
   PC is at gen_pool_dma_alloc+0x30/0x3c
   LR is at gen_pool_virt_to_phys+0x74/0x80
   Process swapper, call trace:
     gen_pool_dma_alloc+0x30/0x3c
     davinci_pm_probe+0x40/0xa8
     platform_drv_probe+0x1c/0x4c
     driver_probe_device+0x98/0x22c
     __driver_attach+0x8c/0x90
     bus_for_each_dev+0x6c/0x8c
     bus_add_driver+0x124/0x1d4
     driver_register+0x78/0xf8
     platform_driver_probe+0x20/0xa4
     davinci_init_late+0xc/0x14
     init_machine_late+0x1c/0x28
     do_one_initcall+0x34/0x15c
     kernel_init_freeable+0xe4/0x1ac
     kernel_init+0x8/0xec

This patch fixes the above.

[akpm@linux-foundation.org: update kerneldoc]
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Nicolin Chen <b42378@freescale.com>
Cc: Joe Perches <joe@perches.com>
Cc: Sachin Kamat <sachin.kamat@linaro.org>
Cc: <stable@vger.kernel.org> [3.13.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull fanotify use-after-free fixes from Jan Kara:
"Three fixes for the fanotify use after free problems guys were
  reporting.

  I have ended up with different lifetime rules for struct
  fanotify_event_info depending on whether it is for permission event or
  normal event which isn't ideal.  My plan is to split these into two
  different structures (as permission events need larger struct anyway)
  which will make the rules trivial again.  But that can wait for later
  I guess (but I can add the patch to the pile if you want), now I
  wanted to make -rc1 boot for these guys"

[ "These guys" being Jiri Kosina and Dave Jones that reported the slab
  corruption issues due to incorrect object lifetimes ]

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: Fix use after free for permission events
  fsnotify: Do not return merged event from fsnotify_add_notify_event()
  fanotify: Fix use after free in mask checking

ceph: fix posix ACL hooks

The merge of commit 7221fe4c2ed7 ("ceph: add acl for cephfs") raced with
upstream changes in the generic POSIX ACL code (eg commit 2aeccbe957d0
"fs: add generic xattr_acl handlers" and others).

Some of the fallout was fixed in commit 4db658ea0ca ("ceph: Fix up after
semantic merge conflict"), but it was incomplete: the set_acl
inode_operation wasn't getting set, and the prototype needed to be
adjusted a bit (it doesn't take a dentry anymore).

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/nouveau: resume display if any later suspend bits fail

If either idling channels or suspending the fence were to fail, the
display would never be resumed. Also if a client fails, resume the fence
(not functionally important, but it would potentially leak memory).

See https://bugs.freedesktop.org/show_bug.cgi?id=70213

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix lock unbalance in nouveau_crtc_page_flip

Fixes a regression introduced by d5c1e84b3a130f0
"drm/nouveau: hold mutex while syncing to kernel channel".

Cc: stable@vger.kernel.org #3.13
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: implement hooks for needed for drm vblank timestamping support

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau/disp: add a method to fetch info needed by drm vblank timestamping

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nv50: fill in crtc mode struct members from crtc_mode_fixup

The DRM uses the adjusted mode to calculate constants for vblank
timestamping. Our encoder mode_fixup (usually) replaces this data
with our backend mode information, which doesn't have the needed
data filled in already.

Reported-by: Mario Kleiner mario.kleiner.de@gmail.com
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/radeon/dce8: workaround for atom BlankCrtc table

Some DCE8 boards have a funky BlankCrtc table that results
in a timeout when trying to blank the display. The
timeout is harmless (all operations needed from the table
are complete), but wastes time and is confusing to users so
work around it.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73420

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon/DCE4+: clear bios scratch dpms bit (v2)

The BlankCrtc table in some DCE8 boards has some
logic shortcuts for the vbios when this bit is set.
Clear it for driver use.

v2: fix typo

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73420

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon: set si_notify_smc_display_change properly

This is effectively a revert of 4573388c92ee60b4ed72b8d95b73df861189988c.

Forcing a display active when there is none causes problems with
dpm on some SI boards which results in improperly initialized
dpm state and boot failures on some boards. As for the bug commit
4573388c92ee tried to address, one can manually force the state to
high for better performance when using the card as a headless compute
node until a better fix is developed.

bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=73788
https://bugs.freedesktop.org/show_bug.cgi?id=69395

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
cc: stable@vger.kernel.org

drm/radeon: fix DAC interrupt handling on DCE5+

DCE5 and newer hardware only has 1 DAC. Use the correct
offset. This may fix display problems on certain board
configurations.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon: clean up active vram sizing

If we are not able to properly initialize one of the gpu
engines for buffer paging, we limit vram to the size of
the cpu visible aperture. We generally either use the gfx
or dma engine to do this. Clean up the size limiting code
to only adjust the size based on what ring is selected
for buffer paging rather than making assumptions about which
engine is selected for paging.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

drm/radeon: skip async dma init on r6xx

The hw is buggy and it's not currently used, but it's
currently still initialized by the driver. Skip the init.
Skipping init also seems to improve stability with dpm on
some r6xx asics.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=66963

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

drm/radeon/runpm: don't runtime suspend non-PX cards

Prevent runtime suspend of non-PX GPUs. Runtime suspend is
not what we want in those cases.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon: add ring to fence trace functions

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: add missing trace point

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: fix VMID use tracking

Otherwise we allocate a new VMID on nearly every submit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Update Jean Delvare's e-mail address

Signed-off-by: Jean Delvare <khali@linux-fr.org>

hwmon: (it87) Print proper names for the IT8771E and IT8772E

The driver prints IT8771F and IT8772F instead of IT8771E and IT8772E
respectively when the driver is loaded. This is a cosmetic only bug
but let's fix it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>

hwmon: (it87) Add support for the ITE IT8603E

Add support for IT8603E.

This closes bug #57861:
https://bugzilla.kernel.org/show_bug.cgi?id=57861

[JD: Fixes and clean-ups.]

Signed-off-by: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

xtensa: fixup simdisk driver to work with immutable bio_vecs

Geert reported:

arch/xtensa/platforms/iss/simdisk.c:108:23: error: 'struct bio' has no member named 'bi_sector'
arch/xtensa/platforms/iss/simdisk.c:110:2: error: incompatible types when assigning to type 'int' from type 'struct bvec_iter'
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:110:2: error: request for member 'bv_len' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_size' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_idx' in something not a structure or union
arch/xtensa/platforms/iss/simdisk.c:111:18: error: request for member 'bi_bvec_done' in something not a structure or union
make[2]: *** [arch/xtensa/platforms/iss/simdisk.o] Error 1

Fixup the usage of bio_for_each_segment(). Also fix wrong use
of __bio_kunmap_atomic() - it needs the mapped buffer passed in,
not the originally mapped page.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fanotify: Fix use after free for permission events

Currently struct fanotify_event_info has been destroyed immediately
after reporting its contents to userspace. However that is wrong for
permission events because those need to stay around until userspace
provides response which is filled back in fanotify_event_info. So change
to code to free permission events only after we have got the response
from userspace.

Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Reported-and-tested-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Jan Kara <jack@suse.cz>

fsnotify: Do not return merged event from fsnotify_add_notify_event()

The event returned from fsnotify_add_notify_event() cannot ever be used
safely as the event may be freed by the time the function returns (after
dropping notification_mutex). So change the prototype to just return
whether the event was added or merged into some existing event.

Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Reported-and-tested-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Jan Kara <jack@suse.cz>

fanotify: Fix use after free in mask checking

We cannot use the event structure returned from
fsnotify_add_notify_event() because that event can be freed by the time
that function returns. Use the mask argument passed into the event
handler directly instead. This also fixes a possible problem when we
could unnecessarily wait for permission response for a normal fanotify
event which got merged with a permission event.

We also disallow merging of permission event with any other event so
that we know the permission event which we just created is the one on
which we should wait for permission response.

Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Reported-and-tested-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Jan Kara <jack@suse.cz>

dmaengine: mmp_pdma: fix mismerge

The merge between 2b7f65b11d87f "mmp_pdma: Style neatening" and
8010dad55a0ab0 "dma: add dma_get_any_slave_channel(), for use in of_xlate()"
caused a build error by leaving obsolete code in place:

  mmp_pdma.c: In function 'mmp_pdma_dma_xlate':
  mmp_pdma.c:909:31: error: 'candidate' undeclared
  mmp_pdma.c:912:3: error: label 'retry' used but not defined
  mmp_pdma.c:901:24: warning: unused variable 'c' [-Wunused-variable]

This removes the extraneous lines.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

sparc: Hook up sched_setattr and sched_getattr syscalls.

Signed-off-by: David S. Miller <davem@davemloft.net>

qeth: fix build of s390 allmodconfig

commit 949efd1c "qeth: bridgeport support - basic control" broke
s390 allmodconfig. This patch fixes this by eliminating one of the
cross-module calls, and by making two other calls via function
pointers in the qeth_discipline structure.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: fix locking in bond_loadbalance_arp_mon()

The commit 1d3ee88ae0d605629bf369
(bonding: add netlink attributes to slave link dev)
has add rtmsg_ifinfo() in bond_set_active_slave() and
bond_set_backup_slave(), so the two function need to
called in RTNL lock, but bond_loadbalance_arp_mon()
only calling these functions in RCU, warning message
will occurs.

fix this by add a new function bond_slave_state_change(),
which will reset the slave's state after slave link check,
so remove the bond_set_xxx_slave() from the cycle and only
record the slave_state_changed, this will call the new
function to set all slaves to new state in RTNL later.

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tun: add device name(iff) field to proc fdinfo entry

A file descriptor opened for /dev/net/tun and a tun device are
connected with ioctl.  Though understanding the connection is
important for trouble shooting, no way is given to a user to know
the connected device for a given file descriptor at userland.

This patch adds a new fdinfo field for the device name connected to
a file descriptor opened for /dev/net/tun.

Here is an example of the field:

    # lsof | grep tun
    qemu-syst 4565         qemu   25u      CHR             10,200       0t138      12921 /dev/net/tun
    ...

    # cat /proc/4565/fdinfo/25
    pos: 138
    flags: 0104002
    iff: vnet0

    # ip link show dev vnet0
    8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ...

changelog:

    v2: indent iff just like the other fdinfo fields are.
    v3: remove unused variable.
        Both are suggested by David Miller <davem@davemloft.net>.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'DT'

Sergei Shtylyov says:

====================
DT: net: davinci_emac: couple more properties actually optional

Though described as required, couple more properties in the DaVinci EMAC
binding are actually optional, as the driver will happily function without them.
The patchset is against DaveM's 'net.git' tree this time.

[1/2] DT: net: davinci_emac: "ti,davinci-rmii-en" property is actually optional
[2/2] DT: net: davinci_emac: "ti,davinci-no-bd-ram" property is actually optional
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

DT: net: davinci_emac: "ti, davinci-no-bd-ram" property is actually optional

The "ti,davinci-no-bd-ram" property for the DaVinci EMAC binding simply can't be
required one, as it's boolean (which means it's absent if false).

While at it, document the property better...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

DT: net: davinci_emac: "ti, davinci-rmii-en" property is actually optional

Though described as required, the "ti,davinci-rmii-en" property for the DaVinci
EMAC binding seems actually optional, as the driver should happily work without
it; the property is not specified either in the example device node or in the
actual EMAC device node for DA850 device tree, only AM3517 one.

While at it, document the property better...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc: don't use module_init in non-modular pci.c code

The pci.o is built for SPARC64_PCI -- which is bool, and hence
this code is either present or absent.  It will never be modular,
so using module_init as an alias for __initcall can be somewhat
misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future.  If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups.  As __initcall gets
mapped onto device_initcall, our use of device_initcall
directly in this change means that the runtime impact is
zero -- it will remain at level 6 in initcall ordering.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc: delete non-required instances of include <linux/init.h>

None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: ide: Include appropriate header file in ide-pio-blacklist.c

Include appropriate header file include/linux/ide.h in file
ide-pio-blacklist.c because function ide_scan_pio_blacklist() has it's
prototype declaration in include/linux/ide.h.

This eliminates the following warning in ide-pio-blacklist.c:
drivers/ide/ide-pio-blacklist.c:85:5: warning: no previous prototype for ‘ide_scan_pio_blacklist’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: ide: Include appropriate header file in ide-cd_verbose.c

Include appropriate header file ide-cd.h in ide-cd_verbose.c because
function ide_cd_log_error() has its prototype declaration in ide-cd.h.
Also, include linux/ide.h because it contains certain declarations
necessary for including ide-cd.h.

This eliminates the following warnings in ide-cd_verbose.c:
drivers/ide/ide-cd_verbose.c:251:6: warning: no previous prototype for ‘ide_cd_log_error’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

From: Eunbong Song <eunb.song@samsung.com>

This patch removes CONFIG_MTD_PARTITIONS in config files for blackfin.
Because CONFIG_MTD_PARTITIONS was removed by commit
6a8a98b22b10f1560d5f90aded4a54234b9b2724.

Signed-off-by: Eunbong Song <eunb.song@samsung.com>
Signed-off-by: Steven Miao <realmz6@gmail.com>

Add platfrom device resource for bfin-sport on bf533 stamp

Signed-off-by: Aaron Wu <Aaron.wu@analog.com>
Signed-off-by: Steven Miao <realmz6@gmail.com>