review.tizen.org Git - kernel/kernel-generic.git/log

frontswap: fix incorrect zeroing and allocation size for frontswap_map

The bitmap accessed by bitops must have enough size to hold the required
numbers of bits rounded up to a multiple of BITS_PER_LONG.  And the
bitmap must not be zeroed by memset() if the number of bits cleared is
not a multiple of BITS_PER_LONG.

This fixes incorrect zeroing and allocation size for frontswap_map.  The
incorrect zeroing part doesn't cause any problem because frontswap_map
is freed just after zeroing.  But the wrongly calculated allocation size
may cause the problem.

For 32bit systems, the allocation size of frontswap_map is about twice
as large as required size.  For 64bit systems, the allocation size is
smaller than requeired if the number of bits is not a multiple of
BITS_PER_LONG.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()

audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().

The reason is when it is killed, the 'rule' itself may have already
released, we should not access it.  one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.

The work flow for adding a rule:

    audit_receive() -> (need audit_cmd_mutex lock)
      audit_receive_skb() ->
        audit_receive_msg() ->
          audit_receive_filter() ->
            audit_add_rule() ->
              audit_add_tree_rule() -> (need audit_filter_mutex lock)
                ...
                unlock audit_filter_mutex
                get_tree()
                ...
                iterate_mounts() -> (iterate all related inodes)
                  tag_mount() ->
                    tag_trunk() ->
                      create_trunk() -> (assume it is 1st rule)
                        fsnotify_add_mark() ->
                          fsnotify_add_inode_mark() ->  (add mark to inode->i_fsnotify_marks)
                        ...
                        get_tree(); (each inode will get one)
                ...
                lock audit_filter_mutex

The work flow for deleting an inode:

    __destroy_inode() ->
     fsnotify_inode_delete() ->
       __fsnotify_inode_delete() ->
        fsnotify_clear_marks_by_inode() ->  (get mark from inode->i_fsnotify_marks)
          fsnotify_destroy_mark() ->
           fsnotify_destroy_mark_locked() ->
             audit_tree_freeing_mark() ->
               evict_chunk() ->
                 ...
                 tree->goner = 1
                 ...
                 kill_rules() ->   (assume current->audit_context == NULL)
                   call_rcu() ->   (rule->tree != NULL)
                     audit_free_rule_rcu() ->
                       audit_free_rule()
                 ...
                 audit_schedule_prune() ->  (assume current->audit_context == NULL)
                   kthread_run() ->    (need audit_cmd_mutex and audit_filter_mutex lock)
                     prune_one() ->    (delete it from prue_list)
                       put_tree(); (match the original get_tree above)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: migration: add migrate_entry_wait_huge()

When we have a page fault for the address which is backed by a hugepage
under migration, the kernel can't wait correctly and do busy looping on
hugepage fault until the migration finishes. As a result, users who try
to kick hugepage migration (via soft offlining, for example) occasionally
experience long delay or soft lockup.

This is because pte_offset_map_lock() can't get a correct migration entry
or a correct page table lock for hugepage. This patch introduces
migration_entry_wait_huge() to solve this.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [2.6.35+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: add missing lockres put in dlm_mig_lockres_handler

dlm_mig_lockres_handler() is missing a dlm_lockres_put() on an error path.

Signed-off-by: joyce <xuejiufei@huawei.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page_alloc.c: fix watermark check in __zone_watermark_ok()

The watermark check consists of two sub-checks.  The first one is:

if (free_pages <= min + lowmem_reserve)
return false;

The check assures that there is minimal amount of RAM in the zone.  If
CMA is used then the free_pages is reduced by the number of free pages
in CMA prior to the over-mentioned check.

if (!(alloc_flags & ALLOC_CMA))
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);

This prevents the zone from being drained from pages available for
non-movable allocations.

The second check prevents the zone from getting too fragmented.

for (o = 0; o < order; o++) {
free_pages -= z->free_area[o].nr_free << o;
min >>= 1;
if (free_pages <= min)
return false;
}

The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages.  Therefore the CMA pages are subtracted twice.
This may cause a false positive fail of __zone_watermark_ok() if the CMA
area gets strongly fragmented.  In such a case there are many 0-order
free pages located in CMA.  Those pages are subtracted twice therefore
they will quickly drain free_pages during the check against
fragmentation.  The test fails even though there are many free non-cma
pages in the zone.

This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.

Laura said:

  We were observing allocation failures of higher order pages (order 5 =
  128K typically) under tight memory conditions resulting in driver
  failure.  The output from the page allocation failure showed plenty of
  free pages of the appropriate order/type/zone and mostly CMA pages in
  the lower orders.

  For full disclosure, we still observed some page allocation failures
  even after applying the patch but the number was drastically reduced and
  those failures were attributed to fragmentation/other system issues.

Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: <stable@vger.kernel.org> [3.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/misc/sgi-gru/grufile.c: fix info leak in gru_get_config_info()

The "info.fill" array isn't initialized so it can leak uninitialized stack
information to user space.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Robin Holt <holt@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

aio: fix io_destroy() regression by using call_rcu()

There was a regression introduced by 36f5588905c1 ("aio: refcounting
cleanup"), reported by Jens Axboe - the refcounting cleanup switched to
using RCU in the shutdown path, but the synchronize_rcu() was done in
the context of the io_destroy() syscall greatly increasing the time it
could block.

This patch switches it to call_rcu() and makes shutdown asynchronous
(more asynchronous than it was originally; before the refcount changes
io_destroy() would still wait on pending kiocbs).

Note that there's a global quota on the max outstanding kiocbs, and that
quota must be manipulated synchronously; otherwise io_setup() could
return -EAGAIN when there isn't quota available, and userspace won't
have any way of waiting until shutdown of the old kioctxs has finished
(besides busy looping).

So we release our quota before kioctx shutdown has finished, which
should be fine since the quota never corresponded to anything real
anyways.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Tested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-at91rm9200: use shadow IMR on at91sam9x5

Add support for the at91sam9x5-family which must use the shadow
interrupt mask due to a hardware issue (causing RTC_IMR to always be
zero).

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-at91rm9200: add shadow interrupt mask

Add shadow interrupt-mask register which can be used on SoCs where the
actual hardware register is broken.

Note that some care needs to be taken to make sure the shadow mask
corresponds to the actual hardware state. The added overhead is not an
issue for the non-broken SoCs due to the relatively infrequent
interrupt-mask updates. We do, however, only use the shadow mask value
as a fall-back when it actually needed as there is still a theoretical
possibility that the mask is incorrect (see the code for details).

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-at91rm9200: refactor interrupt-register handling

Add accessors for the interrupt register.

This will allow us to easily add a shadow interrupt-mask register to use
on SoCs where the interrupt-mask register cannot be used.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-at91rm9200: add configuration support

Add configuration support which can be used to implement SoC-specific
workarounds for broken hardware.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc-at91rm9200: add match-table compile guard

The members of Atmel's at91sam9x5 family (9x5) have a broken RTC
interrupt mask register (AT91_RTC_IMR).  It does not reflect enabled
interrupts but instead always returns zero.

The kernel's rtc-at91rm9200 driver handles the RTC for the 9x5 family.
Currently when the date/time is set, an interrupt is generated and this
driver neglects to handle the interrupt.  The kernel complains about the
un-handled interrupt and disables it henceforth.  This not only breaks
the RTC function, but since that interrupt is shared (Atmel's SYS
interrupt) then other things break as well (e.g.  the debug port no
longer accepts characters).

Tested on the at91sam9g25.  Bug confirmed by Atmel.

This patch (of 5):

Add missing match-table compile guard.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: Robert Nelson <Robert.Nelson@digikey.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/ocfs2/namei.c: remove unecessary ERROR when removing non-empty directory

While removing a non-empty directory, the kernel dumps a message:

(rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39

Suppress the error message from being printed in the dmesg so users
don't panic.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion

read_swap_cache_async() can race against get_swap_page(), and stumble
across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought
into the swapcache yet.

This transient swap_map state is expected to be transitory, but the
actual placement of discard at scan_swap_map() inserts a wait for I/O
completion thus making the thread at read_swap_cache_async() to loop
around its -EEXIST case, while the other end at get_swap_page() is
scheduled away at scan_swap_map(). This can leave the system deadlocked
if the I/O completion happens to be waiting on the CPU waitqueue where
read_swap_cache_async() is busy looping and !CONFIG_PREEMPT.

This patch introduces a cond_resched() call to make the aforementioned
read_swap_cache_async() busy loop condition to bail out when necessary,
thus avoiding the subtle race window.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-twl.c: fix missing device_init_wakeup() when booted with device tree

When booted in legacy mode device_init_wakeup() gets called by
drivers/mfd/twl-core.c when the children are initialized. However, when
booted using device tree, the children are created with
of_platform_populate() instead add_children().

This means that the RTC driver will not have device_init_wakeup() set,
and we need to call it from the driver probe like RTC drivers typically
do.

Without this we cannot test PM wake-up events on omaps for cases where
there may not be any physical wake-up event.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Reported-by: Kevin Hilman <khilman@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Jingoo Han <jg1.han@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cciss: fix broken mutex usage in ioctl

If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked
(as is normal with the Array Configuration Utility) the process will
hang as below.  It attempts to acquire the same mutex twice, once in
do_ioctl() and once in cciss_unlocked_open().  The BKL was recursive,
the mutex isn't.

  Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
  [...]
  acu             D 0000000000000001     0  3246   3191 0x00000080
  Call Trace:
    schedule+0x29/0x70
    schedule_preempt_disabled+0xe/0x10
    __mutex_lock_slowpath+0x17b/0x220
    mutex_lock+0x2b/0x50
    cciss_unlocked_open+0x2f/0x110 [cciss]
    __blkdev_get+0xd3/0x470
    blkdev_get+0x5c/0x1e0
    register_disk+0x182/0x1a0
    add_disk+0x17c/0x310
    cciss_add_disk+0x13a/0x170 [cciss]
    cciss_update_drive_info+0x39b/0x480 [cciss]
    rebuild_lun_table+0x258/0x370 [cciss]
    cciss_ioctl+0x34f/0x470 [cciss]
    do_ioctl+0x49/0x70 [cciss]
    __blkdev_driver_ioctl+0x28/0x30
    blkdev_ioctl+0x200/0x7b0
    block_ioctl+0x3c/0x40
    do_vfs_ioctl+0x89/0x350
    SyS_ioctl+0xa1/0xb0
    system_call_fastpath+0x16/0x1b

This mutex usage was added into the ioctl path when the big kernel lock
was removed.  As it turns out, these paths are all thread safe anyway
(or can easily be made so) and we don't want ioctl() to be single
threaded in any case.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Miller <mike.miller@hp.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE

audit_log_start() does wait_for_auditd() in a loop until
audit_backlog_wait_time passes or audit_skb_queue has a room.

If signal_pending() is true this becomes a busy-wait loop, schedule() in
TASK_INTERRUPTIBLE won't block.

Thanks to Guy for fully investigating and explaining the problem.

(akpm: that'll cause the system to lock up on a non-preemptible
uniprocessor kernel)

(Guy: "Our customer was in fact running a uniprocessor machine, and they
reported a system hang.")

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Guy Streeter <streeter@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-cmos.c: fix accidentally enabling rtc channel

During resume, we call hpet_rtc_timer_init after masking an irq bit in
hpet. This will cause the call to hpet_disable_rtc_channel to be undone
if RTC_AIE is the only bit not masked.

Allowing the cmos interrupt handler to run before resuming caused some
issues where the timer for the alarm was not removed. This would cause
other, later timers to not be cleared, so utilities such as hwclock
would time out when waiting for the update interrupt.

[akpm@linux-foundation.org: coding-style tweak]
Signed-off-by: Derek Basehore <dbasehore@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-tps6586x.c: device wakeup flags correction

Use device_init_wakeup() instead of device_set_wakeup_capable() and move
it before rtc dev registering. This fixes alarmtimer not registered
when tps6586x rtc is the only wakeup compatible rtc in the system.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: don't initialize kmem-cache destroying work for root caches

struct memcg_cache_params has a union.  Different parts of this union
are used for root and non-root caches.  A part with destroying work is
used only for non-root caches.

  BUG: unable to handle kernel paging request at 0000000fffffffe0
  IP: kmem_cache_alloc+0x41/0x1f0
  Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy
  CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G      D      3.10.0-rc1+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: kmem_cache_alloc+0x41/0x1f0
  Call Trace:
   getname_flags.part.34+0x30/0x140
   getname+0x38/0x60
   do_sys_open+0xc5/0x1e0
   SyS_open+0x22/0x30
   system_call_fastpath+0x16/0x1b
  Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49
  RIP  [<ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Li Zefan <lizefan@huawei.com>
Cc: <stable@vger.kernel.org> [3.9.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: ocfs2_prep_new_orphaned_file() should return ret

If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to
a NULL ocfs2_alloc_context struct in the following functions. A kernel
panic happens.

Signed-off-by: "Xiaowei.Hu" <xiaowei.hu@oracle.com>
Reviewed-by: shencanquan <shencanquan@huawei.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/mpi/mpicoder.c: looping issue, need stop when equal to zero, found by 'EXTRA_FLAGS=-W'.

For 'while' looping, need stop when 'nbytes == 0', or will cause issue.
('nbytes' is size_t which is always bigger or equal than zero).

The related warning: (with EXTRA_CFLAGS=-W)

lib/mpi/mpicoder.c:40:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kmsg: honor dmesg_restrict sysctl on /dev/kmsg

The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections.  Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions.  With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.

To fix /dev/kmsg, let's compare the existing interfaces and what they
allow:

- /proc/kmsg allows:
  - open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
    single-reader interface (SYSLOG_ACTION_READ).
  - everything, after an open.

- syslog syscall allows:
  - anything, if CAP_SYSLOG.
  - SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
    dmesg_restrict==0.
  - nothing else (EPERM).

The use-cases were:
- dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
- sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
   destructive SYSLOG_ACTION_READs.

AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
clear the ring buffer.

Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.

To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.

- /dev/kmsg allows:
  - open if CAP_SYSLOG or dmesg_restrict==0
  - reading/polling, after open

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192

[akpm@linux-foundation.org: use pr_warn_once()]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

reboot: rigrate shutdown/reboot to boot cpu

We recently noticed that reboot of a 1024 cpu machine takes approx 16
minutes of just stopping the cpus.  The slowdown was tracked to commit
f96972f2dc63 ("kernel/sys.c: call disable_nonboot_cpus() in
kernel_restart()").

The current implementation does all the work of hot removing the cpus
before halting the system.  We are switching to just migrating to the
boot cpu and then continuing with shutdown/reboot.

This also has the effect of not breaking x86's command line parameter
for specifying the reboot cpu.  Note, this code was shamelessly copied
from arch/x86/kernel/reboot.c with bits removed pertaining to the
reboot_cpu command line parameter.

Signed-off-by: Robin Holt <holt@sgi.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

CPU hotplug: provide a generic helper to disable/enable CPU hotplug

There are instances in the kernel where we would like to disable CPU
hotplug (from sysfs) during some important operation. Today the freezer
code depends on this and the code to do it was kinda tailor-made for
that.

Restructure the code and make it generic enough to be useful for other
usecases too.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme

Pull NVMe fixes from Matthew Wilcox.

* 'fixes-3.10' of git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Add MSI support
  NVMe: Use dma_set_mask() correctly
  Return the result from user admin command IOCTL even in case of failure
  NVMe: Do not cancel command multiple times
  NVMe: fix error return code in nvme_submit_bio_queue()
  NVMe: check for integer overflow in nvme_map_user_pages()
  MAINTAINERS: update NVM EXPRESS DRIVER file list
  NVMe: Fix a signedness bug in nvme_trans_modesel_get_mp
  NVMe: Remove redundant version.h header include

Merge branch 'fixes' of git://git./virt/kvm/kvm

Pull kvm bugfixes from Gleb Natapov:
"There is one more fix for MIPS KVM ABI here, MIPS and PPC build
  breakage fixes and a couple of PPC bug fixes"

* 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()
  kvm/ppc/booke: Hold srcu lock when calling gfn functions
  kvm/ppc/booke64: Disable e6500 support
  kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage
  mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG
  kvm: Add definition of KVM_REG_MIPS
  KVM: add kvm_para_available to asm-generic/kvm_para.h

kvm/ppc/booke64: Fix lazy ee handling in kvmppc_handle_exit()

EE is hard-disabled on entry to kvmppc_handle_exit(), so call
hard_irq_disable() so that PACA_IRQ_HARD_DIS is set, and soft_enabled
is unset.

Without this, we get warnings such as arch/powerpc/kernel/time.c:300,
and sometimes host kernel hangs.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

kvm/ppc/booke: Hold srcu lock when calling gfn functions

KVM core expects arch code to acquire the srcu lock when calling
gfn_to_memslot and similar functions.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

kvm/ppc/booke64: Disable e6500 support

The previous patch made 64-bit booke KVM build again, but Altivec
support is still not complete, and we can't prevent the guest from
turning on Altivec (which can corrupt host state until state
save/restore is implemented). Disable e6500 on KVM until this is
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

kvm/ppc/booke64: Fix AltiVec interrupt numbers and build breakage

Interrupt numbers defined for Book3E follows IVORs definition. Align
BOOKE_INTERRUPT_ALTIVEC_UNAVAIL and BOOKE_INTERRUPT_ALTIVEC_ASSIST to this
rule which also fixes the build breakage.
IVORs 32 and 33 are shared so reflect this in the interrupts naming.

This fixes a build break for 64-bit booke KVM.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

mips/kvm: Use KVM_REG_MIPS and proper size indicators for *_ONE_REG

The API requires that the GET_ONE_REG and SET_ONE_REG ioctls have this
extra information encoded in the register identifiers.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

kvm: Add definition of KVM_REG_MIPS

We use 0x7000000000000000ULL as 0x6000000000000000ULL is reserved for
ARM64.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

Fix lockup related to stop_machine being stuck in __do_softirq.

The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq.  The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.

To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.

Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.

This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce
latencies").

It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.

The hang stack traces look something like this:

    ------------[ cut here ]------------
    WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
    Watchdog detected hard LOCKUP on cpu 2
    Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
    Pid: 23, comm: migration/2 Tainted: G         C   3.9.4+ #11
    Call Trace:
     <NMI>   warn_slowpath_common+0x85/0x9f
      warn_slowpath_fmt+0x46/0x48
      watchdog_overflow_callback+0x9c/0xa7
      __perf_event_overflow+0x137/0x1cb
      perf_event_overflow+0x14/0x16
      intel_pmu_handle_irq+0x2dc/0x359
      perf_event_nmi_handler+0x19/0x1b
      nmi_handle+0x7f/0xc2
      do_nmi+0xbc/0x304
      end_repeat_nmi+0x1e/0x2e
     <<EOE>>
      cpu_stopper_thread+0xae/0x162
      smpboot_thread_fn+0x258/0x260
      kthread+0xc7/0xcf
      ret_from_fork+0x7c/0xb0
    ---[ end trace 4947dfa9b0a4cec3 ]---
    BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
    Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
    irq event stamp: 835637905
    hardirqs last  enabled at (835637904): __do_softirq+0x9f/0x257
    hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
    softirqs last  enabled at (5654720): __do_softirq+0x1ff/0x257
    softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
    CPU 1
    Pid: 17, comm: migration/1 Tainted: G        WC   3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
    RIP: tasklet_hi_action+0xf0/0xf0
    Process migration/1
    Call Trace:
     <IRQ>
      __do_softirq+0x117/0x257
      irq_exit+0x5f/0xbb
      smp_apic_timer_interrupt+0x8a/0x98
      apic_timer_interrupt+0x72/0x80
     <EOI>
      printk+0x4d/0x4f
      stop_machine_cpu_stop+0x22c/0x274
      cpu_stopper_thread+0xae/0x162
      smpboot_thread_fn+0x258/0x260
      kthread+0xc7/0xcf
      ret_from_fork+0x7c/0xb0

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Pekka Riikonen <priikone@iki.fi>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag '9p-3.10-bug-fix-1' of git://git./linux/kernel/git/ericvh/v9fs

Pull net/9p bug fix from Eric Van Hensbergen:
"zero copy error fix"

* tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: Handle error in zero copy request correctly for 9p2000.u

Merge tag 'spi-v3.10-rc4' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few nasty issues, particularly a race with the interrupt controller
  in the xilinx driver, together with a couple of more minor fixes and a
  much needed move of the mailing list away from sourceforge."

* tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: hspi: fixup long delay time
  spi: spi-xilinx: Remove ISR race condition
  spi: topcliff-pch: fix error return code in pch_spi_probe()
  spi: topcliff-pch: Pass correct pointer to free_irq()
  spi: Move mailing list to vger

Merge tag 'stable/for-linus-3.10-rc5-tag' of git://git./linux/kernel/git/konrad/xen

Pull xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes for regressions:
   - xen/tmem stopped working after a certain combination of
     modprobe/swapon was used
   - cpu online/offlining would trigger WARN_ON."

* tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it.
  xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.

Merge tag 'regmap-v3.10-rc4' of git://git./linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
"The biggest fix here is Lars-Peter's fix for custom locking callbacks
  which is pretty localised but important for those devices that use the
  feature.  Otherwise we've got a couple of fairly small cleanups which
  would have been sent sooner were it not for letting Lars-Peter's patch
  soak for a while"

* tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: rbtree: Fixed node range check on sync
  regmap: regcache: Fixup locking for custom lock callbacks
  regmap: debugfs: Check return value of regmap_write()

Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes a build problem in sahara and temporarily disables two new
  optimisations because of performance regressions until a permanent fix
  is ready"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: sahara - fix building as module
  crypto: blowfish - disable AVX2 implementation
  crypto: twofish - disable AVX2 implementation

xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it.

Commit 10a7a0771399a57a297fca9615450dbb3f88081a ("xen: tmem: enable Xen
tmem shim to be built/loaded as a module") allows the tmem module
to be loaded any time. For this work the frontswap API had to
be able to asynchronously to call tmem_frontswap_init before
or after the swap image had been set. That was added in git
commit 905cd0e1bf9ffe82d6906a01fd974ea0f70be97a
("mm: frontswap: lazy initialization to allow tmem backends to build/run as modules").

Which means we could do this (The common case):

modprobe tmem [so calls frontswap_register_ops, no ->init]
modifies tmem_frontswap_poolid = -1
swapon /dev/xvda1 [__frontswap_init, calls -> init, tmem_frontswap_poolid is
< 0 so tmem hypercall done]

Or the failing one:

swapon /dev/xvda1 [calls __frontswap_init, sets the need_init bitmap]
modprobe tmem [calls frontswap_register_ops, -->init calls, finds out
tmem_frontswap_poolid is 0, does not make a hypercall.
Later in the module_init, sets tmem_frontswap_poolid=-1]

Which meant that in the failing case we would not call the hypercall
to initialize the pool and never be able to make any frontswap
backend calls.

Moving the frontswap_register_ops after setting the tmem_frontswap_poolid
fixes it.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>

Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
"This is purely regressions (though not all recent ones) or stable
  material"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Partial revert of "Context switch more PMU related SPRs"
  powerpc/perf: Fix deadlock caused by calling printk() in PMU exception
  powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regression
  powerpc/power8: Update denormalization handler
  powerpc/pseries: Simplify denormalization handler
  powerpc/power8: Fix oprofile and perf
  powerpc/eeh: Don't check RTAS token to get PE addr
  powerpc/pci: Check the bus address instead of resource address in pcibios_fixup_resources

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"The biggest two fixes are fixing a compilation error with the
  decompressor, and a problem with our __my_cpu_offset implementation.

  Other changes are very trivial and small, which seems to be the way
  for most -rc stuff."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier()
  ARM: 7750/1: update legacy CPU ID in decompressor cache support jump table
  ARM: 7743/1: compressed/head.S: work around new binutils warning
  ARM: 7742/1: topology: export cpu_topology
  ARM: 7737/1: fix kernel decompressor compilation error with CONFIG_DEBUG_SEMIHOSTING

powerpc: Partial revert of "Context switch more PMU related SPRs"

In commit 59affcd I added context switching of more PMU SPRs, because
they are potentially exposed to userspace on Power8. However despite me
being a smart arse in the commit message it's actually not correct. In
particular it interacts badly with a global perf record.

We will have to do something more complicated, but that will have to
wait for 3.11.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/perf: Fix deadlock caused by calling printk() in PMU exception

In commit bc09c21 "Fix finding overflowed PMC in interrupt" we added
a printk() to the PMU exception handler. Unfortunately that is not safe.

The problem is that the PMU exception may run even when interrupts are
soft disabled, aka NMI context. We do this so that we can profile parts
of the kernel that have interrupts soft-disabled.

But by calling printk() from the exception handler, we can potentially
deadlock in the printk code on logbuf_lock, eg:

  [c00000038ba575c0] c000000000081928 .vprintk_emit+0xa8/0x540
  [c00000038ba576a0] c0000000007bcde8 .printk+0x48/0x58
  [c00000038ba57710] c000000000076504 .perf_event_interrupt+0x2d4/0x490
  [c00000038ba57810] c00000000001f6f8 .performance_monitor_exception+0x48/0x60
  [c00000038ba57880] c0000000000032cc performance_monitor_common+0x14c/0x180
  --- Exception: f01 (Performance Monitor) at c0000000007b25d4 ._raw_spin_lock_irq
  +0x64/0xc0
  [c00000038ba57bf0] c00000000007ed90 .devkmsg_read+0xd0/0x5a0
  [c00000038ba57d00] c0000000001c2934 .vfs_read+0xc4/0x1e0
  [c00000038ba57d90] c0000000001c2cd8 .SyS_read+0x58/0xd0
  [c00000038ba57e30] c000000000009d54 syscall_exit+0x0/0x98
  --- Exception: c01 (System Call) at 00001fffffbf6f7c
  SP (3ffff6d4de10) is in userspace

Fix it by making sure we only call printk() when we are not in NMI
context.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regression

When introducing support for DABRX in 4474ef0, we broke older 32-bit CPUs
that don't have that register.

Some CPUs have a DABR but not DABRX. Configuration are:
- No 32bit CPUs have DABRX but some have DABR.
- POWER4+ and below have the DABR but no DABRX.
- 970 and POWER5 and above have DABR and DABRX.
- POWER8 has DAWR, hence no DABRX.

This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use
the top 64 bits for CPU FTR bits since only 64 bit CPUs have this.

Processors that don't have the DABRX will still work as they will fall
back to software filtering these breakpoints via perf_exclude_event().

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov>
cc: stable@vger.kernel.org (v3.9 only)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/power8: Update denormalization handler

POWER8 can take a denormalisation exception on any VSX registers.

This does the extra 32 VSX registers we don't currently handle.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/pseries: Simplify denormalization handler

The following simplifies the denorm code by using macros to generate the long
stream of almost identical instructions.

This patch results in no changes to the output binary, but removes a lot of
lines of code.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/power8: Fix oprofile and perf

In 2ac6f42 powerpc/cputable: Fix oprofile_cpu_type on power8
we broke all power8 hw events.

This reverts this change and uses oprofile_type instead. Perf now works
on POWER8 again and oprofile will revert to using timers on POWER8.

Kudos to mpe this fix.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/eeh: Don't check RTAS token to get PE addr

RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2"
are used to retrieve the PE address according to PCI address, which
made up of domain/bus/slot/function. If we don't have those 2 tokens,
the domain/bus/slot/function would be used as the address for EEH
RTAS operations. Some older f/w might not have those 2 tokens and
that blocks the EEH functionality to be initialized. It was introduced
by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization").

The patch skips the check on those 2 tokens so we can bring up EEH
functionality successfully. And domain/bus/slot/function will be
used as address for EEH RTAS operations.

Cc: <stable@vger.kernel.org> # v3.4+
Reported-by: Robert Knight <knight@princeton.edu>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Robert Knight <knight@princeton.edu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/pci: Check the bus address instead of resource address in pcibios_fixup_resources

If a BAR has the value of 0, we would assume that it is unset yet and
then mark the resource as unset and would reassign it later. But after
commit 6c5705fe (powerpc/PCI: get rid of device resource fixups)
the pcibios_fixup_resources is invoked after the bus address was
translated to linux resource. So the value of res->start is resource
address. And since the resource and bus address may be different, we
should translate it to the bus address before doing the check.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fix from Guenter Roeck:
"Improve chip detection in ADM1021 driver to avoid misdetections

  This is not a critical patch, but one we'll want to have applied to
  -stable, since the misdetection especially of LM84 has been causing
  trouble for quite some time."

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (adm1021) Strengthen chip detection for ADM1021, LM84 and MAX1617

Linux 3.10-rc5

hpfs: fix warnings when the filesystem fills up

This patch fixes warnings due to missing lock on write error path.

  WARNING: at fs/hpfs/hpfs_fn.h:353 hpfs_truncate+0x75/0x80 [hpfs]()
  Hardware name: empty
  Pid: 26563, comm: dd Tainted: P           O 3.9.4 #12
  Call Trace:
    hpfs_truncate+0x75/0x80 [hpfs]
    hpfs_write_begin+0x84/0x90 [hpfs]
    _hpfs_bmap+0x10/0x10 [hpfs]
    generic_file_buffered_write+0x121/0x2c0
    __generic_file_aio_write+0x1c7/0x3f0
    generic_file_aio_write+0x7c/0x100
    do_sync_write+0x98/0xd0
    hpfs_file_write+0xd/0x50 [hpfs]
    vfs_write+0xa2/0x160
    sys_write+0x51/0xa0
    page_fault+0x22/0x30
    system_call_fastpath+0x1a/0x1f

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:

- Trivial: unused variable removal

- Posix-timers: Add the clock ID to the new proc interface to make it
   useful.  The interface is new and should be functional when we reach
   the final 3.10 release.

- Cure a false positive warning in the tick code introduced by the
   overhaul in 3.10

- Fix for a persistent clock detection regression introduced in this
   cycle

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Correct run-time detection of persistent_clock.
  ntp: Remove unused variable flags in __hardpps
  posix-timers: Show clock ID in proc file
  tick: Cure broadcast false positive pending bit warning

Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux

Pull irqdomain bug fixes from Grant Likely:
"This branch contains a set of straight forward bug fixes to the
  irqdomain code and to a couple of drivers that make use of it."

* tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux:
  irqchip: Return -EPERM for reserved IRQs
  irqdomain: document the simple domain first_irq
  kernel/irq/irqdomain.c: before use 'irq_data', need check it whether valid.
  irqdomain: export irq_domain_add_simple

irqchip: Return -EPERM for reserved IRQs

The irqdomain core will report a log message for any attempted map call
that fails unless the error code is -EPERM. This patch changes the
Versatile irq controller drivers to use -EPERM because it is normal for
a subset of the IRQ inputs to be marked as reserved on the various
Versatile platforms.

Signed-off-by: Grant Likely <grant.likely@linaro.org>

irqdomain: document the simple domain first_irq

The first_irq needs to be zero to get a linear domain and that
comes with special semantics. We want to simplify this going
forward but some documentation never hurts.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>

kernel/irq/irqdomain.c: before use 'irq_data', need check it whether valid.

Since irq_data may be NULL, if so, we WARN_ON(), and continue, 'hwirq'
which related with 'irq_data' has to initialize later, or it will cause
issue.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>

irqdomain: export irq_domain_add_simple

All other irq_domain_add_* functions are exported already, and apparently
this one got left out by mistake, which causes build errors for ARM
allmodconfig kernels:

ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-rcar.ko] undefined!
ERROR: "irq_domain_add_simple" [drivers/gpio/gpio-em.ko] undefined!

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Grant Likely <grant.likely@linaro.org>

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"Another week, another batch of fixes for arm-soc platforms.

  Nothing controversial here, a handful of fixes for regressions and/or
  serious problems across several of the platforms.  Things are slowing
  down nicely on fix rates for 3.10"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: exynos: add debug_ll_io_init() call in exynos_init_io()
  ARM: EXYNOS: uncompress - print debug messages if DEBUG_LL is defined
  ARM: shmobile: sh73a0: Update CMT clockevent rating to 80
  sh-pfc: r8a7779: Don't group USB OVC and PENC pins
  ARM: mxs: icoll: Fix interrupts gpio bank 0
  ARM: imx: clk-imx6q: AXI clock select index is incorrect
  ARM: bcm2835: override the HW UART periphid
  ARM: mvebu: Fix bug in coherency fabric low level init function
  ARM: Kirkwood: TS219: Fix crash by double PCIe instantiation
  ARM: ux500: Provide supplies for AUX1, AUX2 and AUX3
  ARM: ux500: Only configure wake-up reasons on ux500 based platforms
  ARM: dts: imx: fix clocks for cspi
  ARM i.MX6q: fix for ldb_di_sels

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"MIPS fixes across the field.  The only area that's standing out is the
  exception handling which received it's dose of breakage as part of the
  microMIPS patchset"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: ralink: add missing SZ_1M multiplier
  MIPS: Compat: Fix cputime_to_timeval() arguments in compat binfmt_elf.
  MIPS: OCTEON: Improve _machine_halt implementation.
  MIPS: rtlx: Fix implicit declaration of function set_vi_handler()
  MIPS: Trap exception handling fixes
  MIPS: Quit exposing Kconfig symbols in uapi headers.
  MIPS: Remove duplicate definition of check_for_high_segbits.

Merge branch 'for-linus' of git://git./linux/kernel/git/gerg/m68knommu

Pull m68knommu fix from Greg Ungerer:
"A single fix for compilation breakage to many of the ColdFire CPU
targets"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: only use local gpio_request_one if not using GPIOLIB

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Regression fixers for the big 3:

   - nouveau: hdmi audio, dac load detect, s/r regressions fixed
   - radeon: long standing system hang fixed, hdmi audio and rs780 fast
     fb fixes
   - intel: one old regression, a WARN removal, and a stop X dying fix

  Otherwise one mgag200 fix, a couple of arm build fixes, and a core use
  after free fix."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nv50/kms: use dac loadval from vbios, where it's available
  drm/nv50/disp: force dac power state during load detect
  drm/nv50-nv84/fifo: fix resume regression introduced by playlist race fix
  drm/nv84/disp: Fix HDMI audio regression
  drm/i915/sdvo: Use &intel_sdvo->ddc instead of intel_sdvo->i2c for DDC.
  drm/radeon: don't allow audio on DCE6
  drm/radeon: Use direct mapping for fast fb access on RS780/RS880 (v2)
  radeon: Fix system hang issue when using KMS with older cards
  drm/i915: no lvds quirk for hp t5740
  drm/i915: Quirk the pipe A quirk in the modeset state checker
  drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus
  drm/mgag200: Add missing write to index before accessing data register
  drm/nouveau: use mdelay instead of large udelay constants
  drm/tilcd: select BACKLIGHT_LCD_SUPPORT
  drm: fix a use-after-free when GPU acceleration disabled

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dmaengine fixes from Vinod Koul:
"Fix from Andy is for dmatest regression reported by Will and Rabin has
  fixed runtime ref counting for st_dma40"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmatest: do not allow to interrupt ongoing tests
  dmaengine: ste_dma40: fix pm runtime ref counting

Merge tag 'trace-fixes-v3.10-rc3-v3' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"This contains 4 fixes.

  The first two fix the case where full RCU debugging is enabled,
  enabling function tracing causes a live lock of the system.  This is
  due to the added debug checks in rcu_dereference_raw() that is used by
  the function tracer.  These checks are also traced by the function
  tracer as well as cause enough overhead to the function tracer to slow
  down the system enough that the time to finish an interrupt can take
  longer than when the next interrupt is triggered, causing a live lock
  from the timer interrupt.

  Talking this over with Paul McKenney, we came up with a fix that adds
  a new rcu_dereference_raw_notrace() that does not perform these added
  checks, and let the function tracer use that.

  The third commit fixes a failed compile when branch tracing is
  enabled, due to the conversion of the trace_test_buffer() selftest
  that the branch trace wasn't converted for.

  The forth patch fixes a bug caught by the RCU lockdep code where a
  rcu_read_lock() is performed when rcu is disabled (either going to or
  from idle, or user space).  This happened on the irqsoff tracer as it
  calls task_uid().  The fix here was to use current_uid() when possible
  that doesn't use rcu locking.  Which luckily, is always used when
  irqsoff calls this code."

* tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Use current_uid() for critical time tracing
  tracing: Fix bad parameter passed in branch selftest
  ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
  rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()

Revert "ACPI / scan: do not match drivers against objects having scan handlers"

Commit 9f29ab11ddbf ("ACPI / scan: do not match drivers against objects
having scan handlers") introduced a boot regression on Tony's ia64 HP
rx2600.  Tony says:

  "It panics with the message:

   Kernel panic - not syncing: Unable to find SBA IOMMU: Try a generic or DIG kernel

   [...] my problem comes from arch/ia64/hp/common/sba_iommu.c
   where the code in sba_init() says:

        acpi_bus_register_driver(&acpi_sba_ioc_driver);
        if (!ioc_list) {

   but because of this change we never managed to call ioc_init()
   so ioc_list doesn't get set up, and we die."

Revert it to avoid this breakage and we'll fix the problem it attempted
to address later.

Reported-by: Tony Luck <tony.luck@gmail.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'mxs-fixes-3.10' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes

From Shawn Guo, mxs fixes for 3.10:

- Since the time we move to MULTI_IRQ_HANDLER, the 0x7f polling for no
  interrupt in icoll_handle_irq() becomes insane, because 0x7f is an
  valid interrupt number, the irq of gpio bank 0.  That unnecessary
  polling results in the driver not detecting when irq 0x7f is active
  which makes the machine effectively dead lock.  The fix removes the
  interrupt poll loop and allows usage of gpio0 interrupt without an
  infinite loop.

* tag 'mxs-fixes-3.10' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: mxs: icoll: Fix interrupts gpio bank 0

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'imx-fixes-3.10-2' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes

From Shawn Guo, imx fixes for 3.10, take 2:

- One device tree fix for all spi node to have per clock added.
  The clock is needed by spi driver to calculate bit rate divisor.
  The spi node in the current device trees either does not have the
  clock or is defined as dummy clock, in which case the driver probe
  will fail or spi will run at a wrong bit rate.

- Two imx6q clock fixes, which correct axi_sels and ldb_di_sels.

* tag 'imx-fixes-3.10-2' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: imx: clk-imx6q: AXI clock select index is incorrect
  ARM: dts: imx: fix clocks for cspi
  ARM i.MX6q: fix for ldb_di_sels

Signed-off-by: Olof Johansson <olof@lixom.net>

ARM: exynos: add debug_ll_io_init() call in exynos_init_io()

If the early MMU mapping of the UART happens to get booted out of the
TLB between the start of paging_init() and when we finally re-add the
UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
we've got early_printk enabled. Avoid this hang by calling
debug_ll_io_init() early.

Without this patch, you can reliably reproduce a hang when early
printk is enabled by adding flush_tlb_all() at the start of
exynos_init_io(). After this patch the hang goes away.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'renesas-fixes-for-v3.10' of git://git./linux/kernel/git/horms/renesas into fixes

From Simon Horman, Renesas ARM based SoC fixes for v3.10:
- Correction to USB OVC and PENC pin groupings on r8a7779 SoC.
  This avoids conflicts when the USB_OVCn pins are used by another function.
  This has been observed to be a problem in v3.10-rc1.
- Update CMT clock rating for sh73a0 SoC to resolve boot failure
  on kzm9g-reference. This resolves a regression between v3.9 and v3.10-rc1.

* tag 'renesas-fixes-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: sh73a0: Update CMT clockevent rating to 80
  sh-pfc: r8a7779: Don't group USB OVC and PENC pins

Signed-off-by: Olof Johansson <olof@lixom.net>

ARM: EXYNOS: uncompress - print debug messages if DEBUG_LL is defined

Printing low-level debug messages make an assumption that the specified
UART port has been preconfigured by the bootloader. Incorrectly
specified UART port results in system getting stalled while printing the
message "Uncompressing Linux... done, booting the kernel"
This UART port number is specified through S3C_LOWLEVEL_UART_PORT. Since
the UART port might different for different board, it is not possible to
specify it correctly for every board that use a common defconfig file.

Calling this print subroutine only when DEBUG_LL fixes the problem. By
disabling DEBUG_LL in default config file, we would be able to boot
multiple boards with different default UART ports.

With this current approach, we miss the print "Uncompressing Linux...
done, booting the kernel." when DEBUG_LL is not defined.

Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband

Pull infiniband fixes from Roland Dreier:
- qib RCU/lockdep fix
- iser device removal fix, plus doc fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/qib: Fix lockdep splat in qib_alloc_lkey()
  MAINTAINERS: Add entry for iSCSI Extensions for RDMA (iSER) initiator
  IB/iser: Add Mellanox copyright
  IB/iser: Fix device removal flow

Merge tag 'vfio-v3.10-rc5' of git://github.com/awilliam/linux-vfio

Pull vfio fix from Alex Williamson:
"fix rmmod crash"

* tag 'vfio-v3.10-rc5' of git://github.com/awilliam/linux-vfio:
vfio: fix crash on rmmod

Merge tag 'ecryptfs-3.10-rc5-msync' of git://git./linux/kernel/git/tyhicks/ecryptfs

Pull ecryptfs fixes from Tyler Hicks:
- Fixes how eCryptfs handles msync to sync both the upper and lower
   file
- A couple of MAINTAINERS updates

* tag 'ecryptfs-3.10-rc5-msync' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  eCryptfs: Check return of filemap_write_and_wait during fsync
  Update eCryptFS maintainers
  ecryptfs: fixed msync to flush data

Merge branch 'for-3.10' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fix from Steve French:
"Fix one byte buffer overrun with prefixpaths on cifs mounts which can
cause a problem with mount depending on the string length"

* 'for-3.10' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix off-by-one bug in build_unc_path_to_root

dmatest: do not allow to interrupt ongoing tests

When user interrupts ongoing transfers the dmatest may end up with console
lockup, oops, or data mismatch. This patch prevents user to abort any ongoing
test.

Documentation is updated accordingly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

Merge tag 'sound-3.10' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
- A pile of small regression fix patches for HD-audio VIA codecs
- Quirks for HD-aduio and USB-audio devices
- A trivial SIS7019 error path fix

* tag 'sound-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio - Fix invalid volume resolution on Logitech HD webcam c270
  ALSA: usb-audio - Apply Logitech QuickCam Pro 9000 quirk only to audio iface
  ALSA: hda/via - Clean up duplicated codes
  ALSA: hda/via - Fix wrongly cleared pins after suspend on VT1802
  ALSA: hda - Add keep_eapd_on flag to generic parser
  ALSA: hda - Allow setting automute/automic hooks after parsing
  ALSA: hda/via - Disable broken dynamic power control
  ALSA: usb-audio: fix Roland/Cakewalk UM-3G support
  ALSA: hda - Add headset quirk for two Dell machines
  ALSA: hda - add dock support for Thinkpad T431s
  ALSA: sis7019: fix error return code in sis_chip_create()

Merge tag 'pm+acpi-3.10-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael J Wysocki:

- Fix for an ACPI PM regression causing Toshiba P870-303 to crash
   during boot from Rafael J Wysocki.

- ACPI fix for an issue causing some drivers to attempt to bind to
   devices they shouldn't touch from Aaron Lu.

- Fix for a recent cpufreq regression related to a possible race with
   CPU offline from Michael Wang.

- ACPI cpufreq regression fix for an issue causing turbo frequencies to
   be underutilized in some cases from Ross Lagerwall.

- cpufreq-cpu0 driver fix related to incorrect clock ACPI usage from
   Guennadi Liakhovetski.

- HP WMI driver fix for an issue causing GPS initialization and
   poweroff failures on HP Elitebook 6930p from Lan Tianyu.

- APEI (ACPI Platform Error Interface) fix for an issue in the error
   code path in ghes_probe() from Wei Yongjun.

- New ACPI video driver blacklist entries for HP m4 and HP Pavilion g6
   from Alex Hung and Ash Willis.

* tag 'pm+acpi-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PM: Do not execute _PS0 for devices without _PSC during initialization
  cpufreq: cpufreq-cpu0: use the exact frequency for clk_set_rate()
  cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
  ACPI / scan: do not match drivers against objects having scan handlers
  ACPI / APEI: fix error return code in ghes_probe()
  acpi-cpufreq: set current frequency based on target P-State
  ACPI / video: ignore BIOS initial backlight value for HP Pavilion g6
  ACPI / video: ignore BIOS initial backlight value for HP m4
  x86 / platform / hp_wmi: Fix bluetooth_rfkill misuse in hp_wmi_rfkill_setup()

hwmon: (adm1021) Strengthen chip detection for ADM1021, LM84 and MAX1617

On a system with both MAX1617 and JC42 sensors, JC42 sensors can be misdetected
as LM84. Strengthen detection sufficiently enough to avoid this misdetection.
Also improve detection for ADM1021.

Modeled after chip detection code in sensors-detect command.

Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jean Delvare <khali@linux-fr.org>

Merge branch 'pm-fixes'

* pm-fixes:
  cpufreq: cpufreq-cpu0: use the exact frequency for clk_set_rate()
  cpufreq: protect 'policy->cpus' from offlining during __gov_queue_work()
  acpi-cpufreq: set current frequency based on target P-State

Merge branch 'acpi-fixes'

* acpi-fixes:
  ACPI / PM: Do not execute _PS0 for devices without _PSC during initialization
  ACPI / scan: do not match drivers against objects having scan handlers
  ACPI / APEI: fix error return code in ghes_probe()
  ACPI / video: ignore BIOS initial backlight value for HP Pavilion g6
  ACPI / video: ignore BIOS initial backlight value for HP m4
  x86 / platform / hp_wmi: Fix bluetooth_rfkill misuse in hp_wmi_rfkill_setup()

ACPI / PM: Do not execute _PS0 for devices without _PSC during initialization

Commit b378549 (ACPI / PM: Do not power manage devices in unknown
initial states) added code to force devices without _PSC, but having
_PS0 defined in the ACPI namespace, into ACPI power state D0 by
executing _PS0 for them. That turned out to break Toshiba P870-303,
however, so revert that code.

References: https://bugzilla.kernel.org/show_bug.cgi?id=58201
Reported-and-tested-by: Jerome Cantenot <jerome.cantenot@gmail.com>
Tracked-down-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: 3.9+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Merge git://git./linux/kernel/git/davem/net

Pull networking fix from David Miller:
"This is a quick one commit pull request to cure the regression
introduced by the MSG_CMSG_COMPAT change."

(Background: commit 1be374a0518a completely broke 32-bit COMPAT handling
by not only disallowing MSG_CMSG_COMPAT from user APIs, but clearing it
in our own internal use too!)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Unbreak compat_sys_{send,recv}msg

Merge tag 'staging-3.10-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg Kroah-Hartman:
"Here are some staging and IIO driver fixes for the 3.10-rc5 release.

  All of them are tiny, and fix a number of reported issues (build and
  runtime)"

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'staging-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio:inkern: Fix typo/bug in convert raw to processed.
  iio: frequency: ad4350: Fix bug / typo in mask
  inkern: iio_device_put after incorrect return/goto
  staging: alarm-dev: information leak in alarm_compat_ioctl()
  iio:callback buffer: free the scan_mask
  staging: alarm-dev: information leak in alarm_ioctl()
  drivers: staging: zcache: fix compile error
  staging: dwc2: fix value of dma_mask

Merge tag 'tty-3.10-rc4' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg Kroah-Hartman:
"Here are some small bugfixes, and one revert, of serial driver issues
  that have been reported"

* tag 'tty-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "serial: 8250: Make SERIAL_8250_RUNTIME_UARTS work correctly"
  serial: samsung: enable clock before clearing pending interrupts during init
  serial/imx: disable hardware flow control at startup

Merge tag 'usb-3.10-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
"Here are a number of USB bugfixes and new device ids for the 3.10-rc5
  tree.

  Nothing major here, a number of new device ids (and movement from the
  option to the zte_ev driver of a number of ids that we had previously
  gotten wrong, some xhci bugfixes, some usb-serial driver fixes that
  were recently found, some host controller fixes / reverts, and a
  variety of smaller other things"

* tag 'usb-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (29 commits)
  USB: option,zte_ev: move most ZTE CDMA devices to zte_ev
  USB: option: blacklist network interface on Huawei E1820
  USB: whiteheat: fix broken port configuration
  USB: serial: fix TIOCMIWAIT return value
  USB: mos7720: fix hardware flow control
  USB: keyspan: remove unused endpoint-array access
  USB: keyspan: fix bogus array index
  USB: zte_ev: fix broken open
  USB: serial: Add Option GTM681W to qcserial device table.
  USB: Serial: cypress_M8: Enable FRWD Dongle hidcom device
  USB: EHCI: fix regression related to qh_refresh()
  usbfs: Increase arbitrary limit for USB 3 isopkt length
  USB: zte_ev: fix control-message timeouts
  USB: mos7720: fix message timeouts
  USB: iuu_phoenix: fix bulk-message timeout
  USB: ark3116: fix control-message timeout
  USB: mos7840: fix DMA to stack
  USB: mos7720: fix DMA to stack
  USB: visor: fix initialisation of Treo/Kyocera devices
  USB: serial: fix Treo/Kyocera interrrupt-in urb context
  ...

Merge tag 'pci-v3.10-fixes-3' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
"This fixes a crash when booting a 32-bit kernel via the EFI boot stub.

  PCI ROM from EFI
      x86/PCI: Map PCI setup data with ioremap() so it can be in highmem"

* tag 'pci-v3.10-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  x86/PCI: Map PCI setup data with ioremap() so it can be in highmem

Merge tag 'for-linus-v3.10-rc5' of git://oss.sgi.com/xfs/xfs

Pull more xfs updates from Ben Myers:
"Here are several fixes for filesystems with CRC support turned on:
  fixes for quota, remote attributes, and recovery.  There is also some
  feature work related to CRCs: the implementation of CRCs for the inode
  unlinked lists, disabling noattr2/attr2 options when appropriate, and
  bumping the maximum number of ACLs.

  I would have preferred to defer this last category of items to 3.11.
  This would require setting a feature bit for the on-disk changes, so
  there is some pressure to get these in 3.10.  I believe this
  represents the end of the CRC related queue.

   - Rework of dquot CRCs
   - Fix for remote attribute invalidation of a leaf
   - Fix ordering of transaction replay in recovery
   - Implement CRCs for inode unlinked list
   - Disable noattr2/attr2 mount options when CRCs are enabled
   - Bump the limitation of ACL entries for v5 superblocks"

* tag 'for-linus-v3.10-rc5' of git://oss.sgi.com/xfs/xfs:
  xfs: increase number of ACL entries for V5 superblocks
  xfs: disable noattr2/attr2 mount options for CRC enabled filesystems
  xfs: inode unlinked list needs to recalculate the inode CRC
  xfs: fix log recovery transaction item reordering
  xfs: fix remote attribute invalidation for a leaf
  xfs: rework dquot CRCs

net: Unbreak compat_sys_{send,recv}msg

I broke them in this commit:

    commit 1be374a0518a288147c6a7398792583200a67261
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Wed May 22 14:07:44 2013 -0700

        net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints.  It
also reverts some unnecessary checks in sys_socketcall.

Apparently I was suffering from underscore blindness the first time around.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tracing: Use current_uid() for critical time tracing

The irqsoff tracer records the max time that interrupts are disabled.
There are hooks in the assembly code that calls back into the tracer when
interrupts are disabled or enabled.

When they are enabled, the tracer checks if the amount of time they
were disabled is larger than the previous recorded max interrupts off
time. If it is, it creates a snapshot of the currently running trace
to store where the last largest interrupts off time was held and how
it happened.

During testing, this RCU lockdep dump appeared:

[ 1257.829021] ===============================
[ 1257.829021] [ INFO: suspicious RCU usage. ]
[ 1257.829021] 3.10.0-rc1-test+ #171 Tainted: G        W
[ 1257.829021] -------------------------------
[ 1257.829021] /home/rostedt/work/git/linux-trace.git/include/linux/rcupdate.h:780 rcu_read_lock() used illegally while idle!
[ 1257.829021]
[ 1257.829021] other info that might help us debug this:
[ 1257.829021]
[ 1257.829021]
[ 1257.829021] RCU used illegally from idle CPU!
[ 1257.829021] rcu_scheduler_active = 1, debug_locks = 0
[ 1257.829021] RCU used illegally from extended quiescent state!
[ 1257.829021] 2 locks held by trace-cmd/4831:
[ 1257.829021]  #0:  (max_trace_lock){......}, at: [<ffffffff810e2b77>] stop_critical_timing+0x1a3/0x209
[ 1257.829021]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff810dae5a>] __update_max_tr+0x88/0x1ee
[ 1257.829021]
[ 1257.829021] stack backtrace:
[ 1257.829021] CPU: 3 PID: 4831 Comm: trace-cmd Tainted: G        W    3.10.0-rc1-test+ #171
[ 1257.829021] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
[ 1257.829021]  0000000000000001 ffff880065f49da8 ffffffff8153dd2b ffff880065f49dd8
[ 1257.829021]  ffffffff81092a00 ffff88006bd78680 ffff88007add7500 0000000000000003
[ 1257.829021]  ffff88006bd78680 ffff880065f49e18 ffffffff810daebf ffffffff810dae5a
[ 1257.829021] Call Trace:
[ 1257.829021]  [<ffffffff8153dd2b>] dump_stack+0x19/0x1b
[ 1257.829021]  [<ffffffff81092a00>] lockdep_rcu_suspicious+0x109/0x112
[ 1257.829021]  [<ffffffff810daebf>] __update_max_tr+0xed/0x1ee
[ 1257.829021]  [<ffffffff810dae5a>] ? __update_max_tr+0x88/0x1ee
[ 1257.829021]  [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021]  [<ffffffff810dbf85>] update_max_tr_single+0x11d/0x12d
[ 1257.829021]  [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021]  [<ffffffff810e2b15>] stop_critical_timing+0x141/0x209
[ 1257.829021]  [<ffffffff8109569a>] ? trace_hardirqs_on+0xd/0xf
[ 1257.829021]  [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021]  [<ffffffff810e3057>] time_hardirqs_on+0x2a/0x2f
[ 1257.829021]  [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021]  [<ffffffff8109550c>] trace_hardirqs_on_caller+0x16/0x197
[ 1257.829021]  [<ffffffff8109569a>] trace_hardirqs_on+0xd/0xf
[ 1257.829021]  [<ffffffff811002b9>] user_enter+0xfd/0x107
[ 1257.829021]  [<ffffffff810029b4>] do_notify_resume+0x92/0x97
[ 1257.829021]  [<ffffffff8154bdca>] int_signal+0x12/0x17

What happened was entering into the user code, the interrupts were enabled
and a max interrupts off was recorded. The trace buffer was saved along with
various information about the task: comm, pid, uid, priority, etc.

The uid is recorded with task_uid(tsk). But this is a macro that uses rcu_read_lock()
to retrieve the data, and this happened to happen where RCU is blind (user_enter).

As only the preempt and irqs off tracers can have this happen, and they both
only have the tsk == current, if tsk == current, use current_uid() instead of
task_uid(), as current_uid() does not use RCU as only current can change its uid.

This fixes the RCU suspicious splat.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

USB: option,zte_ev: move most ZTE CDMA devices to zte_ev

Per some ZTE Linux drivers I found for the AC2716, the following patch
moves most ZTE CDMA devices from option to zte_ev. The blacklist stuff
that option does is not required with zte_ev, because it doesn't
implement any of the send_setup hooks which the blacklist suppressed.

I did not move the 2718 over because I could not find any ZTE Linux
drivers for that device, nor even any Windows drivers.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: option: blacklist network interface on Huawei E1820

The mode used by Windows for the Huawei E1820 will use the
same ff/ff/ff class codes for both serial and network
functions.

Reported-by: Graham Inggs <graham.inggs@uct.ac.za>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: whiteheat: fix broken port configuration

When configuring the port (e.g. set_termios) the port minor number
rather than the port number was used in the request (and they only
coincide for minor number 0).

Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: increase number of ACL entries for V5 superblocks

The limit of 25 ACL entries is arbitrary, but baked into the on-disk
format. For version 5 superblocks, increase it to the maximum nuber
of ACLs that can fit into a single xattr.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinuguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 5c87d4bc1a86bd6e6754ac3d6e111d776ddcfe57)

xfs: disable noattr2/attr2 mount options for CRC enabled filesystems

attr2 format is always enabled for v5 superblock filesystems, so the
mount options to enable or disable it need to be cause mount errors.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit d3eaace84e40bf946129e516dcbd617173c1cf14)

xfs: inode unlinked list needs to recalculate the inode CRC

The inode unlinked list manipulations operate directly on the inode
buffer, and so bypass the inode CRC calculation mechanisms. Hence an
inode on the unlinked list has an invalid CRC. Fix this by
recalculating the CRC whenever we modify an unlinked list pointer in
an inode, ncluding during log recovery. This is trivial to do and
results in unlinked list operations always leaving a consistent
inode in the buffer.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 0a32c26e720a8b38971d0685976f4a7d63f9e2ef)

xfs: fix log recovery transaction item reordering

There are several constraints that inode allocation and unlink
logging impose on log recovery. These all stem from the fact that
inode alloc/unlink are logged in buffers, but all other inode
changes are logged in inode items. Hence there are ordering
constraints that recovery must follow to ensure the correct result
occurs.

As it turns out, this ordering has been working mostly by chance
than good management. The existing code moves all buffers except
cancelled buffers to the head of the list, and everything else to
the tail of the list. The problem with this is that is interleaves
inode items with the buffer cancellation items, and hence whether
the inode item in an cancelled buffer gets replayed is essentially
left to chance.

Further, this ordering causes problems for log recovery when inode
CRCs are enabled. It typically replays the inode unlink buffer long before
it replays the inode core changes, and so the CRC recorded in an
unlink buffer is going to be invalid and hence any attempt to
validate the inode in the buffer is going to fail. Hence we really
need to enforce the ordering that the inode alloc/unlink code has
expected log recovery to have since inode chunk de-allocation was
introduced back in 2003...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit a775ad778073d55744ed6709ccede36310638911)

xfs: fix remote attribute invalidation for a leaf

When invalidating an attribute leaf block block, there might be
remote attributes that it points to. With the recent rework of the
remote attribute format, we have to make sure we calculate the
length of the attribute correctly. We aren't doing that in
xfs_attr3_leaf_inactive(), so fix it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinuguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 59913f14dfe8eb772ff93eb442947451b4416329)

xfs: rework dquot CRCs

Calculating dquot CRCs when the backing buffer is written back just
doesn't work reliably. There are several places which manipulate
dquots directly in the buffers, and they don't calculate CRCs
appropriately, nor do they always set the buffer up to calculate
CRCs appropriately.

Firstly, if we log a dquot buffer (e.g. during allocation) it gets
logged without valid CRC, and so on recovery we end up with a dquot
that is not valid.

Secondly, if we recover/repair a dquot, we don't have a verifier
attached to the buffer and hence CRCs are not calculated on the way
down to disk.

Thirdly, calculating the CRC after we've changed the contents means
that if we re-read the dquot from the buffer, we cannot verify the
contents of the dquot are valid, as the CRC is invalid.

So, to avoid all the dquot CRC errors that are being detected by the
read verifier, change to using the same model as for inodes. That
is, dquot CRCs are calculated and written to the backing buffer at
the time the dquot is flushed to the backing buffer. If we modify
the dquot directly in the backing buffer, calculate the CRC
immediately after the modification is complete. Hence the dquot in
the on-disk buffer should always have a valid CRC.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 6fcdc59de28817d1fbf1bd58cc01f4f3fac858fb)

MIPS: ralink: add missing SZ_1M multiplier

On RT5350 the memory size is set to Bytes and not MegaBytes due to a missing
multiplier.

Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: John Crispin <blogic@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/5378/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>