Jackie Liu [Wed, 14 Aug 2019 09:35:22 +0000 (17:35 +0800)]
io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list
This patch may fix two issues:
First, when IOSQE_IO_DRAIN set, the next IOs need to be inserted into
defer list to delay execution, but link io will be actively scheduled to
run by calling io_queue_sqe.
Second, when multiple LINK_IOs are inserted together with defer_list,
the LINK_IO is no longer keep order.
|-------------|
| LINK_IO | ----> insert to defer_list -----------
|-------------| |
| LINK_IO | ----> insert to defer_list ----------|
|-------------| |
| LINK_IO | ----> insert to defer_list ----------|
|-------------| |
| NORMAL_IO | ----> insert to defer_list ----------|
|-------------| |
|
queue_work at same time <-----|
Fixes: 9e645e1105c ("io_uring: add support for sqe links")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 15 Aug 2019 17:09:16 +0000 (11:09 -0600)]
block: remove REQ_NOWAIT_INLINE
We had a few issues with this code, and there's still a problem around
how we deal with error handling for chained/split bios. For now, just
revert the code and we'll try again with a thoroug solution. This
reverts commits:
e15c2ffa1091 ("block: fix O_DIRECT error handling for bio fragments")
0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments")
6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
893a1c97205a ("blk-mq: allow REQ_NOWAIT to return an error inline")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Aleix Roca Nonell [Thu, 15 Aug 2019 12:03:22 +0000 (14:03 +0200)]
io_uring: fix manual setup of iov_iter for fixed buffers
Commit
bd11b3a391e3 ("io_uring: don't use iov_iter_advance() for fixed
buffers") introduced an optimization to avoid using the slow
iov_iter_advance by manually populating the iov_iter iterator in some
cases.
However, the computation of the iterator count field was erroneous: The
first bvec was always accounted for an extent of page size even if the
bvec length was smaller.
In consequence, some I/O operations on fixed buffers were unable to
operate on the full extent of the buffer, consistently skipping some
bytes at the end of it.
Fixes: bd11b3a391e3 ("io_uring: don't use iov_iter_advance() for fixed buffers")
Cc: stable@vger.kernel.org
Signed-off-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Wenwen Wang [Sun, 11 Aug 2019 17:23:22 +0000 (12:23 -0500)]
xen/blkback: fix memory leaks
In read_per_ring_refs(), after 'req' and related memory regions are
allocated, xen_blkif_map() is invoked to map the shared frame, irq, and
etc. However, if this mapping process fails, no cleanup is performed,
leading to memory leaks. To fix this issue, invoke the cleanup before
returning the error.
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
zhengbin [Mon, 12 Aug 2019 12:36:55 +0000 (20:36 +0800)]
blk-mq: move cancel of requeue_work to the front of blk_exit_queue
blk_exit_queue will free elevator_data, while blk_mq_requeue_work
will access it. Move cancel of requeue_work to the front of
blk_exit_queue to avoid use-after-free.
blk_exit_queue blk_mq_requeue_work
__elevator_exit blk_mq_run_hw_queues
blk_mq_exit_sched blk_mq_run_hw_queue
dd_exit_queue blk_mq_hctx_has_pending
kfree(elevator_data) blk_mq_sched_has_work
dd_has_work
Fixes: fbc2a15e3433 ("blk-mq: move cancel of requeue_work into blk_mq_release")
Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sun, 11 Aug 2019 03:41:41 +0000 (21:41 -0600)]
Merge branch 'nvme-5.3-rc' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Sagi:
"Few nvme fixes for the next rc round.
- detect capacity changes on the mpath disk from Anthony
- probe/remove fix from Keith
- various fixes to pass blktests from Logan
- deadlock in reset/scan race fix
- nvme-rdma use-after-free fix
- deadlock fix when passthru commands race mpath disk info update"
* 'nvme-5.3-rc' of git://git.infradead.org/nvme:
nvme-pci: Fix async probe remove race
nvme: fix controller removal race with scan work
nvme-rdma: fix possible use-after-free in connect error flow
nvme: fix a possible deadlock when passthru commands sent to a multipath device
nvme-core: Fix extra device_put() call on error path
nvmet-file: fix nvmet_file_flush() always returning an error
nvmet-loop: Flush nvme_delete_wq when removing the port
nvmet: Fix use-after-free bug when a port is removed
nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
Coly Li [Fri, 9 Aug 2019 06:14:05 +0000 (14:14 +0800)]
bcache: Revert "bcache: use sysfs_match_string() instead of __sysfs_match_string()"
This reverts commit
89e0341af082dbc170019f908846f4a424efc86b.
In drivers/md/bcache/sysfs.c:bch_snprint_string_list(), NULL pointer at
the end of list is necessary. Remove the NULL from last element of each
lists will cause the following panic,
[ 4340.455652] bcache: register_cache() registered cache device nvme0n1
[ 4340.464603] bcache: register_bdev() registered backing device sdk
[ 4421.587335] bcache: bch_cached_dev_run() cached dev sdk is running already
[ 4421.587348] bcache: bch_cached_dev_attach() Caching sdk as bcache0 on set
354e1d46-d99f-4d8b-870b-
078b80dc88a6
[ 5139.247950] general protection fault: 0000 [#1] SMP NOPTI
[ 5139.247970] CPU: 9 PID: 5896 Comm: cat Not tainted 4.12.14-95.29-default #1 SLE12-SP4
[ 5139.247988] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/18/2019
[ 5139.248006] task:
ffff888fb25c0b00 task.stack:
ffff9bbacc704000
[ 5139.248021] RIP: 0010:string+0x21/0x70
[ 5139.248030] RSP: 0018:
ffff9bbacc707bf0 EFLAGS:
00010286
[ 5139.248043] RAX:
ffffffffa7e432e3 RBX:
ffff8881c20da02a RCX:
ffff0a00ffffff04
[ 5139.248058] RDX:
3f00656863616362 RSI:
ffff8881c20db000 RDI:
ffffffffffffffff
[ 5139.248075] RBP:
ffff8881c20db000 R08:
0000000000000000 R09:
ffff8881c20da02a
[ 5139.248090] R10:
0000000000000004 R11:
0000000000000000 R12:
ffff9bbacc707c48
[ 5139.248104] R13:
0000000000000fd6 R14:
ffffffffc0665855 R15:
ffffffffc0665855
[ 5139.248119] FS:
00007faf253b8700(0000) GS:
ffff88903f840000(0000) knlGS:
0000000000000000
[ 5139.248137] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 5139.248149] CR2:
00007faf25395008 CR3:
0000000f72150006 CR4:
00000000007606e0
[ 5139.248164] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 5139.248179] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 5139.248193] PKRU:
55555554
[ 5139.248200] Call Trace:
[ 5139.248210] vsnprintf+0x1fb/0x510
[ 5139.248221] snprintf+0x39/0x40
[ 5139.248238] bch_snprint_string_list.constprop.15+0x5b/0x90 [bcache]
[ 5139.248256] __bch_cached_dev_show+0x44d/0x5f0 [bcache]
[ 5139.248270] ? __alloc_pages_nodemask+0xb2/0x210
[ 5139.248284] bch_cached_dev_show+0x2c/0x50 [bcache]
[ 5139.248297] sysfs_kf_seq_show+0xbb/0x190
[ 5139.248308] seq_read+0xfc/0x3c0
[ 5139.248317] __vfs_read+0x26/0x140
[ 5139.248327] vfs_read+0x87/0x130
[ 5139.248336] SyS_read+0x42/0x90
[ 5139.248346] do_syscall_64+0x74/0x160
[ 5139.248358] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 5139.248370] RIP: 0033:0x7faf24eea370
[ 5139.248379] RSP: 002b:
00007fff82d03f38 EFLAGS:
00000246 ORIG_RAX:
0000000000000000
[ 5139.248395] RAX:
ffffffffffffffda RBX:
0000000000020000 RCX:
00007faf24eea370
[ 5139.248411] RDX:
0000000000020000 RSI:
00007faf25396000 RDI:
0000000000000003
[ 5139.248426] RBP:
00007faf25396000 R08:
00000000ffffffff R09:
0000000000000000
[ 5139.248441] R10:
000000007c9d4d41 R11:
0000000000000246 R12:
00007faf25396000
[ 5139.248456] R13:
0000000000000003 R14:
0000000000000000 R15:
0000000000000fff
[ 5139.248892] Code: ff ff ff 0f 1f 80 00 00 00 00 49 89 f9 48 89 cf 48 c7 c0 e3 32 e4 a7 48 c1 ff 30 48 81 fa ff 0f 00 00 48 0f 46 d0 48 85 ff 74 45 <44> 0f b6 02 48 8d 42 01 45 84 c0 74 38 48 01 fa 4c 89 cf eb 0e
The simplest way to fix is to revert commit
89e0341af082 ("bcache: use
sysfs_match_string() instead of __sysfs_match_string()").
This bug was introduced in Linux v5.2, so this fix only applies to
Linux v5.2 is enough for stable tree maintainer.
Fixes: 89e0341af082 ("bcache: use sysfs_match_string() instead of __sysfs_match_string()")
Cc: stable@vger.kernel.org
Cc: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reported-by: Peifeng Lin <pflin@suse.com>
Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mikulas Patocka [Thu, 8 Aug 2019 15:17:01 +0000 (11:17 -0400)]
loop: set PF_MEMALLOC_NOIO for the worker thread
A deadlock with this stacktrace was observed.
The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.
In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.
PID: 474 TASK:
ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0"
#0 [
ffff8813dedfb938] __schedule at
ffffffff8173f405
#1 [
ffff8813dedfb990] schedule at
ffffffff8173fa27
#2 [
ffff8813dedfb9b0] schedule_timeout at
ffffffff81742fec
#3 [
ffff8813dedfba60] io_schedule_timeout at
ffffffff8173f186
#4 [
ffff8813dedfbaa0] bit_wait_io at
ffffffff8174034f
#5 [
ffff8813dedfbac0] __wait_on_bit at
ffffffff8173fec8
#6 [
ffff8813dedfbb10] out_of_line_wait_on_bit at
ffffffff8173ff81
#7 [
ffff8813dedfbb90] __make_buffer_clean at
ffffffffa038736f [dm_bufio]
#8 [
ffff8813dedfbbb0] __try_evict_buffer at
ffffffffa0387bb8 [dm_bufio]
#9 [
ffff8813dedfbbd0] dm_bufio_shrink_scan at
ffffffffa0387cc3 [dm_bufio]
#10 [
ffff8813dedfbc40] shrink_slab at
ffffffff811a87ce
#11 [
ffff8813dedfbd30] shrink_zone at
ffffffff811ad778
#12 [
ffff8813dedfbdc0] kswapd at
ffffffff811ae92f
#13 [
ffff8813dedfbec0] kthread at
ffffffff810a8428
#14 [
ffff8813dedfbf50] ret_from_fork at
ffffffff81745242
PID: 14127 TASK:
ffff881455749c00 CPU: 11 COMMAND: "loop1"
#0 [
ffff88272f5af228] __schedule at
ffffffff8173f405
#1 [
ffff88272f5af280] schedule at
ffffffff8173fa27
#2 [
ffff88272f5af2a0] schedule_preempt_disabled at
ffffffff8173fd5e
#3 [
ffff88272f5af2b0] __mutex_lock_slowpath at
ffffffff81741fb5
#4 [
ffff88272f5af330] mutex_lock at
ffffffff81742133
#5 [
ffff88272f5af350] dm_bufio_shrink_count at
ffffffffa03865f9 [dm_bufio]
#6 [
ffff88272f5af380] shrink_slab at
ffffffff811a86bd
#7 [
ffff88272f5af470] shrink_zone at
ffffffff811ad778
#8 [
ffff88272f5af500] do_try_to_free_pages at
ffffffff811adb34
#9 [
ffff88272f5af590] try_to_free_pages at
ffffffff811adef8
#10 [
ffff88272f5af610] __alloc_pages_nodemask at
ffffffff811a09c3
#11 [
ffff88272f5af710] alloc_pages_current at
ffffffff811e8b71
#12 [
ffff88272f5af760] new_slab at
ffffffff811f4523
#13 [
ffff88272f5af7b0] __slab_alloc at
ffffffff8173a1b5
#14 [
ffff88272f5af880] kmem_cache_alloc at
ffffffff811f484b
#15 [
ffff88272f5af8d0] do_blockdev_direct_IO at
ffffffff812535b3
#16 [
ffff88272f5afb00] __blockdev_direct_IO at
ffffffff81255dc3
#17 [
ffff88272f5afb30] xfs_vm_direct_IO at
ffffffffa01fe3fc [xfs]
#18 [
ffff88272f5afb90] generic_file_read_iter at
ffffffff81198994
#19 [
ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at
ffffffffa020c970 [xfs]
#20 [
ffff88272f5afcc0] lo_rw_aio at
ffffffffa0377042 [loop]
#21 [
ffff88272f5afd70] loop_queue_work at
ffffffffa0377c3b [loop]
#22 [
ffff88272f5afe60] kthread_worker_fn at
ffffffff810a8a0c
#23 [
ffff88272f5afec0] kthread at
ffffffff810a8428
#24 [
ffff88272f5aff50] ret_from_fork at
ffffffff81745242
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Wed, 7 Aug 2019 09:36:47 +0000 (11:36 +0200)]
bdev: Fixup error handling in blkdev_get()
Commit
89e524c04fa9 ("loop: Fix mount(2) failure due to race with
LOOP_SET_FD") converted blkdev_get() to use the new helpers for
finishing claiming of a block device. However the conversion botched the
error handling in blkdev_get() and thus the bdev has been marked as held
even in case __blkdev_get() returned error. This led to occasional
warnings with block/001 test from blktests like:
kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0
Correct the error handling.
CC: stable@vger.kernel.org
Fixes: 89e524c04fa9 ("loop: Fix mount(2) failure due to race with LOOP_SET_FD")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Paolo Valente [Wed, 7 Aug 2019 17:21:11 +0000 (19:21 +0200)]
block, bfq: handle NULL return value by bfq_init_rq()
As reported in [1], the call bfq_init_rq(rq) may return NULL in case
of OOM (in particular, if rq->elv.icq is NULL because memory
allocation failed in failed in ioc_create_icq()).
This commit handles this circumstance.
[1] https://lkml.org/lkml/2019/7/22/824
Cc: Hsin-Yi Wang <hsinyi@google.com>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Hsin-Yi Wang <hsinyi@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Paolo Valente [Wed, 7 Aug 2019 14:17:54 +0000 (16:17 +0200)]
block, bfq: move update of waker and woken list to queue freeing
Since commit
13a857a4c4e8 ("block, bfq: detect wakers and
unconditionally inject their I/O"), every bfq_queue has a pointer to a
waker bfq_queue and a list of the bfq_queues it may wake. In this
respect, when a bfq_queue, say Q, remains with no I/O source attached
to it, Q cannot be woken by any other bfq_queue, and cannot wake any
other bfq_queue. Then Q must be removed from the woken list of its
possible waker bfq_queue, and all bfq_queues in the woken list of Q
must stop having a waker bfq_queue.
Q remains with no I/O source in two cases: when the last process
associated with Q exits or when such a process gets associated with a
different bfq_queue. Unfortunately, commit
13a857a4c4e8 ("block, bfq:
detect wakers and unconditionally inject their I/O") performed the
above updates only in the first case.
This commit fixes this bug by moving these updates to when Q gets
freed. This is a simple and safe way to handle all cases, as both the
above events, process exit and re-association, lead to Q being freed
soon, and because dangling references would come out only after Q gets
freed (if no update were performed).
Fixes: 13a857a4c4e8 ("block, bfq: detect wakers and unconditionally inject their I/O")
Reported-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Paolo Valente [Wed, 7 Aug 2019 14:17:53 +0000 (16:17 +0200)]
block, bfq: reset last_completed_rq_bfqq if the pointed queue is freed
Since commit
13a857a4c4e8 ("block, bfq: detect wakers and
unconditionally inject their I/O"), BFQ stores, in a per-device
pointer last_completed_rq_bfqq, the last bfq_queue that had an I/O
request completed. If some bfq_queue receives new I/O right after the
last request of last_completed_rq_bfqq has been completed, then
last_completed_rq_bfqq may be a waker bfq_queue.
But if the bfq_queue last_completed_rq_bfqq points to is freed, then
last_completed_rq_bfqq becomes a dangling reference. This commit
resets last_completed_rq_bfqq if the pointed bfq_queue is freed.
Fixes: 13a857a4c4e8 ("block, bfq: detect wakers and unconditionally inject their I/O")
Reported-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
He Zhe [Thu, 8 Aug 2019 03:09:54 +0000 (11:09 +0800)]
block: aoe: Fix kernel crash due to atomic sleep when exiting
Since commit
3582dd291788 ("aoe: convert aoeblk to blk-mq"), aoedev_downdev
has had the possibility of sleeping and causing the following crash.
BUG: scheduling while atomic: rmmod/2242/0x00000003
Modules linked in: aoe
Preemption disabled at:
[<
ffffffffc01d95e5>] flush+0x95/0x4a0 [aoe]
CPU: 7 PID: 2242 Comm: rmmod Tainted: G I 5.2.3 #1
Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.10.0025.
030220091519 03/02/2009
Call Trace:
dump_stack+0x4f/0x6a
? flush+0x95/0x4a0 [aoe]
__schedule_bug.cold+0x44/0x54
__schedule+0x44f/0x680
schedule+0x44/0xd0
blk_mq_freeze_queue_wait+0x46/0xb0
? wait_woken+0x80/0x80
blk_mq_freeze_queue+0x1b/0x20
aoedev_downdev+0x111/0x160 [aoe]
flush+0xff/0x4a0 [aoe]
aoedev_exit+0x23/0x30 [aoe]
aoe_exit+0x35/0x948 [aoe]
__se_sys_delete_module+0x183/0x210
__x64_sys_delete_module+0x16/0x20
do_syscall_64+0x4d/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f24e0043b07
Code: 73 01 c3 48 8b 0d 89 73 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f
1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff
ff 73 01 c3 48 8b 0d 59 73 0b 00 f7 d8 64 89 01 48
RSP: 002b:
00007ffe18f7f1e8 EFLAGS:
00000206 ORIG_RAX:
00000000000000b0
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f24e0043b07
RDX:
000000000000000a RSI:
0000000000000800 RDI:
0000555c3ecf87c8
RBP:
00007ffe18f7f1f0 R08:
0000000000000000 R09:
0000000000000000
R10:
00007f24e00b4ac0 R11:
0000000000000206 R12:
00007ffe18f7f238
R13:
00007ffe18f7f410 R14:
00007ffe18f80e73 R15:
0000555c3ecf8760
This patch, handling in the same way of pass two, unlocks the locks and
restart pass one after aoedev_downdev is done.
Fixes: 3582dd291788 ("aoe: convert aoeblk to blk-mq")
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 7 Aug 2019 18:23:57 +0000 (12:23 -0600)]
libata: add SG safety checks in SFF pio transfers
Abort processing of a command if we run out of mapped data in the
SG list. This should never happen, but a previous bug caused it to
be possible. Play it safe and attempt to abort nicely if we don't
have more SG segments left.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 7 Aug 2019 18:20:52 +0000 (12:20 -0600)]
libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
For passthrough requests, libata-scsi takes what the user passes in
as gospel. This can be problematic if the user fills in the CDB
incorrectly. One example of that is in request sizes. For read/write
commands, the CDB contains fields describing the transfer length of
the request. These should match with the SG_IO header fields, but
libata-scsi currently does no validation of that.
Check that the number of blocks in the CDB for passthrough requests
matches what was mapped into the request. If the CDB asks for more
data then the validated SG_IO header fields, error it.
Reported-by: Krishna Ram Prakash R <krp@gtux.in>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 6 Aug 2019 19:34:31 +0000 (13:34 -0600)]
block: fix O_DIRECT error handling for bio fragments
0eb6ddfb865c tried to fix this up, but introduced a use-after-free
of dio. Additionally, we still had an issue with error handling,
as reported by Darrick:
"I noticed a regression in xfs/747 (an unreleased xfstest for the
xfs_scrub media scanning feature) on 5.3-rc3. I'll condense that down
to a simpler reproducer:
error-test: 0 209 linear 8:48 0
error-test: 209 1 error
error-test: 210
6446894 linear 8:48 210
Basically we have a ~3G /dev/sdd and we set up device mapper to fail IO
for sector 209 and to pass the io to the scsi device everywhere else.
On 5.3-rc3, performing a directio pread of this range with a < 1M buffer
(in other words, a request for fewer than MAX_BIO_PAGES bytes) yields
EIO like you'd expect:
pread64(3, 0x7f880e1c7000,
1048576, 0) = -1 EIO (Input/output error)
pread: Input/output error
+++ exited with 0 +++
But doing it with a larger buffer succeeds(!):
pread64(3, "XFSB\0\0\20\0\0\0\0\0\0\fL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
1146880, 0) =
1146880
read
1146880/
1146880 bytes at offset 0
1 MiB, 1 ops; 0.0009 sec (1.124 GiB/sec and 1052.6316 ops/sec)
+++ exited with 0 +++
(Note that the part of the buffer corresponding to the dm-error area is
uninitialized)
On 5.3-rc2, both commands would fail with EIO like you'd expect. The
only change between rc2 and rc3 is commit
0eb6ddfb865c ("block: Fix
__blkdev_direct_IO() for bio fragments").
AFAICT we end up in __blkdev_direct_IO with a 1120K buffer, which gets
split into two bios: one for the first BIO_MAX_PAGES worth of data (1MB)
and a second one for the 96k after that."
Fix this by noting that it's always safe to dereference dio if we get
BLK_QC_T_EAGAIN returned, as end_io hasn't been run for that case. So
we can safely increment the dio size before calling submit_bio(), and
then decrement it on failure (not that it really matters, as the bio
and dio are going away).
For error handling, return to the original method of just using 'ret'
for tracking the error, and the size tracking in dio->size.
Fixes: 0eb6ddfb865c ("block: Fix __blkdev_direct_IO() for bio fragments")
Fixes: 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Gustavo A. R. Silva [Tue, 6 Aug 2019 08:08:08 +0000 (03:08 -0500)]
ata: rb532_cf: Fix unused variable warning in rb532_pata_driver_probe
Fix the following warning (Building: rb532_defconfig mips):
drivers/ata/pata_rb532_cf.c: In function ‘rb532_pata_driver_remove’:
drivers/ata/pata_rb532_cf.c:161:24: warning: unused variable ‘info’ [-Wunused-variable]
struct rb532_cf_info *info = ah->private_data;
^~~~
Fixes: cd56f35e52d9 ("ata: rb532_cf: Convert to use GPIO descriptors")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stefan Haberland [Thu, 1 Aug 2019 11:06:30 +0000 (13:06 +0200)]
s390/dasd: fix endless loop after read unit address configuration
After getting a storage server event that causes the DASD device driver
to update its unit address configuration during a device shutdown there is
the possibility of an endless loop in the device driver.
In the system log there will be ongoing DASD error messages with RC: -19.
The reason is that the loop starting the ruac request only terminates when
the retry counter is decreased to 0. But in the sleep_on function there are
early exit paths that do not decrease the retry counter.
Prevent an endless loop by handling those cases separately.
Remove the unnecessary do..while loop since the sleep_on function takes
care of retries by itself.
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org # 2.6.25+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Thu, 1 Aug 2019 10:21:51 +0000 (19:21 +0900)]
block: Fix __blkdev_direct_IO() for bio fragments
The recent fix to properly handle IOCB_NOWAIT for async O_DIRECT IO
(patch
6a43074e2f46) introduced two problems with BIO fragment handling
for direct IOs:
1) The dio size processed is calculated by incrementing the ret variable
by the size of the bio fragment issued for the dio. However, this size
is obtained directly from bio->bi_iter.bi_size AFTER the bio submission
which may result in referencing the bi_size value after the bio
completed, resulting in an incorrect value use.
2) The ret variable is not incremented by the size of the last bio
fragment issued for the bio, leading to an invalid IO size being
returned to the user.
Fix both problem by using dio->size (which is incremented before the bio
submission) to update the value of ret after bio submissions, including
for the last bio fragment issued.
Fixes: 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Keith Busch [Mon, 29 Jul 2019 22:34:52 +0000 (16:34 -0600)]
nvme-pci: Fix async probe remove race
Ensure the controller is not in the NEW state when nvme_probe() exits.
This will always allow a subsequent nvme_remove() to set the state to
DELETING, fixing a potential race between the initial asynchronous probe
and device removal.
Reported-by: Li Zhong <lizhongfs@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Sagi Grimberg [Thu, 25 Jul 2019 18:56:57 +0000 (11:56 -0700)]
nvme: fix controller removal race with scan work
With multipath enabled, nvme_scan_work() can read from the device
(through nvme_mpath_add_disk()) and hang [1]. However, with fabrics,
once ctrl->state is set to NVME_CTRL_DELETING, the reads will hang
(see nvmf_check_ready()) and the mpath stack device make_request
will block if head->list is not empty. However, when the head->list
consistst of only DELETING/DEAD controllers, we should actually not
block, but rather fail immediately.
In addition, before we go ahead and remove the namespaces, make sure
to clear the current path and kick the requeue list so that the
request will fast fail upon requeuing.
[1]:
--
INFO: task kworker/u4:3:166 blocked for more than 120 seconds.
Not tainted
5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u4:3 D 0 166 2 0x80004000
Workqueue: nvme-wq nvme_scan_work
Call Trace:
__schedule+0x851/0x1400
schedule+0x99/0x210
io_schedule+0x21/0x70
do_read_cache_page+0xa57/0x1330
read_cache_page+0x4a/0x70
read_dev_sector+0xbf/0x380
amiga_partition+0xc4/0x1230
check_partition+0x30f/0x630
rescan_partitions+0x19a/0x980
__blkdev_get+0x85a/0x12f0
blkdev_get+0x2a5/0x790
__device_add_disk+0xe25/0x1250
device_add_disk+0x13/0x20
nvme_mpath_set_live+0x172/0x2b0
nvme_update_ns_ana_state+0x130/0x180
nvme_set_ns_ana_state+0x9a/0xb0
nvme_parse_ana_log+0x1c3/0x4a0
nvme_mpath_add_disk+0x157/0x290
nvme_validate_ns+0x1017/0x1bd0
nvme_scan_work+0x44d/0x6a0
process_one_work+0x7d7/0x1240
worker_thread+0x8e/0xff0
kthread+0x2c3/0x3b0
ret_from_fork+0x35/0x40
INFO: task kworker/u4:1:1034 blocked for more than 120 seconds.
Not tainted
5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u4:1 D 0 1034 2 0x80004000
Workqueue: nvme-delete-wq nvme_delete_ctrl_work
Call Trace:
__schedule+0x851/0x1400
schedule+0x99/0x210
schedule_timeout+0x390/0x830
wait_for_completion+0x1a7/0x310
__flush_work+0x241/0x5d0
flush_work+0x10/0x20
nvme_remove_namespaces+0x85/0x3d0
nvme_do_delete_ctrl+0xb4/0x1e0
nvme_delete_ctrl_work+0x15/0x20
process_one_work+0x7d7/0x1240
worker_thread+0x8e/0xff0
kthread+0x2c3/0x3b0
ret_from_fork+0x35/0x40
--
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Sagi Grimberg [Fri, 26 Jul 2019 17:29:49 +0000 (10:29 -0700)]
nvme-rdma: fix possible use-after-free in connect error flow
When start_queue fails, we need to make sure to drain the
queue cq before freeing the rdma resources because we might
still race with the completion path. Have start_queue() error
path safely stop the queue.
--
[30371.808111] nvme nvme1: Failed reconnect attempt 11
[30371.808113] nvme nvme1: Reconnecting in 10 seconds...
[...]
[30382.069315] nvme nvme1: creating 4 I/O queues.
[30382.257058] nvme nvme1: Connect Invalid SQE Parameter, qid 4
[30382.257061] nvme nvme1: failed to connect queue: 4 ret=386
[30382.305001] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[30382.305022] IP: qedr_poll_cq+0x8a3/0x1170 [qedr]
[30382.305028] PGD 0 P4D 0
[30382.305037] Oops: 0000 [#1] SMP PTI
[...]
[30382.305153] Call Trace:
[30382.305166] ? __switch_to_asm+0x34/0x70
[30382.305187] __ib_process_cq+0x56/0xd0 [ib_core]
[30382.305201] ib_poll_handler+0x26/0x70 [ib_core]
[30382.305213] irq_poll_softirq+0x88/0x110
[30382.305223] ? sort_range+0x20/0x20
[30382.305232] __do_softirq+0xde/0x2c6
[30382.305241] ? sort_range+0x20/0x20
[30382.305249] run_ksoftirqd+0x1c/0x60
[30382.305258] smpboot_thread_fn+0xef/0x160
[30382.305265] kthread+0x113/0x130
[30382.305273] ? kthread_create_worker_on_cpu+0x50/0x50
[30382.305281] ret_from_fork+0x35/0x40
--
Reported-by: Nicolas Morey-Chaisemartin <NMoreyChaisemartin@suse.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Sagi Grimberg [Wed, 31 Jul 2019 18:00:26 +0000 (11:00 -0700)]
nvme: fix a possible deadlock when passthru commands sent to a multipath device
When the user issues a command with side effects, we will end up freezing
the namespace request queue when updating disk info (and the same for
the corresponding mpath disk node).
However, we are not freezing the mpath node request queue,
which means that mpath I/O can still come in and block on blk_queue_enter
(called from nvme_ns_head_make_request -> direct_make_request).
This is a deadlock, because blk_queue_enter will block until the inner
namespace request queue is unfroze, but that process is blocked because
the namespace revalidation is trying to update the mpath disk info
and freeze its request queue (which will never complete because
of the I/O that is blocked on blk_queue_enter).
Fix this by freezing all the subsystem nsheads request queues before
executing the passthru command. Given that these commands are infrequent
we should not worry about this temporary I/O freeze to keep things sane.
Here is the matching hang traces:
--
[ 374.465002] INFO: task systemd-udevd:17994 blocked for more than 122 seconds.
[ 374.472975] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.478522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.487274] systemd-udevd D 0 17994 1 0x00000000
[ 374.493407] Call Trace:
[ 374.496145] __schedule+0x2ef/0x620
[ 374.500047] schedule+0x38/0xa0
[ 374.503569] blk_queue_enter+0x139/0x220
[ 374.507959] ? remove_wait_queue+0x60/0x60
[ 374.512540] direct_make_request+0x60/0x130
[ 374.517219] nvme_ns_head_make_request+0x11d/0x420 [nvme_core]
[ 374.523740] ? generic_make_request_checks+0x307/0x6f0
[ 374.529484] generic_make_request+0x10d/0x2e0
[ 374.534356] submit_bio+0x75/0x140
[ 374.538163] ? guard_bio_eod+0x32/0xe0
[ 374.542361] submit_bh_wbc+0x171/0x1b0
[ 374.546553] block_read_full_page+0x1ed/0x330
[ 374.551426] ? check_disk_change+0x70/0x70
[ 374.556008] ? scan_shadow_nodes+0x30/0x30
[ 374.560588] blkdev_readpage+0x18/0x20
[ 374.564783] do_read_cache_page+0x301/0x860
[ 374.569463] ? blkdev_writepages+0x10/0x10
[ 374.574037] ? prep_new_page+0x88/0x130
[ 374.578329] ? get_page_from_freelist+0xa2f/0x1280
[ 374.583688] ? __alloc_pages_nodemask+0x179/0x320
[ 374.588947] read_cache_page+0x12/0x20
[ 374.593142] read_dev_sector+0x2d/0xd0
[ 374.597337] read_lba+0x104/0x1f0
[ 374.601046] find_valid_gpt+0xfa/0x720
[ 374.605243] ? string_nocheck+0x58/0x70
[ 374.609534] ? find_valid_gpt+0x720/0x720
[ 374.614016] efi_partition+0x89/0x430
[ 374.618113] ? string+0x48/0x60
[ 374.621632] ? snprintf+0x49/0x70
[ 374.625339] ? find_valid_gpt+0x720/0x720
[ 374.629828] check_partition+0x116/0x210
[ 374.634214] rescan_partitions+0xb6/0x360
[ 374.638699] __blkdev_reread_part+0x64/0x70
[ 374.643377] blkdev_reread_part+0x23/0x40
[ 374.647860] blkdev_ioctl+0x48c/0x990
[ 374.651956] block_ioctl+0x41/0x50
[ 374.655766] do_vfs_ioctl+0xa7/0x600
[ 374.659766] ? locks_lock_inode_wait+0xb1/0x150
[ 374.664832] ksys_ioctl+0x67/0x90
[ 374.668539] __x64_sys_ioctl+0x1a/0x20
[ 374.672732] do_syscall_64+0x5a/0x1c0
[ 374.676828] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 374.738474] INFO: task nvmeadm:49141 blocked for more than 123 seconds.
[ 374.745871] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.751419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.760170] nvmeadm D 0 49141 36333 0x00004080
[ 374.766301] Call Trace:
[ 374.769038] __schedule+0x2ef/0x620
[ 374.772939] schedule+0x38/0xa0
[ 374.776452] blk_mq_freeze_queue_wait+0x59/0x100
[ 374.781614] ? remove_wait_queue+0x60/0x60
[ 374.786192] blk_mq_freeze_queue+0x1a/0x20
[ 374.790773] nvme_update_disk_info.isra.57+0x5f/0x350 [nvme_core]
[ 374.797582] ? nvme_identify_ns.isra.50+0x71/0xc0 [nvme_core]
[ 374.804006] __nvme_revalidate_disk+0xe5/0x110 [nvme_core]
[ 374.810139] nvme_revalidate_disk+0xa6/0x120 [nvme_core]
[ 374.816078] ? nvme_submit_user_cmd+0x11e/0x320 [nvme_core]
[ 374.822299] nvme_user_cmd+0x264/0x370 [nvme_core]
[ 374.827661] nvme_dev_ioctl+0x112/0x1d0 [nvme_core]
[ 374.833114] do_vfs_ioctl+0xa7/0x600
[ 374.837117] ? __audit_syscall_entry+0xdd/0x130
[ 374.842184] ksys_ioctl+0x67/0x90
[ 374.845891] __x64_sys_ioctl+0x1a/0x20
[ 374.850082] do_syscall_64+0x5a/0x1c0
[ 374.854178] entry_SYSCALL_64_after_hwframe+0x44/0xa9
--
Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Tested-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Logan Gunthorpe [Wed, 31 Jul 2019 23:35:34 +0000 (17:35 -0600)]
nvme-core: Fix extra device_put() call on error path
In the error path for nvme_init_subsystem(), nvme_put_subsystem()
will call device_put(), but it will get called again after the
mutex_unlock().
The device_put() only needs to be called if device_add() fails.
This bug caused a KASAN use-after-free error when adding and
removing subsytems in a loop:
BUG: KASAN: use-after-free in device_del+0x8d9/0x9a0
Read of size 8 at addr
ffff8883cdaf7120 by task multipathd/329
CPU: 0 PID: 329 Comm: multipathd Not tainted
5.2.0-rc6-vmlocalyes-00019-g70a2b39005fd-dirty #314
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x7b/0xb5
print_address_description+0x6f/0x280
? device_del+0x8d9/0x9a0
__kasan_report+0x148/0x199
? device_del+0x8d9/0x9a0
? class_release+0x100/0x130
? device_del+0x8d9/0x9a0
kasan_report+0x12/0x20
__asan_report_load8_noabort+0x14/0x20
device_del+0x8d9/0x9a0
? device_platform_notify+0x70/0x70
nvme_destroy_subsystem+0xf9/0x150
nvme_free_ctrl+0x280/0x3a0
device_release+0x72/0x1d0
kobject_put+0x144/0x410
put_device+0x13/0x20
nvme_free_ns+0xc4/0x100
nvme_release+0xb3/0xe0
__blkdev_put+0x549/0x6e0
? kasan_check_write+0x14/0x20
? bd_set_size+0xb0/0xb0
? kasan_check_write+0x14/0x20
? mutex_lock+0x8f/0xe0
? __mutex_lock_slowpath+0x20/0x20
? locks_remove_file+0x239/0x370
blkdev_put+0x72/0x2c0
blkdev_close+0x8d/0xd0
__fput+0x256/0x770
? _raw_read_lock_irq+0x40/0x40
____fput+0xe/0x10
task_work_run+0x10c/0x180
? filp_close+0xf7/0x140
exit_to_usermode_loop+0x151/0x170
do_syscall_64+0x240/0x2e0
? prepare_exit_to_usermode+0xd5/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f5a79af05d7
Code: 00 00 0f 05 48 3d 00 f0 ff ff 77 3f c3 66 0f 1f 44 00 00 53 89 fb 48 83 ec 10 e8 c4 fb ff ff 89 df 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 06 fc ff ff 8b 44 24
RSP: 002b:
00007f5a7799c810 EFLAGS:
00000293 ORIG_RAX:
0000000000000003
RAX:
0000000000000000 RBX:
0000000000000008 RCX:
00007f5a79af05d7
RDX:
0000000000000000 RSI:
0000000000000001 RDI:
0000000000000008
RBP:
00007f5a58000f98 R08:
0000000000000002 R09:
00007f5a7935ee80
R10:
0000000000000000 R11:
0000000000000293 R12:
000055e432447240
R13:
0000000000000000 R14:
0000000000000001 R15:
000055e4324a9cf0
Allocated by task 1236:
save_stack+0x21/0x80
__kasan_kmalloc.constprop.6+0xab/0xe0
kasan_kmalloc+0x9/0x10
kmem_cache_alloc_trace+0x102/0x210
nvme_init_identify+0x13c3/0x3820
nvme_loop_configure_admin_queue+0x4fa/0x5e0
nvme_loop_create_ctrl+0x469/0xf40
nvmf_dev_write+0x19a3/0x21ab
__vfs_write+0x66/0x120
vfs_write+0x154/0x490
ksys_write+0x104/0x240
__x64_sys_write+0x73/0xb0
do_syscall_64+0xa5/0x2e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 329:
save_stack+0x21/0x80
__kasan_slab_free+0x129/0x190
kasan_slab_free+0xe/0x10
kfree+0xa7/0x200
nvme_release_subsystem+0x49/0x60
device_release+0x72/0x1d0
kobject_put+0x144/0x410
put_device+0x13/0x20
klist_class_dev_put+0x31/0x40
klist_put+0x8f/0xf0
klist_del+0xe/0x10
device_del+0x3a7/0x9a0
nvme_destroy_subsystem+0xf9/0x150
nvme_free_ctrl+0x280/0x3a0
device_release+0x72/0x1d0
kobject_put+0x144/0x410
put_device+0x13/0x20
nvme_free_ns+0xc4/0x100
nvme_release+0xb3/0xe0
__blkdev_put+0x549/0x6e0
blkdev_put+0x72/0x2c0
blkdev_close+0x8d/0xd0
__fput+0x256/0x770
____fput+0xe/0x10
task_work_run+0x10c/0x180
exit_to_usermode_loop+0x151/0x170
do_syscall_64+0x240/0x2e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 32fd90c40768 ("nvme: change locking for the per-subsystem controller list")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Logan Gunthorpe [Wed, 31 Jul 2019 23:35:33 +0000 (17:35 -0600)]
nvmet-file: fix nvmet_file_flush() always returning an error
Presently, nvmet_file_flush() always returns a call to
errno_to_nvme_status() but that helper doesn't take into account the
case when errno=0. So nvmet_file_flush() always returns an error code.
All other callers of errno_to_nvme_status() check for success before
calling it.
To fix this, ensure errno_to_nvme_status() returns success if the
errno is zero. This should prevent future mistakes like this from
happening.
Fixes: c6aa3542e010 ("nvmet: add error log support for file backend")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Logan Gunthorpe [Wed, 31 Jul 2019 23:35:32 +0000 (17:35 -0600)]
nvmet-loop: Flush nvme_delete_wq when removing the port
After calling nvme_loop_delete_ctrl(), the controllers will not
yet be deleted because nvme_delete_ctrl() only schedules work
to do the delete.
This means a race can occur if a port is removed but there
are still active controllers trying to access that memory.
To fix this, flush the nvme_delete_wq before returning from
nvme_loop_remove_port() so that any controllers that might
be in the process of being deleted won't access a freed port.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Logan Gunthorpe [Wed, 31 Jul 2019 23:35:31 +0000 (17:35 -0600)]
nvmet: Fix use-after-free bug when a port is removed
When a port is removed through configfs, any connected controllers
are still active and can still send commands. This causes a
use-after-free bug which is detected by KASAN for any admin command
that dereferences req->port (like in nvmet_execute_identify_ctrl).
To fix this, disconnect all active controllers when a subsystem is
removed from a port. This ensures there are no active controllers
when the port is eventually removed.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Denis Efremov [Wed, 31 Jul 2019 14:53:42 +0000 (08:53 -0600)]
MAINTAINERS: floppy: take over maintainership
I would like to maintain the floppy driver. After the recent fixes,
I think I know the code pretty well. Nowadays I've got 2 physical 3.5"
readers to test all the changes.
Signed-off-by: Denis Efremov <efremov@linux.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Munehisa Kamata [Wed, 31 Jul 2019 12:13:10 +0000 (20:13 +0800)]
nbd: replace kill_bdev() with __invalidate_device() again
Commit
abbbdf12497d ("replace kill_bdev() with __invalidate_device()")
once did this, but
29eaadc03649 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.
EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
block nbd0: Receive control failed (result -32)
block nbd0: shutting down sockets
print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
------------[ cut here ]------------
kernel BUG at fs/buffer.c:3057!
invalid opcode: 0000 [#1] SMP PTI
CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
RIP: 0010:submit_bh_wbc+0x18b/0x190
...
Call Trace:
jbd2_write_superblock+0xf1/0x230 [jbd2]
? account_entity_enqueue+0xc5/0xf0
jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
? __switch_to_asm+0x40/0x70
...
? lock_timer_base+0x67/0x80
kjournald2+0x121/0x360 [jbd2]
? remove_wait_queue+0x60/0x60
kthread+0xf8/0x130
? commit_timeout+0x10/0x10 [jbd2]
? kthread_bind+0x10/0x10
ret_from_fork+0x35/0x40
With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.
Fixes: 29eaadc03649 ("nbd: stop using the bdev everywhere")
Cc: linux-block@vger.kernel.org
Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
Cc: nbd@other.debian.org
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Miquel Raynal [Wed, 31 Jul 2019 12:26:51 +0000 (14:26 +0200)]
ata: libahci: do not complain in case of deferred probe
Retrieving PHYs can defer the probe, do not spawn an error when
-EPROBE_DEFER is returned, it is normal behavior.
Fixes: b1a9edbda040 ("ata: libahci: allow to use multiple PHYs")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jackie Liu [Wed, 31 Jul 2019 06:39:33 +0000 (14:39 +0800)]
io_uring: fix KASAN use after free in io_sq_wq_submit_work
[root@localhost ~]# ./liburing/test/link
QEMU Standard PC report that:
[ 29.379892] CPU: 0 PID: 84 Comm: kworker/u2:2 Not tainted
5.3.0-rc2-00051-g4010b622f1d2-dirty #86
[ 29.379902] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 29.379913] Workqueue: io_ring-wq io_sq_wq_submit_work
[ 29.379929] Call Trace:
[ 29.379953] dump_stack+0xa9/0x10e
[ 29.379970] ? io_sq_wq_submit_work+0xbf4/0xe90
[ 29.379986] print_address_description.cold.6+0x9/0x317
[ 29.379999] ? io_sq_wq_submit_work+0xbf4/0xe90
[ 29.380010] ? io_sq_wq_submit_work+0xbf4/0xe90
[ 29.380026] __kasan_report.cold.7+0x1a/0x34
[ 29.380044] ? io_sq_wq_submit_work+0xbf4/0xe90
[ 29.380061] kasan_report+0xe/0x12
[ 29.380076] io_sq_wq_submit_work+0xbf4/0xe90
[ 29.380104] ? io_sq_thread+0xaf0/0xaf0
[ 29.380152] process_one_work+0xb59/0x19e0
[ 29.380184] ? pwq_dec_nr_in_flight+0x2c0/0x2c0
[ 29.380221] worker_thread+0x8c/0xf40
[ 29.380248] ? __kthread_parkme+0xab/0x110
[ 29.380265] ? process_one_work+0x19e0/0x19e0
[ 29.380278] kthread+0x30b/0x3d0
[ 29.380292] ? kthread_create_on_node+0xe0/0xe0
[ 29.380311] ret_from_fork+0x3a/0x50
[ 29.380635] Allocated by task 209:
[ 29.381255] save_stack+0x19/0x80
[ 29.381268] __kasan_kmalloc.constprop.6+0xc1/0xd0
[ 29.381279] kmem_cache_alloc+0xc0/0x240
[ 29.381289] io_submit_sqe+0x11bc/0x1c70
[ 29.381300] io_ring_submit+0x174/0x3c0
[ 29.381311] __x64_sys_io_uring_enter+0x601/0x780
[ 29.381322] do_syscall_64+0x9f/0x4d0
[ 29.381336] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 29.381633] Freed by task 84:
[ 29.382186] save_stack+0x19/0x80
[ 29.382198] __kasan_slab_free+0x11d/0x160
[ 29.382210] kmem_cache_free+0x8c/0x2f0
[ 29.382220] io_put_req+0x22/0x30
[ 29.382230] io_sq_wq_submit_work+0x28b/0xe90
[ 29.382241] process_one_work+0xb59/0x19e0
[ 29.382251] worker_thread+0x8c/0xf40
[ 29.382262] kthread+0x30b/0x3d0
[ 29.382272] ret_from_fork+0x3a/0x50
[ 29.382569] The buggy address belongs to the object at
ffff888067172140
which belongs to the cache io_kiocb of size 224
[ 29.384692] The buggy address is located 120 bytes inside of
224-byte region [
ffff888067172140,
ffff888067172220)
[ 29.386723] The buggy address belongs to the page:
[ 29.387575] page:
ffffea00019c5c80 refcount:1 mapcount:0 mapping:
ffff88806ace5180 index:0x0
[ 29.387587] flags: 0x100000000000200(slab)
[ 29.387603] raw:
0100000000000200 dead000000000100 dead000000000122 ffff88806ace5180
[ 29.387617] raw:
0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
[ 29.387624] page dumped because: kasan: bad access detected
[ 29.387920] Memory state around the buggy address:
[ 29.388771]
ffff888067172080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[ 29.390062]
ffff888067172100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 29.391325] >
ffff888067172180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 29.392578] ^
[ 29.393480]
ffff888067172200: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
[ 29.394744]
ffff888067172280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 29.396003] ==================================================================
[ 29.397260] Disabling lock debugging due to kernel taint
io_sq_wq_submit_work free and read req again.
Cc: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Cc: linux-block@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: f7b76ac9d17e ("io_uring: fix counter inc/dec mismatch in async_list")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Tue, 30 Jul 2019 11:10:14 +0000 (13:10 +0200)]
loop: Fix mount(2) failure due to race with LOOP_SET_FD
Commit
33ec3e53e7b1 ("loop: Don't change loop device under exclusive
opener") made LOOP_SET_FD ioctl acquire exclusive block device reference
while it updates loop device binding. However this can make perfectly
valid mount(2) fail with EBUSY due to racing LOOP_SET_FD holding
temporarily the exclusive bdev reference in cases like this:
for i in {a..z}{a..z}; do
dd if=/dev/zero of=$i.image bs=1k count=0 seek=1024
mkfs.ext2 $i.image
mkdir mnt$i
done
echo "Run"
for i in {a..z}{a..z}; do
mount -o loop -t ext2 $i.image mnt$i &
done
Fix the problem by not getting full exclusive bdev reference in
LOOP_SET_FD but instead just mark the bdev as being claimed while we
update the binding information. This just blocks new exclusive openers
instead of failing them with EBUSY thus fixing the problem.
Fixes: 33ec3e53e7b1 ("loop: Don't change loop device under exclusive opener")
Cc: stable@vger.kernel.org
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Anthony Iliopoulos [Mon, 29 Jul 2019 12:40:40 +0000 (14:40 +0200)]
nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
When CONFIG_NVME_MULTIPATH is set, only the hidden gendisk associated
with the per-controller ns is run through revalidate_disk when a
rescan is triggered, while the visible blockdev never gets its size
(bdev->bd_inode->i_size) updated to reflect any capacity changes that
may have occurred.
This prevents online resizing of nvme block devices and in extension of
any filesystems atop that will are unable to expand while mounted, as
userspace relies on the blockdev size for obtaining the disk capacity
(via BLKGETSIZE/64 ioctls).
Fix this by explicitly revalidating the actual namespace gendisk in
addition to the per-controller gendisk, when multipath is enabled.
Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Kees Cook [Mon, 29 Jul 2019 21:47:22 +0000 (14:47 -0700)]
libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
Jeffrin reported a KASAN issue:
BUG: KASAN: global-out-of-bounds in ata_exec_internal_sg+0x50f/0xc70
Read of size 16 at addr
ffffffff91f41f80 by task scsi_eh_1/149
...
The buggy address belongs to the variable:
cdb.48319+0x0/0x40
Much like commit
18c9a99bce2a ("libata: zpodd: small read overflow in
eject_tray()"), this fixes a cdb[] buffer length, this time in
zpodd_get_mech_type():
We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be
ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes.
Reported-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Fixes: afe759511808c ("libata: identify and init ZPODD devices")
Link: https://lore.kernel.org/lkml/201907181423.E808958@keescook/
Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Gustavo A. R. Silva [Mon, 29 Jul 2019 21:10:53 +0000 (16:10 -0500)]
ataflop: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
This patch fixes the following warning (Building: m68k):
drivers/block/ataflop.c: In function ‘fd_locked_ioctl’:
drivers/block/ataflop.c:1728:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
set_capacity(floppy->disk, MAX_DISK_SIZE * 2);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/block/ataflop.c:1729:2: note: here
case FDFMTEND:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 28 Jul 2019 19:47:02 +0000 (12:47 -0700)]
Linux 5.3-rc2
Linus Torvalds [Sun, 28 Jul 2019 19:33:15 +0000 (12:33 -0700)]
Merge tag 'meminit-v5.3-rc2' of git://git./linux/kernel/git/kees/linux
Pull structleak fix from Kees Cook:
"Disable gcc-based stack variable auto-init under KASAN (Arnd
Bergmann).
This fixes a bunch of build warnings under KASAN and the
gcc-plugin-based stack auto-initialization features (which are
arguably redundant, so better to let KASAN control this)"
* tag 'meminit-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACK
Linus Torvalds [Sun, 28 Jul 2019 17:35:04 +0000 (10:35 -0700)]
Merge tag 'kbuild-fixes-v5.3' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- add compile_commands.json to .gitignore
- fix false-positive warning from gen_compile_commands.py after
allnoconfig build
- remove unused code
* tag 'kbuild-fixes-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: remove unused single-used-m
gen_compile_commands: lower the entry count threshold
.gitignore: Add compilation database file
kbuild: remove unused objectify macro
Linus Torvalds [Sun, 28 Jul 2019 17:26:10 +0000 (10:26 -0700)]
Merge tag 'char-misc-5.3-rc2' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes for 5.3-rc2 to resolve
some reported issues.
Nothing major at all, some binder bugfixes for issues found, some new
mei device ids, firmware building warning fixes, habanalabs fixes, a
few other build fixes, and a MAINTAINERS update.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
test_firmware: fix a memory leak bug
hpet: Fix division by zero in hpet_time_div()
eeprom: make older eeprom drivers select NVMEM_SYSFS
vmw_balloon: Remove Julien from the maintainers list
fpga-manager: altera-ps-spi: Fix build error
mei: me: add mule creek canyon (EHL) device ids
binder: prevent transactions to context manager from its own process.
binder: Set end of SG buffer area properly.
firmware: Fix missing inline
firmware: fix build errors in paged buffer handling code
habanalabs: don't reset device when getting VRHOT
habanalabs: use %pad for printing a dma_addr_t
Linus Torvalds [Sun, 28 Jul 2019 17:18:33 +0000 (10:18 -0700)]
Merge tag 'tty-5.3-rc2' of git://git./linux/kernel/git/gregkh/tty
Pull tty fixes from Greg KH:
"Here are two tty/vt fixes:
- delete the netx-serial driver as the arch has been removed, no need
to keep the serial driver for it around either.
- vt console_lock fix to resolve a reported noisy warning at runtime
Both of these have been in linux-next with no reported issues"
* tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: Grab console_lock around con_is_bound in show_bind
tty: serial: netx: Delete driver
Linus Torvalds [Sun, 28 Jul 2019 17:00:06 +0000 (10:00 -0700)]
Merge tag 'spdx-5.3-rc2' of git://git./linux/kernel/git/gregkh/spdx
Pull SPDX fixes from Greg KH:
"Here are some small SPDX fixes for 5.3-rc2 for things that came in
during the 5.3-rc1 merge window that we previously missed.
Only three small patches here:
- two uapi patches to resolve some SPDX tags that were not correct
- fix an invalid SPDX tag in the iomap Makefile file
All have been properly reviewed on the public mailing lists"
* tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
iomap: fix Invalid License ID
treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again
treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
Linus Torvalds [Sun, 28 Jul 2019 16:52:35 +0000 (09:52 -0700)]
Merge tag 'usb-5.3-rc2' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small fixes for 5.3-rc2. All of these resolve some
reported issues, some more than others :)
Included in here is:
- xhci fix for an annoying issue with odd devices
- reversion of some usb251xb patches that should not have been merged
- usb pci quirk additions and fixups
- usb storage fix
- usb host controller error test fix
All of these have been in linux-next with no reported issues"
* tag 'usb-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: Fix crash if scatter gather is used with Immediate Data Transfer (IDT).
usb: usb251xb: Reallow swap-dx-lanes to apply to the upstream port
Revert "usb: usb251xb: Add US port lanes inversion property"
Revert "usb: usb251xb: Add US lanes inversion dts-bindings"
usb: wusbcore: fix unbalanced get/put cluster_id
usb/hcd: Fix a NULL vs IS_ERR() bug in usb_hcd_setup_local_mem()
usb-storage: Add a limitation for blk_queue_max_hw_sectors()
usb: pci-quirks: Minor cleanup for AMD PLL quirk
usb: pci-quirks: Correct AMD PLL quirk detection
Linus Torvalds [Sun, 28 Jul 2019 16:38:55 +0000 (09:38 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"Here's the first batch of fixes for this release cycle.
Main diffstat here is the re-deletion of netx. I messed up and most
likely didn't remove the files from the index when I test-merged this
and saw conflicts, and from there on out 'git rerere' remembered the
mistake and I missed checking it. Here it's done again as expected.
Besides that:
- A defconfig refresh + enabling of new drivers for u8500
- i.MX fixlets for i2c/SAI/pinmux
- sleep.S build fix for Davinci
- Broadcom devicetree build/warning fix"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: defconfig: u8500: Add new drivers
ARM: defconfig: u8500: Refresh defconfig
ARM: dts: bcm: bcm47094: add missing #cells for mdio-bus-mux
ARM: davinci: fix sleep.S build error on ARMv4
arm64: dts: imx8mq: fix SAI compatible
arm64: dts: imx8mm: Correct SAI3 RXC/TXFS pin's mux option #1
ARM: dts: imx6ul: fix clock frequency property name of I2C buses
ARM: Delete netx a second time
ARM: dts: imx7ulp: Fix usb-phy unit address format
Linus Torvalds [Sun, 28 Jul 2019 04:46:43 +0000 (21:46 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes and functional updates:
- Prevent stale huge I/O TLB mappings on 32bit. A long standing bug
which got exposed by KPTI support for 32bit
- Prevent bogus access_ok() warnings in arch_stack_walk_user()
- Add display quirks for Lenovo devices which have height and width
swapped
- Add the missing CR2 fixup for 32 bit async pagefaults. Fallout of
the CR2 bug fix series.
- Unbreak handling of force enabled HPET by moving the 'is HPET
counting' check back to the original place.
- A more accurate check for running on a hypervisor platform in the
MDS mitigation code. Not perfect, but more accurate than the
previous one.
- Update a stale and confusing comment vs. IRQ stacks"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/mds: Apply more accurate check on hypervisor platform
x86/hpet: Undo the early counter is counting check
x86/entry/32: Pass cr2 to do_async_page_fault()
x86/irq/64: Update stale comment
x86/sysfb_efi: Add quirks for some devices with swapped width and height
x86/stacktrace: Prevent access_ok() warnings in arch_stack_walk_user()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
x86/mm: Sync also unmappings in vmalloc_sync_all()
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
Linus Torvalds [Sun, 28 Jul 2019 04:22:33 +0000 (21:22 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Two fixes for the fair scheduling class:
- Prevent freeing memory which is accessible by concurrent readers
- Make the RCU annotations for numa groups consistent"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Use RCU accessors consistently for ->numa_group
sched/fair: Don't free p->numa_faults with concurrent readers
Linus Torvalds [Sun, 28 Jul 2019 04:17:56 +0000 (21:17 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A pile of perf related fixes:
Kernel:
- Fix SLOTS PEBS event constraints for Icelake CPUs
- Add the missing mask bit to allow counting hardware generated
prefetches on L3 for Icelake CPUs
- Make the test for hypervisor platforms more accurate (as far as
possible)
- Handle PMUs correctly which override event->cpu
- Yet another missing fallthrough annotation
Tools:
perf.data:
- Fix loading of compressed data split across adjacent records
- Fix buffer size setting for processing CPU topology perf.data
header.
perf stat:
- Fix segfault for event group in repeat mode
- Always separate "stalled cycles per insn" line, it was being
appended to the "instructions" line.
perf script:
- Fix --max-blocks man page description.
- Improve man page description of metrics.
- Fix off by one in brstackinsn IPC computation.
perf probe:
- Avoid calling freeing routine multiple times for same pointer.
perf build:
- Do not use -Wshadow on gcc < 4.8, avoiding too strict warnings
treated as errors, breaking the build"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Mark expected switch fall-throughs
perf/core: Fix creating kernel counters for PMUs that override event->cpu
perf/x86: Apply more accurate check on hypervisor platform
perf/x86/intel: Fix invalid Bit 13 for Icelake MSR_OFFCORE_RSP_x register
perf/x86/intel: Fix SLOTS PEBS event constraint
perf build: Do not use -Wshadow on gcc < 4.8
perf probe: Avoid calling freeing routine multiple times for same pointer
perf probe: Set pev->nargs to zero after freeing pev->args entries
perf session: Fix loading of compressed data split across adjacent records
perf stat: Always separate stalled cycles per insn
perf stat: Fix segfault for event group in repeat mode
perf tools: Fix proper buffer size for feature processing
perf script: Fix off by one in brstackinsn IPC computation
perf script: Improve man page description of metrics
perf script: Fix --max-blocks man page description
Linus Torvalds [Sun, 28 Jul 2019 04:10:26 +0000 (21:10 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of locking fixes:
- Address the fallout of the rwsem rework. Missing ACQUIREs and a
sanity check to prevent a use-after-free
- Add missing checks for unitialized mutexes when mutex debugging is
enabled.
- Remove the bogus code in the generic SMP variant of
arch_futex_atomic_op_inuser()
- Fixup the #ifdeffery in lockdep to prevent compile warnings"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/mutex: Test for initialized mutex
locking/lockdep: Clean up #ifdef checks
locking/lockdep: Hide unused 'class' variable
locking/rwsem: Add ACQUIRE comments
tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
lcoking/rwsem: Add missing ACQUIRE to read_slowpath sleep loop
locking/rwsem: Add missing ACQUIRE to read_slowpath exit when queue is empty
locking/rwsem: Don't call owner_on_cpu() on read-owner
futex: Cleanup generic SMP variant of arch_futex_atomic_op_inuser()
Linus Torvalds [Sun, 28 Jul 2019 03:49:43 +0000 (20:49 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Thomas Gleixner:
"A single robustness fix for objtool to handle unbalanced CLAC
invocations under all circumstances"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Improve UACCESS coverage
Linus Torvalds [Sat, 27 Jul 2019 18:04:18 +0000 (11:04 -0700)]
Merge tag 'Wimplicit-fallthrough-5.3-rc2' of git://git./linux/kernel/git/gustavoars/linux
Pull Wimplicit-fallthrough enablement from Gustavo A. R. Silva:
"This marks switch cases where we are expecting to fall through, and
globally enables the -Wimplicit-fallthrough option in the main
Makefile.
Finally, some missing-break fixes that have been tagged for -stable:
- drm/amdkfd: Fix missing break in switch statement
- drm/amdgpu/gfx10: Fix missing break in switch statement
With these changes, we completely get rid of all the fall-through
warnings in the kernel"
* tag 'Wimplicit-fallthrough-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
Makefile: Globally enable fall-through warning
drm/i915: Mark expected switch fall-throughs
drm/amd/display: Mark expected switch fall-throughs
drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning
drm/amdgpu/gfx10: Fix missing break in switch statement
drm/amdkfd: Fix missing break in switch statement
perf/x86/intel: Mark expected switch fall-throughs
mtd: onenand_base: Mark expected switch fall-through
afs: fsclient: Mark expected switch fall-throughs
afs: yfsclient: Mark expected switch fall-throughs
can: mark expected switch fall-throughs
firewire: mark expected switch fall-throughs
Linus Torvalds [Sat, 27 Jul 2019 15:58:04 +0000 (08:58 -0700)]
Merge tag 's390-5.3-3' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Add ABI to kernel image file which allows e.g. the file utility to
figure out the kernel version.
- Wire up clone3 system call.
- Add support for kasan bitops instrumentation.
- uapi header cleanup: use __u{16,32,64} instead of uint{16,32,64}_t.
- Provide proper ARCH_ZONE_DMA_BITS so the s390 DMA zone is correctly
defined with 2 GB instead of the default value of 1 MB.
- Farhan Ali leaves the group of vfio-ccw maintainers.
- Various small vfio-ccw fixes.
- Add missing locking for airq_areas array in virtio code.
- Minor qdio improvements.
* tag 's390-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
MAINTAINERS: vfio-ccw: Remove myself as the maintainer
s390/mm: use shared variables for sysctl range check
virtio/s390: fix race on airq_areas[]
s390/dma: provide proper ARCH_ZONE_DMA_BITS value
s390/kasan: add bitops instrumentation
s390/bitops: make test functions return bool
s390: wire up clone3 system call
kbuild: enable arch/s390/include/uapi/asm/zcrypt.h for uapi header test
s390: use __u{16,32,64} instead of uint{16,32,64}_t in uapi header
s390/hypfs: fix a typo in the name of a function
s390/qdio: restrict QAOB usage to IQD unicast queues
s390/qdio: add sanity checks to the fast-requeue path
s390: enable detection of kernel version from bzImage
Documentation: fix vfio-ccw doc
vfio-ccw: Update documentation for csch/hsch
vfio-ccw: Don't call cp_free if we are processing a channel program
vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn
vfio-ccw: Fix memory leak and don't call cp_free in cp_init
vfio-ccw: Fix misleading comment when setting orb.cmd.c64
Linus Torvalds [Sat, 27 Jul 2019 15:49:19 +0000 (08:49 -0700)]
Merge tag 'devicetree-fixes-for-5.3-2' of git://git./linux/kernel/git/robh/linux
Pull Devicetree fixes from Rob Herring:
"The nvmem changes would typically go thru Greg's tree, but they were
missed in the merge window. [ Acked by Greg ]
Summary:
- Fix mismatches in $id values and actual filenames. Now checked by
tools.
- Convert nvmem binding to DT schema
- Fix a typo in of_property_read_bool() kerneldoc
- Remove some redundant description in al-fic interrupt-controller"
* tag 'devicetree-fixes-for-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: Fix more $id value mismatches filenames
dt-bindings: nvmem: SID: Fix the examples node names
dt-bindings: nvmem: Add YAML schemas for the generic NVMEM bindings
of: Fix typo in kerneldoc
dt-bindings: interrupt-controller: al-fic: remove redundant binding
dt-bindings: clk: allwinner,sun4i-a10-ccu: Correct path in $id
Linus Torvalds [Sat, 27 Jul 2019 15:25:51 +0000 (08:25 -0700)]
Merge tag 'libnvdimm-fixes-5.3-rc2' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A collection of locking and async operations fixes for v5.3-rc2. These
had been soaking in a branch targeting the merge window, but missed
due to a regression hunt. This fixed up version has otherwise been in
-next this past week with no reported issues.
In order to gain confidence in the locking changes the pull also
includes a debug / instrumentation patch to enable lockdep coverage
for libnvdimm subsystem operations that depend on the device_lock for
exclusion. As mentioned in the changelog it is a hack, but it works
and documents the locking expectations of the sub-system in a way that
others can use lockdep to verify. The driver core touches got an ack
from Greg.
Summary:
- Fix duplicate device_unregister() calls (multiple threads competing
to do unregister work when scheduling device removal from a sysfs
attribute of the self-same device).
- Fix badblocks registration order bug. Ensure region badblocks are
initialized in advance of namespace registration.
- Fix a deadlock between the bus lock and probe operations.
- Export device-core infrastructure to coordinate async operations
via the device ->dead state.
- Add device-core infrastructure to validate device_lock() usage with
lockdep"
* tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
driver-core, libnvdimm: Let device subsystems add local lockdep coverage
libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
libnvdimm/region: Register badblocks before namespaces
libnvdimm/bus: Prevent duplicate device_unregister() calls
drivers/base: Introduce kill_device()
Masahiro Yamada [Fri, 26 Jul 2019 02:17:38 +0000 (11:17 +0900)]
kbuild: remove unused single-used-m
This is unused since commit
9f69a496f100 ("kbuild: split out *.mod out
of {single,multi}-used-m rules").
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Sat, 27 Jul 2019 03:01:10 +0000 (12:01 +0900)]
gen_compile_commands: lower the entry count threshold
Running gen_compile_commands.py after building the kernel with
allnoconfig gave this:
$ ./scripts/gen_compile_commands.py
WARNING: Found 449 entries. Have you compiled the kernel?
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Toru Komatsu [Wed, 24 Jul 2019 00:22:33 +0000 (09:22 +0900)]
.gitignore: Add compilation database file
This file is used by clangd to use language server protocol.
It can be generated at each compile using scripts/gen_compile_commands.py.
Therefore it is different depending on the environment and should be
ignored.
Signed-off-by: Toru Komatsu <k0ma@utam0k.jp>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Tue, 23 Jul 2019 04:11:26 +0000 (13:11 +0900)]
kbuild: remove unused objectify macro
Commit
415008af3219 ("docs-rst: convert lsm from DocBook to ReST")
removed the last users of this macro.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Linus Torvalds [Sat, 27 Jul 2019 02:20:34 +0000 (19:20 -0700)]
Merge tag 'for-linus-
20190726-2' of git://git.kernel.dk/linux-block
Pull block DMA segment fix from Jens Axboe:
"Here's the virtual boundary segment size fix"
* tag 'for-linus-
20190726-2' of git://git.kernel.dk/linux-block:
block: fix max segment size handling in blk_queue_virt_boundary
Linus Torvalds [Sat, 27 Jul 2019 02:13:38 +0000 (19:13 -0700)]
Merge tag 'selinux-pr-
20190726' of git://git./linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small SELinux patch to add some proper bounds/overflow checking
when adding a new sid/secid"
* tag 'selinux-pr-
20190726' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: check sidtab limit before adding a new entry
Rob Herring [Fri, 26 Jul 2019 23:36:52 +0000 (17:36 -0600)]
dt-bindings: Fix more $id value mismatches filenames
The path in the schema '$id' values are wrong. Fix them.
Signed-off-by: Rob Herring <robh@kernel.org>
Maxime Ripard [Wed, 3 Jul 2019 09:54:21 +0000 (11:54 +0200)]
dt-bindings: nvmem: SID: Fix the examples node names
Now that the examples are validated, the examples in the SID binding
generates an error since the node names aren't one of the valid ones.
Let's switch for one that is ok.
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Maxime Ripard [Thu, 27 Jun 2019 15:10:37 +0000 (16:10 +0100)]
dt-bindings: nvmem: Add YAML schemas for the generic NVMEM bindings
The nvmem providers and consumers have a bunch of generic properties that
are needed in a device tree. Add a YAML schemas for those.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
[Srini: Changed licence to (GPL-2.0 OR BSD-2-Clause)]
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Thierry Reding [Fri, 26 Jul 2019 10:17:44 +0000 (12:17 +0200)]
of: Fix typo in kerneldoc
"Findfrom" is not a word. Replace the function synopsis by something
that makes sense.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Linus Torvalds [Fri, 26 Jul 2019 21:20:28 +0000 (14:20 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Nine fixes: The most important core one is the dma_max_mapping_size
fix that corrects the boot problem Gunter Roeck was having. A couple
of other driver only fixes are significant, like the cxgbi selector
support addition, the alua 2 second delay and the fdomain build fix"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
scsi: ibmvfc: fix WARN_ON during event pool release
scsi: fcoe: fix a typo
scsi: megaraid_sas: Make some functions static
scsi: megaraid_sas: fix panic on loading firmware crashdump
scsi: megaraid_sas: fix spelling mistake "megarid_sas" -> "megaraid_sas"
scsi: core: fix the dma_max_mapping_size call
scsi: fdomain: fix building pcmcia front-end
scsi: target: cxgbit: add support for IEEE_8021QAZ_APP_SEL_STREAM selector
Linus Torvalds [Fri, 26 Jul 2019 21:12:54 +0000 (14:12 -0700)]
Merge tag 'drm-fixes-2019-07-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"Dave seems to collect an entire streak of things happening, so again
me typing pull summary.
Nothing nefarious here, most of the fixes are for new stuff or things
users won't see. The amd-display patches are a bit different, and very
much look like they should have at least some cc: stable tags. Might
be amd is a bit too comfortable with their internal tree and not
enough looking at upstream. Dave&me are looking into this, in case
something needs rectified with process here.
Also no intel fixes pull, but intel CI is general become rather good,
still I guess expect a notch more for -rc3.
Summary:
amdgpu:
- fixes for (new in 5.3) hw support (vega20, navi)
- disable RAS
- lots of display fixes all over (audio, DSC, dongle, clock mgr)
ttm:
- fix dma_free_attrs calls to appease dma debugging
msm:
- fixes for dma-api, locking debug and compiler splats
core:
- fix cmdline mode to not apply rotation if not specified (new in 5.3)
- compiler warn fix"
* tag 'drm-fixes-2019-07-26' of git://anongit.freedesktop.org/drm/drm: (46 commits)
drm/amd/display: Set enabled to false at start of audio disable
drm/amdgpu/smu: move fan rpm query into the asic specific code
drm/amd/powerplay: custom peak clock freq for navi10
drm: silence variable 'conn' set but not used
drm/msm: stop abusing dma_map/unmap for cache
drm/msm/dpu: Correct dpu encoder spinlock initialization
drm/msm: correct NULL pointer dereference in context_init
drm/amd/display: handle active dongle port type is DP++ or DP case
drm/amd/display: do not read link setting if edp not connected
drm/amd/display: Increase size of audios array
drm/amd/display: drop ASSERT() if eDP panel is not connected
drm/amd/display: Only enable audio if speaker allocation exists
drm/amd/display: Fix dc_create failure handling and 666 color depths
drm/amd/display: allocate 4 ddc engines for RV2
drm/amd/display: put back front end initialization sequence
drm/amd/display: Wait for flip to complete
drm/amd/display: Change min_h_sync_width from 8 to 4
drm/amd/display: use encoder's engine id to find matched free audio device
drm/amd/display: fix DMCU hang when going into Modern Standby
drm/amd/display: Disable Audio on reinitialize hardware
...
Christoph Hellwig [Wed, 24 Jul 2019 16:26:56 +0000 (18:26 +0200)]
block: fix max segment size handling in blk_queue_virt_boundary
We should only set the max segment size to unlimited if we actually
have a virt boundary. Otherwise we accidentally clear that limit
when called from the SCSI midlayer, which always calls
blk_queue_virt_boundary, even if that mask is 0.
Fixes: 7ad388d8e4c7 ("scsi: core: add a host / host template field for the virt boundary")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 26 Jul 2019 18:29:24 +0000 (11:29 -0700)]
Merge tag 'docs-5.3-1' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"This is mostly a set of follow-on fixes from Mauro fixing various
fallout from the massive RST conversion; a few other small fixes as
well"
* tag 'docs-5.3-1' of git://git.lwn.net/linux: (21 commits)
docs: phy: Drop duplicate 'be made'
doc:it_IT: translations in process/
docs/vm: transhuge: fix typo in madvise reference
doc:it_IT: rephrase statement
doc:it_IT: align translation to mainline
docs: load_config.py: ensure subdirs end with "/"
docs: virtual: add it to the documentation body
docs: remove extra conf.py files
docs: load_config.py: avoid needing a conf.py just due to LaTeX docs
scripts/sphinx-pre-install: seek for Noto CJK fonts for pdf output
scripts/sphinx-pre-install: cleanup Gentoo checks
scripts/sphinx-pre-install: fix latexmk dependencies
scripts/sphinx-pre-install: don't use LaTeX with CentOS 7
scripts/sphinx-pre-install: fix script for RHEL/CentOS
docs: conf.py: only use CJK if the font is available
docs: conf.py: add CJK package needed by translations
docs: pdf: add all Documentation/*/index.rst to PDF output
docs: fix broken doc references due to renames
docs: power: add it to to the main documentation index
docs: powerpc: convert docs to ReST and rename to *.rst
...
Linus Torvalds [Fri, 26 Jul 2019 18:20:42 +0000 (11:20 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's more here than we usually have at this stage, but that's
mainly down to the stacktrace changes which came in slightly too late
for the merge window.
Summary:
- Big bad batch of MAINTAINERS updates
- Fix handling of SP alignment fault exceptions
- Fix PSTATE.SSBS handling on heterogeneous systems
- Fix fallout from moving to the generic vDSO implementation
- Fix stack unwinding in the face of frame corruption
- Fix off-by-one in IORT code
- Minor SVE cleanups"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
arm64: entry: SP Alignment Fault doesn't write to FAR_EL1
arm64: Force SSBS on context switch
MAINTAINERS: Update my email address
MAINTAINERS: Update my email address
MAINTAINERS: Fix spelling mistake in my name
MAINTAINERS: Update my email address to @kernel.org
arm64: mm: Drop pte_huge()
arm64/sve: Fix a couple of magic numbers for the Z-reg count
arm64/sve: Factor out FPSIMD to SVE state conversion
arm64: stacktrace: Better handle corrupted stacks
arm64: stacktrace: Factor out backtrace initialisation
arm64: stacktrace: Constify stacktrace.h functions
arm64: vdso: Cleanup Makefiles
arm64: vdso: fix flip/flop vdso build bug
arm64: vdso: Fix population of AT_SYSINFO_EHDR for compat vdso
Linus Torvalds [Fri, 26 Jul 2019 18:08:37 +0000 (11:08 -0700)]
Merge tag 'for-5.3-rc1-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Two regression fixes:
- hangs caused by a missing barrier in the locking code
- memory leaks of extent_state due to bad handling of a cached
pointer"
* tag 'for-5.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix extent_state leak in btrfs_lock_and_flush_ordered_range
btrfs: Fix deadlock caused by missing memory barrier
Linus Torvalds [Fri, 26 Jul 2019 17:58:44 +0000 (10:58 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs umount_tree() leak fix from Al Viro:
"Fix braino introduced in 'switch the remnants of releasing the
mountpoint away from fs_pin'.
The most visible result is leaking struct mount when mounting btrfs,
making it impossible to shut down"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix the struct mount leak in umount_tree()
Linus Torvalds [Fri, 26 Jul 2019 17:32:12 +0000 (10:32 -0700)]
Merge tag 'for-linus-
20190726' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Several io_uring fixes/improvements:
- Blocking fix for O_DIRECT (me)
- Latter page slowness for registered buffers (me)
- Fix poll hang under certain conditions (me)
- Defer sequence check fix for wrapped rings (Zhengyuan)
- Mismatch in async inc/dec accounting (Zhengyuan)
- Memory ordering issue that could cause stall (Zhengyuan)
- Track sequential defer in bytes, not pages (Zhengyuan)
- NVMe pull request from Christoph
- Set of hang fixes for wbt (Josef)
- Redundant error message kill for libahci (Ding)
- Remove unused blk_mq_sched_started_request() and related ops (Marcos)
- drbd dynamic alloc shash descriptor to reduce stack use (Arnd)
- blkcg ->pd_stat() non-debug print (Tejun)
- bcache memory leak fix (Wei)
- Comment fix (Akinobu)
- BFQ perf regression fix (Paolo)
* tag 'for-linus-
20190726' of git://git.kernel.dk/linux-block: (24 commits)
io_uring: ensure ->list is initialized for poll commands
Revert "nvme-pci: don't create a read hctx mapping without read queues"
nvme: fix multipath crash when ANA is deactivated
nvme: fix memory leak caused by incorrect subsystem free
nvme: ignore subnqn for ADATA SX6000LNP
drbd: dynamically allocate shash descriptor
block: blk-mq: Remove blk_mq_sched_started_request and started_request
bcache: fix possible memory leak in bch_cached_dev_run()
io_uring: track io length in async_list based on bytes
io_uring: don't use iov_iter_advance() for fixed buffers
block: properly handle IOCB_NOWAIT for async O_DIRECT IO
blk-mq: allow REQ_NOWAIT to return an error inline
io_uring: add a memory barrier before atomic_read
rq-qos: use a mb for got_token
rq-qos: set ourself TASK_UNINTERRUPTIBLE after we schedule
rq-qos: don't reset has_sleepers on spurious wakeups
rq-qos: fix missed wake-ups in rq_qos_throttle
wait: add wq_has_single_sleeper helper
block, bfq: check also in-flight I/O in dispatch plugging
block: fix sysfs module parameters directory path in comment
...
Linus Torvalds [Fri, 26 Jul 2019 17:23:45 +0000 (10:23 -0700)]
Merge tag 'sound-5.3-rc2' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All relatively small changes:
- a regression fix for PCM link code with CONFIG_REFCOUNT_FULL;
stumbled on a slight difference between atomic_t and refcount_t
- a couple of HD-audio stabilization patches addressing the too slow
PM resume seen on some Intel chips
- a series of ALSA compress-offload API fixes, including the
regression by the previous capture stream support
- trivial LINE6 USB-audio driver fixes, a new Conexant HD-audio chip
coverage, and a fix in AC97 bus error path"
* tag 'sound-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add a conexant codec entry to let mute led work
ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips
ALSA: ac97: Fix double free of ac97_codec_device
ALSA: compress: Be more restrictive about when a drain is allowed
ALSA: compress: Don't allow paritial drain operations on capture streams
ALSA: compress: Prevent bypasses of set_params
ALSA: compress: Fix regression on compressed capture streams
ALSA: line6: Fix a typo
ALSA: pcm: Fix refcount_inc() on zero usage
ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1
ALSA: hda - Optimize resume for codecs without jack detection
Linus Torvalds [Fri, 26 Jul 2019 17:04:19 +0000 (10:04 -0700)]
Merge tag 'iommu-fixes-v5.3-rc1' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- revert an Intel VT-d patch that caused boot problems on some machines
- fix AMD IOMMU interrupts with x2apic enabled
- fix a potential crash when Intel VT-d domain allocation fails
- fix crash in Intel VT-d driver when accessing a domain without a
flush queue
- formatting fix for new Intel VT-d debugfs code
- fix for use-after-free bug in IOVA code
- fix for a NULL-pointer dereference in Intel VT-d driver when PCI
hotplug is used
- compilation fix for one of the previous fixes
* tag 'iommu-fixes-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Add support for X2APIC IOMMU interrupts
iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA
iommu/vt-d: Print pasid table entries MSB to LSB in debugfs
iommu/iova: Remove stale cached32_node
iommu/vt-d: Check if domain->pgd was allocated
iommu/vt-d: Don't queue_iova() if there is no flush queue
iommu/vt-d: Avoid duplicated pci dma alias consideration
Revert "iommu/vt-d: Consolidate domain_init() to avoid duplication"
Linus Torvalds [Fri, 26 Jul 2019 16:43:43 +0000 (09:43 -0700)]
Merge branch 'for-linus-5.3' of git://git./linux/kernel/git/konrad/ibft
Pull iscsi_ibft fix from Konrad Rzeszutek Wilk:
"One tiny fix to enable iSCSI IBFT to be compiled under ARM"
* 'for-linus-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft:
iscsi_ibft: make ISCSI_IBFT depend on ACPI instead of ISCSI_IBFT_FIND
Linus Torvalds [Fri, 26 Jul 2019 16:36:01 +0000 (09:36 -0700)]
Merge tag 'hwmon-for-v5.3-rc2' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"A couple of hwmon bug fixes:
- Update k8temp documentation URL
- Register address fixes in nct6775 driver
- Fix potential division by zero in occ driver"
* tag 'hwmon-for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (k8temp) documentation: update URL of datasheet
hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
hwmon: (occ) Fix division by zero issue
Guido Günther [Fri, 26 Jul 2019 09:55:34 +0000 (11:55 +0200)]
docs: phy: Drop duplicate 'be made'
Fix duplicate words.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Al Viro [Wed, 24 Jul 2019 16:45:46 +0000 (12:45 -0400)]
fix the struct mount leak in umount_tree()
We need to drop everything we remove from the tree, whether
mnt_has_parent() is true or not. Usually the bug manifests as a slow
memory leak (leaked struct mount for initramfs); it becomes much more
visible in mount_subtree() users, such as btrfs. There we leak
a struct mount for btrfs superblock being mounted, which prevents
fs shutdown on subsequent umount.
Fixes: 56cbb429d911 ("switch the remnants of releasing the mountpoint away from fs_pin")
Reported-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Farhan Ali [Wed, 24 Jul 2019 21:32:03 +0000 (17:32 -0400)]
MAINTAINERS: vfio-ccw: Remove myself as the maintainer
I will not be able to continue with my maintainership responsibilities
going forward, so remove myself as the maintainer.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Vasily Gorbik [Tue, 25 Jun 2019 22:00:42 +0000 (00:00 +0200)]
s390/mm: use shared variables for sysctl range check
Since commit
eec4844fae7c ("proc/sysctl: add shared variables for range
check") special shared variables are available for sysctl range check.
Reuse them for /proc/sys/vm/allocate_pgste proc handler.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Halil Pasic [Tue, 23 Jul 2019 15:11:01 +0000 (17:11 +0200)]
virtio/s390: fix race on airq_areas[]
The access to airq_areas was racy ever since the adapter interrupts got
introduced to virtio-ccw, but since commit
39c7dcb15892 ("virtio/s390:
make airq summary indicators DMA") this became an issue in practice as
well. Namely before that commit the airq_info that got overwritten was
still functional. After that commit however the two infos share a
summary_indicator, which aggravates the situation. Which means
auto-online mechanism occasionally hangs the boot with virtio_blk.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 96b14536d935 ("virtio-ccw: virtio-ccw adapter interrupt support.")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Halil Pasic [Tue, 23 Jul 2019 22:51:55 +0000 (00:51 +0200)]
s390/dma: provide proper ARCH_ZONE_DMA_BITS value
On s390 ZONE_DMA is up to 2G, i.e. ARCH_ZONE_DMA_BITS should be 31 bits.
The current value is 24 and makes __dma_direct_alloc_pages() take a
wrong turn first (but __dma_direct_alloc_pages() recovers then).
Let's correct ARCH_ZONE_DMA_BITS value and avoid wrong turns.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Petr Tesarik <ptesarik@suse.cz>
Fixes: c61e9637340e ("dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Naohiro Aota [Fri, 26 Jul 2019 07:47:05 +0000 (16:47 +0900)]
btrfs: fix extent_state leak in btrfs_lock_and_flush_ordered_range
btrfs_lock_and_flush_ordered_range() loads given "*cached_state" into
cachedp, which, in general, is NULL. Then, lock_extent_bits() updates
"cachedp", but it never goes backs to the caller. Thus the caller still
see its "cached_state" to be NULL and never free the state allocated
under btrfs_lock_and_flush_ordered_range(). As a result, we will
see massive state leak with e.g. fstests btrfs/005. Fix this bug by
properly handling the pointers.
Fixes: bd80d94efb83 ("btrfs: Always use a cached extent_state in btrfs_lock_and_flush_ordered_range")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Dave Airlie [Fri, 26 Jul 2019 04:10:24 +0000 (14:10 +1000)]
Merge tag 'drm-fixes-5.3-2019-07-24' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.3-2019-07-24:
amdgpu:
- RAS fixes for vega20
- Navi VCN fix
- DC audio fixes
- DC DSC fixes
- DC dongle fixes
- DC clk mgr fixes
- Fix DDC lines on some RV2 boards
- GDS fixes for compute
- Navi SMU fixes
ttm:
- Use the same attributes when freeing d_page->vaddr
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190724210527.3415-1-alexander.deucher@amd.com
Dave Airlie [Fri, 26 Jul 2019 04:09:49 +0000 (14:09 +1000)]
Merge tag 'drm-misc-fixes-2019-07-25' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- pick up the cmdline fix which missed the merge window (Dmitry)
- a handful of msm fixes so i don't have to spin up msm-fixes (Various)
- fix -Wunused-but-set-variable warning in drm_framebuffer (Qian)
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725160909.GA106249@art_vandelay
Gustavo A. R. Silva [Fri, 7 Jun 2019 00:46:17 +0000 (19:46 -0500)]
Makefile: Globally enable fall-through warning
Now that all the fall-through warnings have been addressed in the
kernel, enable the fall-through warning globally.
Also, update the deprecated.rst file to include implicit fall-through
as 'deprecated' so people can be pointed to a single location for
justification.
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Mon, 22 Jul 2019 18:03:46 +0000 (13:03 -0500)]
drm/i915: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
drivers/gpu/drm/i915/gem/i915_gem_mman.c: In function ‘i915_gem_fault’:
drivers/gpu/drm/i915/gem/i915_gem_mman.c:342:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (!i915_terminally_wedged(i915))
^
drivers/gpu/drm/i915/gem/i915_gem_mman.c:345:2: note: here
case -EAGAIN:
^~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c: In function ‘i915_gem_object_map’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c:270:3: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(type);
^~~~~~~~~~~~
drivers/gpu/drm/i915/gem/i915_gem_pages.c:272:2: note: here
case I915_MAP_WB:
^~~~
drivers/gpu/drm/i915/i915_gpu_error.c: In function ‘error_record_engine_registers’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/i915_gpu_error.c:1196:5: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(engine->id);
^~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gpu_error.c:1197:4: note: here
case RCS0:
^~~~
drivers/gpu/drm/i915/display/intel_dp.c: In function ‘intel_dp_get_fia_supported_lane_count’:
./include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
^~~~~~~~~~~~~~~~~~~~~~~~~~
./include/asm-generic/bug.h:136:2: note: in expansion of macro ‘unlikely’
unlikely(__ret_warn_on); \
^~~~~~~~
drivers/gpu/drm/i915/i915_utils.h:49:25: note: in expansion of macro ‘WARN’
#define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
^~~~
drivers/gpu/drm/i915/display/intel_dp.c:233:3: note: in expansion of macro ‘MISSING_CASE’
MISSING_CASE(lane_info);
^~~~~~~~~~~~
drivers/gpu/drm/i915/display/intel_dp.c:234:2: note: here
case 1:
^~~~
drivers/gpu/drm/i915/display/intel_display.c: In function ‘check_digital_port_conflicts’:
CC [M] drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgv100.o
drivers/gpu/drm/i915/display/intel_display.c:12043:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (WARN_ON(!HAS_DDI(to_i915(dev))))
^
drivers/gpu/drm/i915/display/intel_display.c:12046:3: note: here
case INTEL_OUTPUT_DP:
^~~~
Also, notice that the Makefile is modified to stop ignoring
fall-through warnings. The -Wimplicit-fallthrough option
will be enabled globally in v5.3.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Fri, 26 Jul 2019 00:13:51 +0000 (19:13 -0500)]
drm/amd/display: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Mon, 22 Jul 2019 16:26:31 +0000 (11:26 -0500)]
drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning
In preparation to enabling -Wimplicit-fallthrough, this patch silences
the following warning:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c: In function ‘mqd_manager_init_v10’:
./include/linux/dynamic_debug.h:122:52: warning: this statement may fall through [-Wimplicit-fallthrough=]
#define __dynamic_func_call(id, fmt, func, ...) do { \
^
./include/linux/dynamic_debug.h:143:2: note: in expansion of macro ‘__dynamic_func_call’
__dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~~~~
./include/linux/dynamic_debug.h:153:2: note: in expansion of macro ‘_dynamic_func_call’
_dynamic_func_call(fmt, __dynamic_pr_debug, \
^~~~~~~~~~~~~~~~~~
./include/linux/printk.h:336:2: note: in expansion of macro ‘dynamic_pr_debug’
dynamic_pr_debug(fmt, ##__VA_ARGS__)
^~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:432:3: note: in expansion of macro ‘pr_debug’
pr_debug("%s@%i\n", __func__, __LINE__);
^~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_mqd_manager_v10.c:433:2: note: here
case KFD_MQD_TYPE_COMPUTE:
^~~~
by removing the call to pr_debug() in KFD_MQD_TYPE_CP:
"The mqd init for CP and COMPUTE will have the same routine." [1]
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
[1] https://lore.kernel.org/lkml/
c735a1cc-a545-50fb-44e7-
c0ad93ee8ee7@amd.com/
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Sun, 21 Jul 2019 22:37:33 +0000 (17:37 -0500)]
drm/amdgpu/gfx10: Fix missing break in switch statement
Add missing break statement in order to prevent the code from falling
through to case AMDGPU_IRQ_STATE_ENABLE.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: a644d85a5cd4 ("drm/amdgpu: add gfx v10 implementation (v10)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Sun, 21 Jul 2019 21:41:37 +0000 (16:41 -0500)]
drm/amdkfd: Fix missing break in switch statement
Add missing break statement in order to prevent the code from falling
through to case CHIP_NAVI10.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: 14328aa58ce5 ("drm/amdkfd: Add navi10 support to amdkfd. (v3)")
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Mon, 24 Jun 2019 16:11:07 +0000 (11:11 -0500)]
perf/x86/intel: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
pmem = true;
~~~~~^~~~~~
arch/x86/events/intel/core.c:4960:2: note: here
case INTEL_FAM6_SKYLAKE_MOBILE:
^~~~
arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
pmem = true;
~~~~~^~~~~~
arch/x86/events/intel/core.c:5009:2: note: here
case INTEL_FAM6_ICELAKE_MOBILE:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Tue, 4 Jun 2019 13:58:02 +0000 (08:58 -0500)]
mtd: onenand_base: Mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warning:
drivers/mtd/nand/onenand/onenand_base.c: In function ‘onenand_check_features’:
drivers/mtd/nand/onenand/onenand_base.c:3264:17: warning: this statement may fall through [-Wimplicit-fallthrough=]
this->options |= ONENAND_HAS_NOP_1;
drivers/mtd/nand/onenand/onenand_base.c:3265:2: note: here
case ONENAND_DEVICE_DENSITY_4Gb:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Cc: Jonathan Bakker <xc-racer2@live.ca>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Sun, 19 May 2019 23:43:53 +0000 (18:43 -0500)]
afs: fsclient: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
Warning level 3 was used: -Wimplicit-fallthrough=3
fs/afs/fsclient.c: In function ‘afs_deliver_fs_fetch_acl’:
fs/afs/fsclient.c:2199:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/fsclient.c:2202:2: note: here
case 1:
^~~~
fs/afs/fsclient.c:2216:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/fsclient.c:2219:2: note: here
case 2:
^~~~
fs/afs/fsclient.c:2225:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/fsclient.c:2228:2: note: here
case 3:
^~~~
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Sun, 19 May 2019 23:56:50 +0000 (18:56 -0500)]
afs: yfsclient: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
fs/afs/yfsclient.c: In function ‘yfs_deliver_fs_fetch_opaque_acl’:
fs/afs/yfsclient.c:1984:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/yfsclient.c:1987:2: note: here
case 1:
^~~~
fs/afs/yfsclient.c:2005:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/yfsclient.c:2008:2: note: here
case 2:
^~~~
fs/afs/yfsclient.c:2014:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/yfsclient.c:2017:2: note: here
case 3:
^~~~
fs/afs/yfsclient.c:2035:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/yfsclient.c:2038:2: note: here
case 4:
^~~~
fs/afs/yfsclient.c:2047:19: warning: this statement may fall through [-Wimplicit-fallthrough=]
call->unmarshall++;
~~~~~~~~~~~~~~~~^~
fs/afs/yfsclient.c:2050:2: note: here
case 5:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Also, fix some commenting style issues.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Tue, 29 Jan 2019 17:59:28 +0000 (11:59 -0600)]
can: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warnings:
drivers/net/can/peak_canfd/peak_pciefd_main.c:668:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/net/can/spi/mcp251x.c:875:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/net/can/usb/peak_usb/pcan_usb.c:422:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/net/can/at91_can.c:895:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/net/can/at91_can.c:953:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/net/can/usb/peak_usb/pcan_usb.c: In function ‘pcan_usb_decode_error’:
drivers/net/can/usb/peak_usb/pcan_usb.c:422:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (n & PCAN_USB_ERROR_BUS_LIGHT) {
^
drivers/net/can/usb/peak_usb/pcan_usb.c:428:2: note: here
case CAN_STATE_ERROR_WARNING:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.
Notice that in some cases spelling mistakes were fixed.
In other cases, the /* fall through */ comment is placed
at the bottom of the case statement, which is what GCC
is expecting to find.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Gustavo A. R. Silva [Mon, 11 Feb 2019 17:48:21 +0000 (11:48 -0600)]
firewire: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
This patch fixes the following warnings:
drivers/firewire/core-device.c: In function ‘set_broadcast_channel’:
drivers/firewire/core-device.c:969:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (data & cpu_to_be32(1 << 31)) {
^
drivers/firewire/core-device.c:974:3: note: here
case RCODE_ADDRESS_ERROR:
^~~~
drivers/firewire/core-iso.c: In function ‘manage_channel’:
drivers/firewire/core-iso.c:308:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
if ((data[0] & bit) == (data[1] & bit))
^
drivers/firewire/core-iso.c:312:3: note: here
default:
^~~~~~~
drivers/firewire/core-topology.c: In function ‘count_ports’:
drivers/firewire/core-topology.c:69:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
(*child_port_count)++;
~~~~~~~~~~~~~~~~~~~^~
drivers/firewire/core-topology.c:70:3: note: here
case SELFID_PORT_PARENT:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that in some cases, the code comment is modified in
accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Cc: Kees Cook <keescook@chromium.org>
Cc: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (reworded a comment)
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Arnd Bergmann [Mon, 22 Jul 2019 11:41:20 +0000 (13:41 +0200)]
structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACK
The combination of KASAN_STACK and GCC_PLUGIN_STRUCTLEAK_BYREF
leads to much larger kernel stack usage, as seen from the warnings
about functions that now exceed the 2048 byte limit:
drivers/media/i2c/tvp5150.c:253:1: error: the frame size of 3936 bytes is larger than 2048 bytes
drivers/media/tuners/r820t.c:1327:1: error: the frame size of 2816 bytes is larger than 2048 bytes
drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:16552:1: error: the frame size of 3144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
fs/ocfs2/aops.c:1892:1: error: the frame size of 2088 bytes is larger than 2048 bytes
fs/ocfs2/dlm/dlmrecovery.c:737:1: error: the frame size of 2088 bytes is larger than 2048 bytes
fs/ocfs2/namei.c:1677:1: error: the frame size of 2584 bytes is larger than 2048 bytes
fs/ocfs2/super.c:1186:1: error: the frame size of 2640 bytes is larger than 2048 bytes
fs/ocfs2/xattr.c:3678:1: error: the frame size of 2176 bytes is larger than 2048 bytes
net/bluetooth/l2cap_core.c:7056:1: error: the frame size of 2144 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
net/bluetooth/l2cap_core.c: In function 'l2cap_recv_frame':
net/bridge/br_netlink.c:1505:1: error: the frame size of 2448 bytes is larger than 2048 bytes
net/ieee802154/nl802154.c:548:1: error: the frame size of 2232 bytes is larger than 2048 bytes
net/wireless/nl80211.c:1726:1: error: the frame size of 2224 bytes is larger than 2048 bytes
net/wireless/nl80211.c:2357:1: error: the frame size of 4584 bytes is larger than 2048 bytes
net/wireless/nl80211.c:5108:1: error: the frame size of 2760 bytes is larger than 2048 bytes
net/wireless/nl80211.c:6472:1: error: the frame size of 2112 bytes is larger than 2048 bytes
The structleak plugin was previously disabled for CONFIG_COMPILE_TEST,
but meant we missed some bugs, so this time we should address them.
The frame size warnings are distracting, and risking a kernel stack
overflow is generally not beneficial to performance, so it may be best
to disallow that particular combination. This can be done by turning
off either one. I picked the dependency in GCC_PLUGIN_STRUCTLEAK_BYREF
and GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, as this option is designed to
make uninitialized stack usage less harmful when enabled on its own,
but it also prevents KASAN from detecting those cases in which it was
in fact needed.
KASAN_STACK is currently implied by KASAN on gcc, but could be made a
user selectable option if we want to allow combining (non-stack) KASAN
with GCC_PLUGIN_STRUCTLEAK_BYREF.
Note that it would be possible to specifically address the files that
print the warning, but presumably the overall stack usage is still
significantly higher than in other configurations, so this would not
address the full problem.
I could not test this with CONFIG_INIT_STACK_ALL, which may or may not
suffer from a similar problem.
Fixes: 81a56f6dcd20 ("gcc-plugins: structleak: Generalize to all variable types")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20190722114134.3123901-1-arnd@arndb.de
Signed-off-by: Kees Cook <keescook@chromium.org>
Jens Axboe [Thu, 25 Jul 2019 16:23:15 +0000 (10:23 -0600)]
Merge branch 'nvme-5.3' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph.
* 'nvme-5.3' of git://git.infradead.org/nvme:
Revert "nvme-pci: don't create a read hctx mapping without read queues"
nvme: fix multipath crash when ANA is deactivated
nvme: fix memory leak caused by incorrect subsystem free
nvme: ignore subnqn for ADATA SX6000LNP
Jens Axboe [Thu, 25 Jul 2019 16:20:18 +0000 (10:20 -0600)]
io_uring: ensure ->list is initialized for poll commands
Daniel reports that when testing an http server that uses io_uring
to poll for incoming connections, sometimes it hard crashes. This is
due to an uninitialized list member for the io_uring request. Normally
this doesn't trigger and none of the test cases caught it.
Reported-by: Daniel Kozak <kozzi11@gmail.com>
Tested-by: Daniel Kozak <kozzi11@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 25 Jul 2019 16:07:32 +0000 (09:07 -0700)]
Merge tag 'pm-5.3-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki
"These fix two issues related to the RAPL MMIO interface support added
recently and one cpufreq driver issue.
Specifics:
- Initialize the power capping subsystem and the RAPL driver earlier
in case the int340X thermal driver is built-in and attempts to
register an MMIO interface for RAPL which must not happen before
the requisite infrastructure is ready (Zhang Rui)
- Fix the int340X thermal driver's RAPL MMIO interface registration
error path (Rafael Wysocki)
- Fix possible use-after-free in the pasemi cpufreq driver (Wen
Yang)"
* tag 'pm-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
int340X/processor_thermal_device: Fix proc_thermal_rapl_remove()
powercap: Invoke powercap_init() and rapl_init() earlier
Linus Torvalds [Thu, 25 Jul 2019 16:02:34 +0000 (09:02 -0700)]
Merge tag 'riscv/for-v5.3-rc2' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"Four minor RISC-V-related changes:
- Add support for the new clone3 syscall for RV64, relying on the
generic support
- Add DT data for the gigabit Ethernet controller on the SiFive FU540
and the HiFive Unleashed board
- Update MAINTAINERS to add me to the arch/riscv maintainers' list
- Add support for PCIe message-signaled interrupts by reusing the
generic header file"
* tag 'riscv/for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: Add DT node for SiFive FU540 Ethernet controller driver
riscv: include generic support for MSI irqdomains
MAINTAINERS: Add Paul as a RISC-V maintainer
riscv: enable sys_clone3 syscall for rv64