Serge Semin [Mon, 7 Nov 2022 20:39:44 +0000 (23:39 +0300)]
block: sed-opal: kmalloc the cmd/resp buffers
In accordance with [1] the DMA-able memory buffers must be
cacheline-aligned otherwise the cache writing-back and invalidation
performed during the mapping may cause the adjacent data being lost. It's
specifically required for the DMA-noncoherent platforms [2]. Seeing the
opal_dev.{cmd,resp} buffers are implicitly used for DMAs in the NVME and
SCSI/SD drivers in framework of the nvme_sec_submit() and sd_sec_submit()
methods respectively they must be cacheline-aligned to prevent the denoted
problem. One of the option to guarantee that is to kmalloc the buffers
[2]. Let's explicitly allocate them then instead of embedding into the
opal_dev structure instance.
Note this fix was inspired by the commit
c94b7f9bab22 ("nvme-hwmon:
kmalloc the NVME SMART log buffer").
[1] Documentation/core-api/dma-api.rst
[2] Documentation/core-api/dma-api-howto.rst
Fixes:
455a7b238cd6 ("block: Add Sed-opal library")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221107203944.31686-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yu Kuai [Tue, 8 Nov 2022 10:34:34 +0000 (18:34 +0800)]
block, bfq: fix null pointer dereference in bfq_bio_bfqg()
Out test found a following problem in kernel 5.10, and the same problem
should exist in mainline:
BUG: kernel NULL pointer dereference, address:
0000000000000094
PGD 0 P4D 0
Oops: 0000 [#1] SMP
CPU: 7 PID: 155 Comm: kworker/7:1 Not tainted 5.10.0-01932-g19e0ace2ca1d-dirty 4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-b4
Workqueue: kthrotld blk_throtl_dispatch_work_fn
RIP: 0010:bfq_bio_bfqg+0x52/0xc0
Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
RSP: 0018:
ffffc90001a1fba0 EFLAGS:
00010002
RAX:
ffff888100d60400 RBX:
ffff8881132e7000 RCX:
0000000000000000
RDX:
0000000000000017 RSI:
ffff888103580a18 RDI:
ffff888103580a18
RBP:
ffff8881132e7000 R08:
0000000000000000 R09:
ffffc90001a1fe10
R10:
0000000000000a20 R11:
0000000000034320 R12:
0000000000000000
R13:
ffff888103580a18 R14:
ffff888114447000 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffff88881fdc0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000094 CR3:
0000000100cdb000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
bfq_bic_update_cgroup+0x3c/0x350
? ioc_create_icq+0x42/0x270
bfq_init_rq+0xfd/0x1060
bfq_insert_requests+0x20f/0x1cc0
? ioc_create_icq+0x122/0x270
blk_mq_sched_insert_requests+0x86/0x1d0
blk_mq_flush_plug_list+0x193/0x2a0
blk_flush_plug_list+0x127/0x170
blk_finish_plug+0x31/0x50
blk_throtl_dispatch_work_fn+0x151/0x190
process_one_work+0x27c/0x5f0
worker_thread+0x28b/0x6b0
? rescuer_thread+0x590/0x590
kthread+0x153/0x1b0
? kthread_flush_work+0x170/0x170
ret_from_fork+0x1f/0x30
Modules linked in:
CR2:
0000000000000094
---[ end trace
e2e59ac014314547 ]---
RIP: 0010:bfq_bio_bfqg+0x52/0xc0
Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
RSP: 0018:
ffffc90001a1fba0 EFLAGS:
00010002
RAX:
ffff888100d60400 RBX:
ffff8881132e7000 RCX:
0000000000000000
RDX:
0000000000000017 RSI:
ffff888103580a18 RDI:
ffff888103580a18
RBP:
ffff8881132e7000 R08:
0000000000000000 R09:
ffffc90001a1fe10
R10:
0000000000000a20 R11:
0000000000034320 R12:
0000000000000000
R13:
ffff888103580a18 R14:
ffff888114447000 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffff88881fdc0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000094 CR3:
0000000100cdb000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Root cause is quite complex:
1) use bfq elevator for the test device.
2) create a cgroup CG
3) config blk throtl in CG
blkg_conf_prep
blkg_create
4) create a thread T1 and issue async io in CG:
bio_init
bio_associate_blkg
...
submit_bio
submit_bio_noacct
blk_throtl_bio -> io is throttled
// io submit is done
5) switch elevator:
bfq_exit_queue
blkcg_deactivate_policy
list_for_each_entry(blkg, &q->blkg_list, q_node)
blkg->pd[] = NULL
// bfq policy is removed
5) thread t1 exist, then remove the cgroup CG:
blkcg_unpin_online
blkcg_destroy_blkgs
blkg_destroy
list_del_init(&blkg->q_node)
// blkg is removed from queue list
6) switch elevator back to bfq
bfq_init_queue
bfq_create_group_hierarchy
blkcg_activate_policy
list_for_each_entry_reverse(blkg, &q->blkg_list)
// blkg is removed from list, hence bfq policy is still NULL
7) throttled io is dispatched to bfq:
bfq_insert_requests
bfq_init_rq
bfq_bic_update_cgroup
bfq_bio_bfqg
bfqg = blkg_to_bfqg(blkg)
// bfqg is NULL because bfq policy is NULL
The problem is only possible in bfq because only bfq can be deactivated and
activated while queue is online, while others can only be deactivated while
the device is removed.
Fix the problem in bfq by checking if blkg is online before calling
blkg_to_bfqg().
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221108103434.2853269-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Al Viro [Tue, 1 Nov 2022 00:54:13 +0000 (00:54 +0000)]
block: blk_add_rq_to_plug(): clear stale 'last' after flush
blk_mq_flush_plug_list() empties ->mq_list and request we'd peeked there
before that call is gone; in any case, we are not dealing with a mix
of requests for different queues now - there's no requests left in the
plug.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chen Jun [Mon, 31 Oct 2022 03:12:42 +0000 (03:12 +0000)]
blk-mq: Fix kmemleak in blk_mq_init_allocated_queue
There is a kmemleak caused by modprobe null_blk.ko
unreferenced object 0xffff8881acb1f000 (size 1024):
comm "modprobe", pid 836, jiffies
4294971190 (age 27.068s)
hex dump (first 32 bytes):
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
ff ff ff ff ff ff ff ff 00 53 99 9e ff ff ff ff .........S......
backtrace:
[<
000000004a10c249>] kmalloc_node_trace+0x22/0x60
[<
00000000648f7950>] blk_mq_alloc_and_init_hctx+0x289/0x350
[<
00000000af06de0e>] blk_mq_realloc_hw_ctxs+0x2fe/0x3d0
[<
00000000e00c1872>] blk_mq_init_allocated_queue+0x48c/0x1440
[<
00000000d16b4e68>] __blk_mq_alloc_disk+0xc8/0x1c0
[<
00000000d10c98c3>] 0xffffffffc450d69d
[<
00000000b9299f48>] 0xffffffffc4538392
[<
0000000061c39ed6>] do_one_initcall+0xd0/0x4f0
[<
00000000b389383b>] do_init_module+0x1a4/0x680
[<
0000000087cf3542>] load_module+0x6249/0x7110
[<
00000000beba61b8>] __do_sys_finit_module+0x140/0x200
[<
00000000fdcfff51>] do_syscall_64+0x35/0x80
[<
000000003c0f1f71>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
That is because q->ma_ops is set to NULL before blk_release_queue is
called.
blk_mq_init_queue_data
blk_mq_init_allocated_queue
blk_mq_realloc_hw_ctxs
for (i = 0; i < set->nr_hw_queues; i++) {
old_hctx = xa_load(&q->hctx_table, i);
if (!blk_mq_alloc_and_init_hctx(.., i, ..)) [1]
if (!old_hctx)
break;
xa_for_each_start(&q->hctx_table, j, hctx, j)
blk_mq_exit_hctx(q, set, hctx, j); [2]
if (!q->nr_hw_queues) [3]
goto err_hctxs;
err_exit:
q->mq_ops = NULL; [4]
blk_put_queue
blk_release_queue
if (queue_is_mq(q)) [5]
blk_mq_release(q);
[1]: blk_mq_alloc_and_init_hctx failed at i != 0.
[2]: The hctxs allocated by [1] are moved to q->unused_hctx_list and
will be cleaned up in blk_mq_release.
[3]: q->nr_hw_queues is 0.
[4]: Set q->mq_ops to NULL.
[5]: queue_is_mq returns false due to [4]. And blk_mq_release
will not be called. The hctxs in q->unused_hctx_list are leaked.
To fix it, call blk_release_queue in exception path.
Fixes:
2f8f1336a48b ("blk-mq: always free hctx after request queue is freed")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221031031242.94107-1-chenjun102@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chen Zhongjin [Sat, 29 Oct 2022 07:13:55 +0000 (15:13 +0800)]
block: Fix possible memory leak for rq_wb on add_disk failure
kmemleak reported memory leaks in device_add_disk():
kmemleak: 3 new suspected memory leaks
unreferenced object 0xffff88800f420800 (size 512):
comm "modprobe", pid 4275, jiffies
4295639067 (age 223.512s)
hex dump (first 32 bytes):
04 00 00 00 08 00 00 00 01 00 00 00 00 00 00 00 ................
00 e1 f5 05 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
00000000d3662699>] kmalloc_trace+0x26/0x60
[<
00000000edc7aadc>] wbt_init+0x50/0x6f0
[<
0000000069601d16>] wbt_enable_default+0x157/0x1c0
[<
0000000028fc393f>] blk_register_queue+0x2a4/0x420
[<
000000007345a042>] device_add_disk+0x6fd/0xe40
[<
0000000060e6aab0>] nbd_dev_add+0x828/0xbf0 [nbd]
...
It is because the memory allocated in wbt_enable_default() is not
released in device_add_disk() error path.
Normally, these memory are freed in:
del_gendisk()
rq_qos_exit()
rqos->ops->exit(rqos);
wbt_exit()
So rq_qos_exit() is called to free the rq_wb memory for wbt_init().
However in the error path of device_add_disk(), only
blk_unregister_queue() is called and make rq_wb memory leaked.
Add rq_qos_exit() to the error path to fix it.
Fixes:
83cbce957446 ("block: add error handling for device_add_disk / add_disk")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221029071355.35462-1-chenzhongjin@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sat, 29 Oct 2022 01:04:32 +0000 (09:04 +0800)]
ublk_drv: add ublk_queue_cmd() for cleanup
Add helper of ublk_queue_cmd() so that both ublk_queue_rq()
and ublk_handle_need_get_data() can reuse this helper.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221029010432.598367-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sat, 29 Oct 2022 01:04:31 +0000 (09:04 +0800)]
ublk_drv: avoid to touch io_uring cmd in blk_mq io path
io_uring cmd is supposed to be used in ubq daemon context mainly,
and we should try to avoid to touch it in ublk io submission context,
otherwise this data could become shared between the two contexts,
and performance is hurt.
So link request into one per-queue list, and use same batching policy
of io_uring command, just avoid to touch ucmd in blk-mq io context.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221029010432.598367-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sat, 29 Oct 2022 01:04:30 +0000 (09:04 +0800)]
ublk_drv: comment on ublk_driver entry of Kconfig
Add help info for choosing to build ublk_drv as module or builtin.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221029010432.598367-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Sat, 29 Oct 2022 01:04:29 +0000 (09:04 +0800)]
ublk_drv: return flag of UBLK_F_URING_CMD_COMP_IN_TASK in case of module
UBLK_F_URING_CMD_COMP_IN_TASK needs to be set and returned to userspace
if ublk driver is built as module, otherwise userspace may get wrong
flags shown.
Fixes:
71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221029010432.598367-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
John Garry [Wed, 26 Oct 2022 10:35:13 +0000 (18:35 +0800)]
blk-mq: Properly init requests from blk_mq_alloc_request_hctx()
Function blk_mq_alloc_request_hctx() is missing zeroing/init of rq->bio,
biotail, __sector, and __data_len members, which blk_mq_alloc_request()
has, so duplicate what we do in blk_mq_alloc_request().
Fixes:
1f5bd336b9150 ("blk-mq: add blk_mq_alloc_request_hctx")
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1666780513-121650-1-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 27 Oct 2022 13:49:21 +0000 (07:49 -0600)]
Merge tag 'nvme-6.1-2022-10-27' of git://git.infradead.org/nvme into block-6.1
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.1
- make the multipath dma alignment to match the non-multipath one
(Keith Busch)
- fix a bogus use of sg_init_marker() (Nam Cao)
- fix circulr locking in nvme-tcp (Sagi Grimberg)"
* tag 'nvme-6.1-2022-10-27' of git://git.infradead.org/nvme:
nvme-multipath: set queue dma alignment to 3
nvme-tcp: fix possible circular locking when deleting a controller under memory pressure
nvme-tcp: replace sg_init_marker() with sg_init_table()
Ming Lei [Thu, 27 Oct 2022 08:57:09 +0000 (16:57 +0800)]
blk-mq: don't add non-pt request with ->end_io to batch
dm-rq implements ->end_io callback for request issued to underlying queue,
and it isn't passthrough request.
Commit
ab3e1d3bbab9 ("block: allow end_io based requests in the completion
batch handling") doesn't clear rq->bio and rq->__data_len for request
with ->end_io in blk_mq_end_request_batch(), and this way is actually
dangerous, but so far it is only for nvme passthrough request.
dm-rq needs to clean up remained bios in case of partial completion,
and req->bio is required, then use-after-free is triggered, so the
underlying clone request can't be completed in blk_mq_end_request_batch.
Fix panic by not adding such request into batch list, and the issue
can be triggered simply by exposing nvme pci to dm-mpath simply.
Fixes:
ab3e1d3bbab9 ("block: allow end_io based requests in the completion batch handling")
Cc: dm-devel@redhat.com
Cc: Mike Snitzer <snitzer@kernel.org>
Reported-by: Changhui Zhong <czhong@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221027085709.513175-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yang Yingliang [Thu, 27 Oct 2022 09:19:18 +0000 (17:19 +0800)]
rbd: fix possible memory leak in rbd_sysfs_init()
If device_register() returns error in rbd_sysfs_init(), name of kobject
which is allocated in dev_set_name() called in device_add() is leaked.
As comment of device_add() says, it should call put_device() to drop
the reference count that was set in device_initialize() when it fails,
so the name can be freed in kobject_cleanup().
Fault injection test can trigger this problem:
unreferenced object 0xffff88810173aa78 (size 8):
comm "modprobe", pid 247, jiffies
4294714278 (age 31.789s)
hex dump (first 8 bytes):
72 62 64 00 81 88 ff ff rbd.....
backtrace:
[<
00000000f58fae56>] __kmalloc_node_track_caller+0x44/0x1b0
[<
00000000bdd44fe7>] kstrdup+0x3a/0x70
[<
00000000f7844d0b>] kstrdup_const+0x63/0x80
[<
000000001b0a0eeb>] kvasprintf_const+0x10b/0x190
[<
00000000a47bd894>] kobject_set_name_vargs+0x56/0x150
[<
00000000d5edbf18>] dev_set_name+0xab/0xe0
[<
00000000f5153e80>] device_add+0x106/0x1f20
Fixes:
dfc5606dc513 ("rbd: replace the rbd sysfs interface")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20221027091918.2294132-1-yangyingliang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Keith Busch [Mon, 24 Oct 2022 18:57:45 +0000 (11:57 -0700)]
nvme-multipath: set queue dma alignment to 3
NVMe spec requires all transports support dword aligned addresses, which
is already set in the namespace request_queue. Set the same limit in the
multipath device's request_queue as well.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Sun, 23 Oct 2022 08:04:43 +0000 (11:04 +0300)]
nvme-tcp: fix possible circular locking when deleting a controller under memory pressure
When destroying a queue, when calling sock_release, the network stack
might need to allocate an skb to send a FIN/RST. When that happens
during memory pressure, there is a need to reclaim memory, which
in turn may ask the nvme-tcp device to write out dirty pages, however
this is not possible due to a ctrl teardown that is going on.
Set PF_MEMALLOC to the task that releases the socket to grant access
to PF_MEMALLOC reserves. In addition, do the same for the nvme-tcp
thread as this may also originate from the swap itself and should
be more resilient to memory pressure situations.
This fixes the following lockdep complaint:
--
======================================================
WARNING: possible circular locking dependency detected
6.0.0-rc2+ #25 Tainted: G W
------------------------------------------------------
kswapd0/92 is trying to acquire lock:
ffff888114003240 (sk_lock-AF_INET-NVME){+.+.}-{0:0}, at: tcp_sendpage+0x23/0xa0
but task is already holding lock:
ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x11e/0x160
kmem_cache_alloc_node+0x44/0x530
__alloc_skb+0x158/0x230
tcp_send_active_reset+0x7e/0x730
tcp_disconnect+0x1272/0x1ae0
__tcp_close+0x707/0xd90
tcp_close+0x26/0x80
inet_release+0xfa/0x220
sock_release+0x85/0x1a0
nvme_tcp_free_queue+0x1fd/0x470 [nvme_tcp]
nvme_do_delete_ctrl+0x130/0x13d [nvme_core]
nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
kernfs_fop_write_iter+0x356/0x530
vfs_write+0x4e8/0xce0
ksys_write+0xfd/0x1d0
do_syscall_64+0x58/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #0 (sk_lock-AF_INET-NVME){+.+.}-{0:0}:
__lock_acquire+0x2a0c/0x5690
lock_acquire+0x18e/0x4f0
lock_sock_nested+0x37/0xc0
tcp_sendpage+0x23/0xa0
inet_sendpage+0xad/0x120
kernel_sendpage+0x156/0x440
nvme_tcp_try_send+0x48a/0x2630 [nvme_tcp]
nvme_tcp_queue_rq+0xefb/0x17e0 [nvme_tcp]
__blk_mq_try_issue_directly+0x452/0x660
blk_mq_plug_issue_direct.constprop.0+0x207/0x700
blk_mq_flush_plug_list+0x6f5/0xc70
__blk_flush_plug+0x264/0x410
blk_finish_plug+0x4b/0xa0
shrink_lruvec+0x1263/0x1ea0
shrink_node+0x736/0x1a80
balance_pgdat+0x740/0x10d0
kswapd+0x5f2/0xaf0
kthread+0x256/0x2f0
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(sk_lock-AF_INET-NVME);
lock(fs_reclaim);
lock(sk_lock-AF_INET-NVME);
*** DEADLOCK ***
3 locks held by kswapd0/92:
#0:
ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
#1:
ffff88811f21b0b0 (q->srcu){....}-{0:0}, at: blk_mq_flush_plug_list+0x6b3/0xc70
#2:
ffff888170b11470 (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0xeb9/0x17e0 [nvme_tcp]
Fixes:
3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Nam Cao [Sat, 22 Oct 2022 17:46:36 +0000 (19:46 +0200)]
nvme-tcp: replace sg_init_marker() with sg_init_table()
In nvme_tcp_ddgst_update(), sg_init_marker() is called with an
uninitialized scatterlist. This is probably fine, but gcc complains:
CC [M] drivers/nvme/host/tcp.o
In file included from ./include/linux/dma-mapping.h:10,
from ./include/linux/skbuff.h:31,
from ./include/net/net_namespace.h:43,
from ./include/linux/netdevice.h:38,
from ./include/net/sock.h:46,
from drivers/nvme/host/tcp.c:12:
In function ‘sg_mark_end’,
inlined from ‘sg_init_marker’ at ./include/linux/scatterlist.h:356:2,
inlined from ‘nvme_tcp_ddgst_update’ at drivers/nvme/host/tcp.c:390:2:
./include/linux/scatterlist.h:234:11: error: ‘sg.page_link’ is used uninitialized [-Werror=uninitialized]
234 | sg->page_link |= SG_END;
| ~~^~~~~~~~~~~
drivers/nvme/host/tcp.c: In function ‘nvme_tcp_ddgst_update’:
drivers/nvme/host/tcp.c:388:28: note: ‘sg’ declared here
388 | struct scatterlist sg;
| ^~
cc1: all warnings being treated as errors
Use sg_init_table() instead, which basically memset the scatterlist to
zero first before calling sg_init_marker().
Signed-off-by: Nam Cao <namcaov@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Yu Kuai [Sat, 22 Oct 2022 02:16:15 +0000 (10:16 +0800)]
block: fix memory leak for elevator on add_disk failure
The default elevator is allocated in the beginning of device_add_disk(),
however, it's not freed in the following error path.
Fixes:
50e34d78815e ("block: disable the elevator int del_gendisk")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20221022021615.2756171-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ye Bin [Wed, 19 Oct 2022 03:36:02 +0000 (11:36 +0800)]
blktrace: remove unnessary stop block trace in 'blk_trace_shutdown'
As previous commit, 'blk_trace_cleanup' will stop block trace if
block trace's state is 'Blktrace_running'.
So remove unnessary stop block trace in 'blk_trace_shutdown'.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-4-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ye Bin [Wed, 19 Oct 2022 03:36:01 +0000 (11:36 +0800)]
blktrace: fix possible memleak in '__blk_trace_remove'
When test as follows:
step1: ioctl(sda, BLKTRACESETUP, &arg)
step2: ioctl(sda, BLKTRACESTART, NULL)
step3: ioctl(sda, BLKTRACETEARDOWN, NULL)
step4: ioctl(sda, BLKTRACESETUP, &arg)
Got issue as follows:
debugfs: File 'dropped' in directory 'sda' already present!
debugfs: File 'msg' in directory 'sda' already present!
debugfs: File 'trace0' in directory 'sda' already present!
And also find syzkaller report issue like "KASAN: use-after-free Read in relay_switch_subbuf"
"https://syzkaller.appspot.com/bug?id=
13849f0d9b1b818b087341691be6cc3ac6a6bfb7"
If remove block trace without stop(BLKTRACESTOP) block trace, '__blk_trace_remove'
will just set 'q->blk_trace' with NULL. However, debugfs file isn't removed, so
will report file already present when call BLKTRACESETUP.
static int __blk_trace_remove(struct request_queue *q)
{
struct blk_trace *bt;
bt = rcu_replace_pointer(q->blk_trace, NULL,
lockdep_is_held(&q->debugfs_mutex));
if (!bt)
return -EINVAL;
if (bt->trace_state != Blktrace_running)
blk_trace_cleanup(q, bt);
return 0;
}
If do test as follows:
step1: ioctl(sda, BLKTRACESETUP, &arg)
step2: ioctl(sda, BLKTRACESTART, NULL)
step3: ioctl(sda, BLKTRACETEARDOWN, NULL)
step4: remove sda
There will remove debugfs directory which will remove recursively all file
under directory.
>> blk_release_queue
>> debugfs_remove_recursive(q->debugfs_dir)
So all files which created in 'do_blk_trace_setup' are removed, and
'dentry->d_inode' is NULL. But 'q->blk_trace' is still in 'running_trace_lock',
'trace_note_tsk' will traverse 'running_trace_lock' all nodes.
>>trace_note_tsk
>> trace_note
>> relay_reserve
>> relay_switch_subbuf
>> d_inode(buf->dentry)->i_size
To solve above issues, reference commit '
5afedf670caf', call 'blk_trace_cleanup'
unconditionally in '__blk_trace_remove' and first stop block trace in
'blk_trace_cleanup'.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-3-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ye Bin [Wed, 19 Oct 2022 03:36:00 +0000 (11:36 +0800)]
blktrace: introduce 'blk_trace_{start,stop}' helper
Introduce 'blk_trace_{start,stop}' helper. No functional changed.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-2-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Tue, 18 Oct 2022 19:50:55 +0000 (20:50 +0100)]
bio: safeguard REQ_ALLOC_CACHE bio put
bio_put() with REQ_ALLOC_CACHE assumes that it's executed not from
an irq context. Let's add a warning if the invariant is not respected,
especially since there is a couple of places removing REQ_POLLED by hand
without also clearing REQ_ALLOC_CACHE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/558d78313476c4e9c233902efa0092644c3d420a.1666122465.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yuwei Guan [Tue, 18 Oct 2022 03:01:39 +0000 (11:01 +0800)]
block, bfq: remove unused variable for bfq_queue
it defined in
d0edc2473be9d, but there's nowhere to use it,
so remove it.
Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20221018030139.159-1-Yuwei.Guan@zeekrlife.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Christoph Böhmwalder [Thu, 20 Oct 2022 08:52:05 +0000 (10:52 +0200)]
drbd: only clone bio if we have a backing device
Commit
c347a787e34cb (drbd: set ->bi_bdev in drbd_req_new) moved a
bio_set_dev call (which has since been removed) to "earlier", from
drbd_request_prepare to drbd_req_new.
The problem is that this accesses device->ldev->backing_bdev, which is
not NULL-checked at this point. When we don't have an ldev (i.e. when
the DRBD device is diskless), this leads to a null pointer deref.
So, only allocate the private_bio if we actually have a disk. This is
also a small optimization, since we don't clone the bio to only to
immediately free it again in the diskless case.
Fixes:
c347a787e34cb ("drbd: set ->bi_bdev in drbd_req_new")
Co-developed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Co-developed-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020085205.129090-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 20 Oct 2022 12:43:58 +0000 (05:43 -0700)]
Merge tag 'nvme-6.1-2022-10-22' of git://git.infradead.org/nvme into block-6.1
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.1
- fix nvme-hwmon for DMA non-cohehrent architectures (Serge Semin)
- add a nvme-hwmong maintainer (Christoph Hellwig)
- fix error pointer dereference in error handling (Dan Carpenter)
- fix invalid memory reference in nvmet_subsys_attr_qid_max_show
(Daniel Wagner)
- don't limit the DMA segment size in nvme-apple (Russell King)
- fix workqueue MEM_RECLAIM flushing dependency (Sagi Grimberg)
- disable write zeroes on various Kingston SSDs (Xander Li)"
* tag 'nvme-6.1-2022-10-22' of git://git.infradead.org/nvme:
nvmet: fix invalid memory reference in nvmet_subsys_attr_qid_max_show
nvmet: fix workqueue MEM_RECLAIM flushing dependency
nvme-hwmon: kmalloc the NVME SMART log buffer
nvme-hwmon: consistently ignore errors from nvme_hwmon_init
nvme: add Guenther as nvme-hwmon maintainer
nvme-apple: don't limit DMA segement size
nvme-pci: disable write zeroes on various Kingston SSD
nvme: fix error pointer dereference in error handling
Yushan Zhou [Tue, 18 Oct 2022 10:01:32 +0000 (18:01 +0800)]
ublk_drv: use flexible-array member instead of zero-length array
Eliminate the following coccicheck warning:
./drivers/block/ublk_drv.c:127:16-19: WARNING use flexible-array member instead
Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
Link: https://lore.kernel.org/r/20221018100132.355393-1-zys.zljxml@gmail.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Daniel Wagner [Fri, 7 Oct 2022 07:29:34 +0000 (09:29 +0200)]
nvmet: fix invalid memory reference in nvmet_subsys_attr_qid_max_show
The item passed into nvmet_subsys_attr_qid_max_show is not a member of
struct nvmet_port, it is part of nvmet_subsys. Hence, don't try to
dereference it as struct nvme_ctrl pointer.
Fixes:
3e980f5995e0 ("nvmet: Expose max queues to configfs")
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20220913064203.133536-1-dwagner@suse.de
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Wed, 28 Sep 2022 06:39:10 +0000 (09:39 +0300)]
nvmet: fix workqueue MEM_RECLAIM flushing dependency
The keep alive timer needs to stay on nvmet_wq, and not
modified to reschedule on the system_wq.
This fixes a warning:
------------[ cut here ]------------
workqueue: WQ_MEM_RECLAIM
nvmet-wq:nvmet_rdma_release_queue_work [nvmet_rdma] is flushing
!WQ_MEM_RECLAIM events:nvmet_keep_alive_timer [nvmet]
WARNING: CPU: 3 PID: 1086 at kernel/workqueue.c:2628
check_flush_dependency+0x16c/0x1e0
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Fixes:
8832cf922151 ("nvmet: use a private workqueue instead of the system workqueue")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Serge Semin [Tue, 18 Oct 2022 15:33:52 +0000 (17:33 +0200)]
nvme-hwmon: kmalloc the NVME SMART log buffer
Recent commit
52fde2c07da6 ("nvme: set dma alignment to dword") has
caused a regression on our platform.
It turned out that the nvme_get_log() method invocation caused the
nvme_hwmon_data structure instance corruption. In particular the
nvme_hwmon_data.ctrl pointer was overwritten either with zeros or with
garbage. After some research we discovered that the problem happened
even before the actual NVME DMA execution, but during the buffer mapping.
Since our platform is DMA-noncoherent, the mapping implied the cache-line
invalidations or write-backs depending on the DMA-direction parameter.
In case of the NVME SMART log getting the DMA was performed
from-device-to-memory, thus the cache-invalidation was activated during
the buffer mapping. Since the log-buffer isn't cache-line aligned, the
cache-invalidation caused the neighbour data to be discarded. The
neighbouring data turned to be the data surrounding the buffer in the
framework of the nvme_hwmon_data structure.
In order to fix that we need to make sure that the whole log-buffer is
defined within the cache-line-aligned memory region so the
cache-invalidation procedure wouldn't involve the adjacent data. One of
the option to guarantee that is to kmalloc the DMA-buffer [1]. Seeing the
rest of the NVME core driver prefer that method it has been chosen to fix
this problem too.
Note after a deeper researches we found out that the denoted commit wasn't
a root cause of the problem. It just revealed the invalidity by activating
the DMA-based NVME SMART log getting performed in the framework of the
NVME hwmon driver. The problem was here since the initial commit of the
driver.
[1] Documentation/core-api/dma-api-howto.rst
Fixes:
400b6a7b13a3 ("nvme: Add hardware monitoring support")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Tue, 18 Oct 2022 14:55:55 +0000 (16:55 +0200)]
nvme-hwmon: consistently ignore errors from nvme_hwmon_init
An NVMe controller works perfectly fine even when the hwmon
initialization fails. Stop returning errors that do not come from a
controller reset from nvme_hwmon_init to handle this case consistently.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Christoph Hellwig [Tue, 18 Oct 2022 14:59:16 +0000 (16:59 +0200)]
nvme: add Guenther as nvme-hwmon maintainer
Given that non of the overall NVMe maintainers knows this code very
deeply it probably makes sense to add Guenther as an additional
MAINTAINER for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Russell King (Oracle) [Wed, 12 Oct 2022 11:46:06 +0000 (12:46 +0100)]
nvme-apple: don't limit DMA segement size
NVMe uses PRPs for data transfers and has no specific limit for a single
DMA segement. Limiting the size will cause problems because the block
layer assumes PRP-ish devices using a virt boundary mask don't have a
segment limit. And while this is true, we also really need to tell the
DMA mapping layer about it, otherwise dma-debug will trip over it.
Fixes:
5bd2927aceba ("nvme-apple: Add initial Apple SoC NVMe driver")
Suggested-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[hch: rewrote the commit message based on the PCIe commit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Curtin <ecurtin@redhat.com>
Reviewed-by: Sven Peter <sven@svenpeter.dev>
Xander Li [Tue, 11 Oct 2022 11:06:42 +0000 (04:06 -0700)]
nvme-pci: disable write zeroes on various Kingston SSD
Kingston SSDs do support NVMe Write_Zeroes cmd but take long time to
process. The firmware version is locked by these SSDs, we can not expect
firmware improvement, so disable Write_Zeroes cmd.
Signed-off-by: Xander Li <xander_li@kingston.com.tw>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Dan Carpenter [Sat, 15 Oct 2022 08:25:56 +0000 (11:25 +0300)]
nvme: fix error pointer dereference in error handling
There is typo here so it releases the wrong variable. "ctrl->admin_q"
was intended instead of "ctrl->fabrics_q".
Fixes:
fe60e8c53411 ("nvme: add common helpers to allocate and free tagsets")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
ZiyangZhang [Tue, 18 Oct 2022 04:53:46 +0000 (12:53 +0800)]
Documentation: document ublk user recovery feature
Add documentation for user recovery feature of ublk subsystem.
Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221018045346.99706-2-ZiyangZhang@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yu Kuai [Tue, 11 Oct 2022 14:22:53 +0000 (22:22 +0800)]
blk-mq: fix null pointer dereference in blk_mq_clear_rq_mapping()
Our syzkaller report a null pointer dereference, root cause is
following:
__blk_mq_alloc_map_and_rqs
set->tags[hctx_idx] = blk_mq_alloc_map_and_rqs
blk_mq_alloc_map_and_rqs
blk_mq_alloc_rqs
// failed due to oom
alloc_pages_node
// set->tags[hctx_idx] is still NULL
blk_mq_free_rqs
drv_tags = set->tags[hctx_idx];
// null pointer dereference is triggered
blk_mq_clear_rq_mapping(drv_tags, ...)
This is because commit
63064be150e4 ("blk-mq:
Add blk_mq_alloc_map_and_rqs()") merged the two steps:
1) set->tags[hctx_idx] = blk_mq_alloc_rq_map()
2) blk_mq_alloc_rqs(..., set->tags[hctx_idx])
into one step:
set->tags[hctx_idx] = blk_mq_alloc_map_and_rqs()
Since tags is not initialized yet in this case, fix the problem by
checking if tags is NULL pointer in blk_mq_clear_rq_mapping().
Fixes:
63064be150e4 ("blk-mq: Add blk_mq_alloc_map_and_rqs()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20221011142253.4015966-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 12 Oct 2022 13:15:53 +0000 (07:15 -0600)]
Merge tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme into block-6.1
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.1
- add NVME_QUIRK_BOGUS_NID for Lexar NM760 (Abhijit)
- avoid the deepest sleep state on ZHITAI TiPro5000 SSDs (Xi Ruoyao)
- fix possible hang caused during ctrl deletion (Sagi Grimberg)
- fix possible hang in live ns resize with ANA access (Sagi Grimberg)"
* tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme:
nvme-multipath: fix possible hang in live ns resize with ANA access
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
nvme-tcp: fix possible hang caused during ctrl deletion
nvme-rdma: fix possible hang caused during ctrl deletion
Sagi Grimberg [Thu, 29 Sep 2022 07:36:47 +0000 (10:36 +0300)]
nvme-multipath: fix possible hang in live ns resize with ANA access
When we revalidate paths as part of ns size change (as of commit
e7d65803e2bb), it is possible that during the path revalidation, the
only paths that is IO capable (i.e. optimized/non-optimized) are the
ones that ns resize was not yet informed to the host, which will cause
inflight requests to be requeued (as we have available paths but none
are IO capable). These requests on the requeue list are waiting for
someone to resubmit them at some point.
The IO capable paths will eventually notify the ns resize change to the
host, but there is nothing that will kick the requeue list to resubmit
the queued requests.
Fix this by always kicking the requeue list, and if no IO capable path
exists, these requests will be queued again.
A typical log that indicates that IOs are requeued:
--
nvme nvme1: creating 4 I/O queues.
nvme nvme1: new ctrl: "testnqn1"
nvme nvme2: creating 4 I/O queues.
nvme nvme2: mapped 4/0/0 default/read/poll queues.
nvme nvme2: new ctrl: NQN "testnqn1", addr 127.0.0.1:8009
nvme nvme1: rescanning namespaces.
nvme1n1: detected capacity change from 2097152 to 4194304
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
block nvme1n1: no usable path - requeuing I/O
nvme nvme2: rescanning namespaces.
--
Reported-by: Yogev Cohen <yogev@lightbitslabs.com>
Fixes:
e7d65803e2bb ("nvme-multipath: revalidate paths during rescan")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Cc: <stable@vger.kernel.org> # v5.15+
Signed-off-by: Christoph Hellwig <hch@lst.de>
Xi Ruoyao [Wed, 28 Sep 2022 09:39:13 +0000 (17:39 +0800)]
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
ZHITAI TiPro5000 SSDs has the same APST sleep problem as its cousin,
TiPro7000. The quirk for TiPro7000 has been added in
commit
6b961bce50e4 ("nvme-pci: avoid the deepest sleep state on
ZHITAI TiPro7000 SSDs"), use the same quirk for TiPro5000.
The ASPT data from "nvme id-ctrl /dev/nvme1":
vid : 0x1e49
ssvid : 0x1e49
sn : ZTA21T0KA2227304LM
mn : ZHITAI TiPlus5000 1TB
fr : ZTA09139
[...]
ps 0 : mp:6.50W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:5.80W operational enlat:0 exlat:0 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:3.60W operational enlat:0 exlat:0 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.0500W non-operational enlat:5000 exlat:10000 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0025W non-operational enlat:8000 exlat:45000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
Reported-and-tested-by: Chang Feng <flukehn@gmail.com>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Abhijit [Mon, 10 Oct 2022 08:30:05 +0000 (10:30 +0200)]
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
Add a quirk to fix Lexar NM760 SSD drives reporting duplicate nsids.
Signed-off-by: Abhijit <abhijit@abhijittomar.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Wed, 28 Sep 2022 06:23:25 +0000 (09:23 +0300)]
nvme-tcp: fix possible hang caused during ctrl deletion
When we delete a controller, we execute the following:
1. nvme_stop_ctrl() - stop some work elements that may be
inflight or scheduled (specifically also .stop_ctrl
which cancels ctrl error recovery work)
2. nvme_remove_namespaces() - which first flushes scan_work
to avoid competing ns addition/removal
3. continue to teardown the controller
However, if err_work was scheduled to run in (1), it is designed to
cancel any inflight I/O, particularly I/O that is originating from ns
scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
prevent forward progress of (2) as ns scanning is blocking on I/O
(that will never be cancelled).
The race is:
1. transport layer error observed -> err_work is scheduled
2. scan_work executes, discovers ns, generate I/O to it
3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
- err_work never executed
4. nvme_remove_namespaces() -> flush_work(scan_work)
--> deadlock, because scan_work is blocked on I/O that was supposed
to be cancelled by err_work, but was cancelled before executing (see
stack trace [1]).
Fix this by flushing err_work instead of cancelling it, to force it
to execute and cancel all inflight I/O.
[1]:
--
Call Trace:
<TASK>
__schedule+0x390/0x910
? scan_shadow_nodes+0x40/0x40
schedule+0x55/0xe0
io_schedule+0x16/0x40
do_read_cache_page+0x55d/0x850
? __page_cache_alloc+0x90/0x90
read_cache_page+0x12/0x20
read_part_sector+0x3f/0x110
amiga_partition+0x3d/0x3e0
? osf_partition+0x33/0x220
? put_partition+0x90/0x90
bdev_disk_changed+0x1fe/0x4d0
blkdev_get_whole+0x7b/0x90
blkdev_get_by_dev+0xda/0x2d0
device_add_disk+0x356/0x3b0
nvme_mpath_set_live+0x13c/0x1a0 [nvme_core]
? nvme_parse_ana_log+0xae/0x1a0 [nvme_core]
nvme_update_ns_ana_state+0x3a/0x40 [nvme_core]
nvme_mpath_add_disk+0x120/0x160 [nvme_core]
nvme_alloc_ns+0x594/0xa00 [nvme_core]
nvme_validate_or_alloc_ns+0xb9/0x1a0 [nvme_core]
? __nvme_submit_sync_cmd+0x1d2/0x210 [nvme_core]
nvme_scan_work+0x281/0x410 [nvme_core]
process_one_work+0x1be/0x380
worker_thread+0x37/0x3b0
? process_one_work+0x380/0x380
kthread+0x12d/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
INFO: task nvme:6725 blocked for more than 491 seconds.
Not tainted 5.15.65-f0.el7.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:nvme state:D
stack: 0 pid: 6725 ppid: 1761 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x390/0x910
? sched_clock+0x9/0x10
schedule+0x55/0xe0
schedule_timeout+0x24b/0x2e0
? try_to_wake_up+0x358/0x510
? finish_task_switch+0x88/0x2c0
wait_for_completion+0xa5/0x110
__flush_work+0x144/0x210
? worker_attach_to_pool+0xc0/0xc0
flush_work+0x10/0x20
nvme_remove_namespaces+0x41/0xf0 [nvme_core]
nvme_do_delete_ctrl+0x47/0x66 [nvme_core]
nvme_sysfs_delete.cold.96+0x8/0xd [nvme_core]
dev_attr_store+0x14/0x30
sysfs_kf_write+0x38/0x50
kernfs_fop_write_iter+0x146/0x1d0
new_sync_write+0x114/0x1b0
? intel_pmu_handle_irq+0xe0/0x420
vfs_write+0x18d/0x270
ksys_write+0x61/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x61/0xcb
--
Fixes:
3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Reported-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Wed, 28 Sep 2022 06:23:26 +0000 (09:23 +0300)]
nvme-rdma: fix possible hang caused during ctrl deletion
When we delete a controller, we execute the following:
1. nvme_stop_ctrl() - stop some work elements that may be
inflight or scheduled (specifically also .stop_ctrl
which cancels ctrl error recovery work)
2. nvme_remove_namespaces() - which first flushes scan_work
to avoid competing ns addition/removal
3. continue to teardown the controller
However, if err_work was scheduled to run in (1), it is designed to
cancel any inflight I/O, particularly I/O that is originating from ns
scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
prevent forward progress of (2) as ns scanning is blocking on I/O
(that will never be cancelled).
The race is:
1. transport layer error observed -> err_work is scheduled
2. scan_work executes, discovers ns, generate I/O to it
3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
- err_work never executed
4. nvme_remove_namespaces() -> flush_work(scan_work)
--> deadlock, because scan_work is blocked on I/O that was supposed
to be cancelled by err_work, but was cancelled before executing.
Fix this by flushing err_work instead of cancelling it, to force it
to execute and cancel all inflight I/O.
Fixes:
b435ecea2a4d ("nvme: Add .stop_ctrl to nvme ctrl ops")
Fixes:
f6c8e432cb04 ("nvme: flush namespace scanning work just before removing namespaces")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Jens Axboe [Mon, 10 Oct 2022 17:26:40 +0000 (11:26 -0600)]
Merge branch 'for-6.1/block' into block-6.1
Merge in later fixes.
* for-6.1/block:
block: fix leaking minors of hidden disks
block: avoid sign extend problem with default queue flags mask
blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
block: Remove the repeat word 'can'
MAINTAINERS: Update SED-Opal Maintainers
Christoph Hellwig [Mon, 10 Oct 2022 13:18:57 +0000 (15:18 +0200)]
block: fix leaking minors of hidden disks
The major/minor of a hidden gendisk is not propagated to the block
device because it is never registered using bdev_add. But the lack of
bd_dev also causes the dynamic major minor number not to be freed.
Assign bd_dev manually to ensure the dynamic major minor gets freed.
Based on a patch by Keith Busch.
Fixes:
8ddcd653257c ("block: introduce GENHD_FL_HIDDEN")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20221010131857.748129-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Brian Foster [Mon, 3 Oct 2022 13:35:34 +0000 (09:35 -0400)]
block: avoid sign extend problem with default queue flags mask
request_queue->queue_flags is unsigned long, which is 8-bytes on
64-bit architectures. Most queue flag modifications occur through
bit field helpers, but default flags can be logically OR'd via the
QUEUE_FLAG_MQ_DEFAULT mask. If this mask happens to include bit 31,
the assignment can sign extend the field and set all upper 32 bits.
This exact problem has been observed on a downstream kernel that
happens to use bit 31 for QUEUE_FLAG_NOWAIT. This is not an
immediate problem for current upstream because bit 31 is not
included in the default flag assignment (and is not used at all,
actually). Regardless, fix up the QUEUE_FLAG_MQ_DEFAULT mask
definition to avoid the landmine in the future.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221003133534.1075582-1-bfoster@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 9 Oct 2022 23:24:05 +0000 (16:24 -0700)]
Merge tag 'ucount-rlimits-cleanups-for-v5.19' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull ucounts update from Eric Biederman:
"Split rlimit and ucount values and max values
After the ucount rlimit code was merged a bunch of small but
siginificant bugs were found and fixed. At the time it was realized
that part of the problem was that while the ucount rlimits were very
similar to the oridinary ucounts (in being nested counts with limits)
the semantics were slightly different and the code would be less error
prone if there was less sharing.
This is the long awaited cleanup that should hopefully keep things
more comprehensible and less error prone for whoever needs to touch
that code next"
* tag 'ucount-rlimits-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ucounts: Split rlimit and ucount values and max values
Linus Torvalds [Sun, 9 Oct 2022 23:14:15 +0000 (16:14 -0700)]
Merge tag 'signal-for-v5.20' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull ptrace update from Eric Biederman:
"ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
Recently I had a conversation where it was pointed out to me that
SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
difficult for a tracer to handle.
Keeping SIGKILL working after the process has been killed is pain from
an implementation point of view.
So since the debuggers don't want this behavior let's see if we can
remove this wart for the userspace API
If a regression is detected it should only need to be the last change
that is the reverted. The other two are just general cleanups that
make the last patch simpler"
* tag 'signal-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Drop signals received after a fatal signal has been processed
signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
signal: Ensure SIGNAL_GROUP_EXIT gets set in do_group_exit
Linus Torvalds [Sun, 9 Oct 2022 23:10:22 +0000 (16:10 -0700)]
Merge tag 'retire_mq_sysctls-for-v5.19' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull mqueue fix from Eric Biederman:
"A fix for an unlikely but possible memory leak"
* tag 'retire_mq_sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ipc: mqueue: fix possible memory leak in init_mqueue_fs()
Linus Torvalds [Sun, 9 Oct 2022 23:01:59 +0000 (16:01 -0700)]
Merge tag 'interrupting_kthread_stop-for-v5.20' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull kthread update from Eric Biederman:
"Break out of wait loops on kthread_stop()
This is a small tweak to kthread_stop so it breaks out of
interruptible waits, that don't explicitly test for kthread_stop.
These interruptible waits occassionaly occur in kernel threads do to
code sharing"
* tag 'interrupting_kthread_stop-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: break out of wait loops on kthread_stop()
Linus Torvalds [Sun, 9 Oct 2022 21:05:15 +0000 (14:05 -0700)]
Merge tag 'powerpc-6.1-1' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Remove our now never-true definitions for pgd_huge() and p4d_leaf().
- Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.
- Add support for syscall wrappers.
- Add support for KFENCE on 64-bit.
- Update 64-bit HV KVM to use the new guest state entry/exit accounting
API.
- Support execute-only memory when using the Radix MMU (P9 or later).
- Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.
- Updates to our linker script to move more data into read-only
sections.
- Allow the VDSO to be randomised on 32-bit.
- Many other small features and fixes.
Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas,
Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin
Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent
Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan
Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali
Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool,
Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng
Yongjun.
* tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
KVM: PPC: Book3S HV: Fix stack frame regs marker
powerpc: Don't add __powerpc_ prefix to syscall entry points
powerpc/64s/interrupt: Fix stack frame regs marker
powerpc/64: Fix msr_check_and_set/clear MSR[EE] race
powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN
powerpc/pseries: Add firmware details to the hardware description
powerpc/powernv: Add opal details to the hardware description
powerpc: Add device-tree model to the hardware description
powerpc/64: Add logical PVR to the hardware description
powerpc: Add PVR & CPU name to hardware description
powerpc: Add hardware description string
powerpc/configs: Enable PPC_UV in powernv_defconfig
powerpc/configs: Update config files for removed/renamed symbols
powerpc/mm: Fix UBSAN warning reported on hugetlb
powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range
powerpc: Drops STABS_DEBUG from linker scripts
powerpc/64s: Remove lost/old comment
powerpc/64s: Remove old STAB comment
powerpc: remove orphan systbl_chk.sh
...
Linus Torvalds [Sun, 9 Oct 2022 20:51:40 +0000 (13:51 -0700)]
Merge tag 's390-6.1-1' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Make use of the IBM z16 processor activity instrumentation facility
extension to count neural network processor assist operations: add a
new PMU device driver so that perf can make use of this.
- Rework memcpy_real() to avoid DAT-off mode.
- Rework absolute lowcore access code.
- Various small fixes and improvements all over the code.
* tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: remove unused bus_next field from struct zpci_dev
s390/cio: remove unused ccw_device_force_console() declaration
s390/pai: Add support for PAI Extension 1 NNPA counters
s390/mm: fix no previous prototype warnings in maccess.c
s390/mm: uninline copy_oldmem_kernel() function
s390/mm,ptdump: add real memory copy page markers
s390/mm: rework memcpy_real() to avoid DAT-off mode
s390/dump: save IPL CPU registers once DAT is available
s390/pci: convert high_memory to physical address
s390/smp,ptdump: add absolute lowcore markers
s390/smp: rework absolute lowcore access
s390/smp: call smp_reinit_ipl_cpu() before scheduler is available
s390/ptdump: add missing amode31 markers
s390/mm: split lowcore pages with set_memory_4k()
s390/mm: remove unused access parameter from do_fault_error()
s390/delay: sync comment within __delay() with reality
s390: move from strlcpy with unused retval to strscpy
Linus Torvalds [Sun, 9 Oct 2022 20:24:01 +0000 (13:24 -0700)]
Merge tag 'riscv-for-linus-6.1-mw1' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
- The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
- The CD-ROM filesystems have been enabled in the defconfig.
- Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: enable THP_SWAP for RV64
RISC-V: Print SSTC in canonical order
riscv: compat: s/failed/unsupported if compat mode isn't supported
RISC-V: Increase range and default value of NR_CPUS
cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
perf: RISC-V: throttle perf events
perf: RISC-V: exclude invalid pmu counters from SBI calls
riscv: enable CD-ROM file systems in defconfig
riscv: topology: fix default topology reporting
arm64: topology: move store_cpu_topology() to shared code
Linus Torvalds [Sun, 9 Oct 2022 20:13:48 +0000 (13:13 -0700)]
Merge tag 'microblaze-v6.1' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze updates from Michal Simek:
"This adds architecture support for error injection which can be done
only via local memory (BRAM) with enabling path for recovery after
reset.
These patches targets Triple Modular Redundacy (TMR) configuration
where 3 Microblazes are running in parallel with monitoring logic.
When an error happens (or is injected) system goes to break handler
with full CPU reset and system recovery back to origin context. More
information can be found at [1]"
Link: https://www.xilinx.com/content/dam/xilinx/support/documents/ip_documentation/tmr/v1_0/pg268-tmr.pdf
* tag 'microblaze-v6.1' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Add support for error injection
microblaze: Add custom break vector handler for mb manager
microblaze: Add xmb_manager_register function
Linus Torvalds [Sun, 9 Oct 2022 16:39:55 +0000 (09:39 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"The first batch of KVM patches, mostly covering x86.
ARM:
- Account stage2 page table allocations in memory stats
x86:
- Account EPT/NPT arm64 page table allocations in memory stats
- Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
accesses
- Drop eVMCS controls filtering for KVM on Hyper-V, all known
versions of Hyper-V now support eVMCS fields associated with
features that are enumerated to the guest
- Use KVM's sanitized VMCS config as the basis for the values of
nested VMX capabilities MSRs
- A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed a
longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed for
good
- A handful of fixes for memory leaks in error paths
- Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow
- Never write to memory from non-sleepable kvm_vcpu_check_block()
- Selftests refinements and cleanups
- Misc typo cleanups
Generic:
- remove KVM_REQ_UNHALT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
KVM: remove KVM_REQ_UNHALT
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: x86: never write to memory from kvm_vcpu_check_block()
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
KVM: x86: make vendor code check for all nested events
mailmap: Update Oliver's email address
KVM: x86: Allow force_emulation_prefix to be written without a reload
KVM: selftests: Add an x86-only test to verify nested exception queueing
KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
...
Linus Torvalds [Sun, 9 Oct 2022 15:56:54 +0000 (08:56 -0700)]
Merge tag 'efi-next-for-v6.1' of git://git./linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"A bit more going on than usual in the EFI subsystem. The main driver
for this has been the introduction of the LoonArch architecture last
cycle, which inspired some cleanup and refactoring of the EFI code.
Another driver for EFI changes this cycle and in the future is
confidential compute.
The LoongArch architecture does not use either struct bootparams or DT
natively [yet], and so passing information between the EFI stub and
the core kernel using either of those is undesirable. And in general,
overloading DT has been a source of issues on arm64, so using DT for
this on new architectures is a to avoid for the time being (even if we
might converge on something DT based for non-x86 architectures in the
future). For this reason, in addition to the patch that enables EFI
boot for LoongArch, there are a number of refactoring patches applied
on top of which separate the DT bits from the generic EFI stub bits.
These changes are on a separate topich branch that has been shared
with the LoongArch maintainers, who will include it in their pull
request as well. This is not ideal, but the best way to manage the
conflicts without stalling LoongArch for another cycle.
Another development inspired by LoongArch is the newly added support
for EFI based decompressors. Instead of adding yet another
arch-specific incarnation of this pattern for LoongArch, we are
introducing an EFI app based on the existing EFI libstub
infrastructure that encapulates the decompression code we use on other
architectures, but in a way that is fully generic. This has been
developed and tested in collaboration with distro and systemd folks,
who are eager to start using this for systemd-boot and also for arm64
secure boot on Fedora. Note that the EFI zimage files this introduces
can also be decompressed by non-EFI bootloaders if needed, as the
image header describes the location of the payload inside the image,
and the type of compression that was used. (Note that Fedora's arm64
GRUB is buggy [0] so you'll need a recent version or switch to
systemd-boot in order to use this.)
Finally, we are adding TPM measurement of the kernel command line
provided by EFI. There is an oversight in the TCG spec which results
in a blind spot for command line arguments passed to loaded images,
which means that either the loader or the stub needs to take the
measurement. Given the combinatorial explosion I am anticipating when
it comes to firmware/bootloader stacks and firmware based attestation
protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now
when it comes to EFI measured boot, which is that the kernel measures
the initrd and command line. Intermediate loaders can measure
additional assets if needed, but with the baseline in place, we can
deploy measured boot in a meaningful way even if you boot into Linux
straight from the EFI firmware.
Summary:
- implement EFI boot support for LoongArch
- implement generic EFI compressed boot support for arm64, RISC-V and
LoongArch, none of which implement a decompressor today
- measure the kernel command line into the TPM if measured boot is in
effect
- refactor the EFI stub code in order to isolate DT dependencies for
architectures other than x86
- avoid calling SetVirtualAddressMap() on arm64 if the configured
size of the VA space guarantees that doing so is unnecessary
- move some ARM specific code out of the generic EFI source files
- unmap kernel code from the x86 mixed mode 1:1 page tables"
* tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
efi/arm64: libstub: avoid SetVirtualAddressMap() when possible
efi: zboot: create MemoryMapped() device path for the parent if needed
efi: libstub: fix up the last remaining open coded boot service call
efi/arm: libstub: move ARM specific code out of generic routines
efi/libstub: measure EFI LoadOptions
efi/libstub: refactor the initrd measuring functions
efi/loongarch: libstub: remove dependency on flattened DT
efi: libstub: install boot-time memory map as config table
efi: libstub: remove DT dependency from generic stub
efi: libstub: unify initrd loading between architectures
efi: libstub: remove pointless goto kludge
efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap
efi: libstub: avoid efi_get_memory_map() for allocating the virt map
efi: libstub: drop pointless get_memory_map() call
efi: libstub: fix type confusion for load_options_size
arm64: efi: enable generic EFI compressed boot
loongarch: efi: enable generic EFI compressed boot
riscv: efi: enable generic EFI compressed boot
efi/libstub: implement generic EFI zboot
efi/libstub: move efi_system_table global var into separate object
...
Yu Kuai [Sun, 9 Oct 2022 10:10:38 +0000 (18:10 +0800)]
blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
commit
8c5035dfbb94 ("blk-wbt: call rq_qos_add() after wb_normal is
initialized") moves wbt_set_write_cache() before rq_qos_add(), which
is wrong because wbt_rq_qos() is still NULL.
Fix the problem by removing wbt_set_write_cache() and setting 'rwb->wc'
directly. Noted that this patch also remove the redundant setting of
'rab->wc'.
Fixes:
8c5035dfbb94 ("blk-wbt: call rq_qos_add() after wb_normal is initialized")
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/r/202210081045.77ddf59b-yujie.liu@intel.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221009101038.1692875-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sat, 8 Oct 2022 17:30:44 +0000 (10:30 -0700)]
Merge tag 'mailbox-v6.1' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
- apple: implement poll and flush callbacks
- qcom: fix clocks for IPQ6018 and IPQ8074 irq handler as not-a-thread
- microchip: split reg-space into two
- imx: RST channel fix
- bcm: fix dma_map_sg error handling
- misc: spelling fix in pcc driver
* tag 'mailbox-v6.1' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: qcom-ipcc: flag IRQ NO_THREAD
mailbox: pcc: Fix spelling mistake "Plaform" -> "Platform"
mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg
mailbox: qcom-apcs-ipc: add IPQ8074 APSS clock support
dt-bindings: mailbox: qcom: correct clocks for IPQ6018 and IPQ8074
dt-bindings: mailbox: qcom: set correct #clock-cells
mailbox: mpfs: account for mbox offsets while sending
mailbox: mpfs: fix handling of the reg property
dt-bindings: mailbox: fix the mpfs' reg property
mailbox: imx: fix RST channel support
mailbox: apple: Implement poll_data() operation
mailbox: apple: Implement flush() operation
Linus Torvalds [Sat, 8 Oct 2022 17:06:48 +0000 (10:06 -0700)]
Merge tag 'clk-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"We have some late breaking reports that a patch series to rework clk
rate range support broke boot on some devices, so I've left that
branch out of this. Hopefully we can get to that next week, or punt on
it and let it bake another cycle. That means we don't really have any
changes to the core framework this time around besides a few typo
fixes. Instead this is all clk driver updates and fixes.
The usual suspects are here (again), with Qualcomm dominating the
diffstat. We look to have gained support for quite a few new Qualcomm
SoCs and Dmitry worked on updating many of the existing Qualcomm
drivers to use clk_parent_data. After that we have MediaTek drivers
getting some much needed updates, in particular to support GPU DVFS.
There are also quite a few Samsung clk driver patches, but that's
mostly because there was a maintainer change and so last release we
missed some of those patches.
Overall things look normal, but I'm slowly reviewing core framework
code nowadays and that shows given the rate range patches had to be
yanked last minute. Let's hope this situation changes soon.
New Drivers:
- Support for Renesas VersaClock7 clock generator family
- Add Spreadtrum UMS512 SoC clk support
- New clock drivers for MediaTek Helio X10 MT6795
- Display clks for Qualcomm SM6115, SM8450
- GPU clks for Qualcomm SC8280XP
- Qualcomm MSM8909 and SM6375 global and SMD RPM clk drivers
Deleted Drivers:
- Remove DaVinci DM644x and DM646x clk driver support
Updates:
- Convert Baikal-T1 CCU driver to platform driver
- Split reset support out of primary Baikal-T1 CCU driver
- Add some missing clks required for RPiVid Video Decoder on
RaspberryPi
- Mark PLLC critical on bcm2835
- More devm helpers for fixed rate registration
- Various PXA168 clk driver fixes
- Add resets for MediaTek MT8195 PCIe and USB
- Miscellaneous of_node_put() fixes
- Nuke dt-bindings/clk path (again) by moving headers to
dt-bindings/clock
- Convert gpio-clk-gate binding to YAML
- Various fixes to AMD/Xilinx Zynqmp clk driver
- Graduate AMD/Xilinx "clocking wizard" driver from staging
- Add missing DPI1_HDMI clock in MT8195 VDOSYS1
- Clock driver changes to support GPU DVFS on MT8183, MT8192, MT8195
- Fix GPU clock topology on MT8195
- Propogate rate changes from GPU clock gate up the tree
- Clock mux notifiers for GPU-related PLLs
- Conversion of more "simple" drivers to mtk_clk_simple_probe()
- Hook up mtk_clk_simple_remove() for "simple" MT8192 clock drivers
- Fixes to previous |struct clk| to |struct clk_hw| conversion on
MediaTek
- Shrink MT8192 clock driver by deduplicating clock parent lists
- Change order between 'sim_enet_root_clk' and 'enet_qos_root_clk'
clocks for i.MX8MP
- Drop unnecessary newline in i.MX8MM dt-bindings
- Add more MU1 and SAI clocks dt-bindings Ids
- Introduce slice busy bit check for i.MX93 composite clock
- Introduce white list bit check for i.MX93 composite clock
- Add new i.MX93 clock gate
- Add MU1 and MU2 clocks to i.MX93 clock provider
- Add SAI IPG clocks to i.MX93 clock provider
- add generic clocks for U(S)ART available on SAMA5D2 SoCs
- reset controller support for Polarfire clocks
- .round_rate and .set rate support for clk-mpfs
- code cleanup for clk-mpfs
- PLL support for PolarFire SoC's Clock Conditioning Circuitry
- Add watchdog, I2C, pin control/GPIO, and Ethernet clocks on R-Car
V4H
- Add SDHI, Timer (CMT/TMU), and SPI (MSIOF) clocks on R-Car S4-8
- Add I2C clocks and resets on RZ/V2M
- Document clock support for the RZ/Five SoC
- mux-variant clock using the table variant to select parents
- clock controller for the rv1126 soc
- conversion of rk3128 to yaml and relicensing of the yaml bindings
to gpl2+MIT (following dt-binding guildelines)
- Exynos7885: add FSYS, TREX and MFC clock controllers
- Exynos850: add IS and AUD (audio) clock controllers with bindings
- ExynosAutov9: add FSYS clock controllers with bindings
- ExynosAutov9: correct clock IDs in bindings of Peric 0 and 1 clock
controllers, due to duplicated entries. This is an acceptable ABI
break: recently developed/added platform so without legacies, acked
by known users/developers
- ExynosAutov9: add few missing Peric 0/1 gates
- ExynosAutov9: correct register offsets of few Peric 0/1 clocks
- Minor code improvements (use of_device_get_match_data() helper,
code style)
- Add Krzysztof Kozlowski as co-maintainer of Samsung SoC clocks, as
he already maintainers that architecture/platform
- Keep Qualcomm GDSCs enabled when PWRSTS_RET flag is there, solving
retention issues during suspend of USB on Qualcomm sc7180/sc7280
and SC8280XP
- Qualcomm SM6115 and QCM2260 are moved to reuse PLL configuration
- Qualcomm SDM660 SDCC1 moved to floor clk ops
- Support for the APCS PLLs for Qualcomm IPQ8064, IPQ8074 and IPQ6018
was added/fixed
- The Qualcomm MSM8996 CPU clocks are updated with support for ACD
- Support for Qualcomm SDM670 GCC and RPMh clks was added
- Transition to parent_data, parent_hws and use of ARRAY_SIZE() for
num_parents was done for many Qualcomm SoCs
- Support for per-reset defined delay on Qualcomm was introduced"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (283 commits)
clk: qcom: gcc-sm6375: Ensure unsigned long type
clk: qcom: gcc-sm6375: Remove unused variables
clk: qcom: kpss-xcc: convert to parent data API
clk: introduce (devm_)hw_register_mux_parent_data_table API
clk: allow building lan966x as a module
clk: clk-xgene: simplify if-if to if-else
clk: ast2600: BCLK comes from EPLL
clk: clocking-wizard: Depend on HAS_IOMEM
clk: clocking-wizard: Use dev_err_probe() helper
clk: nxp: fix typo in comment
clk: pxa: add a check for the return value of kzalloc()
clk: vc5: Add support for IDT/Renesas VersaClock 5P49V6975
dt-bindings: clock: vc5: Add 5P49V6975
clk: mvebu: armada-37xx-tbg: Remove the unneeded result variable
clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
clk: Renesas versaclock7 ccf device driver
dt-bindings: Renesas versaclock7 device tree bindings
clk: ti: Balance of_node_get() calls for of_find_node_by_name()
clk: imx: scu: fix memleak on platform_device_add() fails
clk: vc5: Use regmap_{set,clear}_bits() where appropriate
...
Linus Torvalds [Sat, 8 Oct 2022 16:46:29 +0000 (09:46 -0700)]
Merge tag 'gpio-updates-for-v6.1-rc1' of git://git./linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"We have a single new driver, support for a bunch of new models,
improvements in drivers and core gpiolib code as well device-tree
bindings changes.
Summary:
New driver:
- IMX System Controller Unit GPIOs
GPIO core:
- add fdinfo output for the GPIO character device file descriptors
(allows user-space to determine which processes own which GPIO
lines)
- improvements to OF GPIO code
- new quirk for Asus UM325UAZ in gpiolib-acpi
- new quirk for Freescale SPI in gpiolib-of
Driver improvements:
- add a new macro that reduces the amount of boilerplate code in ISA
drivers and use it in relevant drivers
- support two new models in gpio-pca953x
- support new model in gpio-f7188x
- convert more drivers to use immutable irq chips
- other minor tweaks
Device-tree bindings:
- add DT bindings for gpio-imx-scu
- convert Xilinx GPIO bindings to YAML
- reference the properties from the SPI peripheral device-tree
bindings instead of providing custom ones in the GPIO controller
document
- add parsing of GPIO hog nodes to the DT bindings for gpio-mpfs-gpio
- relax the node name requirements in gpio-stmpe
- add new models for gpio-rcar and gpio-pxa95xx
- add a new vendor prefix: Diodes (for Diodes, Inc.)
Misc:
- pulled in the immutable branch from the x86 platform drivers tree
including support for a new simatic board that depends on GPIO
changes"
* tag 'gpio-updates-for-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (36 commits)
gpio: tc3589x: Make irqchip immutable
gpiolib: cdev: add fdinfo output for line request file descriptors
gpio: twl4030: Reorder functions which allows to drop a forward declaraion
gpiolib: fix OOB access in quirk callbacks
gpiolib: of: factor out conversion from OF flags
gpiolib: rework quirk handling in of_find_gpio()
gpiolib: of: make Freescale SPI quirk similar to all others
gpiolib: of: do not ignore requested index when applying quirks
gpio: ws16c48: Ensure number of irq matches number of base
gpio: 104-idio-16: Ensure number of irq matches number of base
gpio: 104-idi-48: Ensure number of irq matches number of base
gpio: 104-dio-48e: Ensure number of irq matches number of base
counter: 104-quad-8: Ensure number of irq matches number of base
isa: Introduce the module_isa_driver_with_irq helper macro
gpio: pca953x: Add support for PCAL6534
gpio: pca953x: Swap if statements to save later complexity
gpio: pca953x: Fix pca953x_gpio_set_pull_up_down()
dt-bindings: gpio: pca95xx: add entry for pcal6534 and PI4IOE5V6534Q
dt-bindings: vendor-prefixes: add Diodes
gpio: mt7621: Switch to use platform_get_irq() function
...
Linus Torvalds [Sat, 8 Oct 2022 16:19:24 +0000 (09:19 -0700)]
Merge tag 'staging-6.1-rc1' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the large set of staging driver changes for 6.1-rc1.
Nothing really interesting in here at all except we deleted a driver
(fwserial) as no one had been using it for a long time. Other than
that, just the normal cleanups and minor fixes:
- rtl8723bs driver cleanups
- loads of r8188eu driver cleanups, making the driver smaller and
fixing up some firmware dependency issues.
- vt6655 driver cleanups.
- lots of other small staging driver cleanups.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (266 commits)
staging: rtl8192e: Rename variable Bandwidth to avoid CamelCase
staging: r8188eu: remove PHY_RFConfig8188E()
staging: r8188eu: remove PHY_RF6052_Config8188E()
staging: r8188eu: convert ODM_ReadAndConfig_AGC_TAB_1T_8188E() to int
staging: r8188eu: convert ODM_ReadAndConfig_PHY_REG_1T_8188E() to int
staging: r8188eu: convert ODM_ReadAndConfig_RadioA_1T_8188E() to int
staging: r8188eu: convert ODM_ReadAndConfig_MAC_REG_8188E() to int
staging: rtl8192e: cmdpkt: Use skb_put_data() instead of skb_put/memcpy pair
staging: r8188eu: Use skb_put_data() instead of skb_put/memcpy pair
staging: r8188eu: remove hal/odm_RegConfig8188E.c
staging: r8188eu: make odm_ConfigRF_RadioA_8188E() static
staging: r8188eu: make odm_ConfigMAC_8188E() static
staging: r8188eu: don't check for stop/removal in the blink worker
staging: r8188eu: don't check bSurpriseRemoved in SwLedOff
staging: rtl8192e: Remove unused variables ForcedAMSDUMaxSize, ...
staging: rtl8192e: Rename CurrentMPDU..., ForcedAMPDU... and ForcedMPDU...
staging: rtl8192e: Rename SelfMimoPs, CurrentOpMode and bForcedShortGI
staging: rtl8192e: Rename PeerMimoPs, IOTAction and IOTRaFunc
staging: rtl8192e: Rename RxRe...WinSize, RxReorder... and RxReorderDr...
staging: rtl8192e: Rename szRT2RTAggBuffer, bRegRxRe... and bCurRxReo...
...
Linus Torvalds [Sat, 8 Oct 2022 15:56:37 +0000 (08:56 -0700)]
Merge tag 'char-misc-6.1-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the large set of char/misc and other small driver subsystem
changes for 6.1-rc1. Loads of different things in here:
- IIO driver updates, additions, and changes. Probably the largest
part of the diffstat
- habanalabs driver update with support for new hardware and
features, the second largest part of the diff.
- fpga subsystem driver updates and additions
- mhi subsystem updates
- Coresight driver updates
- gnss subsystem updates
- extcon driver updates
- icc subsystem updates
- fsi subsystem updates
- nvmem subsystem and driver updates
- misc driver updates
- speakup driver additions for new features
- lots of tiny driver updates and cleanups
All of these have been in the linux-next tree for a while with no
reported issues"
* tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits)
w1: Split memcpy() of struct cn_msg flexible array
spmi: pmic-arb: increase SPMI transaction timeout delay
spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
spmi: pmic-arb: add support to dispatch interrupt based on IRQ status
spmi: pmic-arb: check apid against limits before calling irq handler
spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq
spmi: pmic-arb: handle spurious interrupt
spmi: pmic-arb: add a print in cleanup_irq
drivers: spmi: Directly use ida_alloc()/free()
MAINTAINERS: add TI ECAP driver info
counter: ti-ecap-capture: capture driver support for ECAP
Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items
dt-bindings: counter: add ti,am62-ecap-capture.yaml
counter: Introduce the COUNTER_COMP_ARRAY component type
counter: Consolidate Counter extension sysfs attribute creation
counter: Introduce the Count capture component
counter: 104-quad-8: Add Signal polarity component
counter: Introduce the Signal polarity component
counter: interrupt-cnt: Implement watch_validate callback
...
Linus Torvalds [Sat, 8 Oct 2022 00:04:10 +0000 (17:04 -0700)]
Merge tag 'driver-core-6.1-rc1' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
Linus Torvalds [Fri, 7 Oct 2022 23:48:26 +0000 (16:48 -0700)]
Merge tag 'usb-6.1-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt driver changes for 6.1-rc1.
Nothing major in here, lots of little things with new devices
supported and updates for a few drivers. Highlights include:
- thunderbolt/USB4 devices supported a bit better than before, and
some new ids to enable new hardware devices
- USB gadget uvc updates for newer video formats and better v4l
integration (the v4l portions were acked by those maintainers)
- typec updates for tiny issues and more typec drivers for new chips.
- xhci tiny updates for minor issues
- big usb-serial ftdi_sio driver update to handle new devices better
- lots of tiny dwc3 fixes and updates for the IP block that is
showing up everywhere these days
- dts updates for new devices being supported
- other tiny janitorial and cleanups fixes for lots of different USB
drivers. Full details are in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (169 commits)
usb: gadget: uvc: don't put item still in use
usb: gadget: uvc: Fix argument to sizeof() in uvc_register_video()
usb: host: ehci-exynos: switch to using gpiod API
Revert "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"
Revert "USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present""
dt-bindings: usb: Convert FOTG210 to dt schema
usb: mtu3: fix failed runtime suspend in host only mode
USB: omap_udc: Fix spelling mistake: "tranceiver_ctrl" -> "transceiver_ctrl"
usb: typec: ucsi_ccg: Disable UCSI ALT support on Tegra
usb: typec: Replace custom implementation of device_match_fwnode()
usb: typec: ucsi: Don't warn on probe deferral
usb: add quirks for Lenovo OneLink+ Dock
MAINTAINERS: switch dwc3 to Thinh
usb: idmouse: fix an uninit-value in idmouse_open
USB: PHY: JZ4770: Switch to use dev_err_probe() helper
usb: phy: generic: Switch to use dev_err_probe() helper
usb: ulpi: use DEFINE_SHOW_ATTRIBUTE to simplify ulpi_regs
usb: cdns3: remove dead code
usb: cdc-wdm: Use skb_put_data() instead of skb_put/memcpy pair
usb: musb: sunxi: Switch to use dev_err_probe() helper
...
Linus Torvalds [Fri, 7 Oct 2022 23:36:24 +0000 (16:36 -0700)]
Merge tag 'tty-6.1-rc1' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of TTY and Serial driver updates for 6.1-rc1.
Lots of cleanups in here, no real new functionality this time around,
with the diffstat being that we removed more lines than we added!
Included in here are:
- termios unification cleanups from Al Viro, it's nice to finally get
this work done
- tty serial transmit cleanups in various drivers in preparation for
more cleanup and unification in future releases (that work was not
ready for this release)
- n_gsm fixes and updates
- ktermios cleanups and code reductions
- dt bindings json conversions and updates for new devices
- some serial driver updates for new devices
- lots of other tiny cleanups and janitorial stuff. Full details in
the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (102 commits)
serial: cpm_uart: Don't request IRQ too early for console port
tty: serial: do unlock on a common path in altera_jtaguart_console_putc()
tty: serial: unify TX space reads under altera_jtaguart_tx_space()
tty: serial: use FIELD_GET() in lqasc_tx_ready()
tty: serial: extend lqasc_tx_ready() to lqasc_console_putchar()
tty: serial: allow pxa.c to be COMPILE_TESTed
serial: stm32: Fix unused-variable warning
tty: serial: atmel: Add COMMON_CLK dependency to SERIAL_ATMEL
serial: 8250: Fix restoring termios speed after suspend
serial: Deassert Transmit Enable on probe in driver-specific way
serial: 8250_dma: Convert to use uart_xmit_advance()
serial: 8250_omap: Convert to use uart_xmit_advance()
MAINTAINERS: Solve warning regarding inexistent atmel-usart binding
serial: stm32: Deassert Transmit Enable on ->rs485_config()
serial: ar933x: Deassert Transmit Enable on ->rs485_config()
tty: serial: atmel: Use FIELD_PREP/FIELD_GET
tty: serial: atmel: Make the driver aware of the existence of GCLK
tty: serial: atmel: Only divide Clock Divisor if the IP is USART
tty: serial: atmel: Separate mode clearing between UART and USART
dt-bindings: serial: atmel,at91-usart: Add gclk as a possible USART clock
...
Linus Torvalds [Fri, 7 Oct 2022 23:13:55 +0000 (16:13 -0700)]
Merge tag 'soundwire-6.1-rc1' of git://git./linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
"Updates for Intel, Cadence and Qualcomm drivers:
- another round of Intel driver cleanup to prepare for future code
reorg which is expected in next cycle (Pierre-Louis Bossart)
- bus unattach notifications processing during re-enumeration along
with Cadence driver updates for this (Richard Fitzgerald)
- Qualcomm driver updates to handle device0 status (Srinivas
Kandagatla)"
* tag 'soundwire-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: (42 commits)
soundwire: intel: add helper to stop bus
soundwire: intel: introduce helpers to start bus
soundwire: intel: introduce intel_shim_check_wake() helper
soundwire: intel: simplify read ops assignment
soundwire: intel: remove intel_init() wrapper
soundwire: intel: move shim initialization before power up/down
soundwire: intel: remove clock_stop parameter in intel_shim_init()
soundwire: intel: move all PDI initialization under intel_register_dai()
soundwire: intel: move DAI registration and debugfs init earlier
soundwire: intel: simplify flow and use devm_ for DAI registration
soundwire: intel: fix error handling on dai registration issues
soundwire: cadence: Simplify error paths in cdns_xfer_msg()
soundwire: cadence: Fix error check in cdns_xfer_msg()
soundwire: cadence: Write to correct address for each FIFO chunk
soundwire: bus: Fix wrong port number in sdw_handle_slave_alerts()
soundwire: qcom: do not send status of device 0 during alert
soundwire: qcom: update status from device id 1
soundwire: cadence: Don't overwrite msg->buf during write commands
soundwire: bus: Don't exit early if no device IDs were programmed
soundwire: cadence: Fix lost ATTACHED interrupts when enumerating
...
Linus Torvalds [Fri, 7 Oct 2022 23:03:01 +0000 (16:03 -0700)]
Merge tag 'phy-for-6.1' of git://git./linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"This contains bunch of new device support and one new Sunplus driver
along with updates which include another big round of qmp phy
conversion.
New support:
- Qualcomm SC8280XP eDP & DP and USB3 UNI phy (Bjorn Andersson)
- Rockchip rk3568 inno dsidphy (Chris Morgan)
- ocelot-serdes phy yaml binding (Colin Foster)
- Renesas gen2-usb phy yaml binding (Geert Uytterhoeven)
- RGMII suport in lan966x driver (Horatiu Vultur)
- Qualcomm SM6375 usb snps-femto-v2 bindings (Konrad Dybcio)
- Rockchip rk356x csi-dphya (Michael Riesch)
- Qualcomm sdm670 usb2 bindings (Richard Acayan)
- Sunplus USB2 PHY (Vincent Shih)
Updates:
- Mediatek hdmi, ufs, tphy and xsphy updates to use bitfield helpers
(Chunfeng Yun)
- Continued Qualcomm qmp phy driver split and cleanup. More patches
are under review and expected that next cycle might see completion
of this activity (Dmitry Baryshkov & Johan Hovold)
- TI wiz driver support for j7200 10g (Roger Quadros)
- Qualcomm femto phy driver support for override params to help with
tuning (Sandeep Maheswaram)
- SGMII support in TI wiz driver (Siddharth Vadapalli)
- dev_err_probe simplification (Yuan Can)"
* tag 'phy-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (170 commits)
phy: phy-mtk-dp: make array driving_params static const
dt-bindings: phy: qcom,qusb2: document sdm670 compatible
phy: qcom-qmp-pcie: fix resource mapping for SDM845 QHP PHY
phy: rockchip-snps-pcie3: only look for rockchip,pipe-grf on rk3588
phy: tegra: xusb: Enable usb role switch attribute
phy: mediatek: fix build warning of FIELD_PREP()
phy: qcom-qmp-usb: Use dev_err_probe() to simplify code
phy: qcom-qmp-ufs: Use dev_err_probe() to simplify code
phy: qcom-qmp-pcie-msm8996: Use dev_err_probe() to simplify code
phy: qcom-qmp-combo: Use dev_err_probe() to simplify code
phy: qualcomm: call clk_disable_unprepare in the error handling
phy: intel: Use dev_err_probe() to simplify code
phy: tegra: xusb: Use dev_err_probe() to simplify code
phy: qcom-snps: Use dev_err_probe() to simplify code
phy: qcom-qusb2: Use dev_err_probe() to simplify code
phy: qcom-qmp-pcie: Use dev_err_probe() to simplify code
phy: ti: phy-j721e-wiz: fix reference leaks in wiz_probe()
phy: mediatek: mipi: remove register access helpers
phy: mediatek: mipi: mt8183: use common helper to access registers
phy: mediatek: mipi: mt8183: use GENMASK to generate bits mask
...
Linus Torvalds [Fri, 7 Oct 2022 22:56:34 +0000 (15:56 -0700)]
Merge tag 'dmaengine-6.1-rc1' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New Support:
- MT6795 SoC dma controller (AngeloGioacchino Del Regno)
- qcom-adm controller yaml binding (Christian Marangi)
- Renesas r8a779g0 dma controller yaml binding (Geert Uytterhoeven)
- Qualcomm SM6350 GPI dma controller (Luca Weiss)
Updates:
- STM32 DMA-MDMA chaining support (Amelie Delaunay)
- make hsu driver use managed resources (Andy Shevchenko)
- the usual round of idxd driver updates (Dave Jiang & Jerry
Snitselaar)
- apple dma driver iommu and pd properties and remove use
of devres for irqs (Janne Grunau & Martin Povišer)
- device_synchronize support for Xilinx zynqmp driver (Swati
Agarwal)"
* tag 'dmaengine-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (60 commits)
dmaengine: ioat: remove unused declarations in dma.h
dmaengine: ti: k3-udma: Respond TX done if DMA_PREP_INTERRUPT is not requested
dmaengine: zynqmp_dma: Add device_synchronize support
dt-bindings: dma: add additional pbus reset to qcom,adm
dt-bindings: dma: rework qcom,adm Documentation to yaml schema
dt-bindings: dma: apple,admac: Add iommus and power-domains properties
dmaengine: dw-edma: Remove runtime PM support
dmaengine: idxd: add configuration for concurrent batch descriptor processing
dmaengine: idxd: add configuration for concurrent work descriptor processing
dmaengine: idxd: add WQ operation cap restriction support
dmanegine: idxd: reformat opcap output to match bitmap_parse() input
dmaengine: idxd: convert ats_dis to a wq flag
dmaengine: ioat: stop mod_timer from resurrecting deleted timer in __cleanup()
dmaengine: qcom-adm: fix wrong calling convention for prep_slave_sg
dmaengine: qcom-adm: fix wrong sizeof config in slave_config
dmaengine: ti: k3-psil: add additional TX threads for j721e
dmaengine: ti: k3-psil: add additional TX threads for j7200
dmaengine: apple-admac: Trigger shared reset
dmaengine: apple-admac: Do not use devres for IRQs
dmaengine: ti: edma: Remove some unused functions
...
Linus Torvalds [Fri, 7 Oct 2022 19:33:18 +0000 (12:33 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (qla2xxx, lpfc, ufs, hisi_sas, mpi3mr,
mpt3sas, target). The biggest change (from my biased viewpoint) being
that the mpi3mr now attached to the SAS transport class, making it the
first fusion type device to do so.
Beyond the usual bug fixing and security class reworks, there aren't a
huge number of core changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits)
scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()
scsi: mpi3mr: Remove unnecessary cast
scsi: stex: Properly zero out the passthrough command structure
scsi: mpi3mr: Update driver version to 8.2.0.3.0
scsi: mpi3mr: Fix scheduling while atomic type bug
scsi: mpi3mr: Scan the devices during resume time
scsi: mpi3mr: Free enclosure objects during driver unload
scsi: mpi3mr: Handle 0xF003 Fault Code
scsi: mpi3mr: Graceful handling of surprise removal of PCIe HBA
scsi: mpi3mr: Schedule IRQ kthreads only on non-RT kernels
scsi: mpi3mr: Support new power management framework
scsi: mpi3mr: Update mpi3 header files
scsi: mpt3sas: Revert "scsi: mpt3sas: Fix ioc->base_readl() use"
scsi: mpt3sas: Revert "scsi: mpt3sas: Fix writel() use"
scsi: wd33c93: Remove dead code related to the long-gone config WD33C93_PIO
scsi: core: Add I/O timeout count for SCSI device
scsi: qedf: Populate sysfs attributes for vport
scsi: pm8001: Replace one-element array with flexible-array member
scsi: 3w-xxxx: Replace one-element array with flexible-array member
scsi: hptiop: Replace one-element array with flexible-array member in struct hpt_iop_request_ioctl_command()
...
Linus Torvalds [Fri, 7 Oct 2022 19:05:29 +0000 (12:05 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Not a big list of changes this cycle, mostly small things. The new
MANA rdma driver should come next cycle along with a bunch of work on
rxe.
Summary:
- Small bug fixes in mlx5, efa, rxe, hns, irdma, erdma, siw
- rts tracing improvements
- Code improvements: strlscpy conversion, unused parameter, spelling
mistakes, unused variables, flex arrays
- restrack device details report for hns
- Simplify struct device initialization in SRP
- Eliminate the never-used service_mask support in IB CM
- Make rxe not print to the console for some kinds of network packets
- Asymetric paths and router support in the CM through netlink
messages
- DMABUF importer support for mlx5devx umem's"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (84 commits)
RDMA/rxe: Remove error/warning messages from packet receiver path
RDMA/usnic: fix set-but-not-unused variable 'flags' warning
IB/hfi1: Use skb_put_data() instead of skb_put/memcpy pair
RDMA/hns: Unified Log Printing Style
RDMA/hns: Replacing magic number with macros in apply_func_caps()
RDMA/hns: Repacing 'dseg_len' by macros in fill_ext_sge_inl_data()
RDMA/hns: Remove redundant 'max_srq_desc_sz' in caps
RDMA/hns: Remove redundant 'num_mtt_segs' and 'max_extend_sg'
RDMA/hns: Remove redundant 'phy_addr' in hns_roce_hem_list_find_mtt()
RDMA/hns: Remove redundant 'use_lowmem' argument from hns_roce_init_hem_table()
RDMA/hns: Remove redundant 'bt_level' for hem_list_alloc_item()
RDMA/hns: Remove redundant 'attr_mask' in modify_qp_init_to_init()
RDMA/hns: Remove unnecessary brackets when getting point
RDMA/hns: Remove unnecessary braces for single statement blocks
RDMA/hns: Cleanup for a spelling error of Asynchronous
IB/rdmavt: Add __init/__exit annotations to module init/exit funcs
RDMA/rxe: Remove redundant num_sge fields
RDMA/mlx5: Enable ATS support for MRs and umems
RDMA/mlx5: Add support for dmabuf to devx umem
RDMA/core: Add UVERBS_ATTR_RAW_FD
...
Linus Torvalds [Fri, 7 Oct 2022 18:59:11 +0000 (11:59 -0700)]
Merge tag 'mtd/for-6.1' of git://git./linux/kernel/git/mtd/linux
Pull MTD updates from Miquel Raynal:
"Core MTD changes:
- mtdchar: add MEMREAD ioctl
- Add ECC error accounting for each read request
- always initialize 'stats' in struct mtd_oob_ops
- Track maximum number of bitflips for each read request
- Fix repeated word in comment
- Move from strlcpy with unused retval to strscpy
- Fix a typo in a comment
- Add binding for U-Boot bootloader partitions
MTD device drivers changes:
- FTL: use container_of() rather than cast
- docg3:
- Use correct function names in comment blocks
- Check the return value of devm_ioremap() in the probe
- physmap-core: Fix NULL pointer dereferencing in
of_select_probe_type()
- parsers: add Broadcom's U-Boot parser
Raw NAND core changes:
- Replace of_gpio_named_count() by gpiod_count()
- Remove misguided comment of nand_get_device()
- bbt: Use the bitmap API to allocate bitmaps
Raw NAND controller drivers changes:
- Meson:
- Stop supporting legacy clocks
- Refine resource getting in probe
- Convert bindings to yaml
- Fix clock handling and update the bindings accordingly
- Fix bit map use in meson_nfc_ecc_correct()
- bcm47xx:
- Fix spelling typo in comment
- STM32 FMC2:
- Switch to using devm_fwnode_gpiod_get()
- Fix dma_map_sg error check
- Cadence:
- Remove an unneeded result variable
- Marvell:
- Fix error handle regarding dma_map_sg
- Orion:
- Use devm_clk_get_optional()
- Cafe:
- Use correct function name in comment block
- Atmel:
- Unmap streaming DMA mappings
- Arasan:
- Stop using 0 as NULL pointer
- GPMI:
- Fix typo 'the the' in comment
- BRCM:
- Add individual glue driver selection
- Move Kconfig to driver folder
- FSL: Fix none ECC mode
- Intel:
- Use devm_platform_ioremap_resource_byname()
- Remove unused clk_rate member from struct ebu_nand
- Remove unused nand_pa member from ebu_nand_cs
- Don't re-define NAND_DATA_IFACE_CHECK_ONLY
- Remove undocumented compatible string
- Fix compatible string in the bindings
- Read the chip-select line from the correct OF node
- Fix maximum chip select value in the bindings"
* tag 'mtd/for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (43 commits)
mtd: rawnand: meson: stop supporting legacy clocks
dt-bindings: nand: meson: convert txt to yaml
mtd: rawnand: meson: refine resource getting in probe
mtd: rawnand: meson: fix the clock
dt-bindings: nand: meson: fix meson nfc clock
mtd: rawnand: bcm47xx: fix spelling typo in comment
mtd: rawnand: stm32_fmc2: switch to using devm_fwnode_gpiod_get()
mtd: rawnand: cadence: Remove an unneeded result variable
mtd: rawnand: Replace of_gpio_named_count() by gpiod_count()
mtd: rawnand: marvell: Fix error handle regarding dma_map_sg
mtd: rawnand: stm32_fmc2: Fix dma_map_sg error check
mtd: rawnand: remove misguided comment of nand_get_device()
mtd: rawnand: orion: Use devm_clk_get_optional()
mtd: rawnand: cafe: Use correct function name in comment block
mtd: rawnand: atmel: Unmap streaming DMA mappings
mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct()
mtd: rawnand: arasan: stop using 0 as NULL pointer
mtd: rawnand: gpmi: Fix typo 'the the' in comment
mtd: rawnand: brcmnand: Add individual glue driver selection
mtd: rawnand: brcmnand: Move Kconfig to driver folder
...
Linus Torvalds [Fri, 7 Oct 2022 18:53:53 +0000 (11:53 -0700)]
Merge tag 'rproc-v6.1' of git://git./linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
"Support for remoteprocs that will perform recovery without help from
Linux is introduced.
The virtio integration is transitioned towards remoteproc_virtio.c and
represented by a platform_device, in preparation for instantiating
virtio instances from DeviceTree.
The iMX remoteproc driver has a couple of sparse warnings corrected
and a couple of error message printouts are cleaned up. The keystone
driver is transitioned to use the gpiod API"
* tag 'rproc-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
remoteproc: virtio: Fix warning on bindings by removing the of_match_table
remoteproc: Support attach recovery after rproc crash
remoteproc: Introduce rproc features
remoteproc: virtio: Create platform device for the remoteproc_virtio
remoteproc: Move rproc_vdev management to remoteproc_virtio.c
remoteproc: core: Introduce rproc_add_rvdev function
remoteproc: core: Introduce rproc_rvdev_add_device function
remoteproc: Harden rproc_handle_vdev() against integer overflow
remoteproc/keystone: Switch to using gpiod API
drivers/remoteproc: Fix repeated words in comments
remoteproc: imx_dsp_rproc: fix argument 2 of rproc_mem_entry_init
remoteproc: imx_rproc: Simplify some error message
Linus Torvalds [Fri, 7 Oct 2022 18:51:14 +0000 (11:51 -0700)]
Merge tag 'rpmsg-v6.1' of git://git./linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
"This fixes a double free/destroy and drops an unnecessary local
variable in the rpmsg char driver"
* tag 'rpmsg-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: char: Avoid double destroy of default endpoint
rpmsg: char: Remove the unneeded result variable
Linus Torvalds [Fri, 7 Oct 2022 18:48:30 +0000 (11:48 -0700)]
Merge tag 'for-v6.1' of git://git./linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
- new maintenance charging documentation
- mt6370: new charger driver
- bq25890: support input current limit
- added Qualcomm PMK8350 PON support
- misc minor fixes
* tag 'for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (22 commits)
power: supply: ab8500: remove unused static local variable
power: supply: mt6370: Fix return value check in mt6370_chg_probe()
power: supply: ab8500: Remove unused struct ab8500_chargalg_sysfs_entry
power: supply: mt6370: uses IIO interfaces, depends on IIO
power: supply: max1721x: Fix spelling mistake "Gauage" -> "Gauge"
power: supply: mt6370: Add MediaTek MT6370 charger driver
dt-bindings: power: supply: Add MediaTek MT6370 Charger
lib: add linear range index macro
power: supply: bq25890: Fix enum conversion in bq25890_power_supply_set_property()
power: supply: bq27xxx: fix NULL vs 0 warnings
power: supply: bq27xxx: fix __be16 warnings
power: supply: bq25890: Add support for setting IINLIM
power: supply: bq25890: Disable PUMPX_EN on errors
power: supply: Fix repeated word in comments
power: supply: adp5061: show unknown capacity_level as text
power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()
power: supply: tps65217: Fix comments typo
power: reset: qcom-pon: add support for qcom,pmk8350-pon compatible string
dt-bindings: power: reset: qcom-pon: Add new compatible "qcom,pmk8350-pon"
power: supply: cw2015: Use device managed API to simplify the code
...
Linus Torvalds [Fri, 7 Oct 2022 18:46:18 +0000 (11:46 -0700)]
Merge tag 'hsi-for-6.1' of git://git./linux/kernel/git/sre/linux-hsi
Pull HSI updates from Sebastian Reichel:
- completely switch to gpiod API
- misc small fixes
* tag 'hsi-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: nokia-modem: Replace of_gpio_count() by gpiod_count()
HSI: ssi_protocol: fix potential resource leak in ssip_pn_open()
HSI: omap_ssi_port: Fix dma_map_sg error check
HSI: cmt_speech: Pass a pointer to virt_to_page()
HSI: omap_ssi: Fix refcount leak in ssi_probe
HSI: clients: remove duplicate assignment
Linus Torvalds [Fri, 7 Oct 2022 18:32:10 +0000 (11:32 -0700)]
Merge tag 'pwm/for-6.1-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"The Rockchip and Mediatek drivers gain support for more chips and the
LPSS driver undergoes some refactoring and receives some improvements.
Other than that there are various cleanups of the core"
* tag 'pwm/for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: sysfs: Replace sprintf() with sysfs_emit()
pwm: core: Replace custom implementation of device_match_fwnode()
pwm: lpss: Add a comment to the bypass field
pwm: lpss: Make use of bits.h macros for all masks
pwm: lpss: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
pwm: lpss: Use device_get_match_data() to get device data
pwm: lpss: Move resource mapping to the glue drivers
pwm: lpss: Move exported symbols to PWM_LPSS namespace
pwm: lpss: Deduplicate board info data structures
dt-bindings: pwm: Add compatible for Mediatek MT8188
dt-bindings: pwm: rockchip: Add rockchip,rk3128-pwm
dt-bindings: pwm: rockchip: Add description for rk3588
pwm: sysfs: Switch to DEFINE_SIMPLE_DEV_PM_OPS() and pm_sleep_ptr()
pwm: rockchip: Convert to use dev_err_probe()
Linus Torvalds [Fri, 7 Oct 2022 18:24:20 +0000 (11:24 -0700)]
Merge tag 'mfd-next-6.1' of git://git./linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"Core Frameworks:
- Fix 'mfd_of_node_list' OF node entry resource leak
New Drivers:
- Add support for Ocelot VSC7512 Networking Chip
- Add support for MediaTek MT6370 subPMIC
- Add support for Richtek RT5120 (I2C) PMIC
New Device Support:
- Add support for Rockchip RV1126 and RK3588 to Syscon
- Add support for Rockchip RK817 Battery Charger to RK808
- Add support for Silergy SY7636a Voltage Regulator to Simple MFD
- Add support for Qualcomm PMP8074 PMIC to QCOM SPMI
- Add support for Secure Update to Intel M10 BMC
New Functionality:
- Provide SSP type to Intel's LPSS (PCI) SPI driver
Fix-ups:
- Remove legacy / unused code; stmpe, intel_soc_pmic_crc, syscon
- Unify / simplify; intel_soc_pmic_crc
- Trivial reordering / spelling, etc; Makefile, twl-core
- Convert to managed resources; intel_soc_pmic_crc
- Use appropriate APIs; intel_soc_pmic_crc
- strscpy() conversion; htc-i2cpld, lpc_ich, mfd-core
- GPIOD conversion; htc-i2cpld, stmpe
- Add missing header file includes; twl4030-irq
- DT goodies; stmpe, mediatek,mt6370, x-powers,axp152,
aspeed,ast2x00-scu, mediatek,mt8195-scpsys, qcom,spmi-pmic, syscon,
qcom,tcsr, rockchip,rk817, sprd,ums512-glbreg, dlg,da9063
Bug Fixes:
- Properly check return values; sm501, htc-i2cpld
- Repair Two-Wire Bus Mode; da9062-core
- Fix error handling; intel_soc_pmic_core, fsl-imx25-tsadc, lp8788,
lp8788-irq"
* tag 'mfd-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (60 commits)
mfd: syscon: Remove repetition of the regmap_get_val_endian()
mfd: ocelot-spi: Add missing MODULE_DEVICE_TABLE
power: supply: Add charger driver for Rockchip RK817
dt-bindings: mfd: mt6370: Fix the indentation in the example
mfd: da9061: Fix Failed to set Two-Wire Bus Mode.
mfd: htc-i2cpld: Fix an IS_ERR() vs NULL bug in htcpld_core_probe()
dt-bindings: mfd: qcom,tcsr: Drop simple-mfd from IPQ6018
mfd: sm501: Add check for platform_driver_register()
dt-bindings: mfd: mediatek: Add scpsys compatible for mt8186
mfd: twl4030: Add missed linux/device.h header
dt-bindings: mfd: dlg,da9063: Add missing regulator patterns
dt-bindings: mfd: sprd: Add bindings for ums512 global registers
mfd: intel_soc_pmic_chtdc_ti: Switch from __maybe_unused to pm_sleep_ptr() etc
dt-bindings: mfd: syscon: Add rk3588 QoS register compatible
mfd: stmpe: Switch to using gpiod API
mfd: qcom-spmi-pmic: Add pm7250b compatible
dt-bindings: mfd: Add missing (unevaluated|additional)Properties on child nodes
mfd/omap1: htc-i2cpld: Convert to a pure GPIO driver
mfd: intel-m10-bmc: Add d5005 bmc secure update driver
dt-bindings: mfd: syscon: Drop ref from reg-io-width
...
Linus Torvalds [Fri, 7 Oct 2022 18:13:42 +0000 (11:13 -0700)]
Merge tag 'for-linus-
2022100501' of git://git./linux/kernel/git/hid/hid
Pull HID updates from Benjamin Tissoires:
- handle of all Logitech Bluetooth HID++ devices in the Logitech HID++
drivers (Bastien Nocera)
- fix broken atomic checks in hid-multitouch by adding memory barriers
(Andri Yngvason)
- better handling of devices with AMD SFH1.1 (Basavaraj Natikar)
- better support of Nintendo clone controllers (Icenowy Zheng and
Johnothan King)
- Support for various RC controllers (Marcus Folkesson)
- Add UGEEv2 support in hid-uclogic (XP-PEN Deco Pro S and Parblo A610
PRO) (José Expósito)
- some conversions to use dev_groups (Greg Kroah-Hartman)
- HID-BPF preparatory patches, mostly to convert blank defines as enums
(Benjamin Tissoires)
* tag 'for-linus-
2022100501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (38 commits)
HID: wacom: add three styli to wacom_intuos_get_tool_type
HID: amd_sfh: Handle condition of "no sensors" for SFH1.1
HID: amd_sfh: Change dev_err to dev_dbg for additional debug info
HID: nintendo: check analog user calibration for plausibility
HID: nintendo: deregister home LED when it fails
HID: roccat: Fix use-after-free in roccat_read()
hid: topre: Add driver fixing report descriptor
HID: multitouch: Add memory barriers
HID: convert defines of HID class requests into a proper enum
HID: export hid_report_type to uapi
HID: core: store the unique system identifier in hid_device
HID: Add driver for PhoenixRC Flight Controller
HID: Add driver for VRC-2 Car Controller
HID: sony: Fix double word in comments
hid: hid-logitech-hidpp: avoid unnecessary assignments in hidpp_connect_event
HID: logitech-hidpp: Detect hi-res scrolling support
HID: logitech-hidpp: Remove hard-coded "Sw. Id." for HID++ 2.0 commands
HID: logitech-hidpp: Fix "Sw. Id." for HID++ 2.0 commands
HID: logitech-hidpp: Remove special-casing of Bluetooth devices
HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices
...
Linus Torvalds [Fri, 7 Oct 2022 18:04:35 +0000 (11:04 -0700)]
Merge tag 'media/v6.1-1' of git://git./linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- New driver for Mediatek MDP V3
- New driver for NXP i.MX DW100 dewarper
- Zoran driver got promoted from staging
- Hantro and related drivers got promoted from staging
- Several VB1 drivers got moved to staging/deprecated (cpia2, fsl-viu,
meye, saa7146, av7110, stkwebcam, tm6000, vpfe_capture, davinci,
zr364xx)
- Usual set of driver fixes, improvements and cleanups
* tag 'media/v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (107 commits)
media: destage Hantro VPU driver
media: platform: mtk-mdp3: add MediaTek MDP3 driver
media: dt-binding: mediatek: add bindings for MediaTek CCORR and WDMA
media: dt-binding: mediatek: add bindings for MediaTek MDP3 components
media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init
media: xilinx: video: Add 1X12 greyscale format
media: xilinx: csi2rxss: Add 1X12 greyscale format
media: staging: media: imx: imx7-media-csi: Increase video mem limit
media: uvcvideo: Limit power line control for Sonix Technology
media: uvcvideo: Use entity get_cur in uvc_ctrl_set
media: uvcvideo: Fix typo 'the the' in comment
media: uvcvideo: Use indexed loops in uvc_ctrl_init_ctrl()
media: uvcvideo: Fix memory leak in uvc_gpio_parse
media: renesas: vsp1: Add support for RZ/G2L VSPD
media: renesas: vsp1: Add VSP1_HAS_NON_ZERO_LBA feature bit
media: renesas: vsp1: Add support for VSP software version
media: renesas: vsp1: Add support to deassert/assert reset line
media: dt-bindings: media: renesas,vsp1: Document RZ/G2L VSPD bindings
media: meson: vdec: add missing clk_disable_unprepare on error in vdec_hevc_start()
media: amphion: fix a bug that vpu core may not resume after suspend
...
Linus Torvalds [Fri, 7 Oct 2022 17:48:49 +0000 (10:48 -0700)]
Merge tag 'ata-6.1-rc1' of git://git./linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
- Print the timeout value for internal command failures due to a
timeout (from Tomas)
- Improve parameter names in ata_dev_set_feature() to clarify this
function use (from Niklas)
- Improve the ahci driver low power mode setting initialization to
allow more flexibility for the user (from Rafael)
- Several patches to remove redundant variables in libata-core,
libata-eh and the pata_macio driver and to fix typos in comments
(from Jinpeng, Shaomin, Ye)
- Some code simplifications and macro renaming (for clarity) in various
functions of libata-core (from me)
- Add a missing check for a potential failure of sata_scr_read() in
sata_print_link_status() (from Li)
- Cleanup of libata Kconfig PATA_PLATFORM and PATA_OF_PLATFORM options
(from Lukas)
- Cleanups of ata dt-bindings and improvements of libahci_platform,
ahci and libahci code (from Serge)
- New driver for Synopsys AHCI SATA controllers, based of the generic
ahci code (from Serge). One compilation warning fix is added for this
driver (from me)
- Several fixes to macros used to discover a drive capabilities to be
consistent with the ACS specifications (from Niklas)
- A couple of simplifcations to some libata functions, removing
unnecessary arguments (from Niklas)
- An improvements to libata-eh code to avoid unnecessary link reset
when revalidating a drive after a failed command. In practice, this
extra, unneeded reset, reset does not cause any arm beyond slightly
slowing down error recovery (from Niklas)
* tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (45 commits)
ata: libata-eh: avoid needless hard reset when revalidating link
ata: libata: drop superfluous ata_eh_analyze_tf() parameter
ata: libata: drop superfluous ata_eh_request_sense() parameter
ata: fix ata_id_has_dipm()
ata: fix ata_id_has_ncq_autosense()
ata: fix ata_id_has_devslp()
ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting()
ata: libata-eh: Remove the unneeded result variable
ata: ahci_st: Enable compile test
ata: ahci_st: Fix compilation warning
MAINTAINERS: Add maintainers for DWC AHCI SATA driver
ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support
ata: ahci-dwc: Add platform-specific quirks support
dt-bindings: ata: ahci: Add Baikal-T1 AHCI SATA controller DT schema
ata: ahci: Add DWC AHCI SATA controller support
ata: libahci_platform: Add function returning a clock-handle by id
dt-bindings: ata: ahci: Add DWC AHCI SATA controller DT schema
ata: ahci: Introduce firmware-specific caps initialization
ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments
ata: libahci: Don't read AHCI version twice in the save-config method
...
Linus Torvalds [Fri, 7 Oct 2022 16:47:47 +0000 (09:47 -0700)]
Merge tag 'drm-next-2022-10-07-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fix from Dave Airlie:
"This reverts the patch I found with rough bisection to instability
around fences and the oops I got from netconsole.
sched:
- revert patch causing oopses"
* tag 'drm-next-2022-10-07-1' of git://anongit.freedesktop.org/drm/drm:
Revert "drm/sched: Use parent fence instead of finished"
Linus Torvalds [Fri, 7 Oct 2022 16:35:50 +0000 (09:35 -0700)]
Merge tag 'for-6.1/passthrough-2022-10-04' of git://git.kernel.dk/linux
Pull passthrough updates from Jens Axboe:
"With these changes, passthrough NVMe support over io_uring now
performs at the same level as block device O_DIRECT, and in many cases
6-8% better.
This contains:
- Add support for fixed buffers for passthrough (Anuj, Kanchan)
- Enable batched allocations and freeing on passthrough, similarly to
what we support on the normal storage path (me)
- Fix from Geert fixing an issue with !CONFIG_IO_URING"
* tag 'for-6.1/passthrough-2022-10-04' of git://git.kernel.dk/linux:
io_uring: Add missing inline to io_uring_cmd_import_fixed() dummy
nvme: wire up fixed buffer support for nvme passthrough
nvme: pass ubuffer as an integer
block: extend functionality to map bvec iterator
block: factor out blk_rq_map_bio_alloc helper
block: rename bio_map_put to blk_mq_map_bio_put
nvme: refactor nvme_alloc_request
nvme: refactor nvme_add_user_metadata
nvme: Use blk_rq_map_user_io helper
scsi: Use blk_rq_map_user_io helper
block: add blk_rq_map_user_io
io_uring: introduce fixed buffer support for io_uring_cmd
io_uring: add io_uring_cmd_import_fixed
nvme: enable batched completions of passthrough IO
nvme: split out metadata vs non metadata end_io uring_cmd completions
block: allow end_io based requests in the completion batch handling
block: change request end_io handler to pass back a return value
block: enable batched allocation for blk_mq_alloc_request()
block: kill deprecated BUG_ON() in the flush handling
Linus Torvalds [Fri, 7 Oct 2022 16:19:14 +0000 (09:19 -0700)]
Merge tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull requests via Christoph:
- handle number of queue changes in the TCP and RDMA drivers
(Daniel Wagner)
- allow changing the number of queues in nvmet (Daniel Wagner)
- also consider host_iface when checking ip options (Daniel
Wagner)
- don't map pages which can't come from HIGHMEM (Fabio M. De
Francesco)
- avoid unnecessary flush bios in nvmet (Guixin Liu)
- shrink and better pack the nvme_iod structure (Keith Busch)
- add comment for unaligned "fake" nqn (Linjun Bao)
- print actual source IP address through sysfs "address" attr
(Martin Belanger)
- various cleanups (Jackie Liu, Wolfram Sang, Genjian Zhang)
- handle effects after freeing the request (Keith Busch)
- copy firmware_rev on each init (Keith Busch)
- restrict management ioctls to admin (Keith Busch)
- ensure subsystem reset is single threaded (Keith Busch)
- report the actual number of tagset maps in nvme-pci (Keith
Busch)
- small fabrics authentication fixups (Christoph Hellwig)
- add common code for tagset allocation and freeing (Christoph
Hellwig)
- stop using the request_queue in nvmet (Christoph Hellwig)
- set min_align_mask before calculating max_hw_sectors (Rishabh
Bhatnagar)
- send a rediscover uevent when a persistent discovery controller
reconnects (Sagi Grimberg)
- misc nvmet-tcp fixes (Varun Prakash, zhenwei pi)
- MD pull request via Song:
- Various raid5 fix and clean up, by Logan Gunthorpe and David
Sloan.
- Raid10 performance optimization, by Yu Kuai.
- sbitmap wakeup hang fixes (Hugh, Keith, Jan, Yu)
- IO scheduler switching quisce fix (Keith)
- s390/dasd block driver updates (Stefan)
- support for recovery for the ublk driver (ZiyangZhang)
- rnbd drivers fixes and updates (Guoqing, Santosh, ye, Christoph)
- blk-mq and null_blk map fixes (Bart)
- various bcache fixes (Coly, Jilin, Jules)
- nbd signal hang fix (Shigeru)
- block writeback throttling fix (Yu)
- optimize the passthrough mapping handling (me)
- prepare block cgroups to being gendisk based (Christoph)
- get rid of an old PSI hack in the block layer, moving it to the
callers instead where it belongs (Christoph)
- blk-throttle fixes and cleanups (Yu)
- misc fixes and cleanups (Liu Shixin, Liu Song, Miaohe, Pankaj,
Ping-Xiang, Wolfram, Saurabh, Li Jinlin, Li Lei, Lin, Li zeming,
Miaohe, Bart, Coly, Gaosheng
* tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux: (162 commits)
sbitmap: fix lockup while swapping
block: add rationale for not using blk_mq_plug() when applicable
block: adapt blk_mq_plug() to not plug for writes that require a zone lock
s390/dasd: use blk_mq_alloc_disk
blk-cgroup: don't update the blkg lookup hint in blkg_conf_prep
nvmet: don't look at the request_queue in nvmet_bdev_set_limits
nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all
blk-mq: use quiesced elevator switch when reinitializing queues
block: replace blk_queue_nowait with bdev_nowait
nvme: remove nvme_ctrl_init_connect_q
nvme-loop: use the tagset alloc/free helpers
nvme-loop: store the generic nvme_ctrl in set->driver_data
nvme-loop: initialize sqsize later
nvme-fc: use the tagset alloc/free helpers
nvme-fc: store the generic nvme_ctrl in set->driver_data
nvme-fc: keep ctrl->sqsize in sync with opts->queue_size
nvme-rdma: use the tagset alloc/free helpers
nvme-rdma: store the generic nvme_ctrl in set->driver_data
nvme-tcp: use the tagset alloc/free helpers
nvme-tcp: store the generic nvme_ctrl in set->driver_data
...
Linus Torvalds [Fri, 7 Oct 2022 15:52:43 +0000 (08:52 -0700)]
Merge tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe:
- Add supported for more directly managed task_work running.
This is beneficial for real world applications that end up issuing
lots of system calls as part of handling work. Normal task_work will
always execute as we transition in and out of the kernel, even for
"unrelated" system calls. It's more efficient to defer the handling
of io_uring's deferred work until the application wants it to be run,
generally in batches.
As part of ongoing work to write an io_uring network backend for
Thrift, this has been shown to greatly improve performance. (Dylan)
- Add IOPOLL support for passthrough (Kanchan)
- Improvements and fixes to the send zero-copy support (Pavel)
- Partial IO handling fixes (Pavel)
- CQE ordering fixes around CQ ring overflow (Pavel)
- Support sendto() for non-zc as well (Pavel)
- Support sendmsg for zerocopy (Pavel)
- Networking iov_iter fix (Stefan)
- Misc fixes and cleanups (Pavel, me)
* tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux: (56 commits)
io_uring/net: fix notif cqe reordering
io_uring/net: don't update msg_name if not provided
io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL
io_uring/rw: defer fsnotify calls to task context
io_uring/net: fix fast_iov assignment in io_setup_async_msg()
io_uring/net: fix non-zc send with address
io_uring/net: don't skip notifs for failed requests
io_uring/rw: don't lose short results on io_setup_async_rw()
io_uring/rw: fix unexpected link breakage
io_uring/net: fix cleanup double free free_iov init
io_uring: fix CQE reordering
io_uring/net: fix UAF in io_sendrecv_fail()
selftest/net: adjust io_uring sendzc notif handling
io_uring: ensure local task_work marks task as running
io_uring/net: zerocopy sendmsg
io_uring/net: combine fail handlers
io_uring/net: rename io_sendzc()
io_uring/net: support non-zerocopy sendto
io_uring/net: refactor io_setup_async_addr
io_uring/net: don't lose partial send_zc on fail
...
Linus Torvalds [Fri, 7 Oct 2022 15:42:37 +0000 (08:42 -0700)]
Merge tag 'fs-for_v6.1-rc1' of git://git./linux/kernel/git/jack/linux-fs
Pull ext2, udf, reiserfs, and quota updates from Jan Kara:
- Fix for udf to make splicing work again
- More disk format sanity checks for ext2 to avoid crashes found by
syzbot
- More quota disk format checks to avoid crashes found by fuzzing
- Reiserfs & isofs cleanups
* tag 'fs-for_v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Add more checking after reading from quota file
quota: Replace all block number checking with helper function
quota: Check next/prev free block number after reading from quota file
ext2: Use kvmalloc() for group descriptor array
ext2: Add sanity checks for group and filesystem size
udf: Support splicing to file
isofs: delete unnecessary checks before brelse()
fs/reiserfs: replace ternary operator with min() and min_t()
Linus Torvalds [Fri, 7 Oct 2022 15:28:50 +0000 (08:28 -0700)]
Merge tag 'fsnotify-for_v6.1-rc1' of git://git./linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"Two cleanups for fsnotify code"
* tag 'fsnotify-for_v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: Remove obsoleted fanotify_event_has_path()
fsnotify: remove unused declaration
Linus Torvalds [Fri, 7 Oct 2022 15:19:26 +0000 (08:19 -0700)]
Merge tag '6.1-rc-ksmbd-fixes' of git://git.samba.org/ksmbd
Pull ksmbd updates from Steve French:
- RDMA (smbdirect) fixes
- fixes for SMB3.1.1 POSIX Extensions (especially for id mapping)
- various casemapping fixes for mount and lookup
- UID mapping fixes
- fix confusing error message
- protocol negotiation fixes, including NTLMSSP fix
- two encryption fixes
- directory listing fix
- some cleanup fixes
* tag '6.1-rc-ksmbd-fixes' of git://git.samba.org/ksmbd: (24 commits)
ksmbd: validate share name from share config response
ksmbd: call ib_drain_qp when disconnected
ksmbd: make utf-8 file name comparison work in __caseless_lookup()
ksmbd: Fix user namespace mapping
ksmbd: hide socket error message when ipv6 config is disable
ksmbd: reduce server smbdirect max send/receive segment sizes
ksmbd: decrease the number of SMB3 smbdirect server SGEs
ksmbd: Fix wrong return value and message length check in smb2_ioctl()
ksmbd: set NTLMSSP_NEGOTIATE_SEAL flag to challenge blob
ksmbd: fix encryption failure issue for session logoff response
ksmbd: fix endless loop when encryption for response fails
ksmbd: fill sids in SMB_FIND_FILE_POSIX_INFO response
ksmbd: set file permission mode to match Samba server posix extension behavior
ksmbd: change security id to the one samba used for posix extension
ksmbd: update documentation
ksmbd: casefold utf-8 share names and fix ascii lowercase conversion
ksmbd: port to vfs{g,u}id_t and associated helpers
ksmbd: fix incorrect handling of iterate_dir
MAINTAINERS: remove Hyunchul Lee from ksmbd maintainers
MAINTAINERS: Add Tom Talpey as ksmbd reviewer
...
Miquel Raynal [Fri, 7 Oct 2022 14:56:14 +0000 (16:56 +0200)]
Merge tag 'nand/for-6.1' into mtd/next
Raw NAND core changes:
* Replace of_gpio_named_count() by gpiod_count()
- Remove misguided comment of nand_get_device()
- bbt: Use the bitmap API to allocate bitmaps
Raw NAND controller drivers changes:
* Meson:
- Stop supporting legacy clocks
- Refine resource getting in probe
- Convert bindings to yaml
- Fix clock handling and update the bindings accordingly
- Fix bit map use in meson_nfc_ecc_correct()
* bcm47xx:
- Fix spelling typo in comment
* STM32 FMC2:
- Switch to using devm_fwnode_gpiod_get()
- Fix dma_map_sg error check
* Cadence:
- Remove an unneeded result variable
* Marvell:
- Fix error handle regarding dma_map_sg
* Orion:
- Use devm_clk_get_optional()
* Cafe:
- Use correct function name in comment block
* Atmel:
- Unmap streaming DMA mappings
* Arasan:
- Stop using 0 as NULL pointer
* GPMI:
- Fix typo 'the the' in comment
* BRCM:
- Add individual glue driver selection
- Move Kconfig to driver folder
* FSL: Fix none ECC mode
* Intel:
- Use devm_platform_ioremap_resource_byname()
- Remove unused clk_rate member from struct ebu_nand
- Remove unused nand_pa member from ebu_nand_cs
- Don't re-define NAND_DATA_IFACE_CHECK_ONLY
- Remove undocumented compatible string
- Fix compatible string in the bindings
- Read the chip-select line from the correct OF node
- Fix maximum chip select value in the bindings
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Nicholas Piggin [Thu, 6 Oct 2022 14:33:45 +0000 (00:33 +1000)]
KVM: PPC: Book3S HV: Fix stack frame regs marker
The hard-coded marker is out of date now, fix it using the nice define.
Fixes:
17773afdcd15 ("powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER")
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006143345.129077-1-npiggin@gmail.com
Jisheng Zhang [Mon, 29 Aug 2022 14:57:42 +0000 (22:57 +0800)]
riscv: enable THP_SWAP for RV64
I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
memory optimizations such as swap on zram are helpful. As is seen
in commit
d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
commit
bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
swapped out"), THP_SWAP can improve the swap throughput significantly.
Enable THP_SWAP for RV64, testing the micro-benchmark which is
introduced by commit
d0637c505f8a ("arm64: enable THP_SWAP for arm64")
shows below numbers on the Lichee RV dock board:
swp out bandwidth w/o patch: 66908 bytes/ms (mean of 10 tests)
swp out bandwidth w/ patch: 322638 bytes/ms (mean of 10 tests)
Improved by 382%!
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220829145742.3139-1-jszhang@kernel.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Palmer Dabbelt [Tue, 20 Sep 2022 20:45:18 +0000 (13:45 -0700)]
RISC-V: Print SSTC in canonical order
This got out of order during a merge conflict, fix it by putting the
entries in the correct order.
Fixes:
7ab52f75a9cf ("RISC-V: Add Sstc extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220920204518.10988-1-palmer@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Dave Airlie [Fri, 7 Oct 2022 02:40:50 +0000 (12:40 +1000)]
Revert "drm/sched: Use parent fence instead of finished"
This reverts commit
e4dc45b1848bc6bcac31eb1b4ccdd7f6718b3c86.
This is causing instability on Linus' desktop, and I'm seeing
oops with VK CTS runs.
netconsole got me the following oops:
[ 1234.778760] BUG: kernel NULL pointer dereference, address:
0000000000000088
[ 1234.778782] #PF: supervisor read access in kernel mode
[ 1234.778787] #PF: error_code(0x0000) - not-present page
[ 1234.778791] PGD 0 P4D 0
[ 1234.778798] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1234.778803] CPU: 7 PID: 805 Comm: systemd-journal Not tainted 6.0.0+ #2
[ 1234.778809] Hardware name: System manufacturer System Product
Name/PRIME X370-PRO, BIOS 5603 07/28/2020
[ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched]
[ 1234.778828] Code: aa 0f 1d ce e9 57 ff ff ff 48 89 d7 e8 9d 8f 3f
ce e9 4a ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53
48 89 fb <48> 8b af 88 00 00 00 f0 ff 8d f0 00 00 00 48 8b 85 80 01 00
00 f0
[ 1234.778834] RSP: 0000:
ffffabe680380de0 EFLAGS:
00010087
[ 1234.778839] RAX:
ffffffffc04e9230 RBX:
0000000000000000 RCX:
0000000000000018
[ 1234.778897] RDX:
00000ba278e8977a RSI:
ffff953fb288b460 RDI:
0000000000000000
[ 1234.778901] RBP:
ffff953fb288b598 R08:
00000000000000e0 R09:
ffff953fbd98b808
[ 1234.778905] R10:
0000000000000000 R11:
ffffabe680380ff8 R12:
ffffabe680380e00
[ 1234.778908] R13:
0000000000000001 R14:
00000000ffffffff R15:
ffff953fbd9ec458
[ 1234.778912] FS:
00007f35e7008580(0000) GS:
ffff95428ebc0000(0000)
knlGS:
0000000000000000
[ 1234.778916] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 1234.778919] CR2:
0000000000000088 CR3:
000000010147c000 CR4:
00000000003506e0
[ 1234.778924] Call Trace:
[ 1234.778981] <IRQ>
[ 1234.778989] dma_fence_signal_timestamp_locked+0x6a/0xe0
[ 1234.778999] dma_fence_signal+0x2c/0x50
[ 1234.779005] amdgpu_fence_process+0xc8/0x140 [amdgpu]
[ 1234.779234] sdma_v3_0_process_trap_irq+0x70/0x80 [amdgpu]
[ 1234.779395] amdgpu_irq_dispatch+0xa9/0x1d0 [amdgpu]
[ 1234.779609] amdgpu_ih_process+0x80/0x100 [amdgpu]
[ 1234.779783] amdgpu_irq_handler+0x1f/0x60 [amdgpu]
[ 1234.779940] __handle_irq_event_percpu+0x46/0x190
[ 1234.779946] handle_irq_event+0x34/0x70
[ 1234.779949] handle_edge_irq+0x9f/0x240
[ 1234.779954] __common_interrupt+0x66/0x100
[ 1234.779960] common_interrupt+0xa0/0xc0
[ 1234.779965] </IRQ>
[ 1234.779968] <TASK>
[ 1234.779971] asm_common_interrupt+0x22/0x40
[ 1234.779976] RIP: 0010:finish_mkwrite_fault+0x22/0x110
[ 1234.779981] Code: 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 41
54 55 48 89 fd 53 48 8b 07 f6 40 50 08 0f 84 eb 00 00 00 48 8b 45 30
48 8b 18 <48> 89 df e8 66 bd ff ff 48 85 c0 74 0d 48 89 c2 83 e2 01 48
83 ea
[ 1234.779985] RSP: 0000:
ffffabe680bcfd78 EFLAGS:
00000202
Revert it for now and figure it out later.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Fri, 7 Oct 2022 00:57:50 +0000 (17:57 -0700)]
Merge tag 'iomap-6.1-merge-1' of git://git./fs/xfs/xfs-linux
Pull iomap updates from Darrick Wong:
"It's pretty quiet this time around -- a UAF bugfix and a new
tracepoint so we can watch file writeback:
- Fix a UAF bug when recording writeback mapping errors
- Add a tracepoint so that we can monitor writeback mappings"
* tag 'iomap-6.1-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: add a tracepoint for mappings returned by map_blocks
iomap: iomap: fix memory corruption when recording errors during writeback
Linus Torvalds [Fri, 7 Oct 2022 00:45:53 +0000 (17:45 -0700)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"The first two changes involve files outside of fs/ext4:
- submit_bh() can never return an error, so change it to return void,
and remove the unused checks from its callers
- fix I_DIRTY_TIME handling so it will be set even if the inode
already has I_DIRTY_INODE
Performance:
- Always enable i_version counter (as btrfs and xfs already do).
Remove some uneeded i_version bumps to avoid unnecessary nfs cache
invalidations
- Wake up journal waiters in FIFO order, to avoid some journal users
from not getting a journal handle for an unfairly long time
- In ext4_write_begin() allocate any necessary buffer heads before
starting the journal handle
- Don't try to prefetch the block allocation bitmaps for a read-only
file system
Bug Fixes:
- Fix a number of fast commit bugs, including resources leaks and out
of bound references in various error handling paths and/or if the
fast commit log is corrupted
- Avoid stopping the online resize early when expanding a file system
which is less than 16TiB to a size greater than 16TiB
- Fix apparent metadata corruption caused by a race with a metadata
buffer head getting migrated while it was trying to be read
- Mark the lazy initialization thread freezable to prevent suspend
failures
- Other miscellaneous bug fixes
Cleanups:
- Break up the incredibly long ext4_full_super() function by
refactoring to move code into more understandable, smaller
functions
- Remove the deprecated (and ignored) noacl and nouser_attr mount
option
- Factor out some common code in fast commit handling
- Other miscellaneous cleanups"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (53 commits)
ext4: fix potential out of bound read in ext4_fc_replay_scan()
ext4: factor out ext4_fc_get_tl()
ext4: introduce EXT4_FC_TAG_BASE_LEN helper
ext4: factor out ext4_free_ext_path()
ext4: remove unnecessary drop path references in mext_check_coverage()
ext4: update 'state->fc_regions_size' after successful memory allocation
ext4: fix potential memory leak in ext4_fc_record_regions()
ext4: fix potential memory leak in ext4_fc_record_modified_inode()
ext4: remove redundant checking in ext4_ioctl_checkpoint
jbd2: add miss release buffer head in fc_do_one_pass()
ext4: move DIOREAD_NOLOCK setting to ext4_set_def_opts()
ext4: remove useless local variable 'blocksize'
ext4: unify the ext4 super block loading operation
ext4: factor out ext4_journal_data_mode_check()
ext4: factor out ext4_load_and_init_journal()
ext4: factor out ext4_group_desc_init() and ext4_group_desc_free()
ext4: factor out ext4_geometry_check()
ext4: factor out ext4_check_feature_compatibility()
ext4: factor out ext4_init_metadata_csum()
ext4: factor out ext4_encoding_init()
...
Linus Torvalds [Fri, 7 Oct 2022 00:42:29 +0000 (17:42 -0700)]
Merge tag 'affs-for-6.1-tag' of git://git./linux/kernel/git/kdave/linux
Pull affs update from David Sterba:
"One minor update for AFFS, switching away from strlcpy"
* tag 'affs-for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
affs: move from strlcpy with unused retval to strscpy
Linus Torvalds [Fri, 7 Oct 2022 00:36:48 +0000 (17:36 -0700)]
Merge tag 'for-6.1-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"There's a bunch of performance improvements, most notably the FIEMAP
speedup, the new block group tree to speed up mount on large
filesystems, more io_uring integration, some sysfs exports and the
usual fixes and core updates.
Summary:
Performance:
- outstanding FIEMAP speed improvement
- algorithmic change how extents are enumerated leads to orders of
magnitude speed boost (uncached and cached)
- extent sharing check speedup (2.2x uncached, 3x cached)
- add more cancellation points, allowing to interrupt seeking in
files with large number of extents
- more efficient hole and data seeking (4x uncached, 1.3x cached)
- sample results:
256M, 32K extents: 4s -> 29ms (~150x)
512M, 64K extents: 30s -> 59ms (~550x)
1G, 128K extents: 225s -> 120ms (~1800x)
- improved inode logging, especially for directories (on dbench
workload throughput +25%, max latency -21%)
- improved buffered IO, remove redundant extent state tracking,
lowering memory consumption and avoiding rb tree traversal
- add sysfs tunable to let qgroup temporarily skip exact accounting
when deleting snapshot, leading to a speedup but requiring a rescan
after that, will be used by snapper
- support io_uring and buffered writes, until now it was just for
direct IO, with the no-wait semantics implemented in the buffered
write path it now works and leads to speed improvement in IOPS
(2x), throughput (2.2x), latency (depends, 2x to 150x)
- small performance improvements when dropping and searching for
extent maps as well as when flushing delalloc in COW mode
(throughput +5MB/s)
User visible changes:
- new incompatible feature block-group-tree adding a dedicated tree
for tracking block groups, this allows a much faster load during
mount and avoids seeking unlike when it's scattered in the extent
tree items
- this reduces mount time for many-terabyte sized filesystems
- conversion tool will be provided so existing filesystem can also
be updated in place
- to reduce test matrix and feature combinations requires no-holes
and free-space-tree (mkfs defaults since 5.15)
- improved reporting of super block corruption detected by scrub
- scrub also tries to repair super block and does not wait until next
commit
- discard stats and tunables are exported in sysfs
(/sys/fs/btrfs/FSID/discard)
- qgroup status is exported in sysfs
(/sys/sys/fs/btrfs/FSID/qgroups/)
- verify that super block was not modified when thawing filesystem
Fixes:
- FIEMAP fixes
- fix extent sharing status, does not depend on the cached status
where merged
- flush delalloc so compressed extents are reported correctly
- fix alignment of VMA for memory mapped files on THP
- send: fix failures when processing inodes with no links (orphan
files and directories)
- fix race between quota enable and quota rescan ioctl
- handle more corner cases for read-only compat feature verification
- fix missed extent on fsync after dropping extent maps
Core:
- lockdep annotations to validate various transactions states and
state transitions
- preliminary support for fs-verity in send
- more effective memory use in scrub for subpage where sector is
smaller than page
- block group caching progress logic has been removed, load is now
synchronous
- simplify end IO callbacks and bio handling, use chained bios
instead of own tracking
- add no-wait semantics to several functions (tree search, nocow,
flushing, buffered write
- cleanups and refactoring
MM changes:
- export balance_dirty_pages_ratelimited_flags"
* tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (177 commits)
btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
btrfs: drop extent map range more efficiently
btrfs: avoid pointless extent map tree search when flushing delalloc
btrfs: remove unnecessary next extent map search
btrfs: remove unnecessary NULL pointer checks when searching extent maps
btrfs: assert tree is locked when clearing extent map from logging
btrfs: remove unnecessary extent map initializations
btrfs: remove the refcount warning/check at free_extent_map()
btrfs: add helper to replace extent map range with a new extent map
btrfs: move open coded extent map tree deletion out of inode eviction
btrfs: use cond_resched_rwlock_write() during inode eviction
btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
btrfs: move btrfs_drop_extent_cache() to extent_map.c
btrfs: fix missed extent on fsync after dropping extent maps
btrfs: remove stale prototype of btrfs_write_inode
btrfs: enable nowait async buffered writes
btrfs: assert nowait mode is not used for some btree search functions
btrfs: make btrfs_buffered_write nowait compatible
btrfs: plumb NOWAIT through the write path
btrfs: make lock_and_cleanup_extent_if_need nowait compatible
...
Linus Torvalds [Fri, 7 Oct 2022 00:31:02 +0000 (17:31 -0700)]
Merge tag 'pull-path' of git://git./linux/kernel/git/viro/vfs
Pull vfs constification updates from Al Viro:
"whack-a-mole: constifying struct path *"
* tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ecryptfs: constify path
spufs: constify path
nd_jump_link(): constify path
audit_init_parent(): constify path
__io_setxattr(): constify path
do_proc_readlink(): constify path
overlayfs: constify path
fs/notify: constify path
may_linkat(): constify path
do_sys_name_to_handle(): constify path
->getprocattr(): attribute name is const char *, TYVM...
Linus Torvalds [Fri, 7 Oct 2022 00:26:56 +0000 (17:26 -0700)]
Merge tag 'pull-tomoyo' of git://git./linux/kernel/git/viro/vfs
Pull misc tomoyo changes from Al Viro:
"A couple of assorted tomoyo patches"
* tag 'pull-tomoyo' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
tomoyo: struct path it might get from LSM callers won't have NULL dentry or mnt
tomoyo: use vsnprintf() properly
Linus Torvalds [Fri, 7 Oct 2022 00:22:11 +0000 (17:22 -0700)]
Merge tag 'pull-file_inode' of git://git./linux/kernel/git/viro/vfs
Pull file_inode() updates from Al Vrio:
"whack-a-mole: cropped up open-coded file_inode() uses..."
* tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
orangefs: use ->f_mapping
_nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mapping
dma_buf: no need to bother with file_inode()->i_mapping
nfs_finish_open(): don't open-code file_inode()
bprm_fill_uid(): don't open-code file_inode()
sgx: use ->f_mapping...
exfat_iterate(): don't open-code file_inode(file)
ibmvmc: don't open-code file_inode()
Linus Torvalds [Fri, 7 Oct 2022 00:13:18 +0000 (17:13 -0700)]
Merge tag 'pull-file' of git://git./linux/kernel/git/viro/vfs
Pull vfs file updates from Al Viro:
"struct file-related stuff"
* tag 'pull-file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
dma_buf_getfile(): don't bother with ->f_flags reassignments
Change calling conventions for filldir_t
locks: fix TOCTOU race when granting write lease
Linus Torvalds [Thu, 6 Oct 2022 23:55:41 +0000 (16:55 -0700)]
Merge tag 'pull-d_path' of git://git./linux/kernel/git/viro/vfs
Pull vfs d_path updates from Al Viro.
* tag 'pull-d_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
d_path.c: typo fix...
dynamic_dname(): drop unused dentry argument
Linus Torvalds [Thu, 6 Oct 2022 23:49:00 +0000 (16:49 -0700)]
Merge tag 'pull-inode' of git://git./linux/kernel/git/viro/vfs
Pull vfs inode update from Al Viro:
"Saner inode_init_always(), also fixing a nilfs problem"
* tag 'pull-inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: fix UAF/GPF bug in nilfs_mdt_destroy