Tang Bin [Fri, 3 Apr 2020 22:07:54 +0000 (06:07 +0800)]
crypto: amlogic - Delete duplicate dev_err in meson_crypto_probe()
When something goes wrong, platform_get_irq() will print an error message,
so in order to avoid the situation of repeat output,we should remove
dev_err here.
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Yang Shen [Fri, 3 Apr 2020 08:16:42 +0000 (16:16 +0800)]
crypto: hisilicon/qm - stop qp by judging sq and cq tail
It is not working well to determine whether the queue is empty based on
whether the used count is 0. It is more stable to get if the queue is
stopping by checking if the tail pointer of the send queue and the
completion queue are equal.
Signed-off-by: Yang Shen <shenyang39@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Yang Shen [Fri, 3 Apr 2020 08:16:41 +0000 (16:16 +0800)]
crypto: hisilicon/sec2 - add controller reset support for SEC2
Add support for controller reset in SEC driver.
Signed-off-by: Yang Shen <shenyang39@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hui Tang [Fri, 3 Apr 2020 08:16:40 +0000 (16:16 +0800)]
crypto: hisilicon/hpre - add controller reset support for HPRE
Add support for the controller reset in HPRE driver.
Signed-off-by: Hui Tang <tanghui20@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Shukun Tan [Fri, 3 Apr 2020 08:16:39 +0000 (16:16 +0800)]
crypto: hisilicon/zip - add controller reset support for zip
Register controller reset handle with PCIe AER.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Shukun Tan [Fri, 3 Apr 2020 08:16:38 +0000 (16:16 +0800)]
crypto: hisilicon/qm - add controller reset interface
Add the main implementation of the controller reset interface, which is
roughly divided into three parts, stop, reset, and reinitialization.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hao Fang [Thu, 2 Apr 2020 06:53:03 +0000 (14:53 +0800)]
crypto: hisilicon - add vfs_num module parameter for hpre/sec
The vfs_num module parameter has been used in zip driver, this patch adds
this for HPRE and SEC driver.
Signed-off-by: Hao Fang <fanghao11@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Shukun Tan [Thu, 2 Apr 2020 06:53:02 +0000 (14:53 +0800)]
crypto: hisilicon - unify SR-IOV related codes into QM
Clean the duplicate SR-IOV related codes, put all into qm.c.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Shukun Tan [Thu, 2 Apr 2020 06:53:01 +0000 (14:53 +0800)]
crypto: hisilicon - put vfs_num into struct hisi_qm
We plan to move vfs_num related code into qm.c, put the param
vfs_num into struct hisi_qm first.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Reviewed-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hadar Gat [Fri, 27 Mar 2020 06:10:23 +0000 (09:10 +0300)]
MAINTAINERS: add HG as cctrng maintainer
I work for Arm on maintaining the TrustZone CryptoCell TRNG driver.
Signed-off-by: Hadar Gat <hadar.gat@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hadar Gat [Fri, 27 Mar 2020 06:10:22 +0000 (09:10 +0300)]
hwrng: cctrng - introduce Arm CryptoCell driver
Introduce low level Arm CryptoCell TRNG HW support.
Signed-off-by: Hadar Gat <hadar.gat@arm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hadar Gat [Fri, 27 Mar 2020 06:10:21 +0000 (09:10 +0300)]
dt-bindings: add device tree binding for Arm CryptoCell trng engine
The Arm CryptoCell is a hardware security engine. This patch adds DT
bindings for its TRNG (True Random Number Generator) engine.
Signed-off-by: Hadar Gat <hadar.gat@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Iuliana Prodan [Tue, 7 Apr 2020 15:58:45 +0000 (18:58 +0300)]
crypto: caam - fix the address of the last entry of S/G
For skcipher algorithms, the input, output HW S/G tables
look like this: [IV, src][dst, IV]
Now, we can have 2 conditions here:
- there is no IV;
- src and dst are equal (in-place encryption) and scattered
and the error is an "off-by-one" in the HW S/G table.
This issue was seen with KASAN:
BUG: KASAN: slab-out-of-bounds in skcipher_edesc_alloc+0x95c/0x1018
Read of size 4 at addr
ffff000022a02958 by task cryptomgr_test/321
CPU: 2 PID: 321 Comm: cryptomgr_test Not tainted
5.6.0-rc1-00165-ge4ef8383-dirty #4
Hardware name: LS1046A RDB Board (DT)
Call trace:
dump_backtrace+0x0/0x260
show_stack+0x14/0x20
dump_stack+0xe8/0x144
print_address_description.isra.11+0x64/0x348
__kasan_report+0x11c/0x230
kasan_report+0xc/0x18
__asan_load4+0x90/0xb0
skcipher_edesc_alloc+0x95c/0x1018
skcipher_encrypt+0x84/0x150
crypto_skcipher_encrypt+0x50/0x68
test_skcipher_vec_cfg+0x4d4/0xc10
test_skcipher_vec+0x178/0x1d8
alg_test_skcipher+0xec/0x230
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Allocated by task 321:
save_stack+0x24/0xb0
__kasan_kmalloc.isra.10+0xc4/0xe0
kasan_kmalloc+0xc/0x18
__kmalloc+0x178/0x2b8
skcipher_edesc_alloc+0x21c/0x1018
skcipher_encrypt+0x84/0x150
crypto_skcipher_encrypt+0x50/0x68
test_skcipher_vec_cfg+0x4d4/0xc10
test_skcipher_vec+0x178/0x1d8
alg_test_skcipher+0xec/0x230
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Freed by task 0:
(stack is not available)
The buggy address belongs to the object at
ffff000022a02800
which belongs to the cache dma-kmalloc-512 of size 512
The buggy address is located 344 bytes inside of
512-byte region [
ffff000022a02800,
ffff000022a02a00)
The buggy address belongs to the page:
page:
fffffe00006a8000 refcount:1 mapcount:0 mapping:
ffff00093200c400
index:0x0 compound_mapcount: 0
flags: 0xffff00000010200(slab|head)
raw:
0ffff00000010200 dead000000000100 dead000000000122 ffff00093200c400
raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff000022a02800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff000022a02880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
ffff000022a02900: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
^
ffff000022a02980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff000022a02a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes:
334d37c9e263 ("crypto: caam - update IV using HW support")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Iuliana Prodan [Mon, 6 Apr 2020 22:47:28 +0000 (01:47 +0300)]
crypto: caam - fix use-after-free KASAN issue for RSA algorithms
Here's the KASAN report:
BUG: KASAN: use-after-free in rsa_pub_done+0x70/0xe8
Read of size 1 at addr
ffff000023082014 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 #60
Hardware name: LS1046A RDB Board (DT)
Call trace:
dump_backtrace+0x0/0x260
show_stack+0x14/0x20
dump_stack+0xe8/0x144
print_address_description.isra.11+0x64/0x348
__kasan_report+0x11c/0x230
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
rsa_pub_done+0x70/0xe8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
irq_exit+0x114/0x128
__handle_domain_irq+0x80/0xe0
gic_handle_irq+0x50/0xa0
el1_irq+0xb8/0x180
cpuidle_enter_state+0xa4/0x490
cpuidle_enter+0x48/0x70
call_cpuidle+0x44/0x70
do_idle+0x304/0x338
cpu_startup_entry+0x24/0x40
rest_init+0xf8/0x10c
arch_call_rest_init+0xc/0x14
start_kernel+0x774/0x7b4
Allocated by task 263:
save_stack+0x24/0xb0
__kasan_kmalloc.isra.10+0xc4/0xe0
kasan_kmalloc+0xc/0x18
__kmalloc+0x178/0x2b8
rsa_edesc_alloc+0x2cc/0xe10
caam_rsa_enc+0x9c/0x5f0
test_akcipher_one+0x78c/0x968
alg_test_akcipher+0x78/0xf8
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Freed by task 0:
save_stack+0x24/0xb0
__kasan_slab_free+0x10c/0x188
kasan_slab_free+0x10/0x18
kfree+0x7c/0x298
rsa_pub_done+0x68/0xe8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
The buggy address belongs to the object at
ffff000023082000
which belongs to the cache dma-kmalloc-256 of size 256
The buggy address is located 20 bytes inside of
256-byte region [
ffff000023082000,
ffff000023082100)
The buggy address belongs to the page:
page:
fffffe00006c2080 refcount:1 mapcount:0 mapping:
ffff00093200c200 index:0x0 compound_mapcount: 0
flags: 0xffff00000010200(slab|head)
raw:
0ffff00000010200 dead000000000100 dead000000000122 ffff00093200c200
raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff000023081f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff000023081f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
ffff000023082000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff000023082080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff000023082100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes:
bf53795025a2 ("crypto: caam - add crypto_engine support for RSA algorithms")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Iuliana Prodan [Mon, 6 Apr 2020 22:47:27 +0000 (01:47 +0300)]
crypto: caam - fix use-after-free KASAN issue for HASH algorithms
Here's the KASAN report:
BUG: KASAN: use-after-free in ahash_done+0xdc/0x3b8
Read of size 1 at addr
ffff00002303f010 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 #59
Hardware name: LS1046A RDB Board (DT)
Call trace:
dump_backtrace+0x0/0x260
show_stack+0x14/0x20
dump_stack+0xe8/0x144
print_address_description.isra.11+0x64/0x348
__kasan_report+0x11c/0x230
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
ahash_done+0xdc/0x3b8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
irq_exit+0x114/0x128
__handle_domain_irq+0x80/0xe0
gic_handle_irq+0x50/0xa0
el1_irq+0xb8/0x180
cpuidle_enter_state+0xa4/0x490
cpuidle_enter+0x48/0x70
call_cpuidle+0x44/0x70
do_idle+0x304/0x338
cpu_startup_entry+0x24/0x40
rest_init+0xf8/0x10c
arch_call_rest_init+0xc/0x14
start_kernel+0x774/0x7b4
Allocated by task 263:
save_stack+0x24/0xb0
__kasan_kmalloc.isra.10+0xc4/0xe0
kasan_kmalloc+0xc/0x18
__kmalloc+0x178/0x2b8
ahash_edesc_alloc+0x58/0x1f8
ahash_final_no_ctx+0x94/0x6e8
ahash_final+0x24/0x30
crypto_ahash_op+0x58/0xb0
crypto_ahash_final+0x30/0x40
do_ahash_op+0x2c/0xa0
test_ahash_vec_cfg+0x894/0x9e0
test_hash_vec_cfg+0x6c/0x88
test_hash_vec+0xfc/0x1e0
__alg_test_hash+0x1ac/0x368
alg_test_hash+0xf8/0x1c8
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Freed by task 0:
save_stack+0x24/0xb0
__kasan_slab_free+0x10c/0x188
kasan_slab_free+0x10/0x18
kfree+0x7c/0x298
ahash_done+0xd4/0x3b8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
The buggy address belongs to the object at
ffff00002303f000
which belongs to the cache dma-kmalloc-128 of size 128
The buggy address is located 16 bytes inside of
128-byte region [
ffff00002303f000,
ffff00002303f080)
The buggy address belongs to the page:
page:
fffffe00006c0fc0 refcount:1 mapcount:0 mapping:
ffff00093200c000 index:0x0
flags: 0xffff00000000200(slab)
raw:
0ffff00000000200 dead000000000100 dead000000000122 ffff00093200c000
raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00002303ef00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff00002303ef80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
ffff00002303f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00002303f080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff00002303f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes:
21b014f038d3 ("crypto: caam - add crypto_engine support for HASH algorithms")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Iuliana Prodan [Mon, 6 Apr 2020 22:47:26 +0000 (01:47 +0300)]
crypto: caam - fix use-after-free KASAN issue for AEAD algorithms
Here's the KASAN report:
BUG: KASAN: use-after-free in aead_crypt_done+0x60/0xd8
Read of size 1 at addr
ffff00002303f014 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 #58
Hardware name: LS1046A RDB Board (DT)
Call trace:
dump_backtrace+0x0/0x260
show_stack+0x14/0x20
dump_stack+0xe8/0x144
print_address_description.isra.11+0x64/0x348
__kasan_report+0x11c/0x230
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
aead_crypt_done+0x60/0xd8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
irq_exit+0x114/0x128
__handle_domain_irq+0x80/0xe0
gic_handle_irq+0x50/0xa0
el1_irq+0xb8/0x180
_raw_spin_unlock_irq+0x2c/0x78
finish_task_switch+0xa4/0x2f8
__schedule+0x3a4/0x890
schedule_idle+0x28/0x50
do_idle+0x22c/0x338
cpu_startup_entry+0x24/0x40
rest_init+0xf8/0x10c
arch_call_rest_init+0xc/0x14
start_kernel+0x774/0x7b4
Allocated by task 263:
save_stack+0x24/0xb0
__kasan_kmalloc.isra.10+0xc4/0xe0
kasan_kmalloc+0xc/0x18
__kmalloc+0x178/0x2b8
aead_edesc_alloc+0x1b4/0xbf0
ipsec_gcm_encrypt+0xd4/0x140
crypto_aead_encrypt+0x50/0x68
test_aead_vec_cfg+0x498/0xec0
test_aead_vec+0x110/0x200
alg_test_aead+0xfc/0x680
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Freed by task 0:
save_stack+0x24/0xb0
__kasan_slab_free+0x10c/0x188
kasan_slab_free+0x10/0x18
kfree+0x7c/0x298
aead_crypt_done+0x58/0xd8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
The buggy address belongs to the object at
ffff00002303f000
which belongs to the cache dma-kmalloc-128 of size 128
The buggy address is located 20 bytes inside of
128-byte region [
ffff00002303f000,
ffff00002303f080)
The buggy address belongs to the page:
page:
fffffe00006c0fc0 refcount:1 mapcount:0 mapping:
ffff00093200c000 index:0x0
flags: 0xffff00000000200(slab)
raw:
0ffff00000000200 dead000000000100 dead000000000122 ffff00093200c000
raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00002303ef00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff00002303ef80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
ffff00002303f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00002303f080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff00002303f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes:
1c2402266713 ("crypto: caam - add crypto_engine support for AEAD algorithms")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Iuliana Prodan [Mon, 6 Apr 2020 22:47:25 +0000 (01:47 +0300)]
crypto: caam - fix use-after-free KASAN issue for SKCIPHER algorithms
Here's the KASAN report:
BUG: KASAN: use-after-free in skcipher_crypt_done+0xe8/0x1a8
Read of size 1 at addr
ffff00002304001c by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc1-00162-gfcb90d5 #57
Hardware name: LS1046A RDB Board (DT)
Call trace:
dump_backtrace+0x0/0x260
show_stack+0x14/0x20
dump_stack+0xe8/0x144
print_address_description.isra.11+0x64/0x348
__kasan_report+0x11c/0x230
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
skcipher_crypt_done+0xe8/0x1a8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
irq_exit+0x114/0x128
__handle_domain_irq+0x80/0xe0
gic_handle_irq+0x50/0xa0
el1_irq+0xb8/0x180
_raw_spin_unlock_irq+0x2c/0x78
finish_task_switch+0xa4/0x2f8
__schedule+0x3a4/0x890
schedule_idle+0x28/0x50
do_idle+0x22c/0x338
cpu_startup_entry+0x24/0x40
rest_init+0xf8/0x10c
arch_call_rest_init+0xc/0x14
start_kernel+0x774/0x7b4
Allocated by task 263:
save_stack+0x24/0xb0
__kasan_kmalloc.isra.10+0xc4/0xe0
kasan_kmalloc+0xc/0x18
__kmalloc+0x178/0x2b8
skcipher_edesc_alloc+0x21c/0x1018
skcipher_encrypt+0x84/0x150
crypto_skcipher_encrypt+0x50/0x68
test_skcipher_vec_cfg+0x4d4/0xc10
test_skcipher_vec+0xf8/0x1d8
alg_test_skcipher+0xec/0x230
alg_test.part.44+0x114/0x4a0
alg_test+0x1c/0x60
cryptomgr_test+0x34/0x58
kthread+0x1b8/0x1c0
ret_from_fork+0x10/0x18
Freed by task 0:
save_stack+0x24/0xb0
__kasan_slab_free+0x10c/0x188
kasan_slab_free+0x10/0x18
kfree+0x7c/0x298
skcipher_crypt_done+0xe0/0x1a8
caam_jr_dequeue+0x390/0x608
tasklet_action_common.isra.13+0x1ec/0x230
tasklet_action+0x24/0x30
efi_header_end+0x1a4/0x370
The buggy address belongs to the object at
ffff000023040000
which belongs to the cache dma-kmalloc-512 of size 512
The buggy address is located 28 bytes inside of
512-byte region [
ffff000023040000,
ffff000023040200)
The buggy address belongs to the page:
page:
fffffe00006c1000 refcount:1 mapcount:0 mapping:
ffff00093200c400 index:0x0 compound_mapcount: 0
flags: 0xffff00000010200(slab|head)
raw:
0ffff00000010200 dead000000000100 dead000000000122 ffff00093200c400
raw:
0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00002303ff00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff00002303ff80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
ffff000023040000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff000023040080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff000023040100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes:
ee38767f152a ("crypto: caam - support crypto_engine framework for SKCIPHER algorithms")
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Linus Torvalds [Sun, 12 Apr 2020 19:35:55 +0000 (12:35 -0700)]
Linux 5.7-rc1
Linus Torvalds [Sun, 12 Apr 2020 18:04:58 +0000 (11:04 -0700)]
MAINTAINERS: sort field names for all entries
This sorts the actual field names too, potentially causing even more
chaos and confusion at merge time if you have edited the MAINTAINERS
file. But the end result is a more consistent layout, and hopefully
it's a one-time pain minimized by doing this just before the -rc1
release.
This was entirely scripted:
./scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS --order
Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 12 Apr 2020 18:03:52 +0000 (11:03 -0700)]
MAINTAINERS: sort entries by entry name
They are all supposed to be sorted, but people who add new entries don't
always know the alphabet. Plus sometimes the entry names get edited,
and people don't then re-order the entry.
Let's see how painful this will be for merging purposes (the MAINTAINERS
file is often edited in various different trees), but Joe claims there's
relatively few patches in -next that touch this, and doing it just
before -rc1 is likely the best time. Fingers crossed.
This was scripted with
/scripts/parse-maintainers.pl --input=MAINTAINERS --output=MAINTAINERS
but then I also ended up manually upper-casing a few entry names that
stood out when looking at the end result.
Requested-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 12 Apr 2020 17:17:16 +0000 (10:17 -0700)]
Merge tag 'x86-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of three patches to fix the fallout of the newly added split
lock detection feature.
It addressed the case where a KVM guest triggers a split lock #AC and
KVM reinjects it into the guest which is not prepared to handle it.
Add proper sanity checks which prevent the unconditional injection
into the guest and handles the #AC on the host side in the same way as
user space detections are handled. Depending on the detection mode it
either warns and disables detection for the task or kills the task if
the mode is set to fatal"
* tag 'x86-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
KVM: x86: Emulate split-lock access as a write in emulator
x86/split_lock: Provide handle_guest_split_lock()
Linus Torvalds [Sun, 12 Apr 2020 17:13:14 +0000 (10:13 -0700)]
Merge tag 'timers-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull time(keeping) updates from Thomas Gleixner:
- Fix the time_for_children symlink in /proc/$PID/ so it properly
reflects that it part of the 'time' namespace
- Add the missing userns limit for the allowed number of time
namespaces, which was half defined but the actual array member was
not added. This went unnoticed as the array has an exessive empty
member at the end but introduced a user visible regression as the
output was corrupted.
- Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
to catch half updated data.
* tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ucount: Make sure ucounts in /proc/sys/user don't regress again
time/namespace: Add max_time_namespaces ucount
time/namespace: Fix time_for_children symlink
Linus Torvalds [Sun, 12 Apr 2020 17:09:19 +0000 (10:09 -0700)]
Merge tag 'sched-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes/updates from Thomas Gleixner:
- Deduplicate the average computations in the scheduler core and the
fair class code.
- Fix a raise between runtime distribution and assignement which can
cause exceeding the quota by up to 70%.
- Prevent negative results in the imbalanace calculation
- Remove a stale warning in the workqueue code which can be triggered
since the call site was moved out of preempt disabled code. It's a
false positive.
- Deduplicate the print macros for procfs
- Add the ucmap values to the SCHED_DEBUG procfs output for completness
* tag 'sched-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Add task uclamp values to SCHED_DEBUG procfs
sched/debug: Factor out printing formats into common macros
sched/debug: Remove redundant macro define
sched/core: Remove unused rq::last_load_update_tick
workqueue: Remove the warning in wq_worker_sleeping()
sched/fair: Fix negative imbalance in imbalance calculation
sched/fair: Fix race between runtime distribution and assignment
sched/fair: Align rq->avg_idle and rq->avg_scan_cost
Linus Torvalds [Sun, 12 Apr 2020 17:05:24 +0000 (10:05 -0700)]
Merge tag 'perf-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"Three fixes/updates for perf:
- Fix the perf event cgroup tracking which tries to track the cgroup
even for disabled events.
- Add Ice Lake server support for uncore events
- Disable pagefaults when retrieving the physical address in the
sampling code"
* tag 'perf-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Disable page faults when getting phys address
perf/x86/intel/uncore: Add Ice Lake server uncore support
perf/cgroup: Correct indirection in perf_less_group_idx()
perf/core: Fix event cgroup tracking
Linus Torvalds [Sun, 12 Apr 2020 16:47:10 +0000 (09:47 -0700)]
Merge tag 'locking-urgent-2020-04-12' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Three small fixes/updates for the locking core code:
- Plug a task struct reference leak in the percpu rswem
implementation.
- Document the refcount interaction with PID_MAX_LIMIT
- Improve the 'invalid wait context' data dump in lockdep so it
contains all information which is required to decode the problem"
* tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Improve 'invalid wait context' splat
locking/refcount: Document interaction with PID_MAX_LIMIT
locking/percpu-rwsem: Fix a task_struct refcount
Linus Torvalds [Sun, 12 Apr 2020 16:41:01 +0000 (09:41 -0700)]
Merge tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Ten cifs/smb fixes:
- five RDMA (smbdirect) related fixes
- add experimental support for swap over SMB3 mounts
- also a fix which improves performance of signed connections"
* tag '5.7-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
smb3: enable swap on SMB3 mounts
smb3: change noisy error message to FYI
smb3: smbdirect support can be configured by default
cifs: smbd: Do not schedule work to send immediate packet on every receive
cifs: smbd: Properly process errors on ib_post_send
cifs: Allocate crypto structures on the fly for calculating signatures of incoming packets
cifs: smbd: Update receive credits before sending and deal with credits roll back on failure before sending
cifs: smbd: Check send queue size before posting a send
cifs: smbd: Merge code to track pending packets
cifs: ignore cached share root handle closing errors
Linus Torvalds [Sun, 12 Apr 2020 16:39:47 +0000 (09:39 -0700)]
Merge tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfix from Trond Myklebust:
"Fix an RCU read lock leakage in pnfs_alloc_ds_commits_list()"
* tag 'nfs-for-5.7-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS: Fix RCU lock leakage
Linus Torvalds [Sat, 11 Apr 2020 18:38:44 +0000 (11:38 -0700)]
Merge tag 'nios2-v5.7-rc1' of git://git./linux/kernel/git/lftan/nios2
Pull nios2 updates from Ley Foon Tan:
- Remove nios2-dev@lists.rocketboards.org from MAINTAINERS
- remove 'resetvalue' property
- rename 'altr,gpio-bank-width' -> 'altr,ngpio'
- enable the common clk subsystem on Nios2
* tag 'nios2-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
MAINTAINERS: Remove nios2-dev@lists.rocketboards.org
arch: nios2: remove 'resetvalue' property
arch: nios2: rename 'altr,gpio-bank-width' -> 'altr,ngpio'
arch: nios2: Enable the common clk subsystem on Nios2
Linus Torvalds [Sat, 11 Apr 2020 18:34:36 +0000 (11:34 -0700)]
Merge tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- fix an integer truncation in dma_direct_get_required_mask
(Kishon Vijay Abraham)
- fix the display of dma mapping types (Grygorii Strashko)
* tag 'dma-mapping-5.7-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-debug: fix displaying of dma allocation type
dma-direct: fix data truncation in dma_direct_get_required_mask()
Linus Torvalds [Sat, 11 Apr 2020 16:46:12 +0000 (09:46 -0700)]
Merge tag 'kbuild-v5.7-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- raise minimum supported binutils version to 2.23
- remove old CONFIG_AS_* macros that we know binutils >= 2.23 supports
- move remaining CONFIG_AS_* tests to Kconfig from Makefile
- enable -Wtautological-compare warnings to catch more issues
- do not support GCC plugins for GCC <= 4.7
- fix various breakages of 'make xconfig'
- include the linker version used for linking the kernel into
LINUX_COMPILER, which is used for the banner, and also exposed to
/proc/version
- link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y, which
allows us to remove the lib-ksyms.o workaround, and to solve the last
known issue of the LLVM linker
- add dummy tools in scripts/dummy-tools/ to enable all compiler tests
in Kconfig, which will be useful for distro maintainers
- support the single switch, LLVM=1 to use Clang and all LLVM utilities
instead of GCC and Binutils.
- support LLVM_IAS=1 to enable the integrated assembler, which is still
experimental
* tag 'kbuild-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (36 commits)
kbuild: fix comment about missing include guard detection
kbuild: support LLVM=1 to switch the default tools to Clang/LLVM
kbuild: replace AS=clang with LLVM_IAS=1
kbuild: add dummy toolchains to enable all cc-option etc. in Kconfig
kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y
MIPS: fw: arc: add __weak to prom_meminit and prom_free_prom_memory
kbuild: remove -I$(srctree)/tools/include from scripts/Makefile
kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h
Documentation/llvm: fix the name of llvm-size
kbuild: mkcompile_h: Include $LD version in /proc/version
kconfig: qconf: Fix a few alignment issues
kconfig: qconf: remove some old bogus TODOs
kconfig: qconf: fix support for the split view mode
kconfig: qconf: fix the content of the main widget
kconfig: qconf: Change title for the item window
kconfig: qconf: clean deprecated warnings
gcc-plugins: drop support for GCC <= 4.7
kbuild: Enable -Wtautological-compare
x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2
crypto: x86 - clean up poly1305-x86_64-cryptogams.S by 'make clean'
...
Sedat Dilek [Sat, 11 Apr 2020 13:29:43 +0000 (15:29 +0200)]
mailmap: Add Sedat Dilek (replacement for expired email address)
I do not longer work for credativ Germany.
Please, use my private email address instead.
This is for the case when people want to CC me on
patches sent from my old business email address.
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond Myklebust [Sat, 11 Apr 2020 15:37:18 +0000 (11:37 -0400)]
pNFS: Fix RCU lock leakage
Another brown paper bag moment. pnfs_alloc_ds_commits_list() is leaking
the RCU lock.
Fixes:
a9901899b649 ("pNFS: Add infrastructure for cleaning up per-layout commit structures")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Xiaoyao Li [Fri, 10 Apr 2020 11:54:02 +0000 (13:54 +0200)]
KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest
Two types of #AC can be generated in Intel CPUs:
1. legacy alignment check #AC
2. split lock #AC
Reflect #AC back into the guest if the guest has legacy alignment checks
enabled or if split lock detection is disabled.
If the #AC is not a legacy one and split lock detection is enabled, then
invoke handle_guest_split_lock() which will either warn and disable split
lock detection for this task or force SIGBUS on it.
[ tglx: Switch it to handle_guest_split_lock() and rename the misnamed
helper function. ]
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.176308876@linutronix.de
Xiaoyao Li [Fri, 10 Apr 2020 11:54:01 +0000 (13:54 +0200)]
KVM: x86: Emulate split-lock access as a write in emulator
Emulate split-lock accesses as writes if split lock detection is on
to avoid #AC during emulation, which will result in a panic(). This
should never occur for a well-behaved guest, but a malicious guest can
manipulate the TLB to trigger emulation of a locked instruction[1].
More discussion can be found at [2][3].
[1] https://lkml.kernel.org/r/
8c5b11c9-58df-38e7-a514-
dc12d687b198@redhat.com
[2] https://lkml.kernel.org/r/
20200131200134.GD18946@linux.intel.com
[3] https://lkml.kernel.org/r/
20200227001117.GX9940@linux.intel.com
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115517.084300242@linutronix.de
Thomas Gleixner [Fri, 10 Apr 2020 11:54:00 +0000 (13:54 +0200)]
x86/split_lock: Provide handle_guest_split_lock()
Without at least minimal handling for split lock detection induced #AC,
VMX will just run into the same problem as the VMWare hypervisor, which
was reported by Kenneth.
It will inject the #AC blindly into the guest whether the guest is
prepared or not.
Provide a function for guest mode which acts depending on the host
SLD mode. If mode == sld_warn, treat it like user space, i.e. emit a
warning, disable SLD and mark the task accordingly. Otherwise force
SIGBUS.
[ bp: Add a !CPU_SUP_INTEL stub for handle_guest_split_lock(). ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200410115516.978037132@linutronix.de
Link: https://lkml.kernel.org/r/20200402123258.895628824@linutronix.de
Masahiro Yamada [Wed, 8 Apr 2020 18:29:19 +0000 (03:29 +0900)]
kbuild: fix comment about missing include guard detection
The keyword here is 'twice' to explain the trick.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sat, 11 Apr 2020 00:57:48 +0000 (17:57 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
- Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc,
gup, hugetlb, pagemap, memremap)
- Various other things (hfs, ocfs2, kmod, misc, seqfile)
* akpm: (34 commits)
ipc/util.c: sysvipc_find_ipc() should increase position index
kernel/gcov/fs.c: gcov_seq_next() should increase position index
fs/seq_file.c: seq_read(): add info message about buggy .next functions
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
change email address for Pali Rohár
selftests: kmod: test disabling module autoloading
selftests: kmod: fix handling test numbers above 9
docs: admin-guide: document the kernel.modprobe sysctl
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
kmod: make request_module() return an error when autoloading is disabled
mm/memremap: set caching mode for PCI P2PDMA memory to WC
mm/memory_hotplug: add pgprot_t to mhp_params
powerpc/mm: thread pgprot_t through create_section_mapping()
x86/mm: introduce __set_memory_prot()
x86/mm: thread pgprot_t through init_memory_mapping()
mm/memory_hotplug: rename mhp_restrictions to mhp_params
mm/memory_hotplug: drop the flags field from struct mhp_restrictions
mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
mm/vma: introduce VM_ACCESS_FLAGS
mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
...
Linus Torvalds [Sat, 11 Apr 2020 00:53:43 +0000 (17:53 -0700)]
Merge tag 'docs-5.7-2' of git://git.lwn.net/linux
Pull Documentation fixes from Jonathan Corbet:
"A handful of late-arriving fixes for the documentation tree"
* tag 'docs-5.7-2' of git://git.lwn.net/linux:
Documentation: android: binderfs: add 'stats' mount option
Documentation: driver-api/usb/writing_usb_driver.rst Updates documentation links
docs: driver-api: address duplicate label warning
Documentation: sysrq: fix RST formatting
docs: kernel-parameters.txt: Fix broken references
docs: kernel-parameters.txt: Remove nompx
docs: filesystems: fix typo in qnx6.rst
Linus Torvalds [Sat, 11 Apr 2020 00:50:01 +0000 (17:50 -0700)]
Merge tag 'for-linus-5.7-ofs1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"A fix and two cleanups.
Fix:
- Christoph Hellwig noticed that some logic I added to
orangefs_file_read_iter introduced a race condition, so he sent a
reversion patch. I had to modify his patch since reverting at this
point broke Orangefs.
Cleanups:
- Christoph Hellwig noticed that we were doing some unnecessary work
in orangefs_flush, so he sent in a patch that removed the un-needed
code.
- Al Viro told me he had trouble building Orangefs. Orangefs should
be easy to build, even for Al :-).
I looked back at the test server build notes in orangefs.txt, just
in case that's where the trouble really is, and found a couple of
typos and made a couple of clarifications"
* tag 'for-linus-5.7-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: clarify build steps for test server in orangefs.txt
orangefs: don't mess with I_DIRTY_TIMES in orangefs_flush
orangefs: get rid of knob code...
Linus Torvalds [Sat, 11 Apr 2020 00:39:20 +0000 (17:39 -0700)]
Merge tag 'xtensa-
20200410' of git://github.com/jcmvbkbc/linux-xtensa
Pull xtensa updates from Max Filippov:
- replace setup_irq() by request_irq()
- cosmetic fixes in xtensa Kconfig and boot/Makefile
* tag 'xtensa-
20200410' of git://github.com/jcmvbkbc/linux-xtensa:
arch/xtensa: fix grammar in Kconfig help text
xtensa: remove meaningless export ccflags-y
xtensa: replace setup_irq() by request_irq()
Linus Torvalds [Sat, 11 Apr 2020 00:20:06 +0000 (17:20 -0700)]
Merge tag 'for-linus-5.7-rc1b-tag' of git://git./linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:
- two cleanups
- fix a boot regression introduced in this merge window
- fix wrong use of memory allocation flags
* tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: fix booting 32-bit pv guest
x86/xen: make xen_pvmmu_arch_setup() static
xen/blkfront: fix memory allocation flags in blkfront_setup_indirect()
xen: Use evtchn_type_t as a type for event channels
Vasily Averin [Fri, 10 Apr 2020 21:34:13 +0000 (14:34 -0700)]
ipc/util.c: sysvipc_find_ipc() should increase position index
If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/b7a20945-e315-8bb0-21e6-3875c14a8494@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 10 Apr 2020 21:34:10 +0000 (14:34 -0700)]
kernel/gcov/fs.c: gcov_seq_next() should increase position index
If seq_file .next function does not change position index, read after
some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 10 Apr 2020 21:34:06 +0000 (14:34 -0700)]
fs/seq_file.c: seq_read(): add info message about buggy .next functions
Patch series "seq_file .next functions should increase position index".
In Aug 2018 NeilBrown noticed commit
1f4aace60b0e ("fs/seq_file.c:
simplify seq_file iteration code and interface")
"Some ->next functions do not increment *pos when they return NULL...
Note that such ->next functions are buggy and should be fixed. A simple
demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size
larger than the size of /proc/swaps. This will always show the whole
last line of /proc/swaps"
Described problem is still actual. If you make lseek into middle of
last output line following read will output end of last line and whole
last line once again.
$ dd if=/proc/swaps bs=1 # usual output
Filename Type Size Used Priority
/dev/dm-0 partition 4194812 97536 -2
104+0 records in
104+0 records out
104 bytes copied
$ dd if=/proc/swaps bs=40 skip=1 # last line was generated twice
dd: /proc/swaps: cannot skip to specified offset
v/dm-0 partition 4194812 97536 -2
/dev/dm-0 partition 4194812 97536 -2
3+1 records in
3+1 records out
131 bytes copied
There are lot of other affected files, I've found 30+ including
/proc/net/ip_tables_matches and /proc/sysvipc/*
I've sent patches into maillists of affected subsystems already, this
patch-set fixes the problem in files related to pstore, tracing, gcov,
sysvipc and other subsystems processed via linux-kernel@ mailing list
directly
https://bugzilla.kernel.org/show_bug.cgi?id=206283
This patch (of 4):
Add debug code to seq_read() to detect missed or out-of-tree incorrect
.next seq_file functions.
[akpm@linux-foundation.org: s/pr_info/pr_info_ratelimited/, per Qian Cai]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/244674e5-760c-86bd-d08a-047042881748@virtuozzo.com
Link: http://lkml.kernel.org/r/7c24087c-e280-e580-5b0c-0cdaeb14cd18@virtuozzo.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kbuild test robot [Fri, 10 Apr 2020 21:34:03 +0000 (14:34 -0700)]
drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
Remove dev_err() messages after platform_get_irq*() failures.
platform_get_irq() already prints an error.
Generated by: scripts/coccinelle/api/platform_get_irq.cocci
Fixes:
6c41ac96ad92 ("dmaengine: tegra-apb: Support COMPILE_TEST")
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Vinod Koul <vinod.koul@linux.intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Jon Hunter <jonathanh@nvidia.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002271133450.2973@hadrien
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pali Rohár [Fri, 10 Apr 2020 21:34:00 +0000 (14:34 -0700)]
change email address for Pali Rohár
For security reasons I stopped using gmail account and kernel address is
now up-to-date alias to my personal address.
People periodically send me emails to address which they found in source
code of drivers, so this change reflects state where people can contact
me.
[ Added .mailmap entry as per Joe Perches - Linus ]
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:57 +0000 (14:33 -0700)]
selftests: kmod: test disabling module autoloading
Test that request_module() fails with -ENOENT when
/proc/sys/kernel/modprobe contains (a) a nonexistent path, and (b) an
empty path.
Case (b) is a regression test for the patch "kmod: make request_module()
return an error when autoloading is disabled".
Tested with 'kmod.sh -t 0010 && kmod.sh -t 0011', and also simply with
'kmod.sh' to run all kmod tests.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:53 +0000 (14:33 -0700)]
selftests: kmod: fix handling test numbers above 9
get_test_count() and get_test_enabled() were broken for test numbers
above 9 due to awk interpreting a field specification like '$0010' as
octal rather than decimal. Fix it by stripping the leading zeroes.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200318230515.171692-5-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:50 +0000 (14:33 -0700)]
docs: admin-guide: document the kernel.modprobe sysctl
Document the kernel.modprobe sysctl in the same place that all the other
kernel.* sysctls are documented. Make sure to mention how to use this
sysctl to completely disable module autoloading, and how this sysctl
relates to CONFIG_STATIC_USERMODEHELPER.
[ebiggers@google.com: v5]
Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:47 +0000 (14:33 -0700)]
fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
After request_module(), nothing is stopping the module from being
unloaded until someone takes a reference to it via try_get_module().
The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.
Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().
Keep it printed once only, since the intent of this warning is to detect
a bug in modprobe at boot time. Printing the warning more than once
wouldn't really provide any useful extra information.
Fixes:
41124db869b7 ("fs: warn in case userspace lied about modprobe return")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Cc: <stable@vger.kernel.org> [4.13+]
Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Biggers [Fri, 10 Apr 2020 21:33:43 +0000 (14:33 -0700)]
kmod: make request_module() return an error when autoloading is disabled
Patch series "module autoloading fixes and cleanups", v5.
This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via
'echo > /proc/sys/kernel/modprobe'.
It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/
20200310223731.126894-1-ebiggers@kernel.org/T/#u)
bydocumenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().
This patch (of 4):
It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.
This can be preferable to setting it to a nonexistent file since it
avoids the overhead of an attempted execve(), avoids potential
deadlocks, and avoids the call to security_kernel_module_request() and
thus on SELinux-based systems eliminates the need to write SELinux rules
to dontaudit module_request.
However, when module autoloading is disabled in this way,
request_module() returns 0. This is broken because callers expect 0 to
mean that the module was successfully loaded.
Apparently this was never noticed because this method of disabling
module autoloading isn't used much, and also most callers don't use the
return value of request_module() since it's always necessary to check
whether the module registered its functionality or not anyway.
But improperly returning 0 can indeed confuse a few callers, for example
get_fs_type() in fs/filesystems.c where it causes a WARNING to be hit:
if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
fs = __get_fs_type(name, len);
WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
}
This is easily reproduced with:
echo > /proc/sys/kernel/modprobe
mount -t NONEXISTENT none /
It causes:
request_module fs-NONEXISTENT succeeded, but still no fs?
WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
[...]
This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module.
Regardless, request_module() should correctly return an error when it
fails. So let's make it return -ENOENT, which matches the error when
the modprobe binary doesn't exist.
I've also sent patches to document and test this case.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Ben Hutchings <benh@debian.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:39 +0000 (14:33 -0700)]
mm/memremap: set caching mode for PCI P2PDMA memory to WC
PCI BAR IO memory should never be mapped as WB, however prior to this
the PAT bits were set WB and it was typically overridden by MTRR
registers set by the firmware.
Set PCI P2PDMA memory to be UC as this is what it currently, typically,
ends up being mapped as on x86 after the MTRR registers override the
cache setting.
Future use-cases may need to generalize this by adding flags to select
the caching type, as some P2PDMA cases may not want UC. However, those
use-cases are not upstream yet and this can be changed when they arrive.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-8-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:36 +0000 (14:33 -0700)]
mm/memory_hotplug: add pgprot_t to mhp_params
devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory. At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-. In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.
Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.
To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().
Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables. For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).
For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.
A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:32 +0000 (14:33 -0700)]
powerpc/mm: thread pgprot_t through create_section_mapping()
In prepartion to support a pgprot_t argument for arch_add_memory().
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:28 +0000 (14:33 -0700)]
x86/mm: introduce __set_memory_prot()
For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:24 +0000 (14:33 -0700)]
x86/mm: thread pgprot_t through init_memory_mapping()
In preparation to support a pgprot_t argument for arch_add_memory().
It's required to move the prototype of init_memory_mapping() seeing the
original location came before the definition of pgprot_t.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:21 +0000 (14:33 -0700)]
mm/memory_hotplug: rename mhp_restrictions to mhp_params
The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 10 Apr 2020 21:33:17 +0000 (14:33 -0700)]
mm/memory_hotplug: drop the flags field from struct mhp_restrictions
Patch series "Allow setting caching mode in arch_add_memory() for
P2PDMA", v4.
Currently, the page tables created using memremap_pages() are always
created with the PAGE_KERNEL cacheing mode. However, the P2PDMA code is
creating pages for PCI BAR memory which should never be accessed through
the cache and instead use either WC or UC. This still works in most
cases, on x86, because the MTRR registers typically override the caching
settings in the page tables for all of the IO memory to be UC-.
However, this tends not to work so well on other arches or some rare x86
machines that have firmware which does not setup the MTRR registers in
this way.
Instead of this, this series proposes a change to arch_add_memory() to
take the pgprot required by the mapping which allows us to explicitly
set pagetable entries for P2PDMA memory to UC.
This changes is pretty routine for most of the arches: x86_64, arm64 and
powerpc simply need to thread the pgprot through to where the page
tables are setup. x86_32 unfortunately sets up the page tables at boot
so must use _set_memory_prot() to change their caching mode. ia64, s390
and sh don't appear to have an easy way to change the page tables so,
for now at least, we just return -EINVAL on such mappings and thus they
will not support P2PDMA memory until the work for this is done. This
should be fine as they don't yet support ZONE_DEVICE.
This patch (of 7):
This variable is not used anywhere and should therefore be removed from
the structure.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:13 +0000 (14:33 -0700)]
mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL
but required to define quite similar fallback stubs for special page
table entry helpers such as pte_special() and pte_mkspecial(), as they
get build in generic MM without a config check. This creates two
generic fallback stub definitions for these helpers, eliminating much
code duplication.
mips platform has a special case where pte_special() and pte_mkspecial()
visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires.
This restricts those symbol visibility in order to avoid redefinitions
which is now exposed through this new generic stubs and subsequent build
failure. arm platform set_pte_at() definition needs to be moved into a
C file just to prevent a build failure.
[anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas]
Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Guo Ren <guoren@kernel.org> [csky]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Stafford Horne <shorne@gmail.com> [openrisc]
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:09 +0000 (14:33 -0700)]
mm/vma: introduce VM_ACCESS_FLAGS
There are many places where all basic VMA access flags (read, write,
exec) are initialized or checked against as a group. One such example
is during page fault. Existing vma_is_accessible() wrapper already
creates the notion of VMA accessibility as a group access permissions.
Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
will not only reduce code duplication but also extend the VMA
accessibility concept in general.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anshuman Khandual [Fri, 10 Apr 2020 21:33:05 +0000 (14:33 -0700)]
mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS. While here, also define some more
macros with standard VMA access flag combinations that are used
frequently across many platforms. Apart from simplification, this
reduces code duplication as well.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:33:01 +0000 (14:33 -0700)]
mm/memory.c: add vm_insert_pages()
Add the ability to insert multiple pages at once to a user VM with lower
PTE spinlock operations.
The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.
[akpm@linux-foundation.org: pte_alloc() no longer takes the `addr' argument]
[arjunroy@google.com: add missing page_count() check to vm_insert_pages()]
Link: http://lkml.kernel.org/r/20200214005929.104481-1-arjunroy.kdev@gmail.com
[arjunroy@google.com: vm_insert_pages() checks if pte_index defined]
Link: http://lkml.kernel.org/r/20200228054714.204424-2-arjunroy.kdev@gmail.com
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200128025958.43490-2-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:58 +0000 (14:32 -0700)]
mm: define pte_index as macro for x86
pte_index() is either defined as a macro (e.g. sparc64) or as an
inlined function (e.g. x86). vm_insert_pages() depends on pte_index
but it is not defined on all platforms (e.g. m68k).
To fix compilation of vm_insert_pages() on architectures not providing
pte_index(), we perform the following fix:
0. For platforms where it is meaningful, and defined as a macro, no
change is needed.
1. For platforms where it is meaningful and defined as an inlined
function, and we want to use it with vm_insert_pages(), we define
a degenerate macro of the form: #define pte_index pte_index
2. vm_insert_pages() checks for the existence of a pte_index macro
definition. If found, it implements a batched insert. If not found,
it devolves to calling vm_insert_page() in a loop.
This patch implements step 1 for x86.
v3 of this patch fixes a compilation warning for an unused method.
v2 of this patch moved a macro definition to a more readable location.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200228054714.204424-1-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:54 +0000 (14:32 -0700)]
mm: bring sparc pte_index() semantics inline with other platforms
pte_index() on platforms other than sparc return a numerical index. On
sparc, it returns a pte_t*. This presents an issue for
vm_insert_pages(), which relies on pte_index() to find the offset for a
pte within a pmd, for batched inserts.
This patch:
1. Modifies pte_index() for sparc to return a numerical index, like
other platforms,
2. Defines pte_entry() for sparc which returns a pte_t*
(as pte_index() used to),
3. Converts existing sparc callers for pte_index() to use pte_entry().
[sfr@canb.auug.org.au: remove pte_entry and just directly modified pte_offset_kernel instead]
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Arjun Roy <arjunroy.kdev@gmail.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Link: http://lkml.kernel.org/r/20200227105045.6b421d9f@canb.auug.org.au
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjun Roy [Fri, 10 Apr 2020 21:32:51 +0000 (14:32 -0700)]
mm/memory.c: refactor insert_page to prepare for batched-lock insert
Add helper methods for vm_insert_page()/insert_page() to prepare for
vm_insert_pages(), which batch-inserts pages to reduce spinlock
operations when inserting multiple consecutive pages into the user page
table.
The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200128025958.43490-1-arjunroy.kdev@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaewon Kim [Fri, 10 Apr 2020 21:32:48 +0000 (14:32 -0700)]
mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area
On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
arch_get_unmapped_area_topdown did not set align_offset. Internally on
both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
then info->align_offset was meaningless.
But commit
df529cabb7a2 ("mm: mmap: add trace point of
vm_unmapped_area") always prints info->align_offset even though it is
uninitialized.
Fix this uninitialized value issue by setting it to 0 explicitly.
Before:
vm_unmapped_area: addr=0x755b155000 err=0 total_vm=0x15aaf0 flags=0x1 len=0x109000 lo=0x8000 hi=0x75eed48000 mask=0x0 ofs=0x4022
After:
vm_unmapped_area: addr=0x74a4ca1000 err=0 total_vm=0x168ab1 flags=0x1 len=0x9000 lo=0x8000 hi=0x753d94b000 mask=0x0 ofs=0x0
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20200409094035.19457-1-jaewon31.kim@samsung.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin [Fri, 10 Apr 2020 21:32:45 +0000 (14:32 -0700)]
mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit
944d9fec8d7a ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.
However it actually works only at early stages of the system loading,
when the majority of memory is free. After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero. Even dropping caches manually doesn't
help a lot.
At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex. At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.
The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
cma allocator and the dedicated cma area
In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.
* On a multi-node machine a per-node cma area is allocated on each node.
Following gigantic hugetlb allocation are using the first available
numa node if the mask isn't specified by a user.
Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
pass hugetlb_cma=10G as a kernel argument
2) allocate hugetlb pages as usual, e.g.
echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.
x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.
The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap. It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks!
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aslan Bakirov [Fri, 10 Apr 2020 21:32:42 +0000 (14:32 -0700)]
mm: cma: NUMA node interface
I've noticed that there is no interface exposed by CMA which would let
me to declare contigous memory on particular NUMA node.
This patchset adds the ability to try to allocate contiguous memory on a
specific node. It will fallback to other nodes if the specified one
doesn't work.
Implement a new method for declaring contigous memory on particular node
and keep cma_declare_contiguous() as a wrapper.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Aslan Bakirov <aslan@fb.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-2-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changwei Ge [Fri, 10 Apr 2020 21:32:38 +0000 (14:32 -0700)]
ocfs2: no need try to truncate file beyond i_size
Linux fallocate(2) with FALLOC_FL_PUNCH_HOLE mode set, its offset can
exceed the inode size. Ocfs2 now doesn't allow that offset beyond inode
size. This restriction is not necessary and violates fallocate(2)
semantics.
If fallocate(2) offset is beyond inode size, just return success and do
nothing further.
Otherwise, ocfs2 will crash the kernel.
kernel BUG at fs/ocfs2//alloc.c:7264!
ocfs2_truncate_inline+0x20f/0x360 [ocfs2]
ocfs2_remove_inode_range+0x23c/0xcb0 [ocfs2]
__ocfs2_change_file_space+0x4a5/0x650 [ocfs2]
ocfs2_fallocate+0x83/0xa0 [ocfs2]
vfs_fallocate+0x148/0x230
SyS_fallocate+0x48/0x80
do_syscall_64+0x79/0x170
Signed-off-by: Changwei Ge <chge@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200407082754.17565-1-chge@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason Yan [Fri, 10 Apr 2020 21:32:32 +0000 (14:32 -0700)]
mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static
Fix the following sparse warning:
mm/page_alloc.c:106:1: warning: symbol 'pcpu_drain_mutex' was not declared. Should it be static?
mm/page_alloc.c:107:1: warning: symbol '__pcpu_scope_pcpu_drain' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200407023925.46438-1-yanaijie@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 10 Apr 2020 21:32:29 +0000 (14:32 -0700)]
mm/page_alloc.c: fix kernel-doc warning
Add description of function parameter 'mt' to fix kernel-doc warning:
mm/page_alloc.c:3246: warning: Function parameter or member 'mt' not described in '__putback_isolated_page'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/02998bd4-0b82-2f15-2570-f86130304d1e@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mauro Carvalho Chehab [Fri, 10 Apr 2020 21:32:25 +0000 (14:32 -0700)]
docs: mm: slab.h: fix a broken cross-reference
There is a typo at the cross-reference link, causing this warning:
include/linux/slab.h:11: WARNING: undefined label: memory-allocation (if the link has no caption the label must precede a section header)
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/0aeac24235d356ebd935d11e147dcc6edbb6465c.1586359676.git.mchehab+huawei@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qiujun Huang [Fri, 10 Apr 2020 21:32:22 +0000 (14:32 -0700)]
mm, slab_common: fix a typo in comment "eariler"->"earlier"
There is a typo in comment, fix it.
s/eariler/earlier/
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Lameter <cl@linux.com>
Link: http://lkml.kernel.org/r/20200405160544.1246-1-hqjagain@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jakub Kicinski [Fri, 10 Apr 2020 21:32:19 +0000 (14:32 -0700)]
mm, memcg: do not high throttle allocators based on wraparound
If a cgroup violates its memory.high constraints, we may end up unduly
penalising it. For example, for the following hierarchy:
A: max high, 20 usage
A/B: 9 high, 10 usage
A/C: max high, 10 usage
We would end up doing the following calculation below when calculating
high delay for A/B:
A/B: 10 - 9 = 1...
A: 20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.
This gets worse with higher disparities in usage in the parent.
I have no idea how this disappeared from the final version of the patch,
but it is certainly Not Good(tm). This wasn't obvious in testing because,
for a simple cgroup hierarchy with only one child, the result is usually
roughly the same. It's only in more complex hierarchies that things go
really awry (although still, the effects are limited to a maximum of 2
seconds in schedule_timeout_killable at a maximum).
[chris@chrisdown.name: changelog]
Fixes:
e26733e0d0ec ("mm, memcg: throttle allocators based on ancestral memory.high")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [5.4.x]
Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon Gander [Fri, 10 Apr 2020 21:32:16 +0000 (14:32 -0700)]
hfsplus: fix crash and filesystem corruption when deleting files
When removing files containing extended attributes, the hfsplus driver may
remove the wrong entries from the attributes b-tree, causing major
filesystem damage and in some cases even kernel crashes.
To remove a file, all its extended attributes have to be removed as well.
The driver does this by looking up all keys in the attributes b-tree with
the cnid of the file. Each of these entries then gets deleted using the
key used for searching, which doesn't contain the attribute's name when it
should. Since the key doesn't contain the name, the deletion routine will
not find the correct entry and instead remove the one in front of it. If
parent nodes have to be modified, these become corrupt as well. This
causes invalid links and unsorted entries that not even macOS's fsck_hfs
is able to fix.
To fix this, modify the search key before an entry is deleted from the
attributes b-tree by copying the found entry's key into the search key,
therefore ensuring that the correct entry gets removed from the tree.
Signed-off-by: Simon Gander <simon@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200327155541.1521-1-simon@tuxera.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergey Senozhatsky [Tue, 3 Mar 2020 11:30:02 +0000 (20:30 +0900)]
printk: queue wake_up_klogd irq_work only if per-CPU areas are ready
printk_deferred(), similarly to printk_safe/printk_nmi, does not
immediately attempt to print a new message on the consoles, avoiding
calls into non-reentrant kernel paths, e.g. scheduler or timekeeping,
which potentially can deadlock the system.
Those printk() flavors, instead, rely on per-CPU flush irq_work to print
messages from safer contexts. For same reasons (recursive scheduler or
timekeeping calls) printk() uses per-CPU irq_work in order to wake up
user space syslog/kmsg readers.
However, only printk_safe/printk_nmi do make sure that per-CPU areas
have been initialised and that it's safe to modify per-CPU irq_work.
This means that, for instance, should printk_deferred() be invoked "too
early", that is before per-CPU areas are initialised, printk_deferred()
will perform illegal per-CPU access.
Lech Perczak [0] reports that after commit
1b710b1b10ef ("char/random:
silence a lockdep splat with printk()") user-space syslog/kmsg readers
are not able to read new kernel messages.
The reason is printk_deferred() being called too early (as was pointed
out by Petr and John).
Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU
areas are initialized.
Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
Reported-by: Lech Perczak <l.perczak@camlintechnologies.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Jann Horn <jannh@google.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 10 Apr 2020 19:59:56 +0000 (12:59 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull proc fix from Eric Biederman:
"A brown paper bag slipped through my proc changes, and syzcaller
caught it when the code ended up in your tree.
I have opted to fix it the simplest cleanest way I know how, so there
is no reasonable chance for the bug to repeat"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Use a dedicated lock in struct pid
Linus Torvalds [Fri, 10 Apr 2020 19:55:20 +0000 (12:55 -0700)]
Merge tag 'pwm/for-5.7-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"There's quite a few changes this time around.
Most of these are fixes and cleanups, but there's also new chip
support for some drivers and a bit of rework"
* tag 'pwm/for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (33 commits)
pwm: pca9685: Fix PWM/GPIO inter-operation
pwm: Make pwm_apply_state_debug() static
pwm: meson: Remove redundant assignment to variable fin_freq
pwm: jz4740: Allow selection of PWM channels 0 and 1
pwm: jz4740: Obtain regmap from parent node
pwm: jz4740: Improve algorithm of clock calculation
pwm: jz4740: Use clocks from TCU driver
pwm: sun4i: Remove redundant needs_delay
pwm: omap-dmtimer: Implement .apply callback
pwm: omap-dmtimer: Do not disable PWM before changing period/duty_cycle
pwm: omap-dmtimer: Fix PWM enabling sequence
pwm: omap-dmtimer: Update description for PWM OMAP DM timer
pwm: omap-dmtimer: Drop unused header file
pwm: renesas-tpu: Drop confusing registered message
pwm: renesas-tpu: Fix late Runtime PM enablement
pwm: rcar: Fix late Runtime PM enablement
dt-bindings: pwm: renesas-tpu: Document more R-Car Gen2 support
pwm: meson: Fix confusing indentation
pwm: pca9685: Use gpio core provided macro GPIO_LINE_DIRECTION_OUT
pwm: pca9685: Replace CONFIG_PM with __maybe_unused
...
Linus Torvalds [Fri, 10 Apr 2020 19:43:42 +0000 (12:43 -0700)]
Merge tag 'for-linus-5.7-1' of git://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"Bug fixes for main IPMI driver, kcs updates
A couple of bug fixes for the main IPMI driver, one functional and two
annotations.
The kcs driver has some significant updates that have been pending for
a while, but I forgot to include in next until a week ago. But this
code is only used by the people who are sending it to me, really, so
it's not a big deal. I did want it to sit in next for at least a week,
and it did result in a fix"
* tag 'for-linus-5.7-1' of git://github.com/cminyard/linux-ipmi:
ipmi: kcs: Fix aspeed_kcs_probe_of_v1()
ipmi: Add missing annotation for ipmi_ssif_lock_cond() and ipmi_ssif_unlock_cond()
ipmi: kcs: aspeed: Implement v2 bindings
ipmi: kcs: Finish configuring ASPEED KCS device before enable
dt-bindings: ipmi: aspeed: Introduce a v2 binding for KCS
ipmi: fix hung processes in __get_guid()
drivers: char: ipmi: ipmi_msghandler: Pass lockdep expression to RCU lists
Linus Torvalds [Fri, 10 Apr 2020 19:38:28 +0000 (12:38 -0700)]
Merge tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"As expected, more fixes did turn up in the latter part of the week.
The drm_local_map build regression fix is here, along with temporary
disabling of the hugepage work due to some amdgpu related crashes.
Otherwise it's just a bunch of i915, and amdgpu fixes.
legacy:
- fix drm_local_map.offset type
ttm:
- temporarily disable hugepages to debug amdgpu problems.
prime:
- fix sg extraction
amdgpu:
- Various Renoir fixes
- Fix gfx clockgating sequence on gfx10
- RAS fixes
- Avoid MST property creation after registration
- Various cursor/viewport fixes
- Fix a confusing log message about optional firmwares
i915:
- Flush all the reloc_gpu batch (Chris)
- Ignore readonly failures when updating relocs (Chris)
- Fill all the unused space in the GGTT (Chris)
- Return the right vswing table (Jose)
- Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre)
analogix_dp:
- probe fix
virtio:
- oob fix in object create"
* tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm: (34 commits)
drm/ttm: Temporarily disable the huge_fault() callback
drm/bridge: analogix_dp: Split bind() into probe() and real bind()
drm/legacy: Fix type for drm_local_map.offset
drm/amdgpu/display: fix warning when compiling without debugfs
drm/amdgpu: unify fw_write_wait for new gfx9 asics
drm/amd/powerplay: error out on forcing clock setting not supported
drm/amdgpu: fix gfx hang during suspend with video playback (v2)
drm/amd/display: Check for null fclk voltage when parsing clock table
drm/amd/display: Acknowledge wm_optimized_required
drm/amd/display: Make cursor source translation adjustment optional
drm/amd/display: Calculate scaling ratios on every medium/full update
drm/amd/display: Program viewport when source pos changes for DCN20 hw seq
drm/amd/display: Fix incorrect cursor pos on scaled primary plane
drm/amd/display: change default pipe_split policy for DCN1
drm/amd/display: Translate cursor position by source rect
drm/amd/display: Update stream adjust in dc_stream_adjust_vmin_vmax
drm/amd/display: Avoid create MST prop after registration
drm/amdgpu/psp: dont warn on missing optional TA's
drm/amdgpu: update RAS related dmesg print
drm/amdgpu: resolve mGPU RAS query instability
...
Linus Torvalds [Fri, 10 Apr 2020 19:27:06 +0000 (12:27 -0700)]
Merge tag 'sound-fix-5.7-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes gathered since the previous update.
ALSA core:
- Regression fix for OSS PCM emulation
ASoC:
- Trivial fixes in reg bit mask ops, DAPM, DPCM and topology
- Lots of fixes for Intel-based devices
- Minor fixes for AMD, STM32, Qualcomm, Realtek
Others:
- Fixes for the bugs in mixer handling in HD-audio and ice1724
drivers that were caught by the recent kctl validator
- New quirks for HD-audio and USB-audio
Also this contains a fix for EDD firmware fix, which slipped from
anyone's hands"
* tag 'sound-fix-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
ALSA: hda: Add driver blacklist
ALSA: usb-audio: Add mixer workaround for TRX40 and co
ALSA: hda/realtek - Add quirk for MSI GL63
ALSA: ice1724: Fix invalid access for enumerated ctl items
ALSA: hda: Fix potential access overflow in beep helper
ASoC: cs4270: pull reset GPIO low then high
ALSA: hda/realtek - Add HP new mute led supported for ALC236
ALSA: hda/realtek - Add supported new mute Led for HP
ASoC: rt5645: Add platform-data for Medion E1239T
ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN MPWIN895CL tablet
ASoC: stm32: sai: Add missing cleanup
ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Alpha S
ASoC: Intel: atom: Fix uninitialized variable compiler warning
ASoC: Intel: atom: Check drv->lock is locked in sst_fill_and_send_cmd_unlocked
ASoC: Intel: atom: Take the drv->lock mutex before calling sst_send_slot_map()
ASoC: SOF: Turn "firmware boot complete" message into a dbg message
ALSA: usb-audio: Add Pioneer DJ DJM-250MK2 quirk
ALSA: pcm: oss: Fix regression by buffer overflow fix (again)
ALSA: pcm: oss: Fix regression by buffer overflow fix
edd: Use scnprintf() for avoiding potential buffer overflow
...
Linus Torvalds [Fri, 10 Apr 2020 19:21:11 +0000 (12:21 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is a batch of changes that didn't make it in the initial pull
request because the lpfc series had to be rebased to redo an incorrect
split.
It's basically driver updates to lpfc, target, bnx2fc and ufs with the
rest being minor updates except the sr_block_release one which fixes a
use after free introduced by the removal of the global mutex in the
first patch set"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (35 commits)
scsi: core: Add DID_ALLOC_FAILURE and DID_MEDIUM_ERROR to hostbyte_table
scsi: ufs: Use ufshcd_config_pwr_mode() when scaling gear
scsi: bnx2fc: fix boolreturn.cocci warnings
scsi: zfcp: use fallthrough;
scsi: aacraid: do not overwrite retval in aac_reset_adapter()
scsi: sr: Fix sr_block_release()
scsi: aic7xxx: Remove more FreeBSD-specific code
scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
scsi: ufs: set device as active power mode after resetting device
scsi: iscsi: Report unbind session event when the target has been removed
scsi: lpfc: Change default SCSI LUN QD to 64
scsi: libfc: rport state move to PLOGI if all PRLI retry exhausted
scsi: libfc: If PRLI rejected, move rport to PLOGI state
scsi: bnx2fc: Update the driver version to 2.12.13
scsi: bnx2fc: Fix SCSI command completion after cleanup is posted
scsi: bnx2fc: Process the RQE with CQE in interrupt context
scsi: target: use the stack for XCOPY passthrough cmds
scsi: target: increase XCOPY I/O size
scsi: target: avoid per-loop XCOPY buffer allocations
scsi: target: drop xcopy DISK BLOCK LENGTH debug
...
Steve French [Fri, 10 Apr 2020 02:42:18 +0000 (21:42 -0500)]
smb3: enable swap on SMB3 mounts
Add experimental support for allowing a swap file to be on an SMB3
mount. There are use cases where swapping over a secure network
filesystem is preferable. In some cases there are no local
block devices large enough, and network block devices can be
hard to setup and secure. And in some cases there are no
local block devices at all (e.g. with the recent addition of
remote boot over SMB3 mounts).
There are various enhancements that can be added later e.g.:
- doing a mandatory byte range lock over the swapfile (until
the Linux VFS is modified to notify the file system that an open
is for a swapfile, when the file can be opened "DENY_ALL" to prevent
others from opening it).
- pinning more buffers in the underlying transport to minimize memory
allocations in the TCP stack under the fs
- documenting how to create ACLs (on the server) to secure the
swapfile (or adding additional tools to cifs-utils to make it easier)
Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Ley Foon Tan [Thu, 16 Jan 2020 00:46:38 +0000 (08:46 +0800)]
MAINTAINERS: Remove nios2-dev@lists.rocketboards.org
nios2-dev@lists.rocketboards.org mailing list is no longer supported,
remove it from MAINTAINERS file.
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Alexandru Ardelean [Fri, 10 Apr 2020 15:41:18 +0000 (23:41 +0800)]
arch: nios2: remove 'resetvalue' property
The 'altr,pio-1.0' driver does not handle the 'resetvalue', so remove it.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Alexandru Ardelean [Fri, 10 Apr 2020 15:40:37 +0000 (23:40 +0800)]
arch: nios2: rename 'altr,gpio-bank-width' -> 'altr,ngpio'
There is no more 'altr,gpio-bank-width' in the 'altr,pio-1.0' driver.
There is a 'altr,ngpio' which is what the property wants to configure.
This change updates all occurrences of 'altr,gpio-bank-width' to
'altr,ngpio'.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Dragos Bogdan [Fri, 10 Apr 2020 15:38:23 +0000 (23:38 +0800)]
arch: nios2: Enable the common clk subsystem on Nios2
This patch adds support for common clock framework on Nios2. Clock
framework is commonly used in many drivers, and this patch makes it
available for the entire architecture, not just on a per-driver basis.
Signed-off-by: Beniamin Bia <beniamin.bia@analog.com>
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Linus Torvalds [Fri, 10 Apr 2020 17:26:28 +0000 (10:26 -0700)]
Merge tag 'libata-5.7-2020-04-09' of git://git.kernel.dk/linux-block
Pull libata fixes from Jens Axboe:
"A few followup changes/fixes for libata:
- PMP removal fix (Kai-Heng)
- Add remapped NVMe device attribute to sysfs (Kai-Heng)
- Remove redundant assignment (Colin)
- Add yet another Comet Lake ID (Jian-Hong)"
* tag 'libata-5.7-2020-04-09' of git://git.kernel.dk/linux-block:
ahci: Add Intel Comet Lake PCH RAID PCI ID
ata: ahci: Add sysfs attribute to show remapped NVMe device count
ata: ahci-imx: remove redundant assignment to ret
libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set
Linus Torvalds [Fri, 10 Apr 2020 17:06:54 +0000 (10:06 -0700)]
Merge tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Here's a set of fixes that should go into this merge window. This
contains:
- NVMe pull request from Christoph with various fixes
- Better discard support for loop (Evan)
- Only call ->commit_rqs() if we have queued IO (Keith)
- blkcg offlining fixes (Tejun)
- fix (and fix the fix) for busy partitions"
* tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-block:
block: fix busy device checking in blk_drop_partitions again
block: fix busy device checking in blk_drop_partitions
nvmet-rdma: fix double free of rdma queue
blk-mq: don't commit_rqs() if none were queued
nvme-fc: Revert "add module to ops template to allow module references"
nvme: fix deadlock caused by ANA update wrong locking
nvmet-rdma: fix bonding failover possible NULL deref
loop: Better discard support for block devices
loop: Report EOPNOTSUPP properly
nvmet: fix NULL dereference when removing a referral
nvme: inherit stable pages constraint in the mpath stack device
blkcg: don't offline parent blkcg first
blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it
nvme-tcp: fix possible crash in recv error flow
nvme-tcp: don't poll a non-live queue
nvme-tcp: fix possible crash in write_zeroes processing
nvmet-fc: fix typo in comment
nvme-rdma: Replace comma with a semicolon
nvme-fcloop: fix deallocation of working context
nvme: fix compat address handling in several ioctls
Linus Torvalds [Fri, 10 Apr 2020 17:02:21 +0000 (10:02 -0700)]
Merge tag 'io_uring-5.7-2020-04-09' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Here's a set of fixes that either weren't quite ready for the first,
or came about from some intensive testing on memcached with 350K+
sockets.
Summary:
- Fixes for races or deadlocks around poll handling
- Don't double account fixed files against RLIMIT_NOFILE
- IORING_OP_OPENAT LFS fix
- Poll retry handling (Bijan)
- Missing finish_wait() for SQPOLL (Hillf)
- Cleanup/split of io_kiocb alloc vs ctx references (Pavel)
- Fixed file unregistration and init fixes (Xiaoguang)
- Various little fixes (Xiaoguang, Pavel, Colin)"
* tag 'io_uring-5.7-2020-04-09' of git://git.kernel.dk/linux-block:
io_uring: punt final io_ring_ctx wait-and-free to workqueue
io_uring: fix fs cleanup on cqe overflow
io_uring: don't read user-shared sqe flags twice
io_uring: remove req init from io_get_req()
io_uring: alloc req only after getting sqe
io_uring: simplify io_get_sqring
io_uring: do not always copy iovec in io_req_map_rw()
io_uring: ensure openat sets O_LARGEFILE if needed
io_uring: initialize fixed_file_data lock
io_uring: remove redundant variable pointer nxt and io_wq_assign_next call
io_uring: fix ctx refcounting in io_submit_sqes()
io_uring: process requests completed with -EAGAIN on poll list
io_uring: remove bogus RLIMIT_NOFILE check in file registration
io_uring: use io-wq manager as backup task if task is exiting
io_uring: grab task reference for poll requests
io_uring: retry poll if we got woken with non-matching mask
io_uring: add missing finish_wait() in io_sq_thread()
io_uring: refactor file register/unregister/update handling
Linus Torvalds [Fri, 10 Apr 2020 16:54:26 +0000 (09:54 -0700)]
Merge tag 'xfs-5.7-merge-12' of git://git./fs/xfs/xfs-linux
Pull more xfs updates from Darrick Wong:
"As promised last week, this batch changes how xfs interacts with
memory reclaim; how the log batches and throttles log items; how hard
writes near ENOSPC will try to squeeze more space out of the
filesystem; and hopefully fix the last of the umount hangs after a
catastrophic failure.
Summary:
- Validate the realtime geometry in the superblock when mounting
- Refactor a bunch of tricky flag handling in the log code
- Flush the CIL more judiciously so that we don't wait until there
are millions of log items consuming a lot of memory.
- Throttle transaction commits to prevent the xfs frontend from
flooding the CIL with too many log items.
- Account metadata buffers correctly for memory reclaim.
- Mark slabs properly for memory reclaim. These should help reclaim
run more effectively when XFS is using a lot of memory.
- Don't write a garbage log record at unmount time if we're trying to
trigger summary counter recalculation at next mount.
- Don't block the AIL on locked dquot/inode buffers; instead trigger
its backoff mechanism to give the lock holder a chance to finish
up.
- Ratelimit writeback flushing when buffered writes encounter ENOSPC.
- Other minor cleanups.
- Make reflink a synchronous operation when the fs is mounted with
wsync or sync, which means that now we force the log to disk to
record the changes"
* tag 'xfs-5.7-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (26 commits)
xfs: reflink should force the log out if mounted with wsync
xfs: factor out a new xfs_log_force_inode helper
xfs: fix inode number overflow in ifree cluster helper
xfs: remove redundant variable assignment in xfs_symlink()
xfs: ratelimit inode flush on buffered write ENOSPC
xfs: return locked status of inode buffer on xfsaild push
xfs: trylock underlying buffer on dquot flush
xfs: remove unnecessary ternary from xfs_create
xfs: don't write a corrupt unmount record to force summary counter recalc
xfs: factor inode lookup from xfs_ifree_cluster
xfs: tail updates only need to occur when LSN changes
xfs: factor common AIL item deletion code
xfs: correctly acount for reclaimable slabs
xfs: Improve metadata buffer reclaim accountability
xfs: don't allow log IO to be throttled
xfs: Throttle commits on delayed background CIL push
xfs: Lower CIL flush limit for large logs
xfs: remove some stale comments from the log code
xfs: refactor unmount record writing
xfs: merge xlog_commit_record with xlog_write_done
...
Linus Torvalds [Fri, 10 Apr 2020 16:52:15 +0000 (09:52 -0700)]
Merge tag 'acpi-5.7-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These prevent a false-positive static checker warning from triggering
in the ACPI EC driver (Rafael Wysocki), fix white space in an ACPI
document (Vilhelm Prytz) and add static annotation to one variable
(Jason Yan)"
* tag 'acpi-5.7-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI, x86/boot: make acpi_nobgrt static
Documentation: firmware-guide: ACPI: fix table alignment in namespace.rst
ACPI: EC: Fix up fast path check in acpi_ec_add()
Linus Torvalds [Fri, 10 Apr 2020 16:50:00 +0000 (09:50 -0700)]
Merge tag 'pm-5.7-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"Rework compat ioctl handling in the user space hibernation interface
(Christoph Hellwig) and fix a typo in a function name in the cpuidle
haltpoll driver (Yihao Wu)"
* tag 'pm-5.7-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle-haltpoll: Fix small typo
PM / sleep: handle the compat case in snapshot_set_swap_area()
PM / sleep: move SNAPSHOT_SET_SWAP_AREA handling into a helper
Linus Torvalds [Fri, 10 Apr 2020 16:47:26 +0000 (09:47 -0700)]
Merge tag 's390-5.7-2' of git://git./linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:
"Second round of s390 fixes and features for 5.7:
- The rest of fallthrough; annotations conversion
- Couple of fixes for ADD uevents in the common I/O layer
- Minor refactoring of the queued direct I/O code"
* tag 's390-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: generate delayed uevent for vfio-ccw subchannels
s390/cio: avoid duplicated 'ADD' uevents
s390/qdio: clear DSCI early for polling drivers
s390/qdio: inline shared_ind()
s390/qdio: remove cdev from init_data
s390/qdio: allow for non-contiguous SBAL array in init_data
zfcp: inline zfcp_qdio_setup_init_data()
s390/qdio: cleanly split alloc and establish
s390/mm: use fallthrough;
Randy Dunlap [Wed, 8 Apr 2020 17:29:50 +0000 (10:29 -0700)]
Documentation: android: binderfs: add 'stats' mount option
Add documentation of the binderfs 'stats' mount option.
Description taken from the commit message.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/baa0aa81-007d-af46-16a5-91fead0bd1b9@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Christoph Hellwig [Fri, 10 Apr 2020 12:31:47 +0000 (14:31 +0200)]
block: fix busy device checking in blk_drop_partitions again
The previous fix had an off by one in the bd_openers checking, counting
the callers blkdev_get.
Fixes:
d3ef5536274f ("block: fix busy device checking in blk_drop_partitions")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Qian Cai <cai@lca.pw>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rafael J. Wysocki [Fri, 10 Apr 2020 09:32:22 +0000 (11:32 +0200)]
Merge branch 'pm-cpuidle'
* pm-cpuidle:
cpuidle-haltpoll: Fix small typo
Rafael J. Wysocki [Fri, 10 Apr 2020 09:31:43 +0000 (11:31 +0200)]
Merge branches 'acpi-ec' and 'acpi-x86'
* acpi-ec:
ACPI: EC: Fix up fast path check in acpi_ec_add()
* acpi-x86:
ACPI, x86/boot: make acpi_nobgrt static
Jens Axboe [Fri, 10 Apr 2020 00:14:00 +0000 (18:14 -0600)]
io_uring: punt final io_ring_ctx wait-and-free to workqueue
We can't reliably wait in io_ring_ctx_wait_and_kill(), since the
task_works list isn't ordered (in fact it's LIFO ordered). We could
either fix this with a separate task_works list for io_uring work, or
just punt the wait-and-free to async context. This ensures that
task_work that comes in while we're shutting down is processed
correctly. If we don't go async, we could have work past the fput()
work for the ring that depends on work that won't be executed until
after we're done with the wait-and-free. But as this operation is
blocking, it'll never get a chance to run.
This was reproduced with hundreds of thousands of sockets running
memcached, haven't been able to reproduce this synthetically.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dave Airlie [Thu, 9 Apr 2020 20:42:52 +0000 (06:42 +1000)]
Merge tag 'amd-drm-fixes-5.7-2020-04-08' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-fixes-5.7-2020-04-08:
amdgpu:
- Various Renoir fixes
- Fix gfx clockgating sequence on gfx10
- RAS fixes
- Avoid MST property creation after registration
- Various cursor/viewport fixes
- Fix a confusing log message about optional firmwares
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200408222240.3942-1-alexander.deucher@amd.com