Vincent Chen [Tue, 9 Jun 2020 14:14:49 +0000 (22:14 +0800)]
riscv: set the permission of vdso_data to read-only
The original vdso_data page is empty, so the permission of the vdso_data
page can be the same with the vdso text page. After introducing the vDSO
common flow, the vdso_data is not empty and the permission should be
changed to read-only.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Vincent Chen [Tue, 9 Jun 2020 14:14:48 +0000 (22:14 +0800)]
riscv: use vDSO common flow to reduce the latency of the time-related functions
Even if RISC-V has supported the vDSO feature, the latency of the functions
for obtaining the system time is still expensive. It is because these
functions still trigger a corresponding system call in the process, which
slows down the response time. If we want to remove the system call to
reduce the latency, the kernel should have the ability to output the system
clock information to userspace. This patch introduces the vDSO common flow
to enable the kernel to achieve the above feature and uses "rdtime"
instruction to obtain the current time in the user space. Under this
condition, the latency cost by the ecall from U-mode to S-mode can be
eliminated. After applying this patch, the latency of gettimeofday()
measured on the HiFive unleashed board can be reduced by %61.
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Christoph Hellwig [Thu, 11 Jun 2020 01:42:10 +0000 (18:42 -0700)]
kernel: set USER_DS in kthread_use_mm
Some architectures like arm64 and s390 require USER_DS to be set for
kernel threads to access user address space, which is the whole purpose of
kthread_use_mm, but other like x86 don't. That has lead to a huge mess
where some callers are fixed up once they are tested on said
architectures, while others linger around and yet other like io_uring try
to do "clever" optimizations for what usually is just a trivial asignment
to a member in the thread_struct for most architectures.
Make kthread_use_mm set USER_DS, and kthread_unuse_mm restore to the
previous value instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-7-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 11 Jun 2020 01:42:06 +0000 (18:42 -0700)]
kernel: better document the use_mm/unuse_mm API contract
Switch the function documentation to kerneldoc comments, and add
WARN_ON_ONCE asserts that the calling thread is a kernel thread and does
not have ->mm set (or has ->mm set in the case of unuse_mm).
Also give the functions a kthread_ prefix to better document the use case.
[hch@lst.de: fix a comment typo, cover the newly merged use_mm/unuse_mm caller in vfio]
Link: http://lkml.kernel.org/r/20200416053158.586887-3-hch@lst.de
[sfr@canb.auug.org.au: powerpc/vas: fix up for {un}use_mm() rename]
Link: http://lkml.kernel.org/r/20200422163935.5aa93ba5@canb.auug.org.au
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [usb]
Acked-by: Haren Myneni <haren@linux.ibm.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Link: http://lkml.kernel.org/r/20200404094101.672954-6-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 11 Jun 2020 01:42:03 +0000 (18:42 -0700)]
kernel: move use_mm/unuse_mm to kthread.c
cover the newly merged use_mm/unuse_mm caller in vfio
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Link: http://lkml.kernel.org/r/20200416053158.586887-2-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 11 Jun 2020 01:41:59 +0000 (18:41 -0700)]
kernel: move use_mm/unuse_mm to kthread.c
Patch series "improve use_mm / unuse_mm", v2.
This series improves the use_mm / unuse_mm interface by better documenting
the assumptions, and my taking the set_fs manipulations spread over the
callers into the core API.
This patch (of 3):
Use the proper API instead.
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward. Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200416053158.586887-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Walter Wu [Thu, 11 Jun 2020 01:41:56 +0000 (18:41 -0700)]
stacktrace: cleanup inconsistent variable type
Modify the variable type of 'skip' member of struct stack_trace.
In theory, the 'skip' variable type should be unsigned int.
There are two reasons:
- The 'skip' only has two situation, 1)Positive value, 2)Zero
- The 'skip' of struct stack_trace has inconsistent type with struct
stack_trace_data, it makes a bit confusion in the relationship between
struct stack_trace and stack_trace_data.
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/20200421013511.5960-1-walter-zh.wu@mediatek.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wei Yang [Thu, 11 Jun 2020 01:41:53 +0000 (18:41 -0700)]
lib: test get_count_order/long in test_bitops.c
Add some tests for get_count_order/long in test_bitops.c.
[akpm@linux-foundation.org: define local `i']
[akpm@linux-foundation.org: enhancement, warning fix, cleanup per Geert]
[akpm@linux-foundation.org: fix loop bound, per Wei Yang]
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/20200602223728.32722-1-richard.weiyang@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Widawsky [Thu, 11 Jun 2020 01:41:50 +0000 (18:41 -0700)]
mm: add comments on pglist_data zones
While making other modifications it was easy to confuse the two struct
members node_zones and node_zonelists. For those already familiar with
the code, this might seem to be a silly patch, but it's quite helpful to
disambiguate the similar-sounding fields
While here, add a small comment on why nr_zones isn't simply MAX_NR_ZONES
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200520205443.2757414-1-ben.widawsky@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Keyur Patel [Thu, 11 Jun 2020 01:41:47 +0000 (18:41 -0700)]
ocfs2: fix spelling mistake and grammar
./ocfs2/mmap.c:65: bebongs ==> belonging
Signed-off-by: Keyur Patel <iamkeyur96@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200608014818.102358-1-iamkeyur96@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aneesh Kumar K.V [Thu, 11 Jun 2020 01:41:44 +0000 (18:41 -0700)]
mm/debug_vm_pgtable: fix kernel crash by checking for THP support
Architectures can have CONFIG_TRANSPARENT_HUGEPAGE enabled but no THP
support enabled based on platforms. For ex: with 4K PAGE_SIZE ppc64
supports THP only with radix translation.
This results in below crash when running with hash translation and 4K
PAGE_SIZE.
kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:140!
cpu 0x61: Vector: 700 (Program Check) at [
c000000ff948f860]
pc: debug_vm_pgtable+0x480/0x8b0
lr: debug_vm_pgtable+0x474/0x8b0
...
debug_vm_pgtable+0x374/0x8b0 (unreliable)
do_one_initcall+0x98/0x4f0
kernel_init_freeable+0x330/0x3fc
kernel_init+0x24/0x148
Check for THP support correctly
Link: http://lkml.kernel.org/r/20200608125252.407659-1-aneesh.kumar@linux.ibm.com
Fixes:
399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Gordeev [Thu, 11 Jun 2020 01:41:41 +0000 (18:41 -0700)]
lib: fix bitmap_parse() on 64-bit big endian archs
Commit
2d6261583be0 ("lib: rework bitmap_parse()") does not take into
account order of halfwords on 64-bit big endian architectures. As
result (at least) Receive Packet Steering, IRQ affinity masks and
runtime kernel test "test_bitmap" get broken on s390.
[andriy.shevchenko@linux.intel.com: convert infinite while loop to a for loop]
Link: http://lkml.kernel.org/r/20200609140535.87160-1-andriy.shevchenko@linux.intel.com
Fixes:
2d6261583be0 ("lib: rework bitmap_parse()")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Amritha Nambiar <amritha.nambiar@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "Tobin C . Harding" <tobin@kernel.org>
Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1591634471-17647-1-git-send-email-agordeev@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tim Froidcoeur [Thu, 11 Jun 2020 01:41:38 +0000 (18:41 -0700)]
checkpatch: correct check for kernel parameters doc
Adding a new kernel parameter with documentation makes checkpatch complain
__setup appears un-documented -- check Documentation/admin-guide/kernel-parameters.rst
The list of kernel parameters has moved to a separate txt file, but
checkpatch has not been updated for this.
Make checkpatch.pl look for the documentation for new kernel parameters
in kernel-parameters.txt instead of kernel-parameters.rst.
Fixes:
e52347bd66f6 ("Documentation/admin-guide: split the kernel parameter list to a separate file")
Signed-off-by: Tim Froidcoeur <tim.froidcoeur@tessares.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Thu, 11 Jun 2020 01:41:35 +0000 (18:41 -0700)]
nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
After commit
c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if
mapping has no dirty pages"), the following null pointer dereference has
been reported on nilfs2:
BUG: kernel NULL pointer dereference, address:
00000000000000a8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
RIP: 0010:percpu_counter_add_batch+0xa/0x60
...
Call Trace:
__test_set_page_writeback+0x2d3/0x330
nilfs_segctor_do_construct+0x10d3/0x2110 [nilfs2]
nilfs_segctor_construct+0x168/0x260 [nilfs2]
nilfs_segctor_thread+0x127/0x3b0 [nilfs2]
kthread+0xf8/0x130
...
This crash turned out to be caused by set_page_writeback() call for
segment summary buffers at nilfs_segctor_prepare_write().
set_page_writeback() can call inc_wb_stat(inode_to_wb(inode),
WB_WRITEBACK) where inode_to_wb(inode) is NULL if the inode of
underlying block device does not have an associated wb.
This fixes the issue by calling inode_attach_wb() in advance to ensure
to associate the bdev inode with its wb.
Fixes:
c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if mapping has no dirty pages")
Reported-by: Walton Hoops <me@waltonhoops.com>
Reported-by: Tomas Hlavaty <tom@logand.com>
Reported-by: ARAI Shun-ichi <hermes@ceres.dti.ne.jp>
Reported-by: Hideki EIRAKU <hdk1983@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org> [5.4+]
Link: http://lkml.kernel.org/r/20200608.011819.1399059588922299158.konishi.ryusuke@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 11 Jun 2020 01:41:32 +0000 (18:41 -0700)]
lib/lz4/lz4_decompress.c: document deliberate use of `&'
This operation was intentional, but tools such as smatch will warn that it
might not have been.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Yann Collet <cyan@fb.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Gao Xiang <hsiangkao@aol.com>
Link: http://lkml.kernel.org/r/3bf931c6ea0cae3e23f3485801986859851b4f04.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Thu, 11 Jun 2020 01:41:28 +0000 (18:41 -0700)]
kcov: check kcov_softirq in kcov_remote_stop()
kcov_remote_stop() should check that the corresponding kcov_remote_start()
actually found the specified remote handle and started collecting
coverage. This is done by checking the per thread kcov_softirq flag.
A particular failure scenario where this was observed involved a softirq
with a remote coverage collection section coming between check_kcov_mode()
and the access to t->kcov_area in __sanitizer_cov_trace_pc(). In that
softirq kcov_remote_start() bailed out after kcov_remote_find() check, but
the matching kcov_remote_stop() didn't check if kcov_remote_start()
succeeded, and overwrote per thread kcov parameters with invalid (zero)
values.
Fixes:
5ff3b30ab57d ("kcov: collect coverage from interrupts")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Link: http://lkml.kernel.org/r/fcd1cd16eac1d2c01a66befd8ea4afc6f8d09833.1591576806.git.andreyknvl@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Thu, 11 Jun 2020 01:41:25 +0000 (18:41 -0700)]
scripts/spelling: add a few more typos
This commit adds typos I found from another work.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200605092502.18018-3-sjpark@amazon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 11 Jun 2020 01:41:22 +0000 (18:41 -0700)]
khugepaged: selftests: fix timeout condition in wait_for_scan()
The loop exits with "timeout" set to -1 and not to 0 so the test needs to
be fixed.
Fixes:
e7b592f6caca ("khugepaged: add self test")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Zi Yan <ziy@nvidia.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200605110736.GH978434@mwanda
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Thu, 11 Jun 2020 01:49:03 +0000 (11:49 +1000)]
Merge tag 'drm-intel-next-fixes-2020-06-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610093700.GA8599@jlahtine-desk.ger.corp.intel.com
Linus Torvalds [Thu, 11 Jun 2020 01:09:13 +0000 (18:09 -0700)]
Merge branch 'work.epoll' of git://git./linux/kernel/git/viro/vfs
Pull epoll update from Al Viro:
"epoll conversion to read_iter from Jens; I thought there might be more
epoll stuff this cycle, but uaccess took too much time"
* 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
eventfd: convert to f_op->read_iter()
Jiufei Xue [Wed, 10 Jun 2020 05:41:59 +0000 (13:41 +0800)]
io_uring: check file O_NONBLOCK state for accept
If the socket is O_NONBLOCK, we should complete the accept request
with -EAGAIN when data is not ready.
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Xiaoguang Wang [Wed, 10 Jun 2020 11:41:20 +0000 (19:41 +0800)]
io_uring: avoid unnecessary io_wq_work copy for fast poll feature
Basically IORING_OP_POLL_ADD command and async armed poll handlers
for regular commands don't touch io_wq_work, so only REQ_F_WORK_INITIALIZED
is set, can we do io_wq_work copy and restore.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Xiaoguang Wang [Wed, 10 Jun 2020 11:41:19 +0000 (19:41 +0800)]
io_uring: avoid whole io_wq_work copy for requests completed inline
If requests can be submitted and completed inline, we don't need to
initialize whole io_wq_work in io_init_req(), which is an expensive
operation, add a new 'REQ_F_WORK_INITIALIZED' to determine whether
io_wq_work is initialized and add a helper io_req_init_async(), users
must call io_req_init_async() for the first time touching any members
of io_wq_work.
I use /dev/nullb0 to evaluate performance improvement in my physical
machine:
modprobe null_blk nr_devices=1 completion_nsec=0
sudo taskset -c 60 fio -name=fiotest -filename=/dev/nullb0 -iodepth=128
-thread -rw=read -ioengine=io_uring -direct=1 -bs=4k -size=100G -numjobs=1
-time_based -runtime=120
before this patch:
Run status group 0 (all jobs):
READ: bw=724MiB/s (759MB/s), 724MiB/s-724MiB/s (759MB/s-759MB/s),
io=84.8GiB (91.1GB), run=120001-120001msec
With this patch:
Run status group 0 (all jobs):
READ: bw=761MiB/s (798MB/s), 761MiB/s-761MiB/s (798MB/s-798MB/s),
io=89.2GiB (95.8GB), run=120001-120001msec
About 5% improvement.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Wed, 10 Jun 2020 23:09:11 +0000 (16:09 -0700)]
Merge branch 'work.misc' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of trivial patches that fell through the cracks last cycle"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: fix indentation in deactivate_super()
vfs: Remove duplicated d_mountpoint check in __is_local_mountpoint
Linus Torvalds [Wed, 10 Jun 2020 23:05:54 +0000 (16:05 -0700)]
Merge branch 'work.sysctl' of git://git./linux/kernel/git/viro/vfs
Pull sysctl fixes from Al Viro:
"Fixups to regressions in sysctl series"
* 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysctl: reject gigantic reads/write to sysctl files
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
trace: fix an incorrect __user annotation on stack_trace_sysctl
random: fix an incorrect __user annotation on proc_do_entropy
net/sysctl: remove leftover __user annotations on neigh_proc_dointvec*
net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl
Linus Torvalds [Wed, 10 Jun 2020 23:04:27 +0000 (16:04 -0700)]
Merge branch 'uaccess.i915' of git://git./linux/kernel/git/viro/vfs
Pull i915 uaccess updates from Al Viro:
"Low-hanging fruit in i915; there are several trickier followups, but
that'll wait for the next cycle"
* 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
i915:get_engines(): get rid of pointless access_ok()
i915: alloc_oa_regs(): get rid of pointless access_ok()
i915 compat ioctl(): just use drm_ioctl_kernel()
i915: switch copy_perf_config_registers_or_number() to unsafe_put_user()
i915: switch query_{topology,engine}_info() to copy_to_user()
Linus Torvalds [Wed, 10 Jun 2020 23:02:54 +0000 (16:02 -0700)]
Merge branch 'uaccess.misc' of git://git./linux/kernel/git/viro/vfs
Pull misc uaccess updates from Al Viro:
"Assorted uaccess patches for this cycle - the stuff that didn't fit
into thematic series"
* 'uaccess.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
bpf: make bpf_check_uarg_tail_zero() use check_zeroed_user()
x86: kvm_hv_set_msr(): use __put_user() instead of 32bit __clear_user()
user_regset_copyout_zero(): use clear_user()
TEST_ACCESS_OK _never_ had been checked anywhere
x86: switch cp_stat64() to unsafe_put_user()
binfmt_flat: don't use __put_user()
binfmt_elf_fdpic: don't use __... uaccess primitives
binfmt_elf: don't bother with __{put,copy_to}_user()
pselect6() and friends: take handling the combined 6th/7th args into helper
Linus Torvalds [Wed, 10 Jun 2020 22:00:11 +0000 (15:00 -0700)]
Merge branch 'proc-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull proc fix from Eric Biederman:
"Syzbot found a NULL pointer dereference if kzalloc of s_fs_info fails"
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: s_fs_info may be NULL when proc_kill_sb is called
Linus Torvalds [Wed, 10 Jun 2020 21:46:54 +0000 (14:46 -0700)]
Merge branch 'rwonce/rework' of git://git./linux/kernel/git/will/linux
Pull READ/WRITE_ONCE rework from Will Deacon:
"This the READ_ONCE rework I've been working on for a while, which
bumps the minimum GCC version and improves code-gen on arm64 when
stack protector is enabled"
[ Side note: I'm _really_ tempted to raise the minimum gcc version to
4.9, so that we can just say that we require _Generic() support.
That would allow us to more cleanly handle a lot of the cases where we
depend on very complex macros with 'sizeof' or __builtin_choose_expr()
with __builtin_types_compatible_p() etc.
This branch has a workaround for sparse not handling _Generic(),
either, but that was already fixed in the sparse development branch,
so it's really just gcc-4.9 that we'd require. - Linus ]
* 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse
compiler_types.h: Optimize __unqual_scalar_typeof compilation time
compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long)
compiler-types.h: Include naked type in __pick_integer_type() match
READ_ONCE: Fix comment describing 2x32-bit atomicity
gcov: Remove old GCC 3.4 support
arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
READ_ONCE: Drop pointer qualifiers when reading from scalar types
READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
arm64: csum: Disable KASAN for do_csum()
fault_inject: Don't rely on "return value" from WRITE_ONCE()
net: tls: Avoid assigning 'const' pointer to non-const pointer
netfilter: Avoid assigning 'const' pointer to non-const pointer
compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
Andrew Morton [Wed, 10 Jun 2020 21:34:02 +0000 (14:34 -0700)]
arch/powerpc/mm/pgtable.c: another missed conversion
Fixes:
e05c7b1f2bc4b7 ("mm: pgtable: add shortcuts for accessing kernel PMD and PTE")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 10 Jun 2020 21:12:15 +0000 (14:12 -0700)]
Merge tag 'docs-5.8-2' of git://git.lwn.net/linux
Pull more documentation updates from Jonathan Corbet:
"A handful of late-arriving docs fixes, along with a patch changing a
lot of HTTP links to HTTPS that had to be yanked and redone before the
first pull"
* tag 'docs-5.8-2' of git://git.lwn.net/linux:
docs/memory-barriers.txt/kokr: smp_mb__{before,after}_atomic(): update Documentation
Documentation: devres: add missing entry for devm_platform_get_and_ioremap_resource()
Replace HTTP links with HTTPS ones: documentation
docs: it_IT: address invalid reference warnings
doc: zh_CN: use doc reference to resolve undefined label warning
docs: Update the location of the LF NDA program
docs: dev-tools: coccinelle: underlines
Linus Torvalds [Wed, 10 Jun 2020 21:09:08 +0000 (14:09 -0700)]
Merge tag 'acpi-5.8-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"Update the ACPICA code in the kernel to upstream revision
20200528
with the following changes:
- Remove some dead code from the acpidump utility (Bob Moore)
- Add new OperationRegion subtype keyword PlatformRtMechanism to the
compiler (Erik Kaneda)"
* tag 'acpi-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Update version to
20200528
ACPICA: iASL: add new OperationRegion subtype keyword PlatformRtMechanism
ACPICA: acpidump: Removed dead code from oslinuxtbl.c
Linus Torvalds [Wed, 10 Jun 2020 21:04:39 +0000 (14:04 -0700)]
Merge tag 'pm-5.8-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are operating performance points (OPP) framework updates mostly,
including support for interconnect bandwidth in the OPP core, plus a
few cpufreq changes, including boost support in the CPPC cpufreq
driver, an ACPI device power management fix and a hibernation code
cleanup.
Specifics:
- Add support for interconnect bandwidth to the OPP core (Georgi
Djakov, Saravana Kannan, Sibi Sankar, Viresh Kumar).
- Add support for regulator enable/disable to the OPP core (Kamil
Konieczny).
- Add boost support to the CPPC cpufreq driver (Xiongfeng Wang).
- Make the tegra186 cpufreq driver set the
CPUFREQ_NEED_INITIAL_FREQ_CHECK flag (Mian Yousaf Kaukab).
- Prevent the ACPI power management from using power resources with
devices where the list of power resources for power state D0 (full
power) is missing (Rafael Wysocki).
- Annotate a hibernation-related function with __init (Christophe
JAILLET)"
* tag 'pm-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Avoid using power resources if there are none for D0
cpufreq: CPPC: add SW BOOST support
cpufreq: change '.set_boost' to act on one policy
PM: hibernate: Add __init annotation to swsusp_header_init()
opp: Don't parse icc paths unnecessarily
opp: Remove bandwidth votes when target_freq is zero
opp: core: add regulators enable and disable
opp: Reorder the code for !target_freq case
opp: Expose bandwidth information via debugfs
cpufreq: dt: Add support for interconnect bandwidth scaling
opp: Update the bandwidth on OPP frequency changes
opp: Add sanity checks in _read_opp_key()
opp: Add support for parsing interconnect bandwidth
cpufreq: tegra186: add CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
OPP: Add helpers for reading the binding properties
dt-bindings: opp: Introduce opp-peak-kBps and opp-avg-kBps bindings
Linus Torvalds [Wed, 10 Jun 2020 20:51:47 +0000 (13:51 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a new driver for the Azoteq IQS269A capacitive touch controller
- a new driver for the Cypress CY8CTMA140 touchscreen
- updates to Elan and ft5x06 touchscreen drivers
- assorted driver fixes
- msm-vibrator has been removed as we have a more generic solution
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (28 commits)
Input: adi - work around module name confict
Input: iqs269a - add missing I2C dependency
Input: elants - refactor elants_i2c_execute_command()
Input: elants - override touchscreen info with DT properties
Input: elants - remove unused axes
Input: add support for Azoteq IQS269A
dt-bindings: input: Add bindings for Azoteq IQS269A
Input: imx_sc_key - use devm_add_action_or_reset() to handle all cleanups
Input: remove msm-vibrator driver
dt-bindings: Input: remove msm-vibrator
Input: elants_i2c - provide an attribute to show calibration count
Input: introduce input_mt_report_slot_inactive()
dt-bindings: input: touchscreen: elants_i2c: convert to YAML
Input: add driver for the Cypress CY8CTMA140 touchscreen
dt-bindings: touchscreen: Add CY8CTMA140 bindings
Input: edt-ft5x06 - prefer asynchronous probe
Input: edt-ft5x06 - improve power management operations
Input: edt-ft5x06 - move parameter restore into helper
Input: edt-ft5x06 - fix get_default register write access
Input: atkbd - receive and use physcode->keycode mapping from FW
...
Linus Torvalds [Wed, 10 Jun 2020 20:42:09 +0000 (13:42 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- virtio-mem: paravirtualized memory hotplug
- support doorbell mapping for vdpa
- config interrupt support in ifc
- fixes all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits)
vhost/test: fix up after API change
virtio_mem: convert device block size into 64bit
virtio-mem: drop unnecessary initialization
ifcvf: implement config interrupt in IFCVF
vhost: replace -1 with VHOST_FILE_UNBIND in ioctls
vhost_vdpa: Support config interrupt in vdpa
ifcvf: ignore continuous setting same status value
virtio-mem: Don't rely on implicit compiler padding for requests
virtio-mem: Try to unplug the complete online memory block first
virtio-mem: Use -ETXTBSY as error code if the device is busy
virtio-mem: Unplug subblocks right-to-left
virtio-mem: Drop manual check for already present memory
virtio-mem: Add parent resource for all added "System RAM"
virtio-mem: Better retry handling
virtio-mem: Offline and remove completely unplugged memory blocks
mm/memory_hotplug: Introduce offline_and_remove_memory()
virtio-mem: Allow to offline partially unplugged memory blocks
mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE
virtio-mem: Paravirtualized memory hotunplug part 2
virtio-mem: Paravirtualized memory hotunplug part 1
...
Linus Torvalds [Wed, 10 Jun 2020 20:25:40 +0000 (13:25 -0700)]
Merge tag 'for-linus-5.8-rc1' of git://git./linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- Use fdatasync() in ubd
- Add a generic "fd" vector transport
- Minor cleanups and fixes
* tag 'for-linus-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: virtio: Replace zero-length array with flexible-array
um: Use fdatasync() when mapping the UBD FSYNC command
um: Do not evaluate compiler's library path when cleaning
um: Neaten vu_err macro definition
um: Add a generic "fd" vector transport
um: Add include: memset() and memcpy() are in <string.h>
Linus Torvalds [Wed, 10 Jun 2020 20:24:40 +0000 (13:24 -0700)]
Merge tag 'for-linus-5.8-rc1' of git://git./linux/kernel/git/rw/ubifs
Pull UBI update from Richard Weinberger:
"This contains a single change for UBI:
- Select fastmap anchor PEBs considering wear level rules"
* tag 'for-linus-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubi: Select fastmap anchor PEBs considering wear level rules
Linus Torvalds [Wed, 10 Jun 2020 20:15:17 +0000 (13:15 -0700)]
Merge tag 'mtd/for-5.8' of git://git./linux/kernel/git/mtd/linux
Pull MTD updates from Richard Weinberger:
"MTD core changes:
- partition parser: Support MTD names containing one or more colons.
- mtdblock: clear cache_state to avoid writing to bad blocks
repeatedly.
Raw NAND core changes:
- Stop using nand_release(), patched all drivers.
- Give more information about the ECC weakness when not matching the
chip's requirement.
- MAINTAINERS updates.
- Support emulated SLC mode on MLC NANDs.
- Support "constrained" controllers, adapt the core and ONFI/JEDEC
table parsing and Micron's code.
- Take check_only into account.
- Add an invalid ECC mode to discriminate with valid ones.
- Return an enum from of_get_nand_ecc_algo().
- Drop OOB_FIRST placement scheme.
- Introduce nand_extract_bits().
- Ensure a consistent bitflips numbering.
- BCH lib:
- Allow easy bit swapping.
- Rework a little bit the exported function names.
- Fix nand_gpio_waitrdy().
- Propage CS selection to sub operations.
- Add a NAND_NO_BBM_QUIRK flag.
- Give the possibility to verify a read operation is supported.
- Add a helper to check supported operations.
- Avoid indirect access to ->data_buf().
- Rename the use_bufpoi variables.
- Fix comments about the use of bufpoi.
- Rename a NAND chip option.
- Reorder the nand_chip->options flags.
- Translate obscure bitfields into readable macros.
- Timings:
- Fix default values.
- Add mode information to the timings structure.
Raw NAND controller driver changes:
- Fixed many error paths.
- Arasan
- New driver
- Au1550nd:
- Various cleanups
- Migration to ->exec_op()
- brcmnand:
- Misc cleanup.
- Support v2.1-v2.2 controllers.
- Remove unused including <linux/version.h>.
- Correctly verify erased pages.
- Fix Hamming OOB layout.
- Cadence
- Make cadence_nand_attach_chip static.
- Cafe:
- Set the NAND_NO_BBM_QUIRK flag
- cmx270:
- Remove this controller driver.
- cs553x:
- Misc cleanup
- Migration to ->exec_op()
- Davinci:
- Misc cleanup.
- Migration to ->exec_op()
- Denali:
- Add more delays before latching incoming data
- Diskonchip:
- Misc cleanup
- Migration to ->exec_op()
- Fsmc:
- Change to non-atomic bit operations.
- GPMI:
- Use nand_extract_bits()
- Fix runtime PM imbalance.
- Ingenic:
- Migration to exec_op()
- Fix the RB gpio active-high property on qi, lb60
- Make qi_lb60_ooblayout_ops static.
- Marvell:
- Misc cleanup and small fixes
- Nandsim:
- Fix the error paths, driver wide.
- Omap_elm:
- Fix runtime PM imbalance.
- STM32_FMC2:
- Misc cleanups (error cases, comments, timeout valus, cosmetic
changes).
SPI NOR core changes:
- Add, update support and fix few flashes.
- Prepare BFPT parsing for JESD216 rev D.
- Kernel doc fixes.
CFI changes:
- Support the absence of protection registers for Intel CFI flashes.
- Replace zero-length array with flexible-arrays"
* tag 'mtd/for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (208 commits)
mtd: clear cache_state to avoid writing to bad blocks repeatedly
mtd: parser: cmdline: Support MTD names containing one or more colons
mtd: physmap_of_gemini: remove defined but not used symbol 'syscon_match'
mtd: rawnand: Add an invalid ECC mode to discriminate with valid ones
mtd: rawnand: Return an enum from of_get_nand_ecc_algo()
mtd: rawnand: Drop OOB_FIRST placement scheme
mtd: rawnand: Avoid a typedef
mtd: Fix typo in mtd_ooblayout_set_databytes() description
mtd: rawnand: Stop using nand_release()
mtd: rawnand: nandsim: Reorganize ns_cleanup_module()
mtd: rawnand: nandsim: Rename a label in ns_init_module()
mtd: rawnand: nandsim: Manage lists on error in ns_init_module()
mtd: rawnand: nandsim: Fix the label pointing on nand_cleanup()
mtd: rawnand: nandsim: Free erase_block_wear on error
mtd: rawnand: nandsim: Use an additional label when freeing the nandsim object
mtd: rawnand: nandsim: Stop using nand_release()
mtd: rawnand: nandsim: Free the partition names in ns_free()
mtd: rawnand: nandsim: Free the allocated device on error in ns_init()
mtd: rawnand: nandsim: Free partition names on error in ns_init()
mtd: rawnand: nandsim: Fix the two ns_alloc_device() error paths
...
Alexey Gladkov [Wed, 10 Jun 2020 18:35:49 +0000 (20:35 +0200)]
proc: s_fs_info may be NULL when proc_kill_sb is called
syzbot found that proc_fill_super() fails before filling up sb->s_fs_info,
deactivate_locked_super() will be called and sb->s_fs_info will be NULL.
The proc_kill_sb() does not expect fs_info to be NULL which is wrong.
Link: https://lore.kernel.org/lkml/0000000000002d7ca605a7b8b1c5@google.com
Reported-by: syzbot+4abac52934a48af5ff19@syzkaller.appspotmail.com
Fixes:
fa10fed30f25 ("proc: allow to mount many instances of proc in one pid namespace")
Signed-off-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Linus Torvalds [Wed, 10 Jun 2020 18:42:19 +0000 (11:42 -0700)]
Merge tag 'clk-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"This time around we have four lines of diff in the core framework,
removing a function that isn't used anymore. Otherwise the main new
thing for the common clk framework is that it is selectable in the
Kconfig language now. Hopefully this will let clk drivers and clk
consumers be testable on more than the architectures that support the
clk framework. The goal is to introduce some Kunit tests for the
framework.
Outside of the core framework we have the usual set of various driver
updates and non-critical fixes. The dirstat shows that the new
Baikal-T1 driver is the largest addition this time around in terms of
lines of code. After that the x86 (Intel), Qualcomm, and Mediatek
drivers introduce many lines to support new or upcoming SoCs. After
that the dirstat shows the usual suspects working on their SoC support
by fixing minor bugs, correcting data and converting some of their DT
bindings to YAML.
Core:
- Allow the COMMON_CLK config to be selectable
New Drivers:
- Clk driver for Baikal-T1 SoCs
- Mediatek MT6765 clock support
- Support for Intel Agilex clks
- Add support for X1830 and X1000 Ingenic SoC clk controllers
- Add support for the new Renesas RZ/G1H (R8A7742) SoC
- Add support for Qualcomm's MSM8939 Generic Clock Controller
Updates:
- Support IDT VersaClock 5P49V5925
- Bunch of updates for HSDK clock generation unit (CGU) driver
- Start making audio and GPU clks work on Marvell MMP2/MMP3 SoCs
- Add some GPU, NPU, and UFS clks to Qualcomm SM8150 driver
- Enable supply regulators for GPU gdscs on Qualcomm SoCs
- Add support for Si5342, Si5344 and Si5345 chips
- Support custom flags in Xilinx zynq firmware
- Various small fixes to the Xilinx clk driver
- A single minor rounding fix for the legacy Allwinner clock support
- A few patches from Abel Vesa as preparation of adding audiomix
clock support on i.MX
- A couple of cleanups from Anson Huang for i.MX clk-sscg-pll and
clk-pllv3 drivers
- Drop dependency on ARM64 for i.MX8M clock driver, to support
aarch32 mode on aarch64 hardware
- A series from Peng Fan to improve i.MX8M clock drivers, using
composite clock for core and bus clk slice
- Set a better parent clock for flexcan on i.MX6UL to support CiA102
defined bit rates
- A couple changes for EMC frequency scaling on Tegra210
- Support for CPU frequency scaling on Tegra20/Tegra30
- New clk gate for CSI test pattern generator on Tegra210
- Regression fixes for Samsung exynos542x and exynos5433 SoCs
- Use of fallthrough; attribute for Samsung s3c24xx
- Updates and fixup HDMI and video clocks on Meson8b
- Fixup reset polarity on Meson8b
- Fix GPU glitch free mux switch on Meson gx and g12
- A minor fix for the currently unused suspend/resume handling on
Renesas RZ/A1 and RZ/A2
- Two more conversions of Renesas DT bindings to json-schema
- Add support for the USB 2.0 clock selector on Renesas R-Car M3-W+"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (155 commits)
clk: mediatek: Remove ifr{0,1}_cfg_regs structures
clk: baikal-t1: remove redundant assignment to variable 'divider'
clk: baikal-t1: fix spelling mistake "Uncompatible" -> "Incompatible"
dt-bindings: clock: Add a missing include to MMP Audio Clock binding
dt: Add bindings for IDT VersaClock 5P49V5925
clk: vc5: Add support for IDT VersaClock 5P49V6965
clk: Add Baikal-T1 CCU Dividers driver
clk: Add Baikal-T1 CCU PLLs driver
dt-bindings: clk: Add Baikal-T1 CCU Dividers binding
dt-bindings: clk: Add Baikal-T1 CCU PLLs binding
clk: mediatek: assign the initial value to clk_init_data of mtk_mux
clk: mediatek: Add MT6765 clock support
clk: mediatek: add mt6765 clock IDs
dt-bindings: clock: mediatek: document clk bindings vcodecsys for Mediatek MT6765 SoC
dt-bindings: clock: mediatek: document clk bindings mipi0a for Mediatek MT6765 SoC
dt-bindings: clock: mediatek: document clk bindings for Mediatek MT6765 SoC
CLK: HSDK: CGU: add support for 148.5MHz clock
CLK: HSDK: CGU: support PLL bypassing
CLK: HSDK: CGU: check if PLL is bypassed first
clk: clk-si5341: Add support for the Si5345 series
...
Linus Torvalds [Wed, 10 Jun 2020 18:28:35 +0000 (11:28 -0700)]
Merge tag 'for-v5.8' of git://git./linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"This time there are lots of changes. Quite a few changes to the core,
lots of driver changes and one change to kobject core (with Ack from
Greg).
Summary:
kobject:
- Increase number of allowed uevent variables
power-supply core:
- Add power-supply type in uevent
- Cleanup property handling in core
- Make property and usb_type pointers const
- Convert core power-supply DT binding to YAML
- Cleanup HWMON code
- Add new health status "calibration required"
- Add new properties for manufacture date and capacity error margin
battery drivers:
- new cw2015 battery driver used by pine64 Pinebook Pro laptop
- axp22: blacklist on Meegopad T02
- sc27xx: support current/voltage reading
- max17042: support time-to-empty reading
- simple-battery: add more battery parameters
- bq27xxx: convert DT binding document to YAML
- sbs-battery: add TI BQ20Z65 support, fix technology property,
convert DT binding to YAML, add option to disable charger
broadcasts, add new properties: manufacture date, capacity
error margin, average current, charge current and voltage and
support calibration required health status
- misc fixes
charger drivers:
- bq25890: cleanup, implement charge type, precharge current and
input current limiting properties
- bd70528: use new linear range helper library
- bd99954: new charger driver
- mp2629: new charger driver
- misc fixes
reboot drivers:
- oxnas-restart: introduce new driver
- syscon-reboot: convert DT binding to YAML, add parent syscon device
support
- misc fixes"
* tag 'for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (85 commits)
power: supply: cw2015: Attach OF ID table to the driver
power: reset: gpio-poweroff: add missing '\n' in dev_err()
Revert "power: supply: sbs-battery: simplify read_read_string_data"
Revert "power: supply: sbs-battery: add PEC support"
dt-bindings: power: sbs-battery: Convert to yaml
power: supply: sbs-battery: constify power-supply property array
power: supply: sbs-battery: switch to i2c's probe_new
power: supply: sbs-battery: switch from of_property_* to device_property_*
power: supply: sbs-battery: add ability to disable charger broadcasts
power: supply: sbs-battery: fix idle battery status
power: supply: sbs-battery: add POWER_SUPPLY_HEALTH_CALIBRATION_REQUIRED support
power: supply: sbs-battery: add MANUFACTURE_DATE support
power: supply: sbs-battery: add POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT/VOLTAGE_MAX support
power: supply: sbs-battery: Improve POWER_SUPPLY_PROP_TECHNOLOGY support
power: supply: sbs-battery: add POWER_SUPPLY_PROP_CURRENT_AVG support
power: supply: sbs-battery: add PEC support
power: supply: sbs-battery: simplify read_read_string_data
power: supply: sbs-battery: add POWER_SUPPLY_PROP_CAPACITY_ERROR_MARGIN support
power: supply: sbs-battery: Add TI BQ20Z65 support
power: supply: core: add POWER_SUPPLY_HEALTH_CALIBRATION_REQUIRED
...
Christoph Hellwig [Tue, 9 Jun 2020 17:08:19 +0000 (19:08 +0200)]
sysctl: reject gigantic reads/write to sysctl files
Instead of triggering a WARN_ON deep down in the page allocator just
give up early on allocations that are way larger than the usual sysctl
values.
Fixes:
32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Christoph Hellwig [Tue, 9 Jun 2020 17:08:18 +0000 (19:08 +0200)]
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
No user pointers for sysctls anymore.
Fixes:
32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
Reported-by: build test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Marc Zyngier [Wed, 10 Jun 2020 18:09:26 +0000 (19:09 +0100)]
Merge branch 'kvm-arm64/ptrauth-fixes' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Wed, 10 Jun 2020 15:27:46 +0000 (16:27 +0100)]
KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h
Recent refactoring of the arm64 code make it awkward to have
hyp_symbol_addr() in kvm_mmu.h. Instead, move it next to its
main user, which is __hyp_this_cpu_ptr().
Signed-off-by: Marc Zyngier <maz@kernel.org>
Linus Torvalds [Wed, 10 Jun 2020 18:03:04 +0000 (11:03 -0700)]
Merge tag 'dmaengine-5.8-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"A fairly small dmaengine update which includes mostly driver updates
(dmatest, dw-edma, ioat, mmp-tdma and k3-udma) along with Renesas
binding update to json-schema"
* tag 'dmaengine-5.8-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (39 commits)
dmaengine: imx-sdma: initialize all script addresses
dmaengine: ti: k3-udma: Use proper return code in alloc_chan_resources
dmaengine: ti: k3-udma: Remove udma_chan.in_ring_cnt
dmaengine: ti: k3-udma: Add missing dma_sync call for rx flush descriptor
dmaengine: at_xdmac: Replace zero-length array with flexible-array
dmaengine: at_hdmac: Replace zero-length array with flexible-array
dmaengine: qcom: bam_dma: Replace zero-length array with flexible-array
dmaengine: ti: k3-udma: Use PTR_ERR_OR_ZERO() to simplify code
dmaengine: moxart-dma: Drop pointless static qualifier in moxart_probe()
dmaengine: sf-pdma: Simplify the error handling path in 'sf_pdma_probe()'
dmaengine: qcom_hidma: use true,false for bool variable
dmaengine: dw-edma: support local dma device transfer semantics
dmaengine: Fix doc strings to satisfy validation script
dmaengine: Include dmaengine.h into dmaengine.c
dmaengine: dmatest: Describe members of struct dmatest_info
dmaengine: dmatest: Describe members of struct dmatest_params
dmaengine: dmatest: Allow negative timeout value to specify infinite wait
Revert "dmaengine: dmatest: timeout value of -1 should specify infinite wait"
dmaengine: stm32-dma: direct mode support through device tree
dt-bindings: dma: add direct mode support through device tree in stm32-dma
...
Geert Uytterhoeven [Mon, 1 Jun 2020 10:00:49 +0000 (12:00 +0200)]
Documentation/CodingStyle: Fix duplicate "are" typo
The improved paragraph about line lengths contains a sentence with a
duplicate word: there is one "are" at the end of a line, followed by a
second one at the beginning of the next line.
Drop the first one, as that one is part of the longest line.
Fixes:
bdc48fa11e46f867 ("checkpatch/coding-style: deprecate 80-column warning")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Wed, 10 Jun 2020 01:46:16 +0000 (18:46 -0700)]
arch/sparc/mm/srmmu.c: fix build
"mm: consolidate pte_index() and pte_offset_*() definitions" was supposed
to remove arch/sparc/mm/srmmu.c:pte_offset_kernel().
Fixes:
974b9b2c68f3d35 ("mm: consolidate pte_index() and pte_offset_*() definitions")
Reported-by: kernel test robot <lkp@intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joerg Roedel [Tue, 9 Jun 2020 13:03:03 +0000 (15:03 +0200)]
iommu/vt-d: Move Intel IOMMU driver into subdirectory
Move all files related to the Intel IOMMU driver into its own
subdirectory.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org
Joerg Roedel [Tue, 9 Jun 2020 13:03:02 +0000 (15:03 +0200)]
iommu/amd: Move AMD IOMMU driver into subdirectory
Move all files related to the AMD IOMMU driver into its own
subdirectory.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20200609130303.26974-2-joro@8bytes.org
Rafael J. Wysocki [Wed, 10 Jun 2020 15:27:28 +0000 (17:27 +0200)]
Merge branch 'acpica'
* acpica:
ACPICA: Update version to
20200528
ACPICA: iASL: add new OperationRegion subtype keyword PlatformRtMechanism
ACPICA: acpidump: Removed dead code from oslinuxtbl.c
Rafael J. Wysocki [Wed, 10 Jun 2020 15:10:40 +0000 (17:10 +0200)]
Merge branches 'pm-cpufreq' and 'pm-acpi'
* pm-cpufreq:
cpufreq: CPPC: add SW BOOST support
cpufreq: change '.set_boost' to act on one policy
cpufreq: tegra186: add CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
* pm-acpi:
ACPI: PM: Avoid using power resources if there are none for D0
Rafael J. Wysocki [Wed, 10 Jun 2020 15:10:30 +0000 (17:10 +0200)]
Merge branch 'pm-opp'
* pm-opp:
opp: Don't parse icc paths unnecessarily
opp: Remove bandwidth votes when target_freq is zero
opp: core: add regulators enable and disable
opp: Reorder the code for !target_freq case
opp: Expose bandwidth information via debugfs
cpufreq: dt: Add support for interconnect bandwidth scaling
opp: Update the bandwidth on OPP frequency changes
opp: Add sanity checks in _read_opp_key()
opp: Add support for parsing interconnect bandwidth
interconnect: Remove unused module exit code from core
interconnect: Disallow interconnect core to be built as a module
interconnect: Add of_icc_get_by_index() helper function
OPP: Add helpers for reading the binding properties
dt-bindings: opp: Introduce opp-peak-kBps and opp-avg-kBps bindings
Marc Zyngier [Tue, 9 Jun 2020 07:50:29 +0000 (08:50 +0100)]
KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
On a VHE system, the EL1 state is left in the CPU most of the time,
and only syncronized back to memory when vcpu_put() is called (most
of the time on preemption).
Which means that when injecting an exception, we'd better have a way
to either:
(1) write directly to the EL1 sysregs
(2) synchronize the state back to memory, and do the changes there
For an AArch64, we already do (1), so we are safe. Unfortunately,
doing the same thing for AArch32 would be pretty invasive. Instead,
we can easily implement (2) by calling the put/load architectural
backends, and keep preemption disabled. We can then reload the
state back into EL1.
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Tue, 9 Jun 2020 07:40:35 +0000 (08:40 +0100)]
KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
AArch32 CP1x registers are overlayed on their AArch64 counterparts
in the vcpu struct. This leads to an interesting problem as they
are stored in their CPU-local format, and thus a CP1x register
doesn't "hit" the lower 32bit portion of the AArch64 register on
a BE host.
To workaround this unfortunate situation, introduce a bias trick
in the vcpu_cp1x() accessors which picks the correct half of the
64bit register.
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Takashi Iwai [Wed, 10 Jun 2020 13:40:49 +0000 (15:40 +0200)]
Merge tag 'asoc-fix-v5.8' of https://git./linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.8
A small pile of fixes that came in during the merge window, the DPCM
fixes from Pierre are the most notable thing here.
Nick Desaulniers [Mon, 8 Jun 2020 20:38:17 +0000 (13:38 -0700)]
arm64: acpi: fix UBSAN warning
Will reported a UBSAN warning:
UBSAN: null-ptr-deref in arch/arm64/kernel/smp.c:596:6
member access within null pointer of type 'struct acpi_madt_generic_interrupt'
CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc6-00124-g96bc42ff0a82 #1
Call trace:
dump_backtrace+0x0/0x384
show_stack+0x28/0x38
dump_stack+0xec/0x174
handle_null_ptr_deref+0x134/0x174
__ubsan_handle_type_mismatch_v1+0x84/0xa4
acpi_parse_gic_cpu_interface+0x60/0xe8
acpi_parse_entries_array+0x288/0x498
acpi_table_parse_entries_array+0x178/0x1b4
acpi_table_parse_madt+0xa4/0x110
acpi_parse_and_init_cpus+0x38/0x100
smp_init_cpus+0x74/0x258
setup_arch+0x350/0x3ec
start_kernel+0x98/0x6f4
This is from the use of the ACPI_OFFSET in
arch/arm64/include/asm/acpi.h. Replace its use with offsetof from
include/linux/stddef.h which should implement the same logic using
__builtin_offsetof, so that UBSAN wont warn.
Reported-by: Will Deacon <will@kernel.org>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/20200521100952.GA5360@willie-the-truck/
Link: https://lore.kernel.org/r/20200608203818.189423-1-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Nick Desaulniers [Mon, 8 Jun 2020 20:57:08 +0000 (13:57 -0700)]
arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
Allow the compat vdso (32b) to be compiled as either THUMB2 (default) or
ARM.
For THUMB2, the register r7 is reserved for the frame pointer, but
code in arch/arm64/include/asm/vdso/compat_gettimeofday.h
uses r7. Explicitly set -fomit-frame-pointer, since unwinding through
interworked THUMB2 and ARM is unreliable anyways. See also how
CONFIG_UNWINDER_FRAME_POINTER cannot be selected for
CONFIG_THUMB2_KERNEL for ARCH=arm.
This also helps toolchains that differ in their implicit value if the
choice of -f{no-}omit-frame-pointer is left unspecified, to not error on
the use of r7.
2019 Q4 ARM AAPCS seeks to standardize the use of r11 as the reserved
frame pointer register, but no production compiler that can compile the
Linux kernel currently implements this. We're actively discussing such
a transition with ARM toolchain developers currently.
Reported-by: Luis Lozano <llozano@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Manoj Gupta <manojgupta@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Boyd <swboyd@google.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Link: https://static.docs.arm.com/ihi0042/i/aapcs32.pdf
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
Link: https://lore.kernel.org/r/20200608205711.109418-1-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Jernej Skrabec [Mon, 13 Apr 2020 09:54:57 +0000 (11:54 +0200)]
drm/sun4i: hdmi ddc clk: Fix size of m divider
m divider in DDC clock register is 4 bits wide. Fix that.
Fixes:
9c5681011a0c ("drm/sun4i: Add HDMI support")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200413095457.1176754-1-jernej.skrabec@siol.net
Zong Li [Mon, 1 Jun 2020 07:10:58 +0000 (15:10 +0800)]
riscv: fix build warning of missing prototypes
Add the missing header in file, it was lost in original implementation.
The warning message as follows:
- no previous prototype for 'patch_text_nosync' [-Wmissing-prototypes]
- no previous prototype for 'patch_text' [-Wmissing-prototypes]
Changed in v2:
- Correct the typo of commit message.
Signed-off-by: Zong Li <zong.li@sifive.com>
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 05:06:56 +0000 (10:36 +0530)]
RISC-V: Don't mark init section as non-executable
The head text section (i.e. _start, secondary_start_sbi, etc) and the
init section fall under same page table level-1 mapping.
Currently, the runtime CPU hotplug is broken because we are marking
init section as non-executable which in-turn marks head text section
as non-executable.
Further investigating other architectures, it seems marking the init
section as non-executable is redundant because the init section pages
are anyway poisoned and freed.
To fix broken runtime CPU hotplug, we simply remove the code marking
the init section as non-executable.
Fixes:
d27c3c90817e ("riscv: add STRICT_KERNEL_RWX support")
Cc: stable@vger.kernel.org
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 09:15:43 +0000 (14:45 +0530)]
RISC-V: Force select RISCV_INTC for CONFIG_RISCV
The RISC-V per-HART local interrupt controller driver is mandatory
for all RISC-V system (with/without MMU) hence we force select it
for CONFIG_RISCV (just like RISCV_TIMER).
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 09:15:42 +0000 (14:45 +0530)]
RISC-V: Remove do_IRQ() function
The only thing do_IRQ() does is call handle_arch_irq function
pointer. We can very well call handle_arch_irq function pointer
directly from assembly and remove do_IRQ() function hence this
patch.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 09:15:41 +0000 (14:45 +0530)]
clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
Instead of directly calling RISC-V timer interrupt handler from
RISC-V local interrupt conntroller driver, this patch implements
RISC-V timer interrupt as a per-CPU interrupt using per-CPU APIs
of Linux IRQ subsystem.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 09:15:40 +0000 (14:45 +0530)]
irqchip: RISC-V per-HART local interrupt controller driver
The RISC-V per-HART local interrupt controller manages software
interrupts, timer interrupts, external interrupts (which are routed
via the platform level interrupt controller) and other per-HART
local interrupts.
We add a driver for the RISC-V local interrupt controller, which
eventually replaces the RISC-V architecture code, allowing for a
better split between arch code and drivers.
The driver is compliant with RISC-V Hart-Level Interrupt Controller
DT bindings located at:
Documentation/devicetree/bindings/interrupt-controller/riscv,cpu-intc.txt
Co-developed-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
[Palmer: Cleaned up warnings]
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Anup Patel [Mon, 1 Jun 2020 09:15:39 +0000 (14:45 +0530)]
RISC-V: Rename and move plic_find_hart_id() to arch directory
The plic_find_hart_id() can be useful to other interrupt controller
drivers (such as RISC-V local interrupt driver) so we rename this
function to riscv_of_parent_hartid() and place it in arch directory
along with riscv_of_processor_hartid().
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Anup Patel [Mon, 1 Jun 2020 09:15:38 +0000 (14:45 +0530)]
RISC-V: self-contained IPI handling routine
Currently, the IPI handling routine riscv_software_interrupt() does
not take any argument and also does not perform irq_enter()/irq_exit().
This patch makes IPI handling routine more self-contained by:
1. Passing "pt_regs *" argument
2. Explicitly doing irq_enter()/irq_exit()
3. Explicitly save/restore "pt_regs *" using set_irq_regs()
With above changes, IPI handling routine does not depend on caller
function to perform irq_enter()/irq_exit() and save/restore of
"pt_regs *" hence its more self-contained. This also enables us
to call IPI handling routine from IRQCHIP drivers.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Palmer Dabbelt [Thu, 4 Jun 2020 20:55:14 +0000 (13:55 -0700)]
RISC-V: Sort select statements alphanumerically
Like patch b1b3f49 ("ARM: config: sort select statements alphanumerically")
, we sort all our select statements alphanumerically by using the perl
script in patch b1b3f49 as above.
As suggested by Andrew Morton:
This is a pet peeve of mine. Any time there's a long list of items
(header file inclusions, kconfig entries, array initalisers, etc) and
someone wants to add a new item, they *always* go and stick it at the
end of the list.
Guys, don't do this. Either put the new item into a randomly-chosen
position or, probably better, alphanumerically sort the list.
Suggested-by: Zong Li <zong.li@sifive.com>
[Palmer: Re-ran the script, as there were predictably a bunch of conflicts]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Jens Axboe [Wed, 10 Jun 2020 01:23:05 +0000 (19:23 -0600)]
io_uring: allow O_NONBLOCK async retry
We can assume that O_NONBLOCK is always honored, even if we don't
have a ->read/write_iter() for the file type. Also unify the read/write
checking for allowing async punt, having the write side factoring in the
REQ_F_NOWAIT flag as well.
Cc: stable@vger.kernel.org
Fixes:
490e89676a52 ("io_uring: only force async punt if poll based retry can't handle it")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Tue, 9 Jun 2020 22:48:24 +0000 (15:48 -0700)]
Merge tag 'fuse-update-5.8' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Fix a rare deadlock in virtiofs
- Fix st_blocks in writeback cache mode
- Fix wrong checks in splice move causing spurious warnings
- Fix a race between a GETATTR request and a FUSE_NOTIFY_INVAL_INODE
notification
- Use rb-tree instead of linear search for pages currently under
writeout by userspace
- Fix copy_file_range() inconsistencies
* tag 'fuse-update-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: copy_file_range should truncate cache
fuse: fix copy_file_range cache issues
fuse: optimize writepages search
fuse: update attr_version counter on fuse_notify_inval_inode()
fuse: don't check refcount after stealing page
fuse: fix weird page warning
fuse: use dump_page
virtiofs: do not use fuse_fill_super_common() for device installation
fuse: always allow query of st_dev
fuse: always flush dirty data on close(2)
fuse: invalidate inode attr in writeback cache mode
fuse: Update stale comment in queue_interrupt()
fuse: BUG_ON correction in fuse_dev_splice_write()
virtiofs: Add mount option and atime behavior to the doc
virtiofs: schedule blocking async replies in separate worker
Linus Torvalds [Tue, 9 Jun 2020 22:40:50 +0000 (15:40 -0700)]
Merge tag 'ovl-update-5.8' of git://git./linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"Fixes:
- Resolve mount option conflicts consistently
- Sync before remount R/O
- Fix file handle encoding corner cases
- Fix metacopy related issues
- Fix an unintialized return value
- Add missing permission checks for underlying layers
Optimizations:
- Allow multipe whiteouts to share an inode
- Optimize small writes by inheriting SB_NOSEC from upper layer
- Do not call ->syncfs() multiple times for sync(2)
- Do not cache negative lookups on upper layer
- Make private internal mounts longterm"
* tag 'ovl-update-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (27 commits)
ovl: remove unnecessary lock check
ovl: make oip->index bool
ovl: only pass ->ki_flags to ovl_iocb_to_rwf()
ovl: make private mounts longterm
ovl: get rid of redundant members in struct ovl_fs
ovl: add accessor for ofs->upper_mnt
ovl: initialize error in ovl_copy_xattr
ovl: drop negative dentry in upper layer
ovl: check permission to open real file
ovl: call secutiry hook in ovl_real_ioctl()
ovl: verify permissions in ovl_path_open()
ovl: switch to mounter creds in readdir
ovl: pass correct flags for opening real directory
ovl: fix redirect traversal on metacopy dentries
ovl: initialize OVL_UPPERDATA in ovl_lookup()
ovl: use only uppermetacopy state in ovl_lookup()
ovl: simplify setting of origin for index lookup
ovl: fix out of bounds access warning in ovl_check_fb_len()
ovl: return required buffer size for file handles
ovl: sync dirty data when remounting to ro mode
...
Linus Torvalds [Tue, 9 Jun 2020 22:38:46 +0000 (15:38 -0700)]
Merge tag 'afs-fixes-
20200609' of git://git./linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
"A set of small patches to fix some things, most of them minor.
- Fix a memory leak in afs_put_sysnames()
- Fix an oops in AFS file locking
- Fix new use of BUG()
- Fix debugging statements containing %px
- Remove afs_zero_fid as it's unused
- Make afs_zap_data() static"
* tag 'afs-fixes-
20200609' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Make afs_zap_data() static
afs: Remove afs_zero_fid as it's not used
afs: Fix debugging statements with %px to be %p
afs: Fix use of BUG()
afs: Fix file locking
afs: Fix memory leak in afs_put_sysnames()
Stephen Boyd [Tue, 9 Jun 2020 21:18:47 +0000 (14:18 -0700)]
clk: mediatek: Remove ifr{0,1}_cfg_regs structures
These aren't used and the macros that reference them aren't used either.
Remove the dead code to avoid compile warnings.
Cc: Owen Chen <owen.chen@mediatek.com>
Cc: Mars Cheng <mars.cheng@mediatek.com>
Cc: Macpaul Lin <macpaul.lin@mediatek.com>
Fixes:
1aca9939bf72 ("clk: mediatek: Add MT6765 clock support")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20200609211847.27366-1-sboyd@kernel.org
Colin Ian King [Tue, 2 Jun 2020 17:24:35 +0000 (18:24 +0100)]
clk: baikal-t1: remove redundant assignment to variable 'divider'
The variable divider is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200602172435.70282-1-colin.king@canonical.com
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Colin Ian King [Tue, 2 Jun 2020 12:10:30 +0000 (13:10 +0100)]
clk: baikal-t1: fix spelling mistake "Uncompatible" -> "Incompatible"
There is a spelling mistake in a pr_err error message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20200602121030.39132-1-colin.king@canonical.com
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Lubomir Rintel [Fri, 5 Jun 2020 06:52:58 +0000 (08:52 +0200)]
dt-bindings: clock: Add a missing include to MMP Audio Clock binding
The include file for input clock in the example was missing, breaking the
validation.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reported-by: Rob Herring <robh+dt@kernel.org>
Link: https://lore.kernel.org/r/20200605065258.567858-1-lkundrak@v3.sk
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Arnd Bergmann [Tue, 9 Jun 2020 19:51:53 +0000 (12:51 -0700)]
Input: adi - work around module name confict
Making module name conflicts a fatal error breaks sparc64 allmodconfig:
Error log:
error: the following would cause module name conflict:
drivers/char/adi.ko
drivers/input/joystick/adi.ko
Renaming one of the modules would solve the problem, but then cause other
problems because neither of them is automatically loaded and changing
the name is likely to break any setup that relies on manually loading
it by name.
As there is probably no sparc64 system with this kind of ancient joystick
attached, work around it by adding a Kconfig dependency that forbids
them from both being modules. It is still possible to build the joystick
driver if the sparc64 adi driver is built-in.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20200609100643.1245061-1-arnd@arndb.de
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Tue, 9 Jun 2020 18:28:59 +0000 (11:28 -0700)]
Merge tag 'f2fs-for-5.8' of git://git./linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added some knobs to enhance compression feature
and harden testing environment. In addition, we've fixed several bugs
reported from Android devices such as long discarding latency, device
hanging during quota_sync, etc.
Enhancements:
- support lzo-rle algorithm
- add two ioctls to release and reserve blocks for compression
- support partial truncation/fiemap on compressed file
- introduce sysfs entries to attach IO flags explicitly
- add iostat trace point along with read io stat
Bug fixes:
- fix long discard latency
- flush quota data by f2fs_quota_sync correctly
- fix to recover parent inode number for power-cut recovery
- fix lz4/zstd output buffer budget
- parse checkpoint mount option correctly
- avoid inifinite loop to wait for flushing node/meta pages
- manage discard space correctly
And some refactoring and clean up patches were added"
* tag 'f2fs-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
f2fs: attach IO flags to the missing cases
f2fs: add node_io_flag for bio flags likewise data_io_flag
f2fs: remove unused parameter of f2fs_put_rpages_mapping()
f2fs: handle readonly filesystem in f2fs_ioc_shutdown()
f2fs: avoid utf8_strncasecmp() with unstable name
f2fs: don't return vmalloc() memory from f2fs_kmalloc()
f2fs: fix retry logic in f2fs_write_cache_pages()
f2fs: fix wrong discard space
f2fs: compress: don't compress any datas after cp stop
f2fs: remove unneeded return value of __insert_discard_tree()
f2fs: fix wrong value of tracepoint parameter
f2fs: protect new segment allocation in expand_inode_data
f2fs: code cleanup by removing ifdef macro surrounding
f2fs: avoid inifinite loop to wait for flushing node pages at cp_error
f2fs: flush dirty meta pages when flushing them
f2fs: fix checkpoint=disable:%u%%
f2fs: compress: fix zstd data corruption
f2fs: add compressed/gc data read IO stat
f2fs: fix potential use-after-free issue
f2fs: compress: don't handle non-compressed data in workqueue
...
Linus Torvalds [Tue, 9 Jun 2020 18:24:59 +0000 (11:24 -0700)]
Merge tag 'exfat-for-5.8-rc1' of git://git./linux/kernel/git/linkinjeon/exfat
Pull exfat update from Namjae Jeon:
"Bug fixes:
- Fix memory leak on mount failure with iocharset= option
- Fix incorrect update of stream entry
- Fix cluster range validation error
Clean-ups:
- Remove unused code and unneeded assignment
- Rename variables in exfat structure as specification
- Reorganize boot sector analysis code
- Simplify exfat_utf8_d_hash and exfat_utf8_d_cmp()
- Optimize exfat entry cache functions
- Improve wording of EXFAT_DEFAULT_IOCHARSET config option
New Feature:
- Add boot region verification"
* tag 'exfat-for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: Fix potential use after free in exfat_load_upcase_table()
exfat: fix range validation error in alloc and free cluster
exfat: fix incorrect update of stream entry in __exfat_truncate()
exfat: fix memory leak in exfat_parse_param()
exfat: remove unnecessary reassignment of p_uniname->name_len
exfat: standardize checksum calculation
exfat: add boot region verification
exfat: separate the boot sector analysis
exfat: redefine PBR as boot_sector
exfat: optimize dir-cache
exfat: replace 'time_ms' with 'time_cs'
exfat: remove the assignment of 0 to bool variable
exfat: Remove unused functions exfat_high_surrogate() and exfat_low_surrogate()
exfat: Simplify exfat_utf8_d_hash() for code points above U+FFFF
exfat: Improve wording of EXFAT_DEFAULT_IOCHARSET config option
exfat: Use a more common logging style
exfat: Simplify exfat_utf8_d_cmp() for code points above U+FFFF
Linus Torvalds [Tue, 9 Jun 2020 17:39:33 +0000 (10:39 -0700)]
x86: use proper parentheses around new uaccess macro argument uses
__get_kernel_nofault() didn't have the parentheses around the use of
'src' and 'dst' macro arguments, making the casts potentially do the
wrong thing.
The parentheses aren't necessary with the current very limited use in
mm/access.c, but it's bad form, and future use-cases might have very
unexpected errors as a result.
Do the same for unsafe_copy_loop() while at it, although in that case it
is an entirely internal x86 uaccess helper macro that isn't used
anywhere else and any other use would be invalid anyway.
Fixes:
fa94111d9435 ("x86: use non-set_fs based maccess routines")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Mon, 10 Feb 2020 10:00:22 +0000 (10:00 +0000)]
afs: Make afs_zap_data() static
Make afs_zap_data() static as it's only used in the file in which it is
defined.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 6 Feb 2020 14:22:27 +0000 (14:22 +0000)]
afs: Remove afs_zero_fid as it's not used
Remove afs_zero_fid as it's not used.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Tue, 9 Jun 2020 15:25:02 +0000 (16:25 +0100)]
afs: Fix debugging statements with %px to be %p
Fix a couple of %px to be %p in debugging statements.
Fixes:
e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Fixes:
8a070a964877 ("afs: Detect cell aliases 1 - Cells with root volumes")
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Linus Torvalds [Tue, 9 Jun 2020 17:06:18 +0000 (10:06 -0700)]
Merge tag 'trace-v5.8' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"No new features this release. Mostly clean ups, restructuring and
documentation.
- Have ftrace_bug() show ftrace errors before the WARN, as the WARN
will reboot the box before the error messages are printed if
panic_on_warn is set.
- Have traceoff_on_warn disable tracing sooner (before prints)
- Write a message to the trace buffer that its being disabled when
disable_trace_on_warning() is set.
- Separate out synthetic events from histogram code to let it be used
by other parts of the kernel.
- More documentation on histogram design.
- Other small fixes and clean ups"
* tag 'trace-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Remove obsolete PREEMPTIRQ_EVENTS kconfig option
tracing/doc: Fix ascii-art in histogram-design.rst
tracing: Add a trace print when traceoff_on_warning is triggered
ftrace,bug: Improve traceoff_on_warn
selftests/ftrace: Distinguish between hist and synthetic event checks
tracing: Move synthetic events to a separate file
tracing: Fix events.rst section numbering
tracing/doc: Fix typos in histogram-design.rst
tracing: Add hist_debug trace event files for histogram debugging
tracing: Add histogram-design document
tracing: Check state.disabled in synth event trace functions
tracing/probe: reverse arguments to list_add
tools/bootconfig: Add a summary of test cases and return error
ftrace: show debugging information when panic_on_warn set
Linus Torvalds [Tue, 9 Jun 2020 17:04:47 +0000 (10:04 -0700)]
Merge tag 'linux-kselftest-kunit-5.8-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kunit updates from Shuah Khan:
"This consists of:
- Several config fragment fixes from Anders Roxell to improve test
coverage.
- Improvements to kunit run script to use defconfig as default and
restructure the code for config/build/exec/parse from Vitor Massaru
Iha and David Gow.
- Miscellaneous documentation warn fix"
* tag 'linux-kselftest-kunit-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
security: apparmor: default KUNIT_* fragments to KUNIT_ALL_TESTS
fs: ext4: default KUNIT_* fragments to KUNIT_ALL_TESTS
drivers: base: default KUNIT_* fragments to KUNIT_ALL_TESTS
lib: Kconfig.debug: default KUNIT_* fragments to KUNIT_ALL_TESTS
kunit: default KUNIT_* fragments to KUNIT_ALL_TESTS
kunit: Kconfig: enable a KUNIT_ALL_TESTS fragment
kunit: Fix TabError, remove defconfig code and handle when there is no kunitconfig
kunit: use KUnit defconfig by default
kunit: use --build_dir=.kunit as default
Documentation: test.h - fix warnings
kunit: kunit_tool: Separate out config/build/exec/parse
Linus Torvalds [Tue, 9 Jun 2020 17:03:12 +0000 (10:03 -0700)]
Merge tag 'linux-kselftest-5.8-rc1' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This consists of:
- Several fixes from Masami Hiramatsu to improve coverage for lib and
sysctl tests.
- Clean up to vdso test and a new test for getcpu() from Mark Brown.
- Add new gen_tar selftests Makefile target generate selftest package
running "make gen_tar" in selftests directory from Veronika
Kabatova.
- Other miscellaneous fixes to timens, exec, tpm2 tests"
* tag 'linux-kselftest-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/sysctl: Make sysctl test driver as a module
selftests/sysctl: Fix to load test_sysctl module
lib: Make test_sysctl initialized as module
lib: Make prime number generator independently selectable
selftests/ftrace: Return unsupported if no error_log file
selftests/ftrace: Use printf for backslash included command
selftests/timens: handle a case when alarm clocks are not supported
Kernel selftests: Add check if TPM devices are supported
selftests: vdso: Add a selftest for vDSO getcpu()
selftests: vdso: Use a header file to prototype parse_vdso API
selftests: vdso: Rename vdso_test to vdso_test_gettimeofday
selftests/exec: Verify execve of non-regular files fail
selftests: introduce gen_tar Makefile target
Linus Torvalds [Tue, 9 Jun 2020 16:54:46 +0000 (09:54 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge even more updates from Andrew Morton:
- a kernel-wide sweep of show_stack()
- pagetable cleanups
- abstract out accesses to mmap_sem - prep for mmap_sem scalability work
- hch's user acess work
Subsystems affected by this patch series: debug, mm/pagemap, mm/maccess,
mm/documentation.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (93 commits)
include/linux/cache.h: expand documentation over __read_mostly
maccess: return -ERANGE when probe_kernel_read() fails
x86: use non-set_fs based maccess routines
maccess: allow architectures to provide kernel probing directly
maccess: move user access routines together
maccess: always use strict semantics for probe_kernel_read
maccess: remove strncpy_from_unsafe
tracing/kprobes: handle mixed kernel/userspace probes better
bpf: rework the compat kernel probe handling
bpf:bpf_seq_printf(): handle potentially unsafe format string better
bpf: handle the compat string in bpf_trace_copy_string better
bpf: factor out a bpf_trace_copy_string helper
maccess: unify the probe kernel arch hooks
maccess: remove probe_read_common and probe_write_common
maccess: rename strnlen_unsafe_user to strnlen_user_nofault
maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault
maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault
maccess: update the top of file comment
maccess: clarify kerneldoc comments
maccess: remove duplicate kerneldoc comments
...
Oleg Nesterov [Mon, 4 May 2020 16:47:25 +0000 (18:47 +0200)]
uprobes: ensure that uprobe->offset and ->ref_ctr_offset are properly aligned
uprobe_write_opcode() must not cross page boundary; prepare_uprobe()
relies on arch_uprobe_analyze_insn() which should validate "vaddr" but
some architectures (csky, s390, and sparc) don't do this.
We can remove the BUG_ON() check in prepare_uprobe() and validate the
offset early in __uprobe_register(). The new IS_ALIGNED() check matches
the alignment check in arch_prepare_kprobe() on supported architectures,
so I think that all insns must be aligned to UPROBE_SWBP_INSN_SIZE.
Another problem is __update_ref_ctr() which was wrong from the very
beginning, it can read/write outside of kmap'ed page unless "vaddr" is
aligned to sizeof(short), __uprobe_register() should check this too.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Luis Chamberlain [Tue, 9 Jun 2020 04:35:07 +0000 (21:35 -0700)]
include/linux/cache.h: expand documentation over __read_mostly
__read_mostly can easily be misused by folks, its not meant for just
read-only data. There are performance reasons for using it, but we also
don't provide any guidance about its use. Provide a bit more guidance
over its use.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200507161424.2584-1-mcgrof@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:35:04 +0000 (21:35 -0700)]
maccess: return -ERANGE when probe_kernel_read() fails
Allow the callers to distinguish a real unmapped address vs a range
that can't be probed.
Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-24-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:35:01 +0000 (21:35 -0700)]
x86: use non-set_fs based maccess routines
Provide arch_kernel_read and arch_kernel_write routines to implement the
maccess routines without messing with set_fs and without stac/clac that
opens up access to user space.
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-20-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:58 +0000 (21:34 -0700)]
maccess: allow architectures to provide kernel probing directly
Provide alternative versions of probe_kernel_read, probe_kernel_write
and strncpy_from_kernel_unsafe that don't need set_fs magic, but instead
use arch hooks that are modelled after unsafe_{get,put}_user to access
kernel memory in an exception safe way.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-19-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:55 +0000 (21:34 -0700)]
maccess: move user access routines together
Move kernel access vs user access routines together to ease upcoming
ifdefs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-18-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:50 +0000 (21:34 -0700)]
maccess: always use strict semantics for probe_kernel_read
Except for historical confusion in the kprobes/uprobes and bpf tracers,
which has been fixed now, there is no good reason to ever allow user
memory accesses from probe_kernel_read. Switch probe_kernel_read to only
read from kernel memory.
[akpm@linux-foundation.org: update it for "mm, dump_page(): do not crash with invalid mapping pointer"]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-17-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:47 +0000 (21:34 -0700)]
maccess: remove strncpy_from_unsafe
All users are gone now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-16-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:44 +0000 (21:34 -0700)]
tracing/kprobes: handle mixed kernel/userspace probes better
Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe
helpers, rework probes to try a user probe based on the address if the
architecture has a common address space for kernel and userspace.
[svens@linux.ibm.com:use strncpy_from_kernel_nofault() in fetch_store_string()]
Link: http://lkml.kernel.org/r/20200606181903.49384-1-svens@linux.ibm.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-15-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:40 +0000 (21:34 -0700)]
bpf: rework the compat kernel probe handling
Instead of using the dangerous probe_kernel_read and strncpy_from_unsafe
helpers, rework the compat probes to check if an address is a kernel or
userspace one, and then use the low-level kernel or user probe helper
shared by the proper kernel and user probe helpers. This slightly
changes behavior as the compat probe on a user address doesn't check
the lockdown flags, just as the pure user probes do.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-14-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 9 Jun 2020 04:34:37 +0000 (21:34 -0700)]
bpf:bpf_seq_printf(): handle potentially unsafe format string better
User the proper helper for kernel or userspace addresses based on
TASK_SIZE instead of the dangerous strncpy_from_unsafe function.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:33 +0000 (21:34 -0700)]
bpf: handle the compat string in bpf_trace_copy_string better
User the proper helper for kernel or userspace addresses based on
TASK_SIZE instead of the dangerous strncpy_from_unsafe function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-13-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Tue, 9 Jun 2020 04:34:30 +0000 (21:34 -0700)]
bpf: factor out a bpf_trace_copy_string helper
Split out a helper to do the fault free access to the string pointer
to get it out of a crazy indentation level.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200521152301.2587579-12-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>