Peter Zijlstra [Mon, 18 Jan 2021 14:12:18 +0000 (15:12 +0100)]
static_call: Pull some static_call declarations to the type headers
Some static call declarations are going to be needed on low level header
files. Move the necessary material to the dedicated static call types
header to avoid inclusion dependency hell.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:40 +0000 (14:10 +0100)]
sched/core: Update task_prio() function header
The description of the RT offset and the values for 'normal' tasks needs
update. Moreover there are DL tasks now.
task_prio() has to stay like it is to guarantee compatibility with the
/proc/<pid>/stat priority field:
# cat /proc/<pid>/stat | awk '{ print $18; }'
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-4-dietmar.eggemann@arm.com
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:39 +0000 (14:10 +0100)]
sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO
The only remaining use of MAX_USER_PRIO (and USER_PRIO) is the
SCALE_PRIO() definition in the PowerPC Cell architecture's Synergistic
Processor Unit (SPU) scheduler. TASK_USER_PRIO isn't used anymore.
Commit
fe443ef2ac42 ("[POWERPC] spusched: Dynamic timeslicing for
SCHED_OTHER") copied SCALE_PRIO() from the task scheduler in v2.6.23.
Commit
a4ec24b48dde ("sched: tidy up SCHED_RR") removed it from the task
scheduler in v2.6.24.
Commit
3ee237dddcd8 ("sched/prio: Add 3 macros of MAX_NICE, MIN_NICE and
NICE_WIDTH in prio.h") introduced NICE_WIDTH much later.
With:
MAX_USER_PRIO = USER_PRIO(MAX_PRIO)
= MAX_PRIO - MAX_RT_PRIO
MAX_PRIO = MAX_RT_PRIO + NICE_WIDTH
MAX_USER_PRIO = MAX_RT_PRIO + NICE_WIDTH - MAX_RT_PRIO
MAX_USER_PRIO = NICE_WIDTH
MAX_USER_PRIO can be replaced by NICE_WIDTH to be able to remove all the
{*_}USER_PRIO defines.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-3-dietmar.eggemann@arm.com
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:38 +0000 (14:10 +0100)]
sched: Remove MAX_USER_RT_PRIO
Commit
d46523ea32a7 ("[PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO")
was introduced due to a a small time period in which the realtime patch
set was using different values for MAX_USER_RT_PRIO and MAX_RT_PRIO.
This is no longer true, i.e. now MAX_RT_PRIO == MAX_USER_RT_PRIO.
Get rid of MAX_USER_RT_PRIO and make everything use MAX_RT_PRIO
instead.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-2-dietmar.eggemann@arm.com
Dietmar Eggemann [Mon, 1 Feb 2021 09:53:53 +0000 (10:53 +0100)]
sched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()
Commit "sched/topology: Make sched_init_numa() use a set for the
deduplicating sort" allocates 'i + nr_levels (level)' instead of
'i + nr_levels + 1' sched_domain_topology_level.
This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG):
sched_init_domains
build_sched_domains()
__free_domain_allocs()
__sdt_free() {
...
for_each_sd_topology(tl)
...
sd = *per_cpu_ptr(sdd->sd, j); <--
...
}
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Barry Song <song.bao.hua@hisilicon.com>
Link: https://lkml.kernel.org/r/6000e39e-7d28-c360-9cd6-8798fd22a9bf@arm.com
Peter Zijlstra [Wed, 29 Apr 2020 15:07:53 +0000 (17:07 +0200)]
rbtree, timerqueue: Use rb_add_cached()
Reduce rbtree boiler plate by using the new helpers.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:29:58 +0000 (17:29 +0200)]
rbtree, rtmutex: Use rb_add_cached()
Reduce rbtree boiler plate by using the new helpers.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:06:27 +0000 (17:06 +0200)]
rbtree, uprobes: Use rbtree helpers
Reduce rbtree boilerplate by using the new helpers.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:05:15 +0000 (17:05 +0200)]
rbtree, perf: Use new rbtree helpers
Reduce rbtree boiler plate by using the new helpers.
One noteworthy change is unification of the various (partial) compare
functions. We construct a subtree match by forcing the sub-order to
always match, see __group_cmp().
Due to 'const' we had to touch cgroup_id().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:04:41 +0000 (17:04 +0200)]
rbtree, sched/deadline: Use rb_add_cached()
Reduce rbtree boiler plate by using the new helpers.
Make rb_add_cached() / rb_erase_cached() return a pointer to the
leftmost node to aid in updating additional state.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:04:12 +0000 (17:04 +0200)]
rbtree, sched/fair: Use rb_add_cached()
Reduce rbtree boiler plate by using the new helper function.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Peter Zijlstra [Wed, 29 Apr 2020 15:03:22 +0000 (17:03 +0200)]
rbtree: Add generic add and find helpers
I've always been bothered by the endless (fragile) boilerplate for
rbtree, and I recently wrote some rbtree helpers for objtool and
figured I should lift them into the kernel and use them more widely.
Provide:
partial-order; less() based:
- rb_add(): add a new entry to the rbtree
- rb_add_cached(): like rb_add(), but for a rb_root_cached
total-order; cmp() based:
- rb_find(): find an entry in an rbtree
- rb_find_add(): find an entry, and add if not found
- rb_find_first(): find the first (leftmost) matching entry
- rb_next_match(): continue from rb_find_first()
- rb_for_each(): iterate a sub-tree using the previous two
Inlining and constant propagation should see the compiler inline the
whole thing, including the various compare functions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Mel Gorman [Wed, 27 Jan 2021 13:52:03 +0000 (13:52 +0000)]
sched/fair: Merge select_idle_core/cpu()
Both select_idle_core() and select_idle_cpu() do a loop over the same
cpumask. Observe that by clearing the already visited CPUs, we can
fold the iteration and iterate a core at a time.
All we need to do is remember any non-idle CPU we encountered while
scanning for an idle core. This way we'll only iterate every CPU once.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210127135203.19633-5-mgorman@techsingularity.net
Mel Gorman [Mon, 25 Jan 2021 08:59:08 +0000 (08:59 +0000)]
sched/fair: Remove select_idle_smt()
In order to make the next patch more readable, and to quantify the
actual effectiveness of this pass, start by removing it.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210125085909.4600-4-mgorman@techsingularity.net
Ingo Molnar [Wed, 17 Feb 2021 13:04:39 +0000 (14:04 +0100)]
Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sun, 14 Feb 2021 22:32:24 +0000 (14:32 -0800)]
Linux 5.11
Linus Torvalds [Sun, 14 Feb 2021 19:50:31 +0000 (11:50 -0800)]
Merge branch 'for-rc8-5.11' of git://git./linux/kernel/git/pavel/linux-leds
Pull LED fix from Pavel Machek:
"One-liner fixing a build problem"
* 'for-rc8-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds:
leds: rt8515: add V4L2_FLASH_LED_CLASS dependency
Linus Torvalds [Sun, 14 Feb 2021 19:36:32 +0000 (11:36 -0800)]
Merge tag 'kbuild-fixes-v5.11-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix CONFIG_TRIM_UNUSED_KSYMS build for ppc64
- Use pkg-config for scripts/sign-file.c CFLAGS
* tag 'kbuild-fixes-v5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
scripts: set proper OpenSSL include dir also for sign-file
sparc: remove wrong comment from arch/sparc/include/asm/Kbuild
kbuild: fix CONFIG_TRIM_UNUSED_KSYMS build for ppc64
Linus Torvalds [Sun, 14 Feb 2021 19:10:55 +0000 (11:10 -0800)]
Merge tag 'x86_urgent_for_v5.11' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"I kinda knew while typing 'I hope this is the last batch of x86/urgent
updates' last week, Murphy was reading too and uttered 'Hold my
beer!'.
So here's more fixes... Thanks Murphy.
Anyway, three more x86/urgent fixes for 5.11 final. We should be
finally ready (famous last words). :-)
- An SGX use after free fix
- A fix for the fix to disable CET instrumentation generation for
kernel code. We forgot 32-bit, which we seem to do very often
nowadays
- A Xen PV fix to irqdomain init ordering"
* tag 'x86_urgent_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pci: Create PCI/MSI irqdomain after x86_init.pci.arch_init()
x86/build: Disable CET instrumentation in the kernel for 32-bit too
x86/sgx: Maintain encl->refcount for each encl->mm_list entry
Arnd Bergmann [Thu, 4 Feb 2021 15:39:44 +0000 (16:39 +0100)]
leds: rt8515: add V4L2_FLASH_LED_CLASS dependency
The leds-rt8515 driver can optionall use the v4l2 flash led class,
but it causes a link error when that class is in a loadable module
and the rt8515 driver itself is built-in:
ld.lld: error: undefined symbol: v4l2_flash_init
>>> referenced by leds-rt8515.c
>>> leds/flash/leds-rt8515.o:(rt8515_probe) in archive
drivers/built-in.a
Adding 'depends on V4L2_FLASH_LED_CLASS' in Kconfig would avoid that,
but it would make it impossible to use the driver without the
v4l2 support.
Add the same dependency that the other users of this class have
instead, which just prevents the broken configuration.
Fixes:
e1c6edcbea13 ("leds: rt8515: Add Richtek RT8515 LED driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Rolf Eike Beer [Fri, 12 Feb 2021 07:22:27 +0000 (08:22 +0100)]
scripts: set proper OpenSSL include dir also for sign-file
Fixes:
2cea4a7a1885 ("scripts: use pkg-config to locate libcrypto")
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Cc: stable@vger.kernel.org # 5.6.x
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Thu, 28 Jan 2021 00:51:03 +0000 (09:51 +0900)]
sparc: remove wrong comment from arch/sparc/include/asm/Kbuild
These are NOT exported to userspace.
The headers listed in arch/sparc/include/uapi/asm/Kbuild are exported.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sat, 13 Feb 2021 22:25:22 +0000 (14:25 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One small fix for the Allwinner clk driver so that display clks figure
out the correct rate to use.
This fixes displays running 4k@60Hz and some other resolutions that
haven't been exercised and fully understood until now"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi-ng: mp: fix parent rate change flag check
Linus Torvalds [Sat, 13 Feb 2021 22:14:47 +0000 (14:14 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One fix for scsi_debug that fixes a memory leak on module removal"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_debug: Fix a memory leak
Linus Torvalds [Sat, 13 Feb 2021 20:25:42 +0000 (12:25 -0800)]
Merge branch 'for-5.11-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
"Two cgroup fixes:
- fix a NULL deref when trying to poll PSI in the root cgroup
- fix confusing controller parsing corner case when mounting cgroup
v1 hierarchies
And doc / maintainer file updates"
* 'for-5.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: update PSI file description in docs
cgroup: fix psi monitor for root cgroup
MAINTAINERS: Update my email address
MAINTAINERS: Remove stale URLs for cpuset
cgroup-v1: add disabled controller check in cgroup1_parse_param()
Linus Torvalds [Sat, 13 Feb 2021 20:04:18 +0000 (12:04 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"6 patches.
Subsystems affected by this patch series: mm/pagemap, scripts,
MAINTAINERS, and h8300"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
h8300: fix PREEMPTION build, TI_PRE_COUNT undefined
MAINTAINERS: add Andrey Konovalov to KASAN reviewers
MAINTAINERS: update Andrey Konovalov's email address
MAINTAINERS: update KASAN file list
scripts/recordmcount.pl: support big endian for ARCH sh
m68k: make __pfn_to_phys() and __phys_to_pfn() available for !MMU
Linus Torvalds [Sat, 13 Feb 2021 19:59:10 +0000 (11:59 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"One more I2C driver bugfix"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: stm32f7: fix configuration of the digital filter
Linus Torvalds [Sat, 13 Feb 2021 19:55:29 +0000 (11:55 -0800)]
Merge tag 'for-5.11-rc7-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"A regression fix caused by a refactoring in 5.11.
A corrupted superblock wouldn't be detected by checksum verification
due to wrongly placed initialization of the checksum length, thus
making memcmp always work"
* tag 'for-5.11-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: initialize fs_info::csum_size earlier in open_ctree
Randy Dunlap [Sat, 13 Feb 2021 04:52:54 +0000 (20:52 -0800)]
h8300: fix PREEMPTION build, TI_PRE_COUNT undefined
Fix a build error for undefined 'TI_PRE_COUNT' by adding it to
asm-offsets.c.
h8300-linux-ld: arch/h8300/kernel/entry.o: in function `resume_kernel': (.text+0x29a): undefined reference to `TI_PRE_COUNT'
Link: https://lkml.kernel.org/r/20210212021650.22740-1-rdunlap@infradead.org
Fixes:
df2078b8daa7 ("h8300: Low level entry")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Sat, 13 Feb 2021 04:52:50 +0000 (20:52 -0800)]
MAINTAINERS: add Andrey Konovalov to KASAN reviewers
Add my personal email address to KASAN reviewers list.
Link: https://lkml.kernel.org/r/c1ce89a7aae0e2d6852249c280b1eb59aeac30c0.1613150186.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Sat, 13 Feb 2021 04:52:47 +0000 (20:52 -0800)]
MAINTAINERS: update Andrey Konovalov's email address
Use my personal email address.
Link: https://lkml.kernel.org/r/b0ec98dabbc12336c162788f5ccde97045a0d65e.1613150186.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Sat, 13 Feb 2021 04:52:44 +0000 (20:52 -0800)]
MAINTAINERS: update KASAN file list
Account for the following files:
- lib/Kconfig.kasan
- lib/test_kasan_module.c
- arch/arm64/include/asm/mte-kasan.h
Link: https://lkml.kernel.org/r/7f9771d97b34d396bfdc4e288ad93486bb865a06.1613150186.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rong Chen [Sat, 13 Feb 2021 04:52:41 +0000 (20:52 -0800)]
scripts/recordmcount.pl: support big endian for ARCH sh
The kernel test robot reported the following issue:
CC [M] drivers/soc/litex/litex_soc_ctrl.o
sh4-linux-objcopy: Unable to change endianness of input file(s)
sh4-linux-ld: cannot find drivers/soc/litex/.tmp_gl_litex_soc_ctrl.o: No such file or directory
sh4-linux-objcopy: 'drivers/soc/litex/.tmp_mx_litex_soc_ctrl.o': No such file
The problem is that the format of input file is elf32-shbig-linux, but
sh4-linux-objcopy wants to output a file which format is elf32-sh-linux:
$ sh4-linux-objdump -d drivers/soc/litex/litex_soc_ctrl.o | grep format
drivers/soc/litex/litex_soc_ctrl.o: file format elf32-shbig-linux
Link: https://lkml.kernel.org/r/20210210150435.2171567-1-rong.a.chen@intel.com
Link: https://lore.kernel.org/linux-mm/202101261118.GbbYSlHu-lkp@intel.com
Signed-off-by: Rong Chen <rong.a.chen@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Sat, 13 Feb 2021 04:52:38 +0000 (20:52 -0800)]
m68k: make __pfn_to_phys() and __phys_to_pfn() available for !MMU
Recent changes that obsoleted DISCONTIGMEM on m68k switched the MMU
variant to use generic definitions of __pfn_to_phys() and __phys_to_pfn(),
but missed the !MMU variant which caused a build failure:
drivers/media/common/videobuf2/videobuf2-dma-contig.c: In function 'vb2_dc_get_userptr':
drivers/media/common/videobuf2/videobuf2-dma-contig.c:509:5: error: implicit declaration of function '__pfn_to_phys' [-Werror=implicit-function-declaration]
509 | __pfn_to_phys(nums[0]), size, buf->dma_dir, 0);
| ^~~~~~~~~~~~~
cc1: some warnings being treated as errors
Enable __pfn_to_phys() and __phys_to_pfn() on !MMU builds.
Link: https://lkml.kernel.org/r/20210211232202.GS299309@linux.ibm.com
Fixes:
4bfc848e0981 ("m68k/mm: enable use of generic memory_model.h for !DISCONTIGMEM")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 12 Feb 2021 22:45:39 +0000 (14:45 -0800)]
Merge tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel
Pull cifs fixes from Steve French:
"Four small smb3 fixes to the new mount API (including a particularly
important one for DFS links).
These were found in testing this week of additional DFS scenarios, and
a user testing of an apache container problem"
* tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel:
cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
cifs: In the new mount api we get the full devname as source=
cifs: do not disable noperm if multiuser mount option is not provided
cifs: fix dfs-links
Linus Torvalds [Fri, 12 Feb 2021 19:48:02 +0000 (11:48 -0800)]
Merge tag 'io_uring-5.11-2021-02-12' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
"Revert of a patch from this release that caused a regression"
* tag 'io_uring-5.11-2021-02-12' of git://git.kernel.dk/linux-block:
Revert "io_uring: don't take fs for recvmsg/sendmsg"
Linus Torvalds [Fri, 12 Feb 2021 19:29:06 +0000 (11:29 -0800)]
Merge tag 'drm-fixes-2021-02-12' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes for final, there is a ttm regression fix, dp-mst fix,
one amdgpu revert, two i915 fixes, and some misc fixes for sun4i,
xlnx, and vc4.
All pretty quiet and don't think we have any known outstanding
regressions.
ttm:
- page pool regression fix.
dp_mst:
- don't report un-attached ports as connected
amdgpu:
- blank screen fix
i915:
- ensure Type-C FIA is powered when initializing
- fix overlay frontbuffer tracking
sun4i:
- tcon1 sync polarity fix
- always set HDMI clock rate
- fix H6 HDMI PHY config
- fix H6 max frequency
vc4:
- fix buffer overflow
xlnx:
- fix memory leak"
* tag 'drm-fixes-2021-02-12' of git://anongit.freedesktop.org/drm/drm:
drm/ttm: make sure pool pages are cleared
drm/sun4i: dw-hdmi: Fix max. frequency for H6
drm/sun4i: Fix H6 HDMI PHY configuration
drm/sun4i: dw-hdmi: always set clock rate
drm/sun4i: tcon: set sync polarity for tcon1 channel
drm/i915: Fix overlay frontbuffer tracking
Revert "drm/amd/display: Update NV1x SR latency values"
drm/i915/tgl+: Make sure TypeC FIA is powered up when initializing it
drm/dp_mst: Don't report ports connected if nothing is attached to them
drm/xlnx: fix kmemleak by sending vblank_event in atomic_disable
drm/vc4: hvs: Fix buffer overflow with the dlist handling
Linus Torvalds [Fri, 12 Feb 2021 19:16:17 +0000 (11:16 -0800)]
Merge tag 'trace-v5.11-rc7-2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix buffer overflow in trace event filter.
It was reported that if an trace event was larger than a page and was
filtered, that it caused memory corruption. The reason is that
filtered events first go into a buffer to test the filter before being
written into the ring buffer. Unfortunately, this write did not check
the size"
* tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Check length before giving out the filter buffer
Linus Torvalds [Fri, 12 Feb 2021 19:12:58 +0000 (11:12 -0800)]
Merge tag 'for-linus-5.11-rc8-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix for an issue introduced this development cycle: when
running as a Xen guest on Arm systems the kernel will hang during
boot"
* tag 'for-linus-5.11-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
arm/xen: Don't probe xenbus as part of an early initcall
Linus Torvalds [Fri, 12 Feb 2021 19:07:29 +0000 (11:07 -0800)]
Merge tag 'riscv-for-linus-5.11-rc8' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fix from Palmer Dabbelt:
"A single fix this week: the removal of the GPIO reset method for the
Ethernet phy on the HiFive Unleashed.
This returns to relying on the bootloader's phy reset sequence, which
we'll have to continue doing until we can sort out how to get the
Linux phy driver to perform the special reset dance required for this
phy"
* tag 'riscv-for-linus-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
Revert "dts: phy: add GPIO number and active state used for phy reset"
Linus Torvalds [Fri, 12 Feb 2021 19:03:30 +0000 (11:03 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix PTRACE_PEEKMTETAGS access to an mmapped region before the first
write"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page
Catalin Marinas [Wed, 10 Feb 2021 18:03:16 +0000 (18:03 +0000)]
arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page
The ptrace(PTRACE_PEEKMTETAGS) implementation checks whether the user
page has valid tags (mapped with PROT_MTE) by testing the PG_mte_tagged
page flag. If this bit is cleared, ptrace(PTRACE_PEEKMTETAGS) returns
-EIO.
A newly created (PROT_MTE) mapping points to the zero page which had its
tags zeroed during cpu_enable_mte(). If there were no prior writes to
this mapping, ptrace(PTRACE_PEEKMTETAGS) fails with -EIO since the zero
page does not have the PG_mte_tagged flag set.
Set PG_mte_tagged on the zero page when its tags are cleared during
boot. In addition, to avoid ptrace(PTRACE_PEEKMTETAGS) succeeding on
!PROT_MTE mappings pointing to the zero page, change the
__access_remote_tags() check to (vm_flags & VM_MTE) instead of
PG_mte_tagged.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes:
34bfeea4a9e9 ("arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Will Deacon <will@kernel.org>
Reported-by: Luis Machado <luis.machado@linaro.org>
Tested-by: Luis Machado <luis.machado@linaro.org>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/r/20210210180316.23654-1-catalin.marinas@arm.com
Su Yue [Thu, 11 Feb 2021 08:38:28 +0000 (16:38 +0800)]
btrfs: initialize fs_info::csum_size earlier in open_ctree
User reported that btrfs-progs misc-tests/028-superblock-recover fails:
[TEST/misc] 028-superblock-recover
unexpected success: mounted fs with corrupted superblock
test failed for case 028-superblock-recover
The test case expects that a broken image with bad superblock will be
rejected to be mounted. However, the test image just passed csum check
of superblock and was successfully mounted.
Commit
55fc29bed8dd ("btrfs: use cached value of fs_info::csum_size
everywhere") replaces all calls to btrfs_super_csum_size by
fs_info::csum_size. The calls include the place where fs_info->csum_size
is not initialized. So btrfs_check_super_csum() passes because memcmp()
with len 0 always returns 0.
Fix it by caching csum size in btrfs_fs_info::csum_size once we know the
csum type in superblock is valid in open_ctree().
Link: https://github.com/kdave/btrfs-progs/issues/250
Fixes:
55fc29bed8dd ("btrfs: use cached value of fs_info::csum_size everywhere")
Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Alain Volmat [Fri, 5 Feb 2021 08:51:40 +0000 (09:51 +0100)]
i2c: stm32f7: fix configuration of the digital filter
The digital filter related computation are present in the driver
however the programming of the filter within the IP is missing.
The maximum value for the DNF is wrong and should be 15 instead of 16.
Fixes:
aeb068c57214 ("i2c: i2c-stm32f7: add driver")
Signed-off-by: Alain Volmat <alain.volmat@foss.st.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Dave Airlie [Fri, 12 Feb 2021 03:38:31 +0000 (13:38 +1000)]
Merge branch 'drm-misc-fixes' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
(I've pulled from a non-tag to get the ttm regression fix)
drm-misc-fixes-2021-02-10:
* dp_mst: Don't report un-attached ports as connected
* sun4i: tcon1 sync polarity fix; Always set HDMI clock rate; Fix
H6 HDMI PHY config; Fix H6 max frequency
* vc4: Fix buffer overflow
* xlnx: Fix memory leak
* ttm: page pool regression fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YCPo6g3gDxD3P//h@linux-uq9g
Jernej Skrabec [Tue, 9 Feb 2021 17:58:56 +0000 (18:58 +0100)]
clk: sunxi-ng: mp: fix parent rate change flag check
CLK_SET_RATE_PARENT flag is checked on parent clock instead of current
one. Fix that.
Fixes:
3f790433c3cb ("clk: sunxi-ng: Adjust MP clock parent rate when allowed")
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Link: https://lore.kernel.org/r/20210209175900.7092-2-jernej.skrabec@siol.net
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Dave Airlie [Fri, 12 Feb 2021 00:16:58 +0000 (10:16 +1000)]
Merge tag 'drm-intel-fixes-2021-02-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.11 final:
- Ensure Type-C FIA is powered when initializing
- Fix overlay frontbuffer tracking
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87r1lnc78t.fsf@intel.com
Dave Airlie [Thu, 11 Feb 2021 23:51:15 +0000 (09:51 +1000)]
Merge tag 'amd-drm-fixes-5.11-2021-02-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.11-2021-02-10:
amdgpu:
- Blank screen fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210210223508.4428-1-alexander.deucher@amd.com
Linus Torvalds [Thu, 11 Feb 2021 23:41:07 +0000 (15:41 -0800)]
Merge tag 'powerpc-5.11-8' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"One fix for a regression seen in io_uring, introduced by our support
for KUAP (Kernel User Access Prevention) with the Hash MMU.
Thanks to Aneesh Kumar K.V, and Zorro Lang"
* tag 'powerpc-5.11-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm
Steven Rostedt (VMware) [Wed, 10 Feb 2021 16:53:22 +0000 (11:53 -0500)]
tracing: Check length before giving out the filter buffer
When filters are used by trace events, a page is allocated on each CPU and
used to copy the trace event fields to this page before writing to the ring
buffer. The reason to use the filter and not write directly into the ring
buffer is because a filter may discard the event and there's more overhead
on discarding from the ring buffer than the extra copy.
The problem here is that there is no check against the size being allocated
when using this page. If an event asks for more than a page size while being
filtered, it will get only a page, leading to the caller writing more that
what was allocated.
Check the length of the request, and if it is more than PAGE_SIZE minus the
header default back to allocating from the ring buffer directly. The ring
buffer may reject the event if its too big anyway, but it wont overflow.
Link: https://lore.kernel.org/ath10k/1612839593-2308-1-git-send-email-wgong@codeaurora.org/
Cc: stable@vger.kernel.org
Fixes:
0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events")
Reported-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Linus Torvalds [Thu, 11 Feb 2021 19:21:08 +0000 (11:21 -0800)]
Merge tag 'gpio-fixes-for-v5.11' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"This is hopefully the last batch of fixes for this release cycle. We
have a minor fix for a Kconfig regression as well as fixes for older
bugs in gpio-ep93xx:
- don't build gpio-mxs unconditionally with COMPILE_TEST enabled
- fix two problems with interrupt handling in gpio-ep93xx"
* tag 'gpio-fixes-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: ep93xx: Fix single irqchip with multi gpiochips
gpio: ep93xx: fix BUG_ON port F usage
gpio: mxs: GPIO_MXS should not default to y unconditionally
Masahiro Yamada [Thu, 11 Feb 2021 06:14:16 +0000 (15:14 +0900)]
kbuild: fix CONFIG_TRIM_UNUSED_KSYMS build for ppc64
Stephen Rothwell reported a build error on ppc64 when
CONFIG_TRIM_UNUSED_KSYMS is enabled.
Jessica Yu pointed out the cause of the error with the reference to the
ppc64 ELF ABI:
"Symbol names with a dot (.) prefix are reserved for holding entry
point addresses. The value of a symbol named ".FN", if it exists,
is the entry point of the function "FN".
As it turned out, CONFIG_TRIM_UNUSED_KSYMS has never worked for ppc64,
but this issue has been unnoticed until recently because this option
depends on !UNUSED_SYMBOLS hence is disabled by all{mod,yes}config.
(Then, it was uncovered by another patch removing UNUSED_SYMBOLS.)
Removing the dot prefix in scripts/gen_autoksyms.sh fixes the issue.
Please note it must be done before 'sort -u' because modules have
both ._mcount and _mcount undefined when CONFIG_FUNCTION_TRACER=y.
Link: https://lore.kernel.org/lkml/20210209210843.3af66662@canb.auug.org.au/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Shyam Prasad N [Thu, 11 Feb 2021 11:26:54 +0000 (03:26 -0800)]
cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.
While debugging another issue today, Steve and I noticed that if a
subdir for a file share is already mounted on the client, any new
mount of any other subdir (or the file share root) of the same share
results in sharing the cifs superblock, which e.g. can result in
incorrect device name.
While setting prefix path for the root of a cifs_sb,
CIFS_MOUNT_USE_PREFIX_PATH flag should also be set.
Without it, prepath is not even considered in some places,
and output of "mount" and various /proc/<>/*mount* related
options can be missing part of the device name.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Thu, 11 Feb 2021 06:06:16 +0000 (16:06 +1000)]
cifs: In the new mount api we get the full devname as source=
so we no longer need to handle or parse the UNC= and prefixpath=
options that mount.cifs are generating.
This also fixes a bug in the mount command option where the devname
would be truncated into just //server/share because we were looking
at the truncated UNC value and not the full path.
I.e. in the mount command output the devive //server/share/path
would show up as just //server/share
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Christian König [Wed, 10 Feb 2021 13:24:27 +0000 (14:24 +0100)]
drm/ttm: make sure pool pages are cleared
The old implementation wasn't consistend on this.
But it looks like we depend on this so better bring it back.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Fixes:
d099fc8f540a ("drm/ttm: new TT backend allocation pool v3")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210210160549.1462-1-christian.koenig@amd.com
Julien Grall [Wed, 10 Feb 2021 17:06:54 +0000 (17:06 +0000)]
arm/xen: Don't probe xenbus as part of an early initcall
After Commit
3499ba8198cad ("xen: Fix event channel callback via
INTX/GSI"), xenbus_probe() will be called too early on Arm. This will
recent to a guest hang during boot.
If the hang wasn't there, we would have ended up to call
xenbus_probe() twice (the second time is in xenbus_probe_initcall()).
We don't need to initialize xenbus_probe() early for Arm guest.
Therefore, the call in xen_guest_init() is now removed.
After this change, there is no more external caller for xenbus_probe().
So the function is turned to a static one. Interestingly there were two
prototypes for it.
Cc: stable@vger.kernel.org
Fixes:
3499ba8198cad ("xen: Fix event channel callback via INTX/GSI")
Reported-by: Ian Jackson <iwj@xenproject.org>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20210210170654.5377-1-julien@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Palmer Dabbelt [Fri, 5 Feb 2021 03:41:12 +0000 (19:41 -0800)]
Revert "dts: phy: add GPIO number and active state used for phy reset"
VSC8541 phys need a special reset sequence, which the driver doesn't
currentlny support. As a result enabling the reset via GPIO essentially
guarnteees that the device won't work correctly. We've been relying on
bootloaders to reset the device for years, with this revert we'll go
back to doing so until we can sort out how to get the reset sequence
into the kernel.
This reverts commit
a0fa9d727043da2238432471e85de0bdb8a8df65.
Fixes:
a0fa9d727043 ("dts: phy: add GPIO number and active state used for phy reset")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Thomas Gleixner [Wed, 10 Feb 2021 15:27:41 +0000 (16:27 +0100)]
x86/pci: Create PCI/MSI irqdomain after x86_init.pci.arch_init()
Invoking x86_init.irqs.create_pci_msi_domain() before
x86_init.pci.arch_init() breaks XEN PV.
The XEN_PV specific pci.arch_init() function overrides the default
create_pci_msi_domain() which is obviously too late.
As a consequence the XEN PV PCI/MSI allocation goes through the native
path which runs out of vectors and causes malfunction.
Invoke it after x86_init.pci.arch_init().
Fixes:
6b15ffa07dc3 ("x86/irq: Initialize PCI/MSI domain at PCI init time")
Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87pn18djte.fsf@nanos.tec.linutronix.de
Linus Torvalds [Wed, 10 Feb 2021 20:03:35 +0000 (12:03 -0800)]
Merge tag 'pm-5.11-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Address a performance regression related to scale-invariance on x86
that may prevent turbo CPU frequencies from being used in certain
workloads on systems using acpi-cpufreq as the CPU performance scaling
driver and schedutil as the scaling governor"
* tag 'pm-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there
cpufreq: ACPI: Extend frequency tables to cover boost frequencies
Linus Torvalds [Wed, 10 Feb 2021 19:58:21 +0000 (11:58 -0800)]
Merge tag 'acpi-5.11-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a problematic ACPICA commit that changed the code to attempt to
update memory regions which may be read-only on some systems (Ard
Biesheuvel)"
* tag 'acpi-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPICA: Interpreter: fix memory leak by using existing buffer"
Linus Torvalds [Wed, 10 Feb 2021 19:51:25 +0000 (11:51 -0800)]
Merge tag 'dmaengine-fix2-5.11' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"Some late fixes for dmaengine:
Core:
- fix channel device_node deletion
Driver fixes:
- dw: revert of runtime pm enabling
- idxd: device state fix, interrupt completion and list corruption
- ti: resource leak
* tag 'dmaengine-fix2-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine dw: Revert "dmaengine: dw: Enable runtime PM"
dmaengine: idxd: check device state before issue command
dmaengine: ti: k3-udma: Fix a resource leak in an error handling path
dmaengine: move channel device_node deletion to driver
dmaengine: idxd: fix misc interrupt completion
dmaengine: idxd: Fix list corruption in description completion
Jens Axboe [Wed, 10 Feb 2021 19:37:58 +0000 (12:37 -0700)]
Revert "io_uring: don't take fs for recvmsg/sendmsg"
This reverts commit
10cad2c40dcb04bb46b2bf399e00ca5ea93d36b0.
Petr reports that with this commit in place, io_uring fails the chroot
test (CVE-202-29373). We do need to retain ->fs for send/recvmsg, so
revert this commit.
Reported-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Wed, 10 Feb 2021 19:33:39 +0000 (11:33 -0800)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
"Another pile of networing fixes:
1) ath9k build error fix from Arnd Bergmann
2) dma memory leak fix in mediatec driver from Lorenzo Bianconi.
3) bpf int3 kprobe fix from Alexei Starovoitov.
4) bpf stackmap integer overflow fix from Bui Quang Minh.
5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from
Christoph Schemmel.
6) Don't update deleted entry in xt_recent netfilter module, from
Jazsef Kadlecsik.
7) Use after free in nftables, fix from Pablo Neira Ayuso.
8) Header checksum fix in flowtable from Sven Auhagen.
9) Validate user controlled length in qrtr code, from Sabyrzhan
Tasbolatov.
10) Fix race in xen/netback, from Juergen Gross,
11) New device ID in cxgb4, from Raju Rangoju.
12) Fix ring locking in rxrpc release call, from David Howells.
13) Don't return LAPB error codes from x25_open(), from Xie He.
14) Missing error returns in gsi_channel_setup() from Alex Elder.
15) Get skb_copy_and_csum_datagram working properly with odd segment
sizes, from Willem de Bruijn.
16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean.
17) Do teardown on probe failure in DSA, from Vladimir Oltean.
18) Fix compilation failures of txtimestamp selftest, from Vadim
Fedorenko.
19) Limit rx per-napi gro queue size to fix latency regression, from
Eric Dumazet.
20) dpaa_eth xdp fixes from Camelia Groza.
21) Missing txq mode update when switching CBS off, in stmmac driver,
from Mohammad Athari Bin Ismail.
22) Failover pending logic fix in ibmvnic driver, from Sukadev
Bhattiprolu.
23) Null deref fix in vmw_vsock, from Norbert Slusarek.
24) Missing verdict update in xdp paths of ena driver, from Shay
Agroskin.
25) seq_file iteration fix in sctp from Neil Brown.
26) bpf 32-bit src register truncation fix on div/mod, from Daniel
Borkmann.
27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann.
28) Fix locking in vsock_shutdown(), from Stefano Garzarella.
29) Various missing index bound checks in hns3 driver, from Yufeng Mo.
30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from
Vladimir Oltean.
31) Don't mix up stp and mrp port states in bridge layer, from Horatiu
Vultur.
32) Fix locking during netif_tx_disable(), from Edwin Peer"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
bpf: Fix 32 bit src register truncation on div/mod
bpf: Fix verifier jmp32 pruning decision logic
bpf: Fix verifier jsgt branch analysis on max bound
vsock: fix locking in vsock_shutdown()
net: hns3: add a check for index in hclge_get_rss_key()
net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
net: hns3: add a check for queue_id in hclge_reset_vf_queue()
net: dsa: felix: implement port flushing on .phylink_mac_link_down
switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
net: watchdog: hold device global xmit lock during tx disable
netfilter: nftables: relax check for stateful expressions in set definition
netfilter: conntrack: skip identical origin tuple in same zone only
vsock/virtio: update credit only if socket is not closed
net: fix iteration for sctp transport seq_files
net: ena: Update XDP verdict upon failure
net/vmw_vsock: improve locking in vsock_connect_timeout()
net/vmw_vsock: fix NULL pointer dereference
ibmvnic: Clear failover_pending if unable to schedule
net: stmmac: set TxQ mode back to DCB after disabling CBS
...
Linus Torvalds [Wed, 10 Feb 2021 19:22:41 +0000 (11:22 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"14 patches.
Subsystems affected by this patch series: mm (kasan, mremap, tmpfs,
selftests, memcg, and slub), MAINTAINERS, squashfs, nilfs2, and
firmware"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
nilfs2: make splice write available again
mm, slub: better heuristic for number of cpus when calculating slab order
Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
MAINTAINERS: update Andrey Ryabinin's email address
selftests/vm: rename file run_vmtests to run_vmtests.sh
tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
mm/mremap: fix BUILD_BUG_ON() error in get_extent
firmware_loader: align .builtin_fw to 8
kasan: fix stack traces dependency for HW_TAGS
squashfs: add more sanity checks in xattr id lookup
squashfs: add more sanity checks in inode lookup
squashfs: add more sanity checks in id lookup
squashfs: avoid out of bounds writes in decompressors
Joachim Henke [Tue, 9 Feb 2021 21:42:36 +0000 (13:42 -0800)]
nilfs2: make splice write available again
Since 5.10, splice() or sendfile() to NILFS2 return EINVAL. This was
caused by commit
36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops").
This patch initializes the splice_write field in file_operations, like
most file systems do, to restore the functionality.
Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Joachim Henke <joachim.henke@t-systems.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org> [5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Tue, 9 Feb 2021 21:42:32 +0000 (13:42 -0800)]
mm, slub: better heuristic for number of cpus when calculating slab order
When creating a new kmem cache, SLUB determines how large the slab pages
will based on number of inputs, including the number of CPUs in the
system. Larger slab pages mean that more objects can be allocated/free
from per-cpu slabs before accessing shared structures, but also
potentially more memory can be wasted due to low slab usage and
fragmentation. The rough idea of using number of CPUs is that larger
systems will be more likely to benefit from reduced contention, and also
should have enough memory to spare.
Number of CPUs used to be determined as nr_cpu_ids, which is number of
possible cpus, but on some systems many will never be onlined, thus
commit
045ab8c9487b ("mm/slub: let number of online CPUs determine the
slub page order") changed it to nr_online_cpus(). However, for kmem
caches created early before CPUs are onlined, this may lead to
permamently low slab page sizes.
Vincent reports a regression [1] of hackbench on arm64 systems:
"I'm facing significant performances regression on a large arm64
server system (224 CPUs). Regressions is also present on small arm64
system (8 CPUs) but in a far smaller order of magnitude
On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
v5.11-rc4 : 9.135sec (+/- 0.45%)
v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
v5.10: 3.136sec (+/- 0.40%)"
Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:
"i.e. the patch incurs a 7% to 32% performance penalty. This bisected
cleanly yesterday when I was looking for the regression and then
found the thread.
Numerous caches change size. For example, kmalloc-512 goes from
order-0 (vanilla) to order-2 with the revert.
So mostly this is down to the number of times SLUB calls into the
page allocator which only caches order-0 pages on a per-cpu basis"
Clearly num_online_cpus() doesn't work too early in bootup. We could
change the order dynamically in a memory hotplug callback, but runtime
order changing for existing kmem caches has been already shown as
dangerous, and removed in
32a6f409b693 ("mm, slub: remove runtime
allocation order changes").
It could be resurrected in a safe manner with some effort, but to fix
the regression we need something simpler.
We could use num_present_cpus() that should be the number of physically
present CPUs even before they are onlined. That would work for PowerPC
[3], which triggered the original commit, but that still doesn't work on
arm64 [4] as explained in [5].
So this patch tries to determine the best available value without
specific arch knowledge.
- num_present_cpus() if the number is larger than 1, as that means the
arch is likely setting it properly
- nr_cpu_ids otherwise
This should fix the reported regressions while also keeping the effect
of
045ab8c9487b for PowerPC systems. It's possible there are
configurations where num_present_cpus() is 1 during boot while
nr_cpu_ids is at the same time bloated, so these (if they exist) would
keep the large orders based on nr_cpu_ids as was before
045ab8c9487b.
[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/
[2] https://lore.kernel.org/linux-mm/
20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/
20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/
[5] https://lore.kernel.org/linux-mm/
20210126230305.GD30941@willie-the-truck/
Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz
Fixes:
045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jann Horn <jannh@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nikita Shubin [Tue, 9 Feb 2021 13:31:05 +0000 (16:31 +0300)]
gpio: ep93xx: Fix single irqchip with multi gpiochips
Fixes the following warnings which results in interrupts disabled on
port B/F:
gpio gpiochip1: (B): detected irqchip that is shared with multiple gpiochips: please fix the driver.
gpio gpiochip5: (F): detected irqchip that is shared with multiple gpiochips: please fix the driver.
- added separate irqchip for each interrupt capable gpiochip
- provided unique names for each irqchip
Fixes:
d2b091961510 ("gpio: ep93xx: Pass irqchip when adding gpiochip")
Cc: <stable@vger.kernel.org>
Signed-off-by: Nikita Shubin <nikita.shubin@maquefel.me>
Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Nikita Shubin [Tue, 9 Feb 2021 13:31:04 +0000 (16:31 +0300)]
gpio: ep93xx: fix BUG_ON port F usage
Two index spaces and ep93xx_gpio_port are confusing.
Instead add a separate struct to store necessary data and remove
ep93xx_gpio_port.
- add struct to store IRQ related data for each IRQ capable chip
- replace offset array with defined offsets
- add IRQ registers offset for each IRQ capable chip into
ep93xx_gpio_banks
------------[ cut here ]------------
kernel BUG at drivers/gpio/gpio-ep93xx.c:64!
---[ end trace
3f6544e133e9f5ae ]---
Fixes:
fd935fc421e74 ("gpio: ep93xx: Do not pingpong irq numbers")
Cc: <stable@vger.kernel.org>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Nikita Shubin <nikita.shubin@maquefel.me>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Geert Uytterhoeven [Mon, 8 Feb 2021 14:51:53 +0000 (15:51 +0100)]
gpio: mxs: GPIO_MXS should not default to y unconditionally
Merely enabling CONFIG_COMPILE_TEST should not enable additional code.
To fix this, restrict the automatic enabling of GPIO_MXS to ARCH_MXS,
and ask the user in case of compile-testing.
Fixes:
6876ca311bfca5d7 ("gpio: mxs: add COMPILE_TEST support for GPIO_MXS")
Cc: <stable@vger.kernel.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Jernej Skrabec [Tue, 9 Feb 2021 17:59:00 +0000 (18:59 +0100)]
drm/sun4i: dw-hdmi: Fix max. frequency for H6
It turns out that reasoning for lowering max. supported frequency is
wrong. Scrambling works just fine. Several now fixed bugs prevented
proper functioning, even with rates lower than 340 MHz. Issues were just
more pronounced with higher frequencies.
Fix that by allowing max. supported frequency in HW and fix the comment.
Fixes:
cd9063757a22 ("drm/sun4i: DW HDMI: Lower max. supported rate for H6")
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209175900.7092-6-jernej.skrabec@siol.net
Jernej Skrabec [Tue, 9 Feb 2021 17:58:59 +0000 (18:58 +0100)]
drm/sun4i: Fix H6 HDMI PHY configuration
As it turns out, vendor HDMI PHY driver for H6 has a pretty big table
of predefined values for various pixel clocks. However, most of them are
not useful/tested because they come from reference driver code. Vendor
PHY driver is concerned with only few of those, namely 27 MHz, 74.25
MHz, 148.5 MHz, 297 MHz and 594 MHz. These are all frequencies for
standard CEA modes.
Fix sun50i_h6_cur_ctr and sun50i_h6_phy_config with the values only for
aforementioned frequencies.
Table sun50i_h6_mpll_cfg doesn't need to be changed because values are
actually frequency dependent and not so much SoC dependent. See i.MX6
documentation for explanation of those values for similar PHY.
Fixes:
c71c9b2fee17 ("drm/sun4i: Add support for Synopsys HDMI PHY")
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209175900.7092-5-jernej.skrabec@siol.net
Jernej Skrabec [Tue, 9 Feb 2021 17:58:58 +0000 (18:58 +0100)]
drm/sun4i: dw-hdmi: always set clock rate
As expected, HDMI controller clock should always match pixel clock. In
the past, changing HDMI controller rate would seemingly worsen
situation. However, that was the result of other bugs which are now
fixed.
Fix that by removing set_rate quirk and always set clock rate.
Fixes:
40bb9d3147b2 ("drm/sun4i: Add support for H6 DW HDMI controller")
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209175900.7092-4-jernej.skrabec@siol.net
Jernej Skrabec [Tue, 9 Feb 2021 17:58:57 +0000 (18:58 +0100)]
drm/sun4i: tcon: set sync polarity for tcon1 channel
Channel 1 has polarity bits for vsync and hsync signals but driver never
sets them. It turns out that with pre-HDMI2 controllers seemingly there
is no issue if polarity is not set. However, with HDMI2 controllers
(H6) there often comes to de-synchronization due to phase shift. This
causes flickering screen. It's safe to assume that similar issues might
happen also with pre-HDMI2 controllers.
Solve issue with setting vsync and hsync polarity. Note that display
stacks with tcon top have polarity bits actually in tcon0 polarity
register.
Fixes:
9026e0d122ac ("drm: Add Allwinner A10 Display Engine support")
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209175900.7092-3-jernej.skrabec@siol.net
Ville Syrjälä [Tue, 9 Feb 2021 02:19:17 +0000 (04:19 +0200)]
drm/i915: Fix overlay frontbuffer tracking
We don't have a persistent fb holding a reference to the frontbuffer
object, so every time we do the get+put we throw the frontbuffer object
immediately away. And so the next time around we get a pristine
frontbuffer object with bits==0 even for the old vma. This confuses
the frontbuffer tracking code which understandably expects the old
frontbuffer to have the overlay's bit set.
Fix this by hanging on to the frontbuffer reference until the next
flip. And just to make this a bit more clear let's track the frontbuffer
explicitly instead of just grabbing it via the old vma.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1136
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210209021918.16234-2-ville.syrjala@linux.intel.com
Fixes:
8e7cb1799b4f ("drm/i915: Extract intel_frontbuffer active tracking")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit
553c23bdb4775130f333f07a51b047276bc53f79)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Alex Deucher [Wed, 3 Feb 2021 19:03:50 +0000 (14:03 -0500)]
Revert "drm/amd/display: Update NV1x SR latency values"
This reverts commit
4a3dea8932d3b1199680d2056dd91d31d94d70b7.
This causes blank screens for some users.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1482
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Jun Lei <Jun.Lei@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
David S. Miller [Wed, 10 Feb 2021 02:55:17 +0000 (18:55 -0800)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-02-10
The following pull-request contains BPF updates for your *net* tree.
We've added 5 non-merge commits during the last 8 day(s) which contain
a total of 3 files changed, 22 insertions(+), 21 deletions(-).
The main changes are:
1) Fix missed execution of kprobes BPF progs when kprobe is firing via
int3, from Alexei Starovoitov.
2) Fix potential integer overflow in map max_entries for stackmap on
32 bit archs, from Bui Quang Minh.
3) Fix a verifier pruning and a insn rewrite issue related to 32 bit ops,
from Daniel Borkmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
c# Please enter a commit message to explain why this merge is necessary,
Ronnie Sahlberg [Wed, 10 Feb 2021 01:55:47 +0000 (11:55 +1000)]
cifs: do not disable noperm if multiuser mount option is not provided
Fixes small regression in implementation of new mount API.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reported-by: Hyunchul Lee <hyc.lee@gmail.com>
Tested-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Johannes Weiner [Tue, 9 Feb 2021 21:42:28 +0000 (13:42 -0800)]
Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
This reverts commit
536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can
cause writers to memory.high to get stuck in the kernel forever,
performing page reclaim and consuming excessive amounts of CPU cycles.
Before the patch, a write to memory.high would first put the new limit
in place for the workload, and then reclaim the requested delta. After
the patch, the kernel tries to reclaim the delta before putting the new
limit into place, in order to not overwhelm the workload with a sudden,
large excess over the limit. However, if reclaim is actively racing
with new allocations from the uncurbed workload, it can keep the write()
working inside the kernel indefinitely.
This is causing problems in Facebook production. A privileged
system-level daemon that adjusts memory.high for various workloads
running on a host can get unexpectedly stuck in the kernel and
essentially turn into a sort of involuntary kswapd for one of the
workloads. We've observed that daemon busy-spin in a write() for
minutes at a time, neglecting its other duties on the system, and
expending privileged system resources on behalf of a workload.
To remedy this, we have first considered changing the reclaim logic to
break out after a couple of loops - whether the workload has converged
to the new limit or not - and bound the write() call this way. However,
the root cause that inspired the sequence change in the first place has
been fixed through other means, and so a revert back to the proven
limit-setting sequence, also used by memory.max, is preferable.
The sequence was changed to avoid extreme latencies in the workload when
the limit was lowered: the sudden, large excess created by the limit
lowering would erroneously trigger the penalty sleeping code that is
meant to throttle excessive growth from below. Allocating threads could
end up sleeping long after the write() had already reclaimed the delta
for which they were being punished.
However, erroneous throttling also caused problems in other scenarios at
around the same time. This resulted in commit
b3ff92916af3 ("mm, memcg:
reclaim more aggressively before high allocator throttling"), included
in the same release as the offending commit. When allocating threads
now encounter large excess caused by a racing write() to memory.high,
instead of entering punitive sleeps, they will simply be tasked with
helping reclaim down the excess, and will be held no longer than it
takes to accomplish that. This is in line with regular limit
enforcement - i.e. if the workload allocates up against or over an
otherwise unchanged limit from below.
With the patch breaking userspace, and the root cause addressed by other
means already, revert it again.
Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
Fixes:
536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: <stable@vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Tue, 9 Feb 2021 21:42:24 +0000 (13:42 -0800)]
MAINTAINERS: update Andrey Ryabinin's email address
Update my email, @virtuozzo.com will stop working shortly.
Link: https://lkml.kernel.org/r/20210204223904.3824-1-ryabinin.a.a@gmail.com
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rong Chen [Tue, 9 Feb 2021 21:42:21 +0000 (13:42 -0800)]
selftests/vm: rename file run_vmtests to run_vmtests.sh
Commit
c2aa8afc36fa has renamed run_vmtests in Makefile, but the file
still uses the old name.
The kernel test robot reported the following issue:
# selftests: vm: run_vmtests.sh
# Warning: file run_vmtests.sh is missing!
not ok 1 selftests: vm: run_vmtests.sh
Link: https://lkml.kernel.org/r/20210205085507.1479894-1-rong.a.chen@intel.com
Fixes:
c2aa8afc36fa (selftests/vm: rename run_vmtests --> run_vmtests.sh)
Signed-off-by: Rong Chen <rong.a.chen@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Seth Forshee [Tue, 9 Feb 2021 21:42:17 +0000 (13:42 -0800)]
tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t. With
CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, whereas passing "inode64" in the
mount options will fail. This leads to erroneous behaviours such as
this:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
Fixes:
ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: <stable@vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Seth Forshee [Tue, 9 Feb 2021 21:42:14 +0000 (13:42 -0800)]
tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers
and display "inode64" in the mount options, but passing the "inode64"
mount option will fail. This leads to the following behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes:
ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: <stable@vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Tue, 9 Feb 2021 21:42:10 +0000 (13:42 -0800)]
mm/mremap: fix BUILD_BUG_ON() error in get_extent
clang can't evaluate this function argument at compile time when the
function is not inlined, which leads to a link time failure:
ld.lld: error: undefined symbol: __compiletime_assert_414
>>> referenced by mremap.c
>>> mremap.o:(get_extent) in archive mm/built-in.a
Mark the function as __always_inline to avoid it.
Link: https://lkml.kernel.org/r/20201230154104.522605-1-arnd@kernel.org
Fixes:
9ad9718bfa41 ("mm/mremap: calculate extent in one place")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Brian Geffon <bgeffon@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fangrui Song [Tue, 9 Feb 2021 21:42:07 +0000 (13:42 -0800)]
firmware_loader: align .builtin_fw to 8
arm64 references the start address of .builtin_fw (__start_builtin_fw)
with a pair of R_AARCH64_ADR_PREL_PG_HI21/R_AARCH64_LDST64_ABS_LO12_NC
relocations. The compiler is allowed to emit the
R_AARCH64_LDST64_ABS_LO12_NC relocation because struct builtin_fw in
include/linux/firmware.h is 8-byte aligned.
The R_AARCH64_LDST64_ABS_LO12_NC relocation requires the address to be a
multiple of 8, which may not be the case if .builtin_fw is empty.
Unconditionally align .builtin_fw to fix the linker error. 32-bit
architectures could use ALIGN(4) but that would add unnecessary
complexity, so just use ALIGN(8).
Link: https://lkml.kernel.org/r/20201208054646.2913063-1-maskray@google.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1204
Fixes: 5658c76 ("firmware: allow firmware files to be built into kernel image")
Signed-off-by: Fangrui Song <maskray@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov [Tue, 9 Feb 2021 21:42:03 +0000 (13:42 -0800)]
kasan: fix stack traces dependency for HW_TAGS
Currently, whether the alloc/free stack traces collection is enabled by
default for hardware tag-based KASAN depends on CONFIG_DEBUG_KERNEL.
The intention for this dependency was to only enable collection on slow
debug kernels due to a significant perf and memory impact.
As it turns out, CONFIG_DEBUG_KERNEL is not considered a debug option
and is enabled on many productions kernels including Android and Ubuntu.
As the result, this dependency is pointless and only complicates the
code and documentation.
Having stack traces collection disabled by default would make the
hardware mode work differently to to the software ones, which is
confusing.
This change removes the dependency and enables stack traces collection
by default.
Looking into the future, this default might makes sense for production
kernels, assuming we implement a fast stack trace collection approach.
Link: https://lkml.kernel.org/r/6678d77ceffb71f1cff2cf61560e2ffe7bb6bfe9.1612808820.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phillip Lougher [Tue, 9 Feb 2021 21:42:00 +0000 (13:42 -0800)]
squashfs: add more sanity checks in xattr id lookup
Sysbot has reported a warning where a kmalloc() attempt exceeds the
maximum limit. This has been identified as corruption of the xattr_ids
count when reading the xattr id lookup table.
This patch adds a number of additional sanity checks to detect this
corruption and others.
1. It checks for a corrupted xattr index read from the inode. This could
be because the metadata block is uncompressed, or because the
"compression" bit has been corrupted (turning a compressed block
into an uncompressed block). This would cause an out of bounds read.
2. It checks against corruption of the xattr_ids count. This can either
lead to the above kmalloc failure, or a smaller than expected
table to be read.
3. It checks the contents of the index table for corruption.
[phillip@squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/270245655.754655.1612770082682@webmail.123-reg.co.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-5-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: syzbot+2ccea6339d368360800d@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phillip Lougher [Tue, 9 Feb 2021 21:41:56 +0000 (13:41 -0800)]
squashfs: add more sanity checks in inode lookup
Sysbot has reported an "slab-out-of-bounds read" error which has been
identified as being caused by a corrupted "ino_num" value read from the
inode. This could be because the metadata block is uncompressed, or
because the "compression" bit has been corrupted (turning a compressed
block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the
following corruption.
1. It checks against corruption of the inodes count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large inodes count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
[phillip@squashfs.org.uk: fix checkpatch issue]
Link: https://lkml.kernel.org/r/527909353.754618.1612769948607@webmail.123-reg.co.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-4-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: syzbot+04419e3ff19d2970ea28@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phillip Lougher [Tue, 9 Feb 2021 21:41:53 +0000 (13:41 -0800)]
squashfs: add more sanity checks in id lookup
Sysbot has reported a number of "slab-out-of-bounds reads" and
"use-after-free read" errors which has been identified as being caused
by a corrupted index value read from the inode. This could be because
the metadata block is uncompressed, or because the "compression" bit has
been corrupted (turning a compressed block into an uncompressed block).
This patch adds additional sanity checks to detect this, and the
following corruption.
1. It checks against corruption of the ids count. This can either
lead to a larger table to be read, or a smaller than expected
table to be read.
In the case of a too large ids count, this would often have been
trapped by the existing sanity checks, but this patch introduces
a more exact check, which can identify too small values.
2. It checks the contents of the index table for corruption.
Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: syzbot+b06d57ba83f604522af2@syzkaller.appspotmail.com
Reported-by: syzbot+c021ba012da41ee9807c@syzkaller.appspotmail.com
Reported-by: syzbot+5024636e8b5fd19f0f19@syzkaller.appspotmail.com
Reported-by: syzbot+bcbc661df46657d0fa4f@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phillip Lougher [Tue, 9 Feb 2021 21:41:50 +0000 (13:41 -0800)]
squashfs: avoid out of bounds writes in decompressors
Patch series "Squashfs: fix BIO migration regression and add sanity checks".
Patch [1/4] fixes a regression introduced by the "migrate from
ll_rw_block usage to BIO" patch, which has produced a number of
Sysbot/Syzkaller reports.
Patches [2/4], [3/4], and [4/4] fix a number of filesystem corruption
issues which have produced Sysbot reports in the id, inode and xattr
lookup code.
Each patch has been tested against the Sysbot reproducers using the
given kernel configuration. They have the appropriate "Reported-by:"
lines added.
Additionally, all of the reproducer filesystems are indirectly fixed by
patch [4/4] due to the fact they all have xattr corruption which is now
detected there.
Additional testing with other configurations and architectures (32bit,
big endian), and normal filesystems has also been done to trap any
inadvertent regressions caused by the additional sanity checks.
This patch (of 4):
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Sysbot/Syskaller has reported a number of "out of bounds writes" and
"unable to handle kernel paging request in squashfs_decompress" errors
which have been identified as a regression introduced by the above
patch.
Specifically, the patch removed the following sanity check
if (length < 0 || length > output->length ||
(index + length) > msblk->bytes_used)
This check did two things:
1. It ensured any reads were not beyond the end of the filesystem
2. It ensured that the "length" field read from the filesystem
was within the expected maximum length. Without this any
corrupted values can over-run allocated buffers.
Link: https://lkml.kernel.org/r/20210204130249.4495-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20210204130249.4495-2-phillip@squashfs.org.uk
Fixes:
93e72b3c612adc ("squashfs: migrate from ll_rw_block usage to BIO")
Reported-by: syzbot+6fba78f99b9afd4b5634@syzkaller.appspotmail.com
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Philippe Liard <pliard@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 10 Feb 2021 01:19:56 +0000 (17:19 -0800)]
Merge tag 'i3c/fixes-for-5.11' of git://git./linux/kernel/git/i3c/linux
Pull i3c fix from Alexandre Belloni:
"A single build warning fix"
* tag 'i3c/fixes-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c/master/mipi-i3c-hci: Fix position of __maybe_unused in i3c_hci_of_match
Daniel Borkmann [Tue, 9 Feb 2021 18:46:10 +0000 (18:46 +0000)]
bpf: Fix 32 bit src register truncation on div/mod
While reviewing a different fix, John and I noticed an oddity in one of the
BPF program dumps that stood out, for example:
# bpftool p d x i 13
0: (b7) r0 =
808464450
1: (b4) w4 =
808464432
2: (bc) w0 = w0
3: (15) if r0 == 0x0 goto pc+1
4: (9c) w4 %= w0
[...]
In line 2 we noticed that the mov32 would 32 bit truncate the original src
register for the div/mod operation. While for the two operations the dst
register is typically marked unknown e.g. from adjust_scalar_min_max_vals()
the src register is not, and thus verifier keeps tracking original bounds,
simplified:
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r0 = -1
1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0
1: (b7) r1 = -1
2: R0_w=invP-1 R1_w=invP-1 R10=fp0
2: (3c) w0 /= w1
3: R0_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP-1 R10=fp0
3: (77) r1 >>= 32
4: R0_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP4294967295 R10=fp0
4: (bf) r0 = r1
5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0
5: (95) exit
processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Runtime result of r0 at exit is 0 instead of expected -1. Remove the
verifier mov32 src rewrite in div/mod and replace it with a jmp32 test
instead. After the fix, we result in the following code generation when
having dividend r1 and divisor r6:
div, 64 bit: div, 32 bit:
0: (b7) r6 = 8 0: (b7) r6 = 8
1: (b7) r1 = 8 1: (b7) r1 = 8
2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2
3: (ac) w1 ^= w1 3: (ac) w1 ^= w1
4: (05) goto pc+1 4: (05) goto pc+1
5: (3f) r1 /= r6 5: (3c) w1 /= w6
6: (b7) r0 = 0 6: (b7) r0 = 0
7: (95) exit 7: (95) exit
mod, 64 bit: mod, 32 bit:
0: (b7) r6 = 8 0: (b7) r6 = 8
1: (b7) r1 = 8 1: (b7) r1 = 8
2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1
3: (9f) r1 %= r6 3: (9c) w1 %= w6
4: (b7) r0 = 0 4: (b7) r0 = 0
5: (95) exit 5: (95) exit
x86 in particular can throw a 'divide error' exception for div
instruction not only for divisor being zero, but also for the case
when the quotient is too large for the designated register. For the
edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT
since we always zero edx (rdx). Hence really the only protection
needed is against divisor being zero.
Fixes:
68fda450a7df ("bpf: fix 32-bit divide by zero")
Co-developed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Fri, 5 Feb 2021 19:48:21 +0000 (20:48 +0100)]
bpf: Fix verifier jmp32 pruning decision logic
Anatoly has been fuzzing with kBdysch harness and reported a hang in
one of the outcomes:
func#0 @0
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
0: (b7) r0 =
808464450
1: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R10=fp0
1: (b4) w4 =
808464432
2: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP808464432 R10=fp0
2: (9c) w4 %= w0
3: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
3: (66) if w4 s> 0x30303030 goto pc+0
R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff),s32_max_value=
808464432) R10=fp0
4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff),s32_max_value=
808464432) R10=fp0
4: (7f) r0 >>= r0
5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=
4294967295,var_off=(0x0; 0xffffffff),s32_max_value=
808464432) R10=fp0
5: (9c) w4 %= w0
6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
6: (66) if w0 s> 0x3030 goto pc+0
R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0
7: (d6) if w0 s<= 0x303030 goto pc+1
9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0
9: (95) exit
propagating r0
from 6 to 7: safe
4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=
808464433,umax_value=
2147483647,var_off=(0x0; 0x7fffffff)) R10=fp0
4: (7f) r0 >>= r0
5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=
808464433,umax_value=
2147483647,var_off=(0x0; 0x7fffffff)) R10=fp0
5: (9c) w4 %= w0
6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
6: (66) if w0 s> 0x3030 goto pc+0
R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
propagating r0
7: safe
propagating r0
from 6 to 7: safe
processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
The underlying program was xlated as follows:
# bpftool p d x i 10
0: (b7) r0 =
808464450
1: (b4) w4 =
808464432
2: (bc) w0 = w0
3: (15) if r0 == 0x0 goto pc+1
4: (9c) w4 %= w0
5: (66) if w4 s> 0x30303030 goto pc+0
6: (7f) r0 >>= r0
7: (bc) w0 = w0
8: (15) if r0 == 0x0 goto pc+1
9: (9c) w4 %= w0
10: (66) if w0 s> 0x3030 goto pc+0
11: (d6) if w0 s<= 0x303030 goto pc+1
12: (05) goto pc-1
13: (95) exit
The verifier rewrote original instructions it recognized as dead code with
'goto pc-1', but reality differs from verifier simulation in that we are
actually able to trigger a hang due to hitting the 'goto pc-1' instructions.
Taking a closer look at the verifier analysis, the reason is that it misjudges
its pruning decision at the first 'from 6 to 7: safe' occasion. What happens
is that while both old/cur registers are marked as precise, they get misjudged
for the jmp32 case as range_within() yields true, meaning that the prior
verification path with a wider register bound could be verified successfully
and therefore the current path with a narrower register bound is deemed safe
as well whereas in reality it's not. R0 old/cur path's bounds compare as
follows:
old: smin_value=0x8000000000000000,smax_value=0x7fffffffffffffff,umin_value=0x0,umax_value=0xffffffffffffffff,var_off=(0x0; 0xffffffffffffffff)
cur: smin_value=0x8000000000000000,smax_value=0x7fffffff7fffffff,umin_value=0x0,umax_value=0xffffffff7fffffff,var_off=(0x0; 0xffffffff7fffffff)
old: s32_min_value=0x80000000,s32_max_value=0x00003030,u32_min_value=0x00000000,u32_max_value=0xffffffff
cur: s32_min_value=0x00003031,s32_max_value=0x7fffffff,u32_min_value=0x00003031,u32_max_value=0x7fffffff
The 64 bit bounds generally look okay and while the information that got
propagated from 32 to 64 bit looks correct as well, it's not precise enough
for judging a conditional jmp32. Given the latter only operates on subregisters
we also need to take these into account as well for a range_within() probe
in order to be able to prune paths. Extending the range_within() constraint
to both bounds will be able to tell us that the old signed 32 bit bounds are
not wider than the cur signed 32 bit bounds.
With the fix in place, the program will now verify the 'goto' branch case as
it should have been:
[...]
6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
6: (66) if w0 s> 0x3030 goto pc+0
R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0
7: (d6) if w0 s<= 0x303030 goto pc+1
9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0
9: (95) exit
7: R0_w=invP(id=0,smax_value=
9223372034707292159,umax_value=
18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=12337,u32_min_value=12337,u32_max_value=
2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
7: (d6) if w0 s<= 0x303030 goto pc+1
R0_w=invP(id=0,smax_value=
9223372034707292159,umax_value=
18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=3158065,u32_min_value=3158065,u32_max_value=
2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
8: R0_w=invP(id=0,smax_value=
9223372034707292159,umax_value=
18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=3158065,u32_min_value=3158065,u32_max_value=
2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0
8: (30) r0 = *(u8 *)skb[
808464432]
BPF_LD_[ABS|IND] uses reserved fields
processed 11 insns (limit 1000000) max_states_per_insn 1 total_states 1 peak_states 1 mark_read 1
The bug is quite subtle in the sense that when verifier would determine that
a given branch is dead code, it would (here: wrongly) remove these instructions
from the program and hard-wire the taken branch for privileged programs instead
of the 'goto pc-1' rewrites which will cause hard to debug problems.
Fixes:
3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Fri, 5 Feb 2021 16:20:14 +0000 (17:20 +0100)]
bpf: Fix verifier jsgt branch analysis on max bound
Fix incorrect is_branch{32,64}_taken() analysis for the jsgt case. The return
code for both will tell the caller whether a given conditional jump is taken
or not, e.g. 1 means branch will be taken [for the involved registers] and the
goto target will be executed, 0 means branch will not be taken and instead we
fall-through to the next insn, and last but not least a -1 denotes that it is
not known at verification time whether a branch will be taken or not. Now while
the jsgt has the branch-taken case correct with reg->s32_min_value > sval, the
branch-not-taken case is off-by-one when testing for reg->s32_max_value < sval
since the branch will also be taken for reg->s32_max_value == sval. The jgt
branch analysis, for example, gets this right.
Fixes:
3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Fixes:
4f7b3e82589e ("bpf: improve verifier branch analysis")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
David S. Miller [Tue, 9 Feb 2021 23:55:59 +0000 (15:55 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) nf_conntrack_tuple_taken() needs to recheck zone for
NAT clash resolution, from Florian Westphal.
2) Restore support for stateful expressions when set definition
specifies no stateful expressions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 9 Feb 2021 08:52:19 +0000 (09:52 +0100)]
vsock: fix locking in vsock_shutdown()
In vsock_shutdown() we touched some socket fields without holding the
socket lock, such as 'state' and 'sk_flags'.
Also, after the introduction of multi-transport, we are accessing
'vsk->transport' in vsock_send_shutdown() without holding the lock
and this call can be made while the connection is in progress, so
the transport can change in the meantime.
To avoid issues, we hold the socket lock when we enter in
vsock_shutdown() and release it when we leave.
Among the transports that implement the 'shutdown' callback, only
hyperv_transport acquired the lock. Since the caller now holds it,
we no longer take it.
Fixes:
d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Feb 2021 23:20:43 +0000 (15:20 -0800)]
Merge branch 'hns3-fixes'
Huazhong Tan says:
====================
net: hns3: fixes for -net
The parameters sent from vf may be unreliable. If these
parameters are used directly, memory overwriting may occur.
So this series adds some checks for this case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Tue, 9 Feb 2021 09:03:07 +0000 (17:03 +0800)]
net: hns3: add a check for index in hclge_get_rss_key()
The index is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this index before using it in hclge_get_rss_key().
Fixes:
a638b1d8cc87 ("net: hns3: fix get VF RSS issue")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Tue, 9 Feb 2021 09:03:06 +0000 (17:03 +0800)]
net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
The tqp_index is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this tqp_index before using it in hclge_get_ring_chain_from_mbx().
Fixes:
84e095d64ed9 ("net: hns3: Change PF to add ring-vect binding & resetQ to mailbox")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Tue, 9 Feb 2021 09:03:05 +0000 (17:03 +0800)]
net: hns3: add a check for queue_id in hclge_reset_vf_queue()
The queue_id is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this queue_id before using it in hclge_reset_vf_queue().
Fixes:
1a426f8b40fc ("net: hns3: fix the VF queue reset flow error")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 8 Feb 2021 17:36:27 +0000 (19:36 +0200)]
net: dsa: felix: implement port flushing on .phylink_mac_link_down
There are several issues which may be seen when the link goes down while
forwarding traffic, all of which can be attributed to the fact that the
port flushing procedure from the reference manual was not closely
followed.
With flow control enabled on both the ingress port and the egress port,
it may happen when a link goes down that Ethernet packets are in flight.
In flow control mode, frames are held back and not dropped. When there
is enough traffic in flight (example: iperf3 TCP), then the ingress port
might enter congestion and never exit that state. This is a problem,
because it is the egress port's link that went down, and that has caused
the inability of the ingress port to send packets to any other port.
This is solved by flushing the egress port's queues when it goes down.
There is also a problem when performing stream splitting for
IEEE 802.1CB traffic (not yet upstream, but a sort of multicast,
basically). There, if one port from the destination ports mask goes
down, splitting the stream towards the other destinations will no longer
be performed. This can be traced down to this line:
ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG);
which should have been instead, as per the reference manual:
ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA,
DEV_MAC_ENA_CFG);
Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not
DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is
the case, but apparently multicasting to several ports will cause issues
if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set.
I am not sure what the state of the Ocelot VSC7514 driver is, but
probably not as bad as Felix/Seville, since VSC7514 uses phylib and has
the following in ocelot_adjust_link:
if (!phydev->link)
return;
therefore the port is not really put down when the link is lost, unlike
the DSA drivers which use .phylink_mac_link_down for that.
Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it
needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h
which are not exported in include/soc/mscc/ and a bugfix patch should
probably not move headers around.
Fixes:
bdeced75b13f ("net: dsa: felix: Add PCS operations for PHYLINK")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>