Wei Yang [Fri, 12 Jun 2020 02:07:54 +0000 (10:07 +0800)]
rcu: grpnum just records group number
The ->grpnum field in the rcu_node structure contains the bit position
in this structure's parent's bitmasks, which is not the CPU number.
This commit therefore adjusts this field's comment accordingly.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Wei Yang [Fri, 12 Jun 2020 02:07:53 +0000 (10:07 +0800)]
rcu: grplo/grphi just records CPU number
The ->grplo and ->grphi fields store the lowest and highest CPU number
covered by to a rcu_node structure, which is not the group number.
This commit therefore adjusts these fields' comments to match reality.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Wei Yang [Fri, 12 Jun 2020 02:07:52 +0000 (10:07 +0800)]
rcu: gp_max is protected by root rcu_node's lock
Because gp_max is protected by root rcu_node's lock, this commit moves
the gp_max definition to the region of the rcu_node structure containing
fields protected by this lock.
Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Peter Enderborg [Thu, 4 Jun 2020 10:23:20 +0000 (12:23 +0200)]
rcu: Stop shrinker loop
The count and scan can be separated in time, and there is a fair chance
that all work is already done when the scan starts, which might in turn
result in a needless retry. This commit therefore avoids this retry by
returning SHRINK_STOP.
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Jules Irenge [Mon, 1 Jun 2020 18:45:49 +0000 (19:45 +0100)]
rcu: Replace 1 with true
Coccinelle reports a warning
WARNING: Assignment of 0/1 to bool variable
The root cause is that the variable lastphase is a bool, but is
initialised with integer 1. This commit therefore replaces the 1 with
a true.
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 28 May 2020 15:49:29 +0000 (08:49 -0700)]
lockdep: Complain only once about RCU in extended quiescent state
Currently, lockdep_rcu_suspicious() complains twice about RCU read-side
critical sections being invoked from within extended quiescent states,
for example:
RCU used illegally from idle CPU!
rcu_scheduler_active = 2, debug_locks = 1
RCU used illegally from extended quiescent state!
This commit therefore saves a couple lines of code and one line of
console-log output by eliminating the first of these two complaints.
Link: https://lore.kernel.org/lkml/87wo4wnpzb.fsf@nanos.tec.linutronix.de
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Wed, 20 May 2020 00:00:54 +0000 (17:00 -0700)]
rcu: Mark rcu_nmi_enter() call to rcu_cleanup_after_idle() noinstr
The objtool complains about the call to rcu_cleanup_after_idle() from
rcu_nmi_enter(), so this commit adds instrumentation_begin() before that
call and instrumentation_end() after it.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Tue, 19 May 2020 22:02:02 +0000 (15:02 -0700)]
rcu: Remove initialized but unused rnp from check_slow_task()
This commit removes the variable rnp from check_slow_task(), which
is defined, assigned to, but not otherwise used.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Frederic Weisbecker [Fri, 15 May 2020 00:34:29 +0000 (02:34 +0200)]
tick/nohz: Narrow down noise while setting current task's tick dependency
Setting a tick dependency on any task, including the case where a task
sets that dependency on itself, triggers an IPI to all CPUs. That is
of course suboptimal but it had previously not been an issue because it
was only used by POSIX CPU timers on nohz_full, which apparently never
occurs in latency-sensitive workloads in production. (Or users of such
systems are suffering in silence on the one hand or venting their ire
on the wrong people on the other.)
But RCU now sets a task tick dependency on the current task in order
to fix stall issues that can occur during RCU callback processing.
Thus, RCU callback processing triggers frequent system-wide IPIs from
nohz_full CPUs. This is quite counter-productive, after all, avoiding
IPIs is what nohz_full is supposed to be all about.
This commit therefore optimizes tasks' self-setting of a task tick
dependency by using tick_nohz_full_kick() to avoid the system-wide IPI.
Instead, only the execution of the one task is disturbed, which is
acceptable given that this disturbance is well down into the noise
compared to the degree to which the RCU callback processing itself
disturbs execution.
Fixes: 6a949b7af82d (rcu: Force on tick when invoking lots of callbacks)
Reported-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: stable@kernel.org
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Lihao Liang [Thu, 14 May 2020 20:34:34 +0000 (21:34 +0100)]
rcu: Update comment from rsp->rcu_gp_seq to rsp->gp_seq
Signed-off-by: Lihao Liang <lihaoliang@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 8 May 2020 21:15:37 +0000 (14:15 -0700)]
fs/btrfs: Add cond_resched() for try_release_extent_mapping() stalls
Very large I/Os can cause the following RCU CPU stall warning:
RIP: 0010:rb_prev+0x8/0x50
Code: 49 89 c0 49 89 d1 48 89 c2 48 89 f8 e9 e5 fd ff ff 4c 89 48 10 c3 4c =
89 06 c3 4c 89 40 10 c3 0f 1f 00 48 8b 0f 48 39 cf 74 38 <48> 8b 47 10 48 85 c0 74 22 48 8b 50 08 48 85 d2 74 0c 48 89 d0 48
RSP: 0018:
ffffc9002212bab0 EFLAGS:
00000287 ORIG_RAX:
ffffffffffffff13
RAX:
ffff888821f93630 RBX:
ffff888821f93630 RCX:
ffff888821f937e0
RDX:
0000000000000000 RSI:
0000000000102000 RDI:
ffff888821f93630
RBP:
0000000000103000 R08:
000000000006c000 R09:
0000000000000238
R10:
0000000000102fff R11:
ffffc9002212bac8 R12:
0000000000000001
R13:
ffffffffffffffff R14:
0000000000102000 R15:
ffff888821f937e0
__lookup_extent_mapping+0xa0/0x110
try_release_extent_mapping+0xdc/0x220
btrfs_releasepage+0x45/0x70
shrink_page_list+0xa39/0xb30
shrink_inactive_list+0x18f/0x3b0
shrink_lruvec+0x38e/0x6b0
shrink_node+0x14d/0x690
do_try_to_free_pages+0xc6/0x3e0
try_to_free_mem_cgroup_pages+0xe6/0x1e0
reclaim_high.constprop.73+0x87/0xc0
mem_cgroup_handle_over_high+0x66/0x150
exit_to_usermode_loop+0x82/0xd0
do_syscall_64+0xd4/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
On a PREEMPT=n kernel, the try_release_extent_mapping() function's
"while" loop might run for a very long time on a large I/O. This commit
therefore adds a cond_resched() to this loop, providing RCU any needed
quiescent states.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 7 May 2020 23:38:29 +0000 (16:38 -0700)]
rcu: Expedited grace-period sleeps to idle priority
This commit converts the schedule_timeout_uninterruptible() call used
by RCU's expedited grace-period processing to schedule_timeout_idle().
This conversion avoids polluting the load-average with RCU-related
sleeping.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 7 May 2020 23:36:10 +0000 (16:36 -0700)]
rcu: No-CBs-related sleeps to idle priority
This commit converts the schedule_timeout_interruptible() call used by
RCU's no-CBs grace-period kthreads to schedule_timeout_idle(). This
conversion avoids polluting the load-average with RCU-related sleeping.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 7 May 2020 23:34:38 +0000 (16:34 -0700)]
rcu: Priority-boost-related sleeps to idle priority
This commit converts the long-standing schedule_timeout_interruptible()
call used by RCU's priority-boosting kthreads to schedule_timeout_idle().
This conversion avoids polluting the load-average with RCU-related
sleeping.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 7 May 2020 22:44:46 +0000 (15:44 -0700)]
rcu: Grace-period-kthread related sleeps to idle priority
This commit converts the long-standing schedule_timeout_interruptible()
and schedule_timeout_uninterruptible() calls used by RCU's grace-period
kthread to schedule_timeout_idle(). This conversion avoids polluting
the load-average with RCU-related sleeping.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Madhuparna Bhowmik [Mon, 4 May 2020 12:05:05 +0000 (08:05 -0400)]
trace: events: rcu: Change description of rcu_dyntick trace event
The different strings used for describing the polarity are
Start, End and StillNonIdle. Since StillIdle is not used in any trace
point for rcu_dyntick, it can be removed and StillNonIdle can be added
in the description. Because StillNonIdle is used in a few tracepoints for
rcu_dyntick.
Similarly, USER, IDLE and IRQ are used for describing context in
the rcu_dyntick tracepoints. Since, "KERNEL" is not used for any
of the rcu_dyntick tracepoints, remove it from the description.
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Mon, 4 May 2020 02:16:09 +0000 (19:16 -0700)]
rcu: Add comment documenting rcu_callback_map's purpose
The rcu_callback_map lockdep_map structure was added back in 2013, but
its purpose has become obscure. This commit therefore documments that the
purpose of rcu_callback map is, in the words of commit
24ef659a857 ("rcu:
Provide better diagnostics for blocking in RCU callback functions"),
to help lockdep to tie an "inappropriate voluntary context switch back
to the fact that the function is being invoked from within a callback."
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Fri, 1 May 2020 23:49:48 +0000 (16:49 -0700)]
rcu: Add callbacks-invoked counters
This commit adds a count of the callbacks invoked to the per-CPU rcu_data
structure. This count is printed by the show_rcu_gp_kthreads() that
is invoked by rcutorture and the RCU CPU stall-warning code. It is also
intended for use by drgn.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Wei Yang [Sun, 19 Apr 2020 21:57:15 +0000 (21:57 +0000)]
rcu: Simplify the calculation of rcu_state.ncpus
There is only 1 bit set in mask, which means that the only difference
between oldmask and the new one will be at the position where the bit is
set in mask. This commit therefore updates rcu_state.ncpus by checking
whether the bit in mask is already set in rnp->expmaskinitnext.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Paul E. McKenney [Thu, 16 Apr 2020 23:46:10 +0000 (16:46 -0700)]
mm/mmap.c: Add cond_resched() for exit_mmap() CPU stalls
A large process running on a heavily loaded system can encounter the
following RCU CPU stall warning:
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 3-....: (20998 ticks this GP) idle=4ea/1/0x4000000000000002 softirq=556558/556558 fqs=5190
(t=21013 jiffies g=
1005461 q=132576)
NMI backtrace for cpu 3
CPU: 3 PID: 501900 Comm: aio-free-ring-w Kdump: loaded Not tainted 5.2.9-108_fbk12_rc3_3858_gb83b75af7909 #1
Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS HBM6.71 02/03/2016
Call Trace:
<IRQ>
dump_stack+0x46/0x60
nmi_cpu_backtrace.cold.3+0x13/0x50
? lapic_can_unplug_cpu.cold.27+0x34/0x34
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.87+0x1aa/0x397
? tick_sched_do_timer+0x60/0x60
update_process_times+0x28/0x60
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:kmem_cache_free+0x223/0x300
Code: 88 00 00 00 0f 85 ca 00 00 00 41 8b 55 18 31 f6 f7 da 41 f6 45 0a 02 40 0f 94 c6 83 c6 05 9c 41 5e fa e8 a0 a7 01 00 41 56 9d <49> 8b 47 08 a8 03 0f 85 87 00 00 00 65 48 ff 08 e9 3d fe ff ff 65
RSP: 0018:
ffffc9000e8e3da8 EFLAGS:
00000206 ORIG_RAX:
ffffffffffffff13
RAX:
0000000000020000 RBX:
ffff88861b9de960 RCX:
0000000000000030
RDX:
fffffffffffe41e8 RSI:
000060777fe3a100 RDI:
000000000001be18
RBP:
ffffea00186e7780 R08:
ffffffffffffffff R09:
ffffffffffffffff
R10:
ffff88861b9dea28 R11:
ffff88887ffde000 R12:
ffffffff81230a1f
R13:
ffff888854684dc0 R14:
0000000000000206 R15:
ffff8888547dbc00
? remove_vma+0x4f/0x60
remove_vma+0x4f/0x60
exit_mmap+0xd6/0x160
mmput+0x4a/0x110
do_exit+0x278/0xae0
? syscall_trace_enter+0x1d3/0x2b0
? handle_mm_fault+0xaa/0x1c0
do_group_exit+0x3a/0xa0
__x64_sys_exit_group+0x14/0x20
do_syscall_64+0x42/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
And on a PREEMPT=n kernel, the "while (vma)" loop in exit_mmap() can run
for a very long time given a large process. This commit therefore adds
a cond_resched() to this loop, providing RCU any needed quiescent states.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Wei Yang [Wed, 15 Apr 2020 22:26:55 +0000 (22:26 +0000)]
rcu: Initialize and destroy rcu_synchronize only when necessary
The __wait_rcu_gp() function unconditionally initializes and cleans up
each element of rs_array[], whether used or not. This is slightly
wasteful and rather confusing, so this commit skips both initialization
and cleanup for duplicate callback functions.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Linus Torvalds [Sun, 28 Jun 2020 22:00:24 +0000 (15:00 -0700)]
Linux 5.8-rc3
Linus Torvalds [Sun, 28 Jun 2020 21:57:14 +0000 (14:57 -0700)]
Merge tag 'arm-omap-fixes-5.8-1' of git://git./linux/kernel/git/soc/soc
Pull ARM OMAP fixes from Arnd Bergmann:
"The OMAP developers are particularly active at hunting down
regressions, so this is a separate branch with OMAP specific
fixes for v5.8:
As Tony explains
"The recent display subsystem (DSS) related platform data changes
caused display related regressions for suspend and resume. Looks
like I only tested suspend and resume before dropping the legacy
platform data, and forgot to test it after dropping it. Turns out
the main issue was that we no longer have platform code calling
pm_runtime_suspend for DSS like we did for the legacy platform data
case, and that fix is still being discussed on the dri-devel list
and will get merged separately. The DSS related testing exposed a
pile other other display related issues that also need fixing
though":
- Fix ti-sysc optional clock handling and reset status checks for
devices that reset automatically in idle like DSS
- Ignore ti-sysc clockactivity bit unless separately requested to
avoid unexpected performance issues
- Init ti-sysc framedonetv_irq to true and disable for am4
- Avoid duplicate DSS reset for legacy mode with dts data
- Remove LCD timings for am4 as they cause warnings now that we're
using generic panels
Other OMAP changes from Tony include:
- Fix omap_prm reset deassert as we still have drivers setting the
pm_runtime_irq_safe() flag
- Flush posted write for ti-sysc enable and disable
- Fix droid4 spi related errors with spi flags
- Fix am335x USB range and a typo for softreset
- Fix dra7 timer nodes for clocks for IPU and DSP
- Drop duplicate mailboxes after mismerge for dra7
- Prevent pocketgeagle header line signal from accidentally setting
micro-SD write protection signal by removing the default mux
- Fix NFSroot flakeyness after resume for duover by switching the
smsc911x gpio interrupt to back to level sensitive
- Fix regression for omap4 clockevent source after recent system
timer changes
- Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid"
phy-mode
- One patch to convert am3/am4 DT files to use the regular sdhci-omap
driver instead of the old hsmmc driver, this was meant for the
merge window but got lost in the process"
* tag 'arm-omap-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode
ARM: dts: Fix omap4 system timer source clocks
ARM: dts: Fix duovero smsc interrupt for suspend
ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect
Revert "bus: ti-sysc: Increase max softreset wait"
ARM: dts: am437x-epos-evm: remove lcd timings
ARM: dts: am437x-gp-evm: remove lcd timings
ARM: dts: am437x-sk-evm: remove lcd timings
ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes
ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks
ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag
ARM: dts: Fix am33xx.dtsi USB ranges length
bus: ti-sysc: Increase max softreset wait
ARM: OMAP2+: Fix legacy mode dss_reset
bus: ti-sysc: Fix uninitialized framedonetv_irq
bus: ti-sysc: Ignore clockactivity unless specified as a quirk
bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit
ARM: dts: omap4-droid4: Fix spi configuration and increase rate
bus: ti-sysc: Flush posted write on enable and disable
soc: ti: omap-prm: use atomic iopoll instead of sleeping one
...
Linus Torvalds [Sun, 28 Jun 2020 21:55:18 +0000 (14:55 -0700)]
Merge tag 'arm-fixes-5.8-1' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Here are a couple of bug fixes, mostly for devicetree files
NXP i.MX:
- Use correct voltage on some i.MX8M board device trees to avoid
hardware damage
- Code fixes for a compiler warning and incorrect reference counting,
both harmless.
- Fix the i.MX8M SoC driver to correctly identify imx8mp
- Fix watchdog configuration in imx6ul-kontron device tree.
Broadcom:
- A small regression fix for the Raspberry-Pi firmware driver
- A Kconfig change to use the correct timer driver on Northstar
- A DT fix for the Luxul XWC-2000 machine
- Two more DT fixes for NSP SoCs
STmicroelectronics STI
- Revert one broken patch for L2 cache configuration
ARM Versatile Express:
- Fix a regression by reverting a broken DT cleanup
TEE drivers:
- MAINTAINERS: change tee mailing list"
* tag 'arm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
Revert "ARM: sti: Implement dummy L2 cache's write_sec"
soc: imx8m: fix build warning
ARM: imx6: add missing put_device() call in imx6q_suspend_init()
ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
soc: imx8m: Correct i.MX8MP UID fuse offset
ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM
arm64: dts: imx8mm-beacon: Fix voltages on LDO1 and LDO2
arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range
arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range
ARM: dts: NSP: Correct FA2 mailbox node
ARM: bcm2835: Fix integer overflow in rpi_firmware_print_firmware_revision()
MAINTAINERS: change tee mailing list
ARM: dts: NSP: Disable PL330 by default, add dma-coherent property
ARM: bcm: Select ARM_TIMER_SP804 for ARCH_BCM_NSP
ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000
arm: dts: vexpress: Move mcc node back into motherboard node
Linus Torvalds [Sun, 28 Jun 2020 18:59:08 +0000 (11:59 -0700)]
Merge tag 'timers-urgent-2020-06-28' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"A single DocBook fix"
* tag 'timers-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Fix kerneldoc system_device_crosststamp & al
Linus Torvalds [Sun, 28 Jun 2020 18:58:14 +0000 (11:58 -0700)]
Merge tag 'perf-urgent-2020-06-28' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A single Kbuild dependency fix"
* tag 'perf-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/rapl: Fix RAPL config variable bug
Linus Torvalds [Sun, 28 Jun 2020 18:42:16 +0000 (11:42 -0700)]
Merge tag 'efi-urgent-2020-06-28' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
- Fix build regression on v4.8 and older
- Robustness fix for TPM log parsing code
- kobject refcount fix for the ESRT parsing code
- Two efivarfs fixes to make it behave more like an ordinary file
system
- Style fixup for zero length arrays
- Fix a regression in path separator handling in the initrd loader
- Fix a missing prototype warning
- Add some kerneldoc headers for newly introduced stub routines
- Allow support for SSDT overrides via EFI variables to be disabled
- Report CPU mode and MMU state upon entry for 32-bit ARM
- Use the correct stack pointer alignment when entering from mixed mode
* tag 'efi-urgent-2020-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub: arm: Print CPU boot mode and MMU state at boot
efi/libstub: arm: Omit arch specific config table matching array on arm64
efi/x86: Setup stack correctly for efi_pe_entry
efi: Make it possible to disable efivar_ssdt entirely
efi/libstub: Descriptions for stub helper functions
efi/libstub: Fix path separator regression
efi/libstub: Fix missing-prototype warning for skip_spaces()
efi: Replace zero-length array and use struct_size() helper
efivarfs: Don't return -EINTR when rate-limiting reads
efivarfs: Update inode modification time for successful writes
efi/esrt: Fix reference count leak in esre_create_sysfs_entry.
efi/tpm: Verify event log header before parsing
efi/x86: Fix build with gcc 4
Linus Torvalds [Sun, 28 Jun 2020 17:37:39 +0000 (10:37 -0700)]
Merge tag 'sched_urgent_for_5.8_rc3' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
"The most anticipated fix in this pull request is probably the horrible
build fix for the RANDSTRUCT fail that didn't make -rc2. Also included
is the cleanup that removes those BUILD_BUG_ON()s and replaces it with
ugly unions.
Also included is the try_to_wake_up() race fix that was first
triggered by Paul's RCU-torture runs, but was independently hit by
Dave Chinner's fstest runs as well"
* tag 'sched_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/cfs: change initial value of runnable_avg
smp, irq_work: Continue smp_call_function*() and irq_work*() integration
sched/core: s/WF_ON_RQ/WQ_ON_CPU/
sched/core: Fix ttwu() race
sched/core: Fix PI boosting between RT and DEADLINE tasks
sched/deadline: Initialize ->dl_boosted
sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption
sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
Linus Torvalds [Sun, 28 Jun 2020 17:35:01 +0000 (10:35 -0700)]
Merge tag 'x86_urgent_for_5.8_rc3' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- AMD Memory bandwidth counter width fix, by Babu Moger.
- Use the proper length type in the 32-bit truncate() syscall variant,
by Jiri Slaby.
- Reinit IA32_FEAT_CTL during wakeup to fix the case where after
resume, VMXON would #GP due to VMX not being properly enabled, by
Sean Christopherson.
- Fix a static checker warning in the resctrl code, by Dan Carpenter.
- Add a CR4 pinning mask for bits which cannot change after boot, by
Kees Cook.
- Align the start of the loop of __clear_user() to 16 bytes, to improve
performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming.
* tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/64: Align start of __clear_user() loop to 16-bytes
x86/cpu: Use pinning mask for CR4 bits needing to be 0
x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get()
x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup
syscalls: Fix offset type of ksys_ftruncate()
x86/resctrl: Fix memory bandwidth counter width for AMD
Linus Torvalds [Sun, 28 Jun 2020 17:29:38 +0000 (10:29 -0700)]
Merge tag 'rcu_urgent_for_5.8_rc3' of git://git./linux/kernel/git/tip/tip
Pull RCU-vs-KCSAN fixes from Borislav Petkov:
"A single commit that uses "arch_" atomic operations to avoid the
instrumentation that comes with the non-"arch_" versions.
In preparation for that commit, it also has another commit that makes
these "arch_" atomic operations available to generic code.
Without these commits, KCSAN uses can see pointless errors"
* tag 'rcu_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Fixup noinstr warnings
locking/atomics: Provide the arch_atomic_ interface to generic code
Linus Torvalds [Sun, 28 Jun 2020 17:16:15 +0000 (10:16 -0700)]
Merge tag 'objtool_urgent_for_5.8_rc3' of git://git./linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
"Three fixes from Peter Zijlstra suppressing KCOV instrumentation in
noinstr sections.
Peter Zijlstra says:
"Address KCOV vs noinstr. There is no function attribute to
selectively suppress KCOV instrumentation, instead teach objtool
to NOP out the calls in noinstr functions"
This cures a bunch of KCOV crashes (as used by syzcaller)"
* tag 'objtool_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix noinstr vs KCOV
objtool: Provide elf_write_{insn,reloc}()
objtool: Clean up elf_write() condition
Linus Torvalds [Sun, 28 Jun 2020 16:42:47 +0000 (09:42 -0700)]
Merge tag 'x86_entry_for_5.8' of git://git./linux/kernel/git/tip/tip
Pull x86 entry fixes from Borislav Petkov:
"This is the x86/entry urgent pile which has accumulated since the
merge window.
It is not the smallest but considering the almost complete entry core
rewrite, the amount of fixes to follow is somewhat higher than usual,
which is to be expected.
Peter Zijlstra says:
'These patches address a number of instrumentation issues that were
found after the x86/entry overhaul. When combined with rcu/urgent
and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy
again.
Part of making this all work is bumping the minimum GCC version for
KASAN builds to gcc-8.3, the reason for this is that the
__no_sanitize_address function attribute is broken in GCC releases
before that.
No known GCC version has a working __no_sanitize_undefined, however
because the only noinstr violation that results from this happens
when an UB is found, we treat it like WARN. That is, we allow it to
violate the noinstr rules in order to get the warning out'"
* tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Fix #UD vs WARN more
x86/entry: Increase entry_stack size to a full page
x86/entry: Fixup bad_iret vs noinstr
objtool: Don't consider vmlinux a C-file
kasan: Fix required compiler version
compiler_attributes.h: Support no_sanitize_undefined check with GCC 4
x86/entry, bug: Comment the instrumentation_begin() usage for WARN()
x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*()
x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline()
compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr
kasan: Bump required compiler version
x86, kcsan: Add __no_kcsan to noinstr
kcsan: Remove __no_kcsan_or_inline
x86, kcsan: Remove __no_kcsan_or_inline usage
Vincent Guittot [Wed, 24 Jun 2020 15:44:22 +0000 (17:44 +0200)]
sched/cfs: change initial value of runnable_avg
Some performance regression on reaim benchmark have been raised with
commit
070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group")
The problem comes from the init value of runnable_avg which is initialized
with max value. This can be a problem if the newly forked task is finally
a short task because the group of CPUs is wrongly set to overloaded and
tasks are pulled less agressively.
Set initial value of runnable_avg equals to util_avg to reflect that there
is no waiting time so far.
Fixes: 070f5e860ee2 ("sched/fair: Take into account runnable_avg to classify group")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200624154422.29166-1-vincent.guittot@linaro.org
Peter Zijlstra [Mon, 22 Jun 2020 10:01:25 +0000 (12:01 +0200)]
smp, irq_work: Continue smp_call_function*() and irq_work*() integration
Instead of relying on BUG_ON() to ensure the various data structures
line up, use a bunch of horrible unions to make it all automatic.
Much of the union magic is to ensure irq_work and smp_call_function do
not (yet) see the members of their respective data structures change
name.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lkml.kernel.org/r/20200622100825.844455025@infradead.org
Peter Zijlstra [Mon, 22 Jun 2020 10:01:24 +0000 (12:01 +0200)]
sched/core: s/WF_ON_RQ/WQ_ON_CPU/
Use a better name for this poorly named flag, to avoid confusion...
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20200622100825.785115830@infradead.org
Peter Zijlstra [Mon, 22 Jun 2020 10:01:23 +0000 (12:01 +0200)]
sched/core: Fix ttwu() race
Paul reported rcutorture occasionally hitting a NULL deref:
sched_ttwu_pending()
ttwu_do_wakeup()
check_preempt_curr() := check_preempt_wakeup()
find_matching_se()
is_same_group()
if (se->cfs_rq == pse->cfs_rq) <-- *BOOM*
Debugging showed that this only appears to happen when we take the new
code-path from commit:
2ebb17717550 ("sched/core: Offload wakee task activation if it the wakee is descheduling")
and only when @cpu == smp_processor_id(). Something which should not
be possible, because p->on_cpu can only be true for remote tasks.
Similarly, without the new code-path from commit:
c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
this would've unconditionally hit:
smp_cond_load_acquire(&p->on_cpu, !VAL);
and if: 'cpu == smp_processor_id() && p->on_cpu' is possible, this
would result in an instant live-lock (with IRQs disabled), something
that hasn't been reported.
The NULL deref can be explained however if the task_cpu(p) load at the
beginning of try_to_wake_up() returns an old value, and this old value
happens to be smp_processor_id(). Further assume that the p->on_cpu
load accurately returns 1, it really is still running, just not here.
Then, when we enqueue the task locally, we can crash in exactly the
observed manner because p->se.cfs_rq != rq->cfs_rq, because p's cfs_rq
is from the wrong CPU, therefore we'll iterate into the non-existant
parents and NULL deref.
The closest semi-plausible scenario I've managed to contrive is
somewhat elaborate (then again, actual reproduction takes many CPU
hours of rcutorture, so it can't be anything obvious):
X->cpu = 1
rq(1)->curr = X
CPU0 CPU1 CPU2
// switch away from X
LOCK rq(1)->lock
smp_mb__after_spinlock
dequeue_task(X)
X->on_rq = 9
switch_to(Z)
X->on_cpu = 0
UNLOCK rq(1)->lock
// migrate X to cpu 0
LOCK rq(1)->lock
dequeue_task(X)
set_task_cpu(X, 0)
X->cpu = 0
UNLOCK rq(1)->lock
LOCK rq(0)->lock
enqueue_task(X)
X->on_rq = 1
UNLOCK rq(0)->lock
// switch to X
LOCK rq(0)->lock
smp_mb__after_spinlock
switch_to(X)
X->on_cpu = 1
UNLOCK rq(0)->lock
// X goes sleep
X->state = TASK_UNINTERRUPTIBLE
smp_mb(); // wake X
ttwu()
LOCK X->pi_lock
smp_mb__after_spinlock
if (p->state)
cpu = X->cpu; // =? 1
smp_rmb()
// X calls schedule()
LOCK rq(0)->lock
smp_mb__after_spinlock
dequeue_task(X)
X->on_rq = 0
if (p->on_rq)
smp_rmb();
if (p->on_cpu && ttwu_queue_wakelist(..)) [*]
smp_cond_load_acquire(&p->on_cpu, !VAL)
cpu = select_task_rq(X, X->wake_cpu, ...)
if (X->cpu != cpu)
switch_to(Y)
X->on_cpu = 0
UNLOCK rq(0)->lock
However I'm having trouble convincing myself that's actually possible
on x86_64 -- after all, every LOCK implies an smp_mb() there, so if ttwu
observes ->state != RUNNING, it must also observe ->cpu != 1.
(Most of the previous ttwu() races were found on very large PowerPC)
Nevertheless, this fully explains the observed failure case.
Fix it by ordering the task_cpu(p) load after the p->on_cpu load,
which is easy since nothing actually uses @cpu before this.
Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200622125649.GC576871@hirez.programming.kicks-ass.net
Juri Lelli [Mon, 19 Nov 2018 15:32:01 +0000 (16:32 +0100)]
sched/core: Fix PI boosting between RT and DEADLINE tasks
syzbot reported the following warning:
WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
At deadline.c:628 we have:
623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
624 {
625 struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
626 struct rq *rq = rq_of_dl_rq(dl_rq);
627
628 WARN_ON(dl_se->dl_boosted);
629 WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline));
[...]
}
Which means that setup_new_dl_entity() has been called on a task
currently boosted. This shouldn't happen though, as setup_new_dl_entity()
is only called when the 'dynamic' deadline of the new entity
is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this
condition.
Digging through the PI code I noticed that what above might in fact happen
if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the
first branch of boosting conditions we check only if a pi_task 'dynamic'
deadline is earlier than mutex holder's and in this case we set mutex
holder to be dl_boosted. However, since RT 'dynamic' deadlines are only
initialized if such tasks get boosted at some point (or if they become
DEADLINE of course), in general RT 'dynamic' deadlines are usually equal
to 0 and this verifies the aforementioned condition.
Fix it by checking that the potential donor task is actually (even if
temporary because in turn boosted) running at DEADLINE priority before
using its 'dynamic' deadline value.
Fixes: 2d3d891d3344 ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
Reported-by: syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Link: https://lkml.kernel.org/r/20181119153201.GB2119@localhost.localdomain
Juri Lelli [Wed, 17 Jun 2020 07:29:19 +0000 (09:29 +0200)]
sched/deadline: Initialize ->dl_boosted
syzbot reported the following warning triggered via SYSC_sched_setattr():
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 setup_new_dl_entity /kernel/sched/deadline.c:594 [inline]
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_dl_entity /kernel/sched/deadline.c:1370 [inline]
WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_task_dl+0x1c17/0x2ba0 /kernel/sched/deadline.c:1441
This happens because the ->dl_boosted flag is currently not initialized by
__dl_clear_params() (unlike the other flags) and setup_new_dl_entity()
rightfully complains about it.
Initialize dl_boosted to 0.
Fixes: 2d3d891d3344 ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
Reported-by: syzbot+5ac8bac25f95e8b221e7@syzkaller.appspotmail.com
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Daniel Wagner <dwagner@suse.de>
Link: https://lkml.kernel.org/r/20200617072919.818409-1-juri.lelli@redhat.com
Scott Wood [Wed, 17 Jun 2020 12:17:42 +0000 (14:17 +0200)]
sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption
This function is concerned with the long-term CPU mask, not the
transitory mask the task might have while migrate disabled. Before
this patch, if a task was migrate-disabled at the time
__set_cpus_allowed_ptr() was called, and the new mask happened to be
equal to the CPU that the task was running on, then the mask update
would be lost.
Signed-off-by: Scott Wood <swood@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200617121742.cpxppyi7twxmpin7@linutronix.de
Peter Zijlstra [Wed, 10 Jun 2020 10:14:09 +0000 (12:14 +0200)]
sched/core: Fix CONFIG_GCC_PLUGIN_RANDSTRUCT build fail
As a temporary build fix, the proper cleanup needs more work.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Eric Biggers <ebiggers@kernel.org>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Fixes: a148866489fb ("sched: Replace rq::wake_list")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Arnd Bergmann [Sun, 28 Jun 2020 12:48:19 +0000 (14:48 +0200)]
Merge tag 'imx-fixes-5.8' of git://git./linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.8:
- Fix LDO1 and LDO2 voltage range for a couple of i.MX8M board device
trees.
- Fix i.MX8MP UID fuse offset in i.MX8M SoC driver.
- Fix watchdog configuration in imx6ul-kontron device tree.
- Fix one build warning seen on building soc-imx8m driver with
x86_64-randconfig.
- Add missing put_device() call for a couple of mach-imx PM functions.
* tag 'imx-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx8m: fix build warning
ARM: imx6: add missing put_device() call in imx6q_suspend_init()
ARM: imx5: add missing put_device() call in imx_suspend_alloc_ocram()
soc: imx8m: Correct i.MX8MP UID fuse offset
ARM: dts: imx6ul-kontron: Change WDOG_ANY signal from push-pull to open-drain
ARM: dts: imx6ul-kontron: Move watchdog from Kontron i.MX6UL/ULL board to SoM
arm64: dts: imx8mm-beacon: Fix voltages on LDO1 and LDO2
arm64: dts: imx8mn-ddr4-evk: correct ldo1/ldo2 voltage range
arm64: dts: imx8mm-evk: correct ldo1/ldo2 voltage range
Link: https://lore.kernel.org/r/20200624111725.GA24312@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Sun, 28 Jun 2020 12:48:06 +0000 (14:48 +0200)]
Merge tag 'arm-soc/for-5.8/drivers-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM/ARM64/MIPS SoCs drivers fixes
for 5.8, please pull the following:
- Andy provides a fix for the Raspberry Pi firmware driver to print the
correct time upon boot. This is a fallout from a converstion to use
the ptT format
* tag 'arm-soc/for-5.8/drivers-fixes' of https://github.com/Broadcom/stblinux:
ARM: bcm2835: Fix integer overflow in rpi_firmware_print_firmware_revision()
Link: https://lore.kernel.org/r/20200619202250.19029-2-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Sun, 28 Jun 2020 12:47:40 +0000 (14:47 +0200)]
Merge tag 'arm-soc/for-5.8/soc-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs machine/Kconfig fixes
for 5.8, please pull the following:
- Matthew adds a missing select to permit the use of the standard ARM
SP804 timers on Norsthstar Plus (NSP)
* tag 'arm-soc/for-5.8/soc-fixes' of https://github.com/Broadcom/stblinux:
ARM: bcm: Select ARM_TIMER_SP804 for ARCH_BCM_NSP
Link: https://lore.kernel.org/r/20200619202250.19029-3-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Sun, 28 Jun 2020 12:47:24 +0000 (14:47 +0200)]
Merge tag 'arm-soc/for-5.8/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.8, please pull the following:
- Rafal adds a missing 'device_type' property to the Luxul XWC-2000
required for the memory nodes to be correctly parsed by Linux
- Matthew provides two fixes for the NSP SoCs, one to disable the PL330
DMA controller by default since it can be left in reset by the
bootloader and the second to correct the flow accelerator mailbox node
* tag 'arm-soc/for-5.8/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: NSP: Correct FA2 mailbox node
ARM: dts: NSP: Disable PL330 by default, add dma-coherent property
ARM: dts: BCM5301X: Add missing memory "device_type" for Luxul XWC-2000
Link: https://lore.kernel.org/r/20200619202250.19029-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Patrice Chotard [Thu, 18 Jun 2020 17:24:56 +0000 (19:24 +0200)]
Revert "ARM: sti: Implement dummy L2 cache's write_sec"
This reverts commit
7b8e0188fa717cd9abc4fb52587445b421835c2a.
Initially, STiH410-B2260 was supposed to be secured, that's why
l2c_write_sec was stubbed to avoid secure register access from
non secure world.
But by default, STiH410-B2260 is running in non secure mode,
so L2 cache register accesses are authorized, l2c_write_sec stub
is not needed.
With this patch, L2 cache is configured and performance are enhanced.
Link: https://lore.kernel.org/r/20200618172456.29475-1-patrice.chotard@st.com
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Cc: Alain Volmat <alain.volmat@st.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Sun, 28 Jun 2020 12:45:05 +0000 (14:45 +0200)]
Merge tag 'omap-for-v5.8/fixes-rc1-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/omap-fixes
Few dts fixes for omaps for v5.8
Few fixes for various devices:
- Prevent pocketgeagle header line signal from accidentally setting
micro-SD write protection signal by removing the default mux
- Fix NFSroot flakeyness after resume for duover by switching the
smsc911x gpio interrupt to back to level sensitive
- Fix regression for omap4 clockevent source after recent system
timer changes
- Yet another ethernet regression fix for the "rgmii" vs "rgmii-rxid"
phy-mode
* tag 'omap-for-v5.8/fixes-rc1-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am5729: beaglebone-ai: fix rgmii phy-mode
ARM: dts: Fix omap4 system timer source clocks
ARM: dts: Fix duovero smsc interrupt for suspend
ARM: dts: am335x-pocketbeagle: Fix mmc0 Write Protect
Link: https://lore.kernel.org/r/pull-1592499282-121092@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Sun, 28 Jun 2020 12:44:39 +0000 (14:44 +0200)]
Merge tag 'omap-for-v5.8/dt-missed-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/omap-fixes
Missed sdhci patch for am3 and am4
I forgot to send a pull request earlier for converting am3 and am4 to
use sdhci-omap driver instead of the old omap_hsmmc driver.
There was a display subsystem related suspend and resume regression found
recently and looks like I forgot to send a pull request for this patch
while debugging the regression. This patch has been tested without the
display subsystem, and has been in Linux next for several weeks now, so
would be good to have merged for v5.8.
* tag 'omap-for-v5.8/dt-missed-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Move am33xx and am43xx mmc nodes to sdhci-omap driver
Link: https://lore.kernel.org/r/pull-1591637467-607254@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 26 Jun 2020 22:20:55 +0000 (00:20 +0200)]
Merge tag 'tee-ml-for-v5.8' of git://git.linaro.org/people/jens.wiklander/linux-tee into arm/fixes
Change the TEE mailing list in MAINTAINERS
* tag 'tee-ml-for-v5.8' of git://git.linaro.org/people/jens.wiklander/linux-tee:
MAINTAINERS: change tee mailing list
Link: https://lore.kernel.org/r/20200616075948.GA2288211@jade
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 26 Jun 2020 22:17:23 +0000 (00:17 +0200)]
Merge tag 'omap-for-v5.8/fixes-merge-window-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps for v5.8
The recent display subsystem (DSS) related platform data changes caused
display related regressions for suspend and resume. Looks like I only
tested suspend and resume before dropping the legacy platform data, and
forgot to test it after dropping it. Turns out the main issue was that
we no longer have platform code calling pm_runtime_suspend for DSS like
we did for the legacy platform data case, and that fix is still being
discussed on the dri-devel list and will get merged separately. The DSS
related testing exposed a pile other other display related issues that
also need fixing though:
- Fix ti-sysc optional clock handling and reset status checks
for devices that reset automatically in idle like DSS
- Ignore ti-sysc clockactivity bit unless separately requested
to avoid unexpected performance issues
- Init ti-sysc framedonetv_irq to true and disable for am4
- Avoid duplicate DSS reset for legacy mode with dts data
- Remove LCD timings for am4 as they cause warnings now that we're
using generic panels
Then there is a pile of other fixes not related to the DSS:
- Fix omap_prm reset deassert as we still have drivers setting the
pm_runtime_irq_safe() flag
- Flush posted write for ti-sysc enable and disable
- Fix droid4 spi related errors with spi flags
- Fix am335x USB range and a typo for softreset
- Fix dra7 timer nodes for clocks for IPU and DSP
- Drop duplicate mailboxes after mismerge for dra7
* tag 'omap-for-v5.8/fixes-merge-window-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
Revert "bus: ti-sysc: Increase max softreset wait"
ARM: dts: am437x-epos-evm: remove lcd timings
ARM: dts: am437x-gp-evm: remove lcd timings
ARM: dts: am437x-sk-evm: remove lcd timings
ARM: dts: dra7-evm-common: Fix duplicate mailbox nodes
ARM: dts: dra7: Fix timer nodes properly for timer_sys_ck clocks
ARM: dts: Fix am33xx.dtsi ti,sysc-mask wrong softreset flag
ARM: dts: Fix am33xx.dtsi USB ranges length
bus: ti-sysc: Increase max softreset wait
ARM: OMAP2+: Fix legacy mode dss_reset
bus: ti-sysc: Fix uninitialized framedonetv_irq
bus: ti-sysc: Ignore clockactivity unless specified as a quirk
bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit
ARM: dts: omap4-droid4: Fix spi configuration and increase rate
bus: ti-sysc: Flush posted write on enable and disable
soc: ti: omap-prm: use atomic iopoll instead of sleeping one
Link: https://lore.kernel.org/r/pull-1591889257-410830@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
David Howells [Wed, 24 Jun 2020 16:00:24 +0000 (17:00 +0100)]
afs: Fix storage of cell names
The cell name stored in the afs_cell struct is a 64-char + NUL buffer -
when it needs to be able to handle up to AFS_MAXCELLNAME (256 chars) + NUL.
Fix this by changing the array to a pointer and allocating the string.
Found using Coverity.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 27 Jun 2020 22:24:04 +0000 (15:24 -0700)]
Merge tag '5.8-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Six cifs/smb3 fixes, three of them for stable.
Fixes xfstests 451, 313 and 316"
* tag '5.8-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: misc: Use array_size() in if-statement controlling expression
cifs: update ctime and mtime during truncate
cifs/smb3: Fix data inconsistent when punch hole
cifs/smb3: Fix data inconsistent when zero file range
cifs: Fix double add page to memcg when cifs_readpages
cifs: Fix cached_fid refcnt leak in open_shroot
Linus Torvalds [Sat, 27 Jun 2020 22:20:03 +0000 (15:20 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six small fixes, five in drivers and one to correct another minor
regression from
cc97923a5bcc ("block: move dma drain handling to
scsi") where we still need the drain stub to be built in to the kernel
for the modular libata, non-modular SAS driver case"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mptscsih: Fix read sense data size
scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action
scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()
scsi: libata: Fix the ata_scsi_dma_need_drain stub
scsi: qla2xxx: Keep initiator ports after RSCN
scsi: qla2xxx: Set NVMe status code for failed NVMe FCP request
Linus Torvalds [Sat, 27 Jun 2020 22:17:21 +0000 (15:17 -0700)]
Merge tag 'vfio-v5.8-rc3' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Fix double free of eventfd ctx (Alex Williamson)
- Fix duplicate use of capability ID (Alex Williamson)
- Fix SR-IOV VF memory enable handling (Alex Williamson)
* tag 'vfio-v5.8-rc3' of git://github.com/awilliam/linux-vfio:
vfio/pci: Fix SR-IOV VF handling with MMIO blocking
vfio/type1: Fix migration info capability ID
vfio/pci: Clear error and request eventfd ctx after releasing
Linus Torvalds [Sat, 27 Jun 2020 22:15:37 +0000 (15:15 -0700)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"This contains a 5.8 regression fix for the Designware driver, a
register bitfield fix for the fsi driver, and a missing sanity check
for the I2C core"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: check returned size of emulated smbus block read
i2c: fsi: Fix the port number field in status register
i2c: designware: Adjust bus speed independently of ACPI
Linus Torvalds [Sat, 27 Jun 2020 20:14:15 +0000 (13:14 -0700)]
Merge tag 'staging-5.8-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a small number of tiny staging driver fixes for 5.8-rc3.
Not much here, but there were some reported problems to be fixed:
- three wfx driver fixes
- rtl8723bs driver fix
All of these have been in linux-next with no reported issues"
* tag 'staging-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: rtl8723bs: prevent buffer overflow in update_sta_support_rate()
staging: wfx: fix coherency of hif_scan() prototype
staging: wfx: drop useless loop
staging: wfx: fix AC priority
Linus Torvalds [Sat, 27 Jun 2020 20:12:10 +0000 (13:12 -0700)]
Merge tag 'usb-5.8-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.8-rc3 to resolve some reported
issues.
Nothing major here:
- gadget driver fixes
- cdns3 driver fixes
- xhci fixes
- renesas_usbhs driver fixes
- some new device support with ids
- documentation update
All of these have been in linux-next with no reported issues"
* tag 'usb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (27 commits)
usb: renesas_usbhs: getting residue from callback_result
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
xhci: Poll for U0 after disabling USB2 LPM
xhci: Return if xHCI doesn't support LPM
usb: host: xhci-mtk: avoid runtime suspend when removing hcd
xhci: Fix enumeration issue when setting max packet size for FS devices.
xhci: Fix incorrect EP_STATE_MASK
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
usb: gadget: udc: Potential Oops in error handling code
usb: phy: tegra: Fix unnecessary check in tegra_usb_phy_probe()
usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
usb: typec: tcpci_rt1711h: avoid screaming irq causing boot hangs
USB: ohci-sm501: Add missed iounmap() in remove
cdc-acm: Add DISABLE_ECHO quirk for Microchip/SMSC chip
...
Linus Torvalds [Sat, 27 Jun 2020 20:10:31 +0000 (13:10 -0700)]
Merge tag 'char-misc-5.8-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Some tiny char/misc driver fixes for 5.8-rc3.
The "largest" changes are in the mei driver, to resolve some reported
problems and add some new device ids. There's also a binder bugfix, an
fpga driver build fix, and some assorted habanalabs fixes.
All of these, except for the habanalabs fixes, have been in linux-next
with no reported issues. The habanalabs driver changes showed up in my
tree on Friday, but as they are totally self-contained, all should be
good there"
* tag 'char-misc-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
habanalabs: increase h/w timer when checking idle
habanalabs: Correct handling when failing to enqueue CB
habanalabs: increase GAUDI QMAN ARB WDT timeout
habanalabs: rename mmu_write() to mmu_asid_va_write()
habanalabs: use PI in MMU cache invalidation
habanalabs: block scalar load_and_exe on external queue
mei: me: add tiger lake point device ids for H platforms.
mei: me: disable mei interface on Mehlow server platforms
binder: fix null deref of proc->context
fpga: zynqmp: fix modular build
Linus Torvalds [Sat, 27 Jun 2020 20:08:14 +0000 (13:08 -0700)]
Merge tag 'edac_urgent_for_5.8' of git://git./linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
"A single fix for amd64_edac restoring the reporting of the DRAM scrub
rate on family 0x15 CPUs"
* tag 'edac_urgent_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Read back the scrub rate PCI register on F15h
Linus Torvalds [Sat, 27 Jun 2020 20:06:22 +0000 (13:06 -0700)]
Merge tag 'dma-mapping-5.8-4' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- fix dma coherent mmap in nommu (me)
- more AMD SEV fallout (David Rientjes, me)
- fix alignment in dma_common_*_remap (Eric Auger)
* tag 'dma-mapping-5.8-4' of git://git.infradead.org/users/hch/dma-mapping:
dma-remap: align the size in dma_common_*_remap()
dma-mapping: DMA_COHERENT_POOL should select GENERIC_ALLOCATOR
dma-direct: add missing set_memory_decrypted() for coherent mapping
dma-direct: check return value when encrypting or decrypting memory
dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails
dma-direct: always align allocation size in dma_direct_alloc_pages()
dma-direct: mark __dma_direct_alloc_pages static
dma-direct: re-enable mmap for !CONFIG_MMU
Linus Torvalds [Sat, 27 Jun 2020 16:35:47 +0000 (09:35 -0700)]
Merge tag 'nfs-for-5.8-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client bugfixes from Anna Schumaker:
"Stable Fixes:
- xprtrdma: Fix handling of RDMA_ERROR replies
- sunrpc: Fix rollback in rpc_gssd_dummy_populate()
- pNFS/flexfiles: Fix list corruption if the mirror count changes
- NFSv4: Fix CLOSE not waiting for direct IO completion
- SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
Other Fixes:
- xprtrdma: Fix a use-after-free with r_xprt->rx_ep
- Fix other xprtrdma races during disconnect
- NFS: Fix memory leak of export_path"
* tag 'nfs-for-5.8-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
NFSv4 fix CLOSE not waiting for direct IO compeletion
pNFS/flexfiles: Fix list corruption if the mirror count changes
nfs: Fix memory leak of export_path
sunrpc: fixed rollback in rpc_gssd_dummy_populate()
xprtrdma: Fix handling of RDMA_ERROR replies
xprtrdma: Clean up disconnect
xprtrdma: Clean up synopsis of rpcrdma_flush_disconnect()
xprtrdma: Use re_connect_status safely in rpcrdma_xprt_connect()
xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed
Linus Torvalds [Sat, 27 Jun 2020 16:02:49 +0000 (09:02 -0700)]
Merge tag 'io_uring-5.8-2020-06-26' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Three small fixes:
- Close a corner case for polled IO resubmission (Pavel)
- Toss commands when exiting (Pavel)
- Fix SQPOLL conditional reschedule on perpetually busy submit
(Xuan)"
* tag 'io_uring-5.8-2020-06-26' of git://git.kernel.dk/linux-block:
io_uring: fix current->mm NULL dereference on exit
io_uring: fix hanging iopoll in case of -EAGAIN
io_uring: fix io_sq_thread no schedule when busy
Linus Torvalds [Sat, 27 Jun 2020 15:59:32 +0000 (08:59 -0700)]
Merge tag 'block-5.8-2020-06-26' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph:
- multipath deadlock fixes (Anton)
- NUMA fixes (Max)
- RDMA completion vector fix (Max)
- IO deadlock fix (Sagi)
- multipath reference fix (Sagi)
- NS mutation fix (Sagi)
- Use right allocator when freeing bip in error path (Chengguang)
* tag 'block-5.8-2020-06-26' of git://git.kernel.dk/linux-block:
nvme-multipath: fix bogus request queue reference put
nvme-multipath: fix deadlock due to head->lock
nvme: don't protect ns mutation with ns->head->lock
nvme-multipath: fix deadlock between ana_work and scan_work
nvme: fix possible deadlock when I/O is blocked
nvme-rdma: assign completion vector correctly
nvme-loop: initialize tagset numa value to the value of the ctrl
nvme-tcp: initialize tagset numa value to the value of the ctrl
nvme-pci: initialize tagset numa value to the value of the ctrl
nvme-pci: override the value of the controller's numa node
nvme: set initial value for controller's numa node
block: release bip in a right way in error path
Linus Torvalds [Sat, 27 Jun 2020 15:57:16 +0000 (08:57 -0700)]
Merge tag 'for-5.8/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Quite a few DM zoned target fixes and a Zone append fix in DM core.
Considering the amount of dm-zoned changes that went in during the
5.8 merge window these fixes are not that surprising.
- A few DM writecache target fixes.
- A fix to Documentation index to include DM ebs target docs.
- Small cleanup to use struct_size() in DM core's retrieve_deps().
* tag 'for-5.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm writecache: add cond_resched to loop in persistent_memory_claim()
dm zoned: Fix reclaim zone selection
dm zoned: Fix random zone reclaim selection
dm: update original bio sector on Zone Append
dm zoned: Fix metadata zone size check
docs: device-mapper: add dm-ebs.rst to an index file
dm ioctl: use struct_size() helper in retrieve_deps()
dm writecache: skip writecache_wait when using pmem mode
dm writecache: correct uncommitted_block when discarding uncommitted entry
dm zoned: assign max_io_len correctly
dm zoned: fix uninitialized pointer dereference
Linus Torvalds [Sat, 27 Jun 2020 15:53:49 +0000 (08:53 -0700)]
Merge tag 'kgdb-5.8-rc3' of git://git./linux/kernel/git/danielt/linux
Pull kgdb fixes from Daniel Thompson:
"The main change here is a fix for a number of unsafe interactions
between kdb and the console system. The fixes are specific to kdb
(pure kgdb debugging does not use the console system at all). On
systems with an NMI then kdb, if it is enabled, must get messages to
the user despite potentially running from some "difficult" calling
contexts. These fixes avoid using the console system where we have
been provided an alternative (safer) way to interact with the user
and, if using the console system in unavoidable, use oops_in_progress
for deadlock avoidance. These fixes also ensure kdb honours the
console enable flag.
Also included is a fix that wraps kgdb trap handling in an RCU read
lock to avoids triggering diagnostic warnings. This is a wide lock
scope but this is OK because kgdb is a stop-the-world debugger. When
we stop the world we put all the CPUs into holding pens and this
inhibits RCU update anyway"
* tag 'kgdb-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kgdb: Avoid suspicious RCU usage warning
kdb: Switch to use safer dbg_io_ops over console APIs
kdb: Make kdb_printf() console handling more robust
kdb: Check status of console prior to invoking handlers
kdb: Re-factor kdb_printf() message write code
Linus Torvalds [Sat, 27 Jun 2020 15:51:35 +0000 (08:51 -0700)]
Merge tag 'powerpc-5.8-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- A fix for a crash in nested KVM when CONFIG_DEBUG_VIRTUAL=y.
- Two minor build fixes.
Thanks to: Aneesh Kumar K.V, Arseny Solokha, Harish.
* tag 'powerpc-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix build failure in ebb tests
powerpc/kvm/book3s64: Fix kernel crash with nested kvm & DEBUG_VIRTUAL
powerpc/fsl_booke/32: Fix build with CONFIG_RANDOMIZE_BASE
Linus Torvalds [Sat, 27 Jun 2020 15:49:12 +0000 (08:49 -0700)]
Merge tag 'riscv-for-linus-5.8-rc3' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a handful of fixes I'd like to target for rc3.
Most of them fix issues with the conversion of our vDSO to C. There is
also one fix to the SiFive PRCI driver that I picked up as it's
causing boot issues on the hardware.
- A fix to allow kernels with dynamic ftrace to use the vDSO.
- Some build fixes for the C vDSO functions.
- A fix to the PRCI driver's memory allocation, which was the cause
of some boot panics with FREELIST_RANDOM"
* tag 'riscv-for-linus-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fixup __vdso_gettimeofday broke dynamic ftrace
riscv: Add extern declarations for vDSO time-related functions
clk: sifive: allocate sufficient memory for struct __prci_data
riscv: Add -fPIC option to CFLAGS_vgettimeofday.o
Linus Torvalds [Sat, 27 Jun 2020 15:47:18 +0000 (08:47 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The big fix here is to our vDSO sigreturn trampoline as, after a
painfully long stint of debugging, it turned out that fixing some of
our CFI directives in the merge window lit up a bunch of logic in
libgcc which has been shown to SEGV in some cases during asynchronous
pthread cancellation.
It looks like we can fix this by extending the directives to restore
most of the interrupted register state from the sigcontext, but it's
risky and hard to test so we opted to remove the CFI directives for
now and rely on the unwinder fallback path like we used to.
- Fix unwinding through vDSO sigreturn trampoline
- Fix build warnings by raising minimum LD version for PAC
- Whitelist some Kryo Cortex-A55 derivatives for Meltdown and SSB
- Fix perf register PC reporting for compat tasks
- Fix 'make clean' warning for arm64 signal selftests
- Fix ftrace when BTI is compiled in
- Avoid building the compat vDSO using GCC plugins"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add KRYO{3,4}XX silver CPU cores to SSB safelist
arm64: perf: Report the PC value in REGS_ABI_32 mode
kselftest: arm64: Remove redundant clean target
arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist
arm64: Don't insert a BTI instruction at inner labels
arm64: vdso: Don't use gcc plugins for building vgettimeofday.c
arm64: vdso: Only pass --no-eh-frame-hdr when linker supports it
arm64: Depend on newer binutils when building PAC
arm64: compat: Remove 32-bit sigreturn code from the vDSO
arm64: compat: Always use sigpage for sigreturn trampoline
arm64: compat: Allow 32-bit vdso and sigpage to co-exist
arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline
Arnd Bergmann [Fri, 26 Jun 2020 22:16:42 +0000 (00:16 +0200)]
Merge tag 'juno-fix-5.8' of git://git./linux/kernel/git/sudeep.holla/linux into arm/fixes
ARMv8 Juno/Vexpress/Fast Models fix for v5.8
Partial revert of some recent fixes to silence DTC warning which broke
clocks on some Vexpress platforms resulting in boot issues.
* tag 'juno-fix-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
arm: dts: vexpress: Move mcc node back into motherboard node
Link: https://lore.kernel.org/r/20200609180447.GB5732@bogus
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Fri, 26 Jun 2020 19:33:48 +0000 (12:33 -0700)]
Merge tag 'acpi-5.8-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Prevent bypassing kernel lockdown via the ACPI tables loading
interface (Jason A. Donenfeld) and fix the handling of an ACPI sysfs
attribute (Nathan Chancellor)"
* tag 'acpi-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: sysfs: Fix pm_profile_attr type
ACPI: configfs: Disallow loading ACPI tables when locked down
Linus Torvalds [Fri, 26 Jun 2020 19:32:11 +0000 (12:32 -0700)]
Merge tag 'pm-5.8-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a recent regression that broke suspend-to-idle on some x86
systems, fix the intel_pstate driver to correctly let the platform
firmware control CPU performance in some cases and add __init
annotations to a couple of functions.
Specifics:
- Make sure that the _TIF_POLLING_NRFLAG is clear before entering the
last phase of suspend-to-idle to avoid wakeup issues on some x86
systems (Chen Yu, Rafael Wysocki).
- Cover one more case in which the intel_pstate driver should let the
platform firmware control the CPU frequency and refuse to load
(Srinivas Pandruvada).
- Add __init annotations to 2 functions in the power management core
(Christophe JAILLET)"
* tag 'pm-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: Rearrange s2idle-specific idle state entry code
PM: sleep: core: mark 2 functions as __init to save some memory
cpufreq: intel_pstate: Add one more OOB control bit
PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
Linus Torvalds [Fri, 26 Jun 2020 19:30:07 +0000 (12:30 -0700)]
Merge tag 'iommu-fixes-v5.8-rc2' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"A couple of Intel VT-d fixes:
- Make Intel SVM code 64bit only. The code uses pgd_t* and the IOMMU
only supports long-mode page-table formats, so its broken on 32bit
anyway.
- Make sure GFX quirks in for Intel VT-d are not applied to untrusted
devices. Those devices might gain full memory access otherwise.
- Identity mapping setup fix.
- Fix ACS enabling when Intel IOMMU is off and untrusted devices are
detected.
- Two smaller fixes for coherency and IO page-table setup"
* tag 'iommu-fixes-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix misuse of iommu_domain_identity_map()
iommu/vt-d: Update scalable mode paging structure coherency
iommu/vt-d: Enable PCI ACS for platform opt in hint
iommu/vt-d: Don't apply gfx quirks to untrusted devices
iommu/vt-d: Set U/S bit in first level page table by default
iommu/vt-d: Make Intel SVM code 64-bit only
Linus Torvalds [Fri, 26 Jun 2020 19:27:25 +0000 (12:27 -0700)]
Merge tag 'drm-fixes-2020-06-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Usual rc3 pickup, lots of little fixes all over.
The core VT registration regression fix is probably the largest,
otherwise ttm, amdgpu and tegra are the bulk, with some minor driver
fixes.
No i915 pull this week which may or may not mean I get 2x of it next
week, we'll see how it goes.
core:
- fix VT registration regression
ttm:
- fix two fence leaks
amdgpu:
- Fix missed mutex unlock in DC error path
- Fix firmware leak for sdma5
- DC bpc property fixes
amdkfd:
- Fix memleak in an error path
radeon:
- Fix copy paste typo in NI DPM spll validation
rcar-du:
- build fix
tegra:
- add missing zpos property
- child driver registeration fix
- debugfs cleanup fix
- doc fix
mcde:
- reorder fbdev setup
panel:
- fix connector type
- fix orienation for some panels
sun4i:
- fix dma/iommu configuration
uvesafb:
- respect blank flag"
* tag 'drm-fixes-2020-06-26' of git://anongit.freedesktop.org/drm/drm: (25 commits)
drm/amd: fix potential memleak in err branch
drm/amd/display: Fix ineffective setting of max bpc property
drm/amd/display: Enable output_bpc property on all outputs
drm/amdgpu: add fw release for sdma v5_0
drm/fb-helper: Fix vt restore
drm/radeon: fix fb_div check in ni_init_smc_spll_table()
drm/amdgpu/display: Unlock mutex on error
drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU
drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003
drm: panel-orientation-quirks: Add quirk for Asus T101HA panel
video: fbdev: uvesafb: fix "noblank" option handling
drm/panel-simple: fix connector type for newhaven_nhd_43_480272ef_atxl
drm/panel-simple: fix connector type for LogicPD Type28 Display
drm: rcar-du: Fix build error
drm: mcde: Fix forgotten user of drm->dev_private
drm: mcde: Fix display initialization problem
drm/tegra: Add zpos property for cursor planes
gpu: host1x: Detach driver on unregister
gpu: host1x: Correct trivial kernel-doc inconsistencies
drm/tegra: hub: Register child devices
...
Linus Torvalds [Fri, 26 Jun 2020 19:19:36 +0000 (12:19 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misx fixes from Andrew Morton:
"31 patches.
Subsystems affected by this patch series: hotfixes, mm/pagealloc,
kexec, ocfs2, lib, mm/slab, mm/slab, mm/slub, mm/swap, mm/pagemap,
mm/vmalloc, mm/memcg, mm/gup, mm/thp, mm/vmscan, x86,
mm/memory-hotplug, MAINTAINERS"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (31 commits)
MAINTAINERS: update info for sparse
mm/memory_hotplug.c: fix false softlockup during pfn range removal
mm: remove vmalloc_exec
arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page
x86/hyperv: allocate the hypercall page with only read and execute bits
mm/memory: fix IO cost for anonymous page
mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
mm: workingset: age nonresident information alongside anonymous pages
doc: THP CoW fault no longer allocate THP
docs: mm/gup: minor documentation update
mm/memcontrol.c: prevent missed memory.low load tears
mm/memcontrol.c: add missed css_put()
mm: memcontrol: handle div0 crash race condition in memory.low
mm/vmalloc.c: fix a warning while make xmldocs
media: omap3isp: remove cacheflush.h
make asm-generic/cacheflush.h more standalone
mm/debug_vm_pgtable: fix build failure with powerpc 8xx
mm/memory.c: properly pte_offset_map_lock/unlock in vm_insert_pages()
mm: fix swap cache node allocation mask
slub: cure list_slab_objects() from double fix
...
Greg Kroah-Hartman [Fri, 26 Jun 2020 15:26:31 +0000 (17:26 +0200)]
Merge tag 'fpga-fixes-for-5.8' of git://git./linux/kernel/git/mdf/linux-fpga into char-misc-next
FPGA Manager fixes for 5.8-rc1
Here is one (late) fix for 5.8-rc1 merge window.
Arnd's change addresses a missing build dependency.
All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my fixes branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-fixes-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: zynqmp: fix modular build
Greg Kroah-Hartman [Fri, 26 Jun 2020 15:24:20 +0000 (17:24 +0200)]
Merge tag 'misc-habanalabs-fixes-2020-06-24' of git://people.freedesktop.org/~gabbayo/linux into char-misc-linus
Oded writes:
This tag contains the following fixes for kernel 5.8-rc2:
- close security hole in GAUDI command buffer parsing by blocking an
instruction that might allow user to run command buffer that wasn't
parsed on a secured engine.
- Fix bug in GAUDI MMU cache invalidation code.
- Rename a function to resolve conflict with a static inline function in
arch/m68k/include/asm/mcfmmu.h
- Increase watchdog timeout of GAUDI QMAN arbitration H/W to prevent false
reports on timeouts
- Fix bug of dereferencing NULL pointer when an error occurs during command
submission
- Increase H/W timer for checking if PDMA engine is IDLE in GAUDI.
* tag 'misc-habanalabs-fixes-2020-06-24' of git://people.freedesktop.org/~gabbayo/linux:
habanalabs: increase h/w timer when checking idle
habanalabs: Correct handling when failing to enqueue CB
habanalabs: increase GAUDI QMAN ARB WDT timeout
habanalabs: rename mmu_write() to mmu_asid_va_write()
habanalabs: use PI in MMU cache invalidation
habanalabs: block scalar load_and_exe on external queue
Greg Kroah-Hartman [Fri, 26 Jun 2020 15:16:52 +0000 (17:16 +0200)]
Merge tag 'fixes-for-v5.8-rc2' of git://git./linux/kernel/git/balbi/usb into usb-linus
Felipe writes:
usb: fixes for v5.8-rc2
A revert of Exynos5422 suspend clock support, it turns out it wasn't
ready to be merged. CDNS3 got a fix for test mode initialization.
Signed-off-by: Felipe Balbi <balbi@kernel.org>
* tag 'fixes-for-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"
usb: gadget: udc: Potential Oops in error handling code
usb: phy: tegra: Fix unnecessary check in tegra_usb_phy_probe()
usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work
usb: cdns3: ep0: add spinlock for cdns3_check_new_setup
usb: cdns3: trace: using correct dir value
usb: cdns3: ep0: fix the test mode set incorrectly
Rafael J. Wysocki [Fri, 26 Jun 2020 15:09:49 +0000 (17:09 +0200)]
Merge branches 'pm-cpufreq' and 'pm-cpuidle'
* pm-cpufreq:
cpufreq: intel_pstate: Add one more OOB control bit
* pm-cpuidle:
cpuidle: Rearrange s2idle-specific idle state entry code
PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
Rafael J. Wysocki [Fri, 26 Jun 2020 15:06:29 +0000 (17:06 +0200)]
Merge branch 'acpi-sysfs'
* acpi-sysfs:
ACPI: sysfs: Fix pm_profile_attr type
Greg Kroah-Hartman [Fri, 26 Jun 2020 14:51:14 +0000 (16:51 +0200)]
Merge 5.8-rc2 into usb-linus
Felipe has based his patches on that tag, so update my usb-linus branch
to it as well so that I can pull his patches in here easier.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Douglas Anderson [Tue, 2 Jun 2020 22:47:39 +0000 (15:47 -0700)]
kgdb: Avoid suspicious RCU usage warning
At times when I'm using kgdb I see a splat on my console about
suspicious RCU usage. I managed to come up with a case that could
reproduce this that looked like this:
WARNING: suspicious RCU usage
5.7.0-rc4+ #609 Not tainted
-----------------------------
kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by swapper/0/1:
#0:
ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c
#1:
ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac
#2:
ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac
stack backtrace:
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609
Hardware name: Google Cheza (rev3+) (DT)
Call trace:
dump_backtrace+0x0/0x1b8
show_stack+0x1c/0x24
dump_stack+0xd4/0x134
lockdep_rcu_suspicious+0xf0/0x100
find_task_by_pid_ns+0x5c/0x80
getthread+0x8c/0xb0
gdb_serial_stub+0x9d4/0xd04
kgdb_cpu_enter+0x284/0x7ac
kgdb_handle_exception+0x174/0x20c
kgdb_brk_fn+0x24/0x30
call_break_hook+0x6c/0x7c
brk_handler+0x20/0x5c
do_debug_exception+0x1c8/0x22c
el1_sync_handler+0x3c/0xe4
el1_sync+0x7c/0x100
rpmh_rsc_probe+0x38/0x420
platform_drv_probe+0x94/0xb4
really_probe+0x134/0x300
driver_probe_device+0x68/0x100
__device_attach_driver+0x90/0xa8
bus_for_each_drv+0x84/0xcc
__device_attach+0xb4/0x13c
device_initial_probe+0x18/0x20
bus_probe_device+0x38/0x98
device_add+0x38c/0x420
If I understand properly we should just be able to blanket kgdb under
one big RCU read lock and the problem should go away. We'll add it to
the beast-of-a-function known as kgdb_cpu_enter().
With this I no longer get any splats and things seem to work fine.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Sumit Garg [Thu, 4 Jun 2020 10:01:19 +0000 (15:31 +0530)]
kdb: Switch to use safer dbg_io_ops over console APIs
In kgdb context, calling console handlers aren't safe due to locks used
in those handlers which could in turn lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).
Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.
In order to achieve this we need to reverse the order of preference to
use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
store "struct console" that represents debugger I/O in dbg_io_ops and
while emitting kdb messages, skip console that matches dbg_io_ops
console in order to avoid duplicate messages. After this change,
"is_console" param becomes redundant and hence removed.
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/1591264879-25920-5-git-send-email-sumit.garg@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Chuck Lever [Thu, 25 Jun 2020 15:32:34 +0000 (11:32 -0400)]
SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()
@subbuf is an output parameter of xdr_buf_subsegment(). A survey of
call sites shows that @subbuf is always uninitialized before
xdr_buf_segment() is invoked by callers.
There are some execution paths through xdr_buf_subsegment() that do
not set all of the fields in @subbuf, leaving some pointer fields
containing garbage addresses. Subsequent processing of that buffer
then results in a page fault.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Olga Kornievskaia [Wed, 24 Jun 2020 17:54:08 +0000 (13:54 -0400)]
NFSv4 fix CLOSE not waiting for direct IO compeletion
Figuring out the root case for the REMOVE/CLOSE race and
suggesting the solution was done by Neil Brown.
Currently what happens is that direct IO calls hold a reference
on the open context which is decremented as an asynchronous task
in the nfs_direct_complete(). Before reference is decremented,
control is returned to the application which is free to close the
file. When close is being processed, it decrements its reference
on the open_context but since directIO still holds one, it doesn't
sent a close on the wire. It returns control to the application
which is free to do other operations. For instance, it can delete a
file. Direct IO is finally releasing its reference and triggering
an asynchronous close. Which races with the REMOVE. On the server,
REMOVE can be processed before the CLOSE, failing the REMOVE with
EACCES as the file is still opened.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Suggested-by: Neil Brown <neilb@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Trond Myklebust [Mon, 22 Jun 2020 19:04:15 +0000 (15:04 -0400)]
pNFS/flexfiles: Fix list corruption if the mirror count changes
If the mirror count changes in the new layout we pick up inside
ff_layout_pg_init_write(), then we can end up adding the
request to the wrong mirror and corrupting the mirror->pg_list.
Fixes: d600ad1f2bdb ("NFS41: pop some layoutget errors to application")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Tom Rix [Fri, 12 Jun 2020 22:45:49 +0000 (15:45 -0700)]
nfs: Fix memory leak of export_path
The try_location function is called within a loop by nfs_follow_referral.
try_location calls nfs4_pathname_string to created the export_path.
nfs4_pathname_string allocates the memory. export_path is stored in the
nfs_fs_context/fs_context structure similarly as hostname and source.
But whereas the ctx hostname and source are freed before assignment,
export_path is not. So if there are multiple loops, the new export_path
will overwrite the old without the old being freed.
So call kfree for export_path.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Vasily Averin [Mon, 1 Jun 2020 08:54:57 +0000 (11:54 +0300)]
sunrpc: fixed rollback in rpc_gssd_dummy_populate()
__rpc_depopulate(gssd_dentry) was lost on error path
cc: stable@vger.kernel.org
Fixes: commit 4b9a445e3eeb ("sunrpc: create a new dummy pipe for gssd to hold open")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Ingo Molnar [Fri, 26 Jun 2020 10:24:42 +0000 (12:24 +0200)]
Merge branch 'linus' into x86/entry, to resolve conflicts
Conflicts:
arch/x86/kernel/traps.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mans Rullgard [Sat, 13 Jun 2020 10:41:09 +0000 (11:41 +0100)]
i2c: core: check returned size of emulated smbus block read
If the i2c bus driver ignores the I2C_M_RECV_LEN flag (as some of
them do), it is possible for an I2C_SMBUS_BLOCK_DATA read issued
on some random device to return an arbitrary value in the first
byte (and nothing else). When this happens, i2c_smbus_xfer_emulated()
will happily write past the end of the supplied data buffer, thus
causing Bad Things to happen. To prevent this, check the size
before copying the data block and return an error if it is too large.
Fixes: 209d27c3b167 ("i2c: Emulate SMBus block read over I2C")
Signed-off-by: Mans Rullgard <mans@mansr.com>
[wsa: use better errno]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Luc Van Oostenryck [Fri, 26 Jun 2020 03:30:54 +0000 (20:30 -0700)]
MAINTAINERS: update info for sparse
Update the info for sparse. More specifically:
- change W entry to point to sparse.docs.kernel.org
- add Q & B entry (patchwork & bugzilla)
Link: http://lkml.kernel.org/r/20200621144204.53938-1-luc.vanoostenryck@gmail.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Widawsky [Fri, 26 Jun 2020 03:30:51 +0000 (20:30 -0700)]
mm/memory_hotplug.c: fix false softlockup during pfn range removal
When working with very large nodes, poisoning the struct pages (for which
there will be very many) can take a very long time. If the system is
using voluntary preemptions, the software watchdog will not be able to
detect forward progress. This patch addresses this issue by offering to
give up time like __remove_pages() does. This behavior was introduced in
v5.6 with: commit
d33695b16a9f ("mm/memory_hotplug: poison memmap in
remove_pfn_range_from_zone()")
Alternately, init_page_poison could do this cond_resched(), but it seems
to me that the caller of init_page_poison() is what actually knows whether
or not it should relax its own priority.
Based on Dan's notes, I think this is perfectly safe: commit
f931ab479dd2
("mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}")
Aside from fixing the lockup, it is also a friendlier thing to do on lower
core systems that might wipe out large chunks of hotplug memory (probably
not a very common case).
Fixes this kind of splat:
watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [daxctl:9922]
irq event stamp: 138450
hardirqs last enabled at (138449): [<
ffffffffa1001f26>] trace_hardirqs_on_thunk+0x1a/0x1c
hardirqs last disabled at (138450): [<
ffffffffa1001f42>] trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last enabled at (138448): [<
ffffffffa1e00347>] __do_softirq+0x347/0x456
softirqs last disabled at (138443): [<
ffffffffa10c416d>] irq_exit+0x7d/0xb0
CPU: 46 PID: 9922 Comm: daxctl Not tainted
5.7.0-BEN-14238-g373c6049b336 #30
Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYXCRB1.86B.0578.D07.
1902280810 02/28/2019
RIP: 0010:memset_erms+0x9/0x10
Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
Call Trace:
remove_pfn_range_from_zone+0x3a/0x380
memunmap_pages+0x17f/0x280
release_nodes+0x22a/0x260
__device_release_driver+0x172/0x220
device_driver_detach+0x3e/0xa0
unbind_store+0x113/0x130
kernfs_fop_write+0xdc/0x1c0
vfs_write+0xde/0x1d0
ksys_write+0x58/0xd0
do_syscall_64+0x5a/0x120
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Built 2 zonelists, mobility grouping on. Total pages:
49050381
Policy zone: Normal
Built 3 zonelists, mobility grouping on. Total pages:
49312525
Policy zone: Normal
David said: "It really only is an issue for devmem. Ordinary
hotplugged system memory is not affected (onlined/offlined in memory
block granularity)."
Link: http://lkml.kernel.org/r/20200619231213.1160351-1-ben.widawsky@intel.com
Fixes: commit d33695b16a9f ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()")
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Fri, 26 Jun 2020 03:30:47 +0000 (20:30 -0700)]
mm: remove vmalloc_exec
Merge vmalloc_exec into its only caller. Note that for !CONFIG_MMU
__vmalloc_node_range maps to __vmalloc, which directly clears the
__GFP_HIGHMEM added by the vmalloc_exec stub anyway.
Link: http://lkml.kernel.org/r/20200618064307.32739-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Fri, 26 Jun 2020 03:30:43 +0000 (20:30 -0700)]
arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page
Use PAGE_KERNEL_ROX directly instead of allocating RWX and setting the
page read-only just after the allocation.
Link: http://lkml.kernel.org/r/20200618064307.32739-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Fri, 26 Jun 2020 03:30:40 +0000 (20:30 -0700)]
x86/hyperv: allocate the hypercall page with only read and execute bits
Patch series "fix a hyperv W^X violation and remove vmalloc_exec"
Dexuan reported a W^X violation due to the fact that the hyper hypercall
page due switching it to be allocated using vmalloc_exec.
The problem is that PAGE_KERNEL_EXEC as used by vmalloc_exec actually
sets writable permissions in the pte. This series fixes the issue by
switching to the low-level __vmalloc_node_range interface that allows
specifing more detailed permissions instead. It then also open codes
the other two callers and removes the somewhat confusing vmalloc_exec
interface.
Peter noted that the hyper hypercall page allocation also has another
long standing issue in that it shouldn't use the full vmalloc but just
the module space. This issue is so far theoretical as the allocation is
done early in the boot process. I plan to fix it with another bigger
series for 5.9.
This patch (of 3):
Avoid a W^X violation cause by the fact that PAGE_KERNEL_EXEC includes
the writable bit.
For this resurrect the removed PAGE_KERNEL_RX definition, but as
PAGE_KERNEL_ROX to match arm64 and powerpc.
Link: http://lkml.kernel.org/r/20200618064307.32739-2-hch@lst.de
Fixes: 78bb17f76edc ("x86/hyperv: use vmalloc_exec for the hypercall page")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 26 Jun 2020 03:30:37 +0000 (20:30 -0700)]
mm/memory: fix IO cost for anonymous page
With synchronous IO swap device, swap-in is directly handled in fault
code. Since IO cost notation isn't added there, with synchronous IO
swap device, LRU balancing could be wrongly biased. Fix it to count it
in fault code.
Link: http://lkml.kernel.org/r/1592288204-27734-4-git-send-email-iamjoonsoo.kim@lge.com
Fixes: 314b57fb0460001 ("mm: balance LRU lists based on relative thrashing cache sizing")
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joonsoo Kim [Fri, 26 Jun 2020 03:30:34 +0000 (20:30 -0700)]
mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
Non-file-lru page could also be activated in mark_page_accessed() and we
need to count this activation for nonresident_age.
Note that it's better for this patch to be squashed into the patch "mm:
workingset: age nonresident information alongside anonymous pages".
Link: http://lkml.kernel.org/r/1592288204-27734-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 26 Jun 2020 03:30:31 +0000 (20:30 -0700)]
mm: workingset: age nonresident information alongside anonymous pages
Patch series "fix for "mm: balance LRU lists based on relative
thrashing" patchset"
This patchset fixes some problems of the patchset, "mm: balance LRU
lists based on relative thrashing", which is now merged on the mainline.
Patch "mm: workingset: let cache workingset challenge anon fix" is the
result of discussion with Johannes. See following link.
http://lkml.kernel.org/r/
20200520232525.798933-6-hannes@cmpxchg.org
And, the other two are minor things which are found when I try to rebase
my patchset.
This patch (of 3):
After ("mm: workingset: let cache workingset challenge anon fix"), we
compare refault distances to active_file + anon. But age of the
non-resident information is only driven by the file LRU. As a result,
we may overestimate the recency of any incoming refaults and activate
them too eagerly, causing unnecessary LRU churn in certain situations.
Make anon aging drive nonresident age as well to address that.
Link: http://lkml.kernel.org/r/1592288204-27734-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1592288204-27734-2-git-send-email-iamjoonsoo.kim@lge.com
Fixes: 34e58cac6d8f2a ("mm: workingset: let cache workingset challenge anon")
Reported-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Shi [Fri, 26 Jun 2020 03:30:28 +0000 (20:30 -0700)]
doc: THP CoW fault no longer allocate THP
Since commit
3917c80280c9 ("thp: change CoW semantics for anon-THP"),
THP CoW page fault is rewritten. Now it just splits pmd then fallback
to base page fault, it doesn't try to allocate THP anymore. So it is no
longer counted in THP_FAULT_ALLOC.
Remove the obsolete statement in documentation about THP CoW allocation
to avoid confusion.
Link: http://lkml.kernel.org/r/1592424895-5421-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 26 Jun 2020 03:30:25 +0000 (20:30 -0700)]
docs: mm/gup: minor documentation update
Now there are 5 cases. Updated the same.
Link: http://lkml.kernel.org/r/1592422023-7401-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Down [Fri, 26 Jun 2020 03:30:22 +0000 (20:30 -0700)]
mm/memcontrol.c: prevent missed memory.low load tears
Looks like one of these got missed when massaging in
f86b810c2610 ("mm,
memcg: prevent memory.low load/store tearing") with other linux-mm
changes.
Link: http://lkml.kernel.org/r/20200612174437.GA391453@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Reported-by: Michal Koutny <mkoutny@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Muchun Song [Fri, 26 Jun 2020 03:30:19 +0000 (20:30 -0700)]
mm/memcontrol.c: add missed css_put()
We should put the css reference when memory allocation failed.
Link: http://lkml.kernel.org/r/20200614122653.98829-1-songmuchun@bytedance.com
Fixes: f0a3a24b532d ("mm: memcg/slab: rework non-root kmem_cache lifecycle management")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>