Russell King [Fri, 4 Jan 2019 14:34:46 +0000 (14:34 +0000)]
Merge commit 'smp-hotplug^{/omap2}' into for-linus
Russell King [Wed, 2 Jan 2019 10:37:05 +0000 (10:37 +0000)]
Merge branches 'misc', 'sa1100-for-next' and 'spectre' into for-linus
Russell King [Thu, 13 Dec 2018 12:54:26 +0000 (12:54 +0000)]
ARM: omap2: remove unnecessary boot_lock
The boot_lock is something that was required for ARM development
platforms to ensure that the delay calibration worked properly. This
is not necessary for modern platforms that have better bus bandwidth
and do not need to calibrate the delay loop for secondary cores.
Remove the boot_lock entirely.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 13 Dec 2018 12:54:26 +0000 (12:54 +0000)]
ARM: versatile: rename and comment SMP implementation
Rename pen_release and boot_lock in the Versatile specific SMP
implementation, describe why these exist and state clearly that they
should not be used in production implementations.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Sebastian Andrzej Siewior [Tue, 11 Dec 2018 21:23:22 +0000 (22:23 +0100)]
ARM: versatile: convert boot_lock to raw
The arm boot_lock is used by the secondary processor startup code. The locking
task is the idle thread, which has idle->sched_class == &idle_sched_class.
idle_sched_class->enqueue_task == NULL, so if the idle task blocks on the
lock, the attempt to wake it when the lock becomes available will fail:
try_to_wake_up()
...
activate_task()
enqueue_task()
p->sched_class->enqueue_task(rq, p, flags)
Fix by converting boot_lock to a raw spin lock.
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Link: http://lkml.kernel.org/r/4E77B952.3010606@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 13 Dec 2018 12:54:26 +0000 (12:54 +0000)]
ARM: vexpress/realview: consolidate immitation CPU hotplug
The only difference between the hotplug implementation for Realview
and Versatile Express are the bit in the auxiliary control register
to disable coherency. Combine the two implentations accounting for
that difference.
Rename the functions to try to discourage cargo-cult copying of this
code.
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Fri, 7 Dec 2018 13:11:15 +0000 (13:11 +0000)]
ARM: fix the cockup in the previous patch
The intention in the previous patch was to only place the processor
tables in the .rodata section if big.Little was being built and we
wanted the branch target hardening, but instead (due to the way it
was tested) it ended up always placing the tables into the .rodata
section.
Although harmless, let's correct this anyway.
Fixes:
3a4d0c2172bc ("ARM: ensure that processor vtables is not lost after boot")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 6 Dec 2018 16:36:38 +0000 (16:36 +0000)]
ARM: ensure that processor vtables is not lost after boot
Marek Szyprowski reported problems with CPU hotplug in current kernels.
This was tracked down to the processor vtables being located in an
init section, and therefore discarded after kernel boot, despite being
required after boot to properly initialise the non-boot CPUs.
Arrange for these tables to end up in .rodata when required.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
Fixes:
383fb3ee8024 ("ARM: spectre-v2: per-CPU vtables to work around big.Little systems")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 31 Aug 2016 07:49:50 +0000 (08:49 +0100)]
ARM: sa1100/cerf: switch to using gpio_led_register_device()
Rather than statically declaring the leds-gpio device, use the helper
function provided.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 31 Aug 2016 07:49:50 +0000 (08:49 +0100)]
ARM: sa1100/assabet: switch to using gpio leds
Switch over to using gpio leds now that we have the gpio driver for
the assabet board register in place.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 31 Aug 2016 07:49:50 +0000 (08:49 +0100)]
ARM: sa1100/assabet: add gpio keys support for right-hand two buttons
Add gpio keys support for the right-hand two buttons on the Assabet,
which can be used to wake up the CPU after PM.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: sa1111: remove legacy GPIO interfaces
Now that we have migrated all users of the legacy private SA1111 gpio
interfaces, we can remove these redundant GPIO interfaces.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
pcmcia: sa1100*: remove redundant bvd1/bvd2 setting
bvd1 and bvd2 both default to 1 in the generic soc_common code, so
having a driver repeat this is redundant. Remove it.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: pxa/lubbock: switch PCMCIA to MAX1600 library
As Lubbock now provides GPIOs via gpiolib for controlling the socket
power, we can use the MAX1600 driver. Switch Lubbock to use this
driver, which simplifies the code.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: pxa/mainstone: switch PCMCIA to MAX1600 library and gpiod APIs
Convert mainstone to use the MAX1600 library and gpiod APIs for socket
status and control signals.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: sa1100/neponset: switch PCMCIA to MAX1600 library and gpiod APIs
Convert Neponset to use the gpiod API to specify which GPIOs are used
for PCMCIA, and use the MAX1600 power switch library for Neponset,
simplifying the neponset pcmcia driver as a result.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: sa1100/jornada720: switch PCMCIA to gpiod APIs
Convert the low level PCMCIA driver to gpiod APIs for controlling
the socket power.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
pcmcia: add MAX1600 library
Add a library for the MAX1600 PCMCIA power switch device. This is a
dual-channel device, controlled via four GPIO signals per channel.
Two signals control the Vcc output, and two control the Vpp output.
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 28 Nov 2018 13:57:23 +0000 (13:57 +0000)]
ARM: sa1100: explicitly register sa11x0-pcmcia devices
Simplify the code by getting rid of the conditional automatic
registration of the sa11x0 PCMCIA interfaces in sa1100_init(), and
require all platforms to explicitly call sa11x0_register_pcmcia().
Only one platform (iPAQ) is affected by this change.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Vincent Whitchurch [Fri, 9 Nov 2018 09:12:30 +0000 (10:12 +0100)]
ARM: 8813/1: Make aligned 2-byte getuser()/putuser() atomic on ARMv6+
getuser() and putuser() (and there underscored variants) use two
strb[t]/ldrb[t] instructions when they are asked to get/put 16-bits.
This means that the read/write is not atomic even when performed to a
16-bit-aligned address.
This leads to problems with vhost: vhost uses __getuser() to read the
vring's 16-bit avail.index field, and if it happens to observe a partial
update of the index, wrong descriptors will be used which will lead to a
breakdown of the virtio communication. A similar problem exists for
__putuser() which is used to write to the vring's used.index field.
The reason these functions use strb[t]/ldrb[t] is because strht/ldrht
instructions did not exist until ARMv6T2/ARMv7. So we should be easily
able to fix this on ARMv7. Also, since all ARMv6 processors also don't
actually use the unprivileged instructions anymore for uaccess (since
CONFIG_CPU_USE_DOMAINS is not used) we can easily fix them too.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Vincent Whitchurch [Fri, 9 Nov 2018 09:09:48 +0000 (10:09 +0100)]
ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS
ARMv6+ processors do not use CONFIG_CPU_USE_DOMAINS and use privileged
ldr/str instructions in copy_{from/to}_user. They are currently
unnecessarily using single ldr/str instructions and can use ldm/stm
instructions instead like memcpy does (but with appropriate fixup
tables).
This speeds up a "dd if=foo of=bar bs=32k" on a tmpfs filesystem by
about 4% on my Cortex-A9.
before:
134217728 bytes (128.0MB) copied, 0.543848 seconds, 235.4MB/s
before:
134217728 bytes (128.0MB) copied, 0.538610 seconds, 237.6MB/s
before:
134217728 bytes (128.0MB) copied, 0.544356 seconds, 235.1MB/s
before:
134217728 bytes (128.0MB) copied, 0.544364 seconds, 235.1MB/s
before:
134217728 bytes (128.0MB) copied, 0.537130 seconds, 238.3MB/s
before:
134217728 bytes (128.0MB) copied, 0.533443 seconds, 240.0MB/s
before:
134217728 bytes (128.0MB) copied, 0.545691 seconds, 234.6MB/s
before:
134217728 bytes (128.0MB) copied, 0.534695 seconds, 239.4MB/s
before:
134217728 bytes (128.0MB) copied, 0.540561 seconds, 236.8MB/s
before:
134217728 bytes (128.0MB) copied, 0.541025 seconds, 236.6MB/s
after:
134217728 bytes (128.0MB) copied, 0.520445 seconds, 245.9MB/s
after:
134217728 bytes (128.0MB) copied, 0.527846 seconds, 242.5MB/s
after:
134217728 bytes (128.0MB) copied, 0.519510 seconds, 246.4MB/s
after:
134217728 bytes (128.0MB) copied, 0.527231 seconds, 242.8MB/s
after:
134217728 bytes (128.0MB) copied, 0.525030 seconds, 243.8MB/s
after:
134217728 bytes (128.0MB) copied, 0.524236 seconds, 244.2MB/s
after:
134217728 bytes (128.0MB) copied, 0.523659 seconds, 244.4MB/s
after:
134217728 bytes (128.0MB) copied, 0.525018 seconds, 243.8MB/s
after:
134217728 bytes (128.0MB) copied, 0.519249 seconds, 246.5MB/s
after:
134217728 bytes (128.0MB) copied, 0.518527 seconds, 246.9MB/s
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Nicolas Pitre [Fri, 9 Nov 2018 03:26:39 +0000 (04:26 +0100)]
ARM: 8811/1: always list both ldrd/strd registers explicitly
The ldrd and strd instructions work on a pair of consecutive registers.
It is possible to specify either the first register in the pair, or both
registers explicitly. Let's always do the later to make things clearer.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 19 Jul 2018 11:21:31 +0000 (12:21 +0100)]
ARM: spectre-v2: per-CPU vtables to work around big.Little systems
In big.Little systems, some CPUs require the Spectre workarounds in
paths such as the context switch, but other CPUs do not. In order
to handle these differences, we need per-CPU vtables.
We are unable to use the kernel's per-CPU variables to support this
as per-CPU is not initialised at times when we need access to the
vtables, so we have to use an array indexed by logical CPU number.
We use an array-of-pointers to avoid having function pointers in
the kernel's read/write .data section.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 19 Jul 2018 11:17:38 +0000 (12:17 +0100)]
ARM: add PROC_VTABLE and PROC_TABLE macros
Allow the way we access members of the processor vtable to be changed
at compile time. We will need to move to per-CPU vtables to fix the
Spectre variant 2 issues on big.Little systems.
However, we have a couple of calls that do not need the vtable
treatment, and indeed cause a kernel warning due to the (later) use
of smp_processor_id(), so also introduce the PROC_TABLE macro for
these which always use CPU 0's function pointers.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 19 Jul 2018 11:43:03 +0000 (12:43 +0100)]
ARM: clean up per-processor check_bugs method call
Call the per-processor type check_bugs() method in the same way as we
do other per-processor functions - move the "processor." detail into
proc-fns.h.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 19 Jul 2018 10:59:56 +0000 (11:59 +0100)]
ARM: split out processor lookup
Split out the lookup of the processor type and associated error handling
from the rest of setup_processor() - we will need to use this in the
secondary CPU bringup path for big.Little Spectre variant 2 mitigation.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Thu, 19 Jul 2018 10:42:36 +0000 (11:42 +0100)]
ARM: make lookup_processor_type() non-__init
Move lookup_processor_type() out of the __init section so it is callable
from (eg) the secondary startup code during hotplug.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Julien Thierry [Thu, 8 Nov 2018 16:25:28 +0000 (17:25 +0100)]
ARM: 8810/1: vfp: Fix wrong assignement to ufp_exc
In vfp_preserve_user_clear_hwstate, ufp_exc->fpinst2 gets assigned to
itself. It should actually be hwstate->fpinst2 that gets assigned to the
ufp_exc field.
Fixes commit
3aa2df6ec2ca6bc143a65351cca4266d03a8bc41 ("ARM: 8791/1:
vfp: use __copy_to_user() when saving VFP state").
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Yufen Wang [Fri, 2 Nov 2018 10:51:31 +0000 (11:51 +0100)]
ARM: 8808/1: kexec:offline panic_smp_self_stop CPU
In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
panic()
__crash_kexec
machine_crash_shutdown
crash_smp_send_stop
machine_kexec
BUG_ON(num_online_cpus() > 1);
CPU 1:
panic()
local_irq_disable
panic_smp_self_stop
If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).
Signed-off-by: Yufen Wang <wangyufen@huawei.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Florian Fainelli [Wed, 31 Oct 2018 22:53:12 +0000 (23:53 +0100)]
ARM: 8807/1: mm: Facilitate debugging CONFIG_KUSER_HELPERS disabled
Some software such as perf makes unconditional use of the special
[vectors] page which is only provided when CONFIG_KUSER_HELPERS is
enabled in the kernel.
Facilitate the debugging of such situations by printing a debug message
to the kernel log showing the task name and the faulting address.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Nicolas Pitre [Wed, 7 Nov 2018 16:49:00 +0000 (17:49 +0100)]
ARM: 8805/2: remove unneeded naked function usage
The naked attribute is known to confuse some old gcc versions when
function arguments aren't explicitly listed as inline assembly operands
despite the gcc documentation. That resulted in commit
9a40ac86152c
("ARM: 6164/1: Add kto and kfrom to input operands list.").
Yet that commit has problems of its own by having assembly operand
constraints completely wrong. If the generated code has been OK since
then, it is due to luck rather than correctness. So this patch also
provides proper assembly operand constraints, and removes two instances
of redundant register usages in the implementation while at it.
Inspection of the generated code with this patch doesn't show any
obvious quality degradation either, so not relying on __naked at all
will make the code less fragile, and avoid some issues with clang.
The only remaining __naked instances (excluding the kprobes test cases)
are exynos_pm_power_up_setup(), tc2_pm_power_up_setup() and
cci_enable_port_for_self(. But in the first two cases, only the function
address is used by the compiler with no chance of inlining it by
mistake, and the third case is called from assembly code only. And the
fact that no stack is available when the corresponding code is executed
does warrant the __naked usage in those cases.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Tested-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Ben Dooks [Fri, 12 Oct 2018 08:12:01 +0000 (09:12 +0100)]
ARM: 8804/1: zImage: atags_to_fdt: add serial-number for ATAG_SERIAL
If the system passes an ATAG_SERIAL, convert that into a /serial-number
node so that the system serial number will be passed through the FDT and
be present under the kernel.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 31 Oct 2018 09:56:24 +0000 (09:56 +0000)]
ARM: Kconfig: remove useless "default n"
The default for Kconfig options is always n, so there's no need to
explicitly state a "n" default.
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 24 Oct 2018 09:20:16 +0000 (10:20 +0100)]
ARM: Kconfig: remove useless parenthesis
Remove useless parenthesis from the Kconfig - Kconfig is not C, and
can cope without parenthesising e.g. single symbols to 'if'.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Ard Biesheuvel [Mon, 5 Nov 2018 13:54:56 +0000 (14:54 +0100)]
ARM: 8809/1: proc-v7: fix Thumb annotation of cpu_v7_hvc_switch_mm
Due to what appears to be a copy/paste error, the opening ENTRY()
of cpu_v7_hvc_switch_mm() lacks a matching ENDPROC(), and instead,
the one for cpu_v7_smc_switch_mm() is duplicated.
Given that it is ENDPROC() that emits the Thumb annotation, the
cpu_v7_hvc_switch_mm() routine will be called in ARM mode on a
Thumb2 kernel, resulting in the following splat:
Internal error: Oops - undefined instruction: 0 [#1] SMP THUMB2
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc1-00030-g4d28ad89189d-dirty #488
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
PC is at cpu_v7_hvc_switch_mm+0x12/0x18
LR is at flush_old_exec+0x31b/0x570
pc : [<
c0316efe>] lr : [<
c04117c7>] psr:
00000013
sp :
ee899e50 ip :
00000000 fp :
00000001
r10:
eda28f34 r9 :
eda31800 r8 :
c12470e0
r7 :
eda1fc00 r6 :
eda53000 r5 :
00000000 r4 :
ee88c000
r3 :
c0316eec r2 :
00000001 r1 :
eda53000 r0 :
6da6c000
Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Note the 'ISA ARM' in the last line.
Fix this by using the correct name in ENDPROC().
Cc: <stable@vger.kernel.org>
Fixes:
10115105cb3a ("ARM: spectre-v2: add firmware based hardening")
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Linus Torvalds [Sun, 4 Nov 2018 23:37:52 +0000 (15:37 -0800)]
Linux 4.20-rc1
Linus Torvalds [Sun, 4 Nov 2018 22:46:04 +0000 (14:46 -0800)]
Merge tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs
Pull UBIFS updates from Richard Weinberger:
- Full filesystem authentication feature, UBIFS is now able to have the
whole filesystem structure authenticated plus user data encrypted and
authenticated.
- Minor cleanups
* tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits)
ubifs: Remove unneeded semicolon
Documentation: ubifs: Add authentication whitepaper
ubifs: Enable authentication support
ubifs: Do not update inode size in-place in authenticated mode
ubifs: Add hashes and HMACs to default filesystem
ubifs: authentication: Authenticate super block node
ubifs: Create hash for default LPT
ubfis: authentication: Authenticate master node
ubifs: authentication: Authenticate LPT
ubifs: Authenticate replayed journal
ubifs: Add auth nodes to garbage collector journal head
ubifs: Add authentication nodes to journal
ubifs: authentication: Add hashes to index nodes
ubifs: Add hashes to the tree node cache
ubifs: Create functions to embed a HMAC in a node
ubifs: Add helper functions for authentication support
ubifs: Add separate functions to init/crc a node
ubifs: Format changes for authentication support
ubifs: Store read superblock node
ubifs: Drop write_node
...
Linus Torvalds [Sun, 4 Nov 2018 16:20:09 +0000 (08:20 -0800)]
Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Bugfix:
- Fix build issues on architectures that don't provide 64-bit cmpxchg
Cleanups:
- Fix a spelling mistake"
* tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: fix spelling mistake, EACCESS -> EACCES
SUNRPC: Use atomic(64)_t for seq_send(64)
Linus Torvalds [Sun, 4 Nov 2018 16:15:15 +0000 (08:15 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull more timer updates from Thomas Gleixner:
"A set of commits for the new C-SKY architecture timers"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings: timer: gx6605s SOC timer
clocksource/drivers/c-sky: Add gx6605s SOC system timer
dt-bindings: timer: C-SKY Multi-processor timer
clocksource/drivers/c-sky: Add C-SKY SMP timer
Linus Torvalds [Sun, 4 Nov 2018 16:12:44 +0000 (08:12 -0800)]
Merge tag 'ntb-4.20' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"Fairly minor changes and bug fixes:
NTB IDT thermal changes and hook into hwmon, ntb_netdev clean-up of
private struct, and a few bug fixes"
* tag 'ntb-4.20' of git://github.com/jonmason/ntb:
ntb: idt: Alter the driver info comments
ntb: idt: Discard temperature sensor IRQ handler
ntb: idt: Add basic hwmon sysfs interface
ntb: idt: Alter temperature read method
ntb_netdev: Simplify remove with client device drvdata
NTB: transport: Try harder to alloc an aligned MW buffer
ntb: ntb_transport: Mark expected switch fall-throughs
ntb: idt: Set PCIe bus address to BARLIMITx
NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks
ntb: intel: fix return value for ndev_vec_mask()
ntb_netdev: fix sleep time mismatch
Linus Torvalds [Sun, 4 Nov 2018 01:37:09 +0000 (18:37 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A memory (under-)allocation fix and a comment fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/topology: Fix off by one bug
sched/rt: Update comment in pick_next_task_rt()
Linus Torvalds [Sun, 4 Nov 2018 01:25:17 +0000 (18:25 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A number of fixes and some late updates:
- make in_compat_syscall() behavior on x86-32 similar to other
platforms, this touches a number of generic files but is not
intended to impact non-x86 platforms.
- objtool fixes
- PAT preemption fix
- paravirt fixes/cleanups
- cpufeatures updates for new instructions
- earlyprintk quirk
- make microcode version in sysfs world-readable (it is already
world-readable in procfs)
- minor cleanups and fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
compat: Cleanup in_compat_syscall() callers
x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
objtool: Support GCC 9 cold subfunction naming scheme
x86/numa_emulation: Fix uniform-split numa emulation
x86/paravirt: Remove unused _paravirt_ident_32
x86/mm/pat: Disable preemption around __flush_tlb_all()
x86/paravirt: Remove GPL from pv_ops export
x86/traps: Use format string with panic() call
x86: Clean up 'sizeof x' => 'sizeof(x)'
x86/cpufeatures: Enumerate MOVDIR64B instruction
x86/cpufeatures: Enumerate MOVDIRI instruction
x86/earlyprintk: Add a force option for pciserial device
objtool: Support per-function rodata sections
x86/microcode: Make revision and processor flags world-readable
Linus Torvalds [Sun, 4 Nov 2018 01:13:43 +0000 (18:13 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf updates and fixes from Ingo Molnar:
"These are almost all tooling updates: 'perf top', 'perf trace' and
'perf script' fixes and updates, an UAPI header sync with the merge
window versions, license marker updates, much improved Sparc support
from David Miller, and a number of fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
perf intel-pt/bts: Calculate cpumode for synthesized samples
perf intel-pt: Insert callchain context into synthesized callchains
perf tools: Don't clone maps from parent when synthesizing forks
perf top: Start display thread earlier
tools headers uapi: Update linux/if_link.h header copy
tools headers uapi: Update linux/netlink.h header copy
tools headers: Sync the various kvm.h header copies
tools include uapi: Update linux/mmap.h copy
perf trace beauty: Use the mmap flags table generated from headers
perf beauty: Wire up the mmap flags table generator to the Makefile
perf beauty: Add a generator for MAP_ mmap's flag constants
tools include uapi: Update asound.h copy
tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies
tools include uapi: Update linux/fs.h copy
perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}
perf cs-etm: Correct CPU mode for samples
perf unwind: Take pgoff into account when reporting elf to libdwfl
perf top: Do not use overwrite mode by default
perf top: Allow disabling the overwrite mode
perf trace: Beautify mount's first pathname arg
...
Linus Torvalds [Sun, 4 Nov 2018 01:12:09 +0000 (18:12 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"An irqchip driver fix and a memory (over-)allocation fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function
irq/matrix: Fix memory overallocation
Peter Zijlstra [Fri, 2 Nov 2018 13:22:25 +0000 (14:22 +0100)]
sched/topology: Fix off by one bug
With the addition of the NUMA identity level, we increased @level by
one and will run off the end of the array in the distance sort loop.
Fixed:
051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched domain")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Sat, 3 Nov 2018 22:42:16 +0000 (23:42 +0100)]
Merge branch 'core/urgent' into x86/urgent, to pick up objtool fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 3 Nov 2018 19:13:57 +0000 (12:13 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few fixes who have come in near or during the merge window:
- Removal of a VLA usage in Marvell mpp platform code
- Enable some IPMI options for ARM64 servers by default, helps
testing
- Enable PREEMPT on 32-bit ARMv7 defconfig
- Minor fix for stm32 DT (removal of an unused DMA property)
- Bugfix for TI OMAP1-based ams-delta (-EINVAL -> IRQ_NOTCONNECTED)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: stm32: update HASH1 dmas property on stm32mp157c
ARM: orion: avoid VLA in orion_mpp_conf
ARM: defconfig: Update multi_v7 to use PREEMPT
arm64: defconfig: Enable some IPMI configs
soc: ti: QMSS: Fix usage of irq_set_affinity_hint
ARM: OMAP1: ams-delta: Fix impossible .irq < 0
Linus Torvalds [Sat, 3 Nov 2018 17:55:23 +0000 (10:55 -0700)]
Merge tag 'arm64-upstream' of git://git./linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas:
- fix W+X page (mark RO) allocated by the arm64 kprobes code
- Makefile fix for .i files in out of tree modules
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kprobe: make page to RO mode when allocate it
arm64: kdump: fix small typo
arm64: makefile fix build of .i file in external module case
Linus Torvalds [Sat, 3 Nov 2018 17:53:33 +0000 (10:53 -0700)]
Merge tag 'dma-mapping-4.20-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Avoid compile warnings on non-default arm64 configs"
* tag 'dma-mapping-4.20-2' of git://git.infradead.org/users/hch/dma-mapping:
arm64: fix warnings without CONFIG_IOMMU_DMA
Linus Torvalds [Sat, 3 Nov 2018 17:47:33 +0000 (10:47 -0700)]
Merge tag 'kbuild-v4.20-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- clean-up leftovers in Kconfig files
- remove stale oldnoconfig and silentoldconfig targets
- remove unneeded cc-fullversion and cc-name variables
- improve merge_config script to allow overriding option prefix
* tag 'kbuild-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: remove cc-name variable
kbuild: replace cc-name test with CONFIG_CC_IS_CLANG
merge_config.sh: Allow to define config prefix
kbuild: remove unused cc-fullversion variable
kconfig: remove silentoldconfig target
kconfig: remove oldnoconfig target
powerpc: PCI_MSI needs PCI
powerpc: remove CONFIG_MCA leftovers
powerpc: remove CONFIG_PCI_QSPAN
scsi: aha152x: rename the PCMCIA define
Linus Torvalds [Sat, 3 Nov 2018 17:45:55 +0000 (10:45 -0700)]
Merge tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes and updates from Steve French:
"Three small fixes (one Kerberos related, one for stable, and another
fixes an oops in xfstest 377), two helpful debugging improvements,
three patches for cifs directio and some minor cleanup"
* tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix signed/unsigned mismatch on aio_read patch
cifs: don't dereference smb_file_target before null check
CIFS: Add direct I/O functions to file_operations
CIFS: Add support for direct I/O write
CIFS: Add support for direct I/O read
smb3: missing defines and structs for reparse point handling
smb3: allow more detailed protocol info on open files for debugging
smb3: on kerberos mount if server doesn't specify auth type use krb5
smb3: add trace point for tree connection
cifs: fix spelling mistake, EACCESS -> EACCES
cifs: fix return value for cifs_listxattr
Linus Torvalds [Sat, 3 Nov 2018 17:35:52 +0000 (10:35 -0700)]
Merge branch 'work.afs' of git://git./linux/kernel/git/viro/vfs
Pull 9p fix from Al Viro:
"Regression fix for net/9p handling of iov_iter; broken by braino when
switching to iov_iter_is_kvec() et.al., spotted and fixed by Marc"
* 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: Fix 9p virtio breakage
Linus Torvalds [Sat, 3 Nov 2018 17:34:03 +0000 (10:34 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is a set of minor small (and safe changes) that didn't make the
initial pull request plus some bug fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mvsas: Remove set but not used variable 'id'
scsi: qla2xxx: Remove two arguments from qlafx00_error_entry()
scsi: qla2xxx: Make sure that qlafx00_ioctl_iosb_entry() initializes 'res'
scsi: qla2xxx: Remove a set-but-not-used variable
scsi: qla2xxx: Make qla2x00_sysfs_write_nvram() easier to analyze
scsi: qla2xxx: Declare local functions 'static'
scsi: qla2xxx: Improve several kernel-doc headers
scsi: qla2xxx: Modify fall-through annotations
scsi: 3w-sas: 3w-9xxx: Use unsigned char for cdb
scsi: mvsas: Use dma_pool_zalloc
scsi: target: Don't request modules that aren't even built
scsi: target: Set response length for REPORT TARGET PORT GROUPS
Linus Torvalds [Sat, 3 Nov 2018 17:21:43 +0000 (10:21 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- more ocfs2 work
- various leftovers
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
memory_hotplug: cond_resched in __remove_pages
bfs: add sanity check at bfs_fill_super()
kernel/sysctl.c: remove duplicated include
kernel/kexec_file.c: remove some duplicated includes
mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
ocfs2: fix clusters leak in ocfs2_defrag_extent()
ocfs2: dlmglue: clean up timestamp handling
ocfs2: don't put and assigning null to bh allocated outside
ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
ocfs2: don't use iocb when EIOCBQUEUED returns
ocfs2: without quota support, avoid calling quota recovery
ocfs2: remove ocfs2_is_o2cb_active()
mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
include/linux/notifier.h: SRCU: fix ctags
mm: handle no memcg case in memcg_kmem_charge() properly
Michal Hocko [Fri, 2 Nov 2018 22:48:46 +0000 (15:48 -0700)]
memory_hotplug: cond_resched in __remove_pages
We have received a bug report that unbinding a large pmem (>1TB) can
result in a soft lockup:
NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [ndctl:4365]
[...]
Supported: Yes
CPU: 9 PID: 4365 Comm: ndctl Not tainted 4.12.14-94.40-default #1 SLE12-SP4
Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.
051120182255 05/11/2018
task:
ffff9cce7d4410c0 task.stack:
ffffbe9eb1bc4000
RIP: 0010:__put_page+0x62/0x80
Call Trace:
devm_memremap_pages_release+0x152/0x260
release_nodes+0x18d/0x1d0
device_release_driver_internal+0x160/0x210
unbind_store+0xb3/0xe0
kernfs_fop_write+0x102/0x180
__vfs_write+0x26/0x150
vfs_write+0xad/0x1a0
SyS_write+0x42/0x90
do_syscall_64+0x74/0x150
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fd13166b3d0
It has been reported on an older (4.12) kernel but the current upstream
code doesn't cond_resched in the hot remove code at all and the given
range to remove might be really large. Fix the issue by calling
cond_resched once per memory section.
Link: http://lkml.kernel.org/r/20181031125840.23982-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Dan Williams <dan.j.williams@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Fri, 2 Nov 2018 22:48:42 +0000 (15:48 -0700)]
bfs: add sanity check at bfs_fill_super()
syzbot is reporting too large memory allocation at bfs_fill_super() [1].
Since file system image is corrupted such that bfs_sb->s_start == 0,
bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
printf().
[1] https://syzkaller.appspot.com/bug?id=
16a87c236b951351374a84c8a32f40edbc034e96
Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+71c6b5d68e91149fc8a4@syzkaller.appspotmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tigran Aivazian <aivazian.tigran@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Schupikov [Fri, 2 Nov 2018 22:48:38 +0000 (15:48 -0700)]
kernel/sysctl.c: remove duplicated include
Remove one include of <linux/pipe_fs_i.h>.
No functional changes.
Link: http://lkml.kernel.org/r/20181004134223.17735-1-michael@schupikov.de
Signed-off-by: Michael Schupikov <michael@schupikov.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zhong jiang [Fri, 2 Nov 2018 22:48:35 +0000 (15:48 -0700)]
kernel/kexec_file.c: remove some duplicated includes
We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
hence just remove them.
Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Fri, 2 Nov 2018 22:48:31 +0000 (15:48 -0700)]
mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
THP allocation mode is quite complex and it depends on the defrag mode.
This complexity is hidden in alloc_hugepage_direct_gfpmask from a large
part currently. The NUMA special casing (namely __GFP_THISNODE) is
however independent and placed in alloc_pages_vma currently. This both
adds an unnecessary branch to all vma based page allocation requests and
it makes the code more complex unnecessarily as well. Not to mention
that e.g. shmem THP used to do the node reclaiming unconditionally
regardless of the defrag mode until recently. This was not only
unexpected behavior but it was also hardly a good default behavior and I
strongly suspect it was just a side effect of the code sharing more than
a deliberate decision which suggests that such a layering is wrong.
Get rid of the thp special casing from alloc_pages_vma and move the
logic to alloc_hugepage_direct_gfpmask. __GFP_THISNODE is applied to the
resulting gfp mask only when the direct reclaim is not requested and
when there is no explicit numa binding to preserve the current logic.
Please note that there's also a slight difference wrt MPOL_BIND now. The
previous code would avoid using __GFP_THISNODE if the local node was
outside of policy_nodemask(). After this patch __GFP_THISNODE is avoided
for all MPOL_BIND policies. So there's a difference that if local node
is actually allowed by the bind policy's nodemask, previously
__GFP_THISNODE would be added, but now it won't be. From the behavior
POV this is still correct because the policy nodemask is used.
Link: http://lkml.kernel.org/r/20180925120326.24392-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Larry Chen [Fri, 2 Nov 2018 22:48:27 +0000 (15:48 -0700)]
ocfs2: fix clusters leak in ocfs2_defrag_extent()
ocfs2_defrag_extent() might leak allocated clusters. When the file
system has insufficient space, the number of claimed clusters might be
less than the caller wants. If that happens, the original code might
directly commit the transaction without returning clusters.
This patch is based on code in ocfs2_add_clusters_in_btree().
[akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 2 Nov 2018 22:48:23 +0000 (15:48 -0700)]
ocfs2: dlmglue: clean up timestamp handling
The handling of timestamps outside of the 1970..2038 range in the dlm
glue is rather inconsistent: on 32-bit architectures, this has always
wrapped around to negative timestamps in the 1902..1969 range, while on
64-bit kernels all timestamps are interpreted as positive 34 bit numbers
in the 1970..2514 year range.
Now that the VFS code handles 64-bit timestamps on all architectures, we
can make the behavior more consistent here, and return the same result
that we had on 64-bit already, making the file system y2038 safe in the
process. Outside of dlmglue, it already uses 64-bit on-disk timestamps
anway, so that part is fine.
For consistency, I'm changing ocfs2_pack_timespec() to clamp anything
outside of the supported range to the minimum and maximum values. This
avoids a possible ambiguity of values before 1970 in particular, which
used to be interpreted as times at the end of the 2514 range previously.
Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changwei Ge [Fri, 2 Nov 2018 22:48:19 +0000 (15:48 -0700)]
ocfs2: don't put and assigning null to bh allocated outside
ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
several blocks from disk. Currently, the input argument *bhs* can be
NULL or NOT. It depends on the caller's behavior. If the function
fails in reading blocks from disk, the corresponding bh will be assigned
to NULL and put.
Obviously, above process for non-NULL input bh is not appropriate.
Because the caller doesn't even know its bhs are put and re-assigned.
If buffer head is managed by caller, ocfs2_read_blocks and
ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause
caller accessing illegal memory, thus crash.
Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changwei Ge [Fri, 2 Nov 2018 22:48:15 +0000 (15:48 -0700)]
ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But
there is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), current code just moves to
next position and uses the problematic buffer head again and again
during which the problematic buffer head is released for multiple times.
I suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into linux
-stable.
Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Suggested-by: Changkuo Shi <shi.changkuo@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changwei Ge [Fri, 2 Nov 2018 22:48:11 +0000 (15:48 -0700)]
ocfs2: don't use iocb when EIOCBQUEUED returns
When -EIOCBQUEUED returns, it means that aio_complete() will be called
from dio_complete(), which is an asynchronous progress against
write_iter. Generally, IO is a very slow progress than executing
instruction, but we still can't take the risk to access a freed iocb.
And we do face a BUG crash issue. Using the crash tool, iocb is
obviously freed already.
crash> struct -x kiocb
ffff881a350f5900
struct kiocb {
ki_filp = 0xffff881a350f5a80,
ki_pos = 0x0,
ki_complete = 0x0,
private = 0x0,
ki_flags = 0x0
}
And the backtrace shows:
ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
aio_run_iocb+0x229/0x2f0
do_io_submit+0x291/0x540
SyS_io_submit+0x10/0x20
system_call_fastpath+0x16/0x75
Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guozhonghua [Fri, 2 Nov 2018 22:48:07 +0000 (15:48 -0700)]
ocfs2: without quota support, avoid calling quota recovery
During one dead node's recovery by other node, quota recovery work will
be queued. We should avoid calling quota when it is not supported, so
check the quota flags.
Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gang He [Fri, 2 Nov 2018 22:48:03 +0000 (15:48 -0700)]
ocfs2: remove ocfs2_is_o2cb_active()
Remove ocfs2_is_o2cb_active(). We have similar functions to identify
which cluster stack is being used via osb->osb_cluster_stack.
Secondly, the current implementation of ocfs2_is_o2cb_active() is not
totally safe. Based on the design of stackglue, we need to get
ocfs2_stack_lock before using ocfs2_stack related data structures, and
that active_stack pointer can be NULL in the case of mount failure.
Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Eric Ren <zren@suse.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Fri, 2 Nov 2018 22:47:59 +0000 (15:47 -0700)]
mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
THP allocation might be really disruptive when allocated on NUMA system
with the local node full or hard to reclaim. Stefan has posted an
allocation stall report on 4.12 based SLES kernel which suggests the
same issue:
kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
kvm cpuset=/ mems_allowed=0-1
CPU: 10 PID: 84752 Comm: kvm Tainted: G W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased)
Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017
Call Trace:
dump_stack+0x5c/0x84
warn_alloc+0xe0/0x180
__alloc_pages_slowpath+0x820/0xc90
__alloc_pages_nodemask+0x1cc/0x210
alloc_pages_vma+0x1e5/0x280
do_huge_pmd_wp_page+0x83f/0xf00
__handle_mm_fault+0x93d/0x1060
handle_mm_fault+0xc6/0x1b0
__do_page_fault+0x230/0x430
do_page_fault+0x2a/0x70
page_fault+0x7b/0x80
[...]
Mem-Info:
active_anon:
126315487 inactive_anon:1612476 isolated_anon:5
active_file:60183 inactive_file:245285 isolated_file:0
unevictable:15657 dirty:286 writeback:1 unstable:0
slab_reclaimable:75543 slab_unreclaimable:2509111
mapped:81814 shmem:31764 pagetables:370616 bounce:0
free:
32294031 free_pcp:6233 free_cma:0
Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
The defrag mode is "madvise" and from the above report it is clear that
the THP has been allocated for MADV_HUGEPAGA vma.
Andrea has identified that the main source of the problem is
__GFP_THISNODE usage:
: The problem is that direct compaction combined with the NUMA
: __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very
: hard the local node, instead of failing the allocation if there's no
: THP available in the local node.
:
: Such logic was ok until __GFP_THISNODE was added to the THP allocation
: path even with MPOL_DEFAULT.
:
: The idea behind the __GFP_THISNODE addition, is that it is better to
: provide local memory in PAGE_SIZE units than to use remote NUMA THP
: backed memory. That largely depends on the remote latency though, on
: threadrippers for example the overhead is relatively low in my
: experience.
:
: The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in
: extremely slow qemu startup with vfio, if the VM is larger than the
: size of one host NUMA node. This is because it will try very hard to
: unsuccessfully swapout get_user_pages pinned pages as result of the
: __GFP_THISNODE being set, instead of falling back to PAGE_SIZE
: allocations and instead of trying to allocate THP on other nodes (it
: would be even worse without vfio type1 GUP pins of course, except it'd
: be swapping heavily instead).
Fix this by removing __GFP_THISNODE for THP requests which are
requesting the direct reclaim. This effectivelly reverts
5265047ac301
on the grounds that the zone/node reclaim was known to be disruptive due
to premature reclaim when there was memory free. While it made sense at
the time for HPC workloads without NUMA awareness on rare machines, it
was ultimately harmful in the majority of cases. The existing behaviour
is similar, if not as widespare as it applies to a corner case but
crucially, it cannot be tuned around like zone_reclaim_mode can. The
default behaviour should always be to cause the least harm for the
common case.
If there are specialised use cases out there that want zone_reclaim_mode
in specific cases, then it can be built on top. Longterm we should
consider a memory policy which allows for the node reclaim like behavior
for the specific memory ranges which would allow a
[1] http://lkml.kernel.org/r/
20180820032204.9591-1-aarcange@redhat.com
Mel said:
: Both patches look correct to me but I'm responding to this one because
: it's the fix. The change makes sense and moves further away from the
: severe stalling behaviour we used to see with both THP and zone reclaim
: mode.
:
: I put together a basic experiment with usemem configured to reference a
: buffer multiple times that is 80% the size of main memory on a 2-socket
: box with symmetric node sizes and defrag set to "always". The defrag
: setting is not the default but it would be functionally similar to
: accessing a buffer with madvise(MADV_HUGEPAGE). Usemem is configured to
: reference the buffer multiple times and while it's not an interesting
: workload, it would be expected to complete reasonably quickly as it fits
: within memory. The results were;
:
: usemem
: vanilla noreclaim-v1
: Amean Elapsd-1 42.78 ( 0.00%) 26.87 ( 37.18%)
: Amean Elapsd-3 27.55 ( 0.00%) 7.44 ( 73.00%)
: Amean Elapsd-4 5.72 ( 0.00%) 5.69 ( 0.45%)
:
: This shows the elapsed time in seconds for 1 thread, 3 threads and 4
: threads referencing buffers 80% the size of memory. With the patches
: applied, it's 37.18% faster for the single thread and 73% faster with two
: threads. Note that 4 threads showing little difference does not indicate
: the problem is related to thread counts. It's simply the case that 4
: threads gets spread so their workload mostly fits in one node.
:
: The overall view from /proc/vmstats is more startling
:
: 4.19.0-rc1 4.19.0-rc1
: vanillanoreclaim-v1r1
: Minor Faults
35593425 708164
: Major Faults 484088 36
: Swap Ins 3772837 0
: Swap Outs 3932295 0
:
: Massive amounts of swap in/out without the patch
:
: Direct pages scanned 6013214 0
: Kswapd pages scanned 0 0
: Kswapd pages reclaimed 0 0
: Direct pages reclaimed 4033009 0
:
: Lots of reclaim activity without the patch
:
: Kswapd efficiency 100% 100%
: Kswapd velocity 0.000 0.000
: Direct efficiency 67% 100%
: Direct velocity 11191.956 0.000
:
: Mostly from direct reclaim context as you'd expect without the patch.
:
: Page writes by reclaim 3932314.000 0.000
: Page writes file 19 0
: Page writes anon 3932295 0
: Page reclaim immediate 42336 0
:
: Writes from reclaim context is never good but the patch eliminates it.
:
: We should never have default behaviour to thrash the system for such a
: basic workload. If zone reclaim mode behaviour is ever desired but on a
: single task instead of a global basis then the sensible option is to build
: a mempolicy that enforces that behaviour.
This was a severe regression compared to previous kernels that made
important workloads unusable and it starts when __GFP_THISNODE was
added to THP allocations under MADV_HUGEPAGE. It is not a significant
risk to go to the previous behavior before __GFP_THISNODE was added, it
worked like that for years.
This was simply an optimization to some lucky workloads that can fit in
a single node, but it ended up breaking the VM for others that can't
possibly fit in a single node, so going back is safe.
[mhocko@suse.com: rewrote the changelog based on the one from Andrea]
Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org
Fixes:
5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Debugged-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: <stable@vger.kernel.org> [4.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sam Protsenko [Fri, 2 Nov 2018 22:47:53 +0000 (15:47 -0700)]
include/linux/notifier.h: SRCU: fix ctags
ctags indexing ("make tags" command) throws this warning:
ctags: Warning: include/linux/notifier.h:125:
null expansion of name pattern "\1"
This is the result of DEFINE_PER_CPU() macro expansion. Fix that by
getting rid of line break.
Similar fix was already done in commit
25528213fe9f ("tags: Fix
DEFINE_PER_CPU expansions"), but this one probably wasn't noticed.
Link: http://lkml.kernel.org/r/20181030202808.28027-1-semen.protsenko@linaro.org
Fixes:
9c80172b902d ("kernel/SRCU: provide a static initializer")
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Gushchin [Fri, 2 Nov 2018 22:47:49 +0000 (15:47 -0700)]
mm: handle no memcg case in memcg_kmem_charge() properly
Mike Galbraith reported a regression caused by the commit
9b6f7e163cd0
("mm: rework memcg kernel stack accounting") on a system with
"cgroup_disable=memory" boot option: the system panics with the following
stack trace:
BUG: unable to handle kernel NULL pointer dereference at
00000000000000f8
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 0 PID: 1 Comm: systemd Not tainted 4.19.0-preempt+ #410
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fed4
RIP: 0010:page_counter_try_charge+0x22/0xc0
Code: 41 5d c3 c3 0f 1f 40 00 0f 1f 44 00 00 48 85 ff 0f 84 a7 00 00 00 41 56 48 89 f8 49 89 fe 49
Call Trace:
try_charge+0xcb/0x780
memcg_kmem_charge_memcg+0x28/0x80
memcg_kmem_charge+0x8b/0x1d0
copy_process.part.41+0x1ca/0x2070
_do_fork+0xd7/0x3d0
do_syscall_64+0x5a/0x180
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The problem occurs because get_mem_cgroup_from_current() returns the NULL
pointer if memory controller is disabled. Let's check if this is a case
at the beginning of memcg_kmem_charge() and just return 0 if
mem_cgroup_disabled() returns true. This is how we handle this case in
many other places in the memory controller code.
Link: http://lkml.kernel.org/r/20181029215123.17830-1-guro@fb.com
Fixes:
9b6f7e163cd0 ("mm: rework memcg kernel stack accounting")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Olof Johansson [Sat, 3 Nov 2018 05:31:40 +0000 (22:31 -0700)]
Merge tag 'omap-for-v4.20/omap1-fix-signed' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Fix for omap1 ams-delta irq
We need to use IRQ_NOTCONNECTED instead of -EINVAL for
ams_delta_modem_ports irq.
* tag 'omap-for-v4.20/omap1-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP1: ams-delta: Fix impossible .irq < 0
Signed-off-by: Olof Johansson <olof@lixom.net>
Alexandre Torgue [Thu, 20 Sep 2018 16:34:17 +0000 (18:34 +0200)]
ARM: dts: stm32: update HASH1 dmas property on stm32mp157c
Remove unused parameter from HASH1 dmas property on stm32mp157c SoC.
Fixes:
1e726a40e067 ("ARM: dts: stm32: Add HASH support on stm32mp157c")
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
[Olof: Bug doesn't cause any harm, so shouldn't need stable backport]
Signed-off-by: Olof Johansson <olof@lixom.net>
Arnd Bergmann [Fri, 5 Oct 2018 16:15:49 +0000 (18:15 +0200)]
ARM: orion: avoid VLA in orion_mpp_conf
Testing randconfig builds found an instance of a VLA that was
missed when determining that we have removed them all:
arch/arm/plat-orion/mpp.c: In function 'orion_mpp_conf':
arch/arm/plat-orion/mpp.c:31:2: error: ISO C90 forbids variable length array 'mpp_ctrl' [-Werror=vla]
This one is fairly straightforward: we know what all three
callers are, and the maximum length is not very long.
Fixes:
68664695ae57 ("Makefile: Globally enable VLA warning")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Marc Zyngier [Fri, 2 Nov 2018 17:16:51 +0000 (17:16 +0000)]
iov_iter: Fix 9p virtio breakage
When switching to the new iovec accessors, a negation got subtly
dropped, leading to 9p being remarkably broken (here with kvmtool):
[ 7.430941] VFS: Mounted root (9p filesystem) on device 0:15.
[ 7.432080] devtmpfs: mounted
[ 7.432717] Freeing unused kernel memory: 1344K
[ 7.433658] Run /virt/init as init process
Warning: unable to translate guest address 0x7e00902ff000 to host
Warning: unable to translate guest address 0x7e00902fefc0 to host
Warning: unable to translate guest address 0x7e00902ff000 to host
Warning: unable to translate guest address 0x7e008febef80 to host
Warning: unable to translate guest address 0x7e008febf000 to host
Warning: unable to translate guest address 0x7e008febef00 to host
Warning: unable to translate guest address 0x7e008febf000 to host
[ 7.436376] Kernel panic - not syncing: Requested init /virt/init failed (error -8).
[ 7.437554] CPU: 29 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-02267-g00e23707442a #291
[ 7.439006] Hardware name: linux,dummy-virt (DT)
[ 7.439902] Call trace:
[ 7.440387] dump_backtrace+0x0/0x148
[ 7.441104] show_stack+0x14/0x20
[ 7.441768] dump_stack+0x90/0xb4
[ 7.442425] panic+0x120/0x27c
[ 7.443036] kernel_init+0xa4/0x100
[ 7.443725] ret_from_fork+0x10/0x18
[ 7.444444] SMP: stopping secondary CPUs
[ 7.445391] Kernel Offset: disabled
[ 7.446169] CPU features: 0x0,
23000438
[ 7.446974] Memory Limit: none
[ 7.447645] ---[ end Kernel panic - not syncing: Requested init /virt/init failed (error -8). ]---
Restoring the missing "!" brings the guest back to life.
Fixes:
00e23707442a ("iov_iter: Use accessor function")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Thomas Gleixner [Fri, 2 Nov 2018 20:58:39 +0000 (21:58 +0100)]
Merge branch 'clockevents/4.20-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull clockevent update from Daniel Lezcano:
- Add the per cpu timer for the c-sky architecture (Guo Ren)
- Add the global timer for the c-sky architecture (Guo Ren)
Steve French [Thu, 1 Nov 2018 15:54:32 +0000 (10:54 -0500)]
cifs: fix signed/unsigned mismatch on aio_read patch
The patch "CIFS: Add support for direct I/O read" had
a signed/unsigned mismatch (ssize_t vs. size_t) in the
return from one function. Similar trivial change
in aio_write
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Colin Ian King [Thu, 1 Nov 2018 13:14:30 +0000 (13:14 +0000)]
cifs: don't dereference smb_file_target before null check
There is a null check on dst_file->private data which suggests
it can be potentially null. However, before this check, pointer
smb_file_target is derived from dst_file->private and dereferenced
in the call to tlink_tcon, hence there is a potential null pointer
deference.
Fix this by assigning smb_file_target and target_tcon after the
null pointer sanity checks.
Detected by CoverityScan, CID#1475302 ("Dereference before null check")
Fixes:
04b38d601239 ("vfs: pull btrfs clone API to vfs layer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Long Li [Wed, 31 Oct 2018 22:13:11 +0000 (22:13 +0000)]
CIFS: Add direct I/O functions to file_operations
With direct read/write functions implemented, add them to file_operations.
Dircet I/O is used under two conditions:
1. When mounting with "cache=none", CIFS uses direct I/O for all user file
data transfer.
2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
transfer on this file.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Long Li [Wed, 31 Oct 2018 22:13:10 +0000 (22:13 +0000)]
CIFS: Add support for direct I/O write
With direct I/O write, user supplied buffers are pinned to the memory and data
are transferred directly from user buffers to the transport layer.
Change in v3: add support for kernel AIO
Change in v4:
Refactor common write code to __cifs_writev for direct and non-direct I/O.
Retry on direct I/O failure.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Long Li [Wed, 31 Oct 2018 22:13:09 +0000 (22:13 +0000)]
CIFS: Add support for direct I/O read
With direct I/O read, we transfer the data directly from transport layer to
the user data buffer.
Change in v3: add support for kernel AIO
Change in v4:
Refactor common read code to __cifs_readv for direct and non-direct I/O.
Retry on direct I/O failure.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Steve French [Wed, 31 Oct 2018 16:24:33 +0000 (11:24 -0500)]
smb3: missing defines and structs for reparse point handling
We were missing some structs from MS-FSCC relating to
reparse point handling. Add them to protocol defines
in smb2pdu.h
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Steve French [Wed, 31 Oct 2018 00:50:31 +0000 (19:50 -0500)]
smb3: allow more detailed protocol info on open files for debugging
In order to debug complex problems it is often helpful to
have detailed information on the client and server view
of the open file information. Add the ability for root to
view the list of smb3 open files and dump the persistent
handle and other info so that it can be more easily
correlated with server logs.
Sample output from "cat /proc/fs/cifs/open_files"
# Version:1
# Format:
# <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid>
0x5 0x800000378 0x8000 1 7704 0 some-file 0x14
0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d
0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b
0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Steve French [Sun, 28 Oct 2018 18:13:23 +0000 (13:13 -0500)]
smb3: on kerberos mount if server doesn't specify auth type use krb5
Some servers (e.g. Azure) do not include a spnego blob in the SMB3
negotiate protocol response, so on kerberos mounts ("sec=krb5")
we can fail, as we expected the server to list its supported
auth types (OIDs in the spnego blob in the negprot response).
Change this so that on krb5 mounts we default to trying krb5 if the
server doesn't list its supported protocol mechanisms.
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Steve French [Sun, 28 Oct 2018 05:47:11 +0000 (00:47 -0500)]
smb3: add trace point for tree connection
In debugging certain scenarios, especially reconnect cases,
it can be helpful to have a dynamic trace point for the
result of tree connect. See sample output below
from a reconnect event. The new event is 'smb3_tcon'
TASK-PID CPU# |||| TIMESTAMP FUNCTION
| | | |||| | |
cifsd-6071 [001] .... 2659.897923: smb3_reconnect: server=localhost current_mid=0xa
kworker/1:1-71 [001] .... 2666.026342: smb3_cmd_done: sid=0x0 tid=0x0 cmd=0 mid=0
kworker/1:1-71 [001] .... 2666.026576: smb3_cmd_err: sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5
kworker/1:1-71 [001] .... 2666.031677: smb3_cmd_done: sid=0xc49e1787 tid=0x0 cmd=1 mid=2
kworker/1:1-71 [001] .... 2666.031921: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3
kworker/1:1-71 [001] .... 2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0
kworker/1:1-71 [001] .... 2666.032097: smb3_cmd_done: sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4
kworker/1:1-71 [001] .... 2666.032265: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5
kworker/1:1-71 [001] .... 2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0
kworker/1:1-71 [001] .... 2666.032386: smb3_cmd_done: sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Colin Ian King [Fri, 26 Oct 2018 18:07:21 +0000 (19:07 +0100)]
cifs: fix spelling mistake, EACCESS -> EACCES
Trivial fix to a spelling mistake of the error access name EACCESS,
rename to EACCES
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Ronnie Sahlberg [Thu, 25 Oct 2018 05:43:36 +0000 (15:43 +1000)]
cifs: fix return value for cifs_listxattr
If the application buffer was too small to fit all the names
we would still count the number of bytes and return this for
listxattr. This would then trigger a BUG in usercopy.c
Fix the computation of the size so that we return -ERANGE
correctly when the buffer is too small.
This fixes the kernel BUG for xfstest generic/377
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Guo Ren [Fri, 2 Nov 2018 16:51:31 +0000 (00:51 +0800)]
dt-bindings: timer: gx6605s SOC timer
Dt-bindings doc for gx6605s SOC's system timer.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Guo Ren [Fri, 2 Nov 2018 16:51:30 +0000 (00:51 +0800)]
clocksource/drivers/c-sky: Add gx6605s SOC system timer
The driver is for gx6605s SOC system timer and there are two
same timers in gx6605s. We use one for clkevt and another one for
clksrc.
The timer is mmio map to access, so we need give mmio address in dts.
The counter at 0x0 offset is clock event.
The counter at 0x40 offset is clock source.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Guo Ren [Fri, 2 Nov 2018 16:51:29 +0000 (00:51 +0800)]
dt-bindings: timer: C-SKY Multi-processor timer
Dt-bingdings doc for C-SKY SMP system setting.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Guo Ren [Fri, 2 Nov 2018 16:51:28 +0000 (00:51 +0800)]
clocksource/drivers/c-sky: Add C-SKY SMP timer
The driver is for C-SKY SMP timer. It only supports oneshot event
and 32bit overflow for clocksource. Per cpu core has one timer and
all timers share one clock-counter-input from the same clocksource.
This use mfcr&mtcr instructions to access the regs.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Linus Walleij [Fri, 19 Oct 2018 09:24:26 +0000 (11:24 +0200)]
ARM: defconfig: Update multi_v7 to use PREEMPT
Using CONFIG_PREEMPT as preemption model for ARMv7 systems
appear to be the most reasonable default.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
John Garry [Thu, 18 Oct 2018 16:17:07 +0000 (00:17 +0800)]
arm64: defconfig: Enable some IPMI configs
The arm64 port now runs on servers which use IPMI. This patch enables
relevant core configs to save manually enabling them when testing
mainline.
Signed-off-by: John Garry <john.garry@huawei.com>
[olof: Switched to =m instead of =y]
Signed-off-by: Olof Johansson <olof@lixom.net>
Christoph Hellwig [Tue, 30 Oct 2018 07:41:29 +0000 (09:41 +0200)]
arm64: fix warnings without CONFIG_IOMMU_DMA
__swiotlb_get_sgtable_page and __swiotlb_mmap_pfn are not only misnamed
but also only used if CONFIG_IOMMU_DMA is set. Just add a simple ifdef
for now, given that we plan to remove them entirely for the next merge
window.
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Fri, 2 Nov 2018 18:25:48 +0000 (11:25 -0700)]
Merge tag 'for-linus-
20181102' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"The biggest part of this pull request is the revert of the blkcg
cleanup series. It had one fix earlier for a stacked device issue, but
another one was reported. Rather than play whack-a-mole with this,
revert the entire series and try again for the next kernel release.
Apart from that, only small fixes/changes.
Summary:
- Indentation fixup for mtip32xx (Colin Ian King)
- The blkcg cleanup series revert (Dennis Zhou)
- Two NVMe fixes. One fixing a regression in the nvme request
initialization in this merge window, causing nvme-fc to not work.
The other is a suspend/resume p2p resource issue (James, Keith)
- Fix sg discard merge, allowing us to merge in cases where we didn't
before (Jianchao Wang)
- Call rq_qos_exit() after the queue is frozen, preventing a hang
(Ming)
- Fix brd queue setup, fixing an oops if we fail setting up all
devices (Ming)"
* tag 'for-linus-
20181102' of git://git.kernel.dk/linux-block:
nvme-pci: fix conflicting p2p resource adds
nvme-fc: fix request private initialization
blkcg: revert blkcg cleanups series
block: brd: associate with queue until adding disk
block: call rq_qos_exit() after queue is frozen
mtip32xx: clean an indentation issue, remove extraneous tabs
block: fix the DISCARD request merge
Linus Torvalds [Fri, 2 Nov 2018 18:22:45 +0000 (11:22 -0700)]
Merge tag 'pwm/for-4.20-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"This series contains a number of improvements to existing drivers,
such as LPSS. Some drivers, such as renesas-tpu and rcar get support
for more SoC generations. To round things off this fixes an issue with
the sysfs interface"
* tag 'pwm/for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: lpss: Only set update bit if we are actually changing the settings
pwm: lpss: Force runtime-resume on suspend on Cherry Trail
pwm: Enable TI ECAP driver for ARCH_K3
dt-bindings: pwm: tiecap: Add TI AM654 SoC specific compatible
dt-bindings: pwm: rcar: Add r8a774a1 support
pwm: Send a uevent on the pwmchip device upon channel sysfs (un)export
Revert "pwm: Set class for exported channels in sysfs"
dt-bindings: pwm: renesas-tpu: Document r8a7744 support
dt-bindings: pwm: rcar: Add r8a7744 support
dt-bindings: pwm: renesas: tpu: Document R8A779{7|8}0 bindings
dt-bindings: pwm: renesas: pwm-rcar: Document R8A779{7|8}0 bindings
dt-bindings: pwm: renesas: tpu: Fix "compatible" prop description
pwm: Use SPDX identifier for Renesas drivers
pwm: lpss: Add get_state callback
pwm: lpss: Release runtime-pm reference from the driver's remove callback
pwm: lpss: Check PWM powerstate after resume on Cherry Trail devices
pwm: lpss: Move struct pwm_lpss_chip definition to the header file
pwm: lpss: Add ACPI HID for second PWM controller on Cherry Trail devices
ACPI / PM: Export acpi_device_get_power() for use by modular build drivers
pwm: tegra: Remove gratuituous blank line
Marc Zyngier [Wed, 31 Oct 2018 08:41:34 +0000 (08:41 +0000)]
soc: ti: QMSS: Fix usage of irq_set_affinity_hint
The Keystone QMSS driver is pretty damaged, in the sense that it
does things like this:
irq_set_affinity_hint(irq, to_cpumask(&cpu_map));
where cpu_map is a local variable. As we leave the function, this
will point to nowhere-land, and things will end-up badly.
Instead, let's use a proper cpumask that gets allocated, giving
the driver a chance to actually work with things like irqbalance
as well as have a hypothetical 64bit future.
Cc: stable@vger.kernel.org
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Linus Torvalds [Fri, 2 Nov 2018 18:17:22 +0000 (11:17 -0700)]
Merge tag 'edac_for_4.20_2' of git://git./linux/kernel/git/bp/bp
Pull more EDAC updates from Borislav Petkov:
"The second part of the EDAC pile which contains the ADXL user and a
build fix which addresses a not-so-sensical .config but fixes
randconfig builds people do:
- skx_edac: Address translation for NVDIMMs (Tony Luck and Qiuxu Zhuo)
- ACPI_ADXL build fix"
[ I don't think "sensical" is a word, particularly when used in the
context of actually meaning "nonsensical", but I like it - Linus ]
* tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, skx: Fix randconfig builds
EDAC, skx_edac: Add address translation for non-volatile DIMMs
Anders Roxell [Tue, 30 Oct 2018 11:38:50 +0000 (12:38 +0100)]
arm64: kprobe: make page to RO mode when allocate it
Commit
1404d6f13e47 ("arm64: dump: Add checking for writable and exectuable pages")
has successfully identified code that leaves a page with W+X
permissions.
[ 3.245140] arm64/mm: Found insecure W+X mapping at address (____ptrval____)/0xffff000000d90000
[ 3.245771] WARNING: CPU: 0 PID: 1 at ../arch/arm64/mm/dump.c:232 note_page+0x410/0x420
[ 3.246141] Modules linked in:
[ 3.246653] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5-next-
20180928-00001-ge70ae259b853-dirty #62
[ 3.247008] Hardware name: linux,dummy-virt (DT)
[ 3.247347] pstate:
80000005 (Nzcv daif -PAN -UAO)
[ 3.247623] pc : note_page+0x410/0x420
[ 3.247898] lr : note_page+0x410/0x420
[ 3.248071] sp :
ffff00000804bcd0
[ 3.248254] x29:
ffff00000804bcd0 x28:
ffff000009274000
[ 3.248578] x27:
ffff00000921a000 x26:
ffff80007dfff000
[ 3.248845] x25:
ffff0000093f5000 x24:
ffff000009526f6a
[ 3.249109] x23:
0000000000000004 x22:
ffff000000d91000
[ 3.249396] x21:
ffff000000d90000 x20:
0000000000000000
[ 3.249661] x19:
ffff00000804bde8 x18:
0000000000000400
[ 3.249924] x17:
0000000000000000 x16:
0000000000000000
[ 3.250271] x15:
ffffffffffffffff x14:
295f5f5f5f6c6176
[ 3.250594] x13:
7274705f5f5f5f28 x12:
2073736572646461
[ 3.250941] x11:
20746120676e6970 x10:
70616d20582b5720
[ 3.251252] x9 :
6572756365736e69 x8 :
3039643030303030
[ 3.251519] x7 :
306666666678302f x6 :
ffff0000095467b2
[ 3.251802] x5 :
0000000000000000 x4 :
0000000000000000
[ 3.252060] x3 :
0000000000000000 x2 :
ffffffffffffffff
[ 3.252323] x1 :
4d151327adc50b00 x0 :
0000000000000000
[ 3.252664] Call trace:
[ 3.252953] note_page+0x410/0x420
[ 3.253186] walk_pgd+0x12c/0x238
[ 3.253417] ptdump_check_wx+0x68/0xf8
[ 3.253637] mark_rodata_ro+0x68/0x98
[ 3.253847] kernel_init+0x38/0x160
[ 3.254103] ret_from_fork+0x10/0x18
kprobes allocates a writable executable page with module_alloc() in
order to store executable code.
Reworked to that when allocate a page it sets mode RO. Inspired by
commit
63fef14fc98a ("kprobes/x86: Make insn buffer always ROX and use text_poke()").
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[catalin.marinas@arm.com: removed unnecessary casts]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Fri, 2 Nov 2018 18:02:52 +0000 (11:02 -0700)]
Merge tag 'sound-fix-4.20-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few device-specific fixes: a fix for SPDIF on old Creative PCI
board, and two additional fixes for the recent changes in FireWire
audio stack"
* tag 'sound-fix-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size
ALSA: ca0106: Disable IZD on SB0570 DAC to fix audio pops
ALSA: dice: fix to wait for releases of all ALSA character devices
Linus Torvalds [Fri, 2 Nov 2018 17:58:20 +0000 (10:58 -0700)]
Merge tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Pretty much a normal fixes pull pre-rc1, mostly amdgpu fixes, one i915
link training regression fix, and a couple of minor panel/bridge fixes
and a panel quirk"
* tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm: (37 commits)
drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"
drm/amd/pp: Print warning if od_sclk/mclk out of range
drm/amd/pp: Fix pp_sclk/mclk_od not work on Vega10
drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7
drm/amd/powerplay: no MGPU fan boost enablement on DPM disabled
drm/amdgpu: Fix skipping hangged job reset during gpu recover.
drm/amd/powerplay: revise Vega20 pptable version check
drm/amd/display: set backlight level limit to 1
drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
dt-bindings: drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
drm/bridge: ti-sn65dsi86: Remove the mystery delay
drm/panel: simple: Add "no-hpd" delay for Innolux TV123WAM
drm/panel: simple: Support panels with HPD where HPD isn't connected
dt-bindings: drm/panel: simple: Add no-hpd property
drm/edid: Add 6 bpc quirk for BOE panel.
drm/amdgpu: fix reporting of failed msg sent to SMU (v2)
drm/amdgpu: Fix compute ring 1.0.0 failure after reset
drm/amdgpu: fix VM leaf walking
drm/amdgpu: fix amdgpu_vm_fini
drm/amd/powerplay: commonize the API for retrieving current clocks
...
Yangtao Li [Fri, 2 Nov 2018 12:36:19 +0000 (08:36 -0400)]
arm64: kdump: fix small typo
This brings the kernel doc in line with the function signature.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>