platform/kernel/linux-starfive.git
17 months agos390/mem_detect: do not truncate online memory ranges info
Vasily Gorbik [Fri, 10 Feb 2023 14:47:06 +0000 (15:47 +0100)]
s390/mem_detect: do not truncate online memory ranges info

Commit bf64f0517e5d ("s390/mem_detect: handle online memory limit
just once") introduced truncation of mem_detect online ranges
based on identity mapping size. For kdump case however the full
set of online memory ranges has to be feed into memblock_physmem_add
so that crashed system memory could be extracted.

Instead of truncating introduce a "usable limit" which is respected by
mem_detect api. Also add extra online memory ranges iterator which still
provides full set of online memory ranges disregarding the "usable limit".

Fixes: bf64f0517e5d ("s390/mem_detect: handle online memory limit just once")
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/vx: remove __uint128_t type from __vector128 struct again
Heiko Carstens [Fri, 10 Feb 2023 13:09:38 +0000 (14:09 +0100)]
s390/vx: remove __uint128_t type from __vector128 struct again

The __uint128_t member was only added for future convenience to the
__vector128 struct. However this is a uapi header file, 31/32 bit (aka
compat layer) is still supported, but doesn't know anything about this
type:

/usr/include/asm/types.h:27:17: error: unknown type name __uint128_t
   27 |                 __uint128_t v;

Therefore remove it again.

Fixes: b0b7b43fcc46 ("s390/vx: add 64 and 128 bit members to __vector128 struct")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mm: add support for RDP (Reset DAT-Protection)
Gerald Schaefer [Mon, 6 Feb 2023 16:48:21 +0000 (17:48 +0100)]
s390/mm: add support for RDP (Reset DAT-Protection)

RDP instruction allows to reset DAT-protection bit in a PTE, with less
CPU synchronization overhead than IPTE instruction. In particular, IPTE
can cause machine-wide synchronization overhead, and excessive IPTE usage
can negatively impact machine performance.

RDP can be used instead of IPTE, if the new PTE only differs in SW bits
and _PAGE_PROTECT HW bit, for PTE protection changes from RO to RW.
SW PTE bit changes are allowed, e.g. for dirty and young tracking, but none
of the other HW-defined part of the PTE must change. This is because the
architecture forbids such changes to an active and valid PTE, which
is why invalidation with IPTE is always used first, before writing a new
entry.

The RDP optimization helps mainly for fault-driven SW dirty-bit tracking.
Writable PTEs are initially always mapped with HW _PAGE_PROTECT bit set,
to allow SW dirty-bit accounting on first write protection fault, where
the DAT-protection would then be reset. The reset is now done with RDP
instead of IPTE, if RDP instruction is available.

RDP cannot always guarantee that the DAT-protection reset is propagated
to all CPUs immediately. This means that spurious TLB protection faults
on other CPUs can now occur. For this, common code provides a
flush_tlb_fix_spurious_fault() handler, which will now be used to do a
CPU-local TLB flush. However, this will clear the whole TLB of a CPU, and
not just the affected entry. For more fine-grained flushing, by simply
doing a (local) RDP again, flush_tlb_fix_spurious_fault() would need to
also provide the PTE pointer.

Note that spurious TLB protection faults cannot really be distinguished
from racing pagetable updates, where another thread already installed the
correct PTE. In such a case, the local TLB flush would be unnecessary
overhead, but overall reduction of CPU synchronization overhead by not
using IPTE is still expected to be beneficial.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mm: define private VM_FAULT_* reasons from top bits
Peter Xu [Sun, 5 Feb 2023 23:17:04 +0000 (18:17 -0500)]
s390/mm: define private VM_FAULT_* reasons from top bits

The current definition already collapse with the generic definition of
vm_fault_reason.  Move the private definitions to allocate bits from the
top of uint so they won't collapse anymore.

Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20230205231704.909536-4-peterx@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoDocumentation: s390: correct spelling
Randy Dunlap [Thu, 9 Feb 2023 07:13:51 +0000 (23:13 -0800)]
Documentation: s390: correct spelling

Correct spelling problems for Documentation/s390/ as reported
by codespell.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20230209071400.31476-16-rdunlap@infradead.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/ap: fix status returned by ap_qact()
Halil Pasic [Wed, 8 Feb 2023 23:00:24 +0000 (00:00 +0100)]
s390/ap: fix status returned by ap_qact()

Since commit 159491f3b509 ("s390/ap: rework assembler functions to use
unions for in/out register variables") the  function ap_qact() tries to
grab the status from the wrong part of the register. Thus we always end
up with zeros. Which is wrong, among others, because we detect failures
via status.response_code.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Harald Freudenberger <freude@linux.ibm.com>
Fixes: 159491f3b509 ("s390/ap: rework assembler functions to use unions for in/out register variables")
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/ap: fix status returned by ap_aqic()
Halil Pasic [Wed, 8 Feb 2023 23:00:23 +0000 (00:00 +0100)]
s390/ap: fix status returned by ap_aqic()

There function ap_aqic() tries to grab the status from the
wrong part of the register. Thus we always end up with
zeros. Which is wrong, among others, because we detect
failures via status.response_code.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: 159491f3b509 ("s390/ap: rework assembler functions to use unions for in/out register variables")
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390: vfio-ap: tighten the NIB validity check
Halil Pasic [Wed, 8 Feb 2023 23:00:22 +0000 (00:00 +0100)]
s390: vfio-ap: tighten the NIB validity check

The NIB is architecturally invalid if the address designates a
storage location that is not installed or if it is zero.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: ec89b55e3bce ("s390: ap: implement PAPQ AQIC interception in kernel")
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoRevert "s390/mem_detect: do not update output parameters on failure"
Heiko Carstens [Wed, 8 Feb 2023 18:16:45 +0000 (19:16 +0100)]
Revert "s390/mem_detect: do not update output parameters on failure"

This reverts commit cbc29f107e51b1cc7d1e7b0bbe0691a1224205f1.

Get rid of the following smatch warnings:

arch/s390/include/asm/mem_detect.h:86 get_mem_detect_end() error: uninitialized symbol 'end'.
arch/s390/include/asm/mem_detect.h:86 get_mem_detect_end() error: uninitialized symbol 'end'.
arch/s390/boot/vmem.c:256 setup_vmem() error: uninitialized symbol 'start'.
arch/s390/boot/vmem.c:258 setup_vmem() error: uninitialized symbol 'end'.

Note that there is no bug in the code. This is purely to silence smatch.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/idle: remove arch_cpu_idle_time() and corresponding code
Heiko Carstens [Tue, 7 Feb 2023 16:39:42 +0000 (17:39 +0100)]
s390/idle: remove arch_cpu_idle_time() and corresponding code

arch_cpu_idle_time() returns the idle time of any given cpu if it is in
idle, or zero if not. All if this is racy and partially incorrect. Time
stamps taken with store clock extended and store clock fast from different
cpus are compared, while the architecture states that this is nothing which
can be relied on (see Principles of Operation; Chapter 4, "Setting and
Inspecting the Clock").

A more fundamental problem is that the timestamp when a cpu is leaving idle
is taken early in the assembler part of the interrupt handler, and this
value is only transferred many cycles later to the cpu's per-cpu idle data
structure.

This per cpu data structure is read by arch_cpu_idle() to tell for which
period of time a remote cpu is idle: if only an idle_enter value is
present, the assumed idle time of the cpu is calculated by taking a local
timestamp and returning the difference of the local timestamp and the
idle_enter value. This is potentially incorrect, since the remote cpu may
have already left idle, but the taken timestamp may not have been
transferred to the per-cpu data structure. This in turn means that too much
idle time may be reported for a cpu, and a subsequent calculation of system
idle time may result in a smaller value.

Instead of coming up with even more complex code trying to fix this, just
remove this code, and only account idle time of a cpu, after idle state is
left.

Another minor bug is that it is assumed that timestamps are non-zero, which
is not necessarily the case for timestamps taken with store clock
fast. This however is just a very minor problem, since this can only happen
when the epoch increases.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/vx: use simple assignments to access __vector128 members
Heiko Carstens [Thu, 2 Feb 2023 14:47:39 +0000 (15:47 +0100)]
s390/vx: use simple assignments to access __vector128 members

Use simple assignments to access __vector128 members instead of hard
to read casts.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/vx: add 64 and 128 bit members to __vector128 struct
Heiko Carstens [Thu, 2 Feb 2023 14:47:38 +0000 (15:47 +0100)]
s390/vx: add 64 and 128 bit members to __vector128 struct

Add 64 and 128 bit members to __vector128 struct in order to allow reading
of the complete value, or the higher or lower part of vector register
contents instead of having to use casts.

Add an explicit __aligned(4) statement to avoid that the alignment of the
structure changes from 4 to 8. This should make sure that no breakage
happens because of this change.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoMAINTAINERS: add diag288_wdt driver to s390 maintained files
Heiko Carstens [Mon, 6 Feb 2023 12:56:19 +0000 (13:56 +0100)]
MAINTAINERS: add diag288_wdt driver to s390 maintained files

The diag288_wdt watchdog driver is s390 specific.
Document who is responsible for this driver.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoMAINTAINERS: add entry for s390 SCM driver
Vineeth Vijayan [Mon, 6 Feb 2023 13:26:31 +0000 (14:26 +0100)]
MAINTAINERS: add entry for s390 SCM driver

Storage Class Memory driver support for s390 architecture has been there
for a while. The original author of this work, Sebastian Ott has left IBM
and I am taking over this module. Adding myself as the upstream maintainer
for SCM on s390 architecture.

Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/processor: always inline cpu flag helper functions
Heiko Carstens [Mon, 6 Feb 2023 13:49:41 +0000 (14:49 +0100)]
s390/processor: always inline cpu flag helper functions

arch_cpu_idle() is marked noinstr and therefore must only call functions
which are also not instrumented.

Make sure that cpu flag helper functions are always inlined to avoid that
the compiler generates an out-of-line function for e.g. the call within
arch_cpu_idle().

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/idle: mark arch_cpu_idle() noinstr
Heiko Carstens [Mon, 6 Feb 2023 13:49:40 +0000 (14:49 +0100)]
s390/idle: mark arch_cpu_idle() noinstr

linux-next commit ("cpuidle: tracing: Warn about !rcu_is_watching()")
adds a new warning which hits on s390's arch_cpu_idle() function:

RCU not on for: arch_cpu_idle+0x0/0x28
WARNING: CPU: 2 PID: 0 at include/linux/trace_recursion.h:162 arch_ftrace_ops_list_func+0x24c/0x258
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.2.0-rc6-next-20230202 #4
Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
Krnl PSW : 0404d00180000000 00000000002b55c0 (arch_ftrace_ops_list_func+0x250/0x258)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: c0000000ffffbfff 0000000080000002 0000000000000026 0000000000000000
           0000037ffffe3a28 0000037ffffe3a20 0000000000000000 0000000000000000
           0000000000000000 0000000000f4acf6 00000000001044f0 0000037ffffe3cb0
           0000000000000000 0000000000000000 00000000002b55bc 0000037ffffe3bb8
Krnl Code: 00000000002b55b0c02000840051        larl    %r2,0000000001335652
           00000000002b55b6c0e5fff512d1        brasl   %r14,0000000000157b58
          #00000000002b55bcaf000000            mc      0,0
          >00000000002b55c0a7f4ffe7            brc     15,00000000002b558e
           00000000002b55c4: 0707                bcr     0,%r7
           00000000002b55c6: 0707                bcr     0,%r7
           00000000002b55c8eb6ff0480024        stmg    %r6,%r15,72(%r15)
           00000000002b55ceb90400ef            lgr     %r14,%r15
Call Trace:
 [<00000000002b55c0>] arch_ftrace_ops_list_func+0x250/0x258
([<00000000002b55bc>] arch_ftrace_ops_list_func+0x24c/0x258)
 [<0000000000f5f0fc>] ftrace_common+0x1c/0x20
 [<00000000001044f6>] arch_cpu_idle+0x6/0x28
 [<0000000000f4acf6>] default_idle_call+0x76/0x128
 [<00000000001cc374>] do_idle+0xf4/0x1b0
 [<00000000001cc6ce>] cpu_startup_entry+0x36/0x40
 [<0000000000119d00>] smp_start_secondary+0x140/0x150
 [<0000000000f5d2ae>] restart_int_handler+0x6e/0x90

Mark arch_cpu_idle() noinstr like all other architectures with
CONFIG_ARCH_WANTS_NO_INSTR (should) have it to fix this.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/idle: move idle time accounting to account_idle_time_irq()
Heiko Carstens [Mon, 6 Feb 2023 13:49:39 +0000 (14:49 +0100)]
s390/idle: move idle time accounting to account_idle_time_irq()

There is no reason to do idle time accounting in arch_cpu_idle().
Do idle time accounting in account_idle_time_irq(), where it belongs
to. The accounted values don't change between account_idle_time_irq() and
arch_cpu_idle(); so the result is the same.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoMerge branch 'cmpxchg_user_key' into features
Heiko Carstens [Thu, 9 Feb 2023 19:10:35 +0000 (20:10 +0100)]
Merge branch 'cmpxchg_user_key' into features

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agowatchdog: diag288_wdt: unify lpar and zvm diag288 helpers
Alexander Egorenkov [Fri, 3 Feb 2023 07:39:58 +0000 (08:39 +0100)]
watchdog: diag288_wdt: unify lpar and zvm diag288 helpers

Change naming of the internal diag288 helper functions
to improve overall readability and reduce confusion:
* Rename __diag288() to diag288().
* Get rid of the misnamed helper __diag288_lpar() that was used not only
  on LPARs but also zVM and KVM systems.
* Rename __diag288_vm() to diag288_str().

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230203073958.1585738-6-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agowatchdog: diag288_wdt: de-duplicate diag_stat_inc() calls
Alexander Egorenkov [Fri, 3 Feb 2023 07:39:57 +0000 (08:39 +0100)]
watchdog: diag288_wdt: de-duplicate diag_stat_inc() calls

Call diag_stat_inc() from __diag288() to reduce code duplication.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230203073958.1585738-5-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agowatchdog: diag288_wdt: unify command buffer handling for diag288 zvm
Alexander Egorenkov [Fri, 3 Feb 2023 07:39:56 +0000 (08:39 +0100)]
watchdog: diag288_wdt: unify command buffer handling for diag288 zvm

Simplify and de-duplicate code by introducing a common single command
buffer allocated once at initialization. Moreover, simplify the interface
of __diag288_vm() by accepting ASCII strings as the command parameter
and converting it to the EBCDIC format within the function itself.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230203073958.1585738-4-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agowatchdog: diag288_wdt: remove power management
Alexander Egorenkov [Fri, 3 Feb 2023 07:39:55 +0000 (08:39 +0100)]
watchdog: diag288_wdt: remove power management

Remove power management because s390 no longer supports hibernation since
commit 394216275c7d ("s390: remove broken hibernate / power management
support").

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230203073958.1585738-3-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agowatchdog: diag288_wdt: get rid of register asm
Alexander Egorenkov [Fri, 3 Feb 2023 07:39:54 +0000 (08:39 +0100)]
watchdog: diag288_wdt: get rid of register asm

Using register asm statements has been proven to be very error prone,
especially when using code instrumentation where gcc may add function
calls, which clobbers register contents in an unexpected way.

Therefore, get rid of register asm statements in watchdog code, and make
sure this bug class cannot happen.

Moreover, remove the register r1 from the clobber list because this
register is not changed by DIAG 288.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230203073958.1585738-2-egorenar@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agoMerge branch 'fixes' into features
Heiko Carstens [Mon, 6 Feb 2023 14:13:45 +0000 (15:13 +0100)]
Merge branch 'fixes' into features

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/boot: avoid potential amode31 truncation
Vasily Gorbik [Thu, 2 Feb 2023 18:22:35 +0000 (19:22 +0100)]
s390/boot: avoid potential amode31 truncation

Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/boot: move detect_facilities() after cmd line parsing
Vasily Gorbik [Thu, 2 Feb 2023 18:21:38 +0000 (19:21 +0100)]
s390/boot: move detect_facilities() after cmd line parsing

Facilities setup has to be done after "facilities" command line option
parsing, it might set extra or remove existing facilities bits for
testing purposes.

Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/kasan: avoid mapping KASAN shadow for standby memory
Vasily Gorbik [Mon, 30 Jan 2023 01:11:39 +0000 (02:11 +0100)]
s390/kasan: avoid mapping KASAN shadow for standby memory

KASAN common code is able to handle memory hotplug and create KASAN shadow
memory on a fly. Online memory ranges are available from mem_detect,
use this information to avoid mapping KASAN shadow for standby memory.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/boot: avoid page tables memory in kaslr
Vasily Gorbik [Fri, 27 Jan 2023 16:08:29 +0000 (17:08 +0100)]
s390/boot: avoid page tables memory in kaslr

If kernel is build without KASAN support there is a chance that kernel
image is going to be positioned by KASLR code to overlap with identity
mapping page tables.

When kernel is build with KASAN support enabled memory which
is potentially going to be used for page tables and KASAN
shadow mapping is accounted for in KASLR with the use of
kasan_estimate_memory_needs(). Split this function and introduce
vmem_estimate_memory_needs() to cover decompressor's vmem identity
mapping page tables.

Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mem_detect: add get_mem_detect_online_total()
Vasily Gorbik [Sun, 29 Jan 2023 23:57:12 +0000 (00:57 +0100)]
s390/mem_detect: add get_mem_detect_online_total()

Add a function to get online memory in total. It is supposed to be used
in the decompressor as well as during early kernel startup.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mem_detect: handle online memory limit just once
Vasily Gorbik [Sat, 28 Jan 2023 22:55:04 +0000 (23:55 +0100)]
s390/mem_detect: handle online memory limit just once

Introduce mem_detect_truncate() to cut any online memory ranges above
established identity mapping size, so that mem_detect users wouldn't
have to do it over and over again.

Suggested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/boot: fix mem_detect extended area allocation
Vasily Gorbik [Mon, 23 Jan 2023 11:49:47 +0000 (12:49 +0100)]
s390/boot: fix mem_detect extended area allocation

Allocation of mem_detect extended area was not considered neither
in commit 9641b8cc733f ("s390/ipl: read IPL report at early boot")
nor in commit b2d24b97b2a9 ("s390/kernel: add support for kernel address
space layout randomization (KASLR)"). As a result mem_detect extended
theoretically may overlap with ipl report or randomized kernel image
position. But as mem_detect code will allocate extended area only
upon exceeding 255 online regions (which should alternate with offline
memory regions) it is not seen in practice.

To make sure mem_detect extended area does not overlap with ipl report
or randomized kernel position extend usage of "safe_addr". Make initrd
handling and mem_detect extended area allocation code move it further
right and make KASLR takes in into consideration as well.

Fixes: 9641b8cc733f ("s390/ipl: read IPL report at early boot")
Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mem_detect: rely on diag260() if sclp_early_get_memsize() fails
Vasily Gorbik [Fri, 27 Jan 2023 13:57:43 +0000 (14:57 +0100)]
s390/mem_detect: rely on diag260() if sclp_early_get_memsize() fails

In case sclp_early_get_memsize() fails but diag260() succeeds make sure
some sane value is returned. This error scenario is highly unlikely,
but this change makes system able to boot in such case.

Suggested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/diag: make __diag8c_tmp_amode31 static
Heiko Carstens [Thu, 2 Feb 2023 20:47:11 +0000 (21:47 +0100)]
s390/diag: make __diag8c_tmp_amode31 static

Get rid of this sparse warning:

arch/s390/kernel/diag.c:69:29: warning: symbol '__diag8c_tmp_amode31' was not declared. Should it be static?

Fixes: fbaee7464fbb ("s390/tty3270: add support for diag 8c")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/rethook: add local rethook header file
Heiko Carstens [Thu, 2 Feb 2023 20:36:07 +0000 (21:36 +0100)]
s390/rethook: add local rethook header file

Compiling the kernel with CONFIG_KPROBES disabled, but CONFIG_RETHOOK
enabled, results in this sparse warning:

arch/s390/kernel/rethook.c:26:15: warning: no previous prototype for 'arch_rethook_trampoline_callback' [-Wmissing-prototypes]
    26 | unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs)
       |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Add a local rethook header file similar to riscv to address this.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1a280f48c0e4 ("s390/kprobes: replace kretprobe with rethook")
Link: https://lore.kernel.org/all/202302030102.69dZIuJk-lkp@intel.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/vmem: remove unnecessary KASAN checks
Vasily Gorbik [Sat, 28 Jan 2023 17:06:40 +0000 (18:06 +0100)]
s390/vmem: remove unnecessary KASAN checks

Kasan shadow memory area has been moved to the end of kernel address
space since commit 9a39abb7c9aa ("s390/boot: simplify and fix kernel
memory layout setup"), therefore skipping any memory ranges above
VMALLOC_START in empty page tables cleanup code already handles
KASAN shadow memory intersection case and explicit checks could be
removed.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/vmem: fix empty page tables cleanup under KASAN
Vasily Gorbik [Sat, 28 Jan 2023 16:35:12 +0000 (17:35 +0100)]
s390/vmem: fix empty page tables cleanup under KASAN

Commit b9ff81003cf1 ("s390/vmem: cleanup empty page tables") introduced
empty page tables cleanup in vmem code, but when the kernel is built
with KASAN enabled the code has no effect due to wrong KASAN shadow
memory intersection condition, which effectively ignores any memory
range below KASAN shadow. Fix intersection condition to make code
work as anticipated.

Fixes: b9ff81003cf1 ("s390/vmem: cleanup empty page tables")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/kasan: update kasan memory layout note
Vasily Gorbik [Fri, 27 Jan 2023 22:27:18 +0000 (23:27 +0100)]
s390/kasan: update kasan memory layout note

Kasan shadow memory area has been moved to the end of kernel address
space since commit 9a39abb7c9aa ("s390/boot: simplify and fix kernel
memory layout setup"). Change kasan memory layout note accordingly.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/mem_detect: fix detect_memory() error handling
Vasily Gorbik [Fri, 27 Jan 2023 13:03:07 +0000 (14:03 +0100)]
s390/mem_detect: fix detect_memory() error handling

Currently if for some reason sclp_early_read_info() fails,
sclp_early_get_memsize() will not set max_physmem_end and it
will stay uninitialized. Any garbage value other than 0 will lead
to detect_memory() taking wrong path or returning a garbage value
as max_physmem_end. To avoid that simply initialize max_physmem_end.

Fixes: 73045a08cf55 ("s390: unify identity mapping limits handling")
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/hmcdrv: use strscpy() instead of strlcpy()
Heiko Carstens [Tue, 31 Jan 2023 18:50:25 +0000 (19:50 +0100)]
s390/hmcdrv: use strscpy() instead of strlcpy()

Given that strlcpy() is deprecated use strscpy() instead.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/ipl: add loadparm parameter to eckd ipl/reipl data
Sven Schnelle [Sun, 29 Jan 2023 15:39:19 +0000 (16:39 +0100)]
s390/ipl: add loadparm parameter to eckd ipl/reipl data

commit 87fd22e0ae92 ("s390/ipl: add eckd support") missed to add the
loadparm attribute to the new eckd ipl/reipl data.

Fixes: 87fd22e0ae92 ("s390/ipl: add eckd support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
17 months agos390/ipl: add DEFINE_GENERIC_LOADPARM()
Sven Schnelle [Sun, 29 Jan 2023 15:36:03 +0000 (16:36 +0100)]
s390/ipl: add DEFINE_GENERIC_LOADPARM()

In the current code each reipl type implements its own pair of loadparm
show/store functions. Add a macro to deduplicate the code a bit.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Fixes: 87fd22e0ae92 ("s390/ipl: add eckd support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/mem_detect: do not update output parameters on failure
Alexander Gordeev [Wed, 25 Jan 2023 18:16:10 +0000 (19:16 +0100)]
s390/mem_detect: do not update output parameters on failure

Function __get_mem_detect_block() resets start and end
output parameters in case of invalid mem_detect array
index is provided. That violates the rule of sparing
the output on fail path and leads e.g to a below anomaly:

for_each_mem_detect_block(i, &start, &end)
continue;

One would expect start and end contain addresses of the
last memory block (if available), but in fact the two
will be reset to zeroes. That is not how an iterator is
expected to work.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cio: introduce locking for register/unregister functions
Vineeth Vijayan [Fri, 11 Nov 2022 12:46:33 +0000 (13:46 +0100)]
s390/cio: introduce locking for register/unregister functions

Unbinding an I/O subchannel with a child-CCW device in disconnected
state sometimes causes a kernel-panic. The race condition was seen
mostly during testing, when setting all the CHPIDs of a device to
offline and at the same time, the unbinding the I/O subchannel driver.

The kernel-panic occurs because of double delete, the I/O subchannel
driver calls device_del on the CCW device while another device_del
invocation for the same device is in-flight.  For instance, disabling
all the CHPIDs will trigger the ccw_device_remove function, which will
call a ccw_device_unregister(), which ends up calling the device_del()
which is asynchronous via cdev's todo workqueue. And unbinding the I/O
subchannel driver calls io_subchannel_remove() function which calls the
ccw_device_unregister() and device_del().

This double delete can be prevented by serializing all CCW device
registration/unregistration calls into the driver core. This patch
introduces a mutex which will be used for this purpose.

Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/mm,ptdump: avoid Kasan vs Memcpy Real markers swapping
Vasily Gorbik [Tue, 24 Jan 2023 17:08:38 +0000 (18:08 +0100)]
s390/mm,ptdump: avoid Kasan vs Memcpy Real markers swapping

---[ Real Memory Copy Area Start ]---
0x001bfffffffff000-0x001c000000000000         4K PTE I
---[ Kasan Shadow Start ]---
---[ Real Memory Copy Area End ]---
0x001c000000000000-0x001c000200000000         8G PMD RW NX
...
---[ Kasan Shadow End ]---

ptdump does a stable sort of markers. Move kasan markers after
memcpy real to avoid swapping.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/boot: remove pgtable_populate_end
Vasily Gorbik [Tue, 24 Jan 2023 16:03:24 +0000 (17:03 +0100)]
s390/boot: remove pgtable_populate_end

setup_vmem() already calls populate for all online memory regions.
pgtable_populate_end() could be removed.

Also rename pgtable_populate_begin() to pgtable_populate_init().

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/boot: avoid mapping standby memory
Vasily Gorbik [Mon, 23 Jan 2023 14:24:17 +0000 (15:24 +0100)]
s390/boot: avoid mapping standby memory

Commit bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
doesn't consider online memory holes due to potential memory offlining
and erroneously creates pgtables for stand-by memory, which bear RW+X
attribute and trigger a warning:

RANGE                                 SIZE   STATE REMOVABLE BLOCK
0x0000000000000000-0x0000000c3fffffff  49G  online       yes  0-48
0x0000000c40000000-0x0000000c7fffffff   1G offline              49
0x0000000c80000000-0x0000000fffffffff  14G  online       yes 50-63
0x0000001000000000-0x00000013ffffffff  16G offline           64-79

    s390/mm: Found insecure W+X mapping at address 0xc40000000
    WARNING: CPU: 14 PID: 1 at arch/s390/mm/dump_pagetables.c:142 note_page+0x2cc/0x2d8

Map only online memory ranges which fit within identity mapping limit.

Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/decompressor: specify __decompress() buf len to avoid overflow
Vasily Gorbik [Sun, 29 Jan 2023 22:47:23 +0000 (23:47 +0100)]
s390/decompressor: specify __decompress() buf len to avoid overflow

Historically calls to __decompress() didn't specify "out_len" parameter
on many architectures including s390, expecting that no writes beyond
uncompressed kernel image are performed. This has changed since commit
2aa14b1ab2c4 ("zstd: import usptream v1.5.2") which includes zstd library
commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer
(#2751)"). Now zstd decompression code might store literal buffer in
the unwritten portion of the destination buffer. Since "out_len" is
not set, it is considered to be unlimited and hence free to use for
optimization needs. On s390 this might corrupt initrd or ipl report
which are often placed right after the decompressor buffer. Luckily the
size of uncompressed kernel image is already known to the decompressor,
so to avoid the problem simply specify it in the "out_len" parameter.

Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agowatchdog: diag288_wdt: fix __diag288() inline assembly
Alexander Egorenkov [Fri, 27 Jan 2023 13:52:42 +0000 (14:52 +0100)]
watchdog: diag288_wdt: fix __diag288() inline assembly

The DIAG 288 statement consumes an EBCDIC string the address of which is
passed in a register. Use a "memory" clobber to tell the compiler that
memory is accessed within the inline assembly.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agowatchdog: diag288_wdt: do not use stack buffers for hardware data
Alexander Egorenkov [Fri, 27 Jan 2023 13:52:41 +0000 (14:52 +0100)]
watchdog: diag288_wdt: do not use stack buffers for hardware data

With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data passed to a hardware or a hypervisor interface that
requires V=R can no longer be allocated on the stack.

Use kmalloc() to get memory for a diag288 command.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/syscalls: get rid of system call alias functions
Heiko Carstens [Mon, 23 Jan 2023 13:30:46 +0000 (14:30 +0100)]
s390/syscalls: get rid of system call alias functions

bpftrace and friends only consider functions present in
/sys/kernel/tracing/available_filter_functions.

For system calls there is the s390 specific problem that the system call
function itself is present via __se_sys##name() while the system call
itself is wired up via an __s390x_sys##name() alias. The required DWARF
debug information however is only available for the original function, not
the alias, but within available_filter_functions only the functions with
__s390x_ prefix are available. Which means the required DWARF debug
information cannot be found.
While this could be solved via tooling, it is easier to change the s390
specific system call wrapper handling.

Therefore get rid of this alias handling and implement system call wrappers
like most other architectures are doing. In result the implementation
generates the following functions:

long __s390x_sys##name(struct pt_regs *regs)
static inline long __se_sys##name(...)
static inline long __do_sys##name(...)

__s390x_sys##name() is the visible system call function which is also wired
up in the system call table. Its only parameter is a pt_regs variable.

This function calls the corresponding __se_sys##name() function, which has
as many parameters like the system call definition. This function in turn
performs all zero and sign extensions of all system call parameters, taken
from the pt_regs structure, and finally calls __do_sys##name().

__do_sys##name() is the actual inlined system call function implementation.

For all 64 bit system calls there is a 31/32 bit system call function
__s390_sys##name() generated, which handles all system call parameters
correctly as required by compat handling. This function may be wired
up within the compat system call table, unless there exists an
explicit compat system call function, which is then used instead.

Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/syscalls: remove trailing semicolon
Heiko Carstens [Mon, 23 Jan 2023 13:30:45 +0000 (14:30 +0100)]
s390/syscalls: remove trailing semicolon

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/syscalls: move __S390_SYS_STUBx() macro
Heiko Carstens [Mon, 23 Jan 2023 13:30:44 +0000 (14:30 +0100)]
s390/syscalls: move __S390_SYS_STUBx() macro

Move __S390_SYS_STUBx() the end of the CONFIG_COMPAT section, so both
variants (compat and non-compat) are close together and can be easily
compared.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/syscalls: remove __SC_COMPAT_TYPE define
Heiko Carstens [Mon, 23 Jan 2023 13:30:43 +0000 (14:30 +0100)]
s390/syscalls: remove __SC_COMPAT_TYPE define

Remove __SC_COMPAT_TYPE define which is an unused leftover.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/syscalls: remove SYSCALL_METADATA() from compat syscalls
Heiko Carstens [Mon, 23 Jan 2023 13:30:42 +0000 (14:30 +0100)]
s390/syscalls: remove SYSCALL_METADATA() from compat syscalls

SYSCALL_METADATA() is only supposed to be used for non-compat system
calls. Otherwise there would be a name clash.

This also removes the inconsistency that s390 is the only architecture
which uses SYSCALL_METADATA() for compat system calls, and even that only
for compat system calls without parameters. Only two such compat system
calls exist.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390: discard .interp section
Ilya Leoshkevich [Mon, 23 Jan 2023 21:50:32 +0000 (22:50 +0100)]
s390: discard .interp section

When debugging vmlinux with QEMU + GDB, the following GDB error may
occur:

    (gdb) c
    Continuing.
    Warning:
    Cannot insert breakpoint -1.
    Cannot access memory at address 0xffffffffffff95c0

    Command aborted.
    (gdb)

The reason is that, when .interp section is present, GDB tries to
locate the file specified in it in memory and put a number of
breakpoints there (see enable_break() function in gdb/solib-svr4.c).
Sometimes GDB finds a bogus location that matches its heuristics,
fails to set a breakpoint and stops. This makes further debugging
impossible.

The .interp section contains misleading information anyway (vmlinux
does not need ld.so), so fix by discarding it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: simplify PMC_INIT and PMC_RELEASE usage
Thomas Richter [Tue, 24 Jan 2023 11:20:55 +0000 (12:20 +0100)]
s390/cpum_cf: simplify PMC_INIT and PMC_RELEASE usage

Simplify the use of constants PMC_INIT and PMC_RELEASE.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: merge source files for CPU Measurement counter facility
Thomas Richter [Tue, 24 Jan 2023 11:20:54 +0000 (12:20 +0100)]
s390/cpum_cf: merge source files for CPU Measurement counter facility

With no in-kernel user, the source files can be merged.

Move all functions and the variable definitions to file perf_cpum_cf.c
This file now contains all the necessary functions and definitions
for the CPU Measurement counter facility device driver.

The files cpu_mcf.h and perf_cpum_cf_common.c are deleted.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: remove in-kernel counting facility interface
Thomas Richter [Tue, 24 Jan 2023 11:20:53 +0000 (12:20 +0100)]
s390/cpum_cf: remove in-kernel counting facility interface

Commit 17bebcc68eee ("s390/cpum_cf: Add minimal in-kernel interface for
counter measurements") introduced a small in-kernel interface for CPU
Measurement counter facility.
There are no users of this interface, therefore remove it.

The following functions are removed:
 kernel_cpumcf_alert(),
 kernel_cpumcf_begin(),
 kernel_cpumcf_end(),
 kernel_cpumcf_avail()
there is no need for them anymore.
With the removal of function kernel_cpumcf_alert(), also remove
member alert in struct cpu_cf_events. Its purpose was to counter
measurement alert interrupts for the in-kernel interface.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: move stccm_avail()
Thomas Richter [Tue, 24 Jan 2023 11:20:52 +0000 (12:20 +0100)]
s390/cpum_cf: move stccm_avail()

Function stccm_avail() is defined in a header file and the
only user is one single source file. Move this function to the source
file where it is also used and remove it from the header file.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: move cpum_cf_ctrset_size()
Thomas Richter [Tue, 24 Jan 2023 11:20:51 +0000 (12:20 +0100)]
s390/cpum_cf: move cpum_cf_ctrset_size()

Function cpum_cf_ctrset_size() is defined in one source file and the
only user is in another source file. Move this function to the source
file where it is used and remove its prototype from the header file.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_cf: simplify hw_perf_event_destroy()
Thomas Richter [Tue, 24 Jan 2023 11:20:50 +0000 (12:20 +0100)]
s390/cpum_cf: simplify hw_perf_event_destroy()

To remove an event from the CPU Measurement counter facility
use the lock/unlock scheme as done in event creation. Remove
the atomic_add_unless function to make the code easier.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cache: change type from unsigned long long to unsigned long
Heiko Carstens [Sun, 22 Jan 2023 18:13:03 +0000 (19:13 +0100)]
s390/cache: change type from unsigned long long to unsigned long

The unsigned long long type is a leftover of the 31 bit area.
Get rid of it.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio_ap: increase max wait time for reset verification
Tony Krowiak [Wed, 18 Jan 2023 20:31:11 +0000 (15:31 -0500)]
s390/vfio_ap: increase max wait time for reset verification

Increase the maximum time to wait for verification of a queue reset
operation to 200ms.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-7-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio_ap: fix handling of error response codes
Tony Krowiak [Wed, 18 Jan 2023 20:31:10 +0000 (15:31 -0500)]
s390/vfio_ap: fix handling of error response codes

Some response codes returned from the queue reset function are not being
handled correctly; this patch fixes them:

1. Response code 3, AP queue deconfigured: Deconfiguring an AP adapter
   resets all of its queues, so this is handled by indicating the reset
   verification completed successfully.

2. For all response codes other than 0 (normal reset completion), 2
   (queue reset in progress) and 3 (AP deconfigured), the -EIO error will
   be returned from the vfio_ap_mdev_reset_queue() function. In all cases,
   all fields of the status word other than the response code will be
   set to zero, so it makes no sense to check status bits.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-6-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio_ap: verify ZAPQ completion after return of response code zero
Tony Krowiak [Wed, 18 Jan 2023 20:31:09 +0000 (15:31 -0500)]
s390/vfio_ap: verify ZAPQ completion after return of response code zero

Verification that the asynchronous ZAPQ function has completed only needs
to be done when the response code indicates the function was successfully
initiated; so, let's call the apq_reset_check function immediately after
the response code zero is returned from the ZAPQ.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-5-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio_ap: use TAPQ to verify reset in progress completes
Tony Krowiak [Wed, 18 Jan 2023 20:31:08 +0000 (15:31 -0500)]
s390/vfio_ap: use TAPQ to verify reset in progress completes

To eliminate the repeated calls to the PQAP(ZAPQ) function to verify that
a reset in progress completed successfully and ensure that error response
codes get appropriately logged, let's call the apq_reset_check() function
when the ZAPQ response code indicates that a reset that is already in
progress.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-4-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio_ap: check TAPQ response code when waiting for queue reset
Tony Krowiak [Wed, 18 Jan 2023 20:31:07 +0000 (15:31 -0500)]
s390/vfio_ap: check TAPQ response code when waiting for queue reset

The vfio_ap_mdev_reset_queue() function does not check the status
response code returned form the PQAP(TAPQ) function when verifying the
queue's status; consequently, there is no way of knowing whether
verification failed because the wait time was exceeded, or because the
PQAP(TAPQ) failed.

This patch adds a function to check the status response code from the
PQAP(TAPQ) instruction and logs an appropriate message if it fails.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-3-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio-ap: verify reset complete in separate function
Tony Krowiak [Wed, 18 Jan 2023 20:31:06 +0000 (15:31 -0500)]
s390/vfio-ap: verify reset complete in separate function

The vfio_ap_mdev_reset_queue() function contains a loop to verify that the
reset successfully completes within 40ms. This patch moves that loop into
a separate function.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-2-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kprobes: replace kretprobe with rethook
Vasily Gorbik [Tue, 17 Jan 2023 13:37:10 +0000 (14:37 +0100)]
s390/kprobes: replace kretprobe with rethook

That's an adaptation of commit f3a112c0c40d ("x86,rethook,kprobes:
Replace kretprobe with rethook on x86") to s390.

Replaces the kretprobe code with rethook on s390. With this patch,
kretprobe on s390 uses the rethook instead of kretprobe specific
trampoline code.

Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: diagnostic sampling buffer setup to handle virtual addresses
Thomas Richter [Fri, 13 Jan 2023 11:29:55 +0000 (12:29 +0100)]
s390/cpum_sf: diagnostic sampling buffer setup to handle virtual addresses

The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers
to save samples collected by hardware. These buffers are organized
as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB).
SDBs contain the samples which are extracted and saved in the perf
ring buffer.
The SDBTs are chained using real addresses and refer to SDBs using
real addresses.

The diagnostic sampling setup uses buffers provided by the process
which invokes perf_event_open system call. The buffers are memory
mapped. The buffers have been allocated by the kernel event
subsystem.

Add proper virtual to phyiscal address translation to the buffer
chaining. The current constraint which requires virtual equals
real address layout is removed.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: rework macro AUX_SDB_NUM_xxx
Thomas Richter [Fri, 13 Jan 2023 11:12:07 +0000 (12:12 +0100)]
s390/cpum_sf: rework macro AUX_SDB_NUM_xxx

Macro AUX_SDB_NUM() has three parameters. The first one is not used.
Remove the first parameter. Also convert the macros to inline functions.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: sampling buffer setup to handle virtual addresses
Thomas Richter [Mon, 16 Jan 2023 13:58:22 +0000 (14:58 +0100)]
s390/cpum_sf: sampling buffer setup to handle virtual addresses

The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers
to save samples collected by hardware. These buffers are organized
as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB).
SDBs contain the samples which are extracted and saved in the perf
ring buffer.
The SDBTs are chained using real addresses and refer to SDBs using
real addresses.

Adds proper virtual to phyiscal address translation to the buffer
chaining. The current constraint which requires virtual equals real
address layout is removed.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: remove debug statements from function setup_pmc_cpu
Thomas Richter [Thu, 5 Jan 2023 13:11:33 +0000 (14:11 +0100)]
s390/cpum_sf: remove debug statements from function setup_pmc_cpu

Remove debug statements from function setup_pmc_cpu().
The debug statement displays a pointer value to a per cpu variable.
This pointer value is printed nowhere else, so it has no use for
cross reference.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: move functions from header file to source file
Thomas Richter [Thu, 12 Jan 2023 14:09:21 +0000 (15:09 +0100)]
s390/cpum_sf: move functions from header file to source file

Some inline helper functions are defined in a header file but used
in only one source file. Move these functions to the source file.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vmem: use swap() instead of open coding it
Jiapeng Chong [Tue, 17 Jan 2023 06:02:23 +0000 (14:02 +0800)]
s390/vmem: use swap() instead of open coding it

Swap is a function interface that provides exchange function. To avoid
code duplication, we can use swap function.

./arch/s390/mm/vmem.c:680:10-11: WARNING opportunity for swap().

[hca@linux.ibm.com: get rid of all temp variables]
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3786
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230117060223.58583-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cio: evaluate devices with non-operational paths
Vineeth Vijayan [Thu, 17 Nov 2022 06:31:50 +0000 (07:31 +0100)]
s390/cio: evaluate devices with non-operational paths

css_schedule_reprobe() function calls the evaluation for CSS_EVAL_UNREG
which is specific to the idset of unregistered subchannels. This
evaluation was introduced because, previously, if the underlying device
become not-accessible, the subchannel was unregistered. But, in the recent
changes in cio,with the commit '2297791c92d0 s390/cio: dont unregister
subchannel from child-drivers', we no  longer unregister the subchannels
just because of a non-operational device. This allows to have subchannels
without any operational device connected on it. So, a css_schedule_reprobe
function on unregistered subchannel does not have any effect.

Change this functionality to evaluate the subchannels which does not
have a working path to the device. This could be due the erroneous
device or due to the erraneous path. Evaluate based on the values of OPM
and PAM&POM.
Here we introduced a new idset function,to keep I/O subchannels in the
idset when the last seen status indicates that the device has no working
path. A device has no working path if all available paths have been tried
without success.A failed I/O attempt on a path is indicated as a 0 bit
value in the POM mask. By looking at the POM mask bit values of available
paths (1 in PAM) that Linux is supposed to use (1 in vary mask OPM), we
can identify a non-working device as a device where the bit-wise and of
the PAM, POM and OPM mask return 0.

css_schedule_reprobe() is being used by dasd-driver and chsc-cio
component. dasd driver, when it detects a change in the pathgroup, invokes
the re-evaluation of the subchannel. And chsc-cio component upon a CRW
event, (resource accessibility event). In both the cases, it makes much
better sense to re-evalute the subchannel with no-valid path.

Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reported-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/ipl: use kstrtobool() instead of strtobool()
Christophe JAILLET [Sat, 14 Jan 2023 15:08:22 +0000 (16:08 +0100)]
s390/ipl: use kstrtobool() instead of strtobool()

strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.

In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/58a3ed2e21903a93dfd742943b1e6936863ca037.1673708887.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agoMerge branch 'fixes' into features
Heiko Carstens [Tue, 17 Jan 2023 18:12:39 +0000 (19:12 +0100)]
Merge branch 'fixes' into features

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390: workaround invalid gcc-11 out of bounds read warning
Heiko Carstens [Tue, 17 Jan 2023 18:00:59 +0000 (19:00 +0100)]
s390: workaround invalid gcc-11 out of bounds read warning

GCC 11.1.0 and 11.2.0 generate a wrong warning when compiling the
kernel e.g. with allmodconfig:

arch/s390/kernel/setup.c: In function ‘setup_lowcore_dat_on’:
./include/linux/fortify-string.h:57:33: error: ‘__builtin_memcpy’ reading 128 bytes from a region of size 0 [-Werror=stringop-overread]
...
arch/s390/kernel/setup.c:526:9: note: in expansion of macro ‘memcpy’
  526 |         memcpy(abs_lc->cregs_save_area, S390_lowcore.cregs_save_area,
      |         ^~~~~~

This could be addressed by using absolute_pointer() with the
S390_lowcore macro, but this is not a good idea since this generates
worse code for performance critical paths.

Therefore simply use a for loop to copy the array in question and get
rid of the warning.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agodocs/ABI: use linux-s390 list as the main contact
Vineeth Vijayan [Mon, 2 Jan 2023 19:02:53 +0000 (20:02 +0100)]
docs/ABI: use linux-s390 list as the main contact

Remove Cornelia's email address from the file as suggested by her. List
linux-s390 mailing-list address as the primary contact instead.

Link: https://lore.kernel.org/linux-s390/8735d0oiq6.fsf@redhat.com/
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/vfio-ap: fix an error handling path in vfio_ap_mdev_probe_queue()
Christophe JAILLET [Sun, 31 Jul 2022 16:09:14 +0000 (18:09 +0200)]
s390/vfio-ap: fix an error handling path in vfio_ap_mdev_probe_queue()

The commit in Fixes: has switch the order of a sysfs_create_group() and a
kzalloc().

It correctly removed the now useless kfree() but forgot to add a
sysfs_remove_group() in case of (unlikely) memory allocation failure.

Add it now.

Fixes: 260f3ea14138 ("s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/r/d0c0a35eec4fa87cb7f3910d8ac4dc0f7dc9008a.1659283738.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390: move __amode31_base declaration to proper header file
Heiko Carstens [Sun, 8 Jan 2023 18:08:55 +0000 (19:08 +0100)]
s390: move __amode31_base declaration to proper header file

Move __amode31_base declaration to proper header file to get rid of

arch/s390/boot/startup.c:24:15:
 warning: symbol '__amode31_base' was not declared. Should it be static?

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/mm: allocate Absolute Lowcore Area in decompressor
Alexander Gordeev [Mon, 19 Dec 2022 20:08:27 +0000 (21:08 +0100)]
s390/mm: allocate Absolute Lowcore Area in decompressor

Move Absolute Lowcore Area allocation to the decompressor.
As result, get_abs_lowcore() and put_abs_lowcore() access
brackets become really straight and do not require complex
execution context analysis and LAP and interrupts tackling.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/mm: allocate Real Memory Copy Area in decompressor
Alexander Gordeev [Sun, 11 Dec 2022 07:18:57 +0000 (08:18 +0100)]
s390/mm: allocate Real Memory Copy Area in decompressor

Move Real Memory Copy Area allocation to the decompressor.
As result, memcpy_real() and memcpy_real_iter() movers
become usable since the very moment the kernel starts.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/boot: allow setup of different virtual address types
Alexander Gordeev [Thu, 15 Dec 2022 09:33:52 +0000 (10:33 +0100)]
s390/boot: allow setup of different virtual address types

Currently the decompressor sets up only identity mapping.
Allow adding more address range types as a prerequisite
for allocation of kernel fixed mappings.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kasan: remove identity mapping support
Alexander Gordeev [Fri, 16 Dec 2022 18:49:23 +0000 (19:49 +0100)]
s390/kasan: remove identity mapping support

The identity mapping is created in the decompressor,
there is no need to have the same functionality in
the kasan setup code. Thus, remove it.

Remove the 4KB pages check for first 1MB since there
is no need to take care of the lowcore pages.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/maccess: remove dead DAT-off code
Alexander Gordeev [Fri, 2 Dec 2022 18:23:11 +0000 (19:23 +0100)]
s390/maccess: remove dead DAT-off code

As the kernel is executed in DAT-on mode only, remove
unnecessary DAT bit check together with the dead code.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/mm: start kernel with DAT enabled
Alexander Gordeev [Tue, 13 Dec 2022 10:35:11 +0000 (11:35 +0100)]
s390/mm: start kernel with DAT enabled

The setup of the kernel virtual address space is spread
throughout the sources, boot stages and config options
like this:

1. The available physical memory regions are queried
   and stored as mem_detect information for later use
   in the decompressor.

2. Based on the physical memory availability the virtual
   memory layout is established in the decompressor;

3. If CONFIG_KASAN is disabled the kernel paging setup
   code populates kernel pgtables and turns DAT mode on.
   It uses the information stored at step [1].

4. If CONFIG_KASAN is enabled the kernel early boot
   kasan setup populates kernel pgtables and turns DAT
   mode on. It uses the information stored at step [1].

   The kasan setup creates early_pg_dir directory and
   directly overwrites swapper_pg_dir entries to make
   shadow memory pages available.

Move the kernel virtual memory setup to the decompressor
and start the kernel with DAT turned on right from the
very first istruction. That completely eliminates the
boot phase when the kernel runs in DAT-off mode, simplies
the overall design and consolidates pgtables setup.

The identity mapping is created in the decompressor, while
kasan shadow mappings are still created by the early boot
kernel code.

Share with decompressor the existing kasan memory allocator.
It decreases the size of a newly requested memory block from
pgalloc_pos and ensures that kernel image is not overwritten.
pgalloc_low and pgalloc_pos pointers are made preserved boot
variables for that.

Use the bootdata infrastructure to setup swapper_pg_dir
and invalid_pg_dir directories used by the kernel later.
The interim early_pg_dir directory established by the
kasan initialization code gets eliminated as result.

As the kernel runs in DAT-on mode only the PSW_KERNEL_BITS
define gets PSW_MASK_DAT bit by default. Additionally, the
setup_lowcore_dat_off() and setup_lowcore_dat_on() routines
get merged, since there is no DAT-off mode stage anymore.

The memory mappings are created with RW+X protection that
allows the early boot code setting up all necessary data
and services for the kernel being booted. Just before the
paging is enabled the memory protection is changed to
RO+X for text, RO+NX for read-only data and RW+NX for
kernel data and the identity mapping.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/boot: detect and enable memory facilities
Alexander Gordeev [Sun, 4 Dec 2022 20:15:41 +0000 (21:15 +0100)]
s390/boot: detect and enable memory facilities

Detect and enable memory facilities which is a
prerequisite for pgtables setup in the decompressor.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/pgtable: add REGION3_KERNEL_EXEC protection
Alexander Gordeev [Fri, 2 Dec 2022 18:07:07 +0000 (19:07 +0100)]
s390/pgtable: add REGION3_KERNEL_EXEC protection

Similar to existing PAGE_KERNEL_EXEC and SEGMENT_KERNEL_EXEC
memory protection add REGION3_KERNEL_EXEC attribute that
could be set on PUD pgtable entries.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kasan: use set_pXe_bit() for pgtable entries setup
Alexander Gordeev [Fri, 16 Dec 2022 18:07:38 +0000 (19:07 +0100)]
s390/kasan: use set_pXe_bit() for pgtable entries setup

Convert setup of pgtable entries to use set_pXe_bit()
helpers as the preferred way in MM code.

Locally introduce pgprot_clear_bit() helper, which is
strictly speaking a generic function. However, it is
only x86 pgprot_clear_protnone_bits() helper, which
does a similar thing, so do not make it public.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kasan: cleanup setup of untracked memory pgtables
Alexander Gordeev [Sat, 10 Dec 2022 08:49:04 +0000 (09:49 +0100)]
s390/kasan: cleanup setup of untracked memory pgtables

Avoid duplicate IS_ENABLED(CONFIG_KASAN_VMALLOC) condition check.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kasan: cleanup setup of zero pgtable
Alexander Gordeev [Fri, 9 Dec 2022 21:09:44 +0000 (22:09 +0100)]
s390/kasan: cleanup setup of zero pgtable

Fix variables initialization coding style and setup zero
pgtable same way region and segment pgtables are set up.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/kasan: sort out physical vs virtual memory confusion
Alexander Gordeev [Tue, 13 Dec 2022 10:31:39 +0000 (11:31 +0100)]
s390/kasan: sort out physical vs virtual memory confusion

The kasan early boot memory allocators operate on pgalloc_pos
and segment_pos physical address pointers, but fail to convert
it to the corresponding virtual pointers.

Currently it is not a problem, since virtual and physical
addresses on s390 are the same. Nevertheless, should they
ever differ, this would cause an invalid pointer access.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/early: fix sclp_early_sccb variable lifetime
Alexander Gordeev [Thu, 15 Dec 2022 07:00:34 +0000 (08:00 +0100)]
s390/early: fix sclp_early_sccb variable lifetime

Commit ada1da31ce34 ("s390/sclp: sort out physical vs
virtual pointers usage") fixed the notion of virtual
address for sclp_early_sccb pointer. However, it did
not take into account that kasan_early_init() can also
output messages and sclp_early_sccb should be adjusted
by the time kasan_early_init() is called.

Currently it is not a problem, since virtual and physical
addresses on s390 are the same. Nevertheless, should they
ever differ, this would cause an invalid pointer access.

Fixes: ada1da31ce34 ("s390/sclp: sort out physical vs virtual pointers usage")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/boot: cleanup decompressor header files
Alexander Gordeev [Thu, 5 May 2022 14:54:54 +0000 (16:54 +0200)]
s390/boot: cleanup decompressor header files

Move declarations to appropriate header files. Instead of cryptic
casting directly assign struct vmlinux_info type to _vmlinux_info
linker script variable - wich it actually is.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390: update defconfigs
Heiko Carstens [Wed, 11 Jan 2023 14:04:47 +0000 (15:04 +0100)]
s390: update defconfigs

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agoKVM: s390: interrupt: use READ_ONCE() before cmpxchg()
Heiko Carstens [Mon, 9 Jan 2023 14:54:56 +0000 (15:54 +0100)]
KVM: s390: interrupt: use READ_ONCE() before cmpxchg()

Use READ_ONCE() before cmpxchg() to prevent that the compiler generates
code that fetches the to be compared old value several times from memory.

Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20230109145456.2895385-1-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple()
Heiko Carstens [Mon, 9 Jan 2023 10:51:20 +0000 (11:51 +0100)]
s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple()

Make sure that *ptr__ within arch_this_cpu_to_op_simple() is only
dereferenced once by using READ_ONCE(). Otherwise the compiler could
generate incorrect code.

Cc: <stable@vger.kernel.org>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
18 months agos390/cpum_sf: add READ_ONCE() semantics to compare and swap loops
Heiko Carstens [Thu, 5 Jan 2023 14:44:20 +0000 (15:44 +0100)]
s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops

The current cmpxchg_double() loops within the perf hw sampling code do not
have READ_ONCE() semantics to read the old value from memory. This allows
the compiler to generate code which reads the "old" value several times
from memory, which again allows for inconsistencies.

For example:

        /* Reset trailer (using compare-double-and-swap) */
        do {
                te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK;
                te_flags |= SDB_TE_ALERT_REQ_MASK;
        } while (!cmpxchg_double(&te->flags, &te->overflow,
                 te->flags, te->overflow,
                 te_flags, 0ULL));

The compiler could generate code where te->flags used within the
cmpxchg_double() call may be refetched from memory and which is not
necessarily identical to the previous read version which was used to
generate te_flags. Which in turn means that an incorrect update could
happen.

Fix this by adding READ_ONCE() semantics to all cmpxchg_double()
loops. Given that READ_ONCE() cannot generate code on s390 which atomically
reads 16 bytes, use a private compare-and-swap-double implementation to
achieve that.

Also replace cmpxchg_double() with the private implementation to be able to
re-use the old value within the loops.

As a side effect this converts the whole code to only use bit fields
to read and modify bits within the hws trailer header.

Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/linux-s390/Y71QJBhNTIatvxUT@osiris/T/#ma14e2a5f7aa8ed4b94b6f9576799b3ad9c60f333
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>