Tobias Huschle [Mon, 21 Feb 2022 11:55:52 +0000 (12:55 +0100)]
s390/kprobes: Avoid additional kprobe in kretprobe handling
So far, s390 registered a krobe on __kretprobe_trampoline which is
called everytime a kretprobe fires. This kprobe would then determine
the correct return address and adjust the psw accordingly, such that
the kretprobe would branch to the appropriate address after completion.
Some other archs handle kretprobes without such an additional kprobe.
This approach is adopted to s390 with this patch.
Furthermore, the __kretprobe_trampoline now uses an assembler function
to correctly gather the register and psw content to be passed to the
registered kretprobe handler as struct pt_regs. After completion, the
register content and the psw are set based on the contents of said
pt_regs struct.
Note that a change to the psw address in struct pt_regs will not have
an impact, as the probe will still return to the original return
address of the probed function.
The return address is now recovered by using the appropriate function
arch_kretprobe_fixup_return.
The no longer needed kprobe is removed.
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Tobias Huschle <huschle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Fri, 25 Feb 2022 09:39:02 +0000 (10:39 +0100)]
s390: convert ".insn" encoding to instruction names
With z10 as minimum supported machine generation many ".insn" encodings
could be now converted to instruction names. There are couple of exceptions
- stfle is used from the als code built for z900 and cannot be converted
- few ".insn" directives encode unsupported instruction formats
The generated code is identical before/after this change.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Fri, 25 Feb 2022 09:38:23 +0000 (10:38 +0100)]
s390: assume stckf is always present
With z10 as minimum supported machine generation the store-clock-fast
facility (25) is always present and checked in als code.
Drop alternatives and always use stckf.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Fri, 25 Feb 2022 09:18:14 +0000 (10:18 +0100)]
s390/nospec: move to single register thunks
Assembler generated expoline thunks were in a form
__s390_indirect_jump_rXuse_rX when exrl instruction has not been available.
Now with z10 as minimum supported machine generation there
is no need for 2 register thunks, always generate
__s390_indirect_jump_rX versions.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Thu, 24 Feb 2022 21:43:31 +0000 (22:43 +0100)]
s390: raise minimum supported machine generation to z10
Machine generations up to z9 (released in May 2006) have been officially
out of service for several years now (z9 end of service - January 31, 2019).
No distributions build kernels supporting those old machine generations
anymore, except Debian, which seems to pick the oldest supported
generation. The team supporting Debian on s390 has been notified about
the change.
Raising minimum supported machine generation to z10 helps to reduce
maintenance cost and effectively remove code, which is not getting
enough testing coverage due to lack of older hardware and distributions
support. Besides that this unblocks some optimization opportunities and
allows to use wider instruction set in asm files for future features
implementation. Due to this change spectre mitigation and usercopy
implementations could be drastically simplified and many newer instructions
could be converted from ".insn" encoding to instruction names.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Janis Schoetterl-Glausch [Fri, 11 Feb 2022 18:22:06 +0000 (19:22 +0100)]
s390/uaccess: Add copy_from/to_user_key functions
Add copy_from/to_user_key functions, which perform storage key checking.
These functions can be used by KVM for emulating instructions that need
to be key checked.
These functions differ from their non _key counterparts in
include/linux/uaccess.h only in the additional key argument and must be
kept in sync with those.
Since the existing uaccess implementation on s390 makes use of move
instructions that support having an additional access key supplied,
we can implement raw_copy_from/to_user_key by enhancing the
existing implementation.
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20220211182215.2730017-2-scgl@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Sun, 6 Mar 2022 21:30:42 +0000 (22:30 +0100)]
s390/nospec: align and size extern thunks
Kernel has full control over how extern thunks generated by
arch/s390/lib/expoline.S look like. Align them to 16 bytes like other
symbols. Also set proper symbols size which is important for tooling.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Sun, 6 Mar 2022 19:56:07 +0000 (20:56 +0100)]
s390/nospec: add an option to use thunk-extern
Currently with -mindirect-branch=thunk and -mfunction-return=thunk compiler
options expoline thunks are put into individual COMDAT group sections. s390
is the only architecture which has group sections and it has implications
for kpatch and objtool tools support.
Using -mindirect-branch=thunk-extern and -mfunction-return=thunk-extern
is an alternative, which comes with a need to generate all required
expoline thunks manually. Unfortunately modules area is too far away from
the kernel image, and expolines from the kernel image cannon be used.
But since all new distributions (except Debian) build kernels for machine
generations newer than z10, where "exrl" instruction is available, that
leaves only 16 expolines thunks possible.
Provide an option to build the kernel with
-mindirect-branch=thunk-extern and -mfunction-return=thunk-extern for
z10 or newer. This also requires to postlink expoline thunks into all
modules explicitly. Currently modules already contain most expolines
anyhow.
Unfortunately -mindirect-branch=thunk-extern and
-mfunction-return=thunk-extern options support is broken in gcc <= 11.2.
Additional compile test is required to verify proper gcc support.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Co-developed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Mon, 28 Feb 2022 12:15:59 +0000 (13:15 +0100)]
s390/nospec: generate single register thunks if possible
Currently assembler generated expoline thunks are always in a form
__s390_indirect_jump_rXuse_rX even when exrl instruction is available
and no additional register is utilized.
Generate __s390_indirect_jump_rX versions using a single register if the
kernel is built for z10 or newer machine, which have exrl instruction
available. Thunks generated are identical to the ones generated by the
compiler.
This helps to reduce the number of thunks for newer machines generations.
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Niklas Schnelle [Tue, 8 Mar 2022 09:49:58 +0000 (10:49 +0100)]
s390/pci: make zpci_set_irq()/zpci_clear_irq() static
Commit
c1e18c17bda68 ("s390/pci: add zpci_set_irq()/zpci_clear_irq()")
made zpci_set_irq()/zpci_clear_irq() non-static in preparation for using
them in zpci_hot_reset_device(). The version of zpci_hot_reset_device()
that was finally merged however exploits the fact that IRQs and DMA is
implicitly disabled by clp_disable_fh() so the call to zpci_clear_irq()
was never added. There are no other calls outside pci_irq.c so lets make
both functions static.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Thu, 24 Feb 2022 15:43:23 +0000 (16:43 +0100)]
s390: remove unused expoline to BC instructions
This reverts commit
6deaa3bbca80 ("s390: extend expoline to BC
instructions"). Expolines to BC instructions were added to be utilized
by commit
de5cb6eb514e ("s390: use expoline thunks in the BPF JIT"). But
corresponding code has been removed by commit
e1cf4befa297 ("bpf, s390x:
remove ld_abs/ld_ind"). And compiler does not generate such expolines as
well.
Compared to regular expolines, expolines to BC instructions contain
displacement and all possible variations cannot be generated in advance,
making kpatch support more complicated. So, remove those to avoid
future usages.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Sun, 6 Mar 2022 10:11:05 +0000 (11:11 +0100)]
s390/irq: use assignment instead of cast
Change struct ext_code to contain a union which allows to simply
assign the int_code instead of using a cast.
In order to keep the patch small the anonymous union is embedded
within the existing struct instead of changing the struct ext_code to
a union.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Sun, 6 Mar 2022 09:59:05 +0000 (10:59 +0100)]
s390/traps: get rid of magic cast for per code
Add a proper union in lowcore to reflect architecture and get rid of a
"magic" cast in order to read the full per code.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Sun, 6 Mar 2022 10:15:27 +0000 (11:15 +0100)]
s390/traps: get rid of magic cast for program interruption code
Add a proper union in lowcore to reflect architecture and get rid of a
"magic" cast in order to read the full program interruption code.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Fri, 4 Mar 2022 14:15:33 +0000 (15:15 +0100)]
s390/signal: fix typo in comments
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Fri, 4 Mar 2022 14:14:06 +0000 (15:14 +0100)]
s390/asm-offsets: remove unused defines
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Thu, 3 Mar 2022 15:38:34 +0000 (16:38 +0100)]
s390/test_unwind: avoid build warning with W=1
Fix the following build warning with W=1
arch/s390/lib/test_unwind.c:172:21: warning: variable 'fops' set but not used [-Wunused-but-set-variable]
struct ftrace_ops *fops;
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 17:36:46 +0000 (18:36 +0100)]
s390: remove .fixup section
The only user is gone. Remove the section.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Sun, 27 Feb 2022 20:32:54 +0000 (21:32 +0100)]
s390/bpf: encode register within extable entry
Instead of decoding the instruction that faulted to get the register
which needs to be zeroed, simply encode its number into the extable
entries during code generation. This allows to get rid of a bit of
code, and is also what other architectures are doing.
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 14:02:46 +0000 (15:02 +0100)]
s390/extable: add dedicated uaccess handler
This is more or less a combination of commit
2e77a62cb3a6 ("arm64:
extable: add a dedicated uaccess handler") and commit
4b5305decc84
("x86/extable: Extend extable functionality").
To describe the problem that needs to solved let's cite the full arm64
commit message:
------
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups
as offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address
is branched to at the end of the fixup. This is potentially wasteful
of I-cache resources, and makes it hard to add additional logic to
fixups without significant bloat.
This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.
Other than backtracing, there should be no functional change as a result
of this patch.
------
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 13:52:42 +0000 (14:52 +0100)]
s390/extable: convert to relative table with data
Follow arm64, riscv, and x86 and change extable layout to common
"relative table with data". This allows to get rid of s390 specific
code in sorttable.c.
The main difference to before is that extable entries do not contain a
relative function pointer anymore. Instead data and type fields are
added.
The type field is used to indicate which exception handler needs to be
called, while the data field is currently unused.
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 13:29:25 +0000 (14:29 +0100)]
s390/extable: add and use fixup_exception helper function
Add and use fixup_exception helper function in order to remove the
duplicated exception handler fixup code at several places.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 12:31:33 +0000 (13:31 +0100)]
s390/base: pass pt_regs to early program check handler
Pass pt_regs to early program check handler like it is done for every
other interrupt and exception handler.
Also the passed pt_regs can be changed by the called function and the
changes register contents and psw contents will be taken into account
when returning. In addition the return psw will not be copied to the
program check old psw in lowcore, but to the usual return psw
location, like it is also done by the regular program check handler.
This allows also to get rid of the code that disabled lowcore
protection when changing the return address.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 10:37:52 +0000 (11:37 +0100)]
s390/extable: move extable related functions to mm/extable.c
Just like arm64, riscv, and x86 move extable related functions to
mm/extable.c. This is currently only one function, but this will
change with subsequent changes.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 10:22:12 +0000 (11:22 +0100)]
s390/extable: move EX_TABLE define to asm-extable.h
Follow arm64 and riscv and move the EX_TABLE define to asm-extable.h
which is a lot less generic than the current linkage.h.
Also make sure that all files which contain EX_TABLE usages actually
include the new header file. This should make sure that the files
always compile and there won't be any random compile breakage due to
other header file dependencies.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 09:53:34 +0000 (10:53 +0100)]
s390/extable: search amode31 extable last
It is very unlikely that an exception happens within the amode31 text
section, therefore safe a couple of cycles for the common case, and
search the amode31 extable last.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 28 Feb 2022 09:45:43 +0000 (10:45 +0100)]
s390/extable: sort amode31 extable early
The early program check handler is active before the amode31 extable
is sorted. Therefore in case a program check happens early within the
amode31 code the extable entry might not be found.
Fix this by sorting the amode31 extable early.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Halil Pasic [Tue, 14 Dec 2021 14:54:16 +0000 (15:54 +0100)]
s390/airq: use DMA memory for summary indicators
Protected virtualization guests have to use shared pages for airq
notifier bit vectors and summary bytes or bits, thus these need to be
allocated as DMA coherent memory. Commit
b50623e5db80 ("s390/airq: use
DMA memory for adapter interrupts") took care of the notifier bit
vectors, but omitted to take care of the summary bytes/bits.
In practice this omission is not a big deal, because the summary ain't
necessarily allocated here, but can be supplied by the driver. Currently
all the I/O we have for SE guests is virtio-ccw, and virtio-ccw uses a
self-allocated array of summary indicators.
Let us cover all our bases nevertheless!
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Jürgen Christ [Mon, 10 Jan 2022 12:33:30 +0000 (13:33 +0100)]
s390/zcrypt: Provide target domain for EP11 cprbs to scheduling function
The scheduling function will get an extension which will
process the target_id value from an EP11 cprb. This patch
extracts the value during preparation of the ap message.
Signed-off-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Tue, 23 Nov 2021 15:02:47 +0000 (16:02 +0100)]
s390/zcrypt: change reply buffer size offering
Instead of offering the user space given receive buffer size to
the crypto card firmware as limit for the reply message offer
the internal per queue reply buffer size. As the queue's reply
buffer is always adjusted to the max message size possible for
this card this may offer more buffer space. However, now it is
important to check the user space reply buffer on pushing back
the reply. If the reply does not fit into the user space provided
buffer the ioctl will fail with errno EMSGSIZE.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Tue, 23 Nov 2021 14:16:06 +0000 (15:16 +0100)]
s390/zcrypt: Support CPRB minor version T7
There is a new CPRB minor version T7 to be supported with
this patch. Together with this the functions which extract
the CPRB data from userspace and prepare the AP message do
now check the CPRB minor version and provide some info in
the flag field of the ap message struct for further processing.
The 3 functions doing this job have been renamed to
prep_cca_ap_msg, prep_ep11_ap_msg and prep_rng_ap_msg to
reflect their job better (old was get..fc).
This patch also introduces two new flags to be used internal
with the flag field of the struct ap_message:
AP_MSG_FLAG_USAGE is set when prep_cca_ap_msg or prep_ep11_ap_msg
come to the conclusion that this is a ordinary crypto load CPRB
(which means T2 for CCA CPRBs and no admin bit for EP11 CPRBs).
AP_MSG_FLAG_ADMIN is set when prep_cca_ap_msg or prep_ep11_ap_msg
think, this is an administrative (control) crypto load CPRB
(which means T3, T5, T6 or T7 for CCA CPRBs and admin bit set
for EP11 CPRBs).
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Wed, 17 Nov 2021 14:38:39 +0000 (15:38 +0100)]
s390/zcrypt: handle checkstopped cards with new state
A crypto card may be in checkstopped state. With this
patch this is handled as a new state in the ap card and
ap queue structs. There is also a new card sysfs attribute
/sys/devices/ap/cardxx/chkstop
and a new queue sysfs attribute
/sys/devices/ap/cardxx/xx.yyyy/chkstop
displaying the checkstop state of the card or queue. Please
note that the queue's checkstop state is only a copy of the
card's checkstop state but makes maintenance much easier.
The checkstop state expressed here is the result of an
RC 0x04 (CHECKSTOP) during an AP command, mostly the
PQAP(TAPQ) command which is 'testing' the queue.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Tue, 16 Nov 2021 13:54:19 +0000 (14:54 +0100)]
s390/zcrypt: CEX8S exploitation support
This patch adds CEX8 exploitation support for the AP bus code,
the zcrypt device driver zoo and the vfio device driver.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Thu, 11 Nov 2021 13:31:46 +0000 (14:31 +0100)]
s390/ap/zcrypt: debug feature improvements
This patch adds some debug feature improvements related
to some failures happened in the past. With CEX8 the max
request and response sizes have been extended but the
user space applications did not rework their code and
thus ran into receive buffer issues. This ffdc patch
here helps with additional checks and debug feature
messages in debugging and pointing to the root cause of
some failures related to wrong buffer sizes.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:25:19 +0000 (21:25 +0100)]
s390/mm: convert pte_val()/pXd_val() into functions
Disallow constructs like this:
pte_val(*pte) = __pa(addr) | prot;
which would directly write into a page table. Users are supposed to
use the set_pte()/set_pXd() primitives, which guarantee block
concurrent (aka atomic) writes.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:25:09 +0000 (21:25 +0100)]
s390/mm,gmap: don't use pte_val()/pXd_val() as lvalue
Convert pgtable code so pte_val()/pXd_val() aren't used as lvalue
anymore. This allows in later step to convert pte_val()/pXd_val() to
functions, which in turn makes it impossible to use these macros to
modify page table entries like they have been used before.
Therefore a construct like this:
pte_val(*pte) = __pa(addr) | prot;
which would directly write into a page table, isn't possible anymore
with the last step of this series.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:24:51 +0000 (21:24 +0100)]
s390/mm,hugetlb: don't use pte_val()/pXd_val() as lvalue
Convert pgtable code so pte_val()/pXd_val() aren't used as lvalue
anymore. This allows in later step to convert pte_val()/pXd_val() to
functions, which in turn makes it impossible to use these macros to
modify page table entries like they have been used before.
Therefore a construct like this:
pte_val(*pte) = __pa(addr) | prot;
which would directly write into a page table, isn't possible anymore
with the last step of this series.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:24:35 +0000 (21:24 +0100)]
s390/mm,pageattr: don't use pte_val()/pXd_val() as lvalue
Convert pgtable code so pte_val()/pXd_val() aren't used as lvalue
anymore. This allows in later step to convert pte_val()/pXd_val() to
functions, which in turn makes it impossible to use these macros to
modify page table entries like they have been used before.
Therefore a construct like this:
pte_val(*pte) = __pa(addr) | prot;
which would directly write into a page table, isn't possible anymore
with the last step of this series.
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:24:01 +0000 (21:24 +0100)]
s390/mm,pgtable: don't use pte_val()/pXd_val() as lvalue
Convert pgtable code so pte_val()/pXd_val() aren't used as lvalue
anymore. This allows in later step to convert pte_val()/pXd_val() to
functions, which in turn makes it impossible to use these macros to
modify page table entries like they have been used before.
Therefore a construct like this:
pte_val(*pte) = __pa(addr) | prot;
which would directly write into a page table, isn't possible anymore
with the last step of this series.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 19:50:07 +0000 (20:50 +0100)]
s390/mm: use set_pXd()/set_pte() helper functions everywhere
Use the new set_pXd()/set_pte() helper functions at all places where
page table entries are modified.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 21 Feb 2022 20:18:29 +0000 (21:18 +0100)]
s390/mm: add set_pte_bit()/clear_pte_bit() helper functions
Add set_pte_bit()/clear_pte_bit() and set_pXd_bit()/clear_pXd_bit
helper functions which are supposed to be used if bits within
ptes/pXds are set/cleared.
The only point of these helper functions is to get more readable
code. This is quite similar to what arm64 has.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Thu, 10 Feb 2022 15:08:29 +0000 (16:08 +0100)]
s390/mm: add set_pXd()/set_pte() helper functions
Add set_pXd()/set_pte() helper functions which must be used to update
page table entries. The new helpers use WRITE_ONCE() to make sure that
a page table entry is written to only once.
Without this the compiler could otherwise generate code which writes
several times to a page table entry when updating its contents from
invalid to valid, which could lead to surprising results especially
for multithreaded processes...
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Fri, 25 Feb 2022 08:41:24 +0000 (09:41 +0100)]
s390/entry: remove unused expoline thunk
Remove __s390_indirect_jump_r13use_r14 expoline thunk unused since
commit
fbbdfca5c553 ("s390/entry.S: factor out SIEEXIT macro").
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Tue, 22 Feb 2022 14:27:52 +0000 (15:27 +0100)]
s390/ftrace: make use of epsw to get psw mask
Finally use epsw to create a complete psw mask within pt_regs. Without
this only some bits are correct, while other bits are (incorrectly)
always zero.
The epsw instruction is quite heavy weight, however given that this
only effects ftrace_regs_caller this seems to be the right thing, so
we finally get a complete psw mask for ftrace kprobed functions.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Thu, 17 Feb 2022 14:46:01 +0000 (15:46 +0100)]
s390/ptrace: remove opencoded offsetof
Remove opencoded offsetof and use offsetof instead.
The generated code is identical before/after this change.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Tue, 15 Feb 2022 13:10:48 +0000 (14:10 +0100)]
s390/smp: sort out physical vs virtual pointers usage
With commit
5789284710aa ("s390/smp: reallocate IPL CPU lowcore")
virtual addresses are wrongly passed to memblock_free_late() and
SPX instructions on IPL CPU reinitialization.
Note: this does not fix a bug currently, since virtual and
physical addresses are identical.
Fixes:
5789284710aa ("s390/smp: reallocate IPL CPU lowcore")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Harald Freudenberger [Wed, 16 Feb 2022 11:30:34 +0000 (12:30 +0100)]
s390/ap: enable sysfs attribute scans to force AP bus rescan
This patch switches the sysfs attribute /sys/bus/ap/scans
from read-only to read-write. If there is something written
to this attribute, an AP bus rescan is forced. If an AP
bus scan is triggered this way a debug feature entry line
reports this in /sys/kernel/debug/s390dbf/ap/sprintf.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jakob Naucke <naucke@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Fri, 1 Oct 2021 17:39:13 +0000 (13:39 -0400)]
s390/ap: notify drivers on config changed and scan complete callbacks
This patch introduces an extension to the ap bus to notify device drivers
when the host AP configuration changes - i.e., adapters, domains or
control domains are added or removed. When an adapter or domain is added to
the host's AP configuration, the AP bus will create the associated queue
devices in the linux sysfs device model. Each new type 10 (i.e., CEX4) or
newer queue device with an APQN that is not reserved for the default device
driver will get bound to the vfio_ap device driver. Likewise, whan an
adapter or domain is removed from the host's AP configuration, the AP bus
will remove the associated queue devices from the sysfs device model. Each
of the queues that is bound to the vfio_ap device driver will get unbound.
With the introduction of hot plug support, binding or unbinding of a
queue device will result in plugging or unplugging one or more queues from
a guest that is using the queue. If there are multiple changes to the
host's AP configuration, it could result in the probe and remove callbacks
getting invoked multiple times. Each time queues are plugged into or
unplugged from a guest, the guest's VCPUs must be taken out of SIE.
If this occurs multiple times due to changes in the host's AP
configuration, that can have an undesirable negative affect on the guest's
performance.
To alleviate this problem, this patch introduces two new callbacks: one to
notify the vfio_ap device driver when the AP bus scan routine detects a
change to the host's AP configuration; and, one to notify the driver when
the AP bus is done scanning. This will allow the vfio_ap driver to do
bulk processing of all affected adapters, domains and control domains for
affected guests rather than plugging or unplugging them one at a time when
the probe or remove callback is invoked. The two new callbacks are:
void (*on_config_changed)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
This callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan. This is done by storing
an old and current QCI info struct and comparing them. If there is any
difference, the callback is invoked.
void (*on_scan_complete)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Fri, 1 Feb 2019 21:21:11 +0000 (16:21 -0500)]
s390/ap: driver callback to indicate resource in use
Introduces a new driver callback to prevent a root user from re-assigning
the APQN of a queue that is in use by a non-default host device driver to
a default host device driver and vice versa. The callback will be invoked
whenever a change to the AP bus's sysfs apmask or aqmask attributes would
result in one or more APQNs being re-assigned. If the callback responds
in the affirmative for any driver queried, the change to the apmask or
aqmask will be rejected with a device busy error.
For this patch, only non-default drivers will be queried. Currently,
there is only one non-default driver, the vfio_ap device driver. The
vfio_ap device driver facilitates pass-through of an AP queue to a
guest. The idea here is that a guest may be administered by a different
sysadmin than the host and we don't want AP resources to unexpectedly
disappear from a guest's AP configuration (i.e., adapters and domains
assigned to the matrix mdev). This will enforce the proper procedure for
removing AP resources intended for guest usage which is to
first unassign them from the matrix mdev, then unbind them from the
vfio_ap device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Wed, 2 Feb 2022 23:49:41 +0000 (00:49 +0100)]
s390/test_unwind: fix and extend kprobes test
Running kprobe test on a kernel built with clang 14 didn't actually
trigger pgm_pre_handler() and no unwinder code was called. Even though
do_report_trap() is a global symbol, clang inlined it in several local
functions including illegal_op() handler, so that kprobbing a global
symbol didn't have a desired effect.
To achieve the same test result (unwinding from a program check
handler) introduce a local function and probe an instruction in the
middle, so that kprobe doesn't take KPROBE_ON_FTRACE path.
While at it, add another test for KPROBE_ON_FTRACE.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Mon, 31 Jan 2022 18:06:52 +0000 (19:06 +0100)]
s390/test_unwind: add ftrace test
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Tue, 1 Feb 2022 20:04:16 +0000 (21:04 +0100)]
s390/test_unwind: add "backtrace" module parameter
By default no backtraces are printed when a test succeeds, but sometimes
it is useful to spot issues automated test doesn't cover. Add "backtrace"
module parameter to force it.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Mon, 31 Jan 2022 18:00:56 +0000 (19:00 +0100)]
s390/test_unwind: minor cleanup
- make current_test static
- use current_test consistently
- add TEST_WITH_FLAGS macro to contract parametrized tests definition
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Tue, 1 Feb 2022 18:54:22 +0000 (19:54 +0100)]
s390/test_unwind: show tests as skipped if unsupported
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Fri, 28 Jan 2022 23:34:13 +0000 (00:34 +0100)]
s390: always use the packed stack layout
-mpacked-stack option has been supported by both minimum
gcc and clang versions for a while. With commit
e2bc3e91d91e
("scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0
for s390") minimum clang version now also supports a combination
of flags -mpacked-stack -mbackchain -pg -mfentry and fulfills
all requirements to always enable the packed stack layout.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vasily Gorbik [Tue, 1 Mar 2022 20:02:48 +0000 (21:02 +0100)]
Merge branch 'fixes' into features
This helps to avoid several merge conflicts later.
* fixes:
s390/extable: fix exception table sorting
s390/ftrace: fix arch_ftrace_get_regs implementation
s390/ftrace: fix ftrace_caller/ftrace_regs_caller generation
s390/setup: preserve memory at OLDMEM_BASE and OLDMEM_SIZE
s390/cio: verify the driver availability for path_event call
s390/module: fix building test_modules_helpers.o with clang
MAINTAINERS: downgrade myself to Reviewer for s390
MAINTAINERS: add Alexander Gordeev as maintainer for s390
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Thu, 24 Feb 2022 21:03:29 +0000 (22:03 +0100)]
s390/extable: fix exception table sorting
s390 has a swap_ex_entry_fixup function, however it is not being used
since common code expects a swap_ex_entry_fixup define. If it is not
defined the default implementation will be used. So fix this by adding
a proper define.
However also the implementation of the function must be fixed, since a
NULL value for handler has a special meaning and must not be adjusted.
Luckily all of this doesn't fix a real bug currently: the main extable
is correctly sorted during build time, and for runtime sorting there
is currently no case where the handler field is not NULL.
Fixes:
05a68e892e89 ("s390/kernel: expand exception table logic to allow new handling options")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Tue, 22 Feb 2022 13:53:47 +0000 (14:53 +0100)]
s390/ftrace: fix arch_ftrace_get_regs implementation
arch_ftrace_get_regs is supposed to return a struct pt_regs pointer
only if the pt_regs structure contains all register contents, which
means it must have been populated when created via ftrace_regs_caller.
If it was populated via ftrace_caller the contents are not complete
(the psw mask part is missing), and therefore a NULL pointer needs be
returned.
The current code incorrectly always returns a struct pt_regs pointer.
Fix this by adding another pt_regs flag which indicates if the
contents are complete, and fix arch_ftrace_get_regs accordingly.
Fixes:
894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Wed, 23 Feb 2022 12:02:59 +0000 (13:02 +0100)]
s390/ftrace: fix ftrace_caller/ftrace_regs_caller generation
ftrace_caller was used for both ftrace_caller and ftrace_regs_caller,
which means that the target address of the hotpatch trampoline was
never updated.
With commit
894979689d3a ("s390/ftrace: provide separate
ftrace_caller/ftrace_regs_caller implementations") a separate
ftrace_regs_caller entry point was implemeted, however it was
forgotten to implement the necessary changes for ftrace_modify_call
and ftrace_make_call, where the branch target has to be modified
accordingly.
Therefore add the missing code now.
Fixes:
894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Egorenkov [Wed, 9 Feb 2022 10:25:09 +0000 (11:25 +0100)]
s390/setup: preserve memory at OLDMEM_BASE and OLDMEM_SIZE
We need to preserve the values at OLDMEM_BASE and OLDMEM_SIZE which are
used by zgetdump in case when kdump crashes. In that case zgetdump will
attempt to read OLDMEM_BASE and OLDMEM_SIZE in order to find out where
the memory range [0 - OLDMEM_SIZE] belonging to the production kernel is.
Fixes:
f1a546947431 ("s390/setup: don't reserve memory that occupied decompressor's head")
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Mon, 7 Feb 2022 13:02:18 +0000 (14:02 +0100)]
s390/mm: use CRST_ALLOC_ORDER instead of number
Use CRST_ALLOC_ORDER to make it more obvious what the order means,
and also to be consistent with other code, e.g. the vmemmap code.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Sat, 29 Jan 2022 08:24:50 +0000 (09:24 +0100)]
s390/maccess: fix semantics of memcpy_real() and its callers
There is a confusion with regard to the source address of
memcpy_real() and calling functions. While the declared
type for a source assumes a virtual address, in fact it
always called with physical address of the source.
This confusion led to bugs in copy_oldmem_kernel() and
copy_oldmem_user() functions, where __pa() macro applied
mistakenly to physical addresses. It does not lead to a
real issue, since virtual and physical addresses are
currently the same.
Fix both the bugs and memcpy_real() prototype by making
type of source address consistent to the function name
and the way it actually used.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Sat, 29 Jan 2022 07:38:56 +0000 (08:38 +0100)]
s390/dump: fix old lowcore virtual vs physical address confusion
Virtual addresses of vmcore_info and os_info members are
wrongly passed to copy_oldmem_kernel(), while the function
expects physical address of the source. Instead, __pa()
macro should have been applied.
Yet, use of __pa() macro could be somehow confusing, since
copy_oldmem_kernel() may treat the source as an offset, not
as a direct physical address (that depens from the oldmem
availability and location).
Fix the virtual vs physical address confusion and make the
way the old lowcore is read consistent across all sources.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Vineeth Vijayan [Wed, 2 Feb 2022 20:45:56 +0000 (21:45 +0100)]
s390/cio: verify the driver availability for path_event call
If no driver is attached to a device or the driver does not provide the
path_event function, an FCES path-event on this device could end up in a
kernel-panic. Verify the driver availability before the path_event
function call.
Fixes:
32ef938815c1 ("s390/cio: Add support for FCES status notification")
Cc: stable@vger.kernel.org
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Suggested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Thu, 3 Feb 2022 09:56:07 +0000 (10:56 +0100)]
s390/lgr: use simple assignment instead of memcpy
It is quite pointless to use memcpy to copy two bytes, besides that
this construct will also partially remove type and size sanity checks.
Therefore simply use an assignment.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Wed, 26 Jan 2022 12:47:59 +0000 (13:47 +0100)]
s390/dump: fix os_info virtual vs physical address confusion
Due to historical reasons os_info handling functions misuse
the notion of physical vs virtual addresses difference.
Note: this does not fix a bug currently, since virtual
and physical addresses are identical.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Wed, 26 Jan 2022 12:47:58 +0000 (13:47 +0100)]
s390/sclp_sdias: fix sclp_sdias_copy() virtual vs physical address confusion
Due to historical reasons sclp_sdias_copy() misuses
the notion of physical vs virtual addresses difference.
Note: this does not fix a bug currently, since virtual
and physical addresses are identical.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Alexander Gordeev [Fri, 21 Jan 2022 09:44:25 +0000 (10:44 +0100)]
s390/maccess: fix absolute lowcore virtual vs physical address confusion
Due to historical reasons memcpy_absolute() and friend functions
misuse the notion of physical vs virtual addresses difference.
Note: this does not fix a bug currently, since virtual and physical
addresses are identical.
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Sven Schnelle [Sun, 23 Jan 2022 19:20:09 +0000 (20:20 +0100)]
s390/ftrace: verify opcode before applying patch
commit
72b3942a173c ("scripts: ftrace - move the
sort-processing in ftrace_init") had the unexpected
side effect that wrong code locations were patched.
To prevent this from happening again, verify the
opcode before patching it.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Fri, 28 Jan 2022 11:10:57 +0000 (12:10 +0100)]
s390: remove invalid email address of Heiko Carstens
Remove my old invalid email address which can be found in a couple of
files. Instead of updating it, just remove my contact data completely
from source files.
We have git and other tools which allow to figure out who is responsible
for what with recent contact data.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Tue, 4 Jan 2022 20:44:13 +0000 (15:44 -0500)]
s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function
This patch adds s390dbf logging to the function that executes the
PQAP(AQIC) instruction on behalf of the guest to which the queue for which
interrupts are being enabled or disabled is attached.
Currently, the vfio_ap_irq_enable function sets status response code 06
(notification indicator byte address (nib) invalid) in the status word
when the vfio_pin_pages function - called to pin the page containing the
nib - returns an error or a different number of pages pinned than
requested.
Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction.
In addition to logging a warning for the situation above, this patch adds
the following:
* A function to validate the nib address invoked prior to calling the
vfio_pin_pages function. This allows for logging a message informing
the reader of the reason the page containing the nib can not be pinned
if the nib address is not valid. Response code 06 (invalid nib address)
will be set in the status word returned to the guest from the
instruction.
* Checks the return value from the kvm_s390_gisc_register and logs a
message informing the reader of the failure. Status response code 08
(invalid gisa) will be set in the status word returned to the guest from
the PQAP(AQIC) instruction.
* Checks the status response code returned from execution of the PQAP(AQIC)
instruction and if it indicates an error, logs a message informing the
reader.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Thu, 4 Nov 2021 20:41:37 +0000 (16:41 -0400)]
s390/vfio-ap: add s390dbf logging to the handle_pqap function
This patch adds s390dbf logging to the function that handles interception
of the PQAP(AQIC) instruction. Several items of data are validated before
ultimately calling the functions that execute the PQAP(AQIC) instruction on
behalf of the guest to which the queue for which interrupts are being
enabled or disabled is attached.
Currently, the handle_pqap function sets status response code 01 (queue not
available) in the status word that is normally returned from the
PQAP(AQIC) instruction under the following conditions:
* Set when the function pointer to the handler is not set in the
kvm_s390_crypto object (i.e., the PQAP hook is not registered).
* Set when the KVM pointer is not set in the ap_matrix_mdev object
(i.e., the matrix mdev is not passed through to a guest).
* Set when the queue for which interrupts are being enabled or
disabled is either not bound to the vfio_ap device driver or not assigned
to the matrix mdev.
Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction, so this patch logs a message to the s390dbf log for the
vfio_ap device driver for each of the situations described above.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Tue, 11 Jan 2022 15:19:16 +0000 (10:19 -0500)]
MAINTAINERS: update file path for S390 VFIO AP DRIVER
Changed the MAINTAINERS file to include the new
drivers/s390/crypto/vfio_ap_debug.h file path.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tony Krowiak [Tue, 26 Oct 2021 20:58:31 +0000 (16:58 -0400)]
s390-vfio-ap: introduces s390 kernel debug feature for vfio_ap device driver
Sets up an s390dbf debug log for the vfio_ap device driver for logging
events occurring during the lifetime of the driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Ilya Leoshkevich [Mon, 31 Jan 2022 13:17:11 +0000 (14:17 +0100)]
s390/module: fix building test_modules_helpers.o with clang
Move test_modules_return_* prototypes into a header file in order to
placate -Wmissing-prototypes.
Fixes:
90c5318795ee ("s390/module: test loading modules with a lot of relocations")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Christian Borntraeger [Thu, 27 Jan 2022 14:24:49 +0000 (15:24 +0100)]
MAINTAINERS: downgrade myself to Reviewer for s390
Now that Alexander Gordeev has volunteered to be a co-maintainer for
s390, I can act as a reviewer instead of being a maintainer for s390.
With Alexander, Heiko, and Vasily we are in really good shape.
I will continue to act as the maintainer for KVM on s390 together with
Janosch.
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Thu, 27 Jan 2022 14:06:31 +0000 (15:06 +0100)]
MAINTAINERS: add Alexander Gordeev as maintainer for s390
Change Alexander Gordeev's status so he is maintainer
instead of reviewer for s390.
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Linus Torvalds [Sun, 6 Feb 2022 20:20:50 +0000 (12:20 -0800)]
Linux 5.17-rc3
Linus Torvalds [Sun, 6 Feb 2022 18:34:45 +0000 (10:34 -0800)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Various bug fixes for ext4 fast commit and inline data handling.
Also fix regression introduced as part of moving to the new mount API"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fs/ext4: fix comments mentioning i_mutex
ext4: fix incorrect type issue during replay_del_range
jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
ext4: fix potential NULL pointer dereference in ext4_fill_super()
jbd2: refactor wait logic for transaction updates into a common function
jbd2: cleanup unused functions declarations from jbd2.h
ext4: fix error handling in ext4_fc_record_modified_inode()
ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
ext4: fix error handling in ext4_restore_inline_data()
ext4: fast commit may miss file actions
ext4: fast commit may not fallback for ineligible commit
ext4: modify the logic of ext4_mb_new_blocks_simple
ext4: prevent used blocks from being allocated during fast commit replay
Linus Torvalds [Sun, 6 Feb 2022 18:18:23 +0000 (10:18 -0800)]
Merge tag 'perf-tools-fixes-for-v5.17-2022-02-06' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix display of grouped aliased events in 'perf stat'.
- Add missing branch_sample_type to perf_event_attr__fprintf().
- Apply correct label to user/kernel symbols in branch mode.
- Fix 'perf ftrace' system_wide tracing, it has to be set before
creating the maps.
- Return error if procfs isn't mounted for PID namespaces when
synthesizing records for pre-existing processes.
- Set error stream of objdump process for 'perf annotate' TUI, to avoid
garbling the screen.
- Add missing arm64 support to perf_mmap__read_self(), the kernel part
got into 5.17.
- Check for NULL pointer before dereference writing debug info about a
sample.
- Update UAPI copies for asound, perf_event, prctl and kvm headers.
- Fix a typo in bpf_counter_cgroup.c.
* tag 'perf-tools-fixes-for-v5.17-2022-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf ftrace: system_wide collection is not effective by default
libperf: Add arm64 support to perf_mmap__read_self()
tools include UAPI: Sync sound/asound.h copy with the kernel sources
perf stat: Fix display of grouped aliased events
perf tools: Apply correct label to user/kernel symbols in branch mode
perf bpf: Fix a typo in bpf_counter_cgroup.c
perf synthetic-events: Return error if procfs isn't mounted for PID namespaces
perf session: Check for NULL pointer before dereference
perf annotate: Set error stream of objdump process for TUI
perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
perf beauty: Make the prctl arg regexp more strict to cope with PR_SET_VMA
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
Linus Torvalds [Sun, 6 Feb 2022 18:11:14 +0000 (10:11 -0800)]
Merge tag 'perf_urgent_for_v5.17_rc3' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Intel/PT: filters could crash the kernel
- Intel: default disable the PMU for SMM, some new-ish EFI firmware has
started using CPL3 and the PMU CPL filters don't discriminate against
SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
cycles.
- Fixup for perf_event_attr::sig_data
* tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/pt: Fix crash with stop filters in single-range mode
perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
selftests/perf_events: Test modification of perf_event_attr::sig_data
perf: Copy perf_event_attr::sig_data on modification
x86/perf: Default set FREEZE_ON_SMI for all
Linus Torvalds [Sun, 6 Feb 2022 18:04:43 +0000 (10:04 -0800)]
Merge tag 'objtool_urgent_for_v5.17_rc3' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Borislav Petkov:
"Fix a potential truncated string warning triggered by gcc12"
* tag 'objtool_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix truncated string warning
Linus Torvalds [Sun, 6 Feb 2022 18:00:40 +0000 (10:00 -0800)]
Merge tag 'irq_urgent_for_v5.17_rc3' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
"Remove a bogus warning introduced by the recent PCI MSI irq affinity
overhaul"
* tag 'irq_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Remove bogus warning in pci_irq_get_affinity()
Linus Torvalds [Sun, 6 Feb 2022 17:57:39 +0000 (09:57 -0800)]
Merge tag 'edac_urgent_for_v5.17_rc3' of git://git./linux/kernel/git/ras/ras
Pull EDAC fixes from Borislav Petkov:
"Fix altera and xgene EDAC drivers to propagate the correct error code
from platform_get_irq() so that deferred probing still works"
* tag 'edac_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/xgene: Fix deferred probing
EDAC/altera: Fix deferred probing
Changbin Du [Thu, 27 Jan 2022 13:20:10 +0000 (21:20 +0800)]
perf ftrace: system_wide collection is not effective by default
The ftrace.target.system_wide must be set before invoking
evlist__create_maps(), otherwise it has no effect.
Fixes:
53be50282269b46c ("perf ftrace: Add 'latency' subcommand")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127132010.4836-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rob Herring [Tue, 1 Feb 2022 21:40:56 +0000 (15:40 -0600)]
libperf: Add arm64 support to perf_mmap__read_self()
Add the arm64 variants for read_perf_counter() and read_timestamp().
Unfortunately the counter number is encoded into the instruction, so the
code is a bit verbose to enumerate all possible counters.
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220201214056.702854-1-robh@kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Arnaldo Carvalho de Melo [Wed, 12 Feb 2020 14:04:23 +0000 (11:04 -0300)]
tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:
06feec6005c9d950 ("ASoC: hdmi-codec: Fix OOB memory accesses")
Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.
To silence this perf tools build warning:
Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/lkml/Yf+6OT+2eMrYDEeX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 5 Feb 2022 01:09:41 +0000 (17:09 -0800)]
perf stat: Fix display of grouped aliased events
An event may have a number of uncore aliases that when added to the
evlist are consecutive.
If there are multiple uncore events in a group then
parse_events__set_leader_for_uncore_aliase will reorder the evlist so
that events on the same PMU are adjacent.
The collect_all_aliases function assumes that aliases are in blocks so
that only the first counter is printed and all others are marked merged.
The reordering for groups breaks the assumption and so all counts are
printed.
This change removes the assumption from collect_all_aliases
that the events are in blocks and instead processes the entire evlist.
Before:
```
$ perf stat -e '{UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE,UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE},duration_time' -a -A -- sleep 1
Performance counter stats for 'system wide':
CPU0 256,866 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 494,413 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 967 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,738 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 285,161 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 429,920 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,443 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 310,753 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 416,657 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,231 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,573 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 416,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 405,966 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,481 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,447 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 312,911 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 408,154 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,086 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,380 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 333,994 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 370,349 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,287 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,335 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 188,107 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 302,423 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 701 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,070 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 307,221 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 383,642 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,036 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,158 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 318,479 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 821,545 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,028 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,550 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 227,618 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 372,272 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 903 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 376,783 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 419,827 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,406 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,453 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 286,583 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 429,956 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 999 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,436 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 313,867 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 370,159 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,114 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 409,111 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,399 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,684 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 365,828 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 376,037 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,378 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,411 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 382,456 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 621,743 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,232 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,316 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 385,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,176 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,268 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 373,588 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 386,163 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,394 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,464 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 381,206 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 546,891 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,266 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,712 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 221,176 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 392,069 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 831 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 355,401 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 705,595 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,235 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,216 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 371,436 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 428,103 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,306 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,442 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 384,352 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 504,200 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,468 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,860 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 228,856 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 287,976 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 832 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,060 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 215,121 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 334,162 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 681 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,026 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 296,179 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 436,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,084 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,525 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 262,296 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 416,573 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 986 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,533 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 285,852 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 359,842 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,073 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,326 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 303,379 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 367,222 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,008 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,156 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 273,487 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 425,449 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 932 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,367 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 297,596 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 414,793 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,140 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,601 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,365 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 360,422 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,342 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 327,196 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 580,858 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,122 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,014 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 296,564 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 452,817 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,087 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,694 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 375,002 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 389,393 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,478 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,540 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 365,213 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 594,685 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,401 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,222 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 1,000,749,060 ns duration_time
1.
000749060 seconds time elapsed
```
After:
```
Performance counter stats for 'system wide':
CPU0 20,547,434 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 45,202,862 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 82,001 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 159,688 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 1,000,464,828 ns duration_time
1.
000464828 seconds time elapsed
```
Fixes:
3cdc5c2cb924acb4 ("perf parse-events: Handle uncore event aliases in small groups properly")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Asaf Yaffe <asaf.yaffe@intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220205010941.1065469-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
German Gomez [Wed, 26 Jan 2022 10:59:26 +0000 (10:59 +0000)]
perf tools: Apply correct label to user/kernel symbols in branch mode
In branch mode, the branch symbols were being displayed with incorrect
cpumode labels. So fix this.
For example, before:
# perf record -b -a -- sleep 1
# perf report -b
Overhead Command Source Shared Object Source Symbol Target Symbol
0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state
==> 0.08% cmd0 [kernel.kallsyms] [.] psi_group_change [.] psi_group_change
0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change
After:
# perf report -b
Overhead Command Source Shared Object Source Symbol Target Symbol
0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state
0.08% cmd0 [kernel.kallsyms] [k] psi_group_change [k] pei_group_change
0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220126105927.3411216-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masanari Iida [Sat, 25 Dec 2021 00:55:58 +0000 (09:55 +0900)]
perf bpf: Fix a typo in bpf_counter_cgroup.c
This patch fixes a spelling typo in error message.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211225005558.503935-1-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Fri, 24 Dec 2021 12:40:13 +0000 (20:40 +0800)]
perf synthetic-events: Return error if procfs isn't mounted for PID namespaces
For perf recording, it retrieves process info by iterating nodes in proc
fs. If we run perf in a non-root PID namespace with command:
# unshare --fork --pid perf record -e cycles -a -- test_program
... in this case, unshare command creates a child PID namespace and
launches perf tool in it, but the issue is the proc fs is not mounted
for the non-root PID namespace, this leads to the perf tool gathering
process info from its parent PID namespace.
We can use below command to observe the process nodes under proc fs:
# unshare --pid --fork ls /proc
1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime
10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version
11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo
12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat
128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo
129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs
13 16 2094 241 31 40 54 68 asound fb kpageflags stat
130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps
131 17 2098 25 317 42 56 70 bus fs locks sys
132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger
133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc
134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self
135 19 2117 29 333 46 6 77 consoles irq mounts timer_list
136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty
So it shows many existed tasks, since unshared command has not mounted
the proc fs for the new created PID namespace, it still accesses the
proc fs of the root PID namespace. This leads to two prominent issues:
- Firstly, PID values are mismatched between thread info and samples.
The gathered thread info are coming from the proc fs of the root PID
namespace, but samples record its PID from the child PID namespace.
- The second issue is profiled program 'test_program' returns its forked
PID number from the child PID namespace, perf tool wrongly uses this
PID number to retrieve the process info via the proc fs of the root
PID namespace.
To avoid issues, we need to mount proc fs for the child PID namespace
with the option '--mount-proc' when use unshare command:
# unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
Conversely, when the proc fs of the root PID namespace is used by child
namespace, perf tool can detect the multiple PID levels and
nsinfo__is_in_root_namespace() returns false, this patch reports error
for this case:
# unshare --fork --pid perf record -e cycles -a -- test_program
Couldn't synthesize bpf events.
Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20211224124014.2492751-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ameer Hamza [Tue, 25 Jan 2022 12:11:41 +0000 (17:11 +0500)]
perf session: Check for NULL pointer before dereference
Move NULL pointer check before dereferencing the variable.
Addresses-Coverity: 1497622 ("Derereference before null check")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/20220125121141.18347-1-amhamza.mgc@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 2 Feb 2022 07:08:25 +0000 (23:08 -0800)]
perf annotate: Set error stream of objdump process for TUI
The stderr should be set to a pipe when using TUI. Otherwise it'd
print to stdout and break TUI windows with an error message.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Anshuman Khandual [Wed, 2 Feb 2022 10:57:23 +0000 (16:27 +0530)]
perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()
This updates branch sample type with missing PERF_SAMPLE_BRANCH_TYPE_SAVE.
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1643799443-15109-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Sun, 9 May 2021 12:39:02 +0000 (09:39 -0300)]
tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:
f6c6804c43fa18d3 ("kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h")
That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.
This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Yf+4k5Fs5Q3HdSG9@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Sun, 6 Feb 2022 11:28:34 +0000 (08:28 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf/urgent
To check if more kernel API sync is needed and also to see if the perf
build tests continue to pass.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Sat, 5 Feb 2022 18:40:17 +0000 (10:40 -0800)]
Merge tag 'for-linus-5.17a-rc3-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- documentation fixes related to Xen
- enable x2apic mode when available when running as hardware
virtualized guest under Xen
- cleanup and fix a corner case of vcpu enumeration when running a
paravirtualized Xen guest
* tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/Xen: streamline (and fix) PV CPU enumeration
xen: update missing ioctl magic numers documentation
Improve docs for IOCTL_GNTDEV_MAP_GRANT_REF
xen: xenbus_dev.h: delete incorrect file name
xen/x2apic: enable x2apic mode when supported for HVM
Linus Torvalds [Sat, 5 Feb 2022 17:55:59 +0000 (09:55 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- A couple of fixes when handling an exception while a SError has
been delivered
- Workaround for Cortex-A510's single-step erratum
RISC-V:
- Make CY, TM, and IR counters accessible in VU mode
- Fix SBI implementation version
x86:
- Report deprecation of x87 features in supported CPUID
- Preparation for fixing an interrupt delivery race on AMD hardware
- Sparse fix
All except POWER and s390:
- Rework guest entry code to correctly mark noinstr areas and fix
vtime' accounting (for x86, this was already mostly correct but not
entirely; for ARM, MIPS and RISC-V it wasn't)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
KVM: x86: Report deprecated x87 features in supported CPUID
KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
KVM: arm64: Avoid consuming a stale esr value when SError occur
RISC-V: KVM: Fix SBI implementation version
RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
kvm/riscv: rework guest entry logic
kvm/arm64: rework guest entry logic
kvm/x86: rework guest entry logic
kvm/mips: rework guest entry logic
kvm: add guest_state_{enter,exit}_irqoff()
KVM: x86: Move delivery of non-APICv interrupt into vendor code
kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
Linus Torvalds [Sat, 5 Feb 2022 17:21:55 +0000 (09:21 -0800)]
Merge tag 'xfs-5.17-fixes-1' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"I was auditing operations in XFS that clear file privileges, and
realized that XFS' fallocate implementation drops suid/sgid but
doesn't clear file capabilities the same way that file writes and
reflink do.
There are VFS helpers that do it correctly, so refactor XFS to use
them. I also noticed that we weren't flushing the log at the correct
point in the fallocate operation, so that's fixed too.
Summary:
- Fix fallocate so that it drops all file privileges when files are
modified instead of open-coding that incompletely.
- Fix fallocate to flush the log if the caller wanted synchronous
file updates"
* tag 'xfs-5.17-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: ensure log flush at the end of a synchronous fallocate call
xfs: move xfs_update_prealloc_flags() to xfs_pnfs.c
xfs: set prealloc flag in xfs_alloc_file_space()
xfs: fallocate() should call file_modified()
xfs: remove XFS_PREALLOC_SYNC
xfs: reject crazy array sizes being fed to XFS_IOC_GETBMAP*
Linus Torvalds [Sat, 5 Feb 2022 17:13:51 +0000 (09:13 -0800)]
Merge tag 'vfs-5.17-fixes-2' of git://git./fs/xfs/xfs-linux
Pull vfs fixes from Darrick Wong:
"I was auditing the sync_fs code paths recently and noticed that most
callers of ->sync_fs ignore its return value (and many implementations
never return nonzero even if the fs is broken!), which means that
internal fs errors and corruption are not passed up to userspace
callers of syncfs(2) or FIFREEZE. Hence fixing the common code and
XFS, and I'll start working on the ext4/btrfs folks if this is merged.
Summary:
- Fix a bug where callers of ->sync_fs (e.g. sync_filesystem and
syncfs(2)) ignore the return value.
- Fix a bug where callers of sync_filesystem (e.g. fs freeze) ignore
the return value.
- Fix a bug in XFS where xfs_fs_sync_fs never passed back error
returns"
* tag 'vfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: return errors in xfs_fs_sync_fs
quota: make dquot_quota_sync return errors from ->sync_fs
vfs: make sync_filesystem return errors from ->sync_fs
vfs: make freeze_super abort when sync_filesystem returns error