platform/kernel/linux-rpi.git
2 years agoKVM: selftests: Fix target thread to be migrated in rseq_test
Gavin Shan [Tue, 19 Jul 2022 02:08:30 +0000 (10:08 +0800)]
KVM: selftests: Fix target thread to be migrated in rseq_test

In rseq_test, there are two threads, which are vCPU thread and migration
worker separately. Unfortunately, the test has the wrong PID passed to
sched_setaffinity() in the migration worker. It forces migration on the
migration worker because zeroed PID represents the calling thread, which
is the migration worker itself. It means the vCPU thread is never enforced
to migration and it can migrate at any time, which eventually leads to
failure as the following logs show.

  host# uname -r
  5.19.0-rc6-gavin+
  host# # cat /proc/cpuinfo | grep processor | tail -n 1
  processor    : 223
  host# pwd
  /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
  host# for i in `seq 1 100`; do \
        echo "--------> $i"; ./rseq_test; done
  --------> 1
  --------> 2
  --------> 3
  --------> 4
  --------> 5
  --------> 6
  ==== Test Assertion Failure ====
    rseq_test.c:265: rseq_cpu == cpu
    pid=3925 tid=3925 errno=4 - Interrupted system call
       1  0x0000000000401963: main at rseq_test.c:265 (discriminator 2)
       2  0x0000ffffb044affb: ?? ??:0
       3  0x0000ffffb044b0c7: ?? ??:0
       4  0x0000000000401a6f: _start at ??:?
    rseq CPU = 4, sched CPU = 27

Fix the issue by passing correct parameter, TID of the vCPU thread, to
sched_setaffinity() in the migration worker.

Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Message-Id: <20220719020830.3479482-1-gshan@redhat.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats
Oliver Upton [Tue, 19 Jul 2022 12:52:29 +0000 (12:52 +0000)]
KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats

commit 1b870fa5573e ("kvm: stats: tell userspace which values are
boolean") added a new stat unit (boolean) but failed to raise
KVM_STATS_UNIT_MAX.

Fix by pointing UNIT_MAX at the new max value of UNIT_BOOLEAN.

Fixes: 1b870fa5573e ("kvm: stats: tell userspace which values are boolean")
Reported-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20220719125229.2934273-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: emulate: do not adjust size of fastop and setcc subroutines
Paolo Bonzini [Fri, 15 Jul 2022 11:34:55 +0000 (07:34 -0400)]
KVM: emulate: do not adjust size of fastop and setcc subroutines

Instead of doing complicated calculations to find the size of the subroutines
(which are even more complicated because they need to be stringified into
an asm statement), just hardcode to 16.

It is less dense for a few combinations of IBT/SLS/retbleed, but it has
the advantage of being really simple.

Cc: stable@vger.kernel.org # 5.15.x: 84e7051c0bc1: x86/kvm: fix FASTOP_SIZE when return thunks are enabled
Cc: stable@vger.kernel.org
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: x86: Fully initialize 'struct kvm_lapic_irq' in kvm_pv_kick_cpu_op()
Vitaly Kuznetsov [Fri, 8 Jul 2022 12:51:47 +0000 (14:51 +0200)]
KVM: x86: Fully initialize 'struct kvm_lapic_irq' in kvm_pv_kick_cpu_op()

'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left
uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally
not needed for APIC_DM_REMRD, they're still referenced by
__apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize
the structure to avoid consuming random stack memory.

Fixes: a183b638b61c ("KVM: x86: make apic_accept_irq tracepoint more generic")
Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220708125147.593975-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoMerge commit 'kvm-vmx-nested-tsc-fix' into kvm-master
Paolo Bonzini [Thu, 14 Jul 2022 14:04:12 +0000 (10:04 -0400)]
Merge commit 'kvm-vmx-nested-tsc-fix' into kvm-master

Merge bugfix needed in both 5.19 (because it's bad) and 5.20 (because
it is a prerequisite to test new features).

2 years agoDocumentation: kvm: clarify histogram units
Paolo Bonzini [Thu, 14 Jul 2022 11:29:57 +0000 (07:29 -0400)]
Documentation: kvm: clarify histogram units

In the case of histogram statistics, the values are always sample
counts; the unit instead applies to the bucket range.  For example,
halt_poll_success_hist is a nanosecond statistic because the buckets are
for 0ns, 1ns, 2-3ns, 4-7ns etc.  There isn't really any other sensible
interpretation, but clarify this anyway in the Documentation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agokvm: stats: tell userspace which values are boolean
Paolo Bonzini [Thu, 14 Jul 2022 11:27:31 +0000 (07:27 -0400)]
kvm: stats: tell userspace which values are boolean

Some of the statistics values exported by KVM are always only 0 or 1.
It can be useful to export this fact to userspace so that it can track
them specially (for example by polling the value every now and then to
compute a % of time spent in a specific state).

Therefore, add "boolean value" as a new "unit".  While it is not exactly
a unit, it walks and quacks like one.  In particular, using the type
would be wrong because boolean values could be instantaneous or peak
values (e.g. "is the rmap allocated?") or even two-bucket histograms
(e.g. "number of posted vs. non-posted interrupt injections").

Suggested-by: Amneesh Singh <natto@weirdnatto.in>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agox86/kvm: fix FASTOP_SIZE when return thunks are enabled
Thadeu Lima de Souza Cascardo [Wed, 13 Jul 2022 17:12:41 +0000 (14:12 -0300)]
x86/kvm: fix FASTOP_SIZE when return thunks are enabled

The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.

Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:

[   36.091116] int3: 0000 [#1] SMP NOPTI
[   36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44
[   36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.091191] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.091192] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.091195] PKRU: 55555554
[   36.091195] Call Trace:
[   36.091197]  <TASK>
[   36.091198]  ? fastop+0x5a/0xa0 [kvm]
[   36.091222]  x86_emulate_insn+0x7b8/0xe90 [kvm]
[   36.091244]  x86_emulate_instruction+0x2f4/0x630 [kvm]
[   36.091263]  ? kvm_arch_vcpu_load+0x7c/0x230 [kvm]
[   36.091283]  ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel]
[   36.091290]  complete_emulated_mmio+0x297/0x320 [kvm]
[   36.091310]  kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm]
[   36.091330]  kvm_vcpu_ioctl+0x29e/0x6d0 [kvm]
[   36.091344]  ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm]
[   36.091357]  ? __fget_files+0x86/0xc0
[   36.091362]  ? __fget_files+0x86/0xc0
[   36.091363]  __x64_sys_ioctl+0x92/0xd0
[   36.091366]  do_syscall_64+0x59/0xc0
[   36.091369]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091370]  ? do_syscall_64+0x69/0xc0
[   36.091371]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091372]  ? __x64_sys_writev+0x1c/0x30
[   36.091374]  ? do_syscall_64+0x69/0xc0
[   36.091374]  ? exit_to_user_mode_prepare+0x37/0xb0
[   36.091378]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091380]  ? do_syscall_64+0x69/0xc0
[   36.091381]  ? do_syscall_64+0x69/0xc0
[   36.091381]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   36.091384] RIP: 0033:0x7efdfe6d1aff
[   36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[   36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff
[   36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
[   36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff
[   36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000
[   36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[   36.091396]  </TASK>
[   36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover
[   36.123271] ---[ end trace db3c0ab5a48fabcc ]---
[   36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.123323] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.123324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.123327] PKRU: 55555554
[   36.123328] Kernel panic - not syncing: Fatal exception in interrupt
[   36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: aa3d480315ba ("x86: Use return-thunk in asm code")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Message-Id: <20220713171241.184026-1-cascardo@canonical.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1
Vitaly Kuznetsov [Tue, 12 Jul 2022 13:50:09 +0000 (15:50 +0200)]
KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1

Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
hang upon boot or shortly after when a non-default TSC frequency was
set for L1. The issue is observed on a host where TSC scaling is
supported. The problem appears to be that Windows doesn't use TSC
frequency for its guests even when the feature is advertised and KVM
filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
L1's. This leads to L2 running with the default frequency (matching
host's) while L1 is running with an altered one.

Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
it was set for L1. TSC_MULTIPLIER is already correctly computed and
written by prepare_vmcs02().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220712135009.952805-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoMerge tag 'kvm-riscv-fixes-5.19-2' of https://github.com/kvm-riscv/linux into HEAD
Paolo Bonzini [Thu, 14 Jul 2022 10:01:53 +0000 (06:01 -0400)]
Merge tag 'kvm-riscv-fixes-5.19-2' of https://github.com/kvm-riscv/linux into HEAD

 KVM/riscv fixes for 5.19, take #2

- Fix missing PAGE_PFN_MASK

- Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()

2 years agovf/remap: return the amount of bytes actually deduplicated
Ansgar Lößer [Wed, 13 Jul 2022 18:51:44 +0000 (20:51 +0200)]
vf/remap: return the amount of bytes actually deduplicated

When using the FIDEDUPRANGE ioctl, in case of success the requested size
is returned. In some cases this might not be the actual amount of bytes
deduplicated.

This change modifies vfs_dedupe_file_range() to report the actual amount
of bytes deduplicated, instead of the requested amount.

Link: https://lore.kernel.org/linux-fsdevel/5548ef63-62f9-4f46-5793-03165ceccacc@tu-darmstadt.de/
Reported-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Reported-by: Max Schlecht <max.schlecht@informatik.hu-berlin.de>
Reported-by: Björn Scheuermann <scheuermann@kom.tu-darmstadt.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Darrick J Wong <djwong@kernel.org>
Signed-off-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 13 Jul 2022 18:47:01 +0000 (11:47 -0700)]
Merge tag 'cgroup-for-5.19-rc6-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
 "Fix an old and subtle bug in the migration path.

  css_sets are used to track tasks and migrations are tasks moving from
  a group of css_sets to another group of css_sets. The migration path
  pins all source and destination css_sets in the prep stage.

  Unfortunately, it was overloading the same list_head entry to track
  sources and destinations, which got confused for migrations which are
  partially identity leading to use-after-frees.

  Fixed by using dedicated list_heads for tracking sources and
  destinations"

* tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Use separate src/dst nodes when preloading css_sets for migration

2 years agofs/remap: constrain dedupe of EOF blocks
Dave Chinner [Wed, 13 Jul 2022 07:49:15 +0000 (17:49 +1000)]
fs/remap: constrain dedupe of EOF blocks

If dedupe of an EOF block is not constrainted to match against only
other EOF blocks with the same EOF offset into the block, it can
match against any other block that has the same matching initial
bytes in it, even if the bytes beyond EOF in the source file do
not match.

Fix this by constraining the EOF block matching to only match
against other EOF blocks that have identical EOF offsets and data.
This allows "whole file dedupe" to continue to work without allowing
eof blocks to randomly match against partial full blocks with the
same data.

Reported-by: Ansgar Lößer <ansgar.loesser@tu-darmstadt.de>
Fixes: 1383a7ed6749 ("vfs: check file ranges before cloning files")
Link: https://lore.kernel.org/linux-fsdevel/a7c93559-4ba1-df2f-7a85-55a143696405@tu-darmstadt.de/
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'trace-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 12 Jul 2022 23:17:40 +0000 (16:17 -0700)]
Merge tag 'trace-v5.19-rc5' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Fixes and minor clean ups for tracing:

   - Fix memory leak by reverting what was thought to be a double free.

     A static tool had gave a false positive that a double free was
     possible in the error path, but it was actually a different
     location that confused the static analyzer (and those of us that
     reviewed it).

   - Move use of static buffers by ftrace_dump() to a location that can
     be used by kgdb's ftdump(), as it needs it for the same reasons.

   - Clarify in the Kconfig description that function tracing has
     negligible impact on x86, but may have a bit bigger impact on other
     architectures.

   - Remove unnecessary extra semicolon in trace event.

   - Make a local variable static that is used in the fprobes sample

   - Use KSYM_NAME_LEN for length of function in kprobe sample and get
     rid of unneeded macro for the same purpose"

* tag 'trace-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  samples: Use KSYM_NAME_LEN for kprobes
  fprobe/samples: Make sample_probe static
  blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon
  ftrace: Be more specific about arch impact when function tracer is enabled
  tracing: Fix sleeping while atomic in kdb ftdump
  tracing/histograms: Fix memory leak problem

2 years agosamples: Use KSYM_NAME_LEN for kprobes
Tiezhu Yang [Wed, 8 Jun 2022 01:23:22 +0000 (09:23 +0800)]
samples: Use KSYM_NAME_LEN for kprobes

It is better and enough to use KSYM_NAME_LEN for kprobes
in samples, no need to define and use the other values.

Link: https://lkml.kernel.org/r/1654651402-21552-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agofprobe/samples: Make sample_probe static
sunliming [Mon, 6 Jun 2022 07:56:59 +0000 (15:56 +0800)]
fprobe/samples: Make sample_probe static

This symbol is not used outside of fprobe_example.c, so marks it static.

Fixes the following warning:

sparse warnings: (new ones prefixed by >>)
>> samples/fprobe/fprobe_example.c:23:15: sparse: sparse: symbol 'sample_probe'
was not declared. Should it be static?

Link: https://lkml.kernel.org/r/20220606075659.674556-1-sunliming@kylinos.cn
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agoblk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon
Li kunyu [Wed, 29 Jun 2022 03:00:13 +0000 (11:00 +0800)]
blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon

Remove extra semicolon.

Link: https://lkml.kernel.org/r/20220629030013.10362-1-kunyu@nfschina.com
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agoftrace: Be more specific about arch impact when function tracer is enabled
Steven Rostedt (Google) [Wed, 6 Jul 2022 20:12:31 +0000 (16:12 -0400)]
ftrace: Be more specific about arch impact when function tracer is enabled

It was brought up that on ARMv7, that because the FUNCTION_TRACER does not
use nops to keep function tracing disabled because of the use of a link
register, it does have some performance impact.

The start of functions when -pg is used to compile the kernel is:

push    {lr}
bl      8010e7c0 <__gnu_mcount_nc>

When function tracing is tuned off, it becomes:

push    {lr}
add   sp, sp, #4

Which just puts the stack back to its normal location. But these two
instructions at the start of every function does incur some overhead.

Be more honest in the Kconfig FUNCTION_TRACER description and specify that
the overhead being in the noise was x86 specific, but other architectures
may vary.

Link: https://lore.kernel.org/all/20220705105416.GE5208@pengutronix.de/
Link: https://lkml.kernel.org/r/20220706161231.085a83da@gandalf.local.home
Reported-by: Sascha Hauer <sha@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agotracing: Fix sleeping while atomic in kdb ftdump
Douglas Anderson [Sat, 9 Jul 2022 00:09:52 +0000 (17:09 -0700)]
tracing: Fix sleeping while atomic in kdb ftdump

If you drop into kdb and type "ftdump" you'll get a sleeping while
atomic warning from memory allocation in trace_find_next_entry().

This appears to have been caused by commit ff895103a84a ("tracing:
Save off entry when peeking at next entry"), which added the
allocation in that path. The problematic commit was already fixed by
commit 8e99cf91b99b ("tracing: Do not allocate buffer in
trace_find_next_entry() in atomic") but that fix missed the kdb case.

The fix here is easy: just move the assignment of the static buffer to
the place where it should have been to begin with:
trace_init_global_iter(). That function is called in two places, once
is right before the assignment of the static buffer added by the
previous fix and once is in kdb.

Note that it appears that there's a second static buffer that we need
to assign that was added in commit efbbdaa22bb7 ("tracing: Show real
address for trace event arguments"), so we'll move that too.

Link: https://lkml.kernel.org/r/20220708170919.1.I75844e5038d9425add2ad853a608cb44bb39df40@changeid
Fixes: ff895103a84a ("tracing: Save off entry when peeking at next entry")
Fixes: efbbdaa22bb7 ("tracing: Show real address for trace event arguments")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agotracing/histograms: Fix memory leak problem
Zheng Yejian [Mon, 11 Jul 2022 01:47:31 +0000 (09:47 +0800)]
tracing/histograms: Fix memory leak problem

This reverts commit 46bbe5c671e06f070428b9be142cc4ee5cedebac.

As commit 46bbe5c671e0 ("tracing: fix double free") said, the
"double free" problem reported by clang static analyzer is:
  > In parse_var_defs() if there is a problem allocating
  > var_defs.expr, the earlier var_defs.name is freed.
  > This free is duplicated by free_var_defs() which frees
  > the rest of the list.

However, if there is a problem allocating N-th var_defs.expr:
  + in parse_var_defs(), the freed 'earlier var_defs.name' is
    actually the N-th var_defs.name;
  + then in free_var_defs(), the names from 0th to (N-1)-th are freed;

                        IF ALLOCATING PROBLEM HAPPENED HERE!!! -+
                                                                 \
                                                                  |
          0th           1th                 (N-1)-th      N-th    V
          +-------------+-------------+-----+-------------+-----------
var_defs: | name | expr | name | expr | ... | name | expr | name | ///
          +-------------+-------------+-----+-------------+-----------

These two frees don't act on same name, so there was no "double free"
problem before. Conversely, after that commit, we get a "memory leak"
problem because the above "N-th var_defs.name" is not freed.

If enable CONFIG_DEBUG_KMEMLEAK and inject a fault at where the N-th
var_defs.expr allocated, then execute on shell like:
  $ echo 'hist:key=call_site:val=$v1,$v2:v1=bytes_req,v2=bytes_alloc' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger

Then kmemleak reports:
  unreferenced object 0xffff8fb100ef3518 (size 8):
    comm "bash", pid 196, jiffies 4295681690 (age 28.538s)
    hex dump (first 8 bytes):
      76 31 00 00 b1 8f ff ff                          v1......
    backtrace:
      [<0000000038fe4895>] kstrdup+0x2d/0x60
      [<00000000c99c049a>] event_hist_trigger_parse+0x206f/0x20e0
      [<00000000ae70d2cc>] trigger_process_regex+0xc0/0x110
      [<0000000066737a4c>] event_trigger_write+0x75/0xd0
      [<000000007341e40c>] vfs_write+0xbb/0x2a0
      [<0000000087fde4c2>] ksys_write+0x59/0xd0
      [<00000000581e9cdf>] do_syscall_64+0x3a/0x80
      [<00000000cf3b065c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Link: https://lkml.kernel.org/r/20220711014731.69520-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes: 46bbe5c671e0 ("tracing: fix double free")
Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 years agoMerge tag 'ovl-fixes-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszer...
Linus Torvalds [Tue, 12 Jul 2022 15:59:35 +0000 (08:59 -0700)]
Merge tag 'ovl-fixes-5.19-rc7' of git://git./linux/kernel/git/mszeredi/vfs

Pull overlayfs fix from Miklos Szeredi:
 "Add a temporary fix for posix acls on idmapped mounts introduced in
  this cycle. A proper fix will be added in the next cycle"

* tag 'ovl-fixes-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: turn off SB_POSIXACL with idmapped layers temporarily

2 years agoMerge tag 'drm-fixes-2022-07-12' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Tue, 12 Jul 2022 15:52:15 +0000 (08:52 -0700)]
Merge tag 'drm-fixes-2022-07-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "I see you picked up one of the fbdev fixes, this is the other stuff
  that was queued up last week.

  A bit of a scattering of fixes, three for i915, one amdgpu, and a
  couple of panfrost, rockchip, panel and bridge ones.

  amdgpu:
   - Hibernation fix

  dma-buf:
   - fix use after free of fence

  i915:
   - Fix a possible refcount leak in DP MST connector (Hangyu)
   - Fix on loading guc on ADL-N (Daniele)
   - Fix vm use-after-free in vma destruction (Thomas)

  bridge:
   - fsl-ldb : 3 LVDS modesetting fixes

  rockchip:
   - iommu domain fix

  panfrost:
   - fix memory corruption
   - error path fix

  panel:
   - orientation quirk fix for Yoga tablet 2

  ssd130x:
   - fix pre-charge period setting"

* tag 'drm-fixes-2022-07-12' of git://anongit.freedesktop.org/drm/drm:
  drm/ssd130x: Fix pre-charge period setting
  dma-buf: Fix one use-after-free of fence
  drm/i915: Fix vm use-after-free in vma destruction
  drm/i915/guc: ADL-N should use the same GuC FW as ADL-S
  drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
  drm/amdgpu/display: disable prefer_shadow for generic fb helpers
  drm/amdgpu: keep fbdev buffers pinned during suspend
  drm/panfrost: Fix shrinker list corruption by madvise IOCTL
  drm/panfrost: Put mapping instead of shmem obj on panfrost_mmu_map_fault_addr() error
  drm/rockchip: Detach from ARM DMA domain in attach_device
  drm/bridge: fsl-ldb: Drop DE signal polarity inversion
  drm/bridge: fsl-ldb: Enable split mode for LVDS dual link
  drm/bridge: fsl-ldb: Fix mode clock rate validation
  drm/aperture: Run fbdev removal before internal helpers
  drm: panel-orientation-quirks: Add quirk for the Lenovo Yoga Tablet 2 830

2 years agoMerge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 12 Jul 2022 15:40:09 +0000 (08:40 -0700)]
Merge tag 'x86_bugs_retbleed' of git://git./linux/kernel/git/tip/tip

Pull lockdep fix for x86 retbleed from Borislav Petkov:

 - Fix lockdep complaint for __static_call_fixup()

* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/static_call: Serialize __static_call_fixup() properly

2 years agox86/static_call: Serialize __static_call_fixup() properly
Thomas Gleixner [Tue, 12 Jul 2022 12:01:06 +0000 (14:01 +0200)]
x86/static_call: Serialize __static_call_fixup() properly

__static_call_fixup() invokes __static_call_transform() without holding
text_mutex, which causes lockdep to complain in text_poke_bp().

Adding the proper locking cures that, but as this is either used during
early boot or during module finalizing, it's not required to use
text_poke_bp(). Add an argument to __static_call_transform() which tells
it to use text_poke_early() for it.

Fixes: ee88d363d156 ("x86,static_call: Use alternative RET encoding")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agoMerge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Tue, 12 Jul 2022 01:15:25 +0000 (18:15 -0700)]
Merge tag 'x86_bugs_retbleed' of git://git./linux/kernel/git/tip/tip

Pull x86 retbleed fixes from Borislav Petkov:
 "Just when you thought that all the speculation bugs were addressed and
  solved and the nightmare is complete, here's the next one: speculating
  after RET instructions and leaking privileged information using the
  now pretty much classical covert channels.

  It is called RETBleed and the mitigation effort and controlling
  functionality has been modelled similar to what already existing
  mitigations provide"

* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  x86/speculation: Disable RRSBA behavior
  x86/kexec: Disable RET on kexec
  x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
  x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
  x86/bugs: Add Cannon lake to RETBleed affected CPU list
  x86/retbleed: Add fine grained Kconfig knobs
  x86/cpu/amd: Enumerate BTC_NO
  x86/common: Stamp out the stepping madness
  KVM: VMX: Prevent RSB underflow before vmenter
  x86/speculation: Fill RSB on vmexit for IBRS
  KVM: VMX: Fix IBRS handling after vmexit
  KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
  KVM: VMX: Convert launched argument to flags
  KVM: VMX: Flatten __vmx_vcpu_run()
  objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
  x86/speculation: Remove x86_spec_ctrl_mask
  x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
  x86/speculation: Fix SPEC_CTRL write on SMT state change
  x86/speculation: Fix firmware entry SPEC_CTRL handling
  x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
  ...

2 years agoMerge tag 'drm-misc-fixes-2022-07-07-1' of ssh://git.freedesktop.org/git/drm/drm...
Dave Airlie [Tue, 12 Jul 2022 00:43:49 +0000 (10:43 +1000)]
Merge tag 'drm-misc-fixes-2022-07-07-1' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes

Three mode setting fixes for fsl-ldb, a fbdev removal use-after-free fix,
a dma-buf fence use-after-free fix, a DMA setup fix for rockchip, an error
path fix and memory corruption fix for panfrost and one more orientation
quirk

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708054306.wr6jcfdunuypftbq@houat
2 years agoMerge tag 'drm-intel-fixes-2022-07-07' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Tue, 12 Jul 2022 00:40:24 +0000 (10:40 +1000)]
Merge tag 'drm-intel-fixes-2022-07-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix a possible refcount leak in DP MST connector (Hangyu)
- Fix on loading guc on ADL-N (Daniele)
- Fix vm use-after-free in vma destruction (Thomas)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YsbbgWnLTR8fr4lj@intel.com
2 years agoMerge tag 'amd-drm-fixes-5.19-2022-07-06' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Tue, 12 Jul 2022 00:34:42 +0000 (10:34 +1000)]
Merge tag 'amd-drm-fixes-5.19-2022-07-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.19-2022-07-06:

amdgpu:
- Hibernation fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707024421.5773-1-alexander.deucher@amd.com
2 years agoMerge tag 'for-5.19-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Mon, 11 Jul 2022 21:41:44 +0000 (14:41 -0700)]
Merge tag 'for-5.19-rc6-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A more fixes that seem to me to be important enough to get merged
  before release:

   - in zoned mode, fix leak of a structure when reading zone info, this
     happens on normal path so this can be significant

   - in zoned mode, revert an optimization added in 5.19-rc1 to finish a
     zone when the capacity is full, but this is not reliable in all
     cases

   - try to avoid short reads for compressed data or inline files when
     it's a NOWAIT read, applications should handle that but there are
     two, qemu and mariadb, that are affected"

* tag 'for-5.19-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: drop optimization of zone finish
  btrfs: zoned: fix a leaked bioc in read_zone_info
  btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents

2 years agoMerge tags 'free-mq_sysctls-for-v5.19' and 'ptrace_unfreeze_fix-for-v5.19' of git...
Linus Torvalds [Mon, 11 Jul 2022 21:33:41 +0000 (14:33 -0700)]
Merge tags 'free-mq_sysctls-for-v5.19' and 'ptrace_unfreeze_fix-for-v5.19' of git://git./linux/kernel/git/ebiederm/user-namespace

Pull ipc namespace fix from Eric Biederman:
 "This fixes a bug with error handling if ipc creation fails that was
  reported by syzbot"

For completeness, this also pulls the ptrace_unfreeze_fix tag that
contains the original version of one of the hotfixes that I manually
applied earlier so that it would be fixed in rc6.

* tag 'free-mq_sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ipc: Free mq_sysctls if ipc namespace creation failed

* tag 'ptrace_unfreeze_fix-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()

2 years agoMerge tag 'mm-hotfixes-stable-2022-07-11' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Mon, 11 Jul 2022 19:49:56 +0000 (12:49 -0700)]
Merge tag 'mm-hotfixes-stable-2022-07-11' of git://git./linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "Mainly MM fixes. About half for issues which were introduced after
  5.18 and the remainder for longer-term issues"

* tag 'mm-hotfixes-stable-2022-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm: split huge PUD on wp_huge_pud fallback
  nilfs2: fix incorrect masking of permission flags for symlinks
  mm/rmap: fix dereferencing invalid subpage pointer in try_to_migrate_one()
  riscv/mm: fix build error while PAGE_TABLE_CHECK enabled without MMU
  Documentation: highmem: use literal block for code example in highmem.h comment
  mm: sparsemem: fix missing higher order allocation splitting
  mm/damon: use set_huge_pte_at() to make huge pte old
  sh: convert nommu io{re,un}map() to static inline functions
  mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages

2 years agoMerge tag 'modules-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof...
Linus Torvalds [Mon, 11 Jul 2022 19:39:12 +0000 (12:39 -0700)]
Merge tag 'modules-5.19-rc7' of git://git./linux/kernel/git/mcgrof/linux

Pull module fixes from Luis Chamberlain:
 "Although most of the move of code in in v5.19-rc1 should have not
  introduced a regression patch review on one of the file changes
  captured a checkpatch warning which advised to use strscpy() and it
  caused a buffer overflow when an incorrect length is passed.

  Another change which checkpatch complained about was an odd RCU usage,
  but that was properly addressed in a separate patch to the move by
  Aaron. That caused a regression with PREEMPT_RT=y due to an unbounded
  latency.

  This series fixes both and adjusts documentation which we forgot to do
  for the move"

* tag 'modules-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT
  doc: module: update file references
  module: Fix "warning: variable 'exit' set but not used"
  module: Fix selfAssignment cppcheck warning
  modules: Fix corruption of /proc/kallsyms

2 years agomodule: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT
Aaron Tomlin [Mon, 11 Jul 2022 17:17:19 +0000 (18:17 +0100)]
module: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT

The commit 08126db5ff73 ("module: kallsyms: Fix suspicious rcu usage")
under PREEMPT_RT=y, disabling preemption introduced an unbounded
latency since the loop is not fixed. This change caused a regression
since previously preemption was not disabled and we would dereference
RCU-protected pointers explicitly. That being said, these pointers
cannot change.

Before kallsyms-specific data is prepared/or set-up, we ensure that
the unformed module is known to be unique i.e. does not already exist
(see load_module()). Therefore, we can fix this by using the common and
more appropriate RCU flavour as this section of code can be safely
preempted.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Fixes: 08126db5ff73 ("module: kallsyms: Fix suspicious rcu usage")
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2 years agoMerge tag 'vfio-v5.19-rc7' of https://github.com/awilliam/linux-vfio
Linus Torvalds [Mon, 11 Jul 2022 17:02:03 +0000 (10:02 -0700)]
Merge tag 'vfio-v5.19-rc7' of https://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:

 - Move IOMMU test to unbreak no-iommu support (Jason Gunthorpe)

* tag 'vfio-v5.19-rc7' of https://github.com/awilliam/linux-vfio:
  vfio: Move IOMMU_CAP_CACHE_COHERENCY test to after we know we have a group

2 years agofix race between exit_itimers() and /proc/pid/timers
Oleg Nesterov [Mon, 11 Jul 2022 16:16:25 +0000 (18:16 +0200)]
fix race between exit_itimers() and /proc/pid/timers

As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.

Cc: <stable@vger.kernel.org>
Reported-by: chris@accessvector.net
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoRISC-V: KVM: Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()
Anup Patel [Mon, 11 Jul 2022 04:06:32 +0000 (09:36 +0530)]
RISC-V: KVM: Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()

The kvm_riscv_check_vcpu_requests() is called with SRCU read lock held
and for KVM_REQ_SLEEP request it will block the VCPU without releasing
SRCU read lock. This causes KVM ioctls (such as KVM_IOEVENTFD) from
other VCPUs of the same Guest/VM to hang/deadlock if there is any
synchronize_srcu() or synchronize_srcu_expedited() in the path.

To fix the above in kvm_riscv_check_vcpu_requests(), we should do SRCU
read unlock before blocking the VCPU and do SRCU read lock after VCPU
wakeup.

Fixes: cce69aff689e ("RISC-V: KVM: Implement VCPU interrupts and requests handling")
Reported-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2 years agoriscv: Fix missing PAGE_PFN_MASK
Alexandre Ghiti [Mon, 11 Jul 2022 03:59:51 +0000 (09:29 +0530)]
riscv: Fix missing PAGE_PFN_MASK

There are a bunch of functions that use the PFN from a page table entry
that end up with the svpbmt upper-bits because they are missing the newly
introduced PAGE_PFN_MASK which leads to wrong addresses conversions and
then crash: fix this by adding this mask.

Fixes: 100631b48ded ("riscv: Fix accessing pfn bits in PTEs for non-32bit variants")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2 years agoLinux 5.19-rc6
Linus Torvalds [Sun, 10 Jul 2022 21:40:51 +0000 (14:40 -0700)]
Linux 5.19-rc6

2 years agoMerge branch 'hot-fixes' (fixes for rc6)
Linus Torvalds [Sun, 10 Jul 2022 21:26:49 +0000 (14:26 -0700)]
Merge branch 'hot-fixes' (fixes for rc6)

This is a collection of three fixes for small annoyances.

Two of these are already pending in other trees, but I really don't want
to release another -rc with these issues pending, so I picked up the
patches for these things directly.  We'll end up with duplicate commits
eventually, I prefer that over having these issues pending.

The third one is just me getting rid of another BUG_ON() just because it
was reported and I dislike those things so much.

* merge 'hot-fixes' branch:
  ida: don't use BUG_ON() for debugging
  drm/aperture: Run fbdev removal before internal helpers
  ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()

2 years agoida: don't use BUG_ON() for debugging
Linus Torvalds [Sun, 10 Jul 2022 20:55:49 +0000 (13:55 -0700)]
ida: don't use BUG_ON() for debugging

This is another old BUG_ON() that just shouldn't exist (see also commit
a382f8fee42c: "signal handling: don't use BUG_ON() for debugging").

In fact, as Matthew Wilcox points out, this condition shouldn't really
even result in a warning, since a negative id allocation result is just
a normal allocation failure:

  "I wonder if we should even warn here -- sure, the caller is trying to
   free something that wasn't allocated, but we don't warn for
   kfree(NULL)"

and goes on to point out how that current error check is only causing
people to unnecessarily do their own index range checking before freeing
it.

This was noted by Itay Iellin, because the bluetooth HCI socket cookie
code does *not* do that range checking, and ends up just freeing the
error case too, triggering the BUG_ON().

The HCI code requires CAP_NET_RAW, and seems to just result in an ugly
splat, but there really is no reason to BUG_ON() here, and we have
generally striven for allocation models where it's always ok to just do

    free(alloc());

even if the allocation were to fail for some random reason (usually
obviously that "random" reason being some resource limit).

Fixes: 88eca0207cf1 ("ida: simplified functions for id allocation")
Reported-by: Itay Iellin <ieitayie@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'dmaengine-fix-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Sun, 10 Jul 2022 18:23:01 +0000 (11:23 -0700)]
Merge tag 'dmaengine-fix-5.19' of git://git./linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
 "One core fix for DMA_INTERRUPT and rest driver fixes.

  Core:

   - Revert verification of DMA_INTERRUPT capability as that was
     incorrect

  Bunch of driver fixes for:

   - ti: refcount and put_device leak

   - qcom_bam: runtime pm overflow

   - idxd: force wq context cleanup and call idxd_enable_system_pasid()
     on success

   - dw-axi-dmac: RMW on channel suspend register

   - imx-sdma: restart cyclic channel when enabled

   - at_xdma: error handling for at_xdmac_alloc_desc

   - pl330: lockdep warning

   - lgm: error handling path in probe

   - allwinner: Fix min/max typo in binding"

* tag 'dmaengine-fix-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dt-bindings: dma: allwinner,sun50i-a64-dma: Fix min/max typo
  dmaengine: lgm: Fix an error handling path in intel_ldma_probe()
  dmaengine: pl330: Fix lockdep warning about non-static key
  dmaengine: idxd: Only call idxd_enable_system_pasid() if succeeded in enabling SVA feature
  dmaengine: at_xdma: handle errors of at_xdmac_alloc_desc() correctly
  dmaengine: imx-sdma: only restart cyclic channel when enabled
  dmaengine: dw-axi-dmac: Fix RMW on channel suspend register
  dmaengine: idxd: force wq context cleanup on device disable path
  dmaengine: qcom: bam_dma: fix runtime PM underflow
  dmaengine: imx-sdma: Allow imx8m for imx7 FW revs
  dmaengine: Revert "dmaengine: add verification of DMA_INTERRUPT capability for dmatest"
  dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate
  dmaengine: ti: Fix refcount leak in ti_dra7_xbar_route_allocate

2 years agoMerge tag 'staging-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 10 Jul 2022 16:51:56 +0000 (09:51 -0700)]
Merge tag 'staging-5.19-rc6' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for a reported problem that showed
  up in 5.19-rc1 in the wlan-ng driver. It has been in linux-next for a
  week with no reported problems"

* tag 'staging-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging/wlan-ng: get the correct struct hfa384x in work callback

2 years agoMerge tag 'char-misc-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sun, 10 Jul 2022 16:45:29 +0000 (09:45 -0700)]
Merge tag 'char-misc-5.19-rc6' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are four small char/misc driver fixes for 5.19-rc6 to resolve
  some reported issues. They only affect two drivers:

   - rtsx_usb: fix for of-reported DMA warning error, the driver was
     handling memory buffers in odd ways, it has now been fixed up to be
     much simpler and correct by Shuah.

   - at25 eeprom driver bugfix for reported problem

  All of these have been in linux-next for a week with no reported
  problems"

* tag 'char-misc-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  misc: rtsx_usb: set return value in rsp_buf alloc err path
  misc: rtsx_usb: use separate command and response buffers
  misc: rtsx_usb: fix use of dma mapped buffer for usb bulk transfer
  eeprom: at25: Rework buggy read splitting

2 years agoMerge tag 'io_uring-5.19-2022-07-09' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 10 Jul 2022 16:14:54 +0000 (09:14 -0700)]
Merge tag 'io_uring-5.19-2022-07-09' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "A single fix for an issue that came up yesterday that we should plug
  for -rc6.

  This is a regression introduced in this cycle"

* tag 'io_uring-5.19-2022-07-09' of git://git.kernel.dk/linux-block:
  io_uring: check that we have a file table when allocating update slots

2 years agoMerge tag 'kbuild-fixes-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 10 Jul 2022 15:59:02 +0000 (08:59 -0700)]
Merge tag 'kbuild-fixes-v5.19-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Adjust gen_compile_commands.py to the format change of *.mod files

 - Remove unused macro in scripts/Makefile.modinst

* tag 'kbuild-fixes-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: remove unused cmd_none in scripts/Makefile.modinst
  gen_compile_commands: handle multiple lines per .mod file

2 years agoMerge tag 'irq_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 10 Jul 2022 15:52:12 +0000 (08:52 -0700)]
Merge tag 'irq_urgent_for_v5.19_rc6' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Gracefully handle failure to request MMIO resources in the GICv3
   driver

 - Make a static key static in the Apple AIC driver

 - Fix the Xilinx intc driver dependency on OF_ADDRESS

* tag 'irq_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/apple-aic: Make symbol 'use_fast_ipi' static
  irqchip/xilinx: Add explicit dependency on OF_ADDRESS
  irqchip/gicv3: Handle resource request failure consistently

2 years agoMerge tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 10 Jul 2022 15:43:52 +0000 (08:43 -0700)]
Merge tag 'x86_urgent_for_v5.19_rc6' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Prepare for and clear .brk early in order to address XenPV guests
   failures where the hypervisor verifies page tables and uninitialized
   data in that range leads to bogus failures in those checks

 - Add any potential setup_data entries supplied at boot to the identity
   pagetable mappings to prevent kexec kernel boot failures. Usually,
   this is not a problem for the normal kernel as those mappings are
   part of the initially mapped 2M pages but if kexec gets to allocate
   the second kernel somewhere else, those setup_data entries need to be
   mapped there too.

 - Fix objtool not to discard text references from the __tracepoints
   section so that ENDBR validation still works

 - Correct the setup_data types limit as it is user-visible, before 5.19
   releases

* tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Fix the setup data types max limit
  x86/ibt, objtool: Don't discard text references from tracepoint section
  x86/compressed/64: Add identity mappings for setup_data entries
  x86: Fix .brk attribute in linker script
  x86: Clear .brk area at early boot
  x86/xen: Use clear_bss() for Xen PV guests

2 years agokbuild: remove unused cmd_none in scripts/Makefile.modinst
Masahiro Yamada [Thu, 30 Jun 2022 08:09:35 +0000 (17:09 +0900)]
kbuild: remove unused cmd_none in scripts/Makefile.modinst

Commit 65ce9c38326e ("kbuild: move module strip/compression code into
scripts/Makefile.modinst") added this unused code.

Perhaps, I thought cmd_none was useful for CONFIG_MODULE_COMPRESS_NONE,
but I did not use it after all.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2 years agox86/boot: Fix the setup data types max limit
Borislav Petkov [Sun, 10 Jul 2022 09:15:47 +0000 (11:15 +0200)]
x86/boot: Fix the setup data types max limit

Commit in Fixes forgot to change the SETUP_TYPE_MAX definition which
contains the highest valid setup data type.

Correct that.

Fixes: 5ea98e01ab52 ("x86/boot: Add Confidential Computing type to setup_data")
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/ddba81dd-cc92-699c-5274-785396a17fb5@zytor.com
2 years agoMerge tag 'i2c-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 9 Jul 2022 18:20:15 +0000 (11:20 -0700)]
Merge tag 'i2c-for-5.19-rc6' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Two I2C driver bugfixes preventing resource leaks"

* tag 'i2c-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: cadence: Unregister the clk notifier in error path
  i2c: piix4: Fix a memory leak in the EFCH MMIO support

2 years agodrm/aperture: Run fbdev removal before internal helpers
Thomas Zimmermann [Fri, 17 Jun 2022 12:10:27 +0000 (14:10 +0200)]
drm/aperture: Run fbdev removal before internal helpers

Always run fbdev removal first to remove simpledrm via sysfb_disable().
This clears the internal state.

The later call to drm_aperture_detach_drivers() then does nothing.
Otherwise, with drm_aperture_detach_drivers() running first, the call to
sysfb_disable() uses inconsistent state.

Example backtrace show below:

  BUG: KASAN: use-after-free in device_del+0x79/0x5f0
  Read of size 8 at addr ffff888108185050 by task systemd-udevd/311
  CPU: 0 PID: 311 Comm: systemd-udevd Tainted: G            E     5.19.0-rc2-1-default+ #1689
  Hardware name: HP ProLiant DL120 G7, BIOS J01 04/21/2011
  Call Trace:
    device_del+0x79/0x5f0
    platform_device_del.part.0+0x19/0xe0
    platform_device_unregister+0x1c/0x30
    sysfb_disable+0x2d/0x70
    remove_conflicting_framebuffers+0x1c/0xf0
    remove_conflicting_pci_framebuffers+0x130/0x1a0
    drm_aperture_remove_conflicting_pci_framebuffers+0x86/0xb0
    mgag200_pci_probe+0x2d/0x140 [mgag200]

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 873eb3b11860 ("fbdev: Disable sysfb device registration when removing conflicting FBs")
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Changcheng Deng <deng.changcheng@zte.com.cn>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()
Sven Schnelle [Wed, 6 Jul 2022 10:16:25 +0000 (12:16 +0200)]
ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()

CI reported the following splat while running the strace testsuite:

  WARNING: CPU: 1 PID: 3570031 at kernel/ptrace.c:272 ptrace_check_attach+0x12e/0x178
  CPU: 1 PID: 3570031 Comm: strace Tainted: G           OE     5.19.0-20220624.rc3.git0.ee819a77d4e7.300.fc36.s390x #1
  Hardware name: IBM 3906 M04 704 (z/VM 7.1.0)
  Call Trace:
   [<00000000ab4b645a>] ptrace_check_attach+0x132/0x178
  ([<00000000ab4b6450>] ptrace_check_attach+0x128/0x178)
   [<00000000ab4b6cde>] __s390x_sys_ptrace+0x86/0x160
   [<00000000ac03fcec>] __do_syscall+0x1d4/0x200
   [<00000000ac04e312>] system_call+0x82/0xb0
  Last Breaking-Event-Address:
   [<00000000ab4ea3c8>] wait_task_inactive+0x98/0x190

This is because JOBCTL_TRACED is set, but the task is not in TASK_TRACED
state. Caused by ptrace_unfreeze_traced() which does:

task->jobctl &= ~TASK_TRACED

but it should be:

task->jobctl &= ~JOBCTL_TRACED

Fixes: 31cae1eaae4f ("sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'powerpc-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 9 Jul 2022 17:34:08 +0000 (10:34 -0700)]
Merge tag 'powerpc-5.19-5' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

 - On Power8 bare metal, fix creation of RNG platform devices, which are
   needed for the /dev/hwrng driver to probe correctly.

Thanks to Jason A. Donenfeld, and Sachin Sant.

* tag 'powerpc-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv: delay rng platform device creation until later in boot

2 years agoio_uring: check that we have a file table when allocating update slots
Jens Axboe [Sat, 9 Jul 2022 13:02:10 +0000 (07:02 -0600)]
io_uring: check that we have a file table when allocating update slots

If IORING_FILE_INDEX_ALLOC is set asking for an allocated slot, the
helper doesn't check if we actually have a file table or not. The non
alloc path does do that correctly, and returns -ENXIO if we haven't set
one up.

Do the same for the allocated path, avoiding a NULL pointer dereference
when trying to find a free bit.

Fixes: a7c41b4687f5 ("io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agox86/speculation: Disable RRSBA behavior
Pawan Gupta [Fri, 8 Jul 2022 20:36:09 +0000 (13:36 -0700)]
x86/speculation: Disable RRSBA behavior

Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.

Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.

A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).

For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agox86/kexec: Disable RET on kexec
Konrad Rzeszutek Wilk [Fri, 8 Jul 2022 17:10:11 +0000 (19:10 +0200)]
x86/kexec: Disable RET on kexec

All the invocations unroll to __x86_return_thunk and this file
must be PIC independent.

This fixes kexec on 64-bit AMD boxes.

  [ bp: Fix 32-bit build. ]

Reported-by: Edward Tran <edward.tran@oracle.com>
Reported-by: Awais Tanveer <awais.tanveer@oracle.com>
Suggested-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agoMerge tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 23:08:48 +0000 (16:08 -0700)]
Merge tag 'fscache-fixes-20220708' of git://git./linux/kernel/git/dhowells/linux-fs

Pull fscache fixes from David Howells:

 - Fix a check in fscache_wait_on_volume_collision() in which the
   polarity is reversed. It should complain if a volume is still marked
   acquisition-pending after 20s, but instead complains if the mark has
   been cleared (ie. the condition has cleared).

   Also switch an open-coded test of the ACQUIRE_PENDING volume flag to
   use the helper function for consistency.

 - Not a fix per se, but neaten the code by using a helper to check for
   the DROPPED state.

 - Fix cachefiles's support for erofs to only flush requests associated
   with a released control file, not all requests.

 - Fix a race between one process invalidating an object in the cache
   and another process trying to look it up.

* tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: Fix invalidation/lookup race
  cachefiles: narrow the scope of flushed requests when releasing fd
  fscache: Introduce fscache_cookie_is_dropped()
  fscache: Fix if condition in fscache_wait_on_volume_collision()

2 years agoptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()
Sven Schnelle [Wed, 6 Jul 2022 10:16:25 +0000 (12:16 +0200)]
ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()

CI reported the following splat while running the strace testsuite:

[ 3976.640309] WARNING: CPU: 1 PID: 3570031 at kernel/ptrace.c:272 ptrace_check_attach+0x12e/0x178
[ 3976.640391] CPU: 1 PID: 3570031 Comm: strace Tainted: G           OE     5.19.0-20220624.rc3.git0.ee819a77d4e7.300.fc36.s390x #1
[ 3976.640410] Hardware name: IBM 3906 M04 704 (z/VM 7.1.0)
[ 3976.640452] Call Trace:
[ 3976.640454]  [<00000000ab4b645a>] ptrace_check_attach+0x132/0x178
[ 3976.640457] ([<00000000ab4b6450>] ptrace_check_attach+0x128/0x178)
[ 3976.640460]  [<00000000ab4b6cde>] __s390x_sys_ptrace+0x86/0x160
[ 3976.640463]  [<00000000ac03fcec>] __do_syscall+0x1d4/0x200
[ 3976.640468]  [<00000000ac04e312>] system_call+0x82/0xb0
[ 3976.640470] Last Breaking-Event-Address:
[ 3976.640471]  [<00000000ab4ea3c8>] wait_task_inactive+0x98/0x190

This is because JOBCTL_TRACED is set, but the task is not in TASK_TRACED
state. Caused by ptrace_unfreeze_traced() which does:

task->jobctl &= ~TASK_TRACED

but it should be:

task->jobctl &= ~JOBCTL_TRACED

Fixes: 31cae1eaae4f ("sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Link: https://lkml.kernel.org/r/20220706101625.2100298-1-svens@linux.ibm.com
Link: https://lkml.kernel.org/r/YrHA5UkJLornOdCz@li-4a3a4a4c-28e5-11b2-a85c-a8d192c6f089.ibm.com
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2101641
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Tested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2 years agoMerge tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 8 Jul 2022 20:05:56 +0000 (13:05 -0700)]
Merge tag 'acpi-5.19-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix two recent regressions related to CPPC support.

  Specifics:

   - Prevent _CPC from being used if the platform firmware does not
     confirm CPPC v2 support via _OSC (Mario Limonciello)

   - Allow systems with X86_FEATURE_CPPC set to use _CPC even if CPPC
     support cannot be agreed on via _OSC (Mario Limonciello)"

* tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported
  ACPI: CPPC: Only probe for _CPC if CPPC v2 is acked

2 years agoMerge tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 8 Jul 2022 20:01:04 +0000 (13:01 -0700)]
Merge tag 'pm-5.19-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a NULL pointer dereference in a devfreq driver and a runtime
  PM framework issue that may cause a supplier device to be suspended
  before its consumer.

  Specifics:

   - Fix NULL pointer dereference related to printing a diagnostic
     message in the exynos-bus devfreq driver (Christian Marangi)

   - Fix race condition in the runtime PM framework which in some cases
     may cause a supplier device to be suspended when its consumer is
     still active (Rafael Wysocki)"

* tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / devfreq: exynos-bus: Fix NULL pointer dereference
  PM: runtime: Fix supplier device management during consumer probe
  PM: runtime: Redefine pm_runtime_release_supplier()

2 years agoMerge tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 19:55:25 +0000 (12:55 -0700)]
Merge tag 'cxl-fixes-for-5.19-rc6' of git://git./linux/kernel/git/cxl/cxl

Pull cxl fixes from Vishal Verma:

 - Update MAINTAINERS for Ben's email

 - Fix cleanup of port devices on failure to probe driver

 - Fix endianness in get/set LSA mailbox command structures

 - Fix memregion_free() fallback definition

 - Fix missing variable payload checks in CXL cmd size validation

* tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/mbox: Fix missing variable payload checks in cmd size validation
  memregion: Fix memregion_free() fallback definition
  cxl/mbox: Use __le32 in get,set_lsa mailbox structures
  cxl/core: Use is_endpoint_decoder
  cxl: Fix cleanup of port devices on failure to probe driver.
  MAINTAINERS: Update Ben's email address

2 years agoMerge tag 'iommu-fixes-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 8 Jul 2022 19:49:00 +0000 (12:49 -0700)]
Merge tag 'iommu-fixes-v5.19-rc5' of git://git./linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - fix device setup failures in the Intel VT-d driver when the PASID
   table is shared

 - fix Intel VT-d device hot-add failure due to wrong device notifier
   order

 - remove the old IOMMU mailing list from the MAINTAINERS file now that
   it has been retired

* tag 'iommu-fixes-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  MAINTAINERS: Remove iommu@lists.linux-foundation.org
  iommu/vt-d: Fix RID2PASID setup/teardown failure
  iommu/vt-d: Fix PCI bus rescan device hot add

2 years agoMerge tag 'gpio-fixes-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 8 Jul 2022 19:39:52 +0000 (12:39 -0700)]
Merge tag 'gpio-fixes-for-v5.19-rc6' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a build error in gpio-vf610

 - fix a null-pointer dereference in the GPIO character device code

* tag 'gpio-fixes-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: cdev: fix null pointer dereference in linereq_free()
  gpio: vf610: fix compilation error

2 years agoMerge branch 'pm-core'
Rafael J. Wysocki [Fri, 8 Jul 2022 18:38:51 +0000 (20:38 +0200)]
Merge branch 'pm-core'

Merge a runtime PM framework cleanup and fix related to device links.

* pm-core:
  PM: runtime: Fix supplier device management during consumer probe
  PM: runtime: Redefine pm_runtime_release_supplier()

2 years agoMerge tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 8 Jul 2022 18:32:23 +0000 (11:32 -0700)]
Merge tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "NVMe pull request with another id quirk addition, and a tracing fix"

* tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
  nvme: use struct group for generic command dwords
  nvme-pci: phison e16 has bogus namespace ids

2 years agoMerge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 8 Jul 2022 18:25:01 +0000 (11:25 -0700)]
Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block

Pull io_uring tweak from Jens Axboe:
 "Just a minor tweak to an addition made in this release cycle: padding
  a 32-bit value that's in a 64-bit union to avoid any potential
  funkiness from that"

* tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
  io_uring: explicit sqe padding for ioctl commands

2 years agoMerge tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Fri, 8 Jul 2022 18:03:26 +0000 (11:03 -0700)]
Merge tag 'for-5.19/fbdev-3' of git://git./linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:

 - fbcon now prevents switching to screen resolutions which are smaller
   than the font size, and prevents enabling a font which is bigger than
   the current screen resolution. This fixes vmalloc-out-of-bounds
   accesses found by KASAN.

 - Guiling Deng fixed a bug where the centered fbdev logo wasn't
   displayed correctly if the screen size matched the logo size.

 - Hsin-Yi Wang provided a patch to include errno.h to fix build when
   CONFIG_OF isn't enabled.

* tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
  fbmem: Check virtual screen sizes in fb_set_var()
  fbcon: Prevent that screen size is smaller than font size
  fbcon: Disallow setting font bigger than screen size
  video: of_display_timing.h: include errno.h
  fbdev: fbmem: Fix logo center image dx issue

2 years agobtrfs: zoned: drop optimization of zone finish
Naohiro Aota [Wed, 29 Jun 2022 02:00:38 +0000 (11:00 +0900)]
btrfs: zoned: drop optimization of zone finish

We have an optimization in do_zone_finish() to send REQ_OP_ZONE_FINISH only
when necessary, i.e. we don't send REQ_OP_ZONE_FINISH when we assume we
wrote fully into the zone.

The assumption is determined by "alloc_offset == capacity". This condition
won't work if the last ordered extent is canceled due to some errors. In
that case, we consider the zone is deactivated without sending the finish
command while it's still active.

This inconstancy results in activating another block group while we cannot
really activate the underlying zone, which causes the active zone exceeds
errors like below.

    BTRFS error (device nvme3n2): allocation failed flags 1, wanted 520192 tree-log 0, relocation: 0
    nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
    active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
    nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
    active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0

Fix the issue by removing the optimization for now.

Fixes: 8376d9e1ed8f ("btrfs: zoned: finish superblock zone once no space left for new SB")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agobtrfs: zoned: fix a leaked bioc in read_zone_info
Christoph Hellwig [Thu, 30 Jun 2022 16:03:19 +0000 (18:03 +0200)]
btrfs: zoned: fix a leaked bioc in read_zone_info

The bioc would leak on the normal completion path and also on the RAID56
check (but that one won't happen in practice due to the invalid
combination with zoned mode).

Fixes: 7db1c5d14dcd ("btrfs: zoned: support dev-replace in zoned filesystems")
CC: stable@vger.kernel.org # 5.16+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ update changelog ]
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agobtrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
Filipe Manana [Mon, 4 Jul 2022 11:42:03 +0000 (12:42 +0100)]
btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents

When doing a direct IO read or write, we always return -ENOTBLK when we
find a compressed extent (or an inline extent) so that we fallback to
buffered IO. This however is not ideal in case we are in a NOWAIT context
(io_uring for example), because buffered IO can block and we currently
have no support for NOWAIT semantics for buffered IO, so if we need to
fallback to buffered IO we should first signal the caller that we may
need to block by returning -EAGAIN instead.

This behaviour can also result in short reads being returned to user
space, which although it's not incorrect and user space should be able
to deal with partial reads, it's somewhat surprising and even some popular
applications like QEMU (Link tag #1) and MariaDB (Link tag #2) don't
deal with short reads properly (or at all).

The short read case happens when we try to read from a range that has a
non-compressed and non-inline extent followed by a compressed extent.
After having read the first extent, when we find the compressed extent we
return -ENOTBLK from btrfs_dio_iomap_begin(), which results in iomap to
treat the request as a short read, returning 0 (success) and waiting for
previously submitted bios to complete (this happens at
fs/iomap/direct-io.c:__iomap_dio_rw()). After that, and while at
btrfs_file_read_iter(), we call filemap_read() to use buffered IO to
read the remaining data, and pass it the number of bytes we were able to
read with direct IO. Than at filemap_read() if we get a page fault error
when accessing the read buffer, we return a partial read instead of an
-EFAULT error, because the number of bytes previously read is greater
than zero.

So fix this by returning -EAGAIN for NOWAIT direct IO when we find a
compressed or an inline extent.

Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Link: https://lore.kernel.org/linux-btrfs/YrrFGO4A1jS0GI0G@atmark-techno.com/
Link: https://jira.mariadb.org/browse/MDEV-27900?focusedCommentId=216582&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-216582
Tested-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agoovl: turn of SB_POSIXACL with idmapped layers temporarily
Christian Brauner [Wed, 6 Jul 2022 13:56:11 +0000 (15:56 +0200)]
ovl: turn of SB_POSIXACL with idmapped layers temporarily

This cycle we added support for mounting overlayfs on top of idmapped
mounts.  Recently I've started looking into potential corner cases when
trying to add additional tests and I noticed that reporting for POSIX ACLs
is currently wrong when using idmapped layers with overlayfs mounted on top
of it.

I have sent out an patch that fixes this and makes POSIX ACLs work
correctly but the patch is a bit bigger and we're already at -rc5 so I
recommend we simply don't raise SB_POSIXACL when idmapped layers are
used. Then we can fix the VFS part described below for the next merge
window so we can have good exposure in -next.

I'm going to give a rather detailed explanation to both the origin of the
problem and mention the solution so people know what's going on.

Let's assume the user creates the following directory layout and they have
a rootfs /var/lib/lxc/c1/rootfs. The files in this rootfs are owned as you
would expect files on your host system to be owned. For example, ~/.bashrc
for your regular user would be owned by 1000:1000 and /root/.bashrc would
be owned by 0:0. IOW, this is just regular boring filesystem tree on an
ext4 or xfs filesystem.

The user chooses to set POSIX ACLs using the setfacl binary granting the
user with uid 4 read, write, and execute permissions for their .bashrc
file:

        setfacl -m u:4:rwx /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc

Now they to expose the whole rootfs to a container using an idmapped
mount. So they first create:

        mkdir -pv /vol/contpool/{ctrover,merge,lowermap,overmap}
        mkdir -pv /vol/contpool/ctrover/{over,work}
        chown 10000000:10000000 /vol/contpool/ctrover/{over,work}

The user now creates an idmapped mount for the rootfs:

        mount-idmapped/mount-idmapped --map-mount=b:0:10000000:65536 \
                                      /var/lib/lxc/c2/rootfs \
                                      /vol/contpool/lowermap

This for example makes it so that
/var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc which is owned by uid and gid
1000 as being owned by uid and gid 10001000 at
/vol/contpool/lowermap/home/ubuntu/.bashrc.

Assume the user wants to expose these idmapped mounts through an overlayfs
mount to a container.

       mount -t overlay overlay                      \
             -o lowerdir=/vol/contpool/lowermap,     \
                upperdir=/vol/contpool/overmap/over, \
                workdir=/vol/contpool/overmap/work   \
             /vol/contpool/merge

The user can do this in two ways:

(1) Mount overlayfs in the initial user namespace and expose it to the
    container.

(2) Mount overlayfs on top of the idmapped mounts inside of the container's
    user namespace.

Let's assume the user chooses the (1) option and mounts overlayfs on the
host and then changes into a container which uses the idmapping
0:10000000:65536 which is the same used for the two idmapped mounts.

Now the user tries to retrieve the POSIX ACLs using the getfacl command

        getfacl -n /vol/contpool/lowermap/home/ubuntu/.bashrc

and to their surprise they see:

        # file: vol/contpool/merge/home/ubuntu/.bashrc
        # owner: 1000
        # group: 1000
        user::rw-
        user:4294967295:rwx
        group::r--
        mask::rwx
        other::r--

indicating the uid wasn't correctly translated according to the idmapped
mount. The problem is how we currently translate POSIX ACLs. Let's inspect
the callchain in this example:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  -> __vfs_getxattr()
                  |     -> handler->get == ovl_posix_acl_xattr_get()
                  |        -> ovl_xattr_get()
                  |           -> vfs_getxattr()
                  |              -> __vfs_getxattr()
                  |                 -> handler->get() /* lower filesystem callback */
                  |> posix_acl_fix_xattr_to_user()
                     {
                              4 = make_kuid(&init_user_ns, 4);
                              4 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 4);
                              /* FAILURE */
                             -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
                     }

If the user chooses to use option (2) and mounts overlayfs on top of
idmapped mounts inside the container things don't look that much better:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  -> __vfs_getxattr()
                  |     -> handler->get == ovl_posix_acl_xattr_get()
                  |        -> ovl_xattr_get()
                  |           -> vfs_getxattr()
                  |              -> __vfs_getxattr()
                  |                 -> handler->get() /* lower filesystem callback */
                  |> posix_acl_fix_xattr_to_user()
                     {
                              4 = make_kuid(&init_user_ns, 4);
                              4 = mapped_kuid_fs(&init_user_ns, 4);
                              /* FAILURE */
                             -1 = from_kuid(0:10000000:65536 /* caller's idmapping */, 4);
                     }

As is easily seen the problem arises because the idmapping of the lower
mount isn't taken into account as all of this happens in do_gexattr(). But
do_getxattr() is always called on an overlayfs mount and inode and thus
cannot possible take the idmapping of the lower layers into account.

This problem is similar for fscaps but there the translation happens as
part of vfs_getxattr() already. Let's walk through an fscaps overlayfs
callchain:

        setcap 'cap_net_raw+ep' /var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc

The expected outcome here is that we'll receive the cap_net_raw capability
as we are able to map the uid associated with the fscap to 0 within our
container.  IOW, we want to see 0 as the result of the idmapping
translations.

If the user chooses option (1) we get the following callchain for fscaps:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                   -> vfs_getxattr()
                      -> xattr_getsecurity()
                         -> security_inode_getsecurity()                                       ________________________________
                            -> cap_inode_getsecurity()                                         |                              |
                               {                                                               V                              |
                                        10000000 = make_kuid(0:0:4k /* overlayfs idmapping */, 10000000);                     |
                                        10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000);                  |
                                               /* Expected result is 0 and thus that we own the fscap. */                     |
                                               0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000);            |
                               }                                                                                              |
                               -> vfs_getxattr_alloc()                                                                        |
                                  -> handler->get == ovl_other_xattr_get()                                                    |
                                     -> vfs_getxattr()                                                                        |
                                        -> xattr_getsecurity()                                                                |
                                           -> security_inode_getsecurity()                                                    |
                                              -> cap_inode_getsecurity()                                                      |
                                                 {                                                                            |
                                                                0 = make_kuid(0:0:4k /* lower s_user_ns */, 0);               |
                                                         10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0); |
                                                         10000000 = from_kuid(0:0:4k /* overlayfs idmapping */, 10000000);    |
                                                         |____________________________________________________________________|
                                                 }
                                                 -> vfs_getxattr_alloc()
                                                    -> handler->get == /* lower filesystem callback */

And if the user chooses option (2) we get:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                   -> vfs_getxattr()
                      -> xattr_getsecurity()
                         -> security_inode_getsecurity()                                                _______________________________
                            -> cap_inode_getsecurity()                                                  |                             |
                               {                                                                        V                             |
                                       10000000 = make_kuid(0:10000000:65536 /* overlayfs idmapping */, 0);                           |
                                       10000000 = mapped_kuid_fs(0:0:4k /* no idmapped mount */, 10000000);                           |
                                               /* Expected result is 0 and thus that we own the fscap. */                             |
                                              0 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000000);                     |
                               }                                                                                                      |
                               -> vfs_getxattr_alloc()                                                                                |
                                  -> handler->get == ovl_other_xattr_get()                                                            |
                                    |-> vfs_getxattr()                                                                                |
                                        -> xattr_getsecurity()                                                                        |
                                           -> security_inode_getsecurity()                                                            |
                                              -> cap_inode_getsecurity()                                                              |
                                                 {                                                                                    |
                                                                 0 = make_kuid(0:0:4k /* lower s_user_ns */, 0);                      |
                                                          10000000 = mapped_kuid_fs(0:10000000:65536 /* idmapped mount */, 0);        |
                                                                 0 = from_kuid(0:10000000:65536 /* overlayfs idmapping */, 10000000); |
                                                                 |____________________________________________________________________|
                                                 }
                                                 -> vfs_getxattr_alloc()
                                                    -> handler->get == /* lower filesystem callback */

We can see how the translation happens correctly in those cases as the
conversion happens within the vfs_getxattr() helper.

For POSIX ACLs we need to do something similar. However, in contrast to
fscaps we cannot apply the fix directly to the kernel internal posix acl
data structure as this would alter the cached values and would also require
a rework of how we currently deal with POSIX ACLs in general which almost
never take the filesystem idmapping into account (the noteable exception
being FUSE but even there the implementation is special) and instead
retrieve the raw values based on the initial idmapping.

The correct values are then generated right before returning to
userspace. The fix for this is to move taking the mount's idmapping into
account directly in vfs_getxattr() instead of having it be part of
posix_acl_fix_xattr_to_user().

To this end we simply move the idmapped mount translation into a separate
step performed in vfs_{g,s}etxattr() instead of in
posix_acl_fix_xattr_{from,to}_user().

To see how this fixes things let's go back to the original example. Assume
the user chose option (1) and mounted overlayfs on top of idmapped mounts
on the host:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:0:4k /* initial idmapping */

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  |> __vfs_getxattr()
                  |  |  -> handler->get == ovl_posix_acl_xattr_get()
                  |  |     -> ovl_xattr_get()
                  |  |        -> vfs_getxattr()
                  |  |           |> __vfs_getxattr()
                  |  |           |  -> handler->get() /* lower filesystem callback */
                  |  |           |> posix_acl_getxattr_idmapped_mnt()
                  |  |              {
                  |  |                              4 = make_kuid(&init_user_ns, 4);
                  |  |                       10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
                  |  |                       10000004 = from_kuid(&init_user_ns, 10000004);
                  |  |                       |_______________________
                  |  |              }                               |
                  |  |                                              |
                  |  |> posix_acl_getxattr_idmapped_mnt()           |
                  |     {                                           |
                  |                                                 V
                  |             10000004 = make_kuid(&init_user_ns, 10000004);
                  |             10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
                  |             10000004 = from_kuid(&init_user_ns, 10000004);
                  |     }       |_________________________________________________
                  |                                                              |
                  |                                                              |
                  |> posix_acl_fix_xattr_to_user()                               |
                     {                                                           V
                                 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
                                        /* SUCCESS */
                                        4 = from_kuid(0:10000000:65536 /* caller's idmapping */, 10000004);
                     }

And similarly if the user chooses option (1) and mounted overayfs on top of
idmapped mounts inside the container:

        idmapped mount /vol/contpool/merge:      0:10000000:65536
        caller's idmapping:                      0:10000000:65536
        overlayfs idmapping (ofs->creator_cred): 0:10000000:65536

        sys_getxattr()
        -> path_getxattr()
           -> getxattr()
              -> do_getxattr()
                  |> vfs_getxattr()
                  |  |> __vfs_getxattr()
                  |  |  -> handler->get == ovl_posix_acl_xattr_get()
                  |  |     -> ovl_xattr_get()
                  |  |        -> vfs_getxattr()
                  |  |           |> __vfs_getxattr()
                  |  |           |  -> handler->get() /* lower filesystem callback */
                  |  |           |> posix_acl_getxattr_idmapped_mnt()
                  |  |              {
                  |  |                              4 = make_kuid(&init_user_ns, 4);
                  |  |                       10000004 = mapped_kuid_fs(0:10000000:65536 /* lower idmapped mount */, 4);
                  |  |                       10000004 = from_kuid(&init_user_ns, 10000004);
                  |  |                       |_______________________
                  |  |              }                               |
                  |  |                                              |
                  |  |> posix_acl_getxattr_idmapped_mnt()           |
                  |     {                                           V
                  |             10000004 = make_kuid(&init_user_ns, 10000004);
                  |             10000004 = mapped_kuid_fs(&init_user_ns /* no idmapped mount */, 10000004);
                  |             10000004 = from_kuid(0(&init_user_ns, 10000004);
                  |             |_________________________________________________
                  |     }                                                        |
                  |                                                              |
                  |> posix_acl_fix_xattr_to_user()                               |
                     {                                                           V
                                 10000004 = make_kuid(0:0:4k /* init_user_ns */, 10000004);
                                        /* SUCCESS */
                                        4 = from_kuid(0:10000000:65536 /* caller's idmappings */, 10000004);
                     }

The last remaining problem we need to fix here is ovl_get_acl(). During
ovl_permission() overlayfs will call:

        ovl_permission()
        -> generic_permission()
           -> acl_permission_check()
              -> check_acl()
                 -> get_acl()
                    -> inode->i_op->get_acl() == ovl_get_acl()
                        > get_acl() /* on the underlying filesystem)
                          ->inode->i_op->get_acl() == /*lower filesystem callback */
                 -> posix_acl_permission()

passing through the get_acl request to the underlying filesystem. This will
retrieve the acls stored in the lower filesystem without taking the
idmapping of the underlying mount into account as this would mean altering
the cached values for the lower filesystem. The simple solution is to have
ovl_get_acl() simply duplicate the ACLs, update the values according to the
idmapped mount and return it to acl_permission_check() so it can be used in
posix_acl_permission(). Since overlayfs doesn't cache ACLs they'll be
released right after.

Link: https://github.com/brauner/mount-idmapped/issues/9
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: linux-unionfs@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Fixes: bc70682a497c ("ovl: support idmapped layers")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2 years agox86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
Thadeu Lima de Souza Cascardo [Thu, 7 Jul 2022 16:41:52 +0000 (13:41 -0300)]
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported

There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.

If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.

Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agoMAINTAINERS: Remove iommu@lists.linux-foundation.org
Joerg Roedel [Wed, 6 Jul 2022 10:33:31 +0000 (12:33 +0200)]
MAINTAINERS: Remove iommu@lists.linux-foundation.org

The IOMMU mailing list has moved to iommu@lists.linux.dev
and the old list should bounce by now. Remove it from the
MAINTAINERS file.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220706103331.10215-1-joro@8bytes.org
2 years agoMerge tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme into block-5.19
Jens Axboe [Thu, 7 Jul 2022 23:38:19 +0000 (17:38 -0600)]
Merge tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme into block-5.19

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.19

 - another bogus identifier quirk (Keith Busch)
 - use struct group in the tracer to avoid a gcc warning (Keith Busch)"

* tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme:
  nvme: use struct group for generic command dwords
  nvme-pci: phison e16 has bogus namespace ids

2 years agoio_uring: explicit sqe padding for ioctl commands
Pavel Begunkov [Thu, 7 Jul 2022 14:00:38 +0000 (15:00 +0100)]
io_uring: explicit sqe padding for ioctl commands

32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6b95a05e970af79000435166185e85b196b2ba2.1657202417.git.asml.silence@gmail.com
[axboe: turn bitwise OR into logical variant]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoi2c: cadence: Unregister the clk notifier in error path
Satish Nagireddy [Tue, 28 Jun 2022 19:12:16 +0000 (12:12 -0700)]
i2c: cadence: Unregister the clk notifier in error path

This patch ensures that the clock notifier is unregistered
when driver probe is returning error.

Fixes: df8eb5691c48 ("i2c: Add driver for Cadence I2C controller")
Signed-off-by: Satish Nagireddy <satish.nagireddy@getcruise.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2 years agoMerge tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Rafael J. Wysocki [Thu, 7 Jul 2022 19:46:05 +0000 (21:46 +0200)]
Merge tag 'devfreq-fixes-for-5.19-rc6' of git://git./linux/kernel/git/chanwoo/linux

Pull a devfreq fix for 5.19-rc6 from Chanwoo Choi:

"- Fix exynos-bus NULL pointer dereference by correctly using the local
   generated freq_table to output the debug values instead of using the
   profile freq_table that is not used in the driver."

* tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: exynos-bus: Fix NULL pointer dereference

2 years agoPM / devfreq: exynos-bus: Fix NULL pointer dereference
Christian Marangi [Fri, 1 Jul 2022 13:31:26 +0000 (15:31 +0200)]
PM / devfreq: exynos-bus: Fix NULL pointer dereference

Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: b5d281f6c16d ("PM / devfreq: Rework freq_table to be local to devfreq struct")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2 years agoMerge tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 7 Jul 2022 17:41:27 +0000 (10:41 -0700)]
Merge tag 'loongarch-fixes-5.19-4' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "A fix for tinyconfig build error, a fix for section mismatch warning,
  and two cleanups of obsolete code"

* tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Fix section mismatch warning
  LoongArch: Fix build errors for tinyconfig
  LoongArch: Remove obsolete mentions of vcsr
  LoongArch: Drop these obsolete selects in Kconfig

2 years agoMerge tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 7 Jul 2022 17:08:20 +0000 (10:08 -0700)]
Merge tag 'net-5.19-rc6' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf, netfilter, can, and bluetooth.

  Current release - regressions:

   - bluetooth: fix deadlock on hci_power_on_sync

  Previous releases - regressions:

   - sched: act_police: allow 'continue' action offload

   - eth: usbnet: fix memory leak in error case

   - eth: ibmvnic: properly dispose of all skbs during a failover

  Previous releases - always broken:

   - bpf:
       - fix insufficient bounds propagation from
         adjust_scalar_min_max_vals
       - clear page contiguity bit when unmapping pool

   - netfilter: nft_set_pipapo: release elements in clone from
     abort path

   - mptcp: netlink: issue MP_PRIO signals from userspace PMs

   - can:
       - rcar_canfd: fix data transmission failed on R-Car V3U
       - gs_usb: gs_usb_open/close(): fix memory leak

  Misc:

   - add Wenjia as SMC maintainer"

* tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  wireguard: Kconfig: select CRYPTO_CHACHA_S390
  crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
  wireguard: selftests: use microvm on x86
  wireguard: selftests: always call kernel makefile
  wireguard: selftests: use virt machine on m68k
  wireguard: selftests: set fake real time in init
  r8169: fix accessing unset transport header
  net: rose: fix UAF bug caused by rose_t0timer_expiry
  usbnet: fix memory leak in error case
  Revert "tls: rx: move counting TlsDecryptErrors for sync"
  mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
  mptcp: fix local endpoint accounting
  selftests: mptcp: userspace PM support for MP_PRIO signals
  mptcp: netlink: issue MP_PRIO signals from userspace PMs
  mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
  mptcp: Avoid acquiring PM lock for subflow priority changes
  mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
  net/mlx5e: Fix matchall police parameters validation
  net/sched: act_police: allow 'continue' action offload
  net: lan966x: hardcode the number of external ports
  ...

2 years agoMerge tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 7 Jul 2022 17:02:38 +0000 (10:02 -0700)]
Merge tag 'pinctrl-v5.19-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

 - Tag Intel pin control as supported in MAINTAINERS

 - Fix a NULL pointer exception in the Aspeed driver

 - Correct some NAND functions in the Sunxi A83T driver

 - Use the right offset for some Sunxi pins

 - Fix a zero base offset in the Freescale (NXP) i.MX93

 - Fix the IRQ support in the STM32 driver

* tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: stm32: fix optional IRQ support to gpios
  pinctrl: imx: Add the zero base flag for imx93
  pinctrl: sunxi: sunxi_pconf_set: use correct offset
  pinctrl: sunxi: a83t: Fix NAND function name for some pins
  pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
  MAINTAINERS: Update Intel pin control to Supported

2 years agosignal handling: don't use BUG_ON() for debugging
Linus Torvalds [Wed, 6 Jul 2022 19:20:59 +0000 (12:20 -0700)]
signal handling: don't use BUG_ON() for debugging

These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.

It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".

Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running.  Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agox86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
Peter Zijlstra [Wed, 6 Jul 2022 13:33:30 +0000 (15:33 +0200)]
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry

Commit

  ee774dac0da1 ("x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()")

moved PUSH_AND_CLEAR_REGS out of error_entry, into its own function, in
part to avoid calling error_entry() for XenPV.

However, commit

  7c81c0c9210c ("x86/entry: Avoid very early RET")

had to change that because the 'ret' was too early and moved it into
idtentry, bloating the text size, since idtentry is expanded for every
exception vector.

However, with the advent of xen_error_entry() in commit

  d147553b64bad ("x86/xen: Add UNTRAIN_RET")

it became possible to remove PUSH_AND_CLEAR_REGS from idtentry, back
into *error_entry().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agox86/ibt, objtool: Don't discard text references from tracepoint section
Peter Zijlstra [Tue, 28 Jun 2022 10:57:42 +0000 (12:57 +0200)]
x86/ibt, objtool: Don't discard text references from tracepoint section

On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:

> # ./ftracetest
> === Ftrace unit tests ===
> [1] Basic trace file check      [PASS]
> [2] Basic test for tracers      [PASS]
> [3] Basic trace clock test      [PASS]
> [4] Basic event tracing check   [PASS]
> [5] Change the ringbuffer size  [PASS]
> [6] Snapshot and tracing setting        [PASS]
> [7] trace_pipe and trace_marker [PASS]
> [8] Test ftrace direct functions against tracers        [UNRESOLVED]
> [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
> [10] Generic dynamic event - add/remove eprobe events   [FAIL]
> [11] Generic dynamic event - add/remove kprobe events
>
> It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> "
> [ 9332.752836] mmiotrace: enabled CPU7.
> [ 9332.788612] mmiotrace: disabled.
> [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0

It turns out that while syscall_regfunc() does have an ENDBR when
generated, it gets sealed by objtool's .ibt_endbr_seal list.

Since the only text references to this function:

  $ git grep syscall_regfunc
  include/linux/tracepoint.h:extern int syscall_regfunc(void);
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
  kernel/tracepoint.c:int syscall_regfunc(void)

appear in the __tracepoint section which is excluded by objtool.

Fixes: 3c6f9f77e618 ("objtool: Rework ibt and extricate from stack validation")
Reported-by: Pengfei Xu <pengfei.xu@intel.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yrrepdaow4F5kqG0@hirez.programming.kicks-ass.net
2 years agox86/bugs: Add Cannon lake to RETBleed affected CPU list
Pawan Gupta [Wed, 6 Jul 2022 22:01:15 +0000 (15:01 -0700)]
x86/bugs: Add Cannon lake to RETBleed affected CPU list

Cannon lake is also affected by RETBleed, add it to the list.

Fixes: 6ad0ad2bf8a6 ("x86/bugs: Report Intel retbleed vulnerability")
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agogpiolib: cdev: fix null pointer dereference in linereq_free()
Kent Gibson [Wed, 6 Jul 2022 08:45:07 +0000 (16:45 +0800)]
gpiolib: cdev: fix null pointer dereference in linereq_free()

Fix a kernel NULL pointer dereference reported by gpio kselftests.

linereq_free() can be called as part of the cleanup of a failed request,
at which time the desc for a line may not have been determined, so it
is unsafe to dereference without a check.

Add a check prior to dereferencing the line desc.

Fixes: 2068339a6c35 ("gpiolib: cdev: Add hardware timestamp clock type")
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agoLoongArch: Fix section mismatch warning
Tiezhu Yang [Mon, 27 Jun 2022 06:57:35 +0000 (14:57 +0800)]
LoongArch: Fix section mismatch warning

init_numa_memory() is annotated __init and not used by any module,
thus don't export it.

Remove not needed EXPORT_SYMBOL for init_numa_memory() to fix the
following section mismatch warning:

  MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference
from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()
The symbol init_numa_memory is exported and annotated __init
Fix this by removing the __init annotation of init_numa_memory or drop the export.

This is build on Linux 5.19-rc4.

Fixes: d4b6f1562a3c ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix build errors for tinyconfig
Huacai Chen [Wed, 6 Jul 2022 03:03:09 +0000 (11:03 +0800)]
LoongArch: Fix build errors for tinyconfig

Building loongarch:tinyconfig fails with the following error.

./arch/loongarch/include/asm/page.h: In function 'pfn_valid':
./arch/loongarch/include/asm/page.h:42:32: error: 'PHYS_OFFSET' undeclared

Add the missing include file and fix succeeding vdso errors.

Fixes: 09cfefb7fa70 ("LoongArch: Add memory management")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Remove obsolete mentions of vcsr
Qi Hu [Wed, 6 Jul 2022 11:29:37 +0000 (19:29 +0800)]
LoongArch: Remove obsolete mentions of vcsr

The `vcsr` only exists in the old hardware design, it isn't used in any
shipped hardware from Loongson-3A5000 on. Both scalar FP and LSX/LASX
instructions use the `fcsr` as their control and status registers now.
For example, the RM control bit in fcsr0 is shared by FP, LSX and LASX
instructions.

Particularly, fcsr16 to fcsr31 are reserved for LSX/LASX now, access to
these registers has no visible effect if LSX/LASX is enabled, and will
cause SXD/ASXD exceptions if LSX/LASX is not enabled.

So, mentions of vcsr are obsolete in the first place (it was just used
for debugging), let's remove them.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Drop these obsolete selects in Kconfig
Lukas Bulwahn [Tue, 5 Jul 2022 07:34:05 +0000 (09:34 +0200)]
LoongArch: Drop these obsolete selects in Kconfig

Commit fa96b57c1490 ("LoongArch: Add build infrastructure") adds the new
file arch/loongarch/Kconfig.

As the work on LoongArch was probably quite some time under development,
various config symbols have changed and disappeared from the time of
initial writing of the Kconfig file and its inclusion in the repository.

The following four commits:

  commit c126a53c2760 ("arch: remove GENERIC_FIND_FIRST_BIT entirely")
  commit 140c8180eb7c ("arch: remove HAVE_COPY_THREAD_TLS")
  commit aca52c398389 ("mm: remove CONFIG_HAVE_MEMBLOCK")
  commit 3f08a302f533 ("mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option")

remove the mentioned config symbol, and enable the intended setup by
default without configuration.

Drop these obsolete selects in loongarch's Kconfig.

Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agofbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()

Use the fbcon_info_from_console() wrapper which was added to kernel
v5.19 with commit 409d6c95f9c6 ("fbcon: Introduce wrapper for console->fb_info lookup").

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2 years agofbmem: Check virtual screen sizes in fb_set_var()
Helge Deller [Wed, 29 Jun 2022 13:53:55 +0000 (15:53 +0200)]
fbmem: Check virtual screen sizes in fb_set_var()

Verify that the fbdev or drm driver correctly adjusted the virtual
screen sizes. On failure report the failing driver and reject the screen
size change.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v5.4+
2 years agodrm/ssd130x: Fix pre-charge period setting
Ezequiel Garcia [Wed, 6 Jul 2022 18:41:33 +0000 (15:41 -0300)]
drm/ssd130x: Fix pre-charge period setting

Fix small typo which causes the mask for the 'precharge1' setting
to be used with the 'precharge2' value.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220706184133.210888-1-ezequiel@vanguardiasur.com.ar
2 years agofbcon: Prevent that screen size is smaller than font size
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Prevent that screen size is smaller than font size

We need to prevent that users configure a screen size which is smaller than the
currently selected font size. Otherwise rendering chars on the screen will
access memory outside the graphics memory region.

This patch adds a new function fbcon_modechange_possible() which
implements this check and which later may be extended with other checks
if necessary.  The new function is called from the FBIOPUT_VSCREENINFO
ioctl handler in fbmem.c, which will return -EINVAL if userspace asked
for a too small screen size.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v5.4+
2 years agofbcon: Disallow setting font bigger than screen size
Helge Deller [Sat, 25 Jun 2022 10:56:49 +0000 (12:56 +0200)]
fbcon: Disallow setting font bigger than screen size

Prevent that users set a font size which is bigger than the physical screen.
It's unlikely this may happen (because screens are usually much larger than the
fonts and each font char is limited to 32x32 pixels), but it may happen on
smaller screens/LCD displays.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v4.14+
2 years agodma-buf: Fix one use-after-free of fence
xinhui pan [Thu, 7 Jul 2022 08:02:41 +0000 (16:02 +0800)]
dma-buf: Fix one use-after-free of fence

Need get the new fence when we replace the old one.

Fixes: 047a1b877ed48 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707080241.20060-1-xinhui.pan@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2 years agodrm/i915: Fix vm use-after-free in vma destruction
Thomas Hellström [Mon, 20 Jun 2022 12:36:59 +0000 (14:36 +0200)]
drm/i915: Fix vm use-after-free in vma destruction

In vma destruction, the following race may occur:

Thread 1:        Thread 2:
i915_vma_destroy();

  ...
  list_del_init(vma->vm_link);
  ...
  mutex_unlock(vma->vm->mutex);
  __i915_vm_release();
release_references();

And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.

However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().

v2: Fix a typo in the commit message (Andi Shyti)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count")
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agodrm/i915/guc: ADL-N should use the same GuC FW as ADL-S
Daniele Ceraolo Spurio [Tue, 21 Jun 2022 23:30:05 +0000 (16:30 -0700)]
drm/i915/guc: ADL-N should use the same GuC FW as ADL-S

The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.

Fixes: 7e28d0b26759 ("drm/i915/adl-n: Enable ADL-N platform")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 971e4a9781742aaad1587e25fd5582b2dd595ef8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agodrm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
Hangyu Hua [Fri, 24 Jun 2022 13:04:06 +0000 (06:04 -0700)]
drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()

If drm_connector_init fails, intel_connector_free will be called to take
care of proper free. So it is necessary to drop the refcount of port
before intel_connector_free.

Fixes: 091a4f91942a ("drm/i915: Handle drm-layer errors in intel_dp_add_mst_connector")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624130406.17996-1-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit cea9ed611e85d36a05db52b6457bf584b7d969e2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agoMerge branch 'wireguard-patches-for-5-19-rc6'
Jakub Kicinski [Thu, 7 Jul 2022 03:04:09 +0000 (20:04 -0700)]
Merge branch 'wireguard-patches-for-5-19-rc6'

Jason A. Donenfeld says:

====================
wireguard patches for 5.19-rc6

1) A few small fixups to the selftests, per usual. Of particular note is
   a fix for a test flake that occurred on especially fast systems that
   boot in less than a second.

2) An addition during this cycle of some s390 crypto interacted with the
   way wireguard selects dependencies, resulting in linker errors
   reported by the kernel test robot. So Vladis sent in a patch for
   that, which also required a small preparatory fix moving some Kconfig
   symbols around.
====================

Link: https://lore.kernel.org/r/20220707003157.526645-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>