platform/kernel/linux-rpi.git
3 years agoLinux 5.16-rc6
Linus Torvalds [Sun, 19 Dec 2021 22:14:33 +0000 (14:14 -0800)]
Linux 5.16-rc6

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 19 Dec 2021 20:44:03 +0000 (12:44 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Two small fixes, one of which was being worked around in selftests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Retry page fault if MMU reload is pending and root has no sp
  KVM: selftests: vmx_pmu_msrs_test: Drop tests mangling guest visible CPUIDs
  KVM: x86: Drop guest CPUID check for host initiated writes to MSR_IA32_PERF_CAPABILITIES

3 years agoMerge tag 'block-5.16-2021-12-19' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 19 Dec 2021 20:38:53 +0000 (12:38 -0800)]
Merge tag 'block-5.16-2021-12-19' of git://git.kernel.dk/linux-block

Pull block revert from Jens Axboe:
 "It turns out that the fix for not hammering on the delayed work timer
  too much caused a performance regression for BFQ, so let's revert the
  change for now.

  I've got some ideas on how to fix it appropriately, but they should
  wait for 5.17"

* tag 'block-5.16-2021-12-19' of git://git.kernel.dk/linux-block:
  Revert "block: reduce kblockd_mod_delayed_work_on() CPU consumption"

3 years agoMerge tag 'irq_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Dec 2021 20:28:46 +0000 (12:28 -0800)]
Merge tag 'irq_urgent_for_v5.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Clear the PCI_MSIX_FLAGS_MASKALL bit too on the error path so that it
   is restored to its reset state

 - Mask MSI-X vectors late on the init path in order to handle
   out-of-spec Marvell NVME devices which apparently look at the MSI-X
   mask even when MSI-X is disabled

* tag 'irq_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Clear PCI_MSIX_FLAGS_MASKALL on error
  PCI/MSI: Mask MSI-X vectors only on success

3 years agoMerge tag 'timers_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Dec 2021 20:23:18 +0000 (12:23 -0800)]
Merge tag 'timers_urgent_for_v5.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Borislav Petkov:

 - Make sure the CLOCK_REALTIME to CLOCK_MONOTONIC offset is never
   positive

* tag 'timers_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Really make sure wall_to_monotonic isn't positive

3 years agoMerge tag 'locking_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Dec 2021 20:17:26 +0000 (12:17 -0800)]
Merge tag 'locking_urgent_for_v5.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix the rtmutex condition checking when the optimistic spinning of a
   waiter needs to be terminated

* tag 'locking_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner()

3 years agoMerge tag 'core_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 19 Dec 2021 19:46:54 +0000 (11:46 -0800)]
Merge tag 'core_urgent_for_v5.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull signal handlign fix from Borislav Petkov:

 - Prevent lock contention on the new sigaltstack lock on the
   common-case path, when no changes have been made to the alternative
   signal stack.

* tag 'core_urgent_for_v5.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  signal: Skip the altstack update when not needed

3 years agoMerge tag 'mips-fixes_5.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Sun, 19 Dec 2021 19:40:11 +0000 (11:40 -0800)]
Merge tag 'mips-fixes_5.16_3' of git://git./linux/kernel/git/mips/linux

Pull MIPS fix from Thomas Bogendoerfer:

 - only enable pci_remap_iospace() for Ralink devices

* tag 'mips-fixes_5.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Only define pci_remap_iospace() for Ralink

3 years agoMerge tag 'powerpc-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 19 Dec 2021 19:31:14 +0000 (11:31 -0800)]
Merge tag 'powerpc-5.16-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Fix a recently introduced oops at boot on 85xx in some configurations.

  Fix crashes when loading some livepatch modules with
  STRICT_MODULE_RWX.

  Thanks to Joe Lawrence, Russell Currey, and Xiaoming Ni"

* tag 'powerpc-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/module_64: Fix livepatching for RO modules
  powerpc/85xx: Fix oops when CONFIG_FSL_PMC=n

3 years agoMerge tag '5.16-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 19 Dec 2021 19:23:02 +0000 (11:23 -0800)]
Merge tag '5.16-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Two cifs/smb3 fixes, one fscache related, and one mount parsing
  related for stable"

* tag '5.16-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: sanitize multiple delimiters in prepath
  cifs: ignore resource_id while getting fscache super cookie

3 years agoKVM: x86: Retry page fault if MMU reload is pending and root has no sp
Sean Christopherson [Thu, 9 Dec 2021 06:05:46 +0000 (06:05 +0000)]
KVM: x86: Retry page fault if MMU reload is pending and root has no sp

Play nice with a NULL shadow page when checking for an obsolete root in
the page fault handler by flagging the page fault as stale if there's no
shadow page associated with the root and KVM_REQ_MMU_RELOAD is pending.
Invalidating memslots, which is the only case where _all_ roots need to
be reloaded, requests all vCPUs to reload their MMUs while holding
mmu_lock for lock.

The "special" roots, e.g. pae_root when KVM uses PAE paging, are not
backed by a shadow page.  Running with TDP disabled or with nested NPT
explodes spectaculary due to dereferencing a NULL shadow page pointer.

Skip the KVM_REQ_MMU_RELOAD check if there is a valid shadow page for the
root.  Zapping shadow pages in response to guest activity, e.g. when the
guest frees a PGD, can trigger KVM_REQ_MMU_RELOAD even if the current
vCPU isn't using the affected root.  I.e. KVM_REQ_MMU_RELOAD can be seen
with a completely valid root shadow page.  This is a bit of a moot point
as KVM currently unloads all roots on KVM_REQ_MMU_RELOAD, but that will
be cleaned up in the future.

Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211209060552.2956723-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: selftests: vmx_pmu_msrs_test: Drop tests mangling guest visible CPUIDs
Vitaly Kuznetsov [Thu, 16 Dec 2021 16:52:12 +0000 (17:52 +0100)]
KVM: selftests: vmx_pmu_msrs_test: Drop tests mangling guest visible CPUIDs

Host initiated writes to MSR_IA32_PERF_CAPABILITIES should not depend
on guest visible CPUIDs and (incorrect) KVM logic implementing it is
about to change. Also, KVM_SET_CPUID{,2} after KVM_RUN is now forbidden
and causes test to fail.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211216165213.338923-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: x86: Drop guest CPUID check for host initiated writes to MSR_IA32_PERF_CAPABILITIES
Vitaly Kuznetsov [Thu, 16 Dec 2021 16:52:13 +0000 (17:52 +0100)]
KVM: x86: Drop guest CPUID check for host initiated writes to MSR_IA32_PERF_CAPABILITIES

The ability to write to MSR_IA32_PERF_CAPABILITIES from the host should
not depend on guest visible CPUID entries, even if just to allow
creating/restoring guest MSRs and CPUIDs in any sequence.

Fixes: 27461da31089 ("KVM: x86/pmu: Support full width counting")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211216165213.338923-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoRevert "block: reduce kblockd_mod_delayed_work_on() CPU consumption"
Jens Axboe [Sun, 19 Dec 2021 14:58:44 +0000 (07:58 -0700)]
Revert "block: reduce kblockd_mod_delayed_work_on() CPU consumption"

This reverts commit cb2ac2912a9ca7d3d26291c511939a41361d2d83.

Alex and the kernel test robot report that this causes a significant
performance regression with BFQ. I can reproduce that result, so let's
revert this one as we're close to -rc6 and we there's no point in trying
to rush a fix.

Link: https://lore.kernel.org/linux-block/1639853092.524jxfaem2.none@localhost/
Link: https://lore.kernel.org/lkml/20211219141852.GH14057@xsang-OptiPlex-9020/
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge tag 'tty-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sat, 18 Dec 2021 21:23:55 +0000 (13:23 -0800)]
Merge tag 'tty-5.16-rc6' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are two small tty/serial fixes for 5.16-rc6.  They include:

   - n_hdlc fix for syzbot reported problem that you were previously
     copied on.

   - 8250_fintek driver fix that resolved a console problem by removing
     a previous change.

  Both have been in linux-next with no reported issues"

* tag 'tty-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_fintek: Fix garbled text for console
  tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous

3 years agoMerge tag 'usb-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sat, 18 Dec 2021 21:16:43 +0000 (13:16 -0800)]
Merge tag 'usb-5.16-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a number of small USB driver fixes for reported problems.
  They include:

   - dwc2 driver fixes

   - xhci driver fixes

   - cdnsp driver fixes

   - typec driver fix

   - gadget u_ether driver fix

   - new quirk additions

   - usb gadget endpoint calculation fix

   - usb serial new device ids

   - revert of a xhci-dbg change that broke early debug booting

  All changes, except for the revert, have been in linux-next with no
  reported problems. The revert was from yesterday, and it was reported
  by the developers affected that it resolved their problem"

* tag 'usb-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  Revert "usb: early: convert to readl_poll_timeout_atomic()"
  usb: typec: tcpm: fix tcpm unregister port but leave a pending timer
  usb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore
  USB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)
  usb: xhci: Extend support for runtime power management for AMD's Yellow carp.
  usb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe
  USB: gadget: bRequestType is a bitfield, not a enum
  USB: serial: option: add Telit FN990 compositions
  USB: serial: cp210x: fix CP2105 GPIO registration
  usb: cdnsp: Fix incorrect status for control request
  usb: cdnsp: Fix issue in cdnsp_log_ep trace event
  usb: cdnsp: Fix incorrect calling of cdnsp_died function
  usb: xhci-mtk: fix list_del warning when enable list debug
  usb: gadget: u_ether: fix race in setting MAC address in setup phase

3 years agoMerge tag 'perf-tools-fixes-for-v5.16-2021-12-18' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 18 Dec 2021 19:53:14 +0000 (11:53 -0800)]
Merge tag 'perf-tools-fixes-for-v5.16-2021-12-18' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix segfaults in 'perf inject' related to usage of unopened files

 - The return value of hashmap__new() should be checked using IS_ERR()

* tag 'perf-tools-fixes-for-v5.16-2021-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf inject: Fix segfault due to perf_data__fd() without open
  perf inject: Fix segfault due to close without open
  perf expr: Fix missing check for return value of hashmap__new()

3 years agoperf inject: Fix segfault due to perf_data__fd() without open
Adrian Hunter [Mon, 13 Dec 2021 08:48:29 +0000 (10:48 +0200)]
perf inject: Fix segfault due to perf_data__fd() without open

The fixed commit attempts to get the output file descriptor even if the
file was never opened e.g.

  $ perf record uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
  $ perf inject -i perf.data --vm-time-correlation=dry-run
  Segmentation fault (core dumped)
  $ gdb --quiet perf
  Reading symbols from perf...
  (gdb) r inject -i perf.data --vm-time-correlation=dry-run
  Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

  Program received signal SIGSEGV, Segmentation fault.
  __GI___fileno (fp=0x0) at fileno.c:35
  35      fileno.c: No such file or directory.
  (gdb) bt
  #0  __GI___fileno (fp=0x0) at fileno.c:35
  #1  0x00005621e48dd987 in perf_data__fd (data=0x7fff4c68bd08) at util/data.h:72
  #2  perf_data__fd (data=0x7fff4c68bd08) at util/data.h:69
  #3  cmd_inject (argc=<optimized out>, argv=0x7fff4c69c1f0) at builtin-inject.c:1017
  #4  0x00005621e4936783 in run_builtin (p=0x5621e4ee6878 <commands+600>, argc=4, argv=0x7fff4c69c1f0) at perf.c:313
  #5  0x00005621e4897d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
  #6  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
  #7  main (argc=4, argv=0x7fff4c69c1f0) at perf.c:539
  (gdb)

Fixes: 0ae03893623dd1dd ("perf tools: Pass a fd to perf_file_header__read_pipe()")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211213084829.114772-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf inject: Fix segfault due to close without open
Adrian Hunter [Mon, 13 Dec 2021 08:48:28 +0000 (10:48 +0200)]
perf inject: Fix segfault due to close without open

The fixed commit attempts to close inject.output even if it was never
opened e.g.

  $ perf record uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
  $ perf inject -i perf.data --vm-time-correlation=dry-run
  Segmentation fault (core dumped)
  $ gdb --quiet perf
  Reading symbols from perf...
  (gdb) r inject -i perf.data --vm-time-correlation=dry-run
  Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

  Program received signal SIGSEGV, Segmentation fault.
  0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
  48      iofclose.c: No such file or directory.
  (gdb) bt
  #0  0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
  #1  0x0000557fc7b74f92 in perf_data__close (data=data@entry=0x7ffcdafa6578) at util/data.c:376
  #2  0x0000557fc7a6b807 in cmd_inject (argc=<optimized out>, argv=<optimized out>) at builtin-inject.c:1085
  #3  0x0000557fc7ac4783 in run_builtin (p=0x557fc8074878 <commands+600>, argc=4, argv=0x7ffcdafb6a60) at perf.c:313
  #4  0x0000557fc7a25d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
  #5  run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
  #6  main (argc=4, argv=0x7ffcdafb6a60) at perf.c:539
  (gdb)

Fixes: 02e6246f5364d526 ("perf inject: Close inject.output on exit")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211213084829.114772-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf expr: Fix missing check for return value of hashmap__new()
Miaoqian Lin [Sun, 12 Dec 2021 06:25:02 +0000 (06:25 +0000)]
perf expr: Fix missing check for return value of hashmap__new()

The hashmap__new() function may return ERR_PTR(-ENOMEM) when malloc()
fails, add IS_ERR() checking for ctx->ids.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211212062504.25841-1-linmq006@gmail.com
[ s/kfree()/free()/ and add missing linux/err.h include ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agolocking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner()
Zqiang [Fri, 17 Dec 2021 07:42:07 +0000 (15:42 +0800)]
locking/rtmutex: Fix incorrect condition in rtmutex_spin_on_owner()

Optimistic spinning needs to be terminated when the spinning waiter is not
longer the top waiter on the lock, but the condition is negated. It
terminates if the waiter is the top waiter, which is defeating the whole
purpose.

Fixes: c3123c431447 ("locking/rtmutex: Dont dereference waiter lockless")
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211217074207.77425-1-qiang1.zhang@intel.com
3 years agoMerge tag 'libata-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 18 Dec 2021 01:24:53 +0000 (17:24 -0800)]
Merge tag 'libata-5.16-rc6' of git://git./linux/kernel/git/dlemoal/libata

Pull libata fix from Damien Le Moal:
 "A single fix for this cycle:

   - Check that ATA16 passthrough commands that do not transfer any data
     have a DMA direction set to DMA_NONE (From George)"

* tag 'libata-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  libata: if T_LENGTH is zero, dma direction should be DMA_NONE

3 years agoMerge tag 'zonefs-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 18 Dec 2021 01:19:51 +0000 (17:19 -0800)]
Merge tag 'zonefs-5.16-rc6' of git://git./linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:
 "One fix and one trivial update for rc6:

   - Add MODULE_ALIAS_FS to get automatic module loading on mount
     (Naohiro)

   - Update Damien's email address in the MAINTAINERS file (me)"

* tag 'zonefs-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  MAITAINERS: Change zonefs maintainer email address
  zonefs: add MODULE_ALIAS_FS

3 years agocifs: sanitize multiple delimiters in prepath
Thiago Rafael Becker [Fri, 17 Dec 2021 18:20:22 +0000 (15:20 -0300)]
cifs: sanitize multiple delimiters in prepath

mount.cifs can pass a device with multiple delimiters in it. This will
cause rename(2) to fail with ENOENT.

V2:
  - Make sanitize_path more readable.
  - Fix multiple delimiters between UNC and prepath.
  - Avoid a memory leak if a bad user starts putting a lot of delimiters
    in the path on purpose.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2031200
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api")
Cc: stable@vger.kernel.org # 5.11+
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Thiago Rafael Becker <trbecker@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 years agocifs: ignore resource_id while getting fscache super cookie
Shyam Prasad N [Wed, 8 Dec 2021 16:33:19 +0000 (16:33 +0000)]
cifs: ignore resource_id while getting fscache super cookie

We have a cyclic dependency between fscache super cookie
and root inode cookie. The super cookie relies on
tcon->resource_id, which gets populated from the root inode
number. However, fetching the root inode initializes inode
cookie as a child of super cookie, which is yet to be populated.

resource_id is only used as auxdata to check the validity of
super cookie. We can completely avoid setting resource_id to
remove the circular dependency. Since vol creation time and
vol serial numbers are used for auxdata, we should be fine.
Additionally, there will be auxiliary data check for each
inode cookie as well.

Fixes: 5bf91ef03d98 ("cifs: wait for tcon resource_id before getting fscache super")
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
3 years agotimekeeping: Really make sure wall_to_monotonic isn't positive
Yu Liao [Mon, 13 Dec 2021 13:57:27 +0000 (21:57 +0800)]
timekeeping: Really make sure wall_to_monotonic isn't positive

Even after commit e1d7ba873555 ("time: Always make sure wall_to_monotonic
isn't positive") it is still possible to make wall_to_monotonic positive
by running the following code:

    int main(void)
    {
        struct timespec time;

        clock_gettime(CLOCK_MONOTONIC, &time);
        time.tv_nsec = 0;
        clock_settime(CLOCK_REALTIME, &time);
        return 0;
    }

The reason is that the second parameter of timespec64_compare(), ts_delta,
may be unnormalized because the delta is calculated with an open coded
substraction which causes the comparison of tv_sec to yield the wrong
result:

  wall_to_monotonic = { .tv_sec = -10, .tv_nsec =  900000000 }
  ts_delta      = { .tv_sec =  -9, .tv_nsec = -900000000 }

That makes timespec64_compare() claim that wall_to_monotonic < ts_delta,
but actually the result should be wall_to_monotonic > ts_delta.

After normalization, the result of timespec64_compare() is correct because
the tv_sec comparison is not longer misleading:

  wall_to_monotonic = { .tv_sec = -10, .tv_nsec =  900000000 }
  ts_delta      = { .tv_sec = -10, .tv_nsec =  100000000 }

Use timespec64_sub() to ensure that ts_delta is normalized, which fixes the
issue.

Fixes: e1d7ba873555 ("time: Always make sure wall_to_monotonic isn't positive")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211213135727.1656662-1-liaoyu15@huawei.com
3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Fri, 17 Dec 2021 21:55:03 +0000 (13:55 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "One driver fix: the pm8001 has never actually worked on a system with
  an IOMMU and this fixes that use case"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: pm8001: Fix phys_to_virt() usage on dma_addr_t

3 years agoMerge tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 17 Dec 2021 21:50:58 +0000 (13:50 -0800)]
Merge tag 'for-5.16-rc5-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes, almost all error handling one-liners and for stable.

   - regression fix in directory logging items

   - regression fix of extent buffer status bits handling after an error

   - fix memory leak in error handling path in tree-log

   - fix freeing invalid anon device number when handling errors during
     subvolume creation

   - fix warning when freeing leaf after subvolume creation failure

   - fix missing blkdev put in device scan error handling

   - fix invalid delayed ref after subvolume creation failure"

* tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
  btrfs: fix warning when freeing leaf after subvolume creation failure
  btrfs: fix invalid delayed ref after subvolume creation failure
  btrfs: check WRITE_ERR when trying to read an extent buffer
  btrfs: fix missing last dir item offset update when logging directory
  btrfs: fix double free of anon_dev after failure to create subvolume
  btrfs: fix memory leak in __add_inode_ref()

3 years agoMerge tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 17 Dec 2021 20:03:14 +0000 (12:03 -0800)]
Merge tag 'selinux-pr-20211217' of git://git./linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "Another small SELinux fix for v5.16 to ensure that we don't block on
  memory allocations while holding a spinlock.

  This passes all our tests without problem"

* tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix sleeping function called from invalid context

3 years agoMerge tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 17 Dec 2021 19:55:35 +0000 (11:55 -0800)]
Merge tag 'riscv-for-linus-5.16-rc6' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A handful of DT updates for the SiFive HiFive Unmatched, that fix the
   regulator handling. These should stop some warning spew.

 - A pair of fixes for both the SiFive Hifive Unleashed and Unmatched,
   that correctly hook up the MMC card detect signal.

* tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: dts: sifive unmatched: Link the tmp451 with its power supply
  riscv: dts: sifive unmatched: Fix regulator for board rev3
  riscv: dts: sifive unmatched: Expose the PMIC sub-functions
  riscv: dts: sifive unmatched: Expose the board ID eeprom
  riscv: dts: sifive unmatched: Name gpio lines
  riscv: dts: unmatched: Add gpio card detect to mmc-spi-slot
  riscv: dts: unleashed: Add gpio card detect to mmc-spi-slot

3 years agoMerge tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 17 Dec 2021 19:46:07 +0000 (11:46 -0800)]
Merge tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Fix for hammering on the delayed run queue timer (me)

 - bcache regression fix for this merge window (Lin)

 - Fix a divide-by-zero in the blk-iocost code (Tejun)

* tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block:
  bcache: fix NULL pointer reference in cached_dev_detach_finish
  block: reduce kblockd_mod_delayed_work_on() CPU consumption
  iocost: Fix divide-by-zero on donation from low hweight cgroup

3 years agoMerge tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 17 Dec 2021 19:31:46 +0000 (11:31 -0800)]
Merge tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Just a single fix, fixing an issue with the worker creation change
  that was merged last week"

* tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block:
  io-wq: drop wqe lock before creating new worker

3 years agoMerge tag 'dmaengine-fix-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Fri, 17 Dec 2021 19:25:50 +0000 (11:25 -0800)]
Merge tag 'dmaengine-fix-5.16' of git://git./linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
 "A bunch of driver fixes, notably:

   - uninit variable fix for dw-axi-dmac driver

   - return value check dw-edma driver

   - calling wq quiesce inside spinlock and missed completion for idxd
     driver

   - mod alias fix for st_fdma driver"

* tag 'dmaengine-fix-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: st_fdma: fix MODULE_ALIAS
  dmaengine: idxd: fix missed completion on abort path
  dmaengine: ti: k3-udma: Fix smatch warnings
  dmaengine: idxd: fix calling wq quiesce inside spinlock
  dmaengine: dw-edma: Fix return value check for dma_set_mask_and_coherent()
  dmaengine: dw-axi-dmac: Fix uninitialized variable in axi_chan_block_xfer_start()

3 years agoMerge tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 17 Dec 2021 19:07:13 +0000 (11:07 -0800)]
Merge tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Mostly amdgpu fixes this week scattered around the driver, otherwise
  one i915, one ast, one simpledrm. There is a revert in the fb-helper
  for places userspace was using a string that we tried to change.

  i915:
   - Fix a bound check in the DMC fw load.

  ast:
   - NULL ptr deref fix

  simpledrm:
   - pixel clock units fix

  fb-helper:
   - userspace regression revert

  amdgpu:
   - Fix RLC register offset
   - GMC fix
   - Properly cache SMU FW version on Yellow Carp
   - Fix missing callback on DCN3.1
   - Reset DMCUB before HW init
   - Fix for GMC powergating on PCO
   - Fix a possible memory leak in GPU metrics table handling on RN"

* tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/pm: fix a potential gpu_metrics_table memory leak
  drm/amdgpu: correct the wrong cached state for GMC on PICASSO
  drm/amd/display: Reset DMCUB before HW init
  drm/amd/display: Set exit_optimized_pwr_state for DCN31
  drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC
  drm/amdgpu: don't override default ECO_BITs setting
  drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
  drm/i915/display: Fix an unsigned subtraction which can never be negative.
  drm/ast: potential dereference of null pointer
  drm: simpledrm: fix wrong unit with pixel clock
  Revert "drm/fb-helper: improve DRM fbdev emulation device names"

3 years agoRevert "usb: early: convert to readl_poll_timeout_atomic()"
Greg Kroah-Hartman [Fri, 17 Dec 2021 15:24:30 +0000 (16:24 +0100)]
Revert "usb: early: convert to readl_poll_timeout_atomic()"

This reverts commit 796eed4b2342c9d6b26c958e92af91253a2390e1.

This change causes boot lockups when using "arlyprintk=xdbc" because
ktime can not be used at this point in time in the boot process.  Also,
it is not needed for very small delays like this.

Reported-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Chunfeng Yun <chunfeng.yun@mediatek.com>
Fixes: 796eed4b2342 ("usb: early: convert to readl_poll_timeout_atomic()")
Link: https://lore.kernel.org/r/c2b5c9bb-1b75-bf56-3754-b5b18812d65e@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMerge tag 'usb-serial-5.16-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Fri, 17 Dec 2021 09:37:02 +0000 (10:37 +0100)]
Merge tag 'usb-serial-5.16-rc6' of https://git./linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for 5.16-rc6

Here's a fix for a reported problem in the cp210x gpio-registration code
and some more modem device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.16-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: option: add Telit FN990 compositions
  USB: serial: cp210x: fix CP2105 GPIO registration

3 years agoMAITAINERS: Change zonefs maintainer email address
Damien Le Moal [Fri, 17 Dec 2021 07:41:17 +0000 (16:41 +0900)]
MAITAINERS: Change zonefs maintainer email address

Update my email address from damien.lemoal@wdc.com to
damien.lemoal@opensource.wdc.com.

Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
3 years agozonefs: add MODULE_ALIAS_FS
Naohiro Aota [Fri, 17 Dec 2021 06:15:45 +0000 (15:15 +0900)]
zonefs: add MODULE_ALIAS_FS

Add MODULE_ALIAS_FS() to load the module automatically when you do "mount
-t zonefs".

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable <stable@vger.kernel.org> # 5.6+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
3 years agoriscv: dts: sifive unmatched: Link the tmp451 with its power supply
Vincent Pelletier [Tue, 16 Nov 2021 23:57:42 +0000 (23:57 +0000)]
riscv: dts: sifive unmatched: Link the tmp451 with its power supply

Fixes the following probe warning:
  lm90 0-004c: Looking up vcc-supply from device tree
  lm90 0-004c: Looking up vcc-supply property in node /soc/i2c@10030000/temperature-sensor@4c failed
  lm90 0-004c: supply vcc not found, using dummy regulator

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
3 years agoriscv: dts: sifive unmatched: Fix regulator for board rev3
Vincent Pelletier [Tue, 16 Nov 2021 23:57:41 +0000 (23:57 +0000)]
riscv: dts: sifive unmatched: Fix regulator for board rev3

The existing values are rejected by the da9063 regulator driver, as they
are unachievable with the declared chip setup (non-merged vcore and bmem
are unable to provide the declared curent).

Fix voltages to match rev3 schematics, which also matches their boot-up
configuration within the chip's available precision.
Declare bcore1/bcore2 and bmem/bio as merged.
Set ldo09 and ldo10 as always-on as their consumers are not declared but
exist.
Drop ldo current limits as there is no current limit feature for these
regulators in the DA9063. Fixes warnings like:
  DA9063_LDO3: Operation of current configuration missing

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
3 years agoriscv: dts: sifive unmatched: Expose the PMIC sub-functions
Vincent Pelletier [Tue, 16 Nov 2021 23:57:39 +0000 (23:57 +0000)]
riscv: dts: sifive unmatched: Expose the PMIC sub-functions

These sub-functions are available in the chip revision on this board, so
expose them.

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
3 years agoriscv: dts: sifive unmatched: Expose the board ID eeprom
Vincent Pelletier [Tue, 16 Nov 2021 23:57:38 +0000 (23:57 +0000)]
riscv: dts: sifive unmatched: Expose the board ID eeprom

Mark it as read-only as it is factory-programmed with identifying
information, and no executable nor configuration:
- eth MAC address
- board model (PCB version, BoM version)
- board serial number
Accidental modification would cause misidentification which could brick
the board, so marking read-only seem like both a safe and non-constraining
choice.

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
3 years agoriscv: dts: sifive unmatched: Name gpio lines
Vincent Pelletier [Tue, 16 Nov 2021 23:57:37 +0000 (23:57 +0000)]
riscv: dts: sifive unmatched: Name gpio lines

Follow the pin descriptions given in the version 3 of the board schematics.

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
3 years agoMerge tag 'amd-drm-fixes-5.16-2021-12-15' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 17 Dec 2021 05:01:00 +0000 (15:01 +1000)]
Merge tag 'amd-drm-fixes-5.16-2021-12-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.16-2021-12-15:

amdgpu:
- Fix RLC register offset
- GMC fix
- Properly cache SMU FW version on Yellow Carp
- Fix missing callback on DCN3.1
- Reset DMCUB before HW init
- Fix for GMC powergating on PCO
- Fix a possible memory leak in GPU metrics table handling on RN

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216035239.5787-1-alexander.deucher@amd.com
3 years agoMerge tag 'drm-misc-fixes-2021-12-16-1' of ssh://git.freedesktop.org/git/drm/drm...
Dave Airlie [Fri, 17 Dec 2021 04:16:57 +0000 (14:16 +1000)]
Merge tag 'drm-misc-fixes-2021-12-16-1' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes

One null pointer dereference fix for ast, a pixel clock unit fix for
simpledrm and a user-space regression revert for fb-helper

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216082603.pm6yzlckmxvwnqyv@houat
3 years agolibata: if T_LENGTH is zero, dma direction should be DMA_NONE
George Kennedy [Tue, 14 Dec 2021 14:45:10 +0000 (09:45 -0500)]
libata: if T_LENGTH is zero, dma direction should be DMA_NONE

Avoid data corruption by rejecting pass-through commands where
T_LENGTH is zero (No data is transferred) and the dma direction
is not DMA_NONE.

Cc: <stable@vger.kernel.org>
Reported-by: syzkaller<syzkaller@googlegroups.com>
Signed-off-by: George Kennedy<george.kennedy@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
3 years agoMerge tag 'audit-pr-20211216' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoor...
Linus Torvalds [Thu, 16 Dec 2021 23:24:46 +0000 (15:24 -0800)]
Merge tag 'audit-pr-20211216' of git://git./linux/kernel/git/pcmoore/audit

Pull audit fix from Paul Moore:
 "A single patch to fix a problem where the audit queue could grow
  unbounded when the audit daemon is forcibly stopped"

* tag 'audit-pr-20211216' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: improve robustness of the audit queue handling

3 years agoMerge tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 16 Dec 2021 23:02:14 +0000 (15:02 -0800)]
Merge tag 'net-5.16-rc6' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, wifi, bpf.

  Relatively large batches of fixes from BPF and the WiFi stack, calm in
  general networking.

  Current release - regressions:

   - dpaa2-eth: fix buffer overrun when reporting ethtool statistics

  Current release - new code bugs:

   - bpf: fix incorrect state pruning for <8B spill/fill

   - iavf:
       - add missing unlocks in iavf_watchdog_task()
       - do not override the adapter state in the watchdog task (again)

   - mlxsw: spectrum_router: consolidate MAC profiles when possible

  Previous releases - regressions:

   - mac80211 fixes:
       - rate control, avoid driver crash for retransmitted frames
       - regression in SSN handling of addba tx
       - a memory leak where sta_info is not freed
       - marking TX-during-stop for TX in in_reconfig, prevent stall

   - cfg80211: acquire wiphy mutex on regulatory work

   - wifi drivers: fix build regressions and LED config dependency

   - virtio_net: fix rx_drops stat for small pkts

   - dsa: mv88e6xxx: unforce speed & duplex in mac_link_down()

  Previous releases - always broken:

   - bpf fixes:
       - kernel address leakage in atomic fetch
       - kernel address leakage in atomic cmpxchg's r0 aux reg
       - signed bounds propagation after mov32
       - extable fixup offset
       - extable address check

   - mac80211:
       - fix the size used for building probe request
       - send ADDBA requests using the tid/queue of the aggregation
         session
       - agg-tx: don't schedule_and_wake_txq() under sta->lock, avoid
         deadlocks
       - validate extended element ID is present

   - mptcp:
       - never allow the PM to close a listener subflow (null-defer)
       - clear 'kern' flag from fallback sockets, prevent crash
       - fix deadlock in __mptcp_push_pending()

   - inet_diag: fix kernel-infoleak for UDP sockets

   - xsk: do not sleep in poll() when need_wakeup set

   - smc: avoid very long waits in smc_release()

   - sch_ets: don't remove idle classes from the round-robin list

   - netdevsim:
       - zero-initialize memory for bpf map's value, prevent info leak
       - don't let user space overwrite read only (max) ethtool parms

   - ixgbe: set X550 MDIO speed before talking to PHY

   - stmmac:
       - fix null-deref in flower deletion w/ VLAN prio Rx steering
       - dwmac-rk: fix oob read in rk_gmac_setup

   - ice: time stamping fixes

   - systemport: add global locking for descriptor life cycle"

* tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (89 commits)
  bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
  selftest/bpf: Add a test that reads various addresses.
  bpf: Fix extable address check.
  bpf: Fix extable fixup offset.
  bpf, selftests: Add test case trying to taint map value pointer
  bpf: Make 32->64 bounds propagation slightly more robust
  bpf: Fix signed bounds propagation after mov32
  sit: do not call ipip6_dev_free() from sit_init_net()
  net: systemport: Add global locking for descriptor lifecycle
  net/smc: Prevent smc_release() from long blocking
  net: Fix double 0x prefix print in SKB dump
  virtio_net: fix rx_drops stat for small pkts
  dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
  sfc_ef100: potential dereference of null pointer
  net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
  net: usb: lan78xx: add Allied Telesis AT29M2-AF
  net/packet: rx_owner_map depends on pg_vec
  netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
  dpaa2-eth: fix ethtool statistics
  ixgbe: set X550 MDIO speed before talking to PHY
  ...

3 years agoMerge tag 'soc-fixes-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Thu, 16 Dec 2021 22:48:57 +0000 (14:48 -0800)]
Merge tag 'soc-fixes-5.16-3' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "There are a number of DT fixes, mostly for mistakes found through
  static checking of the dts files again, as well as a couple of minor
  changes to address incorrect DT settings.

  For i.MX, there is yet another series of devitree changes to update
  RGMII delay settings for ethernet, which is an ongoing problem after
  some driver changes.

  For SoC specific device drivers, a number of smaller fixes came up:

   - i.MX SoC identification was incorrectly registered non-i.MX
     machines when the driver is built-in

   - One fix on imx8m-blk-ctrl driver to get i.MX8MM MIPI reset work
     properly

   - a few compile fixes for warnings that get in the way of -Werror

   - a string overflow in the scpi firmware driver

   - a boot failure with FORTIFY_SOURCE on Rockchips machines

   - broken error handling in the AMD TEE driver

   - a revert for a tegra reset driver commit that broke HDA"

* tag 'soc-fixes-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
  soc/tegra: fuse: Fix bitwise vs. logical OR warning
  firmware: arm_scpi: Fix string overflow in SCPI genpd driver
  soc: imx: Register SoC device only on i.MX boards
  soc: imx: imx8m-blk-ctrl: Fix imx8mm mipi reset
  ARM: dts: imx6ull-pinfunc: Fix CSI_DATA07__ESAI_TX0 pad name
  arm64: dts: imx8mq: remove interconnect property from lcdif
  ARM: socfpga: dts: fix qspi node compatible
  arm64: dts: apple: add #interrupt-cells property to pinctrl nodes
  dt-bindings: i2c: apple,i2c: allow multiple compatibles
  arm64: meson: remove COMMON_CLK
  arm64: meson: fix dts for JetHub D1
  tee: amdtee: fix an IS_ERR() vs NULL bug
  arm64: dts: apple: change ethernet0 device type to ethernet
  arm64: dts: ten64: remove redundant interrupt declaration for gpio-keys
  arm64: dts: rockchip: fix poweroff on helios64
  arm64: dts: rockchip: fix audio-supply for Rock Pi 4
  arm64: dts: rockchip: fix rk3399-leez-p710 vcc3v3-lan supply
  arm64: dts: rockchip: fix rk3308-roc-cc vcc-sd supply
  arm64: dts: rockchip: remove mmc-hs400-enhanced-strobe from rk3399-khadas-edge
  ARM: rockchip: Use memcpy_toio instead of memcpy on smp bring-up
  ...

3 years agoselinux: fix sleeping function called from invalid context
Scott Mayhew [Wed, 15 Dec 2021 21:28:40 +0000 (16:28 -0500)]
selinux: fix sleeping function called from invalid context

selinux_sb_mnt_opts_compat() is called via sget_fc() under the sb_lock
spinlock, so it can't use GFP_KERNEL allocations:

[  868.565200] BUG: sleeping function called from invalid context at
               include/linux/sched/mm.h:230
[  868.568246] in_atomic(): 1, irqs_disabled(): 0,
               non_block: 0, pid: 4914, name: mount.nfs
[  868.569626] preempt_count: 1, expected: 0
[  868.570215] RCU nest depth: 0, expected: 0
[  868.570809] Preemption disabled at:
[  868.570810] [<0000000000000000>] 0x0
[  868.571848] CPU: 1 PID: 4914 Comm: mount.nfs Kdump: loaded
               Tainted: G        W         5.16.0-rc5.2585cf9dfa #1
[  868.573273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
               BIOS 1.14.0-4.fc34 04/01/2014
[  868.574478] Call Trace:
[  868.574844]  <TASK>
[  868.575156]  dump_stack_lvl+0x34/0x44
[  868.575692]  __might_resched.cold+0xd6/0x10f
[  868.576308]  slab_pre_alloc_hook.constprop.0+0x89/0xf0
[  868.577046]  __kmalloc_track_caller+0x72/0x420
[  868.577684]  ? security_context_to_sid_core+0x48/0x2b0
[  868.578569]  kmemdup_nul+0x22/0x50
[  868.579108]  security_context_to_sid_core+0x48/0x2b0
[  868.579854]  ? _nfs4_proc_pathconf+0xff/0x110 [nfsv4]
[  868.580742]  ? nfs_reconfigure+0x80/0x80 [nfs]
[  868.581355]  security_context_str_to_sid+0x36/0x40
[  868.581960]  selinux_sb_mnt_opts_compat+0xb5/0x1e0
[  868.582550]  ? nfs_reconfigure+0x80/0x80 [nfs]
[  868.583098]  security_sb_mnt_opts_compat+0x2a/0x40
[  868.583676]  nfs_compare_super+0x113/0x220 [nfs]
[  868.584249]  ? nfs_try_mount_request+0x210/0x210 [nfs]
[  868.584879]  sget_fc+0xb5/0x2f0
[  868.585267]  nfs_get_tree_common+0x91/0x4a0 [nfs]
[  868.585834]  vfs_get_tree+0x25/0xb0
[  868.586241]  fc_mount+0xe/0x30
[  868.586605]  do_nfs4_mount+0x130/0x380 [nfsv4]
[  868.587160]  nfs4_try_get_tree+0x47/0xb0 [nfsv4]
[  868.587724]  vfs_get_tree+0x25/0xb0
[  868.588193]  do_new_mount+0x176/0x310
[  868.588782]  __x64_sys_mount+0x103/0x140
[  868.589388]  do_syscall_64+0x3b/0x90
[  868.589935]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  868.590699] RIP: 0033:0x7f2b371c6c4e
[  868.591239] Code: 48 8b 0d dd 71 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e
                     0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00
                     00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 71
                     0e 00 f7 d8 64 89 01 48
[  868.593810] RSP: 002b:00007ffc83775d88 EFLAGS: 00000246
               ORIG_RAX: 00000000000000a5
[  868.594691] RAX: ffffffffffffffda RBX: 00007ffc83775f10 RCX: 00007f2b371c6c4e
[  868.595504] RDX: 0000555d517247a0 RSI: 0000555d51724700 RDI: 0000555d51724540
[  868.596317] RBP: 00007ffc83775f10 R08: 0000555d51726890 R09: 0000555d51726890
[  868.597162] R10: 0000000000000000 R11: 0000000000000246 R12: 0000555d51726890
[  868.598005] R13: 0000000000000003 R14: 0000555d517246e0 R15: 0000555d511ac925
[  868.598826]  </TASK>

Cc: stable@vger.kernel.org
Fixes: 69c4a42d72eb ("lsm,selinux: add new hook to compare new mount to an existing mount")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
[PM: cleanup/line-wrap the backtrace]
Signed-off-by: Paul Moore <paul@paul-moore.com>
3 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Thu, 16 Dec 2021 21:06:49 +0000 (13:06 -0800)]
Merge https://git./linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2021-12-16

We've added 15 non-merge commits during the last 7 day(s) which contain
a total of 12 files changed, 434 insertions(+), 30 deletions(-).

The main changes are:

1) Fix incorrect verifier state pruning behavior for <8B register spill/fill,
   from Paul Chaignon.

2) Fix x86-64 JIT's extable handling for fentry/fexit when return pointer
   is an ERR_PTR(), from Alexei Starovoitov.

3) Fix 3 different possibilities that BPF verifier missed where unprivileged
   could leak kernel addresses, from Daniel Borkmann.

4) Fix xsk's poll behavior under need_wakeup flag, from Magnus Karlsson.

5) Fix an oob-write in test_verifier due to a missed MAX_NR_MAPS bump,
   from Kumar Kartikeya Dwivedi.

6) Fix a race in test_btf_skc_cls_ingress selftest, from Martin KaFai Lau.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
  selftest/bpf: Add a test that reads various addresses.
  bpf: Fix extable address check.
  bpf: Fix extable fixup offset.
  bpf, selftests: Add test case trying to taint map value pointer
  bpf: Make 32->64 bounds propagation slightly more robust
  bpf: Fix signed bounds propagation after mov32
  bpf, selftests: Update test case for atomic cmpxchg on r0 with pointer
  bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg
  bpf, selftests: Add test case for atomic fetch on spilled pointer
  bpf: Fix kernel address leakage in atomic fetch
  selftests/bpf: Fix OOB write in test_verifier
  xsk: Do not sleep in poll() when need_wakeup set
  selftests/bpf: Tests for state pruning with u32 spill/fill
  bpf: Fix incorrect state pruning for <8B spill/fill
====================

Link: https://lore.kernel.org/r/20211216210005.13815-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobpf, selftests: Fix racing issue in btf_skc_cls_ingress test
Martin KaFai Lau [Thu, 16 Dec 2021 19:16:30 +0000 (11:16 -0800)]
bpf, selftests: Fix racing issue in btf_skc_cls_ingress test

The libbpf CI reported occasional failure in btf_skc_cls_ingress:

  test_syncookie:FAIL:Unexpected syncookie states gen_cookie:80326634 recv_cookie:0
  bpf prog error at line 97

"error at line 97" means the bpf prog cannot find the listening socket
when the final ack is received.  It then skipped processing
the syncookie in the final ack which then led to "recv_cookie:0".

The problem is the userspace program did not do accept() and went
ahead to close(listen_fd) before the kernel (and the bpf prog) had
a chance to process the final ack.

The fix is to add accept() call so that the userspace will wait for
the kernel to finish processing the final ack first before close()-ing
everything.

Fixes: 9a856cae2217 ("bpf: selftest: Add test_btf_skc_cls_ingress")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211216191630.466151-1-kafai@fb.com
3 years agoselftest/bpf: Add a test that reads various addresses.
Alexei Starovoitov [Wed, 15 Dec 2021 20:35:34 +0000 (12:35 -0800)]
selftest/bpf: Add a test that reads various addresses.

Add a function to bpf_testmod that returns invalid kernel and user addresses.
Then attach an fexit program to that function that tries to read
memory through these addresses.

This logic checks that bpf_probe_read_kernel and BPF_PROBE_MEM logic is sane.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agobpf: Fix extable address check.
Alexei Starovoitov [Wed, 15 Dec 2021 03:25:13 +0000 (19:25 -0800)]
bpf: Fix extable address check.

The verifier checks that PTR_TO_BTF_ID pointer is either valid or NULL,
but it cannot distinguish IS_ERR pointer from valid one.

When offset is added to IS_ERR pointer it may become small positive
value which is a user address that is not handled by extable logic
and has to be checked for at the runtime.

Tighten BPF_PROBE_MEM pointer check code to prevent this case.

Fixes: 4c5de127598e ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.")
Reported-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agobpf: Fix extable fixup offset.
Alexei Starovoitov [Thu, 16 Dec 2021 02:38:30 +0000 (18:38 -0800)]
bpf: Fix extable fixup offset.

The prog - start_of_ldx is the offset before the faulting ldx to the location
after it, so this will be used to adjust pt_regs->ip for jumping over it and
continuing, and with old temp it would have been fixed up to the wrong offset,
causing crash.

Fixes: 4c5de127598e ("bpf: Emit explicit NULL pointer checks for PROBE_LDX instructions.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 16 Dec 2021 19:48:59 +0000 (11:48 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "A single fix for the clk framework that needed some more bake time in
  linux-next.

  The problem is that two clks being registered at the same time can
  lead to a busted clk tree if the parent isn't fully registered by the
  time the child finds the parent. We rejigger the place where we mark
  the parent as fully registered so that the child can't find the parent
  until things are proper"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: Don't parent clks until the parent is fully registered

3 years agobpf, selftests: Add test case trying to taint map value pointer
Daniel Borkmann [Wed, 15 Dec 2021 23:48:54 +0000 (23:48 +0000)]
bpf, selftests: Add test case trying to taint map value pointer

Add a test case which tries to taint map value pointer arithmetic into a
unknown scalar with subsequent export through the map.

Before fix:

  # ./test_verifier 1186
  #1186/u map access: trying to leak tained dst reg FAIL
  Unexpected success to load!
  verification time 24 usec
  stack depth 8
  processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
  #1186/p map access: trying to leak tained dst reg FAIL
  Unexpected success to load!
  verification time 8 usec
  stack depth 8
  processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
  Summary: 0 PASSED, 0 SKIPPED, 2 FAILED

After fix:

  # ./test_verifier 1186
  #1186/u map access: trying to leak tained dst reg OK
  #1186/p map access: trying to leak tained dst reg OK
  Summary: 2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Make 32->64 bounds propagation slightly more robust
Daniel Borkmann [Wed, 15 Dec 2021 22:28:48 +0000 (22:28 +0000)]
bpf: Make 32->64 bounds propagation slightly more robust

Make the bounds propagation in __reg_assign_32_into_64() slightly more
robust and readable by aligning it similarly as we did back in the
__reg_combine_64_into_32() counterpart. Meaning, only propagate or
pessimize them as a smin/smax pair.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf: Fix signed bounds propagation after mov32
Daniel Borkmann [Wed, 15 Dec 2021 22:02:19 +0000 (22:02 +0000)]
bpf: Fix signed bounds propagation after mov32

For the case where both s32_{min,max}_value bounds are positive, the
__reg_assign_32_into_64() directly propagates them to their 64 bit
counterparts, otherwise it pessimises them into [0,u32_max] universe and
tries to refine them later on by learning through the tnum as per comment
in mentioned function. However, that does not always happen, for example,
in mov32 operation we call zext_32_to_64(dst_reg) which invokes the
__reg_assign_32_into_64() as is without subsequent bounds update as
elsewhere thus no refinement based on tnum takes place.

Thus, not calling into the __update_reg_bounds() / __reg_deduce_bounds() /
__reg_bound_offset() triplet as we do, for example, in case of ALU ops via
adjust_scalar_min_max_vals(), will lead to more pessimistic bounds when
dumping the full register state:

Before fix:

  0: (b4) w0 = -1
  1: R0_w=invP4294967295
     (id=0,imm=ffffffff,
      smin_value=4294967295,smax_value=4294967295,
      umin_value=4294967295,umax_value=4294967295,
      var_off=(0xffffffff; 0x0),
      s32_min_value=-1,s32_max_value=-1,
      u32_min_value=-1,u32_max_value=-1)

  1: (bc) w0 = w0
  2: R0_w=invP4294967295
     (id=0,imm=ffffffff,
      smin_value=0,smax_value=4294967295,
      umin_value=4294967295,umax_value=4294967295,
      var_off=(0xffffffff; 0x0),
      s32_min_value=-1,s32_max_value=-1,
      u32_min_value=-1,u32_max_value=-1)

Technically, the smin_value=0 and smax_value=4294967295 bounds are not
incorrect, but given the register is still a constant, they break assumptions
about const scalars that smin_value == smax_value and umin_value == umax_value.

After fix:

  0: (b4) w0 = -1
  1: R0_w=invP4294967295
     (id=0,imm=ffffffff,
      smin_value=4294967295,smax_value=4294967295,
      umin_value=4294967295,umax_value=4294967295,
      var_off=(0xffffffff; 0x0),
      s32_min_value=-1,s32_max_value=-1,
      u32_min_value=-1,u32_max_value=-1)

  1: (bc) w0 = w0
  2: R0_w=invP4294967295
     (id=0,imm=ffffffff,
      smin_value=4294967295,smax_value=4294967295,
      umin_value=4294967295,umax_value=4294967295,
      var_off=(0xffffffff; 0x0),
      s32_min_value=-1,s32_max_value=-1,
      u32_min_value=-1,u32_max_value=-1)

Without the smin_value == smax_value and umin_value == umax_value invariant
being intact for const scalars, it is possible to leak out kernel pointers
from unprivileged user space if the latter is enabled. For example, when such
registers are involved in pointer arithmtics, then adjust_ptr_min_max_vals()
will taint the destination register into an unknown scalar, and the latter
can be exported and stored e.g. into a BPF map value.

Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Kuee K1r0a <liulin063@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
3 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 16 Dec 2021 18:44:20 +0000 (10:44 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix missing error code on kexec failure path"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kexec: Fix missing error code 'ret' warning in load_other_segments()

3 years agoMerge tag 'for-5.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Thu, 16 Dec 2021 18:05:49 +0000 (10:05 -0800)]
Merge tag 'for-5.16/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix use after free in DM btree remove's rebalance_children()

 - Fix DM integrity data corruption, introduced during 5.16 merge, due
   to improper use of bvec_kmap_local()

* tag 'for-5.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm integrity: fix data corruption due to improper use of bvec_kmap_local
  dm btree remove: fix use after free in rebalance_children()

3 years agoarm64: kexec: Fix missing error code 'ret' warning in load_other_segments()
Lakshmi Ramasubramanian [Fri, 10 Dec 2021 01:01:21 +0000 (17:01 -0800)]
arm64: kexec: Fix missing error code 'ret' warning in load_other_segments()

Since commit ac10be5cdbfa ("arm64: Use common
of_kexec_alloc_and_setup_fdt()"), smatch reports the following warning:

  arch/arm64/kernel/machine_kexec_file.c:152 load_other_segments()
  warn: missing error code 'ret'

Return code is not set to an error code in load_other_segments() when
of_kexec_alloc_and_setup_fdt() call returns a NULL dtb. This results
in status success (return code set to 0) being returned from
load_other_segments().

Set return code to -EINVAL if of_kexec_alloc_and_setup_fdt() returns
NULL dtb.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ac10be5cdbfa ("arm64: Use common of_kexec_alloc_and_setup_fdt()")
Link: https://lore.kernel.org/r/20211210010121.101823-1-nramas@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
3 years agoafs: Fix mmap
David Howells [Tue, 14 Dec 2021 09:22:12 +0000 (09:22 +0000)]
afs: Fix mmap

Fix afs_add_open_map() to check that the vnode isn't already on the list
when it adds it.  It's possible that afs_drop_open_mmap() decremented
the cb_nr_mmap counter, but hadn't yet got into the locked section to
remove it.

Also vnode->cb_mmap_link should be initialised, so fix that too.

Fixes: 6e0e99d58a65 ("afs: Fix mmap coherency vs 3rd-party changes")
Reported-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com
Suggested-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing+fedora34_64checkkafs-build-300@auristor.com
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/686465.1639435380@warthog.procyon.org.uk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agosit: do not call ipip6_dev_free() from sit_init_net()
Eric Dumazet [Thu, 16 Dec 2021 11:17:41 +0000 (03:17 -0800)]
sit: do not call ipip6_dev_free() from sit_init_net()

ipip6_dev_free is sit dev->priv_destructor, already called
by register_netdevice() if something goes wrong.

Alternative would be to make ipip6_dev_free() robust against
multiple invocations, but other drivers do not implement this
strategy.

syzbot reported:

dst_release underflow
WARNING: CPU: 0 PID: 5059 at net/core/dst.c:173 dst_release+0xd8/0xe0 net/core/dst.c:173
Modules linked in:
CPU: 1 PID: 5059 Comm: syz-executor.4 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dst_release+0xd8/0xe0 net/core/dst.c:173
Code: 4c 89 f2 89 d9 31 c0 5b 41 5e 5d e9 da d5 44 f9 e8 1d 90 5f f9 c6 05 87 48 c6 05 01 48 c7 c7 80 44 99 8b 31 c0 e8 e8 67 29 f9 <0f> 0b eb 85 0f 1f 40 00 53 48 89 fb e8 f7 8f 5f f9 48 83 c3 a8 48
RSP: 0018:ffffc9000aa5faa0 EFLAGS: 00010246
RAX: d6894a925dd15a00 RBX: 00000000ffffffff RCX: 0000000000040000
RDX: ffffc90005e19000 RSI: 000000000003ffff RDI: 0000000000040000
RBP: 0000000000000000 R08: ffffffff816a1f42 R09: ffffed1017344f2c
R10: ffffed1017344f2c R11: 0000000000000000 R12: 0000607f462b1358
R13: 1ffffffff1bfd305 R14: ffffe8ffffcb1358 R15: dffffc0000000000
FS:  00007f66c71a2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f88aaed5058 CR3: 0000000023e0f000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 dst_cache_destroy+0x107/0x1e0 net/core/dst_cache.c:160
 ipip6_dev_free net/ipv6/sit.c:1414 [inline]
 sit_init_net+0x229/0x550 net/ipv6/sit.c:1936
 ops_init+0x313/0x430 net/core/net_namespace.c:140
 setup_net+0x35b/0x9d0 net/core/net_namespace.c:326
 copy_net_ns+0x359/0x5c0 net/core/net_namespace.c:470
 create_new_namespaces+0x4ce/0xa00 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0x11e/0x180 kernel/nsproxy.c:226
 ksys_unshare+0x57d/0xb50 kernel/fork.c:3075
 __do_sys_unshare kernel/fork.c:3146 [inline]
 __se_sys_unshare kernel/fork.c:3144 [inline]
 __x64_sys_unshare+0x34/0x40 kernel/fork.c:3144
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f66c882ce99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f66c71a2168 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f66c893ff60 RCX: 00007f66c882ce99
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000048040200
RBP: 00007f66c8886ff1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff6634832f R14: 00007f66c71a2300 R15: 0000000000022000
 </TASK>

Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211216111741.1387540-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: systemport: Add global locking for descriptor lifecycle
Florian Fainelli [Wed, 15 Dec 2021 20:24:49 +0000 (12:24 -0800)]
net: systemport: Add global locking for descriptor lifecycle

The descriptor list is a shared resource across all of the transmit queues, and
the locking mechanism used today only protects concurrency across a given
transmit queue between the transmit and reclaiming. This creates an opportunity
for the SYSTEMPORT hardware to work on corrupted descriptors if we have
multiple producers at once which is the case when using multiple transmit
queues.

This was particularly noticeable when using multiple flows/transmit queues and
it showed up in interesting ways in that UDP packets would get a correct UDP
header checksum being calculated over an incorrect packet length. Similarly TCP
packets would get an equally correct checksum computed by the hardware over an
incorrect packet length.

The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
when the driver produces a new descriptor anytime it writes to the
WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
re-organize its descriptors and it is possible that concurrent TX queues
eventually break this internal allocation scheme to the point where the
length/status part of the descriptor gets used for an incorrect data buffer.

The fix is to impose a global serialization for all TX queues in the short
section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
the corruption even with multiple concurrent TX queues being used.

Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet/smc: Prevent smc_release() from long blocking
D. Wythe [Wed, 15 Dec 2021 12:29:21 +0000 (20:29 +0800)]
net/smc: Prevent smc_release() from long blocking

In nginx/wrk benchmark, there's a hung problem with high probability
on case likes that: (client will last several minutes to exit)

server: smc_run nginx

client: smc_run wrk -c 10000 -t 1 http://server

Client hangs with the following backtrace:

0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c

This issue dues to flush_work(), which is used to wait for
smc_connect_work() to finish in smc_release(). Once lots of
smc_connect_work() was pending or all executing work dangling,
smc_release() has to block until one worker comes to free, which
is equivalent to wait another smc_connnect_work() to finish.

In order to fix this, There are two changes:

1. For those idle smc_connect_work(), cancel it from the workqueue; for
   executing smc_connect_work(), waiting for it to finish. For that
   purpose, replace flush_work() with cancel_work_sync().

2. Since smc_connect() hold a reference for passive closing, if
   smc_connect_work() has been cancelled, release the reference.

Fixes: 24ac3a08e658 ("net/smc: rebuild nonblocking connect")
Reported-by: Tony Lu <tonylu@linux.alibaba.com>
Tested-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'tegra-for-5.16-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Thu, 16 Dec 2021 14:02:25 +0000 (15:02 +0100)]
Merge tag 'tegra-for-5.16-soc-fixes' of git://git./linux/kernel/git/tegra/linux into arm/fixes

soc/tegra: Fixes for v5.16-rc6

This contains a single build fix without which ARM allmodconfig builds
are broken if -Werror is enabled.

* tag 'tegra-for-5.16-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  soc/tegra: fuse: Fix bitwise vs. logical OR warning

Link: https://lore.kernel.org/r/20211215162618.3568474-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agonet: Fix double 0x prefix print in SKB dump
Gal Pressman [Thu, 16 Dec 2021 09:28:25 +0000 (11:28 +0200)]
net: Fix double 0x prefix print in SKB dump

When printing netdev features %pNF already takes care of the 0x prefix,
remove the explicit one.

Fixes: 6413139dfc64 ("skbuff: increase verbosity when dumping skb data")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agovirtio_net: fix rx_drops stat for small pkts
Wenliang Wang [Thu, 16 Dec 2021 03:11:35 +0000 (11:11 +0800)]
virtio_net: fix rx_drops stat for small pkts

We found the stat of rx drops for small pkts does not increment when
build_skb fail, it's not coherent with other mode's rx drops stat.

Signed-off-by: Wenliang Wang <wangwenliang.1995@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
Andrey Eremeev [Wed, 15 Dec 2021 17:30:32 +0000 (20:30 +0300)]
dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED

Debug print uses invalid check to detect if speed is unforced:
(speed != SPEED_UNFORCED) should be used instead of (!speed).

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Andrey Eremeev <Axtone4all@yandex.ru>
Fixes: 96a2b40c7bd3 ("net: dsa: mv88e6xxx: add port's MAC speed setter")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc_ef100: potential dereference of null pointer
Jiasheng Jiang [Wed, 15 Dec 2021 14:37:31 +0000 (22:37 +0800)]
sfc_ef100: potential dereference of null pointer

The return value of kmalloc() needs to be checked.
To avoid use in efx_nic_update_stats() in case of the failure of alloc.

Fixes: b593b6f1b492 ("sfc_ef100: statistics gathering")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
John Keeping [Tue, 14 Dec 2021 19:10:09 +0000 (19:10 +0000)]
net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup

KASAN reports an out-of-bounds read in rk_gmac_setup on the line:

while (ops->regs[i]) {

This happens for most platforms since the regs flexible array member is
empty, so the memory after the ops structure is being read here.  It
seems that mostly this happens to contain zero anyway, so we get lucky
and everything still works.

To avoid adding redundant data to nearly all the ops structures, add a
new flag to indicate whether the regs field is valid and avoid this loop
when it is not.

Fixes: 3bb3d6b1c195 ("net: stmmac: Add RK3566/RK3568 SoC support")
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
David S. Miller [Thu, 16 Dec 2021 10:27:12 +0000 (10:27 +0000)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-15

This series contains updates to igb, igbvf, igc and ixgbe drivers.

Karen moves checks for invalid VF MAC filters to occur earlier for
igb.

Letu Ren fixes a double free issue in igbvf probe.

Sasha fixes incorrect min value being used when calculating for max for
igc.

Robert Schlabbach adds documentation on enabling NBASE-T support for
ixgbe.

Cyril Novikov adds missing initialization of MDIO bus speed for ixgbe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: usb: lan78xx: add Allied Telesis AT29M2-AF
Greg Jesionowski [Tue, 14 Dec 2021 22:10:27 +0000 (15:10 -0700)]
net: usb: lan78xx: add Allied Telesis AT29M2-AF

This adds the vendor and product IDs for the AT29M2-AF which is a
lan7801-based device.

Signed-off-by: Greg Jesionowski <jesionowskigreg@gmail.com>
Link: https://lore.kernel.org/r/20211214221027.305784-1-jesionowskigreg@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet/packet: rx_owner_map depends on pg_vec
Willem de Bruijn [Wed, 15 Dec 2021 14:39:37 +0000 (09:39 -0500)]
net/packet: rx_owner_map depends on pg_vec

Packet sockets may switch ring versions. Avoid misinterpreting state
between versions, whose fields share a union. rx_owner_map is only
allocated with a packet ring (pg_vec) and both are swapped together.
If pg_vec is NULL, meaning no packet ring was allocated, then neither
was rx_owner_map. And the field may be old state from a tpacket_v3.

Fixes: 61fad6816fc1 ("net/packet: tpacket_rcv: avoid a producer race condition")
Reported-by: Syzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonetdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
Haimin Zhang [Wed, 15 Dec 2021 11:15:30 +0000 (19:15 +0800)]
netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc

Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
since it may cause a potential kernel information leak issue, as follows:
1. nsim_bpf_map_alloc calls nsim_map_alloc_elem to allocate elements for
a new map.
2. nsim_map_alloc_elem uses kmalloc to allocate map's value, but doesn't
zero it.
3. A user application can use IOCTL BPF_MAP_LOOKUP_ELEM to get specific
element's information in the map.
4. The kernel function map_lookup_elem will call bpf_map_copy_value to get
the information allocated at step-2, then use copy_to_user to copy to the
user buffer.
This can only leak information for an array map.

Fixes: 395cacb5f1a0 ("netdevsim: bpf: support fake map offload")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Haimin Zhang <tcs.kernel@gmail.com>
Link: https://lore.kernel.org/r/20211215111530.72103-1-tcs.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-eth: fix ethtool statistics
Ioana Ciornei [Wed, 15 Dec 2021 10:58:31 +0000 (12:58 +0200)]
dpaa2-eth: fix ethtool statistics

Unfortunately, with the blamed commit I also added a side effect in the
ethtool stats shown. Because I added two more fields in the per channel
structure without verifying if its size is used in any way, part of the
ethtool statistics were off by 2.
Fix this by not looking up the size of the structure but instead on a
fixed value kept in a macro.

Fixes: fc398bec0387 ("net: dpaa2: add adaptive interrupt coalescing")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20211215105831.290070-1-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'drm-intel-fixes-2021-12-15' of ssh://git.freedesktop.org/git/drm/drm-intel...
Dave Airlie [Thu, 16 Dec 2021 00:22:23 +0000 (10:22 +1000)]
Merge tag 'drm-intel-fixes-2021-12-15' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes

Fix a bound check in the DMC fw load.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YbnGvnsX/H3rKAqO@intel.com
3 years agousb: typec: tcpm: fix tcpm unregister port but leave a pending timer
Xu Yang [Thu, 9 Dec 2021 10:15:07 +0000 (18:15 +0800)]
usb: typec: tcpm: fix tcpm unregister port but leave a pending timer

In current design, when the tcpm port is unregisterd, the kthread_worker
will be destroyed in the last step. Inside the kthread_destroy_worker(),
the worker will flush all the works and wait for them to end. However, if
one of the works calls hrtimer_start(), this hrtimer will be pending until
timeout even though tcpm port is removed. Once the hrtimer timeout, many
strange kernel dumps appear.

Thus, we can first complete kthread_destroy_worker(), then cancel all the
hrtimers. This will guarantee that no hrtimer is pending at the end.

Fixes: 3ed8e1c2ac99 ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events")
cc: <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20211209101507.499096-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore
Pawel Laszczak [Tue, 14 Dec 2021 04:55:27 +0000 (05:55 +0100)]
usb: cdnsp: Fix lack of spin_lock_irqsave/spin_lock_restore

Patch puts content of cdnsp_gadget_pullup function inside
spin_lock_irqsave and spin_lock_restore section.
This construction is required here to keep the data consistency,
otherwise some data can be changed e.g. from interrupt context.

Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Reported-by: Ken (Jian) He <jianhe@ambarella.com>
cc: <stable@vger.kernel.org>
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
--

Changelog:
v2:
- added disable_irq/enable_irq as sugester by Peter Chen

drivers/usb/cdns3/cdnsp-gadget.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Reviewed-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/20211214045527.26823-1-pawell@gli-login.cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)
Jimmy Wang [Tue, 14 Dec 2021 01:26:50 +0000 (09:26 +0800)]
USB: NO_LPM quirk Lenovo USB-C to Ethernet Adapher(RTL8153-04)

This device doesn't work well with LPM, losing connectivity intermittently.
Disable LPM to resolve the issue.

Reviewed-by: <markpearson@lenovo.com>
Signed-off-by: Jimmy Wang <wangjm221@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211214012652.4898-1-wangjm221@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousb: xhci: Extend support for runtime power management for AMD's Yellow carp.
Nehal Bakulchandra Shah [Wed, 15 Dec 2021 09:32:16 +0000 (15:02 +0530)]
usb: xhci: Extend support for runtime power management for AMD's Yellow carp.

AMD's Yellow Carp platform has few more XHCI controllers,
enable the runtime power management support for the same.

Signed-off-by: Nehal Bakulchandra Shah <Nehal-Bakulchandra.shah@amd.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211215093216.1839065-1-Nehal-Bakulchandra.shah@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoserial: 8250_fintek: Fix garbled text for console
Ji-Ze Hong (Peter Hong) [Wed, 15 Dec 2021 07:58:35 +0000 (15:58 +0800)]
serial: 8250_fintek: Fix garbled text for console

Commit fab8a02b73eb ("serial: 8250_fintek: Enable high speed mode on Fintek F81866")
introduced support to use high baudrate with Fintek SuperIO UARTs. It'll
change clocksources when the UART probed.

But when user add kernel parameter "console=ttyS0,115200 console=tty0" to make
the UART as console output, the console will output garbled text after the
following kernel message.

[    3.681188] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled

The issue is occurs in following step:
probe_setup_port() -> fintek_8250_goto_highspeed()

It change clocksource from 115200 to 921600 with wrong time, it should change
clocksource in set_termios() not in probed. The following 3 patches are
implemented change clocksource in fintek_8250_set_termios().

Commit 58178914ae5b ("serial: 8250_fintek: UART dynamic clocksource on Fintek F81216H")
Commit 195638b6d44f ("serial: 8250_fintek: UART dynamic clocksource on Fintek F81866")
Commit 423d9118c624 ("serial: 8250_fintek: Add F81966 Support")

Due to the high baud rate had implemented above 3 patches and the patch
Commit fab8a02b73eb ("serial: 8250_fintek: Enable high speed mode on Fintek F81866")
is bugged, So this patch will remove it.

Fixes: fab8a02b73eb ("serial: 8250_fintek: Enable high speed mode on Fintek F81866")
Signed-off-by: Ji-Ze Hong (Peter Hong) <hpeter+linux_kernel@gmail.com>
Link: https://lore.kernel.org/r/20211215075835.2072-1-hpeter+linux_kernel@gmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agotty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous
Tetsuo Handa [Wed, 15 Dec 2021 11:52:40 +0000 (20:52 +0900)]
tty: n_hdlc: make n_hdlc_tty_wakeup() asynchronous

syzbot is reporting that an unprivileged user who logged in from tty
console can crash the system using a reproducer shown below [1], for
n_hdlc_tty_wakeup() is synchronously calling n_hdlc_send_frames().

----------
  #include <sys/ioctl.h>
  #include <unistd.h>

  int main(int argc, char *argv[])
  {
    const int disc = 0xd;

    ioctl(1, TIOCSETD, &disc);
    while (1) {
      ioctl(1, TCXONC, 0);
      write(1, "", 1);
      ioctl(1, TCXONC, 1); /* Kernel panic - not syncing: scheduling while atomic */
    }
  }
----------

Linus suspected that "struct tty_ldisc"->ops->write_wakeup() must not
sleep, and Jiri confirmed it from include/linux/tty_ldisc.h. Thus, defer
n_hdlc_send_frames() from n_hdlc_tty_wakeup() to a WQ context like
net/nfc/nci/uart.c does.

Link: https://syzkaller.appspot.com/bug?extid=5f47a8cea6a12b77a876
Reported-by: syzbot <syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com>
Cc: stable <stable@vger.kernel.org>
Analyzed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Confirmed-by: Jiri Slaby <jirislaby@kernel.org>
Reviewed-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/40de8b7e-a3be-4486-4e33-1b1d1da452f8@i-love.sakura.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoixgbe: set X550 MDIO speed before talking to PHY
Cyril Novikov [Tue, 2 Nov 2021 01:39:36 +0000 (18:39 -0700)]
ixgbe: set X550 MDIO speed before talking to PHY

The MDIO bus speed must be initialized before talking to the PHY the first
time in order to avoid talking to it using a speed that the PHY doesn't
support.

This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on
Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network
plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined
with the 10Gb network results in a 24MHz MDIO speed, which is apparently
too fast for the connected PHY. PHY register reads over MDIO bus return
garbage, leading to initialization failure.

Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using
the following setup:

* Use an Atom C3000 family system with at least one X552 LAN on the SoC
* Disable PXE or other BIOS network initialization if possible
  (the interface must not be initialized before Linux boots)
* Connect a live 10Gb Ethernet cable to an X550 port
* Power cycle (not reset, doesn't always work) the system and boot Linux
* Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17

Fixes: e84db7272798 ("ixgbe: Introduce function to control MDIO speed")
Signed-off-by: Cyril Novikov <cnovikov@lynx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agodm integrity: fix data corruption due to improper use of bvec_kmap_local
Mike Snitzer [Wed, 15 Dec 2021 17:31:51 +0000 (12:31 -0500)]
dm integrity: fix data corruption due to improper use of bvec_kmap_local

Commit 25058d1c725c ("dm integrity: use bvec_kmap_local in
__journal_read_write") didn't account for __journal_read_write() later
adding the biovec's bv_offset. As such using bvec_kmap_local() caused
the start of the biovec to be skipped.

Trivial test that illustrates data corruption:

  # integritysetup format /dev/pmem0
  # integritysetup open /dev/pmem0 integrityroot
  # mkfs.xfs /dev/mapper/integrityroot
  ...
  bad magic number
  bad magic number
  Metadata corruption detected at xfs_sb block 0x0/0x1000
  libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
  releasing dirty buffer (bulk) to free list!

Fix this by using kmap_local_page() instead of bvec_kmap_local() in
__journal_read_write().

Fixes: 25058d1c725c ("dm integrity: use bvec_kmap_local in __journal_read_write")
Reported-by: Tony Asleson <tasleson@redhat.com>
Reviewed-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
3 years agoixgbe: Document how to enable NBASE-T support
Robert Schlabbach [Tue, 26 Oct 2021 00:24:48 +0000 (02:24 +0200)]
ixgbe: Document how to enable NBASE-T support

Commit a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0
Gbps support") introduced suppression of the advertisement of NBASE-T
speeds by default, according to Todd Fujinaka to accommodate customers
with network switches which could not cope with advertised NBASE-T
speeds, as posted in the E1000-devel mailing list:

https://sourceforge.net/p/e1000/mailman/message/37106269/

However, the suppression was not documented at all, nor was how to
enable NBASE-T support.

Properly document the NBASE-T suppression and how to enable NBASE-T
support.

Fixes: a296d665eae1 ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support")
Reported-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Robert Schlabbach <robert_s@gmx.net>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Fix typo in i225 LTR functions
Sasha Neftin [Tue, 2 Nov 2021 07:20:06 +0000 (09:20 +0200)]
igc: Fix typo in i225 LTR functions

The LTR maximum value was incorrectly written using the scale from
the LTR minimum value. This would cause incorrect values to be sent,
in cases where the initial calculation lead to different min/max scales.

Fixes: 707abf069548 ("igc: Add initial LTR support")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigbvf: fix double free in `igbvf_probe`
Letu Ren [Sat, 13 Nov 2021 03:42:34 +0000 (11:42 +0800)]
igbvf: fix double free in `igbvf_probe`

In `igbvf_probe`, if register_netdev() fails, the program will go to
label err_hw_init, and then to label err_ioremap. In free_netdev() which
is just below label err_ioremap, there is `list_for_each_entry_safe` and
`netif_napi_del` which aims to delete all entries in `dev->napi_list`.
The program has added an entry `adapter->rx_ring->napi` which is added by
`netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has
been freed below label err_hw_init. So this a UAF.

In terms of how to patch the problem, we can refer to igbvf_remove() and
delete the entry before `adapter->rx_ring`.

The KASAN logs are as follows:

[   35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450
[   35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366
[   35.128360]
[   35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ #14
[   35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   35.131749] Call Trace:
[   35.132199]  dump_stack_lvl+0x59/0x7b
[   35.132865]  print_address_description+0x7c/0x3b0
[   35.133707]  ? free_netdev+0x1fd/0x450
[   35.134378]  __kasan_report+0x160/0x1c0
[   35.135063]  ? free_netdev+0x1fd/0x450
[   35.135738]  kasan_report+0x4b/0x70
[   35.136367]  free_netdev+0x1fd/0x450
[   35.137006]  igbvf_probe+0x121d/0x1a10 [igbvf]
[   35.137808]  ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf]
[   35.138751]  local_pci_probe+0x13c/0x1f0
[   35.139461]  pci_device_probe+0x37e/0x6c0
[   35.165526]
[   35.165806] Allocated by task 366:
[   35.166414]  ____kasan_kmalloc+0xc4/0xf0
[   35.167117]  foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf]
[   35.168078]  igbvf_probe+0x9c5/0x1a10 [igbvf]
[   35.168866]  local_pci_probe+0x13c/0x1f0
[   35.169565]  pci_device_probe+0x37e/0x6c0
[   35.179713]
[   35.179993] Freed by task 366:
[   35.180539]  kasan_set_track+0x4c/0x80
[   35.181211]  kasan_set_free_info+0x1f/0x40
[   35.181942]  ____kasan_slab_free+0x103/0x140
[   35.182703]  kfree+0xe3/0x250
[   35.183239]  igbvf_probe+0x1173/0x1a10 [igbvf]
[   35.184040]  local_pci_probe+0x13c/0x1f0

Fixes: d4e0fe01a38a0 (igbvf: add new driver to support 82576 virtual functions)
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Letu Ren <fantasquex@gmail.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: Fix removal of unicast MAC filters of VFs
Karen Sornek [Tue, 31 Aug 2021 11:16:35 +0000 (13:16 +0200)]
igb: Fix removal of unicast MAC filters of VFs

Move checking condition of VF MAC filter before clearing
or adding MAC filter to VF to prevent potential blackout caused
by removal of necessary and working VF's MAC filter.

Fixes: 1b8b062a99dc ("igb: add VF trust infrastructure")
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoMerge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client
Linus Torvalds [Wed, 15 Dec 2021 19:06:41 +0000 (11:06 -0800)]
Merge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "An SGID directory handling fix (marked for stable), a metrics
  accounting fix and two fixups to appease static checkers"

* tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client:
  ceph: fix up non-directory creation in SGID directories
  ceph: initialize pathlen variable in reconnect_caps_cb
  ceph: initialize i_size variable in ceph_sync_read
  ceph: fix duplicate increment of opened_inodes metric

3 years agoMerge tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Wed, 15 Dec 2021 18:52:01 +0000 (10:52 -0800)]
Merge tag 's390-5.16-5' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - Add missing handling of R_390_PLT32DBL relocation type in
   arch_kexec_apply_relocations_add(). Clang and the upcoming gcc 11.3
   generate such relocation entries, which our relocation code silently
   ignores, and which finally will result in an endless loop within the
   purgatory code in case of kexec.

 - Add proper handling of errors and print error messages when applying
   relocations

 - Fix duplicate tracking of irq nesting level in entry code

 - Let recordmcount.pl also look for jgnop mnemonic. Starting with
   binutils 2.37 objdump emits a jgnop mnemonic instead of brcl, which
   breaks mcount location detection. This is only a problem if used with
   compilers older than gcc 9, since with gcc 9 and newer compilers
   recordmcount.pl is not used anymore.

 - Remove preempt_disable()/preempt_enable() pair in
   kprobe_ftrace_handler() which was done for all architectures except
   for s390.

 - Update defconfig

* tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  recordmcount.pl: look for jgnop instruction as well as bcrl on s390
  s390/entry: fix duplicate tracking of irq nesting level
  s390: enable switchdev support in defconfig
  s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()
  s390/ftrace: remove preempt_disable()/preempt_enable() pair
  s390/kexec_file: fix error handling when applying relocations
  s390/kexec_file: print some more error messages

3 years agoMerge tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 15 Dec 2021 18:46:03 +0000 (10:46 -0800)]
Merge tag 'hyperv-fixes-signed-20211214' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv fix from Wei Liu:
 "Build fix from Randy Dunlap"

* tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv: utils: add PTP_1588_CLOCK to Kconfig to fix build

3 years agoaudit: improve robustness of the audit queue handling
Paul Moore [Thu, 9 Dec 2021 16:46:07 +0000 (11:46 -0500)]
audit: improve robustness of the audit queue handling

If the audit daemon were ever to get stuck in a stopped state the
kernel's kauditd_thread() could get blocked attempting to send audit
records to the userspace audit daemon.  With the kernel thread
blocked it is possible that the audit queue could grow unbounded as
certain audit record generating events must be exempt from the queue
limits else the system enter a deadlock state.

This patch resolves this problem by lowering the kernel thread's
socket sending timeout from MAX_SCHEDULE_TIMEOUT to HZ/10 and tweaks
the kauditd_send_queue() function to better manage the various audit
queues when connection problems occur between the kernel and the
audit daemon.  With this patch, the backlog may temporarily grow
beyond the defined limits when the audit daemon is stopped and the
system is under heavy audit pressure, but kauditd_thread() will
continue to make progress and drain the queues as it would for other
connection problems.  For example, with the audit daemon put into a
stopped state and the system configured to audit every syscall it
was still possible to shutdown the system without a kernel panic,
deadlock, etc.; granted, the system was slow to shutdown but that is
to be expected given the extreme pressure of recording every syscall.

The timeout value of HZ/10 was chosen primarily through
experimentation and this developer's "gut feeling".  There is likely
no one perfect value, but as this scenario is limited in scope (root
privileges would be needed to send SIGSTOP to the audit daemon), it
is likely not worth exposing this as a tunable at present.  This can
always be done at a later date if it proves necessary.

Cc: stable@vger.kernel.org
Fixes: 5b52330bbfe63 ("audit: fix auditd/kernel connection state tracking")
Reported-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Tested-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
3 years agousb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe
Amelie Delaunay [Tue, 7 Dec 2021 12:45:10 +0000 (13:45 +0100)]
usb: dwc2: fix STM ID/VBUS detection startup delay in dwc2_driver_probe

When activate_stm_id_vb_detection is enabled, ID and Vbus detection relies
on sensing comparators. This detection needs time to stabilize.
A delay was already applied in dwc2_resume() when reactivating the
detection, but it wasn't done in dwc2_probe().
This patch adds delay after enabling STM ID/VBUS detection. Then, ID state
is good when initializing gadget and host, and avoid to get a wrong
Connector ID Status Change interrupt.

Fixes: a415083a11cc ("usb: dwc2: add support for STM32MP15 SoCs USB OTG HS and FS")
Cc: stable <stable@vger.kernel.org>
Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211207124510.268841-1-amelie.delaunay@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUSB: gadget: bRequestType is a bitfield, not a enum
Greg Kroah-Hartman [Tue, 14 Dec 2021 18:46:21 +0000 (19:46 +0100)]
USB: gadget: bRequestType is a bitfield, not a enum

Szymon rightly pointed out that the previous check for the endpoint
direction in bRequestType was not looking at only the bit involved, but
rather the whole value.  Normally this is ok, but for some request
types, bits other than bit 8 could be set and the check for the endpoint
length could not stall correctly.

Fix that up by only checking the single bit.

Fixes: 153a2d7e3350 ("USB: gadget: detect too-big endpoint 0 requests")
Cc: Felipe Balbi <balbi@kernel.org>
Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Link: https://lore.kernel.org/r/20211214184621.385828-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agosoc/tegra: fuse: Fix bitwise vs. logical OR warning
Nathan Chancellor [Fri, 10 Dec 2021 16:55:29 +0000 (09:55 -0700)]
soc/tegra: fuse: Fix bitwise vs. logical OR warning

A new warning in clang points out two instances where boolean
expressions are being used with a bitwise OR instead of logical OR:

drivers/soc/tegra/fuse/speedo-tegra20.c:72:9: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
                reg = tegra_fuse_read_spare(i) |
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
                                               ||
drivers/soc/tegra/fuse/speedo-tegra20.c:72:9: note: cast one or both operands to int to silence this warning
drivers/soc/tegra/fuse/speedo-tegra20.c:87:9: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
                reg = tegra_fuse_read_spare(i) |
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
                                               ||
drivers/soc/tegra/fuse/speedo-tegra20.c:87:9: note: cast one or both operands to int to silence this warning
2 warnings generated.

The motivation for the warning is that logical operations short circuit
while bitwise operations do not.

In this instance, tegra_fuse_read_spare() is not semantically returning
a boolean, it is returning a bit value. Use u32 for its return type so
that it can be used with either bitwise or boolean operators without any
warnings.

Fixes: 25cd5a391478 ("ARM: tegra: Add speedo-based process identification")
Link: https://github.com/ClangBuiltLinux/linux/issues/1488
Suggested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
3 years agobtrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
Shin'ichiro Kawasaki [Wed, 15 Dec 2021 10:38:43 +0000 (19:38 +0900)]
btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()

The function btrfs_scan_one_device() calls blkdev_get_by_path() and
blkdev_put() to get and release its target block device. However, when
btrfs_sb_log_location_bdev() fails, blkdev_put() is not called and the
block device is left without clean up. This triggered failure of fstests
generic/085. Fix the failure path of btrfs_sb_log_location_bdev() to
call blkdev_put().

Fixes: 12659251ca5df ("btrfs: implement log-structured superblock for ZONED mode")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agobtrfs: fix warning when freeing leaf after subvolume creation failure
Filipe Manana [Mon, 13 Dec 2021 08:45:13 +0000 (08:45 +0000)]
btrfs: fix warning when freeing leaf after subvolume creation failure

When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
insert the root item for the new subvolume into the root tree, we can
trigger the following warning:

[78961.741046] WARNING: CPU: 0 PID: 4079814 at fs/btrfs/extent-tree.c:3357 btrfs_free_tree_block+0x2af/0x310 [btrfs]
[78961.743344] Modules linked in:
[78961.749440]  dm_snapshot dm_thin_pool (...)
[78961.773648] CPU: 0 PID: 4079814 Comm: fsstress Not tainted 5.16.0-rc4-btrfs-next-108 #1
[78961.775198] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[78961.777266] RIP: 0010:btrfs_free_tree_block+0x2af/0x310 [btrfs]
[78961.778398] Code: 17 00 48 85 (...)
[78961.781067] RSP: 0018:ffffaa4001657b28 EFLAGS: 00010202
[78961.781877] RAX: 0000000000000213 RBX: ffff897f8a796910 RCX: 0000000000000000
[78961.782780] RDX: 0000000000000000 RSI: 0000000011004000 RDI: 00000000ffffffff
[78961.783764] RBP: ffff8981f490e800 R08: 0000000000000001 R09: 0000000000000000
[78961.784740] R10: 0000000000000000 R11: 0000000000000001 R12: ffff897fc963fcc8
[78961.785665] R13: 0000000000000001 R14: ffff898063548000 R15: ffff898063548000
[78961.786620] FS:  00007f31283c6b80(0000) GS:ffff8982ace00000(0000) knlGS:0000000000000000
[78961.787717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[78961.788598] CR2: 00007f31285c3000 CR3: 000000023fcc8003 CR4: 0000000000370ef0
[78961.789568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[78961.790585] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[78961.791684] Call Trace:
[78961.792082]  <TASK>
[78961.792359]  create_subvol+0x5d1/0x9a0 [btrfs]
[78961.793054]  btrfs_mksubvol+0x447/0x4c0 [btrfs]
[78961.794009]  ? preempt_count_add+0x49/0xa0
[78961.794705]  __btrfs_ioctl_snap_create+0x123/0x190 [btrfs]
[78961.795712]  ? _copy_from_user+0x66/0xa0
[78961.796382]  btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs]
[78961.797392]  btrfs_ioctl+0xd1e/0x35c0 [btrfs]
[78961.798172]  ? __slab_free+0x10a/0x360
[78961.798820]  ? rcu_read_lock_sched_held+0x12/0x60
[78961.799664]  ? lock_release+0x223/0x4a0
[78961.800321]  ? lock_acquired+0x19f/0x420
[78961.800992]  ? rcu_read_lock_sched_held+0x12/0x60
[78961.801796]  ? trace_hardirqs_on+0x1b/0xe0
[78961.802495]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
[78961.803358]  ? kmem_cache_free+0x321/0x3c0
[78961.804071]  ? __x64_sys_ioctl+0x83/0xb0
[78961.804711]  __x64_sys_ioctl+0x83/0xb0
[78961.805348]  do_syscall_64+0x3b/0xc0
[78961.805969]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[78961.806830] RIP: 0033:0x7f31284bc957
[78961.807517] Code: 3c 1c 48 f7 d8 (...)

This is because we are calling btrfs_free_tree_block() on an extent
buffer that is dirty. Fix that by cleaning the extent buffer, with
btrfs_clean_tree_block(), before freeing it.

This was triggered by test case generic/475 from fstests.

Fixes: 67addf29004c5b ("btrfs: fix metadata extent leak after failure to create subvolume")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agobtrfs: fix invalid delayed ref after subvolume creation failure
Filipe Manana [Mon, 13 Dec 2021 08:45:12 +0000 (08:45 +0000)]
btrfs: fix invalid delayed ref after subvolume creation failure

When creating a subvolume, at ioctl.c:create_subvol(), if we fail to
insert the new root's root item into the root tree, we are freeing the
metadata extent we reserved for the new root to prevent a metadata
extent leak, as we don't abort the transaction at that point (since
there is nothing at that point that is irreversible).

However we allocated the metadata extent for the new root which we are
creating for the new subvolume, so its delayed reference refers to the
ID of this new root. But when we free the metadata extent we pass the
root of the subvolume where the new subvolume is located to
btrfs_free_tree_block() - this is incorrect because this will generate
a delayed reference that refers to the ID of the parent subvolume's root,
and not to ID of the new root.

This results in a failure when running delayed references that leads to
a transaction abort and a trace like the following:

[3868.738042] RIP: 0010:__btrfs_free_extent+0x709/0x950 [btrfs]
[3868.739857] Code: 68 0f 85 e6 fb ff (...)
[3868.742963] RSP: 0018:ffffb0e9045cf910 EFLAGS: 00010246
[3868.743908] RAX: 00000000fffffffe RBX: 00000000fffffffe RCX: 0000000000000002
[3868.745312] RDX: 00000000fffffffe RSI: 0000000000000002 RDI: ffff90b0cd793b88
[3868.746643] RBP: 000000000e5d8000 R08: 0000000000000000 R09: ffff90b0cd793b88
[3868.747979] R10: 0000000000000002 R11: 00014ded97944d68 R12: 0000000000000000
[3868.749373] R13: ffff90b09afe4a28 R14: 0000000000000000 R15: ffff90b0cd793b88
[3868.750725] FS:  00007f281c4a8b80(0000) GS:ffff90b3ada00000(0000) knlGS:0000000000000000
[3868.752275] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3868.753515] CR2: 00007f281c6a5000 CR3: 0000000108a42006 CR4: 0000000000370ee0
[3868.754869] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[3868.756228] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[3868.757803] Call Trace:
[3868.758281]  <TASK>
[3868.758655]  ? btrfs_merge_delayed_refs+0x178/0x1c0 [btrfs]
[3868.759827]  __btrfs_run_delayed_refs+0x2b1/0x1250 [btrfs]
[3868.761047]  btrfs_run_delayed_refs+0x86/0x210 [btrfs]
[3868.762069]  ? lock_acquired+0x19f/0x420
[3868.762829]  btrfs_commit_transaction+0x69/0xb20 [btrfs]
[3868.763860]  ? _raw_spin_unlock+0x29/0x40
[3868.764614]  ? btrfs_block_rsv_release+0x1c2/0x1e0 [btrfs]
[3868.765870]  create_subvol+0x1d8/0x9a0 [btrfs]
[3868.766766]  btrfs_mksubvol+0x447/0x4c0 [btrfs]
[3868.767669]  ? preempt_count_add+0x49/0xa0
[3868.768444]  __btrfs_ioctl_snap_create+0x123/0x190 [btrfs]
[3868.769639]  ? _copy_from_user+0x66/0xa0
[3868.770391]  btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs]
[3868.771495]  btrfs_ioctl+0xd1e/0x35c0 [btrfs]
[3868.772364]  ? __slab_free+0x10a/0x360
[3868.773198]  ? rcu_read_lock_sched_held+0x12/0x60
[3868.774121]  ? lock_release+0x223/0x4a0
[3868.774863]  ? lock_acquired+0x19f/0x420
[3868.775634]  ? rcu_read_lock_sched_held+0x12/0x60
[3868.776530]  ? trace_hardirqs_on+0x1b/0xe0
[3868.777373]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
[3868.778280]  ? kmem_cache_free+0x321/0x3c0
[3868.779011]  ? __x64_sys_ioctl+0x83/0xb0
[3868.779718]  __x64_sys_ioctl+0x83/0xb0
[3868.780387]  do_syscall_64+0x3b/0xc0
[3868.781059]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[3868.781953] RIP: 0033:0x7f281c59e957
[3868.782585] Code: 3c 1c 48 f7 d8 4c (...)
[3868.785867] RSP: 002b:00007ffe1f83e2b8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[3868.787198] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f281c59e957
[3868.788450] RDX: 00007ffe1f83e2c0 RSI: 0000000050009418 RDI: 0000000000000003
[3868.789748] RBP: 00007ffe1f83f300 R08: 0000000000000000 R09: 00007ffe1f83fe36
[3868.791214] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[3868.792468] R13: 0000000000000003 R14: 00007ffe1f83e2c0 R15: 00000000000003cc
[3868.793765]  </TASK>
[3868.794037] irq event stamp: 0
[3868.794548] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[3868.795670] hardirqs last disabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040
[3868.797086] softirqs last  enabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040
[3868.798309] softirqs last disabled at (0): [<0000000000000000>] 0x0
[3868.799284] ---[ end trace be24c7002fe27747 ]---
[3868.799928] BTRFS info (device dm-0): leaf 241188864 gen 1268 total ptrs 214 free space 469 owner 2
[3868.801133] BTRFS info (device dm-0): refs 2 lock_owner 225627 current 225627
[3868.802056]  item 0 key (237436928 169 0) itemoff 16250 itemsize 33
[3868.802863]          extent refs 1 gen 1265 flags 2
[3868.803447]          ref#0: tree block backref root 1610
(...)
[3869.064354]  item 114 key (241008640 169 0) itemoff 12488 itemsize 33
[3869.065421]          extent refs 1 gen 1268 flags 2
[3869.066115]          ref#0: tree block backref root 1689
(...)
[3869.403834] BTRFS error (device dm-0): unable to find ref byte nr 241008640 parent 0 root 1622  owner 0 offset 0
[3869.405641] BTRFS: error (device dm-0) in __btrfs_free_extent:3076: errno=-2 No such entry
[3869.407138] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2159: errno=-2 No such entry

Fix this by passing the new subvolume's root ID to btrfs_free_tree_block().
This requires changing the root argument of btrfs_free_tree_block() from
struct btrfs_root * to a u64, since at this point during the subvolume
creation we have not yet created the struct btrfs_root for the new
subvolume, and btrfs_free_tree_block() only needs a root ID and nothing
else from a struct btrfs_root.

This was triggered by test case generic/475 from fstests.

Fixes: 67addf29004c5b ("btrfs: fix metadata extent leak after failure to create subvolume")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>