Vlad Buslov [Tue, 24 Oct 2023 19:58:57 +0000 (21:58 +0200)]
net/sched: act_ct: additional checks for outdated flows
Current nf_flow_is_outdated() implementation considers any flow table flow
which state diverged from its underlying CT connection status for teardown
which can be problematic in the following cases:
- Flow has never been offloaded to hardware in the first place either
because flow table has hardware offload disabled (flag
NF_FLOWTABLE_HW_OFFLOAD is not set) or because it is still pending on 'add'
workqueue to be offloaded for the first time. The former is incorrect, the
later generates excessive deletions and additions of flows.
- Flow is already pending to be updated on the workqueue. Tearing down such
flows will also generate excessive removals from the flow table, especially
on highly loaded system where the latency to re-offload a flow via 'add'
workqueue can be quite high.
When considering a flow for teardown as outdated verify that it is both
offloaded to hardware and doesn't have any pending updates.
Fixes:
41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 24 Oct 2023 19:09:47 +0000 (21:09 +0200)]
netfilter: flowtable: GC pushes back packets to classic path
Since
41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded
unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY
back to classic path in every run, ie. every second. This is because of
a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct.
In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on
and IPS_SEEN_REPLY is unreliable since users decide when to offload the
flow before, such bit might be set on at a later stage.
Fix it by adding a custom .gc handler that sched/act_ct can use to
deal with its NF_FLOW_HW_ESTABLISHED bit.
Fixes:
41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
Reported-by: Vladimir Smelhaus <vl.sm@email.cz>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Aneesh Kumar K.V [Tue, 24 Oct 2023 14:36:04 +0000 (20:06 +0530)]
powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes
With commit
9fee28baa601 ("powerpc: implement the new page table range
API") we added set_ptes to powerpc architecture. The implementation
included calling arch_enter/leave_lazy_mmu() calls.
The patch removes the usage of arch_enter/leave_lazy_mmu() because
set_pte is not supposed to be used when updating a pte entry. Powerpc
architecture uses this rule to skip the expensive tlb invalidate which
is not needed when you are setting up the pte for the first time. See
commit
56eecdb912b5 ("mm: Use ptep/pmdp_set_numa() for updating
_PAGE_NUMA bit") for more details
The patch also makes sure we are not using the interface to update a
valid/present pte entry by adding VM_WARN_ON check all the ptes we
are setting up. Furthermore, we add a comment to set_pte_filter to
clarify it can only update folio-related flags and cannot filter
pfn specific details in pte filtering.
Removal of arch_enter/leave_lazy_mmu() also will avoid nesting of
these functions that are not supported. For ex:
remap_pte_range()
-> arch_enter_lazy_mmu()
-> set_ptes()
-> arch_enter_lazy_mmu()
-> arch_leave_lazy_mmu()
-> arch_leave_lazy_mmu()
Fixes:
9fee28baa601 ("powerpc: implement the new page table range API")
Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231024143604.16749-1-aneesh.kumar@linux.ibm.com
Ivan Vecera [Mon, 23 Oct 2023 21:27:14 +0000 (14:27 -0700)]
i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR
The I40E_TXR_FLAGS_WB_ON_ITR is i40e_ring flag and not i40e_pf one.
Fixes:
8e0764b4d6be42 ("i40e/i40evf: Add support for writeback on ITR feature for X722")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023212714.178032-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Oct 2023 20:10:53 +0000 (13:10 -0700)]
Merge tag 'wireless-2023-10-24' of git://git./linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Three more fixes:
- don't drop all unprotected public action frames since
some don't have a protected dual
- fix pointer confusion in scanning code
- fix warning in some connections with multiple links
* tag 'wireless-2023-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: don't drop all unprotected public action frames
wifi: cfg80211: fix assoc response warning on failed links
wifi: cfg80211: pass correct pointer to rdev_inform_bss()
====================
Link: https://lore.kernel.org/r/20231024103540.19198-2-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Tue, 24 Oct 2023 19:52:16 +0000 (09:52 -1000)]
Merge tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 12 are cc:stable and the remainder address post-6.5
issues or aren't considered necessary for earlier kernel versions"
* tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier
mailmap: correct email aliasing for Oleksij Rempel
mailmap: map Bartosz's old address to the current one
mm/damon/sysfs: check DAMOS regions update progress from before_terminate()
MAINTAINERS: Ondrej has moved
kasan: disable kasan_non_canonical_hook() for HW tags
kasan: print the original fault addr when access invalid shadow
hugetlbfs: close race between MADV_DONTNEED and page fault
hugetlbfs: extend hugetlb_vma_lock to private VMAs
hugetlbfs: clear resv_map pointer if mmap fails
mm: zswap: fix pool refcount bug around shrink_worker()
mm/migrate: fix do_pages_move for compat pointers
riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set
riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking
mmap: fix error paths with dup_anon_vma()
mmap: fix vma_iterator in error path of vma_merge()
mm: fix vm_brk_flags() to not bail out while holding lock
mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer
mm/page_alloc: correct start page when guard page debug is enabled
Jinjie Ruan [Mon, 23 Oct 2023 03:28:57 +0000 (11:28 +0800)]
fpga: Fix memory leak for fpga_region_test_class_find()
fpga_region_class_find() in fpga_region_test_class_find() will call
get_device() if the data is matched, which will increment refcount for
dev->kobj, so it should call put_device() to decrement refcount for
dev->kobj to free the region, because fpga_region_unregister() will call
fpga_region_dev_release() only when the refcount for dev->kobj is zero
but fpga_region_test_init() call device_register() in
fpga_region_register_full(), which also increment refcount.
So call put_device() after calling fpga_region_class_find() in
fpga_region_test_class_find(). After applying this patch, the following
memory leak is never detected.
unreferenced object 0xffff88810c8ef000 (size 1024):
comm "kunit_try_catch", pid 1875, jiffies
4294715298 (age 836.836s)
hex dump (first 32 bytes):
b8 d1 fb 05 81 88 ff ff 08 f0 8e 0c 81 88 ff ff ................
08 f0 8e 0c 81 88 ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<
ffffffff817ebad7>] kmalloc_trace+0x27/0xa0
[<
ffffffffa02385e1>] fpga_region_register_full+0x51/0x430 [fpga_region]
[<
ffffffffa0228e47>] 0xffffffffa0228e47
[<
ffffffff829c479d>] kunit_try_run_case+0xdd/0x250
[<
ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
[<
ffffffff81238b85>] kthread+0x2b5/0x380
[<
ffffffff81097ded>] ret_from_fork+0x2d/0x70
[<
ffffffff810034d1>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff888105fbd1b8 (size 8):
comm "kunit_try_catch", pid 1875, jiffies
4294715298 (age 836.836s)
hex dump (first 8 bytes):
72 65 67 69 6f 6e 30 00 region0.
backtrace:
[<
ffffffff817ec023>] __kmalloc_node_track_caller+0x53/0x150
[<
ffffffff82995590>] kvasprintf+0xb0/0x130
[<
ffffffff83f713b1>] kobject_set_name_vargs+0x41/0x110
[<
ffffffff8304ac1b>] dev_set_name+0xab/0xe0
[<
ffffffffa02388a2>] fpga_region_register_full+0x312/0x430 [fpga_region]
[<
ffffffffa0228e47>] 0xffffffffa0228e47
[<
ffffffff829c479d>] kunit_try_run_case+0xdd/0x250
[<
ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
[<
ffffffff81238b85>] kthread+0x2b5/0x380
[<
ffffffff81097ded>] ret_from_fork+0x2d/0x70
[<
ffffffff810034d1>] ret_from_fork_asm+0x11/0x20
unreferenced object 0xffff88810b3b8a00 (size 256):
comm "kunit_try_catch", pid 1875, jiffies
4294715298 (age 836.836s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 08 8a 3b 0b 81 88 ff ff ..........;.....
08 8a 3b 0b 81 88 ff ff e0 ac 04 83 ff ff ff ff ..;.............
backtrace:
[<
ffffffff817ebad7>] kmalloc_trace+0x27/0xa0
[<
ffffffff83056d7a>] device_add+0xa2a/0x15e0
[<
ffffffffa02388b1>] fpga_region_register_full+0x321/0x430 [fpga_region]
[<
ffffffffa0228e47>] 0xffffffffa0228e47
[<
ffffffff829c479d>] kunit_try_run_case+0xdd/0x250
[<
ffffffff829c9f2a>] kunit_generic_run_threadfn_adapter+0x4a/0x90
[<
ffffffff81238b85>] kthread+0x2b5/0x380
[<
ffffffff81097ded>] ret_from_fork+0x2d/0x70
[<
ffffffff810034d1>] ret_from_fork_asm+0x11/0x20
Fixes:
64a5f972c93d ("fpga: add an initial KUnit suite for the FPGA Region")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Marco Pagani <marpagan@redhat.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Link: https://lore.kernel.org/r/20231007094321.3447084-1-ruanjinjie@huawei.com
[yilun.xu@intel.com: slightly changes the commit message]
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Link: https://lore.kernel.org/r/20231023032857.902699-3-yilun.xu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russ Weight [Mon, 23 Oct 2023 03:28:56 +0000 (11:28 +0800)]
fpga: m10bmc-sec: Change contact for secure update driver
Change the maintainer for the Intel MAX10 BMC Secure Update driver from
Russ Weight to Peter Colberg. Update the ABI documentation contact
information as well.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Peter Colberg <peter.colberg@intel.com>
Link: https://lore.kernel.org/r/20230928164753.278684-1-russell.h.weight@intel.com
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Link: https://lore.kernel.org/r/20231023032857.902699-2-yilun.xu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Umesh Nerlige Ramappa [Wed, 2 Aug 2023 20:28:54 +0000 (13:28 -0700)]
drm/i915/perf: Determine context valid in OA reports
When supporting OA for TGL, it was seen that the context valid bit in
the report ID was not defined, however revisiting the spec seems to have
this bit defined. The bit is used to determine if a context is valid on
a context switch and is essential to determine active and idle periods
for a context. Re-enable the context valid bit for gen12 platforms.
BSpec: 52196 (description of report_id)
v2: Include BSpec reference (Ashutosh)
Fixes:
00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230802202854.1224547-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit
7eeaedf79989a8f131939782832e21e9218ed2a0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Peter Zijlstra [Tue, 24 Oct 2023 09:42:21 +0000 (11:42 +0200)]
perf/core: Fix potential NULL deref
Smatch is awesome.
Fixes:
32671e3799ca ("perf: Disallow mis-matched inherited group reads")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Paolo Abeni [Tue, 24 Oct 2023 10:02:03 +0000 (12:02 +0200)]
Merge branch 'gtp-tunnel-driver-fixes'
Pablo Neira Ayuso says:
====================
GTP tunnel driver fixes
The following patchset contains two fixes for the GTP tunnel driver:
1) Incorrect GTPA_MAX definition in UAPI headers. This is updating an
existing UAPI definition but for a good reason, this is certainly
broken. Similar fixes for incorrect _MAX definition in netlink
headers were applied in the past too.
2) Fix GTP driver PMTU with GRO packets, add missing call to
skb_gso_validate_network_len() to handle GRO packets.
====================
Link: https://lore.kernel.org/r/20231022202519.659526-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pablo Neira Ayuso [Sun, 22 Oct 2023 20:25:18 +0000 (22:25 +0200)]
gtp: fix fragmentation needed check with gso
Call skb_gso_validate_network_len() to check if packet is over PMTU.
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pablo Neira Ayuso [Sun, 22 Oct 2023 20:25:17 +0000 (22:25 +0200)]
gtp: uapi: fix GTPA_MAX
Subtract one to __GTPA_MAX, otherwise GTPA_MAX is off by 2.
Fixes:
459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Anjali Kulkarni [Fri, 20 Oct 2023 23:40:58 +0000 (16:40 -0700)]
Fix NULL pointer dereference in cn_filter()
Check that sk_user_data is not NULL, else return from cn_filter().
Could not reproduce this issue, but Oliver Sang verified it has fixed
the "Closes" problem below.
Fixes:
2aa1f7a1f47c ("connector/cn_proc: Add filtering to fix some bugs")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/
202309201456.
84c19e27-oliver.sang@intel.com/
Signed-off-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com>
Link: https://lore.kernel.org/r/20231020234058.2232347-1-anjali.k.kulkarni@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Tue, 24 Oct 2023 06:40:04 +0000 (20:40 -1000)]
Merge tag 'pull-nfsd-fix' of git://git./linux/kernel/git/viro/vfs
Pull nfsd fix from Al Viro:
"Catch from lock_rename() audit; nfsd_rename() checked that both
directories belonged to the same filesystem, but only after having
done lock_rename().
Trivial fix, tested and acked by nfs folks"
* tag 'pull-nfsd-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
nfsd: lock_rename() needs both directories to live on the same fs
Linus Torvalds [Tue, 24 Oct 2023 00:19:11 +0000 (14:19 -1000)]
Merge tag 'urgent/nolibc.2023.10.16a' of git://git./linux/kernel/git/paulmck/linux-rcu
Pull nolibc fixes from Paul McKenney:
- tools/nolibc: i386: Fix a stack misalign bug on _start
- MAINTAINERS: nolibc: update tree location
- tools/nolibc: mark start_c as weak to avoid linker errors
* tag 'urgent/nolibc.2023.10.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
tools/nolibc: mark start_c as weak
MAINTAINERS: nolibc: update tree location
tools/nolibc: i386: Fix a stack misalign bug on _start
Pieter Jansen van Vuuren [Fri, 20 Oct 2023 14:01:49 +0000 (15:01 +0100)]
sfc: cleanup and reduce netlink error messages
Reduce the length of netlink error messages as they are likely to be
truncated anyway. Additionally, reword netlink error messages so they
are more consistent with previous messages.
Fixes:
9dbc8d2b9a02 ("sfc: add decrement ipv6 hop limit by offloading set hop limit actions")
Fixes:
3c9561c0a5b9 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202310202136.4u7bv0hp-lkp@intel.com/
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20231020140149.30490-1-pieter.jansen-van-vuuren@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arnd Bergmann [Mon, 23 Oct 2023 19:01:45 +0000 (21:01 +0200)]
Merge tag 'mvebu-fixes-6.6-1' of git://git./linux/kernel/git/gclement/mvebu into arm/fixes
mvebu fixes for 6.6 (part 1)
Update MAINTAINERS for eDPU board
* tag 'mvebu-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
MAINTAINERS: uDPU: add remaining Methode boards
MAINTAINERS: uDPU: make myself maintainer of it
Link: https://lore.kernel.org/r/875y32abqe.fsf@BL-laptop
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Mon, 23 Oct 2023 17:59:13 +0000 (07:59 -1000)]
Merge tag 'for-6.6-rc7-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more fix for a problem with snapshot of a newly created subvolume
that can lead to inconsistent data under some circumstances. Kernel
6.5 added a performance optimization to skip transaction commit for
subvolume creation but this could end up with newer data on disk but
not linked to other structures.
The fix itself is an added condition, the rest of the patch is a
parameter added to several functions"
* tag 'for-6.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix unwritten extent buffer after snapshotting a new subvolume
Linus Torvalds [Mon, 23 Oct 2023 17:42:48 +0000 (07:42 -1000)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"A collection of small fixes that look like worth having in this
release"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_pci: fix the common cfg map size
virtio-crypto: handle config changed by work queue
vhost: Allow null msg.size on VHOST_IOTLB_INVALIDATE
vdpa/mlx5: Fix firmware error on creation of 1k VQs
virtio_balloon: Fix endless deflation and inflation on arm64
vdpa/mlx5: Fix double release of debugfs entry
virtio-mmio: fix memory leak of vm_dev
vdpa_sim_blk: Fix the potential leak of mgmt_dev
tools/virtio: Add dma sync api for virtio test
Moritz Wanzenböck [Thu, 19 Oct 2023 12:58:47 +0000 (14:58 +0200)]
net/handshake: fix file ref count in handshake_nl_accept_doit()
If req->hr_proto->hp_accept() fail, we call fput() twice:
Once in the error path, but also a second time because sock->file
is at that point already associated with the file descriptor. Once
the task exits, as it would probably do after receiving an error
reading from netlink, the fd is closed, calling fput() a second time.
To fix, we move installing the file after the error path for the
hp_accept() call. In the case of errors we simply put the unused fd.
In case of success we can use fd_install() to link the sock->file
to the reserved fd.
Fixes:
7ea9c1ec66bc ("net/handshake: Fix handshake_dup() ref counting")
Signed-off-by: Moritz Wanzenböck <moritz.wanzenboeck@linbit.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20231019125847.276443-1-moritz.wanzenboeck@linbit.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Filipe Manana [Thu, 19 Oct 2023 12:19:28 +0000 (13:19 +0100)]
btrfs: fix unwritten extent buffer after snapshotting a new subvolume
When creating a snapshot of a subvolume that was created in the current
transaction, we can end up not persisting a dirty extent buffer that is
referenced by the snapshot, resulting in IO errors due to checksum failures
when trying to read the extent buffer later from disk. A sequence of steps
that leads to this is the following:
1) At ioctl.c:create_subvol() we allocate an extent buffer, with logical
address
36007936, for the leaf/root of a new subvolume that has an ID
of 291. We mark the extent buffer as dirty, and at this point the
subvolume tree has a single node/leaf which is also its root (level 0);
2) We no longer commit the transaction used to create the subvolume at
create_subvol(). We used to, but that was recently removed in
commit
1b53e51a4a8f ("btrfs: don't commit transaction for every subvol
create");
3) The transaction used to create the subvolume has an ID of 33, so the
extent buffer
36007936 has a generation of 33;
4) Several updates happen to subvolume 291 during transaction 33, several
files created and its tree height changes from 0 to 1, so we end up with
a new root at level 1 and the extent buffer
36007936 is now a leaf of
that new root node, which is extent buffer
36048896.
The commit root remains as
36007936, since we are still at transaction
33;
5) Creation of a snapshot of subvolume 291, with an ID of 292, starts at
ioctl.c:create_snapshot(). This triggers a commit of transaction 33 and
we end up at transaction.c:create_pending_snapshot(), in the critical
section of a transaction commit.
There we COW the root of subvolume 291, which is extent buffer
36048896.
The COW operation returns extent buffer
36048896, since there's no need
to COW because the extent buffer was created in this transaction and it
was not written yet.
The we call btrfs_copy_root() against the root node
36048896. During
this operation we allocate a new extent buffer to turn into the root
node of the snapshot, copy the contents of the root node
36048896 into
this snapshot root extent buffer, set the owner to 292 (the ID of the
snapshot), etc, and then we call btrfs_inc_ref(). This will create a
delayed reference for each leaf pointed by the root node with a
reference root of 292 - this includes a reference for the leaf
36007936.
After that we set the bit BTRFS_ROOT_FORCE_COW in the root's state.
Then we call btrfs_insert_dir_item(), to create the directory entry in
in the tree of subvolume 291 that points to the snapshot. This ends up
needing to modify leaf
36007936 to insert the respective directory
items. Because the bit BTRFS_ROOT_FORCE_COW is set for the root's state,
we need to COW the leaf. We end up at btrfs_force_cow_block() and then
at update_ref_for_cow().
At update_ref_for_cow() we call btrfs_block_can_be_shared() which
returns false, despite the fact the leaf
36007936 is shared - the
subvolume's root and the snapshot's root point to that leaf. The
reason that it incorrectly returns false is because the commit root
of the subvolume is extent buffer
36007936 - it was the initial root
of the subvolume when we created it. So btrfs_block_can_be_shared()
which has the following logic:
int btrfs_block_can_be_shared(struct btrfs_root *root,
struct extent_buffer *buf)
{
if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
buf != root->node && buf != root->commit_root &&
(btrfs_header_generation(buf) <=
btrfs_root_last_snapshot(&root->root_item) ||
btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC)))
return 1;
return 0;
}
Returns false (0) since 'buf' (extent buffer
36007936) matches the
root's commit root.
As a result, at update_ref_for_cow(), we don't check for the number
of references for extent buffer
36007936, we just assume it's not
shared and therefore that it has only 1 reference, so we set the local
variable 'refs' to 1.
Later on, in the final if-else statement at update_ref_for_cow():
static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct extent_buffer *buf,
struct extent_buffer *cow,
int *last_ref)
{
(...)
if (refs > 1) {
(...)
} else {
(...)
btrfs_clear_buffer_dirty(trans, buf);
*last_ref = 1;
}
}
So we mark the extent buffer
36007936 as not dirty, and as a result
we don't write it to disk later in the transaction commit, despite the
fact that the snapshot's root points to it.
Attempting to access the leaf or dumping the tree for example shows
that the extent buffer was not written:
$ btrfs inspect-internal dump-tree -t 292 /dev/sdb
btrfs-progs v6.2.2
file tree key (292 ROOT_ITEM 33)
node
36110336 level 1 items 2 free space 119 generation 33 owner 292
node
36110336 flags 0x1(WRITTEN) backref revision 1
checksum stored
a8103e3e
checksum calced
a8103e3e
fs uuid
90c9a46f-ae9f-4626-9aff-
0cbf3e2e3a79
chunk uuid
e8c9c885-78f4-4d31-85fe-
89e5f5fd4a07
key (256 INODE_ITEM 0) block
36007936 gen 33
key (257 EXTENT_DATA 0) block
36052992 gen 33
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
total bytes
107374182400
bytes used
38572032
uuid
90c9a46f-ae9f-4626-9aff-
0cbf3e2e3a79
The respective on disk region is full of zeroes as the device was
trimmed at mkfs time.
Obviously 'btrfs check' also detects and complains about this:
$ btrfs check /dev/sdb
Opening filesystem to check...
Checking filesystem on /dev/sdb
UUID:
90c9a46f-ae9f-4626-9aff-
0cbf3e2e3a79
generation: 33 (33)
[1/7] checking root items
[2/7] checking extents
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
bad tree block
36007936, bytenr mismatch, want=
36007936, have=0
owner ref check failed [
36007936 4096]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
[4/7] checking fs roots
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
checksum verify failed on
36007936 wanted 0x00000000 found 0x86005f29
bad tree block
36007936, bytenr mismatch, want=
36007936, have=0
The following tree block(s) is corrupted in tree 292:
tree block bytenr:
36110336, level: 1, node key: (256, 1, 0)
root 292 root dir 256 not found
ERROR: errors found in fs roots
found
38572032 bytes used, error(s) found
total csum bytes: 16048
total tree bytes: 1265664
total fs tree bytes: 1118208
total extent tree bytes: 65536
btree space waste bytes: 562598
file data blocks allocated:
65978368
referenced
36569088
Fix this by updating btrfs_block_can_be_shared() to consider that an
extent buffer may be shared if it matches the commit root and if its
generation matches the current transaction's generation.
This can be reproduced with the following script:
$ cat test.sh
#!/bin/bash
MNT=/mnt/sdi
DEV=/dev/sdi
# Use a filesystem with a 64K node size so that we have the same node
# size on every machine regardless of its page size (on x86_64 default
# node size is 16K due to the 4K page size, while on PPC it's 64K by
# default). This way we can make sure we are able to create a btree for
# the subvolume with a height of 2.
mkfs.btrfs -f -n 64K $DEV
mount $DEV $MNT
btrfs subvolume create $MNT/subvol
# Create a few empty files on the subvolume, this bumps its btree
# height to 2 (root node at level 1 and 2 leaves).
for ((i = 1; i <= 300; i++)); do
echo -n > $MNT/subvol/file_$i
done
btrfs subvolume snapshot -r $MNT/subvol $MNT/subvol/snap
umount $DEV
btrfs check $DEV
Running it on a 6.5 kernel (or any 6.6-rc kernel at the moment):
$ ./test.sh
Create subvolume '/mnt/sdi/subvol'
Create a readonly snapshot of '/mnt/sdi/subvol' in '/mnt/sdi/subvol/snap'
Opening filesystem to check...
Checking filesystem on /dev/sdi
UUID:
bbdde2ff-7d02-45ca-8a73-
3c36f23755a1
[1/7] checking root items
[2/7] checking extents
parent transid verify failed on
30539776 wanted 7 found 5
parent transid verify failed on
30539776 wanted 7 found 5
parent transid verify failed on
30539776 wanted 7 found 5
Ignoring transid failure
owner ref check failed [
30539776 65536]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
[4/7] checking fs roots
parent transid verify failed on
30539776 wanted 7 found 5
Ignoring transid failure
Wrong key of child node/leaf, wanted: (256, 1, 0), have: (2, 132, 0)
Wrong generation of child node/leaf, wanted: 5, have: 7
root 257 root dir 256 not found
ERROR: errors found in fs roots
found 917504 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 851968
total fs tree bytes: 393216
total extent tree bytes: 65536
btree space waste bytes: 736550
file data blocks allocated: 0
referenced 0
A test case for fstests will follow soon.
Fixes:
1b53e51a4a8f ("btrfs: don't commit transaction for every subvol create")
CC: stable@vger.kernel.org # 6.5+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Christian König [Fri, 20 Oct 2023 12:28:56 +0000 (14:28 +0200)]
drm/amdkfd: reserve a fence slot while locking the BO
Looks like the KFD still needs this.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes:
8abc1eb2987a ("drm/amdkfd: switch over to using drm_exec v3")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231020123306.43978-1-christian.koenig@amd.com
Michael Ellerman [Mon, 23 Oct 2023 11:25:00 +0000 (22:25 +1100)]
powerpc/mm: Fix boot crash with FLATMEM
Erhard reported that his G5 was crashing with v6.6-rc kernels:
mpic: Setting up HT PICs workarounds for U3/U4
BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
Faulting instruction address: 0xc00000000005dc40
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G T 6.6.0-rc3-PMacGS #1
Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
NIP:
c00000000005dc40 LR:
c000000000066660 CTR:
c000000000007730
REGS:
c0000000022bf510 TRAP: 0380 Tainted: G T (6.6.0-rc3-PMacGS)
MSR:
9000000000001032 <SF,HV,ME,IR,DR,RI> CR:
44004242 XER:
00000000
IRQMASK: 3
GPR00:
0000000000000000 c0000000022bf7b0 c0000000010c0b00 00000000000001ac
GPR04:
0000000003c80000 0000000000000300 c0000000f20001ae 0000000000000300
GPR08:
0000000000000006 feffbb62ffec65ff 0000000000000001 0000000000000000
GPR12:
9000000000001032 c000000002362000 c000000000f76b80 000000000349ecd8
GPR16:
0000000002367ba8 0000000002367f08 0000000000000006 0000000000000000
GPR20:
00000000000001ac c000000000f6f920 c0000000022cd985 000000000000000c
GPR24:
0000000000000300 00000003b0a3691d c0003e008030000e 0000000000000000
GPR28:
c00000000000000c c0000000f20001ee feffbb62ffec65fe 00000000000001ac
NIP hash_page_do_lazy_icache+0x50/0x100
LR __hash_page_4K+0x420/0x590
Call Trace:
hash_page_mm+0x364/0x6f0
do_hash_fault+0x114/0x2b0
data_access_common_virt+0x198/0x1f0
--- interrupt: 300 at mpic_init+0x4bc/0x10c4
NIP:
c000000002020a5c LR:
c000000002020a04 CTR:
0000000000000000
REGS:
c0000000022bf9f0 TRAP: 0300 Tainted: G T (6.6.0-rc3-PMacGS)
MSR:
9000000000001032 <SF,HV,ME,IR,DR,RI> CR:
24004248 XER:
00000000
DAR:
c0003e008030000e DSISR:
40000000 IRQMASK: 1
...
NIP mpic_init+0x4bc/0x10c4
LR mpic_init+0x464/0x10c4
--- interrupt: 300
pmac_setup_one_mpic+0x258/0x2dc
pmac_pic_init+0x28c/0x3d8
init_IRQ+0x90/0x140
start_kernel+0x57c/0x78c
start_here_common+0x1c/0x20
A bisect pointed to the breakage beginning with commit
9fee28baa601 ("powerpc:
implement the new page table range API").
Analysis of the oops pointed to a struct page with a corrupted
compound_head being loaded via page_folio() -> _compound_head() in
hash_page_do_lazy_icache().
The access by the mpic code is to an MMIO address, so the expectation
is that the struct page for that address would be initialised by
init_unavailable_range(), as pointed out by Aneesh.
Instrumentation showed that was not the case, which eventually lead to
the realisation that pfn_valid() was returning false for that address,
causing the struct page to not be initialised.
Because the system is using FLATMEM, the version of pfn_valid() in
memory_model.h is used:
static inline int pfn_valid(unsigned long pfn)
{
...
return pfn >= pfn_offset && (pfn - pfn_offset) < max_mapnr;
}
Which relies on max_mapnr being initialised. Early in boot max_mapnr is
zero meaning no PFNs are valid.
max_mapnr is initialised in mem_init() called via:
start_kernel()
mm_core_init() # init/main.c:928
mem_init()
But that is too late for the usage in init_unavailable_range() called via:
start_kernel()
setup_arch() # init/main.c:893
paging_init()
free_area_init()
init_unavailable_range()
Although max_mapnr is currently set in mem_init(), the value is actually
already available much earlier, as soon as mem_topology_setup() has
completed, which is also before paging_init() is called. So move the
initialisation there, which causes paging_init() to correctly initialise
the struct page and fixes the bug.
This bug seems to have been lurking for years, but went unnoticed
because the pre-folio code was inspecting the uninitialised page->flags
but not dereferencing it.
Thanks to Erhard and Aneesh for help debugging.
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://lore.kernel.org/all/
20230929132750.
3cd98452@yea/
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231023112500.1550208-1-mpe@ellerman.id.au
Avraham Stern [Mon, 16 Oct 2023 11:52:48 +0000 (14:52 +0300)]
wifi: mac80211: don't drop all unprotected public action frames
Not all public action frames have a protected variant. When MFP is
enabled drop only public action frames that have a dual protected
variant.
Fixes:
76a3059cf124 ("wifi: mac80211: drop some unprotected action frames")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20231016145213.2973e3c8d3bb.I6198b8d3b04cf4a97b06660d346caec3032f232a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 18 Oct 2023 09:42:51 +0000 (11:42 +0200)]
wifi: cfg80211: fix assoc response warning on failed links
The warning here shouldn't be done before we even set the
bss field (or should've used the input data). Move the
assignment before the warning to fix it.
We noticed this now because of Wen's bugfix, where the bug
fixed there had previously hidden this other bug.
Fixes:
53ad07e9823b ("wifi: cfg80211: support reporting failed links")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ben Greear [Sat, 21 Oct 2023 15:48:27 +0000 (08:48 -0700)]
wifi: cfg80211: pass correct pointer to rdev_inform_bss()
Confusing struct member names here resulted in passing
the wrong pointer, causing crashes. Pass the correct one.
Fixes:
eb142608e2c4 ("wifi: cfg80211: use a struct for inform_single_bss data")
Signed-off-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20231021154827.1142734-1-greearb@candelatech.com
[rewrite commit message, add fixes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Kunwu Chan [Mon, 23 Oct 2023 06:37:58 +0000 (14:37 +0800)]
isdn: mISDN: hfcsusb: Spelling fix in comment
protocoll -> protocol
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 22 Oct 2023 22:11:21 +0000 (12:11 -1000)]
Linux 6.6-rc7
Linus Torvalds [Sun, 22 Oct 2023 17:11:10 +0000 (07:11 -1000)]
Merge tag 'phy-fixes-6.6' of git://git./linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
- mapphone-mdm6600 runtime pm & pinctrl handling fixes
- Qualcomm qmp usb pcs register fixes, qmp pcie register size warning
fix, m31 fixes for wrong pointer in PTR_ERR and dropping wrong vreg
check, qmp combo fix for 8550 power config register
- realtek usb fix for debugfs_create_dir() and kconfig dependency
* tag 'phy-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: realtek: Realtek PHYs should depend on ARCH_REALTEK
phy: qualcomm: Fix typos in comments
phy: qcom-qmp-combo: initialize PCS_USB registers
phy: qcom-qmp-combo: Square out 8550 POWER_STATE_CONFIG1
phy: qcom: m31: Remove unwanted qphy->vreg is NULL check
phy: realtek: usb: Drop unnecessary error check for debugfs_create_dir()
phy: qcom: phy-qcom-m31: change m31_ipq5332_regs to static
phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR()
dt-bindings: phy: qcom,ipq8074-qmp-pcie: fix warning regarding reg size
phy: qcom-qmp-usb: split PCS_USB init table for sc8280xp and sa8775p
phy: qcom-qmp-usb: initialize PCS_USB registers
phy: mapphone-mdm6600: Fix pinctrl_pm handling for sleep pins
phy: mapphone-mdm6600: Fix runtime PM for remove
phy: mapphone-mdm6600: Fix runtime disable on probe
Linus Torvalds [Sun, 22 Oct 2023 17:05:28 +0000 (07:05 -1000)]
Merge tag 'efi-fixes-for-v6.6-3' of git://git./linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"The boot_params pointer fix uses a somewhat ugly extern struct
declaration but this will be cleaned up the next cycle.
- don't try to print warnings to the console when it is no longer
available
- fix theoretical memory leak in SSDT override handling
- make sure that the boot_params global variable is set before the
KASLR code attempts to hash it for 'randomness'
- avoid soft lockups in the memory acceptance code"
* tag 'efi-fixes-for-v6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/unaccepted: Fix soft lockups caused by parallel memory acceptance
x86/boot: efistub: Assign global boot_params variable
efi: fix memory leak in krealloc failure handling
x86/efistub: Don't try to print after ExitBootService()
Fred Chen [Sat, 21 Oct 2023 00:19:47 +0000 (08:19 +0800)]
tcp: fix wrong RTO timeout when received SACK reneging
This commit fix wrong RTO timeout when received SACK reneging.
When an ACK arrived pointing to a SACK reneging, tcp_check_sack_reneging()
will rearm the RTO timer for min(1/2*srtt, 10ms) into to the future.
But since the commit
62d9f1a6945b ("tcp: fix TLP timer not set when
CA_STATE changes from DISORDER to OPEN") merged, the tcp_set_xmit_timer()
is moved after tcp_fastretrans_alert()(which do the SACK reneging check),
so the RTO timeout will be overwrited by tcp_set_xmit_timer() with
icsk_rto instead of 1/2*srtt.
Here is a packetdrill script to check this bug:
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
// simulate srtt to 100ms
+0 < S 0:0(0) win 32792 <mss 1000, sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
+.1 < . 1:1(0) ack 1 win 1024
+0 accept(3, ..., ...) = 4
+0 write(4, ..., 10000) = 10000
+0 > P. 1:10001(10000) ack 1
// inject sack
+.1 < . 1:1(0) ack 1 win 257 <sack 1001:10001,nop,nop>
+0 > . 1:1001(1000) ack 1
// inject sack reneging
+.1 < . 1:1(0) ack 1001 win 257 <sack 9001:10001,nop,nop>
// we expect rto fired in 1/2*srtt (50ms)
+.05 > . 1001:2001(1000) ack 1
This fix remove the FLAG_SET_XMIT_TIMER from ack_flag when
tcp_check_sack_reneging() set RTO timer with 1/2*srtt to avoid
being overwrited later.
Fixes:
62d9f1a6945b ("tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN")
Signed-off-by: Fred Chen <fred.chenchen03@gmail.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiang Chen [Thu, 19 Oct 2023 13:01:21 +0000 (21:01 +0800)]
ACPI: NFIT: Install Notify() handler before getting NFIT table
If there is no NFIT at startup, it will return 0 immediately in function
acpi_nfit_add() and will not install Notify() handler. If hotplugging
a nvdimm device later, it will not be identified as there is no Notify()
handler.
Install the handler before getting NFI table in function acpi_nfit_add()
to avoid above issue.
Fixes:
dcca12ab62a2 ("ACPI: NFIT: Install Notify() handler directly")
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
David S. Miller [Sun, 22 Oct 2023 10:46:18 +0000 (11:46 +0100)]
Merge branch 'r8152-reg-garbage'
Douglas Anderson says:
====================
r8152: Avoid writing garbage to the adapter's registers
This series is the result of a cooperative debug effort between
Realtek and the ChromeOS team. On ChromeOS, we've noticed that Realtek
Ethernet adapters can sometimes get so wedged that even a reboot of
the host can't get them to enumerate again, assuming that the adapter
was on a powered hub and din't lose power when the host rebooted. This
is sometimes seen in the ChromeOS automated testing lab. The only way
to recover adapters in this state is to manually power cycle them.
I managed to reproduce one instance of this wedging (unknown if this
is truly related to what the test lab sees) by doing this:
1. Start a flood ping from a host to the device.
2. Drop the device into kdb.
3. Wait 90 seconds.
4. Resume from kdb (the "g" command).
5. Wait another 45 seconds.
Upon analysis, Realtek realized this was happening:
1. The Linux driver was getting a "Tx timeout" after resuming from kdb
and then trying to reset itself.
2. As part of the reset, the Linux driver was attempting to do a
read-modify-write of the adapter's registers.
3. The read would fail (due to a timeout) and the driver pretended
that the register contained all 0xFFs. See commit
f53a7ad18959
("r8152: Set memory to all 0xFFs on failed reg reads")
4. The driver would take this value of all 0xFFs, modify it, and
attempt to write it back to the adapter.
5. By this time the USB channel seemed to recover and thus we'd
successfully write a value that was mostly 0xFFs to the adpater.
6. The adapter didn't like this and would wedge itself.
Another Engineer also managed to reproduce wedging of the Realtek
Ethernet adpater during a reboot test on an AMD Chromebook. In that
case he was sometimes seeing -EPIPE returned from the control
transfers.
This patch series fixes both issues.
Changes in v5:
- ("Run the unload routine if we have errors during probe") new for v5.
- ("Cancel hw_phy_work if we have an error in probe") new for v5.
- ("Release firmware if we have an error in probe") new for v5.
- Removed extra mutex_unlock() left over in v4.
- Fixed minor typos.
- Don't do queue an unbind/bind reset if probe fails; just retry probe.
Changes in v4:
- Took out some unnecessary locks/unlocks of the control mutex.
- Added comment about reading version causing probe fail if 3 fails.
- Added text to commit msg about the potential unbind/bind loop.
Changes in v3:
- Fixed v2 changelog ending up in the commit message.
- farmework -> framework in comments.
Changes in v2:
- ("Check for unplug in rtl_phy_patch_request()") new for v2.
- ("Check for unplug in r8153b_ups_en() / r8153c_ups_en()") new for v2.
- ("Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE") new for v2.
- Reset patch no longer based on retry patch, since that was dropped.
- Reset patch should be robust even if failures happen in probe.
- Switched booleans to bits in the "flags" variable.
- Check for -ENODEV instead of "udev->state == USB_STATE_NOTATTACHED"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:59 +0000 (14:06 -0700)]
r8152: Block future register access if register access fails
Even though the functions to read/write registers can fail, most of
the places in the r8152 driver that read/write register values don't
check error codes. The lack of error code checking is problematic in
at least two ways.
The first problem is that the r8152 driver often uses code patterns
similar to this:
x = read_register()
x = x | SOME_BIT;
write_register(x);
...with the above pattern, if the read_register() fails and returns
garbage then we'll end up trying to write modified garbage back to the
Realtek adapter. If the write_register() succeeds that's bad. Note
that as of commit
f53a7ad18959 ("r8152: Set memory to all 0xFFs on
failed reg reads") the "garbage" returned by read_register() will at
least be consistent garbage, but it is still garbage.
It turns out that this problem is very serious. Writing garbage to
some of the hardware registers on the Ethernet adapter can put the
adapter in such a bad state that it needs to be power cycled (fully
unplugged and plugged in again) before it can enumerate again.
The second problem is that the r8152 driver generally has functions
that are long sequences of register writes. Assuming everything will
be OK if a random register write fails in the middle isn't a great
assumption.
One might wonder if the above two problems are real. You could ask if
we would really have a successful write after a failed read. It turns
out that the answer appears to be "yes, this can happen". In fact,
we've seen at least two distinct failure modes where this happens.
On a sc7180-trogdor Chromebook if you drop into kdb for a while and
then resume, you can see:
1. We get a "Tx timeout"
2. The "Tx timeout" queues up a USB reset.
3. In rtl8152_pre_reset() we try to reinit the hardware.
4. The first several (2-9) register accesses fail with a timeout, then
things recover.
The above test case was actually fixed by the patch ("r8152: Increase
USB control msg timeout to 5000ms as per spec") but at least shows
that we really can see successful calls after failed ones.
On a different (AMD) based Chromebook with a particular adapter, we
found that during reboot tests we'd also sometimes get a transitory
failure. In this case we saw -EPIPE being returned sometimes. Retrying
worked, but retrying is not always safe for all register accesses
since reading/writing some registers might have side effects (like
registers that clear on read).
Let's fully lock out all register access if a register access fails.
When we do this, we'll try to queue up a USB reset and try to unlock
register access after the reset. This is slightly tricker than it
sounds since the r8152 driver has an optimized reset sequence that
only works reliably after probe happens. In order to handle this, we
avoid the optimized reset if probe didn't finish. Instead, we simply
retry the probe routine in this case.
When locking out access, we'll use the existing infrastructure that
the driver was using when it detected we were unplugged. This keeps us
from getting stuck in delay loops in some parts of the driver.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:58 +0000 (14:06 -0700)]
r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
Whenever the RTL8152_UNPLUG is set that just tells the driver that all
accesses will fail and we should just immediately bail. A future patch
will use this same concept at a time when the driver hasn't actually
been unplugged but is about to be reset. Rename the flag in
preparation for the future patch.
This is a no-op change and just a search and replace.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:57 +0000 (14:06 -0700)]
r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en()
If the adapter is unplugged while we're looping in r8153b_ups_en() /
r8153c_ups_en() we could end up looping for 10 seconds (20 ms * 500
loops). Add code similar to what's done in other places in the driver
to check for unplug and bail.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:56 +0000 (14:06 -0700)]
r8152: Check for unplug in rtl_phy_patch_request()
If the adapter is unplugged while we're looping in
rtl_phy_patch_request() we could end up looping for 10 seconds (2 ms *
5000 loops). Add code similar to what's done in other places in the
driver to check for unplug and bail.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:55 +0000 (14:06 -0700)]
r8152: Release firmware if we have an error in probe
The error handling in rtl8152_probe() is missing a call to release
firmware. Add it in to match what's in the cleanup code in
rtl8152_disconnect().
Fixes:
9370f2d05a2a ("r8152: support request_firmware for RTL8153")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:54 +0000 (14:06 -0700)]
r8152: Cancel hw_phy_work if we have an error in probe
The error handling in rtl8152_probe() is missing a call to cancel the
hw_phy_work. Add it in to match what's in the cleanup code in
rtl8152_disconnect().
Fixes:
a028a9e003f2 ("r8152: move the settings of PHY to a work queue")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:53 +0000 (14:06 -0700)]
r8152: Run the unload routine if we have errors during probe
The rtl8152_probe() function lacks a call to the chip-specific
unload() routine when it sees an error in probe. Add it in to match
the cleanup code in rtl8152_disconnect().
Fixes:
ac718b69301c ("net/usb: new driver for RTL8152")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Douglas Anderson [Fri, 20 Oct 2023 21:06:52 +0000 (14:06 -0700)]
r8152: Increase USB control msg timeout to 5000ms as per spec
According to the comment next to USB_CTRL_GET_TIMEOUT and
USB_CTRL_SET_TIMEOUT, although sending/receiving control messages is
usually quite fast, the spec allows them to take up to 5 seconds.
Let's increase the timeout in the Realtek driver from 500ms to 5000ms
(using the #defines) to account for this.
This is not just a theoretical change. The need for the longer timeout
was seen in testing. Specifically, if you drop a sc7180-trogdor based
Chromebook into the kdb debugger and then "go" again after sitting in
the debugger for a while, the next USB control message takes a long
time. Out of ~40 tests the slowest USB control message was 4.5
seconds.
While dropping into kdb is not exactly an end-user scenario, the above
is similar to what could happen due to an temporary interrupt storm,
what could happen if there was a host controller (HW or SW) issue, or
what could happen if the Realtek device got into a confused state and
needed time to recover.
This change is fairly critical since the r8152 driver in Linux doesn't
expect register reads/writes (which are backed by USB control
messages) to fail.
Fixes:
ac718b69301c ("net/usb: new driver for RTL8152")
Suggested-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shigeru Yoshida [Fri, 20 Oct 2023 17:03:44 +0000 (02:03 +0900)]
net: usb: smsc95xx: Fix uninit-value access in smsc95xx_read_reg
syzbot reported the following uninit-value access issue [1]:
smsc95xx 1-1:0.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x00000030: -32
smsc95xx 1-1:0.0 (unnamed net_device) (uninitialized): Error reading E2P_CMD
=====================================================
BUG: KMSAN: uninit-value in smsc95xx_reset+0x409/0x25f0 drivers/net/usb/smsc95xx.c:896
smsc95xx_reset+0x409/0x25f0 drivers/net/usb/smsc95xx.c:896
smsc95xx_bind+0x9bc/0x22e0 drivers/net/usb/smsc95xx.c:1131
usbnet_probe+0x100b/0x4060 drivers/net/usb/usbnet.c:1750
usb_probe_interface+0xc75/0x1210 drivers/usb/core/driver.c:396
really_probe+0x506/0xf40 drivers/base/dd.c:658
__driver_probe_device+0x2a7/0x5d0 drivers/base/dd.c:800
driver_probe_device+0x72/0x7b0 drivers/base/dd.c:830
__device_attach_driver+0x55a/0x8f0 drivers/base/dd.c:958
bus_for_each_drv+0x3ff/0x620 drivers/base/bus.c:457
__device_attach+0x3bd/0x640 drivers/base/dd.c:1030
device_initial_probe+0x32/0x40 drivers/base/dd.c:1079
bus_probe_device+0x3d8/0x5a0 drivers/base/bus.c:532
device_add+0x16ae/0x1f20 drivers/base/core.c:3622
usb_set_configuration+0x31c9/0x38c0 drivers/usb/core/message.c:2207
usb_generic_driver_probe+0x109/0x2a0 drivers/usb/core/generic.c:238
usb_probe_device+0x290/0x4a0 drivers/usb/core/driver.c:293
really_probe+0x506/0xf40 drivers/base/dd.c:658
__driver_probe_device+0x2a7/0x5d0 drivers/base/dd.c:800
driver_probe_device+0x72/0x7b0 drivers/base/dd.c:830
__device_attach_driver+0x55a/0x8f0 drivers/base/dd.c:958
bus_for_each_drv+0x3ff/0x620 drivers/base/bus.c:457
__device_attach+0x3bd/0x640 drivers/base/dd.c:1030
device_initial_probe+0x32/0x40 drivers/base/dd.c:1079
bus_probe_device+0x3d8/0x5a0 drivers/base/bus.c:532
device_add+0x16ae/0x1f20 drivers/base/core.c:3622
usb_new_device+0x15f6/0x22f0 drivers/usb/core/hub.c:2589
hub_port_connect drivers/usb/core/hub.c:5440 [inline]
hub_port_connect_change drivers/usb/core/hub.c:5580 [inline]
port_event drivers/usb/core/hub.c:5740 [inline]
hub_event+0x53bc/0x7290 drivers/usb/core/hub.c:5822
process_one_work kernel/workqueue.c:2630 [inline]
process_scheduled_works+0x104e/0x1e70 kernel/workqueue.c:2703
worker_thread+0xf45/0x1490 kernel/workqueue.c:2784
kthread+0x3e8/0x540 kernel/kthread.c:388
ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
Local variable buf.i225 created at:
smsc95xx_read_reg drivers/net/usb/smsc95xx.c:90 [inline]
smsc95xx_reset+0x203/0x25f0 drivers/net/usb/smsc95xx.c:892
smsc95xx_bind+0x9bc/0x22e0 drivers/net/usb/smsc95xx.c:1131
CPU: 1 PID: 773 Comm: kworker/1:2 Not tainted 6.6.0-rc1-syzkaller-00125-ge42bebf6db29 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
Workqueue: usb_hub_wq hub_event
=====================================================
Similar to
e9c65989920f ("net: usb: smsc75xx: Fix uninit-value access in
__smsc75xx_read_reg"), this issue is caused because usbnet_read_cmd() reads
less bytes than requested (zero byte in the reproducer). In this case,
'buf' is not properly filled.
This patch fixes the issue by returning -ENODATA if usbnet_read_cmd() reads
less bytes than requested.
sysbot reported similar uninit-value access issue [2]. The root cause is
the same as mentioned above, and this patch addresses it as well.
Fixes:
2f7ca802bdae ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
Reported-and-tested-by: syzbot+c74c24b43c9ae534f0e0@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+2c97a98a5ba9ea9c23bd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
c74c24b43c9ae534f0e0 [1]
Closes: https://syzkaller.appspot.com/bug?extid=
2c97a98a5ba9ea9c23bd [2]
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Su Hui [Fri, 20 Oct 2023 09:27:59 +0000 (17:27 +0800)]
net: chelsio: cxgb4: add an error code check in t4_load_phy_fw
t4_set_params_timeout() can return -EINVAL if failed, add check
for this.
Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sat, 21 Oct 2023 18:03:53 +0000 (20:03 +0200)]
net: ieee802154: adf7242: Fix some potential buffer overflow in adf7242_stats_show()
strncat() usage in adf7242_debugfs_init() is wrong.
The size given to strncat() is the maximum number of bytes that can be
written, excluding the trailing NULL.
Here, the size that is passed, DNAME_INLINE_LEN, does not take into account
the size of "adf7242-" that is already in the array.
In order to fix it, use snprintf() instead.
Fixes:
7302b9d90117 ("ieee802154/adf7242: Driver for ADF7242 MAC IEEE802154")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 22 Oct 2023 01:46:47 +0000 (18:46 -0700)]
Merge tag 'powerpc-6.6-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix stale propagated yield_cpu in qspinlocks leading to lockups
- Fix broken hugepages on some configs due to ARCH_FORCE_MAX_ORDER
- Fix a spurious warning when copros are in use at exit time
Thanks to Nicholas Piggin, Christophe Leroy, Nysal Jan K.A Sachin Sant,
and Shrikanth Hegde.
* tag 'powerpc-6.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/qspinlock: Fix stale propagated yield_cpu
powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()
powerpc/mm: Allow ARCH_FORCE_MAX_ORDER up to 12
Luben Tuikov [Wed, 18 Oct 2023 00:56:57 +0000 (20:56 -0400)]
drm/amdgpu: Remove redundant call to priority_is_valid()
Remove a redundant call to amdgpu_ctx_priority_is_valid() from
amdgpu_ctx_priority_permit(), which is called from amdgpu_ctx_init() which is
called from amdgpu_ctx_alloc() which is called from amdgpu_ctx_ioctl(), where
we've called amdgpu_ctx_priority_is_valid() already first thing in the
function.
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231018010359.30393-1-luben.tuikov@amd.com
Linus Torvalds [Sat, 21 Oct 2023 19:59:18 +0000 (12:59 -0700)]
Merge tag 'gpio-fixes-for-v6.6-rc7' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix interrupt handling in suspend and wakeup in gpio-vf610
- fix a bug on setting direction to output in gpio-vf610
- add a missing memset() in gpio ACPI code
* tag 'gpio-fixes-for-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: acpi: Add missing memset(0) to acpi_get_gpiod_from_data()
gpio: vf610: set value before the direction to avoid a glitch
gpio: vf610: mask the gpio irq in system suspend and support wakeup
Linus Torvalds [Sat, 21 Oct 2023 19:54:58 +0000 (12:54 -0700)]
Merge tag 'rust-fixes-6.6' of https://github.com/Rust-for-Linux/linux
Pull rust fixes from Miguel Ojeda:
- GCC build: fix bindgen build error with '-fstrict-flex-arrays'
- Error module: fix the description for 'ECHILD' and fix Markdown
style nit
- Code docs: fix logo replacement
- Docs: update docs output path
- Kbuild: remove old docs output path in 'cleandocs' target
* tag 'rust-fixes-6.6' of https://github.com/Rust-for-Linux/linux:
rust: docs: fix logo replacement
kbuild: remove old Rust docs output path
docs: rust: update Rust docs output path
rust: fix bindgen build error with fstrict-flex-arrays
rust: error: Markdown style nit
rust: error: fix the description for `ECHILD`
Alain Volmat [Tue, 10 Oct 2023 08:44:54 +0000 (10:44 +0200)]
i2c: stm32f7: Fix PEC handling in case of SMBUS transfers
In case of SMBUS byte read with PEC enabled, the whole transfer
is split into two commands. A first write command, followed by
a read command. The write command does not have any PEC byte
and a PEC byte is appended at the end of the read command.
(cf Read byte protocol with PEC in SMBUS specification)
Within the STM32 I2C controller, handling (either sending
or receiving) of the PEC byte is done via the PECBYTE bit in
register CR2.
Currently, the PECBYTE is set at the beginning of a transfer,
which lead to sending a PEC byte at the end of the write command
(hence losing the real last byte), and also does not check the
PEC byte received during the read command.
This patch corrects the function stm32f7_i2c_smbus_xfer_msg
in order to only set the PECBYTE during the read command.
Fixes:
9e48155f6bfe ("i2c: i2c-stm32f7: Add initial SMBus protocols support")
Signed-off-by: Alain Volmat <alain.volmat@foss.st.com>
Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com>
Acked-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Linus Torvalds [Sat, 21 Oct 2023 18:19:07 +0000 (11:19 -0700)]
Merge tag 'sched-urgent-2023-10-21' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
"Fix a recently introduced use-after-free bug"
* tag 'sched-urgent-2023-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/eevdf: Fix heap corruption more
Linus Torvalds [Sat, 21 Oct 2023 18:09:29 +0000 (11:09 -0700)]
Merge tag 'perf-urgent-2023-10-21' of git://git./linux/kernel/git/tip/tip
Pull perf events fix from Ingo Molnar:
"Fix group event semantics"
* tag 'perf-urgent-2023-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Disallow mis-matched inherited group reads
Linus Torvalds [Sat, 21 Oct 2023 18:00:36 +0000 (11:00 -0700)]
Merge tag 'probes-fixes-v6.6-rc6.2' of git://git./linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- kprobe-events: Fix kprobe events to reject if the attached symbol is
not unique name because it may not the function which the user want
to attach to. (User can attach a probe to such symbol using the
nearest unique symbol + offset.)
- selftest: Add a testcase to ensure the kprobe event rejects non
unique symbol correctly.
* tag 'probes-fixes-v6.6-rc6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/ftrace: Add new test case which checks non unique symbol
tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
Herve Codina [Fri, 20 Oct 2023 15:30:13 +0000 (17:30 +0200)]
i2c: muxes: i2c-mux-gpmux: Use of_get_i2c_adapter_by_node()
i2c-mux-gpmux uses the pair of_find_i2c_adapter_by_node() /
i2c_put_adapter(). These pair alone is not correct to properly lock the
I2C parent adapter.
Indeed, i2c_put_adapter() decrements the module refcount while
of_find_i2c_adapter_by_node() does not increment it. This leads to an
underflow of the parent module refcount.
Use the dedicated function, of_get_i2c_adapter_by_node(), to handle
correctly the module refcount.
Fixes:
ac8498f0ce53 ("i2c: i2c-mux-gpmux: new driver")
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Herve Codina [Fri, 20 Oct 2023 15:30:12 +0000 (17:30 +0200)]
i2c: muxes: i2c-demux-pinctrl: Use of_get_i2c_adapter_by_node()
i2c-demux-pinctrl uses the pair of_find_i2c_adapter_by_node() /
i2c_put_adapter(). These pair alone is not correct to properly lock the
I2C parent adapter.
Indeed, i2c_put_adapter() decrements the module refcount while
of_find_i2c_adapter_by_node() does not increment it. This leads to an
underflow of the parent module refcount.
Use the dedicated function, of_get_i2c_adapter_by_node(), to handle
correctly the module refcount.
Fixes:
50a5ba876908 ("i2c: mux: demux-pinctrl: add driver")
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Herve Codina [Fri, 20 Oct 2023 15:30:11 +0000 (17:30 +0200)]
i2c: muxes: i2c-mux-pinctrl: Use of_get_i2c_adapter_by_node()
i2c-mux-pinctrl uses the pair of_find_i2c_adapter_by_node() /
i2c_put_adapter(). These pair alone is not correct to properly lock the
I2C parent adapter.
Indeed, i2c_put_adapter() decrements the module refcount while
of_find_i2c_adapter_by_node() does not increment it. This leads to an
underflow of the parent module refcount.
Use the dedicated function, of_get_i2c_adapter_by_node(), to handle
correctly the module refcount.
Fixes:
c4aee3e1b0de ("i2c: mux: pinctrl: remove platform_data")
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Linus Torvalds [Sat, 21 Oct 2023 17:11:11 +0000 (10:11 -0700)]
Merge tag 's390-6.6-4' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix IOMMU bitmap allocation in s390 PCI to avoid out of bounds access
when IOMMU pages aren't a multiple of 64
- Fix kasan crashes when accessing DCSS mapping in memory holes by
adding corresponding kasan zero shadow mappings
- Fix a memory leak in css_alloc_subchannel in case
dma_set_coherent_mask fails
* tag 's390-6.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: fix iommu bitmap allocation
s390/kasan: handle DCSS mapping in memory holes
s390/cio: fix a memleak in css_alloc_subchannel
Linus Torvalds [Sat, 21 Oct 2023 17:02:46 +0000 (10:02 -0700)]
Merge tag 'platform-drivers-x86-v6.6-5' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
- Fix spurious brightness down presses on newer Asus laptop models
- Fix backlight control not working on T2 Mac Pro all-in-ones
- Add Armin Wolf as new maintainer for the WMI bus driver and change
its status from orphaned to maintained
- A few other small fixes
* tag 'platform-drivers-x86-v6.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/mellanox: mlxbf-tmfifo: Fix a warning message
apple-gmux: Hard Code max brightness for MMIO gmux
platform/surface: platform_profile: Propagate error if profile registration fails
platform/x86: asus-wmi: Map 0x2a code, Ignore 0x2b and 0x2c events
platform/x86: asus-wmi: Only map brightness codes when using asus-wmi backlight control
platform/x86: asus-wmi: Change ASUS_WMI_BRN_DOWN code from 0x20 to 0x2e
platform/x86: wmi: Update MAINTAINERS entry
platform/x86: msi-ec: Fix the 3rd config
platform/x86: intel-uncore-freq: Conditionally create attribute for read frequency
platform: mellanox: Fix a resource leak in an error handling path in probing flow
Linus Torvalds [Sat, 21 Oct 2023 16:57:34 +0000 (09:57 -0700)]
Merge tag 'usb-6.6-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt fixes and ids from Greg KH:
"Here are four small patches for USB and Thunderbolt for 6.6-rc7 that
do the following:
- new usb-serial device ids
- thunderbolt driver fix for reported issue
All of these have been in linux-next with no reported problems"
* tag 'usb-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Fibocom to DELL custom modem FM101R-GL
USB: serial: option: add entry for Sierra EM9191 with new firmware
USB: serial: option: add Telit LE910C4-WWX 0x1035 composition
thunderbolt: Call tb_switch_put() once DisplayPort bandwidth request is finished
Linus Torvalds [Sat, 21 Oct 2023 16:49:13 +0000 (09:49 -0700)]
Merge tag 'v6.6-p5' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a 6.5 regression in crypto/asymmetric_keys"
* tag 'v6.6-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
KEYS: asymmetric: Fix sign/verify on pkcs1pad without a hash
Linus Torvalds [Sat, 21 Oct 2023 16:43:09 +0000 (09:43 -0700)]
Merge tag 'iomap-6.6-fixes-5' of git://git./fs/xfs/xfs-linux
Pull iomap fix from Darrick Wong:
- Fix a bug where a writev consisting of a bunch of sub-fsblock writes
where the last buffer address is invalid could lead to an infinite
loop
* tag 'iomap-6.6-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: fix short copy in iomap_write_iter()
Greg Kroah-Hartman [Sat, 21 Oct 2023 15:35:49 +0000 (17:35 +0200)]
Merge tag 'iio-fixes-for-6.6b' of https://git./linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:
2nd set of IIO fixes for the 6.6 cycle.
Note, given timing my expectation is these will be queued for the
6.7 merge window but they could go quicker if the 6.6 cycle ends up
being extended.
afe:
* Allow for channels with offset but no scale. In this case the scale
can be assumed to be 1.
adi,ad74115:
* Add missing dt-binding constraint on number of reset-gpios.
samsung,exynos:
* Don't request touchscreen interrupt if it is not going to be used,
getting rid of an incorrect resulting warning message.
xilinx,xadc:
* Avoid changing preset voltage and themperature thresholds as they
are typicaly set as part of FPGA image building so should be left
alone.
* Fix wrong temperature offset and scale for some devices.
* tag 'iio-fixes-for-6.6b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: afe: rescale: Accept only offset channels
iio: exynos-adc: request second interupt only when touchscreen mode is used
iio: adc: xilinx-xadc: Correct temperature offset/scale for UltraScale
iio: adc: xilinx-xadc: Don't clobber preset voltage/temperature thresholds
dt-bindings: iio: add missing reset-gpios constrain
Dell Jin [Fri, 20 Oct 2023 06:20:53 +0000 (09:20 +0300)]
net: ethernet: adi: adin1110: Fix uninitialized variable
The spi_transfer struct has to have all it's fields initialized to 0 in
this case, since not all of them are set before starting the transfer.
Otherwise, spi_sync_transfer() will sometimes return an error.
Fixes:
a526a3cc9c8d ("net: ethernet: adi: adin1110: Fix SPI transfers")
Signed-off-by: Dell Jin <dell.jin.code@outlook.com>
Signed-off-by: Ciprian Regus <ciprian.regus@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Sit Wei Hong [Fri, 20 Oct 2023 03:25:35 +0000 (11:25 +0800)]
net: stmmac: update MAC capabilities when tx queues are updated
Upon boot up, the driver will configure the MAC capabilities based on
the maximum number of tx and rx queues. When the user changes the
tx queues to single queue, the MAC should be capable of supporting Half
Duplex, but the driver does not update the MAC capabilities when it is
configured so.
Using the stmmac_reinit_queues() to check the number of tx queues
and set the MAC capabilities accordingly.
Fixes:
0366f7e06a6b ("net: stmmac: add ethtool support for get/set channels")
Cc: <stable@vger.kernel.org> # 5.17+
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: Gan, Yi Fang <yi.fang.gan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Herring [Thu, 19 Oct 2023 18:23:37 +0000 (13:23 -0500)]
net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF
Commit
b0377116decd ("net: ethernet: Use device_get_match_data()") dropped
the unconditional use of xgene_enet_of_match resulting in this warning:
drivers/net/ethernet/apm/xgene/xgene_enet_main.c:2004:34: warning: unused variable 'xgene_enet_of_match' [-Wunused-const-variable]
The fix is to drop of_match_ptr() which is not necessary because DT is
always used for this driver (well, it could in theory support ACPI only,
but CONFIG_OF is always enabled for arm64).
Fixes:
b0377116decd ("net: ethernet: Use device_get_match_data()")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202310170627.2Kvf6ZHY-lkp@intel.com/
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marco Pagani [Wed, 18 Oct 2023 16:38:13 +0000 (18:38 +0200)]
fpga: disable KUnit test suites when module support is enabled
The fpga core currently assumes that all manager, bridge, and region
devices have a parent device associated with a driver that can be used
to take the module's refcount. This behavior causes the fpga test suites
to crash with a null-ptr-deref since parent fake devices do not have a
driver. This patch disables all fpga KUnit test suites when loadable
module support is enabled until the fpga core is fixed. Test suites
can still be run using the KUnit default UML kernel.
Signed-off-by: Marco Pagani <marpagan@redhat.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Fixes:
ccbc1c302115 ("fpga: add an initial KUnit suite for the FPGA Manager")
Link: https://lore.kernel.org/r/20231018163814.100803-1-marpagan@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tirthendu Sarkar [Thu, 19 Oct 2023 20:38:52 +0000 (13:38 -0700)]
i40e: sync next_to_clean and next_to_process for programming status desc
When a programming status desc is encountered on the rx_ring,
next_to_process is bumped along with cleaned_count but next_to_clean is
not. This causes I40E_DESC_UNUSED() macro to misbehave resulting in
overwriting whole ring with new buffers.
Update next_to_clean to point to next_to_process on seeing a programming
status desc if not in the middle of handling a multi-frag packet. Also,
bump cleaned_count only for such case as otherwise next_to_clean buffer
may be returned to hardware on reaching clean_threshold.
Fixes:
e9031f2da1ae ("i40e: introduce next_to_process to i40e_ring")
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reported-by: hq.dev+kernel@msdfc.xyz
Reported by: Solomon Peachy <pizza@shaftnet.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217678
Tested-by: hq.dev+kernel@msdfc.xyz
Tested by: Indrek Järve <incx@dustbite.net>
Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20231019203852.3663665-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sasha Neftin [Thu, 19 Oct 2023 20:36:41 +0000 (13:36 -0700)]
igc: Fix ambiguity in the ethtool advertising
The 'ethtool_convert_link_mode_to_legacy_u32' method does not allow us to
advertise 2500M speed support and TP (twisted pair) properly. Convert to
'ethtool_link_ksettings_test_link_mode' to advertise supported speed and
eliminate ambiguity.
Fixes:
8c5ad0dae93c ("igc: Add ethtool support")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Suggested-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231019203641.3661960-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 19 Oct 2023 12:21:04 +0000 (12:21 +0000)]
neighbour: fix various data-races
1) tbl->gc_thresh1, tbl->gc_thresh2, tbl->gc_thresh3 and tbl->gc_interval
can be written from sysfs.
2) tbl->last_flush is read locklessly from neigh_alloc()
3) tbl->proxy_queue.qlen is read locklessly from neightbl_fill_info()
4) neightbl_fill_info() reads cpu stats that can be changed concurrently.
Fixes:
c7fb64db001f ("[NETLINK]: Neighbour table configuration and statistics via rtnetlink")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231019122104.1448310-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 19 Oct 2023 11:24:57 +0000 (11:24 +0000)]
net: do not leave an empty skb in write queue
Under memory stress conditions, tcp_sendmsg_locked()
might call sk_stream_wait_memory(), thus releasing the socket lock.
If a fresh skb has been allocated prior to this,
we should not leave it in the write queue otherwise
tcp_write_xmit() could panic.
This apparently does not happen often, but a future change
in __sk_mem_raise_allocated() that Shakeel and others are
considering would increase chances of being hurt.
Under discussion is to remove this controversial part:
/* Fail only if socket is _under_ its sndbuf.
* In this case we cannot block, so that we have to fail.
*/
if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
/* Force charge with __GFP_NOFAIL */
if (memcg_charge && !charged) {
mem_cgroup_charge_skmem(sk->sk_memcg, amt,
gfp_memcg_charge() | __GFP_NOFAIL);
}
return 1;
}
Fixes:
fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Link: https://lore.kernel.org/r/20231019112457.1190114-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Khazhismel Kumykov [Fri, 20 Oct 2023 22:36:17 +0000 (15:36 -0700)]
blk-throttle: check for overflow in calculate_bytes_allowed
Inexact, we may reject some not-overflowing values incorrectly, but
they'll be on the order of exabytes allowed anyways.
This fixes divide error crash on x86 if bps_limit is not configured or
is set too high in the rare case that jiffy_elapsed is greater than HZ.
Fixes:
e8368b57c006 ("blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()")
Fixes:
8d6bbaada2e0 ("blk-throttle: prevent overflow while calculating wait time")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20231020223617.2739774-1-khazhy@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 20 Oct 2023 21:49:24 +0000 (14:49 -0700)]
Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' of git://git./linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix regression in reading scale and unit files from sysfs for PMU
events, so that we can use that info to pretty print instead of
printing raw numbers:
# perf stat -e power/energy-ram/,power/energy-gpu/ sleep 2
Performance counter stats for 'system wide':
1.64 Joules power/energy-ram/
0.20 Joules power/energy-gpu/
2.
001228914 seconds time elapsed
#
# grep -m1 "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
#
- The small llvm.cpp file used to check if the llvm devel files are
present was incorrectly deleted when removing the BPF event in 'perf
trace', put it back as it is also used by tools/bpf/bpftool, that
uses llvm routines to do disassembly of BPF object files.
- Fix use of addr_location__exit() in dlfilter__object_code(), making
sure that it is only used to pair a previous addr_location__init()
call.
* tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools build: Fix llvm feature detection, still used by bpftool
perf dlfilter: Add a test for object_code()
perf dlfilter: Fix use of addr_location__exit() in dlfilter__object_code()
perf pmu: Fix perf stat output with correct scale and unit
Linus Torvalds [Fri, 20 Oct 2023 21:45:41 +0000 (14:45 -0700)]
Merge tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
"One single fix to assert check in user_events abi_test to properly
check bit value on Big Endian architectures. The code treated the bit
values as Little Endian and the check failed on Big Endian"
* tag 'linux_kselftest_active-fixes-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/user_events: Fix abi_test for BE archs
Linus Torvalds [Fri, 20 Oct 2023 21:04:53 +0000 (14:04 -0700)]
Merge tag 'nfs-for-6.6-4' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
"Stable Fix:
- Fix a pNFS hang in nfs4_evict_inode()
Fixes:
- Force update of suid/sgid bits after an NFS v4.2 ALLOCATE op
- Fix a potential oops in nfs_inode_remove_request()
- Check the validity of the layout pointer in ff_layout_mirror_prepare_stats()
- Fix incorrectly marking the pNFS MDS with USE_PNFS_DS in some cases"
* tag 'nfs-for-6.6-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFSv4.1: fixup use EXCHGID4_FLAG_USE_PNFS_DS for DS server
pNFS/flexfiles: Check the layout validity in ff_layout_mirror_prepare_stats
pNFS: Fix a hang in nfs4_evict_inode()
NFS: Fix potential oops in nfs_inode_remove_request()
nfs42: client needs to strip file mode's suid/sgid bit after ALLOCATE op
Linus Torvalds [Fri, 20 Oct 2023 21:00:05 +0000 (14:00 -0700)]
Merge tag 'fsnotify_for_v6.6-rc7' of git://git./linux/kernel/git/jack/linux-fs
Pull fanotify fix from Jan Kara:
"Disable superblock / mount marks for filesystems that can encode file
handles but not open them (currently only overlayfs).
It is not clear the functionality is useful in any way so let's better
disable it before someone comes up with some creative misuse"
* tag 'fsnotify_for_v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: limit reporting of event with non-decodeable file handles
Linus Torvalds [Fri, 20 Oct 2023 20:47:05 +0000 (13:47 -0700)]
Merge tag 'acpi-6.6-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the ACPI initialization ordering on ARM and ACPI IRQ
management in the cases when irq_create_fwspec_mapping() fails.
Specifics:
- Fix ACPI initialization ordering on ARM that was changed
incorrectly during the 6.5 development cycle (Hanjun Guo)
- Make acpi_register_gsi() return an error code as appropriate when
irq_create_fwspec_mapping() returns 0 on failure (Sunil V L)"
* tag 'acpi-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: bus: Move acpi_arm_init() to the place of after acpi_ghes_init()
ACPI: irq: Fix incorrect return value in acpi_register_gsi()
Linus Torvalds [Fri, 20 Oct 2023 20:24:50 +0000 (13:24 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes, both in drivers.
The mptsas one is really fixing an error path issue where it can leave
the misc driver loaded even though the sas driver fails to initialize"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix double free of dsd_list during driver load
scsi: mpt3sas: Fix in error path
Linus Torvalds [Fri, 20 Oct 2023 20:21:46 +0000 (13:21 -0700)]
Merge tag 'pinctrl-v6.6-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Concurrent register updates in the Qualcomm LPASS pin controller gets
a proper lock.
- revert a mutex fix that was causing problems: contention on the mutex
or something of the sort lead to probe reordering and MMC block
devices start to register in a different order, which unsuspecting
userspace is not ready to handle
* tag 'pinctrl-v6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
Revert "pinctrl: avoid unsafe code pattern in find_pinctrl()"
pinctrl: qcom: lpass-lpi: fix concurrent register updates
Linus Torvalds [Fri, 20 Oct 2023 20:12:34 +0000 (13:12 -0700)]
Merge tag 'mtd/fixes-for-6.6-rc7' of git://git./linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
"In the raw NAND subsystem, the major fix prevents using cached reads
with devices not supporting it. There was two bug reports about this.
Apart from that, three drivers (pl353, arasan and marvell) could
sometimes hide page program failures due to their their own program
page helper not being fully compliant with the specification (many
drivers use the default helpers shared by the core). Adding a missing
check prevents these situation.
Finally, the Qualcomm driver had a broken error path.
In the SPI-NAND subsystem one Micron device used a wrong bitmak
reporting possibly corrupted ECC status.
Finally, the physmap-core got stripped from its map_rom fallback by
mistake, this feature is added back"
* tag 'mtd/fixes-for-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: Ensure the nand chip supports cached reads
mtd: rawnand: qcom: Unmap the right resource upon probe failure
mtd: rawnand: pl353: Ensure program page operations are successful
mtd: rawnand: arasan: Ensure program page operations are successful
mtd: spinand: micron: correct bitmask for ecc status
mtd: physmap-core: Restore map_rom fallback
mtd: rawnand: marvell: Ensure program page operations are successful
Linus Torvalds [Fri, 20 Oct 2023 20:09:19 +0000 (13:09 -0700)]
Merge tag 'mmc-v6.6-rc3' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Capture correct oemid-bits for eMMC cards
- Fix error propagation for some ioctl commands
- Hold retuning if SDIO is in 1-bit mode
MMC host:
- mtk-sd: Use readl_poll_timeout_atomic to not "schedule while atomic"
- sdhci-msm: Correct minimum number of clocks
- sdhci-pci-gli: Fix LPM negotiation so x86/S0ix SoCs can suspend
- sdhci-sprd: Fix error code in sdhci_sprd_tuning()"
* tag 'mmc-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: Capture correct oemid-bits for eMMC cards
mmc: mtk-sd: Use readl_poll_timeout_atomic in msdc_reset_hw
mmc: core: Fix error propagation for some ioctl commands
mmc: sdhci-sprd: Fix error code in sdhci_sprd_tuning()
mmc: sdhci-pci-gli: fix LPM negotiation so x86/S0ix SoCs can suspend
mmc: core: sdio: hold retuning if sdio in 1-bit mode
dt-bindings: mmc: sdhci-msm: correct minimum number of clocks
Linus Torvalds [Fri, 20 Oct 2023 17:31:06 +0000 (10:31 -0700)]
Merge tag 'block-6.6-2023-10-20' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"A fix for a regression with sed-opal and saved keys, and outside of
that an NVMe pull request fixing a few minor issues on that front"
* tag 'block-6.6-2023-10-20' of git://git.kernel.dk/linux:
nvme-pci: add BOGUS_NID for Intel 0a54 device
nvmet-auth: complete a request only after freeing the dhchap pointers
nvme: sanitize metadata bounce buffer for reads
block: Fix regression in sed-opal for a saved key.
nvme-auth: use chap->s2 to indicate bidirectional authentication
nvmet-tcp: Fix a possible UAF in queue intialization setup
nvme-rdma: do not try to stop unallocated queues
Linus Torvalds [Fri, 20 Oct 2023 17:28:46 +0000 (10:28 -0700)]
Merge tag 'io_uring-6.6-2023-10-20' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single fix for a bug report that came in, fixing a case where
failure to init a ring with IORING_SETUP_NO_MMAP can trigger a NULL
pointer dereference"
* tag 'io_uring-6.6-2023-10-20' of git://git.kernel.dk/linux:
io_uring: fix crash with IORING_SETUP_NO_MMAP and invalid SQ ring address
Linus Torvalds [Fri, 20 Oct 2023 17:05:10 +0000 (10:05 -0700)]
Merge tag 'sound-6.6-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Still higher volume than wished, but all are driver-specific small
fixes and look safe for this late RC.
The majority of changes are for ASoC, especially for wcd938x driver
and Cirrus codec drivers, while there are other random fixes including
usual HD-audio quirks"
* tag 'sound-6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (22 commits)
ASoC: da7219: Correct the process of setting up Gnd switch in AAD
ALSA: hda/realtek - Fixed ASUS platform headset Mic issue
ALSA: hda/realtek: Add quirk for ASUS ROG GU603ZV
ALSA: hda/relatek: Enable Mute LED on HP Laptop 15s-fq5xxx
ASoC: dwc: Fix non-DT instantiation
ASoC: codecs: tas2780: Fix log of failed reset via I2C.
ASoC: rt5650: fix the wrong result of key button
ASoC: cs42l42: Fix missing include of gpio/consumer.h
ASoC: cs42l43: Update values for bias sense
ASoC: dt-bindings: cirrus,cs42l43: Update values for bias sense
ASoC: cs35l56: ASP1 DOUT must default to Hi-Z when not transmitting
ASoC: pxa: fix a memory leak in probe()
ASoC: cs35l56: Fix illegal use of init_completion()
ASoC: codecs: wcd938x-sdw: fix runtime PM imbalance on probe errors
ASoC: codecs: wcd938x-sdw: fix use after free on driver unbind
ASoC: codecs: wcd938x: fix runtime PM imbalance on remove
ASoC: codecs: wcd938x: fix regulator leaks on probe errors
ASoC: codecs: wcd938x: fix resource leaks on bind errors
ASoC: codecs: wcd938x: fix unbind tear down order
ASoC: codecs: wcd938x: drop bogus bind error handling
...
Linus Torvalds [Fri, 20 Oct 2023 16:55:31 +0000 (09:55 -0700)]
Merge tag 'drm-fixes-2023-10-20' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes for the week, amdgpu, i915, nouveau, with some other
scattered around, nothing major.
amdgpu:
- Fix possible NULL pointer dereference
- Avoid possible BUG_ON in GPUVM updates
- Disable AMD_CTX_PRIORITY_UNSET
i915:
- Fix display issue that was blocking S0ix
- Retry gtt fault when out of fence registers
bridge:
- ti-sn65dsi86: Fix device lifetime
edid:
- Add quirk for BenQ GW2765
ivpu:
- Extend address range for MMU mmap
nouveau:
- DP-connector fixes
- Documentation fixes
panel:
- Move AUX B116XW03 into panel-simple
scheduler:
- Eliminate DRM_SCHED_PRIORITY_UNSET
ttm:
- Fix possible NULL-ptr deref in cleanup
mediatek:
- Correctly free sg_table in gem prime vmap"
* tag 'drm-fixes-2023-10-20' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: Reserve fences for VM update
drm/amdgpu: Fix possible null pointer dereference
accel/ivpu: Extend address range for MMU mmap
Revert "accel/ivpu: Use cached buffers for FW loading"
accel/ivpu: Don't enter d0i3 during FLR
drm/i915: Retry gtt fault when out of fence registers
drm/i915/cx0: Only clear/set the Pipe Reset bit of the PHY Lanes Owned
gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET
drm/amdgpu: Unset context priority is now invalid
drm/mediatek: Correctly free sg_table in gem prime vmap
drm/edid: add 8 bpc quirk to the BenQ GW2765
drm/ttm: Reorder sys manager cleanup step
drm/nouveau/disp: fix DP capable DSM connectors
drm/nouveau: exec: fix ioctl kernel-doc warning
drm/panel: Move AUX B116XW03 out of panel-edp back to panel-simple
drm/bridge: ti-sn65dsi86: Associate DSI device lifetime with auxiliary device
Ard Biesheuvel [Fri, 20 Oct 2023 16:11:06 +0000 (18:11 +0200)]
Merge 3rd batch of EFI fixes into efi/urgent
Kirill A. Shutemov [Mon, 16 Oct 2023 16:31:22 +0000 (19:31 +0300)]
efi/unaccepted: Fix soft lockups caused by parallel memory acceptance
Michael reported soft lockups on a system that has unaccepted memory.
This occurs when a user attempts to allocate and accept memory on
multiple CPUs simultaneously.
The root cause of the issue is that memory acceptance is serialized with
a spinlock, allowing only one CPU to accept memory at a time. The other
CPUs spin and wait for their turn, leading to starvation and soft lockup
reports.
To address this, the code has been modified to release the spinlock
while accepting memory. This allows for parallel memory acceptance on
multiple CPUs.
A newly introduced "accepting_list" keeps track of which memory is
currently being accepted. This is necessary to prevent parallel
acceptance of the same memory block. If a collision occurs, the lock is
released and the process is retried.
Such collisions should rarely occur. The main path for memory acceptance
is the page allocator, which accepts memory in MAX_ORDER chunks. As long
as MAX_ORDER is equal to or larger than the unit_size, collisions will
never occur because the caller fully owns the memory block being
accepted.
Aside from the page allocator, only memblock and deferered_free_range()
accept memory, but this only happens during boot.
The code has been tested with unit_size == 128MiB to trigger collisions
and validate the retry codepath.
Fixes:
2053bc57f367 ("efi: Add unaccepted memory support")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Michael Roth <michael.roth@amd.com
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Michael Roth <michael.roth@amd.com>
[ardb: drop unnecessary cpu_relax() call]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Rafael J. Wysocki [Fri, 20 Oct 2023 15:31:15 +0000 (17:31 +0200)]
Merge branch 'acpi-irq'
Merge ACPI IRQ management fix for 6.6-rc7 (Sunil V L).
* acpi-irq:
ACPI: irq: Fix incorrect return value in acpi_register_gsi()
Francis Laniel [Fri, 20 Oct 2023 10:42:50 +0000 (13:42 +0300)]
selftests/ftrace: Add new test case which checks non unique symbol
If name_show() is non unique, this test will try to install a kprobe on this
function which should fail returning EADDRNOTAVAIL.
On kernel where name_show() is not unique, this test is skipped.
Link: https://lore.kernel.org/all/20231020104250.9537-3-flaniel@linux.microsoft.com/
Cc: stable@vger.kernel.org
Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Francis Laniel [Fri, 20 Oct 2023 10:42:49 +0000 (13:42 +0300)]
tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols
When a kprobe is attached to a function that's name is not unique (is
static and shares the name with other functions in the kernel), the
kprobe is attached to the first function it finds. This is a bug as the
function that it is attaching to is not necessarily the one that the
user wants to attach to.
Instead of blindly picking a function to attach to what is ambiguous,
error with EADDRNOTAVAIL to let the user know that this function is not
unique, and that the user must use another unique function with an
address offset to get to the function they want to attach to.
Link: https://lore.kernel.org/all/20231020104250.9537-2-flaniel@linux.microsoft.com/
Cc: stable@vger.kernel.org
Fixes:
413d37d1eb69 ("tracing: Add kprobe-based event tracer")
Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
Link: https://lore.kernel.org/lkml/20230819101105.b0c104ae4494a7d1f2eea742@kernel.org/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Mateusz Palczewski [Thu, 19 Oct 2023 20:40:35 +0000 (13:40 -0700)]
igb: Fix potential memory leak in igb_add_ethtool_nfc_entry
Add check for return of igb_update_ethtool_nfc_entry so that in case
of any potential errors the memory alocated for input will be freed.
Fixes:
0e71def25281 ("igb: add support of RX network flow classification")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kunwu Chan [Fri, 20 Oct 2023 09:31:56 +0000 (17:31 +0800)]
treewide: Spelling fix in comment
reques -> request
Fixes:
09dde54c6a69 ("PS3: gelic: Add wireless support for PS3")
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 19 Oct 2023 16:37:20 +0000 (18:37 +0200)]
i40e: Fix I40E_FLAG_VF_VLAN_PRUNING value
Commit
c87c938f62d8f1 ("i40e: Add VF VLAN pruning") added new
PF flag I40E_FLAG_VF_VLAN_PRUNING but its value collides with
existing I40E_FLAG_TOTAL_PORT_SHUTDOWN_ENABLED flag.
Move the affected flag at the end of the flags and fix its value.
Reproducer:
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
[ 6323.142585] i40e 0000:02:00.0: Setting link-down-on-close not supported on this port (because total-port-shutdown is enabled)
netlink error: Operation not supported
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning off
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
The link-down-on-close flag cannot be modified after setting vf-vlan-pruning
because vf-vlan-pruning shares the same bit with total-port-shutdown flag
that prevents any modification of link-down-on-close flag.
Fixes:
c87c938f62d8 ("i40e: Add VF VLAN pruning")
Cc: Mateusz Palczewski <mateusz.palczewski@intel.com>
Cc: Simon Horman <horms@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Thu, 19 Oct 2023 07:13:46 +0000 (09:13 +0200)]
iavf: initialize waitqueues before starting watchdog_task
It is not safe to initialize the waitqueues after queueing the
watchdog_task. It will be using them.
The chance of this causing a real problem is very small, because
there will be some sleeping before any of the waitqueues get used.
I got a crash only after inserting an artificial sleep in iavf_probe.
Queue the watchdog_task as the last step in iavf_probe. Add a comment to
prevent repeating the mistake.
Fixes:
fe2647ab0c99 ("i40evf: prevent VF close returning before state transitions to DOWN")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirsad Goran Todorovac [Wed, 18 Oct 2023 19:34:38 +0000 (21:34 +0200)]
r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1
KCSAN reported the following data-race bug:
==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
value changed: 0x80003fff -> 0x3402805f
Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================
drivers/net/ethernet/realtek/r8169_main.c:
==========================================
4429
→ 4430 status = le32_to_cpu(desc->opts1);
4431 if (status & DescOwn)
4432 break;
4433
4434 /* This barrier is needed to keep us from reading
4435 * any other fields out of the Rx descriptor until
4436 * we know the status of DescOwn
4437 */
4438 dma_rmb();
4439
4440 if (unlikely(status & RxRES)) {
4441 if (net_ratelimit())
4442 netdev_warn(dev, "Rx ERROR. status = %08x\n",
Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
happening:
4429
→ 4430 status = le32_to_cpu(READ_ONCE(desc->opts1));
4431 if (status & DescOwn)
4432 break;
4433
As the consequence of this fix, this KCSAN warning was eliminated.
Fixes:
6202806e7c03a ("r8169: drop member opts1_mask from struct rtl8169_private")
Suggested-by: Marco Elver <elver@google.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirsad Goran Todorovac [Wed, 18 Oct 2023 19:34:36 +0000 (21:34 +0200)]
r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1
KCSAN reported the following data-race:
==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
value changed: 0xb0000042 -> 0x00000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================
The read side is in
drivers/net/ethernet/realtek/r8169_main.c
=========================================
4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
4356 int budget)
4357 {
4358 unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
4359 struct sk_buff *skb;
4360
4361 dirty_tx = tp->dirty_tx;
4362
4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) {
4364 unsigned int entry = dirty_tx % NUM_TX_DESC;
4365 u32 status;
4366
→ 4367 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
4368 if (status & DescOwn)
4369 break;
4370
4371 skb = tp->tx_skb[entry].skb;
4372 rtl8169_unmap_tx_skb(tp, entry);
4373
4374 if (skb) {
4375 pkts_compl++;
4376 bytes_compl += skb->len;
4377 napi_consume_skb(skb, budget);
4378 }
4379 dirty_tx++;
4380 }
4381
4382 if (tp->dirty_tx != dirty_tx) {
4383 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
4384 WRITE_ONCE(tp->dirty_tx, dirty_tx);
4385
4386 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
4387 rtl_tx_slots_avail(tp),
4388 R8169_TX_START_THRS);
4389 /*
4390 * 8168 hack: TxPoll requests are lost when the Tx packets are
4391 * too close. Let's kick an extra TxPoll request when a burst
4392 * of start_xmit activity is detected (if it is not detected,
4393 * it is slow enough). -- FR
4394 * If skb is NULL then we come here again once a tx irq is
4395 * triggered after the last fragment is marked transmitted.
4396 */
4397 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
4398 rtl8169_doorbell(tp);
4399 }
4400 }
tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
this KCSAN warning.
4366
→ 4367 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
4368 if (status & DescOwn)
4369 break;
4370
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: David S. Miller <davem@davemloft.net>
Mirsad Goran Todorovac [Wed, 18 Oct 2023 19:34:34 +0000 (21:34 +0200)]
r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx
KCSAN reported the following data-race:
==================================================================
BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]
write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
sch_direct_xmit (net/sched/sch_generic.c:342)
__dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
__ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
ip_finish_output (net/ipv4/ip_output.c:328)
ip_output (net/ipv4/ip_output.c:435)
ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
udp_send_skb (net/ipv4/udp.c:963)
udp_sendmsg (net/ipv4/udp.c:1246)
inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
sock_sendmsg (net/socket.c:730 net/socket.c:753)
__sys_sendto (net/socket.c:2177)
__x64_sys_sendto (net/socket.c:2185)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
value changed: 0x002f4815 -> 0x002f4816
Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G L 6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================
The write side of drivers/net/ethernet/realtek/r8169_main.c is:
==================
4251 /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
4252 smp_wmb();
4253
→ 4254 WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
4255
4256 stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
4257 R8169_TX_STOP_THRS,
4258 R8169_TX_START_THRS);
The read side is the function rtl_tx():
4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
4356 int budget)
4357 {
4358 unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
4359 struct sk_buff *skb;
4360
4361 dirty_tx = tp->dirty_tx;
4362
4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) {
4364 unsigned int entry = dirty_tx % NUM_TX_DESC;
4365 u32 status;
4366
4367 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
4368 if (status & DescOwn)
4369 break;
4370
4371 skb = tp->tx_skb[entry].skb;
4372 rtl8169_unmap_tx_skb(tp, entry);
4373
4374 if (skb) {
4375 pkts_compl++;
4376 bytes_compl += skb->len;
4377 napi_consume_skb(skb, budget);
4378 }
4379 dirty_tx++;
4380 }
4381
4382 if (tp->dirty_tx != dirty_tx) {
4383 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
4384 WRITE_ONCE(tp->dirty_tx, dirty_tx);
4385
4386 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
4387 rtl_tx_slots_avail(tp),
4388 R8169_TX_START_THRS);
4389 /*
4390 * 8168 hack: TxPoll requests are lost when the Tx packets are
4391 * too close. Let's kick an extra TxPoll request when a burst
4392 * of start_xmit activity is detected (if it is not detected,
4393 * it is slow enough). -- FR
4394 * If skb is NULL then we come here again once a tx irq is
4395 * triggered after the last fragment is marked transmitted.
4396 */
→ 4397 if (tp->cur_tx != dirty_tx && skb)
4398 rtl8169_doorbell(tp);
4399 }
4400 }
Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
line 4363:
4363 while (READ_ONCE(tp->cur_tx) != dirty_tx) {
but the same solution is required for protecting the other access to tp->cur_tx:
→ 4397 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
4398 rtl8169_doorbell(tp);
The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
might have suffered read tearing under some compiler optimisations.
The fix eliminated the KCSAN data-race report for this bug.
It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
and line 4397. This test should certainly not be cached by the compiler in some register
for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
4254 in the meantime.
Fixes:
94d8a98e6235c ("r8169: reduce number of workaround doorbell rings")
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: nic_swsd@realtek.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/
Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Thu, 19 Oct 2023 17:34:55 +0000 (20:34 +0300)]
gpiolib: acpi: Add missing memset(0) to acpi_get_gpiod_from_data()
When refactoring the acpi_get_gpiod_from_data() the change missed
cleaning up the variable on stack. Add missing memset().
Reported-by: Ferry Toth <ftoth@exalondelft.nl>
Fixes:
16ba046e86e9 ("gpiolib: acpi: teach acpi_find_gpio() to handle data-only nodes")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Greg Kroah-Hartman [Fri, 20 Oct 2023 05:52:44 +0000 (07:52 +0200)]
Merge tag 'usb-serial-6.6-rc7' of https://git./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial device ids for 6.6-rc7
Here are some new modem device ids, including an entry needed for Sierra
EM9191 which stopped working with recent firmware.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.6-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Fibocom to DELL custom modem FM101R-GL
USB: serial: option: add entry for Sierra EM9191 with new firmware
USB: serial: option: add Telit LE910C4-WWX 0x1035 composition
Dave Airlie [Fri, 20 Oct 2023 04:23:25 +0000 (14:23 +1000)]
Merge tag 'mediatek-drm-fixes-
20231017' of https://git./linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes -
20231017
1. Correctly free sg_table in gem prime vmap
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231016233659.3639-1-chunkuang.hu@kernel.org
Dave Airlie [Fri, 20 Oct 2023 04:21:16 +0000 (14:21 +1000)]
Merge tag 'drm-intel-fixes-2023-10-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix display issue that was blocking S0ix (Khaled)
- Retry gtt fault when out of fence registers (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZTFXbo6M5bWp/hTU@intel.com