platform/kernel/linux-starfive.git
12 months agodrm/i915: Fix premature release of request's reusable memory
Janusz Krzysztofik [Thu, 20 Jul 2023 09:35:44 +0000 (11:35 +0200)]
drm/i915: Fix premature release of request's reusable memory

Infinite waits for completion of GPU activity have been observed in CI,
mostly inside __i915_active_wait(), triggered by igt@gem_barrier_race or
igt@perf@stress-open-close.  Root cause analysis, based of ftrace dumps
generated with a lot of extra trace_printk() calls added to the code,
revealed loops of request dependencies being accidentally built,
preventing the requests from being processed, each waiting for completion
of another one's activity.

After we substitute a new request for a last active one tracked on a
timeline, we set up a dependency of our new request to wait on completion
of current activity of that previous one.  While doing that, we must take
care of keeping the old request still in memory until we use its
attributes for setting up that await dependency, or we can happen to set
up the await dependency on an unrelated request that already reuses the
memory previously allocated to the old one, already released.  Combined
with perf adding consecutive kernel context remote requests to different
user context timelines, unresolvable loops of await dependencies can be
built, leading do infinite waits.

We obtain a pointer to the previous request to wait upon when we
substitute it with a pointer to our new request in an active tracker,
e.g. in intel_timeline.last_request.  In some processing paths we protect
that old request from being freed before we use it by getting a reference
to it under RCU protection, but in others, e.g.  __i915_request_commit()
-> __i915_request_add_to_timeline() -> __i915_request_ensure_ordering(),
we don't.  But anyway, since the requests' memory is SLAB_FAILSAFE_BY_RCU,
that RCU protection is not sufficient against reuse of memory.

We could protect i915_request's memory from being prematurely reused by
calling its release function via call_rcu() and using rcu_read_lock()
consequently, as proposed in v1.  However, that approach leads to
significant (up to 10 times) increase of SLAB utilization by i915_request
SLAB cache.  Another potential approach is to take a reference to the
previous active fence.

When updating an active fence tracker, we first lock the new fence,
substitute a pointer of the current active fence with the new one, then we
lock the substituted fence.  With this approach, there is a time window
after the substitution and before the lock when the request can be
concurrently released by an interrupt handler and its memory reused, then
we may happen to lock and return a new, unrelated request.

Always get a reference to the current active fence first, before
replacing it with a new one.  Having it protected from premature release
and reuse, lock it and then replace with the new one but only if not
yet signalled via a potential concurrent interrupt nor replaced with
another one by a potential concurrent thread, otherwise retry, starting
from getting a reference to the new current one.  Adjust users to not
get a reference to the previous active fence themselves and always put the
reference got by __i915_active_fence_set() when no longer needed.

v3: Fix lockdep splat reports and other issues caused by incorrect use of
    try_cmpxchg() (use (cmpxchg() != prev) instead)
v2: Protect request's memory by getting a reference to it in favor of
    delegating its release to call_rcu() (Chris)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8211
Fixes: df9f85d8582e ("drm/i915: Serialise i915_active_fence_set() with itself")
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720093543.832147-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 946e047a3d88d46d15b5c5af0414098e12b243f7)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Support aux invalidation on all engines
Andi Shyti [Tue, 25 Jul 2023 00:19:50 +0000 (02:19 +0200)]
drm/i915/gt: Support aux invalidation on all engines

Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.

With this refactoring add more engines where the invalidation
should be performed.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-8-andi.shyti@linux.intel.com
(cherry picked from commit 76ff7789d6e63d1a10b3b58f5c70b2e640c7a880)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Poll aux invalidation register bit on invalidation
Jonathan Cavitt [Tue, 25 Jul 2023 00:19:49 +0000 (02:19 +0200)]
drm/i915/gt: Poll aux invalidation register bit on invalidation

For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-7-andi.shyti@linux.intel.com
(cherry picked from commit d459c86f00aa98028d155a012c65dc42f7c37e76)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Enable the CCS_FLUSH bit in the pipe control and in the CS
Andi Shyti [Tue, 25 Jul 2023 00:19:48 +0000 (02:19 +0200)]
drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control and in the CS

Enable the CCS_FLUSH bit 13 in the control pipe for render and
compute engines in platforms starting from Meteor Lake (BSPEC
43904 and 47112).

For the copy engine add MI_FLUSH_DW_CCS (bit 16) in the command
streamer.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Requires: 8da173db894a ("drm/i915/gt: Rename flags with bit_group_X according to the datasheet")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-6-andi.shyti@linux.intel.com
(cherry picked from commit b70df82b428774875c7c56d3808102165891547c)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Rename flags with bit_group_X according to the datasheet
Andi Shyti [Tue, 25 Jul 2023 00:19:47 +0000 (02:19 +0200)]
drm/i915/gt: Rename flags with bit_group_X according to the datasheet

In preparation of the next patch align with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.

Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-5-andi.shyti@linux.intel.com
(cherry picked from commit f2dcd21d5a22e13f2fbfe7ab65149038b93cf2ff)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Ensure memory quiesced before invalidation
Jonathan Cavitt [Tue, 25 Jul 2023 00:19:46 +0000 (02:19 +0200)]
drm/i915/gt: Ensure memory quiesced before invalidation

All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Requires: a2a4aa0eef3b ("drm/i915: Add the gen12_needs_ccs_aux_inv helper")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-4-andi.shyti@linux.intel.com
(cherry picked from commit ad8ebf12217e451cd19804b1c3e97ad56491c74a)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915: Add the gen12_needs_ccs_aux_inv helper
Andi Shyti [Tue, 25 Jul 2023 00:19:45 +0000 (02:19 +0200)]
drm/i915: Add the gen12_needs_ccs_aux_inv helper

We always assumed that a device might either have AUX or FLAT
CCS, but this is an approximation that is not always true, e.g.
PVC represents an exception.

Set the basis for future finer selection by implementing a
boolean gen12_needs_ccs_aux_inv() function that tells whether aux
invalidation is needed or not.

Currently PVC is the only exception to the above mentioned rule.

Requires: 059ae7ae2a1c ("drm/i915/gt: Cleanup aux invalidation registers")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-3-andi.shyti@linux.intel.com
(cherry picked from commit c827655b87ad201ebe36f2e28d16b5491c8f7801)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agodrm/i915/gt: Cleanup aux invalidation registers
Andi Shyti [Tue, 25 Jul 2023 00:19:44 +0000 (02:19 +0200)]
drm/i915/gt: Cleanup aux invalidation registers

Fix the 'NV' definition postfix that is supposed to be INV.

Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.

Remove also VD1, VD3 and VE1 registers that don't exist and add
BCS0 and CCS0.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230725001950.1014671-2-andi.shyti@linux.intel.com
(cherry picked from commit 2f0b927d3ca3440445975ebde27f3df1c3ed6f76)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
12 months agoerofs: drop unnecessary WARN_ON() in erofs_kill_sb()
Gao Xiang [Tue, 1 Aug 2023 01:47:37 +0000 (09:47 +0800)]
erofs: drop unnecessary WARN_ON() in erofs_kill_sb()

Previously, .kill_sb() will be called only after fill_super fails.
It will be changed [1].

Besides, checking for s_magic in erofs_kill_sb() is unnecessary from
any point of view.  Let's get rid of it now.

[1] https://lore.kernel.org/r/20230731-flugbereit-wohnlage-78acdf95ab7e@brauner

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20230801014737.28614-1-hsiangkao@linux.alibaba.com
12 months agoerofs: fix wrong primary bvec selection on deduplicated extents
Gao Xiang [Wed, 19 Jul 2023 06:54:59 +0000 (14:54 +0800)]
erofs: fix wrong primary bvec selection on deduplicated extents

When handling deduplicated compressed data, there can be multiple
decompressed extents pointing to the same compressed data in one shot.

In such cases, the bvecs which belong to the longest extent will be
selected as the primary bvecs for real decompressors to decode and the
other duplicated bvecs will be directly copied from the primary bvecs.

Previously, only relative offsets of the longest extent were checked to
decompress the primary bvecs.  On rare occasions, it can be incorrect
if there are several extents with the same start relative offset.
As a result, some short bvecs could be selected for decompression and
then cause data corruption.

For example, as Shijie Sun reported off-list, considering the following
extents of a file:
 117:   903345..  915250 |   11905 :     385024..    389120 |    4096
...
 119:   919729..  930323 |   10594 :     385024..    389120 |    4096
...
 124:   968881..  980786 |   11905 :     385024..    389120 |    4096

The start relative offset is the same: 2225, but extent 119 (919729..
930323) is shorter than the others.

Let's restrict the bvec length in addition to the start offset if bvecs
are not full.

Reported-by: Shijie Sun <sunshijie@xiaomi.com>
Fixes: 5c2a64252c5d ("erofs: introduce partial-referenced pclusters")
Tested-by Shijie Sun <sunshijie@xiaomi.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230719065459.60083-1-hsiangkao@linux.alibaba.com
12 months agobpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
Tomas Glozar [Fri, 28 Jul 2023 06:44:11 +0000 (08:44 +0200)]
bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
patchset notes for details).

This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
BUG (sleeping function called from invalid context) on RT kernels.

preempt_disable was introduced together with lock_sk and rcu_read_lock
in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
in parallel"), probably to match disabled migration of BPF programs, and
is no longer necessary.

Remove preempt_disable to fix BUG in sock_map_update_common on RT.

Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/all/20200224140131.461979697@linutronix.de/
Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel")
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230728064411.305576-1-tglozar@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
12 months agoperf test parse-events: Test complex name has required event format
Ian Rogers [Thu, 6 Jul 2023 18:37:05 +0000 (11:37 -0700)]
perf test parse-events: Test complex name has required event format

test__checkevent_complex_name will use an "event" format which if not
present, such as with a placeholder PMU, will cause test failures. Skip
the test in this case to avoid failures in restricted environments.

Add perf_pmu__has_format utility as a general PMU utility.

Fixes: 628eaa4e877af823 ("perf pmus: Add placeholder core PMU")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230706183705.601412-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 months agoperf pmus: Create placholder regardless of scanning core_only
Ian Rogers [Thu, 6 Jul 2023 18:37:04 +0000 (11:37 -0700)]
perf pmus: Create placholder regardless of scanning core_only

If scanning all PMUs the placeholder is still necessary if no core PMU
is found. This situation occurs in perf test's parse-events test,
when uncore events appear before core.

Fixes: 628eaa4e877af823 ("perf pmus: Add placeholder core PMU")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230706183705.601412-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 months agodrm/i915/gvt: Fix bug in getting msg length in AUX CH registers handler
Yan Zhao [Mon, 31 Jul 2023 11:20:33 +0000 (19:20 +0800)]
drm/i915/gvt: Fix bug in getting msg length in AUX CH registers handler

Msg length should be obtained from value written to AUX_CH_CTL register
rather than from enum type of the register.

Commit 0cad796a2269  ("drm/i915: Use REG_BIT() & co. for AUX CH registers")
incorrectly calculates the msg_length from reg type and yields below
warning in intel_gvt_i2c_handle_aux_ch_write():
"i915 0000:00:02.0: drm_WARN_ON(msg_length != 4)".

Fixes: 0cad796a2269 ("drm/i915: Use REG_BIT() & co. for AUX CH registers")
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20230731112033.7275-1-yan.y.zhao@intel.com
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
12 months agoMerge branch 'net-sched-bind-logic-fixes-for-cls_fw-cls_u32-and-cls_route'
Jakub Kicinski [Tue, 1 Aug 2023 03:10:39 +0000 (20:10 -0700)]
Merge branch 'net-sched-bind-logic-fixes-for-cls_fw-cls_u32-and-cls_route'

valis says:

====================
net/sched Bind logic fixes for cls_fw, cls_u32 and cls_route

Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

This patch set fixes this issue in all affected classifiers by no longer
copying the tcf_result struct from the old filter.
====================

Link: https://lore.kernel.org/r/20230729123202.72406-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agonet/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free
valis [Sat, 29 Jul 2023 12:32:02 +0000 (08:32 -0400)]
net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free

When route4_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: 1109c00547fc ("net: sched: RCU cls_route")
Reported-by: valis <sec@valis.email>
Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Signed-off-by: valis <sec@valis.email>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: M A Ramdhan <ramdhan@starlabs.sg>
Link: https://lore.kernel.org/r/20230729123202.72406-4-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agonet/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free
valis [Sat, 29 Jul 2023 12:32:01 +0000 (08:32 -0400)]
net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free

When fw_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: e35a8ee5993b ("net: sched: fw use RCU")
Reported-by: valis <sec@valis.email>
Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Signed-off-by: valis <sec@valis.email>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: M A Ramdhan <ramdhan@starlabs.sg>
Link: https://lore.kernel.org/r/20230729123202.72406-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agonet/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free
valis [Sat, 29 Jul 2023 12:32:00 +0000 (08:32 -0400)]
net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free

When u32_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers")
Reported-by: valis <sec@valis.email>
Reported-by: M A Ramdhan <ramdhan@starlabs.sg>
Signed-off-by: valis <sec@valis.email>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: M A Ramdhan <ramdhan@starlabs.sg>
Link: https://lore.kernel.org/r/20230729123202.72406-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agoMerge branch 'Two fixes for cpu-map'
Martin KaFai Lau [Mon, 31 Jul 2023 22:37:13 +0000 (15:37 -0700)]
Merge branch 'Two fixes for cpu-map'

Hou Tao says:

====================

The patchset fixes two reported warning in cpu-map when running
xdp_redirect_cpu and some RT threads concurrently. Patch #1 fixes
the warning in __cpu_map_ring_cleanup() when kthread is stopped
prematurely. Patch #2 fixes the warning in __xdp_return() when
there are pending skbs in ptr_ring.

Please see individual patches for more details. And comments are always
welcome.

====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
12 months agobpf, cpumap: Handle skb as well when clean up ptr_ring
Hou Tao [Sat, 29 Jul 2023 09:51:07 +0000 (17:51 +0800)]
bpf, cpumap: Handle skb as well when clean up ptr_ring

The following warning was reported when running xdp_redirect_cpu with
both skb-mode and stress-mode enabled:

  ------------[ cut here ]------------
  Incorrect XDP memory type (-2128176192) usage
  WARNING: CPU: 7 PID: 1442 at net/core/xdp.c:405
  Modules linked in:
  CPU: 7 PID: 1442 Comm: kworker/7:0 Tainted: G  6.5.0-rc2+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  Workqueue: events __cpu_map_entry_free
  RIP: 0010:__xdp_return+0x1e4/0x4a0
  ......
  Call Trace:
   <TASK>
   ? show_regs+0x65/0x70
   ? __warn+0xa5/0x240
   ? __xdp_return+0x1e4/0x4a0
   ......
   xdp_return_frame+0x4d/0x150
   __cpu_map_entry_free+0xf9/0x230
   process_one_work+0x6b0/0xb80
   worker_thread+0x96/0x720
   kthread+0x1a5/0x1f0
   ret_from_fork+0x3a/0x70
   ret_from_fork_asm+0x1b/0x30
   </TASK>

The reason for the warning is twofold. One is due to the kthread
cpu_map_kthread_run() is stopped prematurely. Another one is
__cpu_map_ring_cleanup() doesn't handle skb mode and treats skbs in
ptr_ring as XDP frames.

Prematurely-stopped kthread will be fixed by the preceding patch and
ptr_ring will be empty when __cpu_map_ring_cleanup() is called. But
as the comments in __cpu_map_ring_cleanup() said, handling and freeing
skbs in ptr_ring as well to "catch any broken behaviour gracefully".

Fixes: 11941f8a8536 ("bpf: cpumap: Implement generic cpumap")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230729095107.1722450-3-houtao@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
12 months agobpf, cpumap: Make sure kthread is running before map update returns
Hou Tao [Sat, 29 Jul 2023 09:51:06 +0000 (17:51 +0800)]
bpf, cpumap: Make sure kthread is running before map update returns

The following warning was reported when running stress-mode enabled
xdp_redirect_cpu with some RT threads:

  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 65 at kernel/bpf/cpumap.c:135
  CPU: 4 PID: 65 Comm: kworker/4:1 Not tainted 6.5.0-rc2+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  Workqueue: events cpu_map_kthread_stop
  RIP: 0010:put_cpu_map_entry+0xda/0x220
  ......
  Call Trace:
   <TASK>
   ? show_regs+0x65/0x70
   ? __warn+0xa5/0x240
   ......
   ? put_cpu_map_entry+0xda/0x220
   cpu_map_kthread_stop+0x41/0x60
   process_one_work+0x6b0/0xb80
   worker_thread+0x96/0x720
   kthread+0x1a5/0x1f0
   ret_from_fork+0x3a/0x70
   ret_from_fork_asm+0x1b/0x30
   </TASK>

The root cause is the same as commit 436901649731 ("bpf: cpumap: Fix memory
leak in cpu_map_update_elem"). The kthread is stopped prematurely by
kthread_stop() in cpu_map_kthread_stop(), and kthread() doesn't call
cpu_map_kthread_run() at all but XDP program has already queued some
frames or skbs into ptr_ring. So when __cpu_map_ring_cleanup() checks
the ptr_ring, it will find it was not emptied and report a warning.

An alternative fix is to use __cpu_map_ring_cleanup() to drop these
pending frames or skbs when kthread_stop() returns -EINTR, but it may
confuse the user, because these frames or skbs have been handled
correctly by XDP program. So instead of dropping these frames or skbs,
just make sure the per-cpu kthread is running before
__cpu_map_entry_alloc() returns.

After apply the fix, the error handle for kthread_stop() will be
unnecessary because it will always return 0, so just remove it.

Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230729095107.1722450-2-houtao@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
12 months agoocteon_ep: initialize mbox mutexes
Michal Schmidt [Sat, 29 Jul 2023 15:15:16 +0000 (17:15 +0200)]
octeon_ep: initialize mbox mutexes

The two mbox-related mutexes are destroyed in octep_ctrl_mbox_uninit(),
but the corresponding mutex_init calls were missing.
A "DEBUG_LOCKS_WARN_ON(lock->magic != lock)" warning was emitted with
CONFIG_DEBUG_MUTEXES on.

Initialize the two mutexes in octep_ctrl_mbox_init().

Fixes: 577f0d1b1c5f ("octeon_ep: add separate mailbox command and response queues")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230729151516.24153-1-mschmidt@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agobnxt: don't handle XDP in netpoll
Jakub Kicinski [Fri, 28 Jul 2023 20:50:20 +0000 (13:50 -0700)]
bnxt: don't handle XDP in netpoll

Similarly to other recently fixed drivers make sure we don't
try to access XDP or page pool APIs when NAPI budget is 0.
NAPI budget of 0 may mean that we are in netpoll.

This may result in running software IRQs in hard IRQ context,
leading to deadlocks or crashes.

To make sure bnapi->tx_pkts don't get wiped without handling
the events, move clearing the field into the handler itself.
Remember to clear tx_pkts after reset (bnxt_enable_napi())
as it's technically possible that netpoll will accumulate
some tx_pkts and then a reset will happen, leaving tx_pkts
out of sync with reality.

Fixes: 322b87ca55f2 ("bnxt_en: add page_pool support")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230728205020.2784844-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agoice: Fix RDMA VSI removal during queue rebuild
Rafal Rogalski [Fri, 28 Jul 2023 17:12:43 +0000 (10:12 -0700)]
ice: Fix RDMA VSI removal during queue rebuild

During qdisc create/delete, it is necessary to rebuild the queue
of VSIs. An error occurred because the VSIs created by RDMA were
still active.

Added check if RDMA is active. If yes, it disallows qdisc changes
and writes a message in the system logs.

Fixes: 348048e724a0 ("ice: Implement iidc operations")
Signed-off-by: Rafal Rogalski <rafalx.rogalski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230728171243.2446101-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agosfc: fix field-spanning memcpy in selftest
Edward Cree [Fri, 28 Jul 2023 16:55:28 +0000 (17:55 +0100)]
sfc: fix field-spanning memcpy in selftest

Add a struct_group for the whole packet body so we can copy it in one
 go without triggering FORTIFY_SOURCE complaints.

Fixes: cf60ed469629 ("sfc: use padding to fix alignment in loopback test")
Fixes: 30c24dd87f3f ("sfc: siena: use padding to fix alignment in loopback test")
Fixes: 1186c6b31ee1 ("sfc: falcon: use padding to fix alignment in loopback test")
Reviewed-by: Andy Moreton <andy.moreton@amd.com>
Tested-by: Andy Moreton <andy.moreton@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230728165528.59070-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agonet: usb: qmi_wwan: add Quectel EM05GV2
Martin Kohn [Thu, 27 Jul 2023 20:00:43 +0000 (20:00 +0000)]
net: usb: qmi_wwan: add Quectel EM05GV2

Add support for Quectel EM05GV2 (G=global) with vendor ID
0x2c7c and product ID 0x030e

Enabling DTR on this modem was necessary to ensure stable operation.
Patch for usb: serial: option: is also in progress.

T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=030e Rev= 3.18
S:  Manufacturer=Quectel
S:  Product=Quectel EM05-G
C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Signed-off-by: Martin Kohn <m.kohn@welotec.com>
Link: https://lore.kernel.org/r/AM0PR04MB57648219DE893EE04FA6CC759701A@AM0PR04MB5764.eurprd04.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agonet: usb: lan78xx: reorder cleanup operations to avoid UAF bugs
Duoming Zhou [Wed, 26 Jul 2023 08:14:07 +0000 (16:14 +0800)]
net: usb: lan78xx: reorder cleanup operations to avoid UAF bugs

The timer dev->stat_monitor can schedule the delayed work dev->wq and
the delayed work dev->wq can also arm the dev->stat_monitor timer.

When the device is detaching, the net_device will be deallocated. but
the net_device private data could still be dereferenced in delayed work
or timer handler. As a result, the UAF bugs will happen.

One racy situation is shown below:

      (Thread 1)                 |      (Thread 2)
lan78xx_stat_monitor()           |
 ...                             |  lan78xx_disconnect()
 lan78xx_defer_kevent()          |    ...
  ...                            |    cancel_delayed_work_sync(&dev->wq);
  schedule_delayed_work()        |    ...
  (wait some time)               |    free_netdev(net); //free net_device
  lan78xx_delayedwork()          |
  //use net_device private data  |
  dev-> //use                    |

Although we use cancel_delayed_work_sync() to cancel the delayed work
in lan78xx_disconnect(), it could still be scheduled in timer handler
lan78xx_stat_monitor().

Another racy situation is shown below:

      (Thread 1)                |      (Thread 2)
lan78xx_delayedwork             |
 mod_timer()                    |  lan78xx_disconnect()
                                |   cancel_delayed_work_sync()
 (wait some time)               |   if (timer_pending(&dev->stat_monitor))
                              |       del_timer_sync(&dev->stat_monitor);
 lan78xx_stat_monitor()         |   ...
  lan78xx_defer_kevent()        |   free_netdev(net); //free
   //use net_device private data|
   dev-> //use                  |

Although we use del_timer_sync() to delete the timer, the function
timer_pending() returns 0 when the timer is activated. As a result,
the del_timer_sync() will not be executed and the timer could be
re-armed.

In order to mitigate this bug, We use timer_shutdown_sync() to shutdown
the timer and then use cancel_delayed_work_sync() to cancel the delayed
work. As a result, the net_device could be deallocated safely.

What's more, the dev->flags is set to EVENT_DEV_DISCONNECT in
lan78xx_disconnect(). But it could still be set to EVENT_STAT_UPDATE
in lan78xx_stat_monitor(). So this patch put the set_bit() behind
timer_shutdown_sync().

Fixes: 77dfff5bb7e2 ("lan78xx: Fix race condition in disconnect handling")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agodt-bindings: net: mediatek,net: fixup MAC binding
Rafał Miłecki [Sat, 29 Jul 2023 11:10:45 +0000 (13:10 +0200)]
dt-bindings: net: mediatek,net: fixup MAC binding

1. Use unevaluatedProperties
It's needed to allow ethernet-controller.yaml properties work correctly.

2. Drop unneeded phy-handle/phy-mode

3. Don't require phy-handle
Some SoCs may use fixed link.

For in-kernel MT7621 DTS files this fixes following errors:
arch/mips/boot/dts/ralink/mt7621-tplink-hc220-g5-v1.dtb: ethernet@1e100000: mac@0: 'fixed-link' does not match any of the regexes: 'pinctrl-[0-9]+'
        From schema: Documentation/devicetree/bindings/net/mediatek,net.yaml
arch/mips/boot/dts/ralink/mt7621-tplink-hc220-g5-v1.dtb: ethernet@1e100000: mac@0: 'phy-handle' is a required property
        From schema: Documentation/devicetree/bindings/net/mediatek,net.yaml
arch/mips/boot/dts/ralink/mt7621-tplink-hc220-g5-v1.dtb: ethernet@1e100000: mac@1: 'fixed-link' does not match any of the regexes: 'pinctrl-[0-9]+'
        From schema: Documentation/devicetree/bindings/net/mediatek,net.yaml
arch/mips/boot/dts/ralink/mt7621-tplink-hc220-g5-v1.dtb: ethernet@1e100000: mac@1: 'phy-handle' is a required property
        From schema: Documentation/devicetree/bindings/net/mediatek,net.yaml

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX.
Kuniyuki Iwashima [Sat, 29 Jul 2023 00:07:05 +0000 (17:07 -0700)]
net/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX.

syzkaller found zero division error [0] in div_s64_rem() called from
get_cycle_time_elapsed(), where sched->cycle_time is the divisor.

We have tests in parse_taprio_schedule() so that cycle_time will never
be 0, and actually cycle_time is not 0 in get_cycle_time_elapsed().

The problem is that the types of divisor are different; cycle_time is
s64, but the argument of div_s64_rem() is s32.

syzkaller fed this input and 0x100000000 is cast to s32 to be 0.

  @TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME={0xc, 0x8, 0x100000000}

We use s64 for cycle_time to cast it to ktime_t, so let's keep it and
set max for cycle_time.

While at it, we prevent overflow in setup_txtime() and add another
test in parse_taprio_schedule() to check if cycle_time overflows.

Also, we add a new tdc test case for this issue.

[0]:
divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 PID: 103 Comm: kworker/1:3 Not tainted 6.5.0-rc1-00330-g60cc1f7d0605 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: ipv6_addrconf addrconf_dad_work
RIP: 0010:div_s64_rem include/linux/math64.h:42 [inline]
RIP: 0010:get_cycle_time_elapsed net/sched/sch_taprio.c:223 [inline]
RIP: 0010:find_entry_to_transmit+0x252/0x7e0 net/sched/sch_taprio.c:344
Code: 3c 02 00 0f 85 5e 05 00 00 48 8b 4c 24 08 4d 8b bd 40 01 00 00 48 8b 7c 24 48 48 89 c8 4c 29 f8 48 63 f7 48 99 48 89 74 24 70 <48> f7 fe 48 29 d1 48 8d 04 0f 49 89 cc 48 89 44 24 20 49 8d 85 10
RSP: 0018:ffffc90000acf260 EFLAGS: 00010206
RAX: 177450e0347560cf RBX: 0000000000000000 RCX: 177450e0347560cf
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000100000000
RBP: 0000000000000056 R08: 0000000000000000 R09: ffffed10020a0934
R10: ffff8880105049a7 R11: ffff88806cf3a520 R12: ffff888010504800
R13: ffff88800c00d800 R14: ffff8880105049a0 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0edf84f0e8 CR3: 000000000d73c002 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
 <TASK>
 get_packet_txtime net/sched/sch_taprio.c:508 [inline]
 taprio_enqueue_one+0x900/0xff0 net/sched/sch_taprio.c:577
 taprio_enqueue+0x378/0xae0 net/sched/sch_taprio.c:658
 dev_qdisc_enqueue+0x46/0x170 net/core/dev.c:3732
 __dev_xmit_skb net/core/dev.c:3821 [inline]
 __dev_queue_xmit+0x1b2f/0x3000 net/core/dev.c:4169
 dev_queue_xmit include/linux/netdevice.h:3088 [inline]
 neigh_resolve_output net/core/neighbour.c:1552 [inline]
 neigh_resolve_output+0x4a7/0x780 net/core/neighbour.c:1532
 neigh_output include/net/neighbour.h:544 [inline]
 ip6_finish_output2+0x924/0x17d0 net/ipv6/ip6_output.c:135
 __ip6_finish_output+0x620/0xaa0 net/ipv6/ip6_output.c:196
 ip6_finish_output net/ipv6/ip6_output.c:207 [inline]
 NF_HOOK_COND include/linux/netfilter.h:292 [inline]
 ip6_output+0x206/0x410 net/ipv6/ip6_output.c:228
 dst_output include/net/dst.h:458 [inline]
 NF_HOOK.constprop.0+0xea/0x260 include/linux/netfilter.h:303
 ndisc_send_skb+0x872/0xe80 net/ipv6/ndisc.c:508
 ndisc_send_ns+0xb5/0x130 net/ipv6/ndisc.c:666
 addrconf_dad_work+0xc14/0x13f0 net/ipv6/addrconf.c:4175
 process_one_work+0x92c/0x13a0 kernel/workqueue.c:2597
 worker_thread+0x60f/0x1240 kernel/workqueue.c:2748
 kthread+0x2fe/0x3f0 kernel/kthread.c:389
 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308
 </TASK>
Modules linked in:

Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Co-developed-by: Eric Dumazet <edumazet@google.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonfsd: Fix reading via splice
David Howells [Thu, 27 Jul 2023 16:21:17 +0000 (17:21 +0100)]
nfsd: Fix reading via splice

nfsd_splice_actor() has a clause in its loop that chops up a compound page
into individual pages such that if the same page is seen twice in a row, it
is discarded the second time.  This is a problem with the advent of
shmem_splice_read() as that inserts zero_pages into the pipe in lieu of
pages that aren't present in the pagecache.

Fix this by assuming that the last page is being extended only if the
currently stored length + starting offset is not currently on a page
boundary.

This can be tested by NFS-exporting a tmpfs filesystem on the test machine
and truncating it to more than a page in size (eg. truncate -s 8192) and
then reading it by NFS.  The first page will be all zeros, but thereafter
garbage will be read.

Note: I wonder if we can ever get a situation now where we get a splice
that gives us contiguous parts of a page in separate actor calls.  As NFSD
can only be splicing from a file (I think), there are only three sources of
the page: copy_splice_read(), shmem_splice_read() and file_splice_read().
The first allocates pages for the data it reads, so the problem cannot
occur; the second should never see a partial page; and the third waits for
each page to become available before we're allowed to read from it.

Fixes: bd194b187115 ("shmem: Implement splice-read")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
cc: Hugh Dickins <hughd@google.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
12 months agoLinux 6.5-rc4
Linus Torvalds [Sun, 30 Jul 2023 20:23:47 +0000 (13:23 -0700)]
Linux 6.5-rc4

12 months agoMerge tag 'spi-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Sun, 30 Jul 2023 19:54:31 +0000 (12:54 -0700)]
Merge tag 'spi-fix-v6.5-rc3' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A bunch of fixes for the Qualcomm QSPI driver, fixing multiple issues
  with the newly added DMA mode - it had a number of issues exposed when
  tested in a wider range of use cases, both race condition style issues
  and issues with different inputs to those that had been used in test"

* tag 'spi-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-qcom-qspi: Add mem_ops to avoid PIO for badly sized reads
  spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes
  spi: spi-qcom-qspi: Add DMA_CHAIN_DONE to ALL_IRQS
  spi: spi-qcom-qspi: Call dma_wmb() after setting up descriptors
  spi: spi-qcom-qspi: Use GFP_ATOMIC flag while allocating for descriptor
  spi: spi-qcom-qspi: Ignore disabled interrupts' status in isr

12 months agoMerge tag 'regulator-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 30 Jul 2023 19:52:05 +0000 (12:52 -0700)]
Merge tag 'regulator-fix-v6.5-rc3' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A couple of small fixes for the the mt6358 driver, fixing error
  reporting and a bootstrapping issue"

* tag 'regulator-fix-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: mt6358: Fix incorrect VCN33 sync error message
  regulator: mt6358: Sync VCN33_* enable status after checking ID

12 months agoMerge tag 'usb-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 30 Jul 2023 18:57:51 +0000 (11:57 -0700)]
Merge tag 'usb-6.5-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a set of USB driver fixes for 6.5-rc4. Include in here are:

   - new USB serial device ids

   - dwc3 driver fixes for reported issues

   - typec driver fixes for reported problems

   - gadget driver fixes

   - reverts of some problematic USB changes that went into -rc1

  All of these have been in linux-next with no reported problems"

* tag 'usb-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits)
  usb: misc: ehset: fix wrong if condition
  usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
  usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
  usb: gadget: call usb_gadget_check_config() to verify UDC capability
  usb: typec: Use sysfs_emit_at when concatenating the string
  usb: typec: Iterate pds array when showing the pd list
  usb: typec: Set port->pd before adding device for typec_port
  usb: typec: qcom: fix return value check in qcom_pmic_typec_probe()
  Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
  Revert "usb: xhci: tegra: Fix error check"
  USB: gadget: Fix the memory leak in raw_gadget driver
  usb: gadget: core: remove unbalanced mutex_unlock in usb_gadget_activate
  Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
  Revert "xhci: add quirk for host controllers that don't update endpoint DCS"
  USB: quirks: add quirk for Focusrite Scarlett
  usb: xhci-mtk: set the dma max_seg_size
  MAINTAINERS: drop invalid usb/cdns3 Reviewer e-mail
  usb: dwc3: don't reset device side if dwc3 was configured as host-only
  usb: typec: ucsi: move typec_set_mode(TYPEC_STATE_SAFE) to ucsi_unregister_partner()
  usb: ohci-at91: Fix the unhandle interrupt when resume
  ...

12 months agoMerge tag 'tty-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 30 Jul 2023 18:51:36 +0000 (11:51 -0700)]
Merge tag 'tty-6.5-rc4' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small TTY and serial driver fixes for 6.5-rc4 for some
  reported problems. Included in here is:

   - TIOCSTI fix for braille readers

   - documentation fix for minor numbers

   - MAINTAINERS update for new serial files in -rc1

   - minor serial driver fixes for reported problems

  All of these have been in linux-next with no reported problems"

* tag 'tty-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_dw: Preserve original value of DLF register
  tty: serial: sh-sci: Fix sleeping in atomic context
  serial: sifive: Fix sifive_serial_console_setup() section
  Documentation: devices.txt: reconcile serial/ucc_uart minor numers
  MAINTAINERS: Update TTY layer for lists and recently added files
  tty: n_gsm: fix UAF in gsm_cleanup_mux
  TIOCSTI: always enable for CAP_SYS_ADMIN

12 months agoMerge tag 'staging-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 30 Jul 2023 18:47:56 +0000 (11:47 -0700)]
Merge tag 'staging-6.5-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are three small staging driver fixes for 6.5-rc4 that resolve
  some reported problems. These fixes are:

   - fix for an old bug in the r8712 driver

   - fbtft driver fix for a spi device

   - potential overflow fix in the ks7010 driver

  All of these have been in linux-next with no reported problems"

* tag 'staging-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
  staging: fbtft: ili9341: use macro FBTFT_REGISTER_SPI_DRIVER
  staging: r8712: Fix memory leak in _r8712_init_xmit_priv()

12 months agoMerge tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 30 Jul 2023 18:44:00 +0000 (11:44 -0700)]
Merge tag 'char-misc-6.5-rc4' of git://git./linux/kernel/git/gregkh/char-misc

Pull char driver and Documentation fixes from Greg KH:
 "Here is a char driver fix and some documentation updates for 6.5-rc4
  that contain the following changes:

   - sram/genalloc bugfix for reported problem

   - security-bugs.rst update based on recent discussions

   - embargoed-hardware-issues minor cleanups and then partial revert
     for the project/company lists

  All of these have been in linux-next for a while with no reported
  problems, and the documentation updates have all been reviewed by the
  relevant developers"

* tag 'char-misc-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  misc/genalloc: Name subpools by of_node_full_name()
  Documentation: embargoed-hardware-issues.rst: add AMD to the list
  Documentation: embargoed-hardware-issues.rst: clean out empty and unused entries
  Documentation: security-bugs.rst: clarify CVE handling
  Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group

12 months agoMerge tag 'probes-fixes-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 30 Jul 2023 18:27:22 +0000 (11:27 -0700)]
Merge tag 'probes-fixes-v6.5-rc3' of git://git./linux/kernel/git/trace/linux-trace

Pull probe fixes from Masami Hiramatsu:

 - probe-events: add NULL check for some BTF API calls which can return
   error code and NULL.

 - ftrace selftests: check fprobe and kprobe event correctly. This fixes
   a miss condition of the test command.

 - kprobes: do not allow probing functions that start with "__cfi_" or
   "__pfx_" since those are auto generated for kernel CFI and not
   executed.

* tag 'probes-fixes-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobes: Prohibit probing on CFI preamble symbol
  selftests/ftrace: Fix to check fprobe event eneblement
  tracing/probes: Fix to add NULL check for BTF APIs

12 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 30 Jul 2023 18:19:08 +0000 (11:19 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86:

   - Do not register IRQ bypass consumer if posted interrupts not
     supported

   - Fix missed device interrupt due to non-atomic update of IRR

   - Use GFP_KERNEL_ACCOUNT for pid_table in ipiv

   - Make VMREAD error path play nice with noinstr

   - x86: Acquire SRCU read lock when handling fastpath MSR writes

   - Support linking rseq tests statically against glibc 2.35+

   - Fix reference count for stats file descriptors

   - Detect userspace setting invalid CR0

  Non-KVM:

   - Remove coccinelle script that has caused multiple confusion
     ("debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE()
     usage", acked by Greg)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: selftests: Expand x86's sregs test to cover illegal CR0 values
  KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
  KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
  Revert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"
  KVM: selftests: Verify stats fd is usable after VM fd has been closed
  KVM: selftests: Verify stats fd can be dup()'d and read
  KVM: selftests: Verify userspace can create "redundant" binary stats files
  KVM: selftests: Explicitly free vcpus array in binary stats test
  KVM: selftests: Clean up stats fd in common stats_test() helper
  KVM: selftests: Use pread() to read binary stats header
  KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
  selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
  Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"
  KVM: x86: Acquire SRCU read lock when handling fastpath MSR writes
  KVM: VMX: Use vmread_error() to report VM-Fail in "goto" path
  KVM: VMX: Make VMREAD error path play nice with noinstr
  KVM: x86/irq: Conditionally register IRQ bypass consumer again
  KVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
  KVM: x86: check the kvm_cpu_get_interrupt result before using it
  KVM: x86: VMX: set irr_pending in kvm_apic_update_irr
  ...

12 months agoMerge tag 'locking_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 30 Jul 2023 18:12:32 +0000 (11:12 -0700)]
Merge tag 'locking_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix a rtmutex race condition resulting from sharing of the sort key
   between the lock waiters and the PI chain tree (->pi_waiters) of a
   task by giving each tree their own sort key

* tag 'locking_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Fix task->pi_waiters integrity

12 months agoMerge tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 30 Jul 2023 18:05:35 +0000 (11:05 -0700)]
Merge tag 'x86_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - AMD's automatic IBRS doesn't enable cross-thread branch target
   injection protection (STIBP) for user processes. Enable STIBP on such
   systems.

 - Do not delete (but put the ref instead) of AMD MCE error thresholding
   sysfs kobjects when destroying them in order not to delete the kernfs
   pointer prematurely

 - Restore annotation in ret_from_fork_asm() in order to fix kthread
   stack unwinding from being marked as unreliable and thus breaking
   livepatching

* tag 'x86_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
  x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
  x86: Fix kthread unwind

12 months agoMerge tag 'irq_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 30 Jul 2023 17:59:19 +0000 (10:59 -0700)]
Merge tag 'irq_urgent_for_v6.5_rc4' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Work around an erratum on GIC700, where a race between a CPU handling
   a wake-up interrupt, a change of affinity, and another CPU going to
   sleep can result in a lack of wake-up event on the next interrupt

 - Fix the locking required on a VPE for GICv4

 - Enable Rockchip 3588001 erratum workaround for RK3588S

 - Fix the irq-bcm6345-l1 assumtions of the boot CPU always be the first
   CPU in the system

* tag 'irq_urgent_for_v6.5_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3: Workaround for GIC-700 erratum 2941627
  irqchip/gic-v3: Enable Rockchip 3588001 erratum workaround for RK3588S
  irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
  irq-bcm6345-l1: Do not assume a fixed block to cpu mapping

12 months agoMerge tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 30 Jul 2023 03:49:13 +0000 (20:49 -0700)]
Merge tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
 "Four small SMB3 client fixes:

   - two reconnect fixes (to address the case where non-default
     iocharset gets incorrectly overridden at reconnect with the
     default charset)

   - fix for NTLMSSP_AUTH request setting a flag incorrectly)

   - Add missing check for invalid tlink (tree connection) in ioctl"

* tag '6.5-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add missing return value check for cifs_sb_tlink
  smb3: do not set NTLMSSP_VERSION flag for negotiate not auth request
  cifs: fix charset issue in reconnection
  fs/nls: make load_nls() take a const parameter

12 months agoMerge tag 'trace-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 30 Jul 2023 03:40:43 +0000 (20:40 -0700)]
Merge tag 'trace-v6.5-rc3' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix to /sys/kernel/tracing/per_cpu/cpu*/stats read and entries.

   If a resize shrinks the buffer it clears the read count to notify
   readers that they need to reset. But the read count is also used for
   accounting and this causes the numbers to be off. Instead, create a
   separate variable to use to notify readers to reset.

 - Fix the ref counts of the "soft disable" mode. The wrong value was
   used for testing if soft disable mode should be enabled or disable,
   but instead, just change the logic to do the enable and disable in
   place when the SOFT_MODE is set or cleared.

 - Several kernel-doc fixes

 - Removal of unused external declarations

* tag 'trace-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix warning in trace_buffered_event_disable()
  ftrace: Remove unused extern declarations
  tracing: Fix kernel-doc warnings in trace_seq.c
  tracing: Fix kernel-doc warnings in trace_events_trigger.c
  tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c
  ring-buffer: Fix kernel-doc warnings in ring_buffer.c
  ring-buffer: Fix wrong stat of cpu_buffer->read

12 months agoarch/*/configs/*defconfig: Replace AUTOFS4_FS by AUTOFS_FS
Sven Joachim [Thu, 27 Jul 2023 20:00:41 +0000 (22:00 +0200)]
arch/*/configs/*defconfig: Replace AUTOFS4_FS by AUTOFS_FS

Commit a2225d931f75 ("autofs: remove left-over autofs4 stubs")
promised the removal of the fs/autofs/Kconfig fragment for AUTOFS4_FS
within a couple of releases, but five years later this still has not
happened yet, and AUTOFS4_FS is still enabled in 63 defconfigs.

Get rid of it mechanically:

   git grep -l CONFIG_AUTOFS4_FS -- '*defconfig' |
       xargs sed -i 's/AUTOFS4_FS/AUTOFS_FS/'

Also just remove the AUTOFS4_FS config option stub.  Anybody who hasn't
regenerated their config file in the last five years will need to just
get the new name right when they do.

Signed-off-by: Sven Joachim <svenjoac@gmx.de>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 months agoMerge branch 'net-data-races'
David S. Miller [Sat, 29 Jul 2023 17:13:41 +0000 (18:13 +0100)]
Merge branch 'net-data-races'

Eric Dumazet says:

====================
net: annotate data-races

This series was inspired by a syzbot/KCSAN report.

This will later also permit some optimizations,
like not having to lock the socket while reading/writing
some of its fields.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-races around sk->sk_priority
Eric Dumazet [Fri, 28 Jul 2023 15:03:18 +0000 (15:03 +0000)]
net: annotate data-races around sk->sk_priority

sk_getsockopt() runs locklessly. This means sk->sk_priority
can be read while other threads are changing its value.

Other reads also happen without socket lock being held.

Add missing annotations where needed.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: add missing data-race annotation for sk_ll_usec
Eric Dumazet [Fri, 28 Jul 2023 15:03:17 +0000 (15:03 +0000)]
net: add missing data-race annotation for sk_ll_usec

In a prior commit I forgot that sk_getsockopt() reads
sk->sk_ll_usec without holding a lock.

Fixes: 0dbffbb5335a ("net: annotate data race around sk_ll_usec")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: add missing data-race annotations around sk->sk_peek_off
Eric Dumazet [Fri, 28 Jul 2023 15:03:16 +0000 (15:03 +0000)]
net: add missing data-race annotations around sk->sk_peek_off

sk_getsockopt() runs locklessly, thus we need to annotate the read
of sk->sk_peek_off.

While we are at it, add corresponding annotations to sk_set_peek_off()
and unix_set_peek_off().

Fixes: b9bb53f3836f ("sock: convert sk_peek_offset functions to WRITE_ONCE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-races around sk->sk_mark
Eric Dumazet [Fri, 28 Jul 2023 15:03:15 +0000 (15:03 +0000)]
net: annotate data-races around sk->sk_mark

sk->sk_mark is often read while another thread could change the value.

Fixes: 4a19ec5800fc ("[NET]: Introducing socket mark socket option.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: add missing READ_ONCE(sk->sk_rcvbuf) annotation
Eric Dumazet [Fri, 28 Jul 2023 15:03:14 +0000 (15:03 +0000)]
net: add missing READ_ONCE(sk->sk_rcvbuf) annotation

In a prior commit, I forgot to change sk_getsockopt()
when reading sk->sk_rcvbuf locklessly.

Fixes: ebb3b78db7bf ("tcp: annotate sk->sk_rcvbuf lockless reads")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: add missing READ_ONCE(sk->sk_sndbuf) annotation
Eric Dumazet [Fri, 28 Jul 2023 15:03:13 +0000 (15:03 +0000)]
net: add missing READ_ONCE(sk->sk_sndbuf) annotation

In a prior commit, I forgot to change sk_getsockopt()
when reading sk->sk_sndbuf locklessly.

Fixes: e292f05e0df7 ("tcp: annotate sk->sk_sndbuf lockless reads")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-races around sk->sk_{rcv|snd}timeo
Eric Dumazet [Fri, 28 Jul 2023 15:03:12 +0000 (15:03 +0000)]
net: annotate data-races around sk->sk_{rcv|snd}timeo

sk_getsockopt() runs without locks, we must add annotations
to sk->sk_rcvtimeo and sk->sk_sndtimeo.

In the future we might allow fetching these fields before
we lock the socket in TCP fast path.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: add missing READ_ONCE(sk->sk_rcvlowat) annotation
Eric Dumazet [Fri, 28 Jul 2023 15:03:11 +0000 (15:03 +0000)]
net: add missing READ_ONCE(sk->sk_rcvlowat) annotation

In a prior commit, I forgot to change sk_getsockopt()
when reading sk->sk_rcvlowat locklessly.

Fixes: eac66402d1c3 ("net: annotate sk->sk_rcvlowat lockless reads")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-races around sk->sk_max_pacing_rate
Eric Dumazet [Fri, 28 Jul 2023 15:03:10 +0000 (15:03 +0000)]
net: annotate data-races around sk->sk_max_pacing_rate

sk_getsockopt() runs locklessly. This means sk->sk_max_pacing_rate
can be read while other threads are changing its value.

Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-race around sk->sk_txrehash
Eric Dumazet [Fri, 28 Jul 2023 15:03:09 +0000 (15:03 +0000)]
net: annotate data-race around sk->sk_txrehash

sk_getsockopt() runs locklessly. This means sk->sk_txrehash
can be read while other threads are changing its value.

Other locations were handled in commit cb6cd2cec799
("tcp: Change SYN ACK retransmit behaviour to account for rehash")

Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: annotate data-races around sk->sk_reserved_mem
Eric Dumazet [Fri, 28 Jul 2023 15:03:08 +0000 (15:03 +0000)]
net: annotate data-races around sk->sk_reserved_mem

sk_getsockopt() runs locklessly. This means sk->sk_reserved_mem
can be read while other threads are changing its value.

Add missing annotations where they are needed.

Fixes: 2bb2f5fb21b0 ("net: add new socket option SO_RESERVE_MEM")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: gro: fix misuse of CB in udp socket lookup
Richard Gobert [Thu, 27 Jul 2023 15:33:56 +0000 (17:33 +0200)]
net: gro: fix misuse of CB in udp socket lookup

This patch fixes a misuse of IP{6}CB(skb) in GRO, while calling to
`udp6_lib_lookup2` when handling udp tunnels. `udp6_lib_lookup2` fetch the
device from CB. The fix changes it to fetch the device from `skb->dev`.
l3mdev case requires special attention since it has a master and a slave
device.

Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
Reported-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agoqed: Fix scheduling in a tasklet while getting stats
Konstantin Khorenko [Thu, 27 Jul 2023 15:26:09 +0000 (18:26 +0300)]
qed: Fix scheduling in a tasklet while getting stats

Here we've got to a situation when tasklet called usleep_range() in PTT
acquire logic, thus welcome to the "scheduling while atomic" BUG().

  BUG: scheduling while atomic: swapper/24/0/0x00000100

   [<ffffffffb41c6199>] schedule+0x29/0x70
   [<ffffffffb41c5512>] schedule_hrtimeout_range_clock+0xb2/0x150
   [<ffffffffb41c55c3>] schedule_hrtimeout_range+0x13/0x20
   [<ffffffffb41c3bcf>] usleep_range+0x4f/0x70
   [<ffffffffc08d3e58>] qed_ptt_acquire+0x38/0x100 [qed]
   [<ffffffffc08eac48>] _qed_get_vport_stats+0x458/0x580 [qed]
   [<ffffffffc08ead8c>] qed_get_vport_stats+0x1c/0xd0 [qed]
   [<ffffffffc08dffd3>] qed_get_protocol_stats+0x93/0x100 [qed]
                        qed_mcp_send_protocol_stats
            case MFW_DRV_MSG_GET_LAN_STATS:
            case MFW_DRV_MSG_GET_FCOE_STATS:
            case MFW_DRV_MSG_GET_ISCSI_STATS:
            case MFW_DRV_MSG_GET_RDMA_STATS:
   [<ffffffffc08e36d8>] qed_mcp_handle_events+0x2d8/0x890 [qed]
                        qed_int_assertion
                        qed_int_attentions
   [<ffffffffc08d9490>] qed_int_sp_dpc+0xa50/0xdc0 [qed]
   [<ffffffffb3aa7623>] tasklet_action+0x83/0x140
   [<ffffffffb41d9125>] __do_softirq+0x125/0x2bb
   [<ffffffffb41d560c>] call_softirq+0x1c/0x30
   [<ffffffffb3a30645>] do_softirq+0x65/0xa0
   [<ffffffffb3aa78d5>] irq_exit+0x105/0x110
   [<ffffffffb41d8996>] do_IRQ+0x56/0xf0

Fix this by making caller to provide the context whether it could be in
atomic context flow or not when getting stats from QED driver.
QED driver based on the context provided decide to schedule out or not
when acquiring the PTT BAR window.

We faced the BUG_ON() while getting vport stats, but according to the
code same issue could happen for fcoe and iscsi statistics as well, so
fixing them too.

Fixes: 6c75424612a7 ("qed: Add support for NCSI statistics.")
Fixes: 1e128c81290a ("qed: Add support for hardware offloaded FCoE.")
Fixes: 2f2b2614e893 ("qed: Provide iSCSI statistics to management")
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Cc: David Miller <davem@davemloft.net>
Cc: Manish Chopra <manishc@marvell.com>
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: dsa: microchip: KSZ9477 register regmap alignment to 32 bit boundaries
Lukasz Majewski [Thu, 27 Jul 2023 08:13:42 +0000 (10:13 +0200)]
net: dsa: microchip: KSZ9477 register regmap alignment to 32 bit boundaries

The commit (SHA1: 5c844d57aa7894154e49cf2fc648bfe2f1aefc1c) provided code
to apply "Module 6: Certain PHY registers must be written as pairs instead
of singly" errata for KSZ9477 as this chip for certain PHY registers
(0xN120 to 0xN13F, N=1,2,3,4,5) must be accesses as 32 bit words instead
of 16 or 8 bit access.
Otherwise, adjacent registers (no matter if reserved or not) are
overwritten with 0x0.

Without this patch some registers (e.g. 0x113c or 0x1134) required for 32
bit access are out of valid regmap ranges.

As a result, following error is observed and KSZ9477 is not properly
configured:

ksz-switch spi1.0: can't rmw 32bit reg 0x113c: -EIO
ksz-switch spi1.0: can't rmw 32bit reg 0x1134: -EIO
ksz-switch spi1.0 lan1 (uninitialized): failed to connect to PHY: -EIO
ksz-switch spi1.0 lan1 (uninitialized): error -5 setting up PHY for tree 0, switch 0, port 0

The solution is to modify regmap_reg_range to allow accesses with 4 bytes
boundaries.

Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agonet: stmmac: tegra: Properly allocate clock bulk data
Thierry Reding [Wed, 26 Jul 2023 16:32:00 +0000 (18:32 +0200)]
net: stmmac: tegra: Properly allocate clock bulk data

The clock data is an array of struct clk_bulk_data, so make sure to
allocate enough memory.

Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 months agoMerge tag 'loongarch-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 29 Jul 2023 15:59:25 +0000 (08:59 -0700)]
Merge tag 'loongarch-fixes-6.5-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Some bug fixes for build system, builtin cmdline handling, bpf and
  {copy, clear}_user, together with a trivial cleanup"

* tag 'loongarch-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Cleanup __builtin_constant_p() checking for cpu_has_*
  LoongArch: BPF: Fix check condition to call lu32id in move_imm()
  LoongArch: BPF: Enable bpf_probe_read{, str}() on LoongArch
  LoongArch: Fix return value underflow in exception path
  LoongArch: Fix CMDLINE_EXTEND and CMDLINE_BOOTLOADER handling
  LoongArch: Fix module relocation error with binutils 2.41
  LoongArch: Only fiddle with CHECKFLAGS if `need-compiler'

12 months agoKVM: selftests: Expand x86's sregs test to cover illegal CR0 values
Sean Christopherson [Tue, 13 Jun 2023 20:30:37 +0000 (13:30 -0700)]
KVM: selftests: Expand x86's sregs test to cover illegal CR0 values

Add coverage to x86's set_sregs_test to verify KVM rejects vendor-agnostic
illegal CR0 values, i.e. CR0 values whose legality doesn't depend on the
current VMX mode.  KVM historically has neglected to reject bad CR0s from
userspace, i.e. would happily accept a completely bogus CR0 via
KVM_SET_SREGS{2}.

Punt VMX specific subtests to future work, as they would require quite a
bit more effort, and KVM gets coverage for CR0 checks in general through
other means, e.g. KVM-Unit-Tests.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
Sean Christopherson [Tue, 13 Jun 2023 20:30:36 +0000 (13:30 -0700)]
KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest

Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only
if KVM itself is not configured to utilize unrestricted guests, i.e. don't
stuff CR0/CR4 for a restricted L2 that is running as the guest of an
unrestricted L1.  Any attempt to VM-Enter a restricted guest with invalid
CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should
never observe a restricted L2 with incompatible CR0/CR4, since nested
VM-Enter from L1 should have failed.

And if KVM does observe an active, restricted L2 with incompatible state,
e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail
does more harm than good, as KVM will often neglect to undo the side
effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus
the damage can easily spill over to L1.  On the other hand, letting
VM-Enter fail due to bad guest state is more likely to contain the damage
to L2 as KVM relies on hardware to perform most guest state consistency
checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into
L1 irrespective of (un)restricted guest behavior.

Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Sean Christopherson [Tue, 13 Jun 2023 20:30:35 +0000 (13:30 -0700)]
KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid

Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid,
e.g. due to setting bits 63:32, illegal combinations, or to a value that
isn't allowed in VMX (non-)root mode.  The VMX checks in particular are
"fun" as failure to disallow Real Mode for an L2 that is configured with
unrestricted guest disabled, when KVM itself has unrestricted guest
enabled, will result in KVM forcing VM86 mode to virtual Real Mode for
L2, but then fail to unwind the related metadata when synthesizing a
nested VM-Exit back to L1 (which has unrestricted guest enabled).

Opportunistically fix a benign typo in the prototype for is_valid_cr4().

Cc: stable@vger.kernel.org
Reported-by: syzbot+5feef0b9ee9c8e9e5689@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000f316b705fdf6e2b4@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoRevert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"
Sean Christopherson [Wed, 26 Jul 2023 20:29:20 +0000 (13:29 -0700)]
Revert "debugfs, coccinelle: check for obsolete DEFINE_SIMPLE_ATTRIBUTE() usage"

Remove coccinelle's recommendation to use DEFINE_DEBUGFS_ATTRIBUTE()
instead of DEFINE_SIMPLE_ATTRIBUTE().  Regardless of whether or not the
"significant overhead" incurred by debugfs_create_file() is actually
meaningful, warnings from the script have led to a rash of low-quality
patches that have sowed confusion and consumed maintainer time for little
to no benefit.  There have been no less than four attempts to "fix" KVM,
and a quick search on lore shows that KVM is not alone.

This reverts commit 5103068eaca290f890a30aae70085fac44cecaf6.

Link: https://lore.kernel.org/all/87tu2nbnz3.fsf@mpe.ellerman.id.au
Link: https://lore.kernel.org/all/c0b98151-16b6-6d8f-1765-0f7d46682d60@redhat.com
Link: https://lkml.kernel.org/r/20230706072954.4881-1-duminjie%40vivo.com
Link: https://lore.kernel.org/all/Y2FsbufV00jbyF0B@google.com
Link: https://lore.kernel.org/all/Y2ENJJ1YiSg5oHiy@orome
Link: https://lore.kernel.org/all/7560b350e7b23786ce712118a9a504356ff1cca4.camel@kernel.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230726202920.507756-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Verify stats fd is usable after VM fd has been closed
Sean Christopherson [Tue, 11 Jul 2023 23:01:31 +0000 (16:01 -0700)]
KVM: selftests: Verify stats fd is usable after VM fd has been closed

Verify that VM and vCPU binary stats files are usable even after userspace
has put its last direct reference to the VM.  This is a regression test
for a UAF bug where KVM didn't gift the stats files a reference to the VM.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Verify stats fd can be dup()'d and read
Sean Christopherson [Tue, 11 Jul 2023 23:01:30 +0000 (16:01 -0700)]
KVM: selftests: Verify stats fd can be dup()'d and read

Expand the binary stats test to verify that a stats fd can be dup()'d
and read, to (very) roughly simulate userspace passing around the file.
Adding the dup() test is primarily an intermediate step towards verifying
that userspace can read VM/vCPU stats before _and_ after userspace closes
its copy of the VM fd; the dup() test itself is only mildly interesting.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-7-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Verify userspace can create "redundant" binary stats files
Sean Christopherson [Tue, 11 Jul 2023 23:01:29 +0000 (16:01 -0700)]
KVM: selftests: Verify userspace can create "redundant" binary stats files

Verify that KVM doesn't artificially limit KVM_GET_STATS_FD to a single
file per VM/vCPU.  There's no known use case for getting multiple stats
fds, but it should work, and more importantly creating multiple files will
make it easier to test that KVM correct manages VM refcounts for stats
files.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Explicitly free vcpus array in binary stats test
Sean Christopherson [Tue, 11 Jul 2023 23:01:28 +0000 (16:01 -0700)]
KVM: selftests: Explicitly free vcpus array in binary stats test

Explicitly free the all-encompassing vcpus array in the binary stats test
so that the test is consistent with respect to freeing all dynamically
allocated resources (versus letting them be freed on exit).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Clean up stats fd in common stats_test() helper
Sean Christopherson [Tue, 11 Jul 2023 23:01:27 +0000 (16:01 -0700)]
KVM: selftests: Clean up stats fd in common stats_test() helper

Move the stats fd cleanup code into stats_test() and drop the
superfluous vm_stats_test() and vcpu_stats_test() helpers in order to
decouple creation of the stats file from consuming/testing the file
(deduping code is a bonus).  This will make it easier to test various
edge cases related to stats, e.g. that userspace can dup() a stats fd,
that userspace can have multiple stats files for a singleVM/vCPU, etc.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: selftests: Use pread() to read binary stats header
Sean Christopherson [Tue, 11 Jul 2023 23:01:26 +0000 (16:01 -0700)]
KVM: selftests: Use pread() to read binary stats header

Use pread() with an explicit offset when reading the header and the header
name for a binary stats fd so that the common helper and the binary stats
test don't subtly rely on the file effectively being untouched, e.g. to
allow multiple reads of the header, name, etc.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230711230131.648752-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: Grab a reference to KVM for VM and vCPU stats file descriptors
Sean Christopherson [Tue, 11 Jul 2023 23:01:25 +0000 (16:01 -0700)]
KVM: Grab a reference to KVM for VM and vCPU stats file descriptors

Grab a reference to KVM prior to installing VM and vCPU stats file
descriptors to ensure the underlying VM and vCPU objects are not freed
until the last reference to any and all stats fds are dropped.

Note, the stats paths manually invoke fd_install() and so don't need to
grab a reference before creating the file.

Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU")
Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM")
Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu>
Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5r=Q@mail.gmail.com
Cc: stable@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Message-Id: <20230711230131.648752-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoselftests/rseq: Play nice with binaries statically linked against glibc 2.35+
Sean Christopherson [Fri, 21 Jul 2023 22:33:52 +0000 (15:33 -0700)]
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+

To allow running rseq and KVM's rseq selftests as statically linked
binaries, initialize the various "trampoline" pointers to point directly
at the expect glibc symbols, and skip the dlysm() lookups if the rseq
size is non-zero, i.e. the binary is statically linked *and* the libc
registered its own rseq.

Define weak versions of the symbols so as not to break linking against
libc versions that don't support rseq in any capacity.

The KVM selftests in particular are often statically linked so that they
can be run on targets with very limited runtime environments, i.e. test
machines.

Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35")
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230721223352.2333911-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoRevert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"
Sean Christopherson [Fri, 21 Jul 2023 22:43:37 +0000 (15:43 -0700)]
Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"

Now that handle_fastpath_set_msr_irqoff() acquires kvm->srcu, i.e. allows
dereferencing memslots during WRMSR emulation, drop the requirement that
"next RIP" is valid.  In hindsight, acquiring kvm->srcu would have been a
better fix than avoiding the pastpath, but at the time it was thought that
accessing SRCU-protected data in the fastpath was a one-off edge case.

This reverts commit 5c30e8101e8d5d020b1d7119117889756a6ed713.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230721224337.2335137-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86: Acquire SRCU read lock when handling fastpath MSR writes
Sean Christopherson [Fri, 21 Jul 2023 22:43:36 +0000 (15:43 -0700)]
KVM: x86: Acquire SRCU read lock when handling fastpath MSR writes

Temporarily acquire kvm->srcu for read when potentially emulating WRMSR in
the VM-Exit fastpath handler, as several of the common helpers used during
emulation expect the caller to provide SRCU protection.  E.g. if the guest
is counting instructions retired, KVM will query the PMU event filter when
stepping over the WRMSR.

  dump_stack+0x85/0xdf
  lockdep_rcu_suspicious+0x109/0x120
  pmc_event_is_allowed+0x165/0x170
  kvm_pmu_trigger_event+0xa5/0x190
  handle_fastpath_set_msr_irqoff+0xca/0x1e0
  svm_vcpu_run+0x5c3/0x7b0 [kvm_amd]
  vcpu_enter_guest+0x2108/0x2580

Alternatively, check_pmu_event_filter() could acquire kvm->srcu, but this
isn't the first bug of this nature, e.g. see commit 5c30e8101e8d ("KVM:
SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid").  Providing
protection for the entirety of WRMSR emulation will allow reverting the
aforementioned commit, and will avoid having to play whack-a-mole when new
uses of SRCU-protected structures are inevitably added in common emulation
helpers.

Fixes: dfdeda67ea2d ("KVM: x86/pmu: Prevent the PMU from counting disallowed events")
Reported-by: Greg Thelen <gthelen@google.com>
Reported-by: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230721224337.2335137-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: VMX: Use vmread_error() to report VM-Fail in "goto" path
Sean Christopherson [Fri, 21 Jul 2023 23:56:37 +0000 (16:56 -0700)]
KVM: VMX: Use vmread_error() to report VM-Fail in "goto" path

Use vmread_error() to report VM-Fail on VMREAD for the "asm goto" case,
now that trampoline case has yet another wrapper around vmread_error() to
play nice with instrumentation.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230721235637.2345403-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: VMX: Make VMREAD error path play nice with noinstr
Sean Christopherson [Fri, 21 Jul 2023 23:56:36 +0000 (16:56 -0700)]
KVM: VMX: Make VMREAD error path play nice with noinstr

Mark vmread_error_trampoline() as noinstr, and add a second trampoline
for the CONFIG_CC_HAS_ASM_GOTO_OUTPUT=n case to enable instrumentation
when handling VM-Fail on VMREAD.  VMREAD is used in various noinstr
flows, e.g. immediately after VM-Exit, and objtool rightly complains that
the call to the error trampoline leaves a no-instrumentation section
without annotating that it's safe to do so.

  vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0xc9:
  call to vmread_error_trampoline() leaves .noinstr.text section

Note, strictly speaking, enabling instrumentation in the VM-Fail path
isn't exactly safe, but if VMREAD fails the kernel/system is likely hosed
anyways, and logging that there is a fatal error is more important than
*maybe* encountering slightly unsafe instrumentation.

Reported-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230721235637.2345403-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86/irq: Conditionally register IRQ bypass consumer again
Like Xu [Mon, 24 Jul 2023 11:12:36 +0000 (19:12 +0800)]
KVM: x86/irq: Conditionally register IRQ bypass consumer again

As was attempted commit 14717e203186 ("kvm: Conditionally register IRQ
bypass consumer"): "if we don't support a mechanism for bypassing IRQs,
don't register as a consumer.  Initially this applied to AMD processors,
but when AVIC support was implemented for assigned devices,
kvm_arch_has_irq_bypass() was always returning true.

We can still skip registering the consumer where enable_apicv
or posted-interrupts capability is unsupported or globally disabled.
This eliminates meaningless dev_info()s when the connect fails
between producer and consumer", such as on Linux hosts where enable_apicv
or posted-interrupts capability is unsupported or globally disabled.

Cc: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Yong He <alexyonghe@tencent.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217379
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20230724111236.76570-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv
Peng Hao [Fri, 28 Jul 2023 06:49:48 +0000 (14:49 +0800)]
KVM: X86: Use GFP_KERNEL_ACCOUNT for pid_table in ipiv

The pid_table of ipiv is the persistent memory allocated by
per-vcpu, which should be counted into the memory cgroup.

Signed-off-by: Peng Hao <flyingpeng@tencent.com>
Message-Id: <CAPm50aLxCQ3TQP2Lhc0PX3y00iTRg+mniLBqNDOC=t9CLxMwwA@mail.gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86: check the kvm_cpu_get_interrupt result before using it
Maxim Levitsky [Wed, 26 Jul 2023 13:59:45 +0000 (16:59 +0300)]
KVM: x86: check the kvm_cpu_get_interrupt result before using it

The code was blindly assuming that kvm_cpu_get_interrupt never returns -1
when there is a pending interrupt.

While this should be true, a bug in KVM can still cause this.

If -1 is returned, the code before this patch was converting it to 0xFF,
and 0xFF interrupt was injected to the guest, which results in an issue
which was hard to debug.

Add WARN_ON_ONCE to catch this case and skip the injection
if this happens again.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20230726135945.260841-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86: VMX: set irr_pending in kvm_apic_update_irr
Maxim Levitsky [Wed, 26 Jul 2023 13:59:44 +0000 (16:59 +0300)]
KVM: x86: VMX: set irr_pending in kvm_apic_update_irr

When the APICv is inhibited, the irr_pending optimization is used.

Therefore, when kvm_apic_update_irr sets bits in the IRR,
it must set irr_pending to true as well.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20230726135945.260841-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agoKVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically
Maxim Levitsky [Wed, 26 Jul 2023 13:59:43 +0000 (16:59 +0300)]
KVM: x86: VMX: __kvm_apic_update_irr must update the IRR atomically

If APICv is inhibited, then IPIs from peer vCPUs are done by
atomically setting bits in IRR.

This means, that when __kvm_apic_update_irr copies PIR to IRR,
it has to modify IRR atomically as well.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20230726135945.260841-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
12 months agokprobes: Prohibit probing on CFI preamble symbol
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 01:50:47 +0000 (10:50 +0900)]
kprobes: Prohibit probing on CFI preamble symbol

Do not allow to probe on "__cfi_" or "__pfx_" started symbol, because those
are used for CFI and not executed. Probing it will break the CFI.

Link: https://lore.kernel.org/all/168904024679.116016.18089228029322008512.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agoKVM: s390: fix sthyi error handling
Heiko Carstens [Thu, 27 Jul 2023 18:29:39 +0000 (20:29 +0200)]
KVM: s390: fix sthyi error handling

Commit 9fb6c9b3fea1 ("s390/sthyi: add cache to store hypervisor info")
added cache handling for store hypervisor info. This also changed the
possible return code for sthyi_fill().

Instead of only returning a condition code like the sthyi instruction would
do, it can now also return a negative error value (-ENOMEM). handle_styhi()
was not changed accordingly. In case of an error, the negative error value
would incorrectly injected into the guest PSW.

Add proper error handling to prevent this, and update the comment which
describes the possible return values of sthyi_fill().

Fixes: 9fb6c9b3fea1 ("s390/sthyi: add cache to store hypervisor info")
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20230727182939.2050744-1-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
12 months agomISDN: hfcpci: Fix potential deadlock on &hc->lock
Chengfeng Ye [Thu, 27 Jul 2023 08:56:19 +0000 (08:56 +0000)]
mISDN: hfcpci: Fix potential deadlock on &hc->lock

As &hc->lock is acquired by both timer _hfcpci_softirq() and hardirq
hfcpci_int(), the timer should disable irq before lock acquisition
otherwise deadlock could happen if the timmer is preemtped by the hadr irq.

Possible deadlock scenario:
hfcpci_softirq() (timer)
    -> _hfcpci_softirq()
    -> spin_lock(&hc->lock);
        <irq interruption>
        -> hfcpci_int()
        -> spin_lock(&hc->lock); (deadlock here)

This flaw was found by an experimental static analysis tool I am developing
for irq-related deadlock.

The tentative patch fixes the potential deadlock by spin_lock_irq()
in timer.

Fixes: b36b654a7e82 ("mISDN: Create /sys/class/mISDN")
Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Link: https://lore.kernel.org/r/20230727085619.7419-1-dg573847474@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agoMerge tag 'ata-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 29 Jul 2023 01:31:18 +0000 (18:31 -0700)]
Merge tag 'ata-6.5-rc4' of git://git./linux/kernel/git/dlemoal/libata

Pull ata fixes from Damien Le Moal:

 - Fix error message output in the pata_arasan_cf driver (Minjie)

 - Fix invalid error return in the pata_octeon_cf driver initialization
   (Yingliang)

 - Fix a compilation warning due to a missing static function
   declaration in the pata_ns87415 driver (Arnd)

 - Fix the condition evaluating when to fetch sense data for successful
   completions, which should be done only when command duration limits
   are being used (Niklas)

* tag 'ata-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata-core: fix when to fetch sense data for successful commands
  ata: pata_ns87415: mark ns87560_tf_read static
  ata: pata_octeon_cf: fix error return code in octeon_cf_probe()
  ata: pata_arasan_cf: Use dev_err_probe() instead dev_err() in data_xfer()

12 months agonet: sched: cls_u32: Fix match key mis-addressing
Jamal Hadi Salim [Wed, 26 Jul 2023 13:51:51 +0000 (09:51 -0400)]
net: sched: cls_u32: Fix match key mis-addressing

A match entry is uniquely identified with an "address" or "path" in the
form of: hashtable ID(12b):bucketid(8b):nodeid(12b).

When creating table match entries all of hash table id, bucket id and
node (match entry id) are needed to be either specified by the user or
reasonable in-kernel defaults are used. The in-kernel default for a table id is
0x800(omnipresent root table); for bucketid it is 0x0. Prior to this fix there
was none for a nodeid i.e. the code assumed that the user passed the correct
nodeid and if the user passes a nodeid of 0 (as Mingi Cho did) then that is what
was used. But nodeid of 0 is reserved for identifying the table. This is not
a problem until we dump. The dump code notices that the nodeid is zero and
assumes it is referencing a table and therefore references table struct
tc_u_hnode instead of what was created i.e match entry struct tc_u_knode.

Ming does an equivalent of:
tc filter add dev dummy0 parent 10: prio 1 handle 0x1000 \
protocol ip u32 match ip src 10.0.0.1/32 classid 10:1 action ok

Essentially specifying a table id 0, bucketid 1 and nodeid of zero
Tableid 0 is remapped to the default of 0x800.
Bucketid 1 is ignored and defaults to 0x00.
Nodeid was assumed to be what Ming passed - 0x000

dumping before fix shows:
~$ tc filter ls dev dummy0 parent 10:
filter protocol ip pref 1 u32 chain 0
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor -30591

Note that the last line reports a table instead of a match entry
(you can tell this because it says "ht divisor...").
As a result of reporting the wrong data type (misinterpretting of struct
tc_u_knode as being struct tc_u_hnode) the divisor is reported with value
of -30591. Ming identified this as part of the heap address
(physmap_base is 0xffff8880 (-30591 - 1)).

The fix is to ensure that when table entry matches are added and no
nodeid is specified (i.e nodeid == 0) then we get the next available
nodeid from the table's pool.

After the fix, this is what the dump shows:
$ tc filter ls dev dummy0 parent 10:
filter protocol ip pref 1 u32 chain 0
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:1 not_in_hw
  match 0a000001/ffffffff at 12
action order 1: gact action pass
 random type none pass val 0
 index 1 ref 1 bind 1

Reported-by: Mingi Cho <mgcho.minic@gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230726135151.416917-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 months agotracing: Fix warning in trace_buffered_event_disable()
Zheng Yejian [Wed, 26 Jul 2023 09:58:04 +0000 (17:58 +0800)]
tracing: Fix warning in trace_buffered_event_disable()

Warning happened in trace_buffered_event_disable() at
  WARN_ON_ONCE(!trace_buffered_event_ref)

  Call Trace:
   ? __warn+0xa5/0x1b0
   ? trace_buffered_event_disable+0x189/0x1b0
   __ftrace_event_enable_disable+0x19e/0x3e0
   free_probe_data+0x3b/0xa0
   unregister_ftrace_function_probe_func+0x6b8/0x800
   event_enable_func+0x2f0/0x3d0
   ftrace_process_regex.isra.0+0x12d/0x1b0
   ftrace_filter_write+0xe6/0x140
   vfs_write+0x1c9/0x6f0
   [...]

The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
 ```
 #!/bin/bash

 cd /sys/kernel/tracing/

 # 1. Register a 'disable_event' command, then:
 #    1) SOFT_DISABLED_BIT was set;
 #    2) trace_buffered_event_enable() was called first time;
 echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
     set_ftrace_filter

 # 2. Enable the event registered, then:
 #    1) SOFT_DISABLED_BIT was cleared;
 #    2) trace_buffered_event_disable() was called first time;
 echo 1 > events/initcall/initcall_finish/enable

 # 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
 #    set again!!!
 cat /proc/cmdline

 # 4. Unregister the 'disable_event' command, then:
 #    1) SOFT_DISABLED_BIT was cleared again;
 #    2) trace_buffered_event_disable() was called second time!!!
 echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
     set_ftrace_filter
 ```

To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled.

Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes: 0fc1b09ff1ff ("tracing: Use temp buffer when filtering events")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agoMerge tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sat, 29 Jul 2023 00:19:52 +0000 (17:19 -0700)]
Merge tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git./linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "11 hotfixes. Five are cc:stable and the remainder address post-6.4
  issues or aren't considered serious enough to justify backporting"

* tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/memory-failure: fix hardware poison check in unpoison_memory()
  proc/vmcore: fix signedness bug in read_from_oldmem()
  mailmap: update remaining active codeaurora.org email addresses
  mm: lock VMA in dup_anon_vma() before setting ->anon_vma
  mm: fix memory ordering for mm_lock_seq and vm_lock_seq
  scripts/spelling.txt: remove 'thead' as a typo
  mm/pagewalk: fix EFI_PGT_DUMP of espfix area
  shmem: minor fixes to splice-read implementation
  tmpfs: fix Documentation of noswap and huge mount options
  Revert "um: Use swap() to make code cleaner"
  mm/damon/core-test: initialise context before test in damon_test_set_attrs()

12 months agoMerge tag 'thermal-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Sat, 29 Jul 2023 00:14:05 +0000 (17:14 -0700)]
Merge tag 'thermal-6.5-rc4' of git://git./linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "Constify thermal_zone_device_register() parameters, which was omitted
  by mistake, and fix a double free on thermal zone unregistration in
  the generic DT thermal driver (Ahmad Fatoum)"

* tag 'thermal-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: of: fix double-free on unregistration
  thermal: core: constify params in thermal_zone_device_register

12 months agoMerge tag 'pm-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Sat, 29 Jul 2023 00:08:59 +0000 (17:08 -0700)]
Merge tag 'pm-6.5-rc4' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Fix the arming of wakeup IRQs in the generic wakeup IRQ code
  (wakeirq), drop unused functions from it and fix up a driver using it
  and trying to work around the IRQ arming issue in a questionable way
  (Johan Hovold)"

* tag 'pm-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  serial: qcom-geni: drop bogus runtime pm state update
  PM: sleep: wakeirq: drop unused enable helpers
  PM: sleep: wakeirq: fix wake irq arming

12 months agoMerge tag 'hwmon-for-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sat, 29 Jul 2023 00:02:11 +0000 (17:02 -0700)]
Merge tag 'hwmon-for-v6.5-rc4' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - k10temp: Display negative temperatures for industrial processors

 - pmbus core: Fix deadlock, NULL pointer dereference, and chip enable
   detection

 - nct7802: Do not display PECI1 temperature if disabled

 - nct6775: Fix IN scaling factors and feature detection for
   NCT6798/6799

 - oxp-sensors: Fix race condition during device attribute creation

 - aquacomputer_d5next: Fix incorrect PWM value readout

* tag 'hwmon-for-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (k10temp) Enable AMD3255 Proc to show negative temperature
  hwmon: (pmbus_core) Fix Deadlock in pmbus_regulator_get_status
  hwmon: (pmbus_core) Fix NULL pointer dereference
  hwmon: (pmbus_core) Fix pmbus_is_enabled()
  hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
  hwmon: (nct6775) Fix IN scaling factors for 6798/6799
  hwmon: (oxp-sensors) Move tt_toggle attribute to dev_groups
  hwmon: (aquacomputer_d5next) Fix incorrect PWM value readout
  hwmon: (nct6775) Fix register for nct6799

12 months agoftrace: Remove unused extern declarations
YueHaibing [Tue, 25 Jul 2023 13:48:08 +0000 (21:48 +0800)]
ftrace: Remove unused extern declarations

commit 6a9c981b1e96 ("ftrace: Remove unused function ftrace_arch_read_dyn_info()")
left ftrace_arch_read_dyn_info() extern declaration.
And commit 1d74f2a0f64b ("ftrace: remove ftrace_ip_converted()")
leave ftrace_ip_converted() declaration.

Link: https://lore.kernel.org/linux-trace-kernel/20230725134808.9716-1-yuehaibing@huawei.com
Cc: <mhiramat@kernel.org>
Cc: <mark.rutland@arm.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agotracing: Fix kernel-doc warnings in trace_seq.c
Gaosheng Cui [Mon, 24 Jul 2023 14:08:27 +0000 (22:08 +0800)]
tracing: Fix kernel-doc warnings in trace_seq.c

Fix kernel-doc warning:

kernel/trace/trace_seq.c:142: warning: Function parameter or member
'args' not described in 'trace_seq_vprintf'

Link: https://lkml.kernel.org/r/20230724140827.1023266-5-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agotracing: Fix kernel-doc warnings in trace_events_trigger.c
Gaosheng Cui [Mon, 24 Jul 2023 14:08:26 +0000 (22:08 +0800)]
tracing: Fix kernel-doc warnings in trace_events_trigger.c

Fix kernel-doc warnings:

kernel/trace/trace_events_trigger.c:59: warning: Function parameter
or member 'buffer' not described in 'event_triggers_call'
kernel/trace/trace_events_trigger.c:59: warning: Function parameter
or member 'event' not described in 'event_triggers_call'

Link: https://lkml.kernel.org/r/20230724140827.1023266-4-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agotracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c
Gaosheng Cui [Mon, 24 Jul 2023 14:08:25 +0000 (22:08 +0800)]
tracing/synthetic: Fix kernel-doc warnings in trace_events_synth.c

Fix kernel-doc warning:

kernel/trace/trace_events_synth.c:1257: warning: Function parameter
or member 'mod' not described in 'synth_event_gen_cmd_array_start'

Link: https://lkml.kernel.org/r/20230724140827.1023266-3-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agoring-buffer: Fix kernel-doc warnings in ring_buffer.c
Gaosheng Cui [Mon, 24 Jul 2023 14:08:24 +0000 (22:08 +0800)]
ring-buffer: Fix kernel-doc warnings in ring_buffer.c

Fix kernel-doc warnings:

kernel/trace/ring_buffer.c:954: warning: Function parameter or
member 'cpu' not described in 'ring_buffer_wake_waiters'
kernel/trace/ring_buffer.c:3383: warning: Excess function parameter
'event' description in 'ring_buffer_unlock_commit'
kernel/trace/ring_buffer.c:5359: warning: Excess function parameter
'cpu' description in 'ring_buffer_reset_online_cpus'

Link: https://lkml.kernel.org/r/20230724140827.1023266-2-cuigaosheng1@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
12 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 28 Jul 2023 23:55:56 +0000 (16:55 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Several smaller driver fixes and a core RDMA CM regression fix:

   - Fix improperly accepting flags from userspace in mlx4

   - Add missing DMA barriers for irdma

   - Fix two kcsan warnings in irdma

   - Report the correct CQ op code to userspace in irdma

   - Report the correct MW bind error code for irdma

   - Load the destination address in RDMA CM to resolve a recent
     regression

   - Fix a QP regression in mthca

   - Remove a race processing completions in bnxt_re resulting in a
     crash

   - Fix driver unloading races with interrupts and tasklets in bnxt_re

   - Fix missing error unwind in rxe"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Report correct WC error
  RDMA/irdma: Fix op_type reporting in CQEs
  RDMA/rxe: Fix an error handling path in rxe_bind_mw()
  RDMA/bnxt_re: Fix hang during driver unload
  RDMA/bnxt_re: Prevent handling any completions after qp destroy
  RDMA/mthca: Fix crash when polling CQ for shared QPs
  RDMA/core: Update CMA destination address on rdma_resolve_addr
  RDMA/irdma: Fix data race on CQP request done
  RDMA/irdma: Fix data race on CQP completion stats
  RDMA/irdma: Add missing read barriers
  RDMA/mlx4: Make check for invalid flags stricter

12 months agoMerge tag 'tpmdd-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko...
Linus Torvalds [Fri, 28 Jul 2023 23:44:32 +0000 (16:44 -0700)]
Merge tag 'tpmdd-v6.5-rc4' of git://git./linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fixes from Jarkko Sakkinen:
 "I picked up three small scale updates that I think would improve the
  quality of the release"

* tag 'tpmdd-v6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm_tis: Explicitly check for error code
  tpm: Switch i2c drivers back to use .probe()
  security: keys: perform capable check only on privileged operations