Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:49 +0000 (11:38 +0200)]
Documentation/rv: Add deterministic automata monitor synthesis documentation
Add the da_monitor_synthesis.rst introduces some concepts behind the
Deterministic Automata (DA) monitor synthesis and interface.
Link: https://lkml.kernel.org/r/7873bdb7b2e5d2bc0b2eb6ca0b324af9a0ba27a0.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:48 +0000 (11:38 +0200)]
tools/rv: Add dot2k
transform .dot file into kernel rv monitor
usage: dot2k [-h] -d DOT_FILE -t MONITOR_TYPE [-n MODEL_NAME] [-D DESCRIPTION]
optional arguments:
-h, --help show this help message and exit
-d DOT_FILE, --dot DOT_FILE
-t MONITOR_TYPE, --monitor_type MONITOR_TYPE
-n MODEL_NAME, --model_name MODEL_NAME
-D DESCRIPTION, --description DESCRIPTION
Link: https://lkml.kernel.org/r/083b3ae61e5a62c1e2e5d08009baa91f82181618.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:47 +0000 (11:38 +0200)]
Documentation/rv: Add deterministic automaton documentation
Add documentation about deterministic automaton and its possible
representations (formal, graphic, .dot and C).
Link: https://lkml.kernel.org/r/387edaed87630bd5eb37c4275045dfd229700aa6.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:46 +0000 (11:38 +0200)]
tools/rv: Add dot2c
dot2c is a tool that transforms an automata in the graphiviz .dot file
into an C representation of the automata.
usage: dot2c [-h] dot_file
dot2c: converts a .dot file into a C structure
positional arguments:
dot_file The dot file to be converted
optional arguments:
-h, --help show this help message and exit
Link: https://lkml.kernel.org/r/b26204ba9509c80bcda31b76cdea31ddb188cd24.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:45 +0000 (11:38 +0200)]
Documentation/rv: Add a basic documentation
Add the runtime-verification.rst document, explaining the basics of RV
and how to use the interface.
Link: https://lkml.kernel.org/r/4be7d1a88ab1e2eb0767521e1ab52a149a154bc4.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:44 +0000 (11:38 +0200)]
rv/include: Add instrumentation helper functions
Instrumentation helper functions to facilitate the instrumentation of
auto-generated RV monitors create by dot2k.
Link: https://lkml.kernel.org/r/3b36c9435f9d9299beb84e5c7c46920e205bedec.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:43 +0000 (11:38 +0200)]
rv/include: Add deterministic automata monitor definition via C macros
In Linux terms, the runtime verification monitors are encapsulated
inside the "RV monitor" abstraction. The "RV monitor" includes a set
of instances of the monitor (per-cpu monitor, per-task monitor, and
so on), the helper functions that glue the monitor to the system
reference model, and the trace output as a reaction for event parsing
and exceptions, as depicted below:
Linux +----- RV Monitor ----------------------------------+ Formal
Realm | | Realm
+-------------------+ +----------------+ +-----------------+
| Linux kernel | | Monitor | | Reference |
| Tracing | -> | Instance(s) | <- | Model |
| (instrumentation) | | (verification) | | (specification) |
+-------------------+ +----------------+ +-----------------+
| | |
| V |
| +----------+ |
| | Reaction | |
| +--+--+--+-+ |
| | | | |
| | | +-> trace output ? |
+------------------------|--|----------------------+
| +----> panic ?
+-------> <user-specified>
Add the rv/da_monitor.h, enabling automatic code generation for the
*Monitor Instance(s)* using C macros, and code to support it.
The benefits of the usage of macro for monitor synthesis are 3-fold as it:
- Reduces the code duplication;
- Facilitates the bug fix/improvement;
- Avoids the case of developers changing the core of the monitor code
to manipulate the model in a (let's say) non-standard way.
This initial implementation presents three different types of monitor
instances:
- DECLARE_DA_MON_GLOBAL(name, type)
- DECLARE_DA_MON_PER_CPU(name, type)
- DECLARE_DA_MON_PER_TASK(name, type)
The first declares the functions for a global deterministic automata monitor,
the second for monitors with per-cpu instances, and the third with per-task
instances.
Link: https://lkml.kernel.org/r/51b0bf425a281e226dfeba7401d2115d6091f84e.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:42 +0000 (11:38 +0200)]
rv/include: Add helper functions for deterministic automata
Formally, a deterministic automaton, denoted by G, is defined as a
quintuple:
G = { X, E, f, x_0, X_m }
where:
- X is the set of states;
- E is the finite set of events;
- x_0 is the initial state;
- X_m (subset of X) is the set of marked states.
- f : X x E -> X $ is the transition function. It defines the
state transition in the occurrence of a event from E in
the state X. In the special case of deterministic automata,
the occurrence of the event in E in a state in X has a
deterministic next state from X.
An automaton can also be represented using a graphical format of
vertices (nodes) and edges. The open-source tool Graphviz can produce
this graphic format using the (textual) DOT language as the source code.
The dot2c tool presented in this paper:
De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo
Silva. Efficient formal verification for the Linux kernel. In:
International Conference on Software Engineering and Formal Methods.
Springer, Cham, 2019. p. 315-332.
Translates a deterministic automaton in the DOT format into a C
source code representation that to be used for monitoring.
This header file implements helper functions to facilitate the usage
of the C output from dot2c/k for monitoring.
Link: https://lkml.kernel.org/r/563234f2bfa84b540f60cf9e39c2d9f0eea95a55.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:41 +0000 (11:38 +0200)]
rv: Add runtime reactors interface
A runtime monitor can cause a reaction to the detection of an
exception on the model's execution. By default, the monitors have
tracing reactions, printing the monitor output via tracepoints.
But other reactions can be added (on-demand) via this interface.
The user interface resembles the kernel tracing interface and
presents these files:
"available_reactors"
- Reading shows the available reactors, one per line.
For example:
# cat available_reactors
nop
panic
printk
"reacting_on"
- It is an on/off general switch for reactors, disabling
all reactions.
"monitors/MONITOR/reactors"
- List available reactors, with the select reaction for the given
MONITOR inside []. The default one is the nop (no operation)
reactor.
- Writing the name of a reactor enables it to the given
MONITOR.
For example:
# cat monitors/wip/reactors
[nop]
panic
printk
# echo panic > monitors/wip/reactors
# cat monitors/wip/reactors
nop
[panic]
printk
Link: https://lkml.kernel.org/r/1794eb994637457bdeaa6bad0b8263d2f7eece0c.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Bristot de Oliveira [Fri, 29 Jul 2022 09:38:40 +0000 (11:38 +0200)]
rv: Add Runtime Verification (RV) interface
RV is a lightweight (yet rigorous) method that complements classical
exhaustive verification techniques (such as model checking and
theorem proving) with a more practical approach to complex systems.
RV works by analyzing the trace of the system's actual execution,
comparing it against a formal specification of the system behavior.
RV can give precise information on the runtime behavior of the
monitored system while enabling the reaction for unexpected
events, avoiding, for example, the propagation of a failure on
safety-critical systems.
The development of this interface roots in the development of the
paper:
De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo
Silva. Efficient formal verification for the Linux kernel. In:
International Conference on Software Engineering and Formal Methods.
Springer, Cham, 2019. p. 315-332.
And:
De Oliveira, Daniel Bristot. Automata-based formal analysis
and verification of the real-time Linux kernel. PhD Thesis, 2020.
The RV interface resembles the tracing/ interface on purpose. The current
path for the RV interface is /sys/kernel/tracing/rv/.
It presents these files:
"available_monitors"
- List the available monitors, one per line.
For example:
# cat available_monitors
wip
wwnr
"enabled_monitors"
- Lists the enabled monitors, one per line;
- Writing to it enables a given monitor;
- Writing a monitor name with a '!' prefix disables it;
- Truncating the file disables all enabled monitors.
For example:
# cat enabled_monitors
# echo wip > enabled_monitors
# echo wwnr >> enabled_monitors
# cat enabled_monitors
wip
wwnr
# echo '!wip' >> enabled_monitors
# cat enabled_monitors
wwnr
# echo > enabled_monitors
# cat enabled_monitors
#
Note that more than one monitor can be enabled concurrently.
"monitoring_on"
- It is an on/off general switcher for monitoring. Note
that it does not disable enabled monitors or detach events,
but stop the per-entity monitors of monitoring the events
received from the system. It resembles the "tracing_on" switcher.
"monitors/"
Each monitor will have its one directory inside "monitors/". There
the monitor specific files will be presented.
The "monitors/" directory resembles the "events" directory on
tracefs.
For example:
# cd monitors/wip/
# ls
desc enable
# cat desc
wakeup in preemptive per-cpu testing monitor.
# cat enable
0
For further information, see the comments in the header of
kernel/trace/rv/rv.c from this patch.
Link: https://lkml.kernel.org/r/a4bfe038f50cb047bfb343ad0e12b0e646ab308b.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 26 Jul 2022 14:18:51 +0000 (10:18 -0400)]
ftrace/x86: Add back ftrace_expected assignment
When a ftrace_bug happens (where ftrace fails to modify a location) it is
helpful to have what was at that location as well as what was expected to
be there.
But with the conversion to text_poke() the variable that assigns the
expected for debugging was dropped. Unfortunately, I noticed this when I
needed it. Add it back.
Link: https://lkml.kernel.org/r/20220726101851.069d2e70@gandalf.local.home
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Fixes:
768ae4406a5c ("x86/ftrace: Use text_poke()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 19 Jul 2022 22:20:04 +0000 (18:20 -0400)]
tracing: Use a copy of the va_list for __assign_vstr()
If an instance of tracing enables the same trace event as another
instance, or the top level instance, or even perf, then the va_list passed
into some tracepoints can be used more than once.
As va_list can only be traversed once, this can cause issues:
# cat /sys/kernel/tracing/instances/qla2xxx/trace
cat-56106 [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered (null).
cat-56106 [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered ×+<96>²Ü<98>^H.
cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0xde589000.
# cat /sys/kernel/tracing/trace
cat-56106 [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered qla2x00_get_firmware_state.
cat-56106 [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered qla2x00_mailbox_command.
cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0x69.
The instance version is corrupted because the top level instance iterated
the va_list first.
Use va_copy() in the __assign_vstr() macro to make sure that each trace
event for each use case gets a fresh va_list.
Link: https://lore.kernel.org/all/259d53a5-958e-6508-4e45-74dba2821242@marvell.com/
Link: https://lkml.kernel.org/r/20220719182004.21daa83e@gandalf.local.home
Fixes:
0563231f93c6d ("tracing/events: Add __vstring() and __assign_vstr() helper macros")
Reported-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Sun, 24 Jul 2022 23:16:50 +0000 (19:16 -0400)]
batman-adv: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220724191650.236b1355@rorschach.local.home
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <a@unstable.cc>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: netdev@vger.kernel.org
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 19 Jul 2022 15:27:19 +0000 (11:27 -0400)]
USB: mtu3: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220719112719.17e796c6@gandalf.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-usb@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mediatek@lists.infradead.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Mon, 18 Jul 2022 07:05:10 +0000 (16:05 +0900)]
selftests/kprobe: Update test for no event name syntax error
The commit
208003254c32 ("selftests/kprobe: Do not test for GRP/
without event failures") removed a syntax which is no more cause
a syntax error (NO_EVENT_NAME error with GRP/).
However, there are another case (NO_EVENT_NAME error without GRP/)
which causes a same error. This adds a test for that case.
Link: https://lkml.kernel.org/r/165812790993.1377963.9762767354560397298.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Fri, 15 Jul 2022 21:55:55 +0000 (17:55 -0400)]
tracing: Add example and documentation for new __vstring() macro
Update the sample trace events to include an example that uses the new
__vstring() helpers for TRACE_EVENTS.
Link: https://lkml.kernel.org/r/20220715175555.16375a3b@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 12 Jul 2022 20:17:07 +0000 (16:17 -0400)]
selftests/kprobe: Do not test for GRP/ without event failures
A new feature is added where kprobes (and other probes) do not need to
explicitly state the event name when creating a probe. The event name will
come from what is being attached.
That is:
# echo 'p:foo/ vfs_read' > kprobe_events
Will no longer error, but instead create an event:
# cat kprobe_events
p:foo/p_vfs_read_0 vfs_read
This should not be tested as an error case anymore. Remove it from the
selftest as now this feature "breaks" the selftest as it no longer fails
as expected.
Link: https://lore.kernel.org/all/1656296348-16111-1-git-send-email-quic_linyyuan@quicinc.com/
Link: https://lkml.kernel.org/r/20220712161707.6dc08a14@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linyu Yuan [Mon, 27 Jun 2022 02:19:08 +0000 (10:19 +0800)]
selftests/ftrace: Add test case for GRP/ only input
Add kprobe and eprobe event test for new GRP/ only format.
Link: https://lore.kernel.org/all/1656296348-16111-5-git-send-email-quic_linyyuan@quicinc.com/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linyu Yuan [Mon, 27 Jun 2022 02:19:07 +0000 (10:19 +0800)]
tracing: Auto generate event name when creating a group of events
Currently when creating a specific group of trace events,
take kprobe event as example, the user must use the following format:
p:GRP/EVENT [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS],
which means user must enter EVENT name, one example is:
echo 'p:usb_gadget/config_usb_cfg_link config_usb_cfg_link $arg1' >> kprobe_events
It is not simple if there are too many entries because the event name is
the same as symbol name.
This change allows user to specify no EVENT name, format changed as:
p:GRP/ [MOD:]KSYM[+OFFS]|KADDR [FETCHARGS]
It will generate event name automatically and one example is:
echo 'p:usb_gadget/ config_usb_cfg_link $arg1' >> kprobe_events.
Link: https://lore.kernel.org/all/1656296348-16111-4-git-send-email-quic_linyyuan@quicinc.com/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linyu Yuan [Mon, 27 Jun 2022 02:19:06 +0000 (10:19 +0800)]
tracing: eprobe: Remove duplicate is_good_name() operation
traceprobe_parse_event_name() already validate SYSTEM and EVENT name,
there is no need to call is_good_name() after it.
Link: https://lore.kernel.org/all/1656296348-16111-3-git-send-email-quic_linyyuan@quicinc.com/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linyu Yuan [Mon, 27 Jun 2022 02:19:05 +0000 (10:19 +0800)]
tracing: eprobe: Add missing log index
Add trace_probe_log_set_index(1) to allow report correct error
if user input wrong SYSTEM.EVENT format.
Link: https://lore.kernel.org/all/1656296348-16111-2-git-send-email-quic_linyyuan@quicinc.com/
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:45:06 +0000 (18:45 -0400)]
mac80211: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224751.271015450@goodmis.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:45:04 +0000 (18:45 -0400)]
scsi: qla2xxx: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224750.896553364@goodmis.org
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:45:03 +0000 (18:45 -0400)]
scsi: iscsi: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224750.715763972@goodmis.org
Cc: Fred Herard <fred.herard@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:45:02 +0000 (18:45 -0400)]
usb: musb: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224750.532345354@goodmis.org
Cc: Bin Liu <b-liu@ti.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:45:00 +0000 (18:45 -0400)]
xhci: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224750.172301548@goodmis.org
Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:59 +0000 (18:44 -0400)]
usb: chipidea: tracing: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224749.991587733@goodmis.org
Cc: Peter Chen <peter.chen@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:58 +0000 (18:44 -0400)]
tracing/iwlwifi: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224749.806599472@goodmis.org
Cc: Gregory Greenman <gregory.greenman@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:57 +0000 (18:44 -0400)]
tracing/brcm: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224749.622796175@goodmis.org
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Franky Lin <franky.lin@broadcom.com>
Cc: Hante Meuleman <hante.meuleman@broadcom.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-wireless@vger.kernel.org
Cc: brcm80211-dev-list.pdl@broadcom.com
Cc: SHA-cyfmac-dev-list@infineon.com
Cc: netdev@vger.kernel.org
Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:56 +0000 (18:44 -0400)]
tracing/ath: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224749.430339634@goodmis.org
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: ath11k@lists.infradead.org
Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:55 +0000 (18:44 -0400)]
tracing/IB/hfi1: Use the new __vstring() helper
Instead of open coding a __dynamic_array() with a fixed length (which
defeats the purpose of the dynamic array in the first place). Use the new
__vstring() helper that will use a va_list and only write enough of the
string into the ring buffer that is needed.
Link: https://lkml.kernel.org/r/20220705224749.239494531@goodmis.org
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:44:54 +0000 (18:44 -0400)]
tracing/events: Add __vstring() and __assign_vstr() helper macros
There's several places that open code the following logic:
TP_STRUCT__entry(__dynamic_array(char, msg, MSG_MAX)),
TP_fast_assign(vsnprintf(__get_str(msg), MSG_MAX, vaf->fmt, *vaf->va);)
To load a string created by variable array va_list.
The main issue with this approach is that "MSG_MAX" usage in the
__dynamic_array() portion. That actually just reserves the MSG_MAX in the
event, and even wastes space because there's dynamic meta data also saved
in the event to denote the offset and size of the dynamic array. It would
have been better to just use a static __array() field.
Instead, create __vstring() and __assign_vstr() that work like __string
and __assign_str() but instead of taking a destination string to copy,
take a format string and a va_list pointer and fill in the values.
It uses the helper:
#define __trace_event_vstr_len(fmt, va) \
({ \
va_list __ap; \
int __ret; \
\
va_copy(__ap, *(va)); \
__ret = vsnprintf(NULL, 0, fmt, __ap) + 1; \
va_end(__ap); \
\
min(__ret, TRACE_EVENT_STR_MAX); \
})
To figure out the length to store the string. It may be slightly slower as
it needs to run the vsnprintf() twice, but it now saves space on the ring
buffer.
Link: https://lkml.kernel.org/r/20220705224749.053570613@goodmis.org
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Franky Lin <franky.lin@broadcom.com>
Cc: Hante Meuleman <hante.meuleman@broadcom.com>
Cc: Gregory Greenman <gregory.greenman@intel.com>
Cc: Peter Chen <peter.chen@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Chunfeng Yun <chunfeng.yun@mediatek.com>
Cc: Bin Liu <b-liu@ti.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <a@unstable.cc>
Cc: Sven Eckelmann <sven@narfation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 5 Jul 2022 22:37:41 +0000 (18:37 -0400)]
neighbor: tracing: Have neigh_create event use __string()
The dev field of the neigh_create event uses __dynamic_array() with a
fixed size, which defeats the purpose of __dynamic_array(). Looking at the
logic, as it already uses __assign_str(), just use the same logic in
__string to create the size needed. It appears that because "dev" can be
NULL, it needs the check. But __string() can have the same checks as
__assign_str() so use them there too.
Link: https://lkml.kernel.org/r/20220705183741.35387e3f@rorschach.local.home
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Mon, 4 Jul 2022 13:14:36 +0000 (09:14 -0400)]
tracing/ipv4/ipv6: Use static array for name field in fib*_lookup_table event
The fib_lookup_table and fib6_lookup_table events declare name as a
dynamic_array, but also give it a fixed size, which defeats the purpose of
the dynamic array, especially since the dynamic array also includes meta
data in the event to specify its size.
Since the size of the name is at most 16 bytes (defined by IFNAMSIZ),
it is not worth spending the effort to determine the size of the string.
Just use a fixed size array and copy into it. This will save 4 bytes that
are used for the meta data that saves the size and position of a dynamic
array, and even slightly speed up the event processing.
Link: https://lkml.kernel.org/r/20220704091436.3705edbf@rorschach.local.home
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Tue, 12 Jul 2022 22:58:20 +0000 (18:58 -0400)]
tracing: devlink: Use static array for string in devlink_trap_report event
The trace event devlink_trap_report uses the __dynamic_array() macro to
determine the size of the input_dev_name field. This is because it needs
to test the dev field for NULL, and will use "NULL" if it is. But it also
has the size of the dynamic array as a fixed IFNAMSIZ bytes. This defeats
the purpose of the dynamic array, as this will reserve that amount of
bytes on the ring buffer, and to make matters worse, it will even save
that size in the event as the event expects it to be dynamic (for which it
is not).
Since IFNAMSIZ is just 16 bytes, just make it a static array and this will
remove the meta data from the event that records the size.
Link: https://lkml.kernel.org/r/20220712185820.002d9fb5@gandalf.local.home
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Zheng Yejian [Thu, 30 Jun 2022 01:31:52 +0000 (09:31 +0800)]
tracing/histograms: Simplify create_hist_fields()
When I look into implements of create_hist_fields(), I think there can be
following two simplifications:
1. If something wrong happened in parse_var_defs(), free_var_defs() would
have been called in it, so no need goto free again after calling it;
2. After calling create_key_fields(), regardless of the value of 'ret', it
then always runs into 'out: ', so the judge of 'ret' is redundant.
Link: https://lkml.kernel.org/r/20220630013152.164871-1-zhengyejian1@huawei.com
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Xiang wangx [Mon, 6 Jun 2022 02:30:07 +0000 (10:30 +0800)]
tracing/user_events: Fix syntax errors in comments
Delete the redundant word 'have'.
Link: https://lkml.kernel.org/r/20220606023007.23377-1-wangxiang@cdjrlc.com
Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tiezhu Yang [Wed, 8 Jun 2022 01:23:22 +0000 (09:23 +0800)]
samples: Use KSYM_NAME_LEN for kprobes
It is better and enough to use KSYM_NAME_LEN for kprobes
in samples, no need to define and use the other values.
Link: https://lkml.kernel.org/r/1654651402-21552-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
sunliming [Mon, 6 Jun 2022 07:56:59 +0000 (15:56 +0800)]
fprobe/samples: Make sample_probe static
This symbol is not used outside of fprobe_example.c, so marks it static.
Fixes the following warning:
sparse warnings: (new ones prefixed by >>)
>> samples/fprobe/fprobe_example.c:23:15: sparse: sparse: symbol 'sample_probe'
was not declared. Should it be static?
Link: https://lkml.kernel.org/r/20220606075659.674556-1-sunliming@kylinos.cn
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Li kunyu [Wed, 29 Jun 2022 03:00:13 +0000 (11:00 +0800)]
blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon
Remove extra semicolon.
Link: https://lkml.kernel.org/r/20220629030013.10362-1-kunyu@nfschina.com
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt (Google) [Wed, 6 Jul 2022 20:12:31 +0000 (16:12 -0400)]
ftrace: Be more specific about arch impact when function tracer is enabled
It was brought up that on ARMv7, that because the FUNCTION_TRACER does not
use nops to keep function tracing disabled because of the use of a link
register, it does have some performance impact.
The start of functions when -pg is used to compile the kernel is:
push {lr}
bl
8010e7c0 <__gnu_mcount_nc>
When function tracing is tuned off, it becomes:
push {lr}
add sp, sp, #4
Which just puts the stack back to its normal location. But these two
instructions at the start of every function does incur some overhead.
Be more honest in the Kconfig FUNCTION_TRACER description and specify that
the overhead being in the noise was x86 specific, but other architectures
may vary.
Link: https://lore.kernel.org/all/20220705105416.GE5208@pengutronix.de/
Link: https://lkml.kernel.org/r/20220706161231.085a83da@gandalf.local.home
Reported-by: Sascha Hauer <sha@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Douglas Anderson [Sat, 9 Jul 2022 00:09:52 +0000 (17:09 -0700)]
tracing: Fix sleeping while atomic in kdb ftdump
If you drop into kdb and type "ftdump" you'll get a sleeping while
atomic warning from memory allocation in trace_find_next_entry().
This appears to have been caused by commit
ff895103a84a ("tracing:
Save off entry when peeking at next entry"), which added the
allocation in that path. The problematic commit was already fixed by
commit
8e99cf91b99b ("tracing: Do not allocate buffer in
trace_find_next_entry() in atomic") but that fix missed the kdb case.
The fix here is easy: just move the assignment of the static buffer to
the place where it should have been to begin with:
trace_init_global_iter(). That function is called in two places, once
is right before the assignment of the static buffer added by the
previous fix and once is in kdb.
Note that it appears that there's a second static buffer that we need
to assign that was added in commit
efbbdaa22bb7 ("tracing: Show real
address for trace event arguments"), so we'll move that too.
Link: https://lkml.kernel.org/r/20220708170919.1.I75844e5038d9425add2ad853a608cb44bb39df40@changeid
Fixes:
ff895103a84a ("tracing: Save off entry when peeking at next entry")
Fixes:
efbbdaa22bb7 ("tracing: Show real address for trace event arguments")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Zheng Yejian [Mon, 11 Jul 2022 01:47:31 +0000 (09:47 +0800)]
tracing/histograms: Fix memory leak problem
This reverts commit
46bbe5c671e06f070428b9be142cc4ee5cedebac.
As commit
46bbe5c671e0 ("tracing: fix double free") said, the
"double free" problem reported by clang static analyzer is:
> In parse_var_defs() if there is a problem allocating
> var_defs.expr, the earlier var_defs.name is freed.
> This free is duplicated by free_var_defs() which frees
> the rest of the list.
However, if there is a problem allocating N-th var_defs.expr:
+ in parse_var_defs(), the freed 'earlier var_defs.name' is
actually the N-th var_defs.name;
+ then in free_var_defs(), the names from 0th to (N-1)-th are freed;
IF ALLOCATING PROBLEM HAPPENED HERE!!! -+
\
|
0th 1th (N-1)-th N-th V
+-------------+-------------+-----+-------------+-----------
var_defs: | name | expr | name | expr | ... | name | expr | name | ///
+-------------+-------------+-----+-------------+-----------
These two frees don't act on same name, so there was no "double free"
problem before. Conversely, after that commit, we get a "memory leak"
problem because the above "N-th var_defs.name" is not freed.
If enable CONFIG_DEBUG_KMEMLEAK and inject a fault at where the N-th
var_defs.expr allocated, then execute on shell like:
$ echo 'hist:key=call_site:val=$v1,$v2:v1=bytes_req,v2=bytes_alloc' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
Then kmemleak reports:
unreferenced object 0xffff8fb100ef3518 (size 8):
comm "bash", pid 196, jiffies
4295681690 (age 28.538s)
hex dump (first 8 bytes):
76 31 00 00 b1 8f ff ff v1......
backtrace:
[<
0000000038fe4895>] kstrdup+0x2d/0x60
[<
00000000c99c049a>] event_hist_trigger_parse+0x206f/0x20e0
[<
00000000ae70d2cc>] trigger_process_regex+0xc0/0x110
[<
0000000066737a4c>] event_trigger_write+0x75/0xd0
[<
000000007341e40c>] vfs_write+0xbb/0x2a0
[<
0000000087fde4c2>] ksys_write+0x59/0xd0
[<
00000000581e9cdf>] do_syscall_64+0x3a/0x80
[<
00000000cf3b065c>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lkml.kernel.org/r/20220711014731.69520-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
46bbe5c671e0 ("tracing: fix double free")
Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sun, 3 Jul 2022 22:39:28 +0000 (15:39 -0700)]
Linux 5.19-rc5
Linus Torvalds [Sun, 3 Jul 2022 21:40:28 +0000 (14:40 -0700)]
lockref: remove unused 'lockref_get_or_lock()' function
Looking at the conditional lock acquire functions in the kernel due to
the new sparse support (see commit
4a557a5d1a61 "sparse: introduce
conditional lock acquire function attribute"), it became obvious that
the lockref code has a couple of them, but they don't match the usual
naming convention for the other ones, and their return value logic is
also reversed.
In the other very similar places, the naming pattern is '*_and_lock()'
(eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the
function returns true when the lock is taken.
The lockref code is superficially very similar to the refcount code,
only with the special "atomic wrt the embedded lock" semantics. But
instead of the '*_and_lock()' naming it uses '*_or_lock()'.
And instead of returning true in case it took the lock, it returns true
if it *didn't* take the lock.
Now, arguably the reflock code is quite logical: it really is a "either
decrement _or_ lock" kind of situation - and the return value is about
whether the operation succeeded without any special care needed.
So despite the similarities, the differences do make some sense, and
maybe it's not worth trying to unify the different conditional locking
primitives in this area.
But while looking at this all, it did become obvious that the
'lockref_get_or_lock()' function hasn't actually had any users for
almost a decade.
The only user it ever had was the shortlived 'd_rcu_to_refcount()'
function, and it got removed and replaced with 'lockref_get_not_dead()'
back in 2013 in commits
0d98439ea3c6 ("vfs: use lockred 'dead' flag to
mark unrecoverably dead dentries") and
e5c832d55588 ("vfs: fix dentry
RCU to refcounting possibly sleeping dput()")
In fact, that single use was removed less than a week after the whole
function was introduced in commit
b3abd80250c1 ("lockref: add
'lockref_get_or_lock() helper") so this function has been around for a
decade, but only had a user for six days.
Let's just put this mis-designed and unused function out of its misery.
We can think about the naming and semantic oddities of the remaining
'lockref_put_or_lock()' later, but at least that function has users.
And while the naming is different and the return value doesn't match,
that function matches the whole '{atomic,refcount}_dec_and_test()'
pattern much better (ie the magic happens when the count goes down to
zero, not when it is incremented from zero).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 30 Jun 2022 16:34:10 +0000 (09:34 -0700)]
sparse: introduce conditional lock acquire function attribute
The kernel tends to try to avoid conditional locking semantics because
it makes it harder to think about and statically check locking rules,
but we do have a few fundamental locking primitives that take locks
conditionally - most obviously the 'trylock' functions.
That has always been a problem for 'sparse' checking for locking
imbalance, and we've had a special '__cond_lock()' macro that we've used
to let sparse know how the locking works:
# define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
so that you can then use this to tell sparse that (for example) the
spinlock trylock macro ends up acquiring the lock when it succeeds, but
not when it fails:
#define raw_spin_trylock(lock) __cond_lock(lock, _raw_spin_trylock(lock))
and then sparse can follow along the locking rules when you have code like
if (!spin_trylock(&dentry->d_lock))
return LRU_SKIP;
.. sparse sees that the lock is held here..
spin_unlock(&dentry->d_lock);
and sparse ends up happy about the lock contexts.
However, this '__cond_lock()' use does result in very ugly header files,
and requires you to basically wrap the real function with that macro
that uses '__cond_lock'. Which has made PeterZ NAK things that try to
fix sparse warnings over the years [1].
To solve this, there is now a very experimental patch to sparse that
basically does the exact same thing as '__cond_lock()' did, but using a
function attribute instead. That seems to make PeterZ happy [2].
Note that this does not replace existing use of '__cond_lock()', but
only exposes the new proposed attribute and uses it for the previously
unannotated 'refcount_dec_and_lock()' family of functions.
For existing sparse installations, this will make no difference (a
negative output context was ignored), but if you have the experimental
sparse patch it will make sparse now understand code that uses those
functions, the same way '__cond_lock()' makes sparse understand the very
similar 'atomic_dec_and_lock()' uses that have the old '__cond_lock()'
annotations.
Note that in some cases this will silence existing context imbalance
warnings. But in other cases it may end up exposing new sparse warnings
for code that sparse just didn't see the locking for at all before.
This is a trial, in other words. I'd expect that if it ends up being
successful, and new sparse releases end up having this new attribute,
we'll migrate the old-style '__cond_lock()' users to use the new-style
'__cond_acquires' function attribute.
The actual experimental sparse patch was posted in [3].
Link: https://lore.kernel.org/all/20130930134434.GC12926@twins.programming.kicks-ass.net/
Link: https://lore.kernel.org/all/Yr60tWxN4P568x3W@worktop.programming.kicks-ass.net/
Link: https://lore.kernel.org/all/CAHk-=wjZfO9hGqJ2_hGQG3U_XzSh9_XaXze=HgPdvJbgrvASfA@mail.gmail.com/
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Aring <aahringo@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 3 Jul 2022 16:42:17 +0000 (09:42 -0700)]
Merge tag 'xfs-5.19-fixes-4' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"This fixes some stalling problems and corrects the last of the
problems (I hope) observed during testing of the new atomic xattr
update feature.
- Fix statfs blocking on background inode gc workers
- Fix some broken inode lock assertion code
- Fix xattr leaf buffer leaks when cancelling a deferred xattr update
operation
- Clean up xattr recovery to make it easier to understand.
- Fix xattr leaf block verifiers tripping over empty blocks.
- Remove complicated and error prone xattr leaf block bholding mess.
- Fix a bug where an rt extent crossing EOF was treated as "posteof"
blocks and cleaned unnecessarily.
- Fix a UAF when log shutdown races with unmount"
* tag 'xfs-5.19-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: prevent a UAF when log IO errors race with unmount
xfs: dont treat rt extents beyond EOF as eofblocks to be cleared
xfs: don't hold xattr leaf buffers across transaction rolls
xfs: empty xattr leaf header blocks are not corruption
xfs: clean up the end of xfs_attri_item_recover
xfs: always free xattri_leaf_bp when cancelling a deferred op
xfs: use invalidate_lock to check the state of mmap_lock
xfs: factor out the common lock flags assert
xfs: introduce xfs_inodegc_push()
xfs: bound maximum wait time for inodegc work
Linus Torvalds [Sat, 2 Jul 2022 18:20:56 +0000 (11:20 -0700)]
Merge tag 'nfsd-5.19-2' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Notable regression fixes:
- Fix NFSD crash during NFSv4.2 READ_PLUS operation
- Fix incorrect status code returned by COMMIT operation"
* tag 'nfsd-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Fix READ_PLUS crasher
NFSD: restore EINVAL error translation in nfsd_commit()
Linus Torvalds [Sat, 2 Jul 2022 17:23:36 +0000 (10:23 -0700)]
Merge tag 'for-5.19/parisc-4' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"Two important fixes for bugs in code which was added in 5.18:
- Fix userspace signal failures on 32-bit kernel due to a bug in vDSO
- Fix 32-bit load-word unalignment exception handler which returned
wrong values"
* tag 'for-5.19/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix vDSO signal breakage on 32-bit kernel
parisc/unaligned: Fix emulate_ldw() breakage
Helge Deller [Fri, 1 Jul 2022 07:00:41 +0000 (09:00 +0200)]
parisc: Fix vDSO signal breakage on 32-bit kernel
Addition of vDSO support for parisc in kernel v5.18 suddenly broke glibc
signal testcases on a 32-bit kernel.
The trampoline code (sigtramp.S) which is mapped into userspace includes
an offset to the context data on the stack, which is used by gdb and
glibc to get access to registers.
In a 32-bit kernel we used by mistake the offset into the compat context
(which is valid on a 64-bit kernel only) instead of the offset into the
"native" 32-bit context.
Reported-by: John David Anglin <dave.anglin@bell.net>
Tested-by: John David Anglin <dave.anglin@bell.net>
Fixes:
df24e1783e6e ("parisc: Add vDSO support")
CC: stable@vger.kernel.org # 5.18
Signed-off-by: Helge Deller <deller@gmx.de>
Linus Torvalds [Sat, 2 Jul 2022 16:28:36 +0000 (09:28 -0700)]
Merge tag 'perf-tools-fixes-for-v5.19-2022-07-02' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- BPF program info linear (BPIL) data is accessed assuming 64-bit
alignment resulting in undefined behavior as the data is just byte
aligned. Fix it, Found using -fsanitize=undefined.
- Fix 'perf offcpu' build on old kernels wrt task_struct's
state/__state field.
- Fix perf_event_attr.sample_type setting on the 'offcpu-time' event
synthesized by the 'perf offcpu' tool.
- Don't bail out when synthesizing PERF_RECORD_ events for pre-existing
threads when one goes away while parsing its procfs entries.
- Don't sort the task scan result from /proc, its not needed and
introduces bugs when the main thread isn't the first one to be
processed.
- Fix uninitialized 'offset' variable on aarch64 in the unwind code.
- Sync KVM headers with the kernel sources.
* tag 'perf-tools-fixes-for-v5.19-2022-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf synthetic-events: Ignore dead threads during event synthesis
perf synthetic-events: Don't sort the task scan result from /proc
perf unwind: Fix unitialized 'offset' variable on aarch64
tools headers UAPI: Sync linux/kvm.h with the kernel sources
perf bpf: 8 byte align bpil data
tools kvm headers arm64: Update KVM headers from the kernel sources
perf offcpu: Accept allowed sample types only
perf offcpu: Fix build failure on old kernels
Linus Torvalds [Sat, 2 Jul 2022 16:11:44 +0000 (09:11 -0700)]
Merge tag 'powerpc-5.19-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix BPF uapi confusion about the correct type of bpf_user_pt_regs_t.
- Fix virt_addr_valid() when memory is hotplugged above the boot-time
high_memory value.
- Fix a bug in 64-bit Book3E map_kernel_page() which would incorrectly
allocate a PMD page at PUD level.
- Fix a couple of minor issues found since we enabled KASAN for 64-bit
Book3S.
Thanks to Aneesh Kumar K.V, Cédric Le Goater, Christophe Leroy, Kefeng
Wang, Liam Howlett, Nathan Lynch, and Naveen N. Rao.
* tag 'powerpc-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/memhotplug: Add add_pages override for PPC
powerpc/bpf: Fix use of user_pt_regs in uapi
powerpc/prom_init: Fix kernel config grep
powerpc/book3e: Fix PUD allocation size in map_kernel_page()
powerpc/xive/spapr: correct bitmap allocation size
Namhyung Kim [Fri, 1 Jul 2022 20:54:58 +0000 (13:54 -0700)]
perf synthetic-events: Ignore dead threads during event synthesis
When it synthesize various task events, it scans the list of task
first and then accesses later. There's a window threads can die
between the two and proc entries may not be available.
Instead of bailing out, we can ignore that thread and move on.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701205458.985106-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 1 Jul 2022 20:54:57 +0000 (13:54 -0700)]
perf synthetic-events: Don't sort the task scan result from /proc
It should not sort the result as procfs already returns a proper
ordering of tasks. Actually sorting the order caused problems that it
doesn't guararantee to process the main thread first.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701205458.985106-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ivan Babrou [Fri, 1 Jul 2022 18:20:46 +0000 (11:20 -0700)]
perf unwind: Fix unitialized 'offset' variable on aarch64
Commit
dc2cf4ca866f5715 ("perf unwind: Fix segbase for ld.lld linked
objects") uncovered the following issue on aarch64:
util/unwind-libunwind-local.c: In function 'find_proc_info':
util/unwind-libunwind-local.c:386:28: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
386 | if (ofs > 0) {
| ^
util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
199 | u64 address, offset;
| ^~~~~~
util/unwind-libunwind-local.c:371:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
371 | if (ofs <= 0) {
| ^
util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
199 | u64 address, offset;
| ^~~~~~
util/unwind-libunwind-local.c:363:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
363 | if (ofs <= 0) {
| ^
util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
199 | u64 address, offset;
| ^~~~~~
In file included from util/libunwind/arm64.c:37:
Fixes:
dc2cf4ca866f5715 ("perf unwind: Fix segbase for ld.lld linked objects")
Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-team@cloudflare.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220701182046.12589-1-ivan@cloudflare.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Fri, 1 Jul 2022 23:58:19 +0000 (16:58 -0700)]
Merge tag 'libnvdimm-fixes-5.19-rc5' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Vishal Verma:
- Fix a bug in the libnvdimm 'BTT' (Block Translation Table) driver
where accounting for poison blocks to be cleared was off by one,
causing a failure to clear the the last badblock in an nvdimm region.
* tag 'libnvdimm-fixes-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm: Fix badblocks clear off-by-one error
Linus Torvalds [Fri, 1 Jul 2022 20:00:47 +0000 (13:00 -0700)]
Merge tag 'thermal-5.19-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Add a new CPU ID to the list of supported processors in the
intel_tcc_cooling driver (Sumeet Pawnikar)"
* tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
Linus Torvalds [Fri, 1 Jul 2022 19:55:28 +0000 (12:55 -0700)]
Merge tag 'pm-5.19-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix some issues in cpufreq drivers and some issues in devfreq:
- Fix error code path issues related PROBE_DEFER handling in devfreq
(Christian Marangi)
- Revert an editing accident in SPDX-License line in the devfreq
passive governor (Lukas Bulwahn)
- Fix refcount leak in of_get_devfreq_events() in the exynos-ppmu
devfreq driver (Miaoqian Lin)
- Use HZ_PER_KHZ macro in the passive devfreq governor (Yicong Yang)
- Fix missing of_node_put for qoriq and pmac32 driver (Liang He)
- Fix issues around throttle interrupt for qcom driver (Stephen Boyd)
- Add MT8186 to cpufreq-dt-platdev blocklist (AngeloGioacchino Del
Regno)
- Make amd-pstate enable CPPC on resume from S3 (Jinzhou Su)"
* tag 'pm-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / devfreq: passive: revert an editing accident in SPDX-License line
PM / devfreq: Fix kernel warning with cpufreq passive register fail
PM / devfreq: Rework freq_table to be local to devfreq struct
PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
PM / devfreq: passive: Use HZ_PER_KHZ macro in units.h
PM / devfreq: Fix cpufreq passive unregister erroring on PROBE_DEFER
PM / devfreq: Mute warning on governor PROBE_DEFER
PM / devfreq: Fix kernel panic with cpu based scaling to passive gov
cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
cpufreq: pmac32-cpufreq: Fix refcount leak bug
cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
cpufreq: amd-pstate: Add resume and suspend callbacks
Rafael J. Wysocki [Fri, 1 Jul 2022 19:43:08 +0000 (21:43 +0200)]
Merge branch 'pm-cpufreq'
Merge cpufreq fixes for 5.19-rc5, including ARM cpufreq fixes and the
following one:
- Make amd-pstate enable CPPC on resume from S3 (Jinzhou Su).
* pm-cpufreq:
cpufreq: Add MT8186 to cpufreq-dt-platdev blocklist
cpufreq: pmac32-cpufreq: Fix refcount leak bug
cpufreq: qcom-hw: Don't do lmh things without a throttle interrupt
drivers: cpufreq: Add missing of_node_put() in qoriq-cpufreq.c
cpufreq: amd-pstate: Add resume and suspend callbacks
Linus Torvalds [Fri, 1 Jul 2022 19:05:27 +0000 (12:05 -0700)]
Merge tag 'hwmon-for-v5.19-rc5' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix error handling in ibmaem driver initialization
- Fix bad data reported by occ driver after setting power cap
- Fix typos in pmbus/ucd9200 driver comments
* tag 'hwmon-for-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ibmaem) don't call platform_device_del() if platform_device_add() fails
hwmon: (pmbus/ucd9200) fix typos in comments
hwmon: (occ) Prevent power cap command overwriting poll response
Yang Yingliang [Fri, 1 Jul 2022 07:41:53 +0000 (15:41 +0800)]
hwmon: (ibmaem) don't call platform_device_del() if platform_device_add() fails
If platform_device_add() fails, it no need to call platform_device_del(), split
platform_device_unregister() into platform_device_del/put(), so platform_device_put()
can be called separately.
Fixes:
8808a793f052 ("ibmaem: new driver for power/energy/temp meters in IBM System X hardware")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220701074153.4021556-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Fri, 1 Jul 2022 18:23:21 +0000 (11:23 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Restore TLB invalidation for the 'break-before-make' rule on
contiguous ptes (missed in a recent clean-up)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
Linus Torvalds [Fri, 1 Jul 2022 18:19:14 +0000 (11:19 -0700)]
Merge tag 's390-5.19-5' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Alexander Gordeev:
- Fix purgatory build process so bin2c tool does not get built
unnecessarily and the Makefile is more consistent with other
architectures.
- Return earlier simple design of arch_get_random_seed_long|int() and
arch_get_random_long|int() callbacks as result of changes in generic
RNG code.
- Fix minor comment typos and spelling mistakes.
* tag 's390-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/qdio: Fix spelling mistake
s390/sclp: Fix typo in comments
s390/archrandom: simplify back to earlier design and initialize earlier
s390/purgatory: remove duplicated build rule of kexec-purgatory.o
s390/purgatory: hard-code obj-y in Makefile
s390: remove unneeded 'select BUILD_BIN2C'
Linus Torvalds [Fri, 1 Jul 2022 18:11:32 +0000 (11:11 -0700)]
Merge tag 'nfs-for-5.19-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
- Allocate a fattr for _nfs4_discover_trunking()
- Fix module reference count leak in nfs4_run_state_manager()
* tag 'nfs-for-5.19-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFSv4: Add an fattr allocation to _nfs4_discover_trunking()
NFS: restore module put when manager exits.
Linus Torvalds [Fri, 1 Jul 2022 18:06:21 +0000 (11:06 -0700)]
Merge tag 'ceph-for-5.19-rc5' of https://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A ceph filesystem fix, marked for stable.
There appears to be a deeper issue on the MDS side, but for now we are
going with this one-liner to avoid busy looping and potential soft
lockups"
* tag 'ceph-for-5.19-rc5' of https://github.com/ceph/ceph-client:
ceph: wait on async create before checking caps for syncfs
Linus Torvalds [Fri, 1 Jul 2022 17:58:39 +0000 (10:58 -0700)]
Merge tag 'for-5.19/dm-fixes-5' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"Three fixes for invalid memory accesses discovered by using KASAN
while running the lvm2 testsuite's dm-raid tests. Includes changes to
MD's raid5.c given the dependency dm-raid has on the MD code"
* tag 'for-5.19/dm-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm raid: fix KASAN warning in raid5_add_disks
dm raid: fix KASAN warning in raid5_remove_disk
dm raid: fix accesses beyond end of raid member array
Linus Torvalds [Fri, 1 Jul 2022 17:52:01 +0000 (10:52 -0700)]
Merge tag 'io_uring-5.19-2022-07-01' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Two minor tweaks:
- While we still can, adjust the send/recv based flags to be in
->ioprio rather than in ->addr2. This is consistent with eg accept,
and also doesn't waste a full 64-bit field for flags (Pavel)
- 5.18-stable fix for re-importing provided buffers. Not much real
world relevance here as it'll only impact non-pollable files gone
async, which is more of a practical test case rather than something
that is used in the wild (Dylan)"
* tag 'io_uring-5.19-2022-07-01' of git://git.kernel.dk/linux-block:
io_uring: fix provided buffer import
io_uring: keep sendrecv flags in ioprio
Linus Torvalds [Fri, 1 Jul 2022 17:42:10 +0000 (10:42 -0700)]
Merge tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix for batch getting of tags in sbitmap (wuchi)
- NVMe pull request via Christoph:
- More quirks (Lamarque Vieira Souza, Pablo Greco)
- Fix a fabrics disconnect regression (Ruozhu Li)
- Fix a nvmet-tcp data_digest calculation regression (Sagi
Grimberg)
- Fix nvme-tcp send failure handling (Sagi Grimberg)
- Fix a regression with nvmet-loop and passthrough controllers
(Alan Adamson)
* tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block:
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
nvmet: add a clear_ids attribute for passthru targets
nvme: fix regression when disconnect a recovering ctrl
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
nvme-tcp: always fail a request when sending it failed
nvmet-tcp: fix regression in data_digest calculation
lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
Linus Torvalds [Fri, 1 Jul 2022 17:38:17 +0000 (10:38 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One simple driver fix for a dma overrun"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: hisi_sas: Limit max hw sectors for v3 HW
Linus Torvalds [Fri, 1 Jul 2022 17:31:44 +0000 (10:31 -0700)]
Merge tag 'ata-5.19-rc5' of git://git./linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:
- Fix a compilation warning with some versions of gcc/sparse when
compiling the pata_cs5535 driver, from John.
* tag 'ata-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_cs5535: Fix W=1 warnings
Will Deacon [Wed, 29 Jun 2022 09:53:49 +0000 (10:53 +0100)]
arm64: hugetlb: Restore TLB invalidation for BBM on contiguous ptes
Commit
fb396bb459c1 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
removed TLB invalidation from get_clear_flush() [now get_clear_contig()]
on the basis that the core TLB invalidation code is aware of hugetlb
mappings backed by contiguous page-table entries and will cover the
correct virtual address range.
However, this change also resulted in the TLB invalidation being removed
from the "break" step in the break-before-make (BBM) sequence used
internally by huge_ptep_set_{access_flags,wrprotect}(), therefore
making the BBM sequence unsafe irrespective of later invalidation.
Although the architecture is desperately unclear about how exactly
contiguous ptes should be updated in a live page-table, restore TLB
invalidation to our BBM sequence under the assumption that BBM is the
right thing to be doing in the first place.
Fixes:
fb396bb459c1 ("arm64/hugetlb: Drop TLB flush from get_clear_flush()")
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220629095349.25748-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Fri, 1 Jul 2022 17:01:32 +0000 (10:01 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small fixes
- Initialize a spinlock in the stm32 reset code
- Add dt bindings to the clk maintainer filepattern"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
MAINTAINERS: add include/dt-bindings/clock to COMMON CLK FRAMEWORK
clk: stm32: rcc_reset: Fix missing spin_lock_init()
Darrick J. Wong [Fri, 1 Jul 2022 16:08:33 +0000 (09:08 -0700)]
xfs: prevent a UAF when log IO errors race with unmount
KASAN reported the following use after free bug when running
generic/475:
XFS (dm-0): Mounting V5 Filesystem
XFS (dm-0): Starting recovery (logdev: internal)
XFS (dm-0): Ending recovery (logdev: internal)
Buffer I/O error on dev dm-0, logical block
20639616, async page read
Buffer I/O error on dev dm-0, logical block
20639617, async page read
XFS (dm-0): log I/O error -5
XFS (dm-0): Filesystem has been shut down due to log error (0x2).
XFS (dm-0): Unmounting Filesystem
XFS (dm-0): Please unmount the filesystem and rectify the problem(s).
==================================================================
BUG: KASAN: use-after-free in do_raw_spin_lock+0x246/0x270
Read of size 4 at addr
ffff888109dd84c4 by task 3:1H/136
CPU: 3 PID: 136 Comm: 3:1H Not tainted 5.19.0-rc4-xfsx #rc4
8e53ab5ad0fddeb31cee5e7063ff9c361915a9c4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs]
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x2b8/0x661
? do_raw_spin_lock+0x246/0x270
kasan_report+0xab/0x120
? do_raw_spin_lock+0x246/0x270
do_raw_spin_lock+0x246/0x270
? rwlock_bug.part.0+0x90/0x90
xlog_force_shutdown+0xf6/0x370 [xfs
4ad76ae0d6add7e8183a553e624c31e9ed567318]
xlog_ioend_work+0x100/0x190 [xfs
4ad76ae0d6add7e8183a553e624c31e9ed567318]
process_one_work+0x672/0x1040
worker_thread+0x59b/0xec0
? __kthread_parkme+0xc6/0x1f0
? process_one_work+0x1040/0x1040
? process_one_work+0x1040/0x1040
kthread+0x29e/0x340
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
Allocated by task 154099:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
kmem_alloc+0x8d/0x2e0 [xfs]
xlog_cil_init+0x1f/0x540 [xfs]
xlog_alloc_log+0xd1e/0x1260 [xfs]
xfs_log_mount+0xba/0x640 [xfs]
xfs_mountfs+0xf2b/0x1d00 [xfs]
xfs_fs_fill_super+0x10af/0x1910 [xfs]
get_tree_bdev+0x383/0x670
vfs_get_tree+0x7d/0x240
path_mount+0xdb7/0x1890
__x64_sys_mount+0x1fa/0x270
do_syscall_64+0x2b/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 154151:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_set_free_info+0x20/0x30
____kasan_slab_free+0x110/0x190
slab_free_freelist_hook+0xab/0x180
kfree+0xbc/0x310
xlog_dealloc_log+0x1b/0x2b0 [xfs]
xfs_unmountfs+0x119/0x200 [xfs]
xfs_fs_put_super+0x6e/0x2e0 [xfs]
generic_shutdown_super+0x12b/0x3a0
kill_block_super+0x95/0xd0
deactivate_locked_super+0x80/0x130
cleanup_mnt+0x329/0x4d0
task_work_run+0xc5/0x160
exit_to_user_mode_prepare+0xd4/0xe0
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x46/0xb0
This appears to be a race between the unmount process, which frees the
CIL and waits for in-flight iclog IO; and the iclog IO completion. When
generic/475 runs, it starts fsstress in the background, waits a few
seconds, and substitutes a dm-error device to simulate a disk falling
out of a machine. If the fsstress encounters EIO on a pure data write,
it will exit but the filesystem will still be online.
The next thing the test does is unmount the filesystem, which tries to
clean the log, free the CIL, and wait for iclog IO completion. If an
iclog was being written when the dm-error switch occurred, it can race
with log unmounting as follows:
Thread 1 Thread 2
xfs_log_unmount
xfs_log_clean
xfs_log_quiesce
xlog_ioend_work
<observe error>
xlog_force_shutdown
test_and_set_bit(XLOG_IOERROR)
xfs_log_force
<log is shut down, nop>
xfs_log_umount_write
<log is shut down, nop>
xlog_dealloc_log
xlog_cil_destroy
<wait for iclogs>
spin_lock(&log->l_cilp->xc_push_lock)
<KABOOM>
Therefore, free the CIL after waiting for the iclogs to complete. I
/think/ this race has existed for quite a few years now, though I don't
remember the ~2014 era logging code well enough to know if it was a real
threat then or if the actual race was exposed only more recently.
Fixes:
ac983517ec59 ("xfs: don't sleep in xlog_cil_force_lsn on shutdown")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Linus Torvalds [Fri, 1 Jul 2022 00:19:19 +0000 (17:19 -0700)]
Merge tag 'drm-fixes-2022-07-01' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Bit quieter this week, the main thing is it pulls in the fixes for the
sysfb resource issue you were seeing. these had been queued for next
so should have had some decent testing.
Otherwise amdgpu, i915 and msm each have a few fixes, and vc4 has one.
fbdev:
- sysfb fixes/conflicting fb fixes
amdgpu:
- GPU recovery fix
- Fix integer type usage in fourcc header for AMD modifiers
- KFD TLB flush fix for gfx9 APUs
- Display fix
i915:
- Fix ioctl argument error return
- Fix d3cold disable to allow PCI upstream bridge D3 transition
- Fix setting cache_dirty for dma-buf objects on discrete
msm:
- Fix to increment vsync_cnt before calling drm_crtc_handle_vblank so
that userspace sees the value *after* it is incremented if waiting
for vblank events
- Fix to reset drm_dev to NULL in dp_display_unbind to avoid a crash
in probe/bind error paths
- Fix to resolve the smatch error of de-referencing before NULL check
in dpu_encoder_phys_wb.c
- Fix error return to userspace if fence-id allocation fails in
submit ioctl
vc4:
- NULL ptr dereference fix"
* tag 'drm-fixes-2022-07-01' of git://anongit.freedesktop.org/drm/drm:
Revert "drm/amdgpu/display: set vblank_disable_immediate for DC"
drm/amdgpu: To flush tlb for MMHUB of RAVEN series
drm/fourcc: fix integer type usage in uapi header
drm/amdgpu: fix adev variable used in amdgpu_device_gpu_recover()
fbdev: Disable sysfb device registration when removing conflicting FBs
firmware: sysfb: Add sysfb_disable() helper function
firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
drm/msm/gem: Fix error return on fence id alloc fail
drm/i915: tweak the ordering in cpu_write_needs_clflush
drm/i915/dgfx: Disable d3cold at gfx root port
drm/i915/gem: add missing else
drm/vc4: perfmon: Fix variable dereferenced before check
drm/msm/dpu: Fix variable dereferenced before check
drm/msm/dp: reset drm_dev to NULL at dp_display_unbind()
drm/msm/dpu: Increment vsync_cnt before waking up userspace
Dave Airlie [Thu, 30 Jun 2022 23:27:28 +0000 (09:27 +1000)]
Merge tag 'drm-misc-fixes-2022-06-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A NULL pointer dereference fix for vc4, and 3 patches to improve the
sysfb device behaviour when removing conflicting framebuffers
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630072404.2fa4z3nk5h5q34ci@houat
Linus Torvalds [Thu, 30 Jun 2022 22:26:55 +0000 (15:26 -0700)]
Merge tag 'net-5.19-rc5' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - new code bugs:
- clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()
- mptcp:
- invoke MP_FAIL response only when needed
- fix shutdown vs fallback race
- consistent map handling on failure
- octeon_ep: use bitwise AND
Previous releases - regressions:
- tipc: move bc link creation back to tipc_node_create, fix NPD
Previous releases - always broken:
- tcp: add a missing nf_reset_ct() in 3WHS handling to prevent socket
buffered skbs from keeping refcount on the conntrack module
- ipv6: take care of disable_policy when restoring routes
- tun: make sure to always disable and unlink NAPI instances
- phy: don't trigger state machine while in suspend
- netfilter: nf_tables: avoid skb access on nf_stolen
- asix: fix "can't send until first packet is send" issue
- usb: asix: do not force pause frames support
- nxp-nci: don't issue a zero length i2c_master_read()
Misc:
- ncsi: allow use of proper "mellanox" DT vendor prefix
- act_api: add a message for user space if any actions were already
flushed before the error was hit"
* tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
net: dsa: felix: fix race between reading PSFP stats and port stats
selftest: tun: add test for NAPI dismantle
net: tun: avoid disabling NAPI twice
net: sparx5: mdb add/del handle non-sparx5 devices
net: sfp: fix memory leak in sfp_probe()
mlxsw: spectrum_router: Fix rollback in tunnel next hop init
net: rose: fix UAF bugs caused by timer handler
net: usb: ax88179_178a: Fix packet receiving
net: bonding: fix use-after-free after 802.3ad slave unbind
ipv6: fix lockdep splat in in6_dump_addrs()
net: phy: ax88772a: fix lost pause advertisement configuration
net: phy: Don't trigger state machine while in suspend
usbnet: fix memory allocation in helpers
selftests net: fix kselftest net fatal error
NFC: nxp-nci: don't print header length mismatch on i2c error
NFC: nxp-nci: Don't issue a zero length i2c_master_read()
net: tipc: fix possible refcount leak in tipc_sk_create()
nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
net: ipv6: unexport __init-annotated seg6_hmac_net_init()
ipv6/sit: fix ipip6_tunnel_get_prl return value
...
Amir Goldstein [Thu, 30 Jun 2022 19:58:49 +0000 (22:58 +0300)]
vfs: fix copy_file_range() regression in cross-fs copies
A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.
Before commit
5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems. After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.
Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.
Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().
This patch restores some cross-filesystem copy restrictions that existed
prior to commit
5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement ->copy_file_range().
Filesystems that do implement ->copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user. Before this change this was only true for fs that did not
implement the ->remap_file_range() operation (i.e. nfsv3).
Filesystems that do not implement ->copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb. This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().
nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.
fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.
Fixes:
5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Fixes:
64bf5ff58dff ("vfs: no fallback for ->copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chuck Lever [Thu, 30 Jun 2022 20:48:18 +0000 (16:48 -0400)]
SUNRPC: Fix READ_PLUS crasher
Looks like there are still cases when "space_left - frag1bytes" can
legitimately exceed PAGE_SIZE. Ensure that xdr->end always remains
within the current encode buffer.
Reported-by: Bruce Fields <bfields@fieldses.org>
Reported-by: Zorro Lang <zlang@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216151
Fixes:
6c254bf3b637 ("SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Scott Mayhew [Mon, 27 Jun 2022 21:31:29 +0000 (17:31 -0400)]
NFSv4: Add an fattr allocation to _nfs4_discover_trunking()
This was missed in
c3ed222745d9 ("NFSv4: Fix free of uninitialized
nfs4_label on referral lookup.") and causes a panic when mounting
with '-o trunkdiscovery':
PID: 1604 TASK:
ffff93dac3520000 CPU: 3 COMMAND: "mount.nfs"
#0 [
ffffb79140f738f8] machine_kexec at
ffffffffaec64bee
#1 [
ffffb79140f73950] __crash_kexec at
ffffffffaeda67fd
#2 [
ffffb79140f73a18] crash_kexec at
ffffffffaeda76ed
#3 [
ffffb79140f73a30] oops_end at
ffffffffaec2658d
#4 [
ffffb79140f73a50] general_protection at
ffffffffaf60111e
[exception RIP: nfs_fattr_init+0x5]
RIP:
ffffffffc0c18265 RSP:
ffffb79140f73b08 RFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff93dac304a800 RCX:
0000000000000000
RDX:
ffffb79140f73bb0 RSI:
ffff93dadc8cbb40 RDI:
d03ee11cfaf6bd50
RBP:
ffffb79140f73be8 R8:
ffffffffc0691560 R9:
0000000000000006
R10:
ffff93db3ffd3df8 R11:
0000000000000000 R12:
ffff93dac4040000
R13:
ffff93dac2848e00 R14:
ffffb79140f73b60 R15:
ffffb79140f73b30
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#5 [
ffffb79140f73b08] _nfs41_proc_get_locations at
ffffffffc0c73d53 [nfsv4]
#6 [
ffffb79140f73bf0] nfs4_proc_get_locations at
ffffffffc0c83e90 [nfsv4]
#7 [
ffffb79140f73c60] nfs4_discover_trunking at
ffffffffc0c83fb7 [nfsv4]
#8 [
ffffb79140f73cd8] nfs_probe_fsinfo at
ffffffffc0c0f95f [nfs]
#9 [
ffffb79140f73da0] nfs_probe_server at
ffffffffc0c1026a [nfs]
RIP:
00007f6254fce26e RSP:
00007ffc69496ac8 RFLAGS:
00000246
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f6254fce26e
RDX:
00005600220a82a0 RSI:
00005600220a64d0 RDI:
00005600220a6520
RBP:
00007ffc69496c50 R8:
00005600220a8710 R9:
003035322e323231
R10:
0000000000000000 R11:
0000000000000246 R12:
00007ffc69496c50
R13:
00005600220a8440 R14:
0000000000000010 R15:
0000560020650ef9
ORIG_RAX:
00000000000000a5 CS: 0033 SS: 002b
Fixes:
c3ed222745d9 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Thu, 23 Jun 2022 04:47:34 +0000 (14:47 +1000)]
NFS: restore module put when manager exits.
Commit
f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module") removed
calls to module_put_and_kthread_exit() from threads that acted as SUNRPC
servers and had a related svc_serv_ops structure. This was correct.
It ALSO removed the module_put_and_kthread_exit() call from
nfs4_run_state_manager() which is NOT a SUNRPC service.
Consequently every time the NFSv4 state manager runs the module count
increments and won't be decremented. So the nfsv4 module cannot be
unloaded.
So restore the module_put_and_kthread_exit() call.
Fixes:
f49169c97fce ("NFSD: Remove svc_serv_ops::svo_module")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Jens Axboe [Thu, 30 Jun 2022 20:00:11 +0000 (14:00 -0600)]
Merge tag 'nvme-5.19-2022-06-30' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- more quirks (Lamarque Vieira Souza, Pablo Greco)
- fix a fabrics disconnect regression (Ruozhu Li)
- fix a nvmet-tcp data_digest calculation regression (Sagi Grimberg)
- fix nvme-tcp send failure handling (Sagi Grimberg)
- fix a regression with nvmet-loop and passthrough controllers
(Alan Adamson)"
* tag 'nvme-5.19-2022-06-30' of git://git.infradead.org/nvme:
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
nvmet: add a clear_ids attribute for passthru targets
nvme: fix regression when disconnect a recovering ctrl
nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
nvme-tcp: always fail a request when sending it failed
nvmet-tcp: fix regression in data_digest calculation
Vladimir Oltean [Wed, 29 Jun 2022 18:30:07 +0000 (21:30 +0300)]
net: dsa: felix: fix race between reading PSFP stats and port stats
Both PSFP stats and the port stats read by ocelot_check_stats_work() are
indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW,
read from SYS:STAT:CNT[n].
It's just that for port stats, we write STAT_VIEW with the index of the
port, and for PSFP stats, we write STAT_VIEW with the filter index.
So if we allow them to run concurrently, ocelot_check_stats_work() may
change the view from vsc9959_psfp_counters_get(), and vice versa.
Fixes:
7d4b564d6add ("net: dsa: felix: support psfp filter on vsc9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 29 Jun 2022 18:19:11 +0000 (11:19 -0700)]
selftest: tun: add test for NAPI dismantle
Being lazy does not pay, add the test for various
ordering of tun queue close / detach / destroy.
Link: https://lore.kernel.org/r/20220629181911.372047-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 29 Jun 2022 18:19:10 +0000 (11:19 -0700)]
net: tun: avoid disabling NAPI twice
Eric reports that syzbot made short work out of my speculative
fix. Indeed when queue gets detached its tfile->tun remains,
so we would try to stop NAPI twice with a detach(), close()
sequence.
Alternative fix would be to move tun_napi_disable() to
tun_detach_all() and let the NAPI run after the queue
has been detached.
Fixes:
a8fc8cb5692a ("net: tun: stop NAPI when detaching queues")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220629181911.372047-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Casper Andersson [Thu, 30 Jun 2022 12:22:26 +0000 (14:22 +0200)]
net: sparx5: mdb add/del handle non-sparx5 devices
When adding/deleting mdb entries on other net_devices, eg., tap
interfaces, it should not crash.
Fixes:
3bacfccdcb2d ("net: sparx5: Add mdb handlers")
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Link: https://lore.kernel.org/r/20220630122226.316812-1-casper.casan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sumeet Pawnikar [Fri, 6 May 2022 13:50:09 +0000 (19:20 +0530)]
thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
Add RaptorLake to the list of processor models supported by the Intel
TCC cooling driver.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Zhang Jiaming [Thu, 23 Jun 2022 06:05:43 +0000 (14:05 +0800)]
s390/qdio: Fix spelling mistake
Change 'defineable' to 'definable'.
Change 'paramater' to 'parameter'.
Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Link: https://lore.kernel.org/r/20220623060543.12870-1-jiaming@nfschina.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Jiang Jian [Wed, 22 Jun 2022 14:27:13 +0000 (22:27 +0800)]
s390/sclp: Fix typo in comments
Remove the repeated word 'and' from comments
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220622142713.14187-1-jiangjian@cdjrlc.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Jason A. Donenfeld [Fri, 10 Jun 2022 22:20:23 +0000 (00:20 +0200)]
s390/archrandom: simplify back to earlier design and initialize earlier
s390x appears to present two RNG interfaces:
- a "TRNG" that gathers entropy using some hardware function; and
- a "DRBG" that takes in a seed and expands it.
Previously, the TRNG was wired up to arch_get_random_{long,int}(), but
it was observed that this was being called really frequently, resulting
in high overhead. So it was changed to be wired up to arch_get_random_
seed_{long,int}(), which was a reasonable decision. Later on, the DRBG
was then wired up to arch_get_random_{long,int}(), with a complicated
buffer filling thread, to control overhead and rate.
Fortunately, none of the performance issues matter much now. The RNG
always attempts to use arch_get_random_seed_{long,int}() first, which
means a complicated implementation of arch_get_random_{long,int}() isn't
really valuable or useful to have around. And it's only used when
reseeding, which means it won't hit the high throughput complications
that were faced before.
So this commit returns to an earlier design of just calling the TRNG in
arch_get_random_seed_{long,int}(), and returning false in arch_get_
random_{long,int}().
Part of what makes the simplification possible is that the RNG now seeds
itself using the TRNG at bootup. But this only works if the TRNG is
detected early in boot, before random_init() is called. So this commit
also causes that check to happen in setup_arch().
Cc: stable@vger.kernel.org
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Ingo Franzki <ifranzki@linux.ibm.com>
Cc: Juergen Christ <jchrist@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Dylan Yudaken [Thu, 30 Jun 2022 13:20:06 +0000 (06:20 -0700)]
io_uring: fix provided buffer import
io_import_iovec uses the s pointer, but this was changed immediately
after the iovec was re-imported and so it was imported into the wrong
place.
Change the ordering.
Fixes:
2be2eb02e2f5 ("io_uring: ensure reads re-import for selected buffers")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630132006.2825668-1-dylany@fb.com
[axboe: ensure we don't half-import as well]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 30 Jun 2022 17:03:22 +0000 (10:03 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Three minor bug fixes:
- qedr not setting the QP timeout properly toward userspace
- Memory leak on error path in ib_cm
- Divide by 0 in RDMA interrupt moderation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
linux/dim: Fix divide by 0 in RDMA DIM
RDMA/cm: Fix memory leak in ib_cm_insert_listen
RDMA/qedr: Fix reporting QP timeout attribute
Linus Torvalds [Thu, 30 Jun 2022 16:57:18 +0000 (09:57 -0700)]
Merge tag 'fsnotify_for_v5.19-rc5' of git://git./linux/kernel/git/jack/linux-fs
Pull fanotify fix from Jan Kara:
"A fix for recently added fanotify API to have stricter checks and
refuse some invalid flag combinations to make our life easier in the
future"
* tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: refine the validation checks on non-dir inode mask
Linus Torvalds [Thu, 30 Jun 2022 16:45:42 +0000 (09:45 -0700)]
Merge tag 'v5.19-p3' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a regression that breaks the ccp driver"
* tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ccp - Fix device IRQ counting by using platform_irq_count()
Rafael J. Wysocki [Thu, 30 Jun 2022 13:30:30 +0000 (15:30 +0200)]
Merge tag 'devfreq-fixes-for-5.19-rc5' of git://git./linux/kernel/git/chanwoo/linux
Pull devfreq fixes for 5.19-rc5 from Chanwoo Choi:
"1. Fix devfreq passive governor issue when cpufreq policies are not
ready during kernel boot because some CPUs turn on after kernel
booting or others.
- Re-initialize the vairables of struct devfreq_passive_data when
PROBE_DEFER happens when cpufreq_get() returns NULL.
- Use dev_err_probe to mute warning when PROBE_DEFER.
- Fix cpufreq passive unregister erroring on PROBE_DEFER
by using the allocated parent_cpu_data list to free resouce
instead of for_each_possible_cpu().
- Remove duplicate cpufreq passive unregister and warning when
PROBE_DEFER.
- Use HZ_PER_KZH macro in units.h.
- Fix wrong indentation in SPDX-License line.
2. Fix reference count leak in exynos-ppmu.c by using of_node_put().
3. Rework freq_table to be local to devfreq struct
- struct devfreq_dev_profile includes freq_table array to store
the supported frequencies. If devfreq driver doesn't initialize
the freq_table, devfreq core allocates the memory and initializes
the freq_table.
On a devfreq PROBE_DEFER, the freq_table in the driver profile
struct is never reset and may be left in an undefined state. To fix
this and correctly handle PROBE_DEFER, use a local freq_table and
max_state in the devfreq struct."
* tag 'devfreq-fixes-for-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: passive: revert an editing accident in SPDX-License line
PM / devfreq: Fix kernel warning with cpufreq passive register fail
PM / devfreq: Rework freq_table to be local to devfreq struct
PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
PM / devfreq: passive: Use HZ_PER_KHZ macro in units.h
PM / devfreq: Fix cpufreq passive unregister erroring on PROBE_DEFER
PM / devfreq: Mute warning on governor PROBE_DEFER
PM / devfreq: Fix kernel panic with cpu based scaling to passive gov
Pavel Begunkov [Thu, 30 Jun 2022 12:25:57 +0000 (13:25 +0100)]
io_uring: keep sendrecv flags in ioprio
We waste a u64 SQE field for flags even though we don't need as many
bits and it can be used for something more useful later. Store io_uring
specific send/recv flags in sqe->ioprio instead of ->addr2.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Fixes:
0455d4ccec54 ("io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg")
[axboe: change comment in io_uring.h as well]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Masahiro Yamada [Mon, 13 Jun 2022 17:09:02 +0000 (02:09 +0900)]
s390/purgatory: remove duplicated build rule of kexec-purgatory.o
This is equivalent to the pattern rule in scripts/Makefile.build.
Having the dependency on $(obj)/purgatory.ro is enough.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-3-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Masahiro Yamada [Mon, 13 Jun 2022 17:09:01 +0000 (02:09 +0900)]
s390/purgatory: hard-code obj-y in Makefile
The purgatory/ directory is entirely guarded in arch/s390/Kbuild.
CONFIG_ARCH_HAS_KEXEC_PURGATORY is bool type.
$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) is always 'y' when Kbuild visits
this Makefile for building.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-2-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Masahiro Yamada [Mon, 13 Jun 2022 17:09:00 +0000 (02:09 +0900)]
s390: remove unneeded 'select BUILD_BIN2C'
Since commit
4c0f032d4963 ("s390/purgatory: Omit use of bin2c"),
s390 builds the purgatory without using bin2c.
Remove 'select BUILD_BIN2C' to avoid the unneeded build of bin2c.
Fixes:
4c0f032d4963 ("s390/purgatory: Omit use of bin2c")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220613170902.1775211-1-masahiroy@kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Jianglei Nie [Wed, 29 Jun 2022 07:55:50 +0000 (15:55 +0800)]
net: sfp: fix memory leak in sfp_probe()
sfp_probe() allocates a memory chunk from sfp with sfp_alloc(). When
devm_add_action() fails, sfp is not freed, which leads to a memory leak.
We should use devm_add_action_or_reset() instead of devm_add_action().
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220629075550.2152003-1-niejianglei2021@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata [Wed, 29 Jun 2022 07:02:05 +0000 (10:02 +0300)]
mlxsw: spectrum_router: Fix rollback in tunnel next hop init
In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.
A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.
Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.
Fixes:
dbe4598c1e92 ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes:
a5390278a5eb ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>