ring-buffer: Do not try to put back write_stamp
authorSteven Rostedt (Google) <rostedt@goodmis.org>
Fri, 15 Dec 2023 03:29:21 +0000 (22:29 -0500)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 20 Dec 2023 16:02:06 +0000 (17:02 +0100)
commit dd939425707898da992e59ab0fcfae4652546910 upstream.

If an update to an event is interrupted by another event between the time
the initial event allocated its buffer and where it wrote to the
write_stamp, the code try to reset the write stamp back to the what it had
just overwritten. It knows that it was overwritten via checking the
before_stamp, and if it didn't match what it wrote to the before_stamp
before it allocated its space, it knows it was overwritten.

To put back the write_stamp, it uses the before_stamp it read. The problem
here is that by writing the before_stamp to the write_stamp it makes the
two equal again, which means that the write_stamp can be considered valid
as the last timestamp written to the ring buffer. But this is not
necessarily true. The event that interrupted the event could have been
interrupted in a way that it was interrupted as well, and can end up
leaving with an invalid write_stamp. But if this happens and returns to
this context that uses the before_stamp to update the write_stamp again,
it can possibly incorrectly make it valid, causing later events to have in
correct time stamps.

As it is OK to leave this function with an invalid write_stamp (one that
doesn't match the before_stamp), there's no reason to try to make it valid
again in this case. If this race happens, then just leave with the invalid
write_stamp and the next event to come along will just add a absolute
timestamp and validate everything again.

Bonus points: This gets rid of another cmpxchg64!

Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernel/trace/ring_buffer.c

index 93a7b5b..803f0ce 100644 (file)
@@ -3614,14 +3614,14 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
        }
 
        if (likely(tail == w)) {
-               u64 save_before;
-               bool s_ok;
-
                /* Nothing interrupted us between A and C */
  /*D*/         rb_time_set(&cpu_buffer->write_stamp, info->ts);
-               barrier();
- /*E*/         s_ok = rb_time_read(&cpu_buffer->before_stamp, &save_before);
-               RB_WARN_ON(cpu_buffer, !s_ok);
+               /*
+                * If something came in between C and D, the write stamp
+                * may now not be in sync. But that's fine as the before_stamp
+                * will be different and then next event will just be forced
+                * to use an absolute timestamp.
+                */
                if (likely(!(info->add_timestamp &
                             (RB_ADD_STAMP_FORCE | RB_ADD_STAMP_ABSOLUTE))))
                        /* This did not interrupt any time update */
@@ -3629,24 +3629,7 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
                else
                        /* Just use full timestamp for interrupting event */
                        info->delta = info->ts;
-               barrier();
                check_buffer(cpu_buffer, info, tail);
-               if (unlikely(info->ts != save_before)) {
-                       /* SLOW PATH - Interrupted between C and E */
-
-                       a_ok = rb_time_read(&cpu_buffer->write_stamp, &info->after);
-                       RB_WARN_ON(cpu_buffer, !a_ok);
-
-                       /* Write stamp must only go forward */
-                       if (save_before > info->after) {
-                               /*
-                                * We do not care about the result, only that
-                                * it gets updated atomically.
-                                */
-                               (void)rb_time_cmpxchg(&cpu_buffer->write_stamp,
-                                                     info->after, save_before);
-                       }
-               }
        } else {
                u64 ts;
                /* SLOW PATH - Interrupted between A and C */