sched/cputime: Do not scale when utime == 0
authorStanislaw Gruszka <sgruszka@redhat.com>
Wed, 4 Sep 2013 13:16:03 +0000 (15:16 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 1 Oct 2013 16:17:45 +0000 (09:17 -0700)
commit 5a8e01f8fa51f5cbce8f37acc050eb2319d12956 upstream.

scale_stime() silently assumes that stime < rtime, otherwise
when stime == rtime and both values are big enough (operations
on them do not fit in 32 bits), the resulting scaling stime can
be bigger than rtime. In consequence utime = rtime - stime
results in negative value.

User space visible symptoms of the bug are overflowed TIME
values on ps/top, for example:

 $ ps aux | grep rcu
 root         8  0.0  0.0      0     0 ?        S    12:42   0:00 [rcuc/0]
 root         9  0.0  0.0      0     0 ?        S    12:42   0:00 [rcub/0]
 root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
 root        11  0.1  0.0      0     0 ?        S    12:42   0:02 [rcuop/0]
 root        12 62422329  0.0  0     0 ?        S    12:42 21114581:35 [rcuop/1]
 root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]

or overflowed utime values read directly from /proc/$PID/stat

Reference:

  https://lkml.org/lkml/2013/8/20/259

Reported-and-tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: stable@vger.kernel.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20130904131602.GC2564@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernel/sched/cputime.c

index b5ccba2..1101d92 100644 (file)
@@ -558,7 +558,7 @@ static void cputime_adjust(struct task_cputime *curr,
                           struct cputime *prev,
                           cputime_t *ut, cputime_t *st)
 {
-       cputime_t rtime, stime, utime, total;
+       cputime_t rtime, stime, utime;
 
        if (vtime_accounting_enabled()) {
                *ut = curr->utime;
@@ -566,9 +566,6 @@ static void cputime_adjust(struct task_cputime *curr,
                return;
        }
 
-       stime = curr->stime;
-       total = stime + curr->utime;
-
        /*
         * Tick based cputime accounting depend on random scheduling
         * timeslices of a task to be interrupted or not by the timer.
@@ -589,13 +586,19 @@ static void cputime_adjust(struct task_cputime *curr,
        if (prev->stime + prev->utime >= rtime)
                goto out;
 
-       if (total) {
+       stime = curr->stime;
+       utime = curr->utime;
+
+       if (utime == 0) {
+               stime = rtime;
+       } else if (stime == 0) {
+               utime = rtime;
+       } else {
+               cputime_t total = stime + utime;
+
                stime = scale_stime((__force u64)stime,
                                    (__force u64)rtime, (__force u64)total);
                utime = rtime - stime;
-       } else {
-               stime = rtime;
-               utime = 0;
        }
 
        /*