Stanislaw Gruszka [Tue, 30 Apr 2013 09:35:07 +0000 (11:35 +0200)]
Revert "math64: New div64_u64_rem helper"
This reverts commit
f792685006274a850e6cc0ea9ade275ccdfc90bc.
The cputime scaling code was changed/fixed and does not need the
div64_u64_rem() primitive anymore. It has no other users, so let's
remove them.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1367314507-9728-4-git-send-email-sgruszka@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Stanislaw Gruszka [Tue, 30 Apr 2013 09:35:06 +0000 (11:35 +0200)]
sched: Avoid prev->stime underflow
Dave Hansen reported strange utime/stime values on his system:
https://lkml.org/lkml/2013/4/4/435
This happens because prev->stime value is bigger than rtime
value. Root of the problem are non-monotonic rtime values (i.e.
current rtime is smaller than previous rtime) and that should be
debugged and fixed.
But since problem did not manifest itself before commit
62188451f0d63add7ad0cd2a1ae269d600c1663d "cputime: Avoid
multiplication overflow on utime scaling", it should be threated
as regression, which we can easily fixed on cputime_adjust()
function.
For now, let's apply this fix, but further work is needed to fix
root of the problem.
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Cc: <stable@vger.kernel.org> # 3.9+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1367314507-9728-3-git-send-email-sgruszka@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Stanislaw Gruszka [Tue, 30 Apr 2013 09:35:05 +0000 (11:35 +0200)]
sched: Do not account bogus utime
Due to rounding in scale_stime(), for big numbers, scaled stime
values will grow in chunks. Since rtime grow in jiffies and we
calculate utime like below:
prev->stime = max(prev->stime, stime);
prev->utime = max(prev->utime, rtime - prev->stime);
we could erroneously account stime values as utime. To prevent
that only update prev->{u,s}time values when they are smaller
than current rtime.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1367314507-9728-2-git-send-email-sgruszka@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Stanislaw Gruszka [Tue, 30 Apr 2013 15:14:42 +0000 (17:14 +0200)]
sched: Avoid cputime scaling overflow
Here is patch, which adds Linus's cputime scaling algorithm to the
kernel.
This is a follow up (well, fix) to commit
d9a3c9823a2e6a543eb7807fb3d15d8233817ec5 ("sched: Lower chances
of cputime scaling overflow") which commit tried to avoid
multiplication overflow, but did not guarantee that the overflow
would not happen.
Linus crated a different algorithm, which completely avoids the
multiplication overflow by dropping precision when numbers are
big.
It was tested by me and it gives good relative error of
scaled numbers. Testing method is described here:
http://marc.info/?l=linux-kernel&m=
136733059505406&w=2
Originally-From: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: Dave Hansen <dave@sr71.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130430151441.GC10465@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vincent Guittot [Tue, 23 Apr 2013 14:59:02 +0000 (16:59 +0200)]
sched: Fix init NOHZ_IDLE flag
On my SMP platform which is made of 5 cores in 2 clusters, I
have the nr_busy_cpu field of sched_group_power struct that is
not null when the platform is fully idle - which makes the
scheduler unhappy.
The root cause is:
During the boot sequence, some CPUs reach the idle loop and set
their NOHZ_IDLE flag while waiting for others CPUs to boot. But
the nr_busy_cpus field is initialized later with the assumption
that all CPUs are in the busy state whereas some CPUs have
already set their NOHZ_IDLE flag.
More generally, the NOHZ_IDLE flag must be initialized when new
sched_domains are created in order to ensure that NOHZ_IDLE and
nr_busy_cpus are aligned.
This condition can be ensured by adding a synchronize_rcu()
between the destruction of old sched_domains and the creation of
new ones so the NOHZ_IDLE flag will not be updated with old
sched_domain once it has been initialized. But this solution
introduces a additionnal latency in the rebuild sequence that is
called during cpu hotplug.
As suggested by Frederic Weisbecker, another solution is to have
the same rcu lifecycle for both NOHZ_IDLE and sched_domain
struct. A new nohz_idle field is added to sched_domain so both
status and sched_domain will share the same RCU lifecycle and
will be always synchronized. In addition, there is no more need
to protect nohz_idle against concurrent access as it is only
modified by 2 exclusive functions called by local cpu.
This solution has been prefered to the creation of a new struct
with an extra pointer indirection for sched_domain.
The synchronization is done at the cost of :
- An additional indirection and a rcu_dereference for accessing nohz_idle.
- We use only the nohz_idle field of the top sched_domain.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linaro-kernel@lists.linaro.org
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Cc: pjt@google.com
Cc: rostedt@goodmis.org
Cc: efault@gmx.de
Link: http://lkml.kernel.org/r/1366729142-14662-1-git-send-email-vincent.guittot@linaro.org
[ Fixed !NO_HZ build bug. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:42 +0000 (17:27 +0900)]
sched: Prevent to re-select dst-cpu in load_balance()
Commit
88b8dac0 makes load_balance() consider other cpus in its
group. But, in that, there is no code for preventing to
re-select dst-cpu. So, same dst-cpu can be selected over and
over.
This patch add functionality to load_balance() in order to
exclude cpu which is selected once. We prevent to re-select
dst_cpu via env's cpus, so now, env's cpus is a candidate not
only for src_cpus, but also dst_cpus.
With this patch, we can remove lb_iterations and
max_lb_iterations, because we decide whether we can go ahead or
not via env's cpus.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Jason Low <jason.low2@hp.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:41 +0000 (17:27 +0900)]
sched: Rename load_balance_tmpmask to load_balance_mask
This name doesn't represent specific meaning.
So rename it to imply it's purpose.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Jason Low <jason.low2@hp.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-6-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:40 +0000 (17:27 +0900)]
sched: Move up affinity check to mitigate useless redoing overhead
Currently, LBF_ALL_PINNED is cleared after affinity check is
passed. So, if task migration is skipped by small load value or
small imbalance value in move_tasks(), we don't clear
LBF_ALL_PINNED. At last, we trigger 'redo' in load_balance().
Imbalance value is often so small that any tasks cannot be moved
to other cpus and, of course, this situation may be continued
after we change the target cpu. So this patch move up affinity
check code and clear LBF_ALL_PINNED before evaluating load value
in order to mitigate useless redoing overhead.
In addition, re-order some comments correctly.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Jason Low <jason.low2@hp.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-5-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:39 +0000 (17:27 +0900)]
sched: Don't consider other cpus in our group in case of NEWLY_IDLE
Commit
88b8dac0 makes load_balance() consider other cpus in its
group, regardless of idle type. When we do NEWLY_IDLE balancing,
we should not consider it, because a motivation of NEWLY_IDLE
balancing is to turn this cpu to non idle state if needed. This
is not the case of other cpus. So, change code not to consider
other cpus for NEWLY_IDLE balancing.
With this patch, assign 'if (pulled_task) this_rq->idle_stamp =
0' in idle_balance() is corrected, because NEWLY_IDLE balancing
doesn't consider other cpus. Assigning to 'this_rq->idle_stamp'
is now valid.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: Jason Low <jason.low2@hp.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:38 +0000 (17:27 +0900)]
sched: Explicitly cpu_idle_type checking in rebalance_domains()
After commit
88b8dac0, dst-cpu can be changed in load_balance(),
then we can't know cpu_idle_type of dst-cpu when load_balance()
return positive. So, add explicit cpu_idle_type checking.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: Jason Low <jason.low2@hp.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-3-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Joonsoo Kim [Tue, 23 Apr 2013 08:27:37 +0000 (17:27 +0900)]
sched: Change position of resched_cpu() in load_balance()
cur_ld_moved is reset if env.flags hit LBF_NEED_BREAK.
So, there is possibility that we miss doing resched_cpu().
Correct it as changing position of resched_cpu()
before checking LBF_NEED_BREAK.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: Jason Low <jason.low2@hp.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366705662-3587-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vincent Guittot [Thu, 18 Apr 2013 16:34:26 +0000 (18:34 +0200)]
sched: Fix wrong rq's runnable_avg update with rt tasks
The current update of the rq's load can be erroneous when RT
tasks are involved.
The update of the load of a rq that becomes idle, is done only
if the avg_idle is less than sysctl_sched_migration_cost. If RT
tasks and short idle duration alternate, the runnable_avg will
not be updated correctly and the time will be accounted as idle
time when a CFS task wakes up.
A new idle_enter function is called when the next task is the
idle function so the elapsed time will be accounted as run time
in the load of the rq, whatever the average idle time is. The
function update_rq_runnable_avg is removed from idle_balance.
When a RT task is scheduled on an idle CPU, the update of the
rq's load is not done when the rq exit idle state because CFS's
functions are not called. Then, the idle_balance, which is
called just before entering the idle function, updates the rq's
load and makes the assumption that the elapsed time since the
last update, was only running time.
As a consequence, the rq's load of a CPU that only runs a
periodic RT task, is close to LOAD_AVG_MAX whatever the running
duration of the RT task is.
A new idle_exit function is called when the prev task is the
idle function so the elapsed time will be accounted as idle time
in the rq's load.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: linaro-kernel@lists.linaro.org
Cc: peterz@infradead.org
Cc: pjt@google.com
Cc: fweisbec@gmail.com
Cc: efault@gmx.de
Link: http://lkml.kernel.org/r/1366302867-5055-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andrei Epure [Thu, 11 Apr 2013 17:30:29 +0000 (20:30 +0300)]
sched: Document task_struct::personality field
Signed-off-by: Andrei Epure <epure.andrei@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1365701429-4721-1-git-send-email-epure.andrei@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 10 Apr 2013 13:10:50 +0000 (15:10 +0200)]
sched/cpuacct/UML: Fix header file dependency bug on the UML build
The cpuacct split caused this build failure on UML:
kernel/sched/cpuacct.c:94:2: error: implicit declaration of function 'ERR_PTR'
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:44:42 +0000 (14:44 +0800)]
cgroup: Kill subsys.active flag
The only user was cpuacct.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5155385A.4040207@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:44:28 +0000 (14:44 +0800)]
sched/cpuacct: No need to check subsys active state
Now we're guaranteed when cpuacct_charge() and
cpuacct_account_field() are called, cpuacct has already been
properly initialized, so we no longer need those checks.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5155384C.7000508@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:44:15 +0000 (14:44 +0800)]
sched/cpuacct: Initialize cpuacct subsystem earlier
Initialize cpuacct before the scheduler is functioning, so when
cpuacct_charge() and cpuacct_account_field() are called,
task_ca() won't return NULL.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5155383F.8000005@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:44:04 +0000 (14:44 +0800)]
sched/cpuacct: Initialize root cpuacct earlier
Now we don't need cpuacct_init(), and instead we just initialize
root_cpuacct when it's defined.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51553834.9090701@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:43:46 +0000 (14:43 +0800)]
sched/cpuacct: Allocate per_cpu cpuusage for root cpuacct statically
This is a preparation, so later we can initialize cpuacct
earlier.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51553822.5000403@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:38:13 +0000 (14:38 +0800)]
sched/cpuacct: Clean up cpuacct.h
Now most of the code in cpuacct.h can be moved to cpuacct.c
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/515536D5.2080401@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:37:43 +0000 (14:37 +0800)]
sched/cpuacct: Remove redundant NULL checks in cpuacct_acount_field()
This is a micro optimazation for a hot path.
- We don't need to check if @ca returned from task_ca() is NULL.
- We don't need to check if @ca returned from parent_ca() is NULL.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/515536B7.6060602@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:37:29 +0000 (14:37 +0800)]
sched/cpuacct: Remove redundant NULL checks in cpuacct_charge()
This is a micro optimization for the hot path.
- We don't need to check if @ca is NULL in parent_ca().
- We don't need to check if @ca is NULL in the beginning of the for loop.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/515536A9.5000700@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:37:06 +0000 (14:37 +0800)]
sched/cpuacct: Add cpuacct_acount_field()
So we can remove open-coded cpuacct code in cputime.c.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51553692.9060008@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:36:55 +0000 (14:36 +0800)]
sched/cpuacct: Add cpuacct_init()
So we don't open-coded initialization of cpuacct in core.c.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51553687.1060906@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:36:43 +0000 (14:36 +0800)]
sched: Split cpuacct code out of sched.h
Add cpuacct.h and let sched.h include it.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5155367B.2060506@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Fri, 29 Mar 2013 06:36:31 +0000 (14:36 +0800)]
sched: Split cpuacct code out of core.c
Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5155366F.5060404@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Libin [Mon, 1 Apr 2013 11:14:01 +0000 (19:14 +0800)]
sched: Fix comment in rebalance_domains()
A comment in function rebalance_domains() mentions
arch_init_sched_domains(), but that function does not exist
anymore. The proper function is init_sched_domains().
Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1364814841-49156-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Zhang Hang [Wed, 10 Apr 2013 06:04:55 +0000 (14:04 +0800)]
sched: Simplify can_migrate_task()
At this point tsk_cache_hot is always true, so no need to check it.
Signed-off-by: Zhang Hang <bob.zhanghang@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51650107.9040606@huawei.com
[ Also remove unnecessary schedstat #ifdefs. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Viresh Kumar [Fri, 5 Apr 2013 10:56:46 +0000 (16:26 +0530)]
sched: Fix typo inside comment
Fix typo:
sched_domains_nume_distance ->
sched_domains_numa_distance
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linaro-kernel@lists.linaro.org
Cc: patches@linaro.org
Cc: robin.randhawa@arm.com
Cc: Steve.Bannister@arm.com
Cc: Liviu.Dudau@arm.com
Cc: charles.garcia-tobin@arm.com
Cc: arvind.chauhan@arm.com
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/cd8084746ac932106d6fa6be388b8f2d6aa9617c.1365159023.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2013 09:48:39 +0000 (10:48 +0100)]
sched/tracing: Allow tracing the preemption decision on wakeup
Thomas noted that we do the wakeup preemption check after the
wakeup trace point, this means the tracepoint cannot test/report
this decision; which is rather important for latency sensitive
workloads. Therefore move the tracepoint after doing the
preemption check.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1363254519.26965.9.camel@laptop
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 18 Mar 2013 09:09:31 +0000 (10:09 +0100)]
Merge branch 'sched/core' of git://git./linux/kernel/git/frederic/linux-dynticks into sched/core
Pull CPU runtime stats/accounting fixes from Frederic Weisbecker:
" Some users are complaining that their threadgroup's runtime accounting
freezes after a week or so of intense cpu-bound workload. This set tries
to fix the issue by reducing the risk of multiplication overflow in the
cputime scaling code. "
Stanislaw Gruszka further explained the historic context and impact of the
bug:
" Commit
0cf55e1ec08bb5a22e068309e2d8ba1180ab4239 start to use scalling
for whole thread group, so increase chances of hitting multiplication
overflow, depending on how many CPUs are on the system.
We have multiplication utime * rtime for one thread since commit
b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa.
Overflow will happen after:
rtime * utime > 0xffffffffffffffff jiffies
if thread utilize 100% of CPU time, that gives:
rtime > sqrt(0xffffffffffffffff) jiffies
ritme > sqrt(0xffffffffffffffff) / (24 * 60 * 60 * HZ) days
For HZ 100 it will be 497 days for HZ 1000 it will be 49 days.
Bug affect only users, who run CPU intensive application for that
long period. Also they have to be interested on utime,stime values,
as bug has no other visible effect as making those values incorrect. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andrei Epure [Tue, 12 Mar 2013 19:12:24 +0000 (21:12 +0200)]
sched: Fix variable name misnomer, add comments
The min_vruntime variable actually stores the maximum value.
The added comment was taken from place_entity function.
Signed-off-by: Andrei Epure <epure.andrei@gmail.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1363115544-1964-1-git-send-email-epure.andrei@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Frederic Weisbecker [Wed, 20 Feb 2013 17:54:55 +0000 (18:54 +0100)]
sched: Lower chances of cputime scaling overflow
Some users have reported that after running a process with
hundreds of threads on intensive CPU-bound loads, the cputime
of the group started to freeze after a few days.
This is due to how we scale the tick-based cputime against
the scheduler precise execution time value.
We add the values of all threads in the group and we multiply
that against the sum of the scheduler exec runtime of the whole
group.
This easily overflows after a few days/weeks of execution.
A proposed solution to solve this was to compute that multiplication
on stime instead of utime:
62188451f0d63add7ad0cd2a1ae269d600c1663d
("cputime: Avoid multiplication overflow on utime scaling")
The rationale behind that was that it's easy for a thread to
spend most of its time in userspace under intensive CPU-bound workload
but it's much harder to do CPU-bound intensive long run in the kernel.
This postulate got defeated when a user recently reported he was still
seeing cputime freezes after the above patch. The workload that
triggers this issue relates to intensive networking workloads where
most of the cputime is consumed in the kernel.
To reduce much more the opportunities for multiplication overflow,
lets reduce the multiplication factors to the remainders of the division
between sched exec runtime and cputime. Assuming the difference between
these shouldn't ever be that large, it could work on many situations.
This gets the same results as in the upstream scaling code except for
a small difference: the upstream code always rounds the results to
the nearest integer not greater to what would be the precise result.
The new code rounds to the nearest integer either greater or not
greater. In practice this difference probably shouldn't matter but
it's worth mentioning.
If this solution appears not to be enough in the end, we'll
need to partly revert back to the behaviour prior to commit
0cf55e1ec08bb5a22e068309e2d8ba1180ab4239
("sched, cputime: Introduce thread_group_times()")
Back then, the scaling was done on exit() time before adding the cputime
of an exiting thread to the signal struct. And then we'll need to
scale one-by-one the live threads cputime in thread_group_cputime(). The
drawback may be a slightly slower code on exit time.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Frederic Weisbecker [Tue, 5 Mar 2013 17:05:46 +0000 (18:05 +0100)]
math64: New div64_u64_rem helper
Provide an extended version of div64_u64() that
also returns the remainder of the division.
We are going to need this to refine the cputime
scaling code.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Andrei Epure [Mon, 11 Mar 2013 10:03:20 +0000 (12:03 +0200)]
sched: Spelling fix
Signed-off-by: Andrei Epure <epure.andrei@gmail.com>
Cc: trivial@kernel.org
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1362996200-2674-1-git-send-email-epure.andrei@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Thu, 7 Mar 2013 02:00:26 +0000 (10:00 +0800)]
sched: Fix update_group_power() prototype placement to fix build warning when !CONFIG_SMP
All warnings:
In file included from kernel/sched/core.c:85:0:
kernel/sched/sched.h:1036:39: warning: 'struct sched_domain' declared inside parameter list
kernel/sched/sched.h:1036:39: warning: its scope is only this definition or declaration, which is probably not what you want
It's because struct sched_domain is defined inside #if CONFIG_SMP,
while update_group_power() is declared unconditionally.
Fix this warning by declaring update_group_power() only if
CONFIG_SMP=n.
Build tested with CONFIG_SMP enabled and then disabled.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5137F4BA.2060101@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 8 Mar 2013 15:41:22 +0000 (16:41 +0100)]
Merge branch 'sched/cputime' of git://git./linux/kernel/git/frederic/linux-dynticks into sched/core
Pull cputime changes from Frederic Weisbecker:
* Generalize exception handling
* Fix race in context tracking state restore on return from exception
and irq exit kernel preemption
* Fix cputime scaling in full dynticks accounting dynamic off-case
* Fix default Kconfig value
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Frederic Weisbecker [Tue, 26 Feb 2013 14:37:59 +0000 (15:37 +0100)]
context_tracking: Enable probes by default for selftesting
Until we provide the nohz_mask boot parameter, keeping
the context tracking probes disabled by default is pointless
since what we want is to runtime test this code anyway.
It's furthermore confusing for the users which don't expect
the probes to be off when they select RCU user mode or full
dynticks cputime accounting.
Let's enable these probes selftests by default for now.
Suggested: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Frederic Weisbecker [Mon, 25 Feb 2013 16:25:39 +0000 (17:25 +0100)]
cputime: Dynamically scale cputime for full dynticks accounting
The full dynticks cputime accounting is able to account either
using the tick or the context tracking subsystem. This way
the housekeeping CPU can keep the low overhead tick based
solution.
This latter mode has a low jiffies resolution granularity and
need to be scaled against CFS precise runtime accounting to
improve its result. We are doing this for CONFIG_TICK_CPU_ACCOUNTING,
now we also need to expand it to full dynticks accounting dynamic
off-case as well.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Frederic Weisbecker [Sun, 24 Feb 2013 11:59:30 +0000 (12:59 +0100)]
context_tracking: Restore preempted context state after preempt_schedule_irq()
From the context tracking POV, preempt_schedule_irq() behaves pretty much
like an exception: It can be called anytime and schedule another task.
But currently it doesn't restore the context tracking state of the preempted
code on preempt_schedule_irq() return.
As a result, if preempt_schedule_irq() is called in the tiny frame between
user_enter() and the actual return to userspace, we resume userspace with
the wrong context tracking state.
Fix this by using exception_enter/exit() which are a perfect fit for this
kind of issue.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Frederic Weisbecker [Sun, 24 Feb 2013 00:19:14 +0000 (01:19 +0100)]
context_tracking: Restore correct previous context state on exception exit
On exception exit, we restore the previous context tracking state based on
the regs of the interrupted frame. Iff that frame is in user mode as
stated by user_mode() helper, we restore the context tracking user mode.
However there is a tiny chunck of low level arch code after we pass through
user_enter() and until the CPU eventually resumes userspace.
If an exception happens in this tiny area, exception_enter() correctly
exits the context tracking user mode but exception_exit() won't restore
it because of the value returned by user_mode(regs).
As a result we may return to userspace with the wrong context tracking
state.
To fix this, change exception_enter() to return the context tracking state
prior to its call and pass this saved state to exception_exit(). This restores
the real context tracking state of the interrupted frame.
(May be this patch was suggested to me, I don't recall exactly. If so,
sorry for the missing credit).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Frederic Weisbecker [Sat, 23 Feb 2013 23:23:25 +0000 (00:23 +0100)]
context_tracking: Move exception handling to generic code
Exceptions handling on context tracking should share common
treatment: on entry we exit user mode if the exception triggered
in that context. Then on exception exit we return to that previous
context.
Generalize this to avoid duplication across archs.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Li Zefan [Tue, 5 Mar 2013 08:07:52 +0000 (16:07 +0800)]
sched: Remove double declaration of root_task_group
It's already declared in include/linux/sched.h
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A7D8.7000107@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:07:33 +0000 (16:07 +0800)]
sched: Move group scheduling functions out of include/linux/sched.h
- Make sched_group_{set_,}runtime(), sched_group_{set_,}period()
and sched_rt_can_attach() static.
- Move sched_{create,destroy,online,offline}_group() to
kernel/sched/sched.h.
- Remove declaration of sched_group_shares().
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A7C5.3000708@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:07:11 +0000 (16:07 +0800)]
sched: Make default_scale_freq_power() static
As default_scale_{freq,smt}_power() and update_rt_power() are
used in kernel/sched/fair.c only, annotate them as static
functions.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A7AF.8010900@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:06:55 +0000 (16:06 +0800)]
sched: Move struct sched_class to kernel/sched/sched.h
It's used internally only.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A79F.8090502@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:06:38 +0000 (16:06 +0800)]
sched: Move wake flags to kernel/sched/sched.h
They are used internally only.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A78E.7040609@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:06:23 +0000 (16:06 +0800)]
sched: Move struct sched_group to kernel/sched/sched.h
Move struct sched_group_power and sched_group and related inline
functions to kernel/sched/sched.h, as they are used internally
only.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A77F.2010705@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:06:09 +0000 (16:06 +0800)]
sched: Move SCHED_LOAD_SHIFT macros to kernel/sched/sched.h
They are used internally only.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A771.4070104@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:05:51 +0000 (16:05 +0800)]
sched: Remove test_sd_parent()
It's unused.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A75F.4070202@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Li Zefan [Tue, 5 Mar 2013 08:05:28 +0000 (16:05 +0800)]
sched: Remove some dummy functions
No one will call those functions if CONFIG_SCHED_DEBUG=n.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A748.3050206@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sun, 3 Mar 2013 23:11:05 +0000 (15:11 -0800)]
Linux 3.9-rc1
Linus Torvalds [Sun, 3 Mar 2013 22:24:59 +0000 (14:24 -0800)]
Merge tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers
Pull fbdev UAPI disintegration from David Howells:
"You'll be glad to here that the end is nigh for the UAPI patches.
Only the fbdev/framebuffer piece remains now that the SCSI stuff has
gone in.
Here are the UAPI disintegration bits for the fbdev drivers. It
appears that Florian hasn't had time to deal with my patch, but back
in December he did say he didn't mind if I pushed it forward."
Yay. No more uapi movement. And hopefully no more big header file
cleanups coming up either, it just tends to be very painful.
* tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/video
Linus Torvalds [Sun, 3 Mar 2013 22:22:53 +0000 (14:22 -0800)]
Merge tag 'stable/for-linus-3.9-rc1-tag' of git://git./linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
- Update the Xen ACPI memory and CPU hotplug locking mechanism.
- Fix PAT issues wherein various applications would not start
- Fix handling of multiple MSI as AHCI now does it.
- Fix ARM compile failures.
* tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus: fix compile failure on ARM with Xen enabled
xen/pci: We don't do multiple MSI's.
xen/pat: Disable PAT using pat_enabled value.
xen/acpi: xen cpu hotplug minor updates
xen/acpi: xen memory hotplug minor updates
Linus Torvalds [Sun, 3 Mar 2013 21:23:02 +0000 (13:23 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull more VFS bits from Al Viro:
"Unfortunately, it looks like xattr series will have to wait until the
next cycle ;-/
This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
more file_inode() work"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify path_get/path_put and fs_struct.c stuff
fix nommu breakage in shmem.c
cache the value of file_inode() in struct file
9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
9p: make sure ->lookup() adds fid to the right dentry
9p: untangle ->lookup() a bit
9p: double iput() in ->lookup() if d_materialise_unique() fails
9p: v9fs_fid_add() can't fail now
v9fs: get rid of v9fs_dentry
9p: turn fid->dlist into hlist
9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
more file_inode() open-coded instances
selinux: opened file can't have NULL or negative ->f_path.dentry
(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
Linus Torvalds [Sun, 3 Mar 2013 21:13:20 +0000 (13:13 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixup from Chris Mason:
"Geert and James both sent this one in, sorry guys"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs/raid56: Add missing #include <linux/vmalloc.h>
Linus Torvalds [Sun, 3 Mar 2013 20:58:43 +0000 (12:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull second set of s390 patches from Martin Schwidefsky:
"The main part of this merge are Heikos uaccess patches. Together with
commit
09884964335e ("mm: do not grow the stack vma just because of an
overrun on preceding vma") the user string access is hopefully fixed
for good.
In addition some bug fixes and two cleanup patches."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/module: fix compile warning
qdio: remove unused parameters
s390/uaccess: fix kernel ds access for page table walk
s390/uaccess: fix strncpy_from_user string length check
input: disable i8042 PC Keyboard controller for s390
s390/dis: Fix invalid array size
s390/uaccess: remove pointless access_ok() checks
s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
s390/uaccess: shorten strncpy_from_user/strnlen_user
s390/dasd: fix unresponsive device after all channel paths were lost
s390/mm: ignore change bit for vmemmap
s390/page table dumper: add support for change-recording override bit
Linus Torvalds [Sun, 3 Mar 2013 20:57:38 +0000 (12:57 -0800)]
Merge branch 'fixes-for-3.9-latest' of git://git./linux/kernel/git/deller/parisc-linux
Pull second round of PARISC updates from Helge Deller:
"The most important fix in this branch is the switch of io_setup,
io_getevents and io_submit syscalls to use the available compat
syscalls when running 32bit userspace on 64bit kernel. Other than
that it's mostly removal of compile warnings."
* 'fixes-for-3.9-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix redefinition of SET_PERSONALITY
parisc: do not install modules when installing kernel
parisc: fix compile warnings triggered by atomic_sub(sizeof(),v)
parisc: check return value of down_interruptible() in hp_sdc_rtc.c
parisc: avoid unitialized variable warning in pa_memcpy()
parisc: remove unused variable 'compat_val'
parisc: switch to compat_functions of io_setup, io_getevents and io_submit
parisc: select ARCH_WANT_FRAME_POINTERS
Linus Torvalds [Sun, 3 Mar 2013 20:06:09 +0000 (12:06 -0800)]
Merge tag 'metag-v3.9-rc1-v4' of git://git./linux/kernel/git/jhogan/metag
Pull new ImgTec Meta architecture from James Hogan:
"This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:
- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of
metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()"
* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
metag: Provide dma_get_sgtable()
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
metag: copy devicetree to non-init memory
metag: cleanup metag_ksyms.c includes
metag: move mm/init.c exports out of metag_ksyms.c
metag: move usercopy.c exports out of metag_ksyms.c
metag: move setup.c exports out of metag_ksyms.c
metag: move kick.c exports out of metag_ksyms.c
metag: move traps.c exports out of metag_ksyms.c
metag: move irq enable out of irqflags.h on SMP
genksyms: fix metag symbol prefix on crc symbols
metag: hugetlb: convert to vm_unmapped_area()
metag: export clear_page and copy_page
metag: export metag_code_cache_flush_all
metag: protect more non-MMU memory regions
metag: make TXPRIVEXT bits explicit
metag: kernel/setup.c: sort includes
perf: Enable building perf tools for Meta
metag: add boot time LNKGET/LNKSET check
metag: add __init to metag_cache_probe()
...
Linus Torvalds [Sun, 3 Mar 2013 19:54:39 +0000 (11:54 -0800)]
Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull late ARM updates from Russell King:
"Here is the late set of ARM updates for this merge window; in here is:
- The ARM parts of the broadcast timer support, core parts merged
through tglx's tree. This was left over from the previous merge to
allow the dependency on tglx's tree to be resolved.
- A fix to the VFP code which shows up on Raspberry Pi's, as well as
fixing the fallout from a previous commit in this area.
- A number of smaller fixes scattered throughout the ARM tree"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Fix broken commit
0cc41e4a21d43 corrupting kernel messages
ARM: fix scheduling while atomic warning in alignment handling code
ARM: VFP: fix emulation of second VFP instruction
ARM: 7656/1: uImage: Error out on build of multiplatform without LOADADDR
ARM: 7640/1: memory: tegra_ahb_enable_smmu() depends on TEGRA_IOMMU_SMMU
ARM: 7654/1: Preserve L_PTE_VALID in pte_modify()
ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock
ARM: 7651/1: remove unused smp_timer_broadcast #define
Linus Torvalds [Sun, 3 Mar 2013 18:25:47 +0000 (10:25 -0800)]
Merge tag 'char-misc-3.9-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc patch from Greg Kroah-Hartman:
"Here is one remaining patch for 3.9-rc1. It is for the hyper-v
drivers, and had to wait until some other patches went in through the
x86 tree."
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'char-misc-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Use the new infrastructure for delivering VMBUS interrupts
Linus Torvalds [Sun, 3 Mar 2013 18:24:57 +0000 (10:24 -0800)]
Merge tag 'usb-3.9-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB patch revert from Greg Kroah-Hartman:
"Here is one remaining USB patch for 3.9-rc1, it reverts a 3.8 patch
that has caused a lot of regressions for some VIA EHCI controllers."
* tag 'usb-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: EHCI: revert "remove ASS/PSS polling timeout"
Linus Torvalds [Sun, 3 Mar 2013 18:23:29 +0000 (10:23 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
"This contains:
- fixes and improvements
- devicetree bindings
- conversion to watchdog generic framework of the following drivers:
- booke_wdt
- bcm47xx_wdt.c
- at91sam9_wdt
- Removal of old STMP3xxx driver
- Addition of following new drivers:
- new driver for STMP3xxx and i.MX23/28
- Retu watchdog driver"
* git://www.linux-watchdog.org/linux-watchdog: (30 commits)
watchdog: sp805_wdt depends on ARM
watchdog: davinci_wdt: update to devm_* API
watchdog: davinci_wdt: use devm managed clk get
watchdog: at91rm9200: add DT support
watchdog: add timeout-sec property binding
watchdog: at91sam9_wdt: Convert to use the watchdog framework
watchdog: omap_wdt: Add option nowayout
watchdog: core: dt: add support for the timeout-sec dt property
watchdog: bcm47xx_wdt.c: add hard timer
watchdog: bcm47xx_wdt.c: rename wdt_time to timeout
watchdog: bcm47xx_wdt.c: rename ops methods
watchdog: bcm47xx_wdt.c: use platform device
watchdog: bcm47xx_wdt.c: convert to watchdog core api
watchdog: Convert BookE watchdog driver to watchdog infrastructure
watchdog: s3c2410_wdt: Use devm_* functions
watchdog: remove old STMP3xxx driver
watchdog: add new driver for STMP3xxx and i.MX23/28
rtc: stmp3xxx: add wdt-accessor function
watchdog: introduce retu_wdt driver
watchdog: intel_scu_watchdog: fix Kconfig dependency
...
Linus Torvalds [Sun, 3 Mar 2013 18:20:22 +0000 (10:20 -0800)]
Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull second set of slave-dmaengine updates from Vinod Koul:
"Arnd's patch moves the dw_dmac to use generic DMA binding. I agreed
to merge this late as it will avoid the conflicts between trees.
The second patch from Matt adding a dma_request_slave_channel_compat
API was supposed to be picked up, but somehow never got picked up.
Some patches dependent on this are already in -next :("
* 'next' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: dw_dmac: move to generic DMA binding
dmaengine: add dma_request_slave_channel_compat()
Linus Torvalds [Sun, 3 Mar 2013 18:16:19 +0000 (10:16 -0800)]
Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver updates from Matthew Garrett:
"Mostly relatively small updates, along with some hardware enablement
for Sony hardware and a pile of updates to Google's Chromebook driver"
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86: (49 commits)
ideapad-laptop: Depend on BACKLIGHT_CLASS_DEVICE instead of selecting it
ideapad: depends on backlight subsystem and update comment
Platform: x86: chromeos_laptop - add i915 gmbuses to adapter names
Platform: x86: chromeos_laptop - Add isl light sensor for Pixel
Platform: x86: chromeos_laptop - Add a more general add_i2c_device
Platform: x86: chromeos_laptop - Add Pixel Touchscreen
Platform: x86: chromeos_laptop - Add support for probing devices
Platform: x86: chromeos_laptop - Add Pixel Trackpad
hp-wmi: fix handling of platform device
sony-laptop: leak in error handling sony_nc_lid_resume_setup()
hp-wmi: Add support for SMBus hotkeys
asus-wmi: Fix unused function build warning
acer-wmi: avoid the warning of 'devices' may be used uninitialized
drivers/platform/x86/thinkpad_acpi.c: Handle HKEY event 0x6040
Platform: x86: chromeos_laptop - Add HP Pavilion 14
Platform: x86: chromeos_laptop - Add Taos tsl2583 device
Platform: x86: chromeos_laptop - Add Taos tsl2563 device
Platform: x86: chromeos_laptop - Add Acer C7 trackpad
Platform: x86: chromeos_laptop - Rename setup_lumpy_tp to setup_cyapa_smbus_tp
asus-laptop: always report brightness key events
...
Geert Uytterhoeven [Sun, 3 Mar 2013 11:44:41 +0000 (04:44 -0700)]
btrfs/raid56: Add missing #include <linux/vmalloc.h>
tilegx_defconfig:
fs/btrfs/raid56.c: In function 'btrfs_alloc_stripe_hash_table':
fs/btrfs/raid56.c:206:3: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
fs/btrfs/raid56.c:206:9: warning: assignment makes pointer from integer without a cast [enabled by default]
fs/btrfs/raid56.c:226:4: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Linus Torvalds [Sun, 3 Mar 2013 03:33:21 +0000 (19:33 -0800)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Ted Ts'o:
"Various bug fixes for ext4. The most important is a fix for the new
extent cache's slab shrinker which can cause significant, user-visible
pauses when the system is under memory pressure."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: enable quotas before orphan cleanup
ext4: don't allow quota mount options when quota feature enabled
ext4: fix a warning from sparse check for ext4_dir_llseek
ext4: convert number of blocks to clusters properly
ext4: fix possible memory leak in ext4_remount()
jbd2: fix ERR_PTR dereference in jbd2__journal_start
ext4: use percpu counter for extent cache count
ext4: optimize ext4_es_shrink()
Linus Torvalds [Sun, 3 Mar 2013 03:32:06 +0000 (19:32 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/signal
Pull sigprocmask compat fix from Al Viro:
"generic compat_sys_rt_sigprocmask() had a very dumb braino; I'd spent
quite a while staring at the offending commit before finally managing
to spot the idiocy ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
fix compat_sys_rt_sigprocmask()
Al Viro [Sun, 3 Mar 2013 01:39:15 +0000 (20:39 -0500)]
fix compat_sys_rt_sigprocmask()
Converting bitmask to 32bit granularity is fine, but we'd better
_do_ something with the result. Such as "copy it to userland"...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sun, 3 Mar 2013 00:46:07 +0000 (16:46 -0800)]
Merge tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"We've just concluded another Connectathon interoperability testing
week, and so here are the fixes for the bugs that were discovered:
- Don't allow NFS silly-renamed files to be deleted
- Don't start the retransmission timer when out of socket space
- Fix a couple of pnfs-related Oopses.
- Fix one more NFSv4 state recovery deadlock
- Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"
* tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: One line comment fix
NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
SUNRPC: add call to get configured timeout
PNFS: set the default DS timeout to 60 seconds
NFSv4: Fix another open/open_recovery deadlock
nfs: don't allow nfs_find_actor to match inodes of the wrong type
NFSv4.1: Hold reference to layout hdr in layoutget
pnfs: fix resend_to_mds for directio
SUNRPC: Don't start the retransmission timer when out of socket space
NFS: Don't allow NFS silly-renamed files to be deleted, no signal
Linus Torvalds [Sun, 3 Mar 2013 00:41:54 +0000 (16:41 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
"The biggest feature in the pull is the new (and still experimental)
raid56 code that David Woodhouse started long ago. I'm still working
on the parity logging setup that will avoid inconsistent parity after
a crash, so this is only for testing right now. But, I'd really like
to get it out to a broader audience to hammer out any performance
issues or other problems.
scrub does not yet correct errors on raid5/6 either.
Josef has another pass at fsync performance. The big change here is
to combine waiting for metadata with waiting for data, which is a big
latency win. It is also step one toward using atomics from the
hardware during a commit.
Mark Fasheh has a new way to use btrfs send/receive to send only the
metadata changes. SUSE is using this to make snapper more efficient
at finding changes between snapshosts.
Snapshot-aware defrag is also included.
Otherwise we have a large number of fixes and cleanups. Eric Sandeen
wins the award for removing the most lines, and I'm hoping we steal
this idea from XFS over and over again."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
btrfs: fixup/remove module.h usage as required
Btrfs: delete inline extents when we find them during logging
btrfs: try harder to allocate raid56 stripe cache
Btrfs: cleanup to make the function btrfs_delalloc_reserve_metadata more logic
Btrfs: don't call btrfs_qgroup_free if just btrfs_qgroup_reserve fails
Btrfs: remove reduplicate check about root in the function btrfs_clean_quota_tree
Btrfs: return ENOMEM rather than use BUG_ON when btrfs_alloc_path fails
Btrfs: fix missing deleted items in btrfs_clean_quota_tree
btrfs: use only inline_pages from extent buffer
Btrfs: fix wrong reserved space when deleting a snapshot/subvolume
Btrfs: fix wrong reserved space in qgroup during snap/subv creation
Btrfs: remove unnecessary dget_parent/dput when creating the pending snapshot
btrfs: remove a printk from scan_one_device
Btrfs: fix NULL pointer after aborting a transaction
Btrfs: fix memory leak of log roots
Btrfs: copy everything if we've created an inline extent
btrfs: cleanup for open-coded alignment
Btrfs: do not change inode flags in rename
Btrfs: use reserved space for creating a snapshot
clear chunk_alloc flag on retryable failure
...
Linus Torvalds [Sun, 3 Mar 2013 00:33:54 +0000 (16:33 -0800)]
Merge tag 'for-linus-
20130301' of git://git.infradead.org/linux-mtd
Pull MTD update from David Woodhouse:
"Fairly unexciting MTD merge for 3.9:
- misc clean-ups in the MTD command-line partitioning parser
(cmdlinepart)
- add flash locking support for STmicro chips serial flash chips, as
well as for CFI command set 2 chips.
- new driver for the ELM error correction HW module found in various
TI chips, enable the OMAP NAND driver to use the ELM HW error
correction
- added number of new serial flash IDs
- various fixes and improvements in the gpmi NAND driver
- bcm47xx NAND driver improvements
- make the mtdpart module actually removable"
* tag 'for-linus-
20130301' of git://git.infradead.org/linux-mtd: (45 commits)
mtd: map: BUG() in non handled cases
mtd: bcm47xxnflash: use pr_fmt for module prefix in messages
mtd: davinci_nand: Use managed resources
mtd: mtd_torturetest can cause stack overflows
mtd: physmap_of: Convert device allocation to managed devm_kzalloc()
mtd: at91: atmel_nand: for PMECC, add code to check the ONFI parameter ECC requirement.
mtd: atmel_nand: make pmecc-cap, pmecc-sector-size in dts is optional.
mtd: atmel_nand: avoid to report an error when lookup table offset is 0.
mtd: bcm47xxsflash: adjust names of bus-specific functions
mtd: bcm47xxpart: improve probing of nvram partition
mtd: bcm47xxpart: add support for other erase sizes
mtd: bcm47xxnflash: register this as normal driver
mtd: bcm47xxnflash: fix message
mtd: bcm47xxsflash: register this as normal driver
mtd: bcm47xxsflash: write number of written bytes
mtd: gpmi: add sanity check for the ECC
mtd: gpmi: set the Golois Field bit for mx6q's BCH
mtd: devices: elm: Removes <xx> literals in elm DT node
mtd: gpmi: fix a dereferencing freed memory error
mtd: fix the wrong timeo for panic_nand_wait()
...
Russell King [Sun, 3 Mar 2013 00:32:50 +0000 (00:32 +0000)]
Merge branches 'devel-stable', 'fixes' and 'mmci' into for-linus
Trond Myklebust [Sat, 2 Mar 2013 23:54:11 +0000 (15:54 -0800)]
SUNRPC: One line comment fix
Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jan Kara [Sat, 2 Mar 2013 23:22:38 +0000 (18:22 -0500)]
ext4: enable quotas before orphan cleanup
When using quota feature we need to enable quotas before orphan cleanup
so that changes happening during it are properly reflected in quota
accounting.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Jan Kara [Sat, 2 Mar 2013 22:57:08 +0000 (17:57 -0500)]
ext4: don't allow quota mount options when quota feature enabled
So far we silently ignored when quota mount options were set while quota
feature was enabled. But this can create confusion in userspace when
mount options are set but silently ignored and also creates opportunities
for bugs when we don't properly test all quota types. Actually
ext4_mark_dquot_dirty() forgets to test for quota feature so it was
dependent on journaled quota options being set. OTOH ext4_orphan_cleanup()
tries to enable journaled quota when quota options are specified which is
wrong when quota feature is enabled.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Zheng Liu [Sat, 2 Mar 2013 22:24:05 +0000 (17:24 -0500)]
ext4: fix a warning from sparse check for ext4_dir_llseek
ext4_dir_llseek is only used as a callback function, and no one calls
it directly. So make it as a static function in order to remove a
warning message from sparse check.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Lukas Czerner [Sat, 2 Mar 2013 22:18:58 +0000 (17:18 -0500)]
ext4: convert number of blocks to clusters properly
We're using macro EXT4_B2C() to convert number of blocks to number of
clusters for bigalloc file systems. However, we should be using
EXT4_NUM_B2C().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
Wei Yongjun [Sat, 2 Mar 2013 22:13:55 +0000 (17:13 -0500)]
ext4: fix possible memory leak in ext4_remount()
'orig_data' is malloced in ext4_remount() and should be freed
before leaving from the error handling cases, otherwise it will
cause memory leak.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org
Dmitry Monakhov [Sat, 2 Mar 2013 22:08:46 +0000 (17:08 -0500)]
jbd2: fix ERR_PTR dereference in jbd2__journal_start
If start_this_handle() failed handle will be initialized
to ERR_PTR() and can not be dereferenced.
paging request at
fffffffffffffff6
IP: [<
ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
PGD 200e067 PUD 200f067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
CPU 0 journal commit I/O error
Pid: 2694, comm: fio Not tainted 3.8.0-rc3+ #79 /DQ67SW
RIP: 0010:[<
ffffffff813c073f>] [<
ffffffff813c073f>] jbd2__journal_start+0x18f/0x290
RSP: 0018:
ffff880233b8ba58 EFLAGS:
00010292
RAX:
00000000ffffffe2 RBX:
ffffffffffffffe2 RCX:
0000000000000006
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
ffffffff82128f48
RBP:
ffff880233b8ba98 R08:
0000000000000000 R09:
ffff88021440a6e0
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
James Hogan [Tue, 19 Feb 2013 13:25:46 +0000 (13:25 +0000)]
metag: Provide dma_get_sgtable()
metag/allmodconfig:
drivers/media/v4l2-core/videobuf2-dma-contig.c: In function 'vb2_dc_get_base_sgt':
drivers/media/v4l2-core/videobuf2-dma-contig.c:387: error: implicit declaration of function 'dma_get_sgtable'
For architectures using dma_map_ops, dma_get_sgtable() is provided in
<asm-generic/dma-mapping-common.h>.
Metag does not use dma_map_ops yet, hence it should implement it as an
inline stub using dma_common_get_sgtable().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
James Hogan [Wed, 20 Feb 2013 14:19:54 +0000 (14:19 +0000)]
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
Metag doesn't have a metag_dt_memblock_reserve() function so remove the
declaration from asm/prom.h.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 20 Feb 2013 13:59:13 +0000 (13:59 +0000)]
metag: copy devicetree to non-init memory
Make a copy of the device tree blob in non-init memory. It is required
when using built-in device tree files that the platform code copies the
blob to non-init memory prior to calling unflatten_device_tree(),
otherwise the strings that the device tree refer to will get poisoned
and potentially reused, breaking later reading of the device tree
post-init (such as compatible matching in modules, debugfs, and the
procfs interface).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Vineet Gupta <vgupta@synopsys.com>
James Hogan [Wed, 13 Feb 2013 13:30:15 +0000 (13:30 +0000)]
metag: cleanup metag_ksyms.c includes
Minimise metag_ksyms.c includes to directly include the <asm/*.h> files
that declare a particular symbol, and not include any unnecessary ones.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 13:19:15 +0000 (13:19 +0000)]
metag: move mm/init.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in mm/init.c into mm/init.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:57:10 +0000 (12:57 +0000)]
metag: move usercopy.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in usercopy.c into usercopy.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:55:51 +0000 (12:55 +0000)]
metag: move setup.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in setup.c into setup.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 13:01:55 +0000 (13:01 +0000)]
metag: move kick.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in kick.c into kick.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Wed, 13 Feb 2013 12:19:03 +0000 (12:19 +0000)]
metag: move traps.c exports out of metag_ksyms.c
It's less error prone to have function symbols exported immediately
after the function rather than in metag_ksyms.c. Move each EXPORT_SYMBOL
in metag_ksyms.c for symbols defined in traps.c into traps.c
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Tue, 12 Feb 2013 16:04:53 +0000 (16:04 +0000)]
metag: move irq enable out of irqflags.h on SMP
The SMP version of arch_local_irq_enable() uses preempt_disable(), but
<asm/irqflags.h> doesn't include <linux/preempt.h> causing the following
errors on SMP when pstore/ftrace is enabled (caught by buildbot smp
allyesconfig):
In file included from include/linux/irqflags.h:15,
from fs/pstore/ftrace.c:16:
arch/metag/include/asm/irqflags.h: In function 'arch_local_irq_enable':
arch/metag/include/asm/irqflags.h:84: error: implicit declaration of function 'preempt_disable'
arch/metag/include/asm/irqflags.h:86: error: implicit declaration of function 'preempt_enable_no_resched'
However <linux/preempt.h> cannot be easily included from
<asm/irqflags.h> as it can cause circular include dependencies in the
!SMP case, and potentially in the SMP case in the future. Therefore move
the SMP implementation of arch_local_irq_enable() into traps.c and use
an inline version of get_trigger_mask() which is also defined in traps.c
for SMP.
This adds an extra layer of function call / stack push when
preempt_disable needs to call other functions, however in the
non-preemptive SMP case it should be about as fast, as it was already
calling the get_trigger_mask() function which is now used inline.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 15:40:24 +0000 (15:40 +0000)]
genksyms: fix metag symbol prefix on crc symbols
Meta uses symbol prefixes, so add "metag" to the list of architectures
to set the mod_prefix to "_" for. This fixes __crc_* symbols to add the
extra underscore to match _CRC_SYMBOL macro in <linux/export.h> and so
that modpost finds them.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 17:28:10 +0000 (17:28 +0000)]
metag: hugetlb: convert to vm_unmapped_area()
Convert hugetlb_get_unmapped_area_new_pmd() to use vm_unmapped_area()
rather than searching the virtual address space itself. This fixes the
following errors in linux-next due to the specified members being
removed after other architectures have already been converted:
arch/metag/mm/hugetlbpage.c: In function 'hugetlb_get_unmapped_area_new_pmd':
arch/metag/mm/hugetlbpage.c:199: error: 'struct mm_struct' has no member named 'cached_hole_size'
arch/metag/mm/hugetlbpage.c:200: error: 'struct mm_struct' has no member named 'free_area_cache'
arch/metag/mm/hugetlbpage.c:215: error: 'struct mm_struct' has no member named 'cached_hole_size'
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Michel Lespinasse <walken@google.com>
James Hogan [Mon, 11 Feb 2013 11:47:02 +0000 (11:47 +0000)]
metag: export clear_page and copy_page
Various file systems use clear_page() and copy_page(), so when they're
built as modules we get build errors like the following:
ERROR: "clear_page" [fs/ntfs/ntfs.ko] undefined!
ERROR: "copy_page" [fs/nilfs2/nilfs2.ko] undefined!
Therefore export these functions to modules from metag_ksyms.c to fix
the errors. This was hit by a randconfig build.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Mon, 11 Feb 2013 11:43:20 +0000 (11:43 +0000)]
metag: export metag_code_cache_flush_all
Various file systems indirectly use metag_code_cache_flush_all(), so
when they're built as modules we get build errors like the following:
ERROR: "metag_code_cache_flush_all" [fs/xfs/xfs.ko] undefined!
Therefore export this function to modules to fix the errors. This was
hit by a randconfig build.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 13:42:03 +0000 (13:42 +0000)]
metag: protect more non-MMU memory regions
Rename setup_txprivext() to setup_priv() and add initialisation of some
more per-thread privilege protection registers:
- TxPRIVSYSR: 0x04400000-0x047fffff
0x05000000-0x07ffffff
0x84000000-0x87ffffff
- TxPIOREG: 0x02000000-0x02ffffff
0x04800000-0x048fffff
- TxSYREG: 0x04000000-0x04000fff (except write fetch system event)
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 13:27:35 +0000 (13:27 +0000)]
metag: make TXPRIVEXT bits explicit
Define PRIV_BITS using explicit constants from <asm/metag_regs.h> rather
than with a hard coded value. This also adds a couple of missing
definitions for the TXPRIVEXT priv bits for protecting writes to TXTIMER
and the trace registers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 13:38:48 +0000 (13:38 +0000)]
metag: kernel/setup.c: sort includes
Sort includes in kernel/setup.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 12:22:37 +0000 (12:22 +0000)]
perf: Enable building perf tools for Meta
Define rmb(), cpu_relax(), and CPUINFO_PROC for Meta so that the perf
tools can be built for Meta.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
James Hogan [Thu, 31 Jan 2013 11:06:03 +0000 (11:06 +0000)]
metag: add boot time LNKGET/LNKSET check
Add boot time check for whether LNKGET/LNKSET go through or around the
cache. Depending on the configuration an info message (no harm), warning
(technically wrong but no harm), or big WARN (expect failure in either
kernel or userland) may be emitted if the behaviour is not as expected:
Configuration Hardware Response
------------------------------------------ -------- --------
AROUND_CACHE through pr_info
!AROUND_CACHE && ATOMICITY_LNKGET around WARN (kernel)
" && !ATOMICITY_LNKGET && SMP around WARN (user)
" " && !SMP around pr_warn
Signed-off-by: James Hogan <james.hogan@imgtec.com>
James Hogan [Thu, 31 Jan 2013 11:04:49 +0000 (11:04 +0000)]
metag: add __init to metag_cache_probe()
metag_cache_probe() is only called from setup_arch(), so add the __init
attribute to it.
Signed-off-by: James Hogan <james.hogan@imgtec.com>