Chanwoo Choi [Thu, 2 Apr 2015 10:33:11 +0000 (19:33 +0900)]
LOCAL / arm64: configs: Update defconfig to enable CONFIG_ARM_EXYNOS_BUS_DEVFREQ
This patch enables CONFIG_ARM_EXYNOS_BUS_DEVFREQ to support memory bus
frequency for Exynos5433 SoC.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Mon, 2 Feb 2015 08:04:24 +0000 (17:04 +0900)]
LOCAL / arm64: dts: Add the support for memory busfreq on Exynos5433-based tm2 board
This patch adds the memory and PPMU dt node to support the generic exynos
memory bus frequency driver by using DEVFREQ / DEVFREQ-Event framework.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Tue, 31 Mar 2015 05:05:05 +0000 (14:05 +0900)]
arm64: dts: exynos: Add memory bus node for Exynos5433 SoC
This patch adds the memory bus node for Exynos5433 SoC. Exynos5433 SoC has
four memory buses to translate data between DRAM and MIF/INT/ISP/DISPLAY IPs.
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Tue, 31 Mar 2015 05:04:27 +0000 (14:04 +0900)]
arm64: dts: exynos: Add PPMU dt node for Exynos5433 SoC
This patch adds PPMU (Platform Performance Monitoring Unit) dt node to get
current usage of sub-IPs in Exynos5433 SoC.
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 31 Dec 2014 00:24:01 +0000 (09:24 +0900)]
ARM: dts: Add the support for exynos busfreq on Exynos4412-based TRATS2 board
This patch adds the Exynos4412 memory-bus node which includes the regulator
and devfreq-event phandle. The devfreq-event phandle is used for the
governor of devfreq device and provide the current usage state of
MIF (Memory Interface) / INT (Internal) memory bus group.
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 17 Dec 2014 05:06:09 +0000 (14:06 +0900)]
ARM: dts: Add the support for exynos busfreq on Exynos3250-based Rinato/Monk board
This patch adds the Exynos3250 memory-bus node which includes the regulator
and devfreq-event phandle. The devfreq-event phandle is used for the
governor of devfreq device and provide the current usage state of
MIF (Memory Interface) / INT (Internal) memory bus group.
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Youngjun Cho <yj44.cho@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Chanwoo Choi [Wed, 31 Dec 2014 02:08:40 +0000 (11:08 +0900)]
ARM: dts: Add memory bus node for Exynos4210
This patch adds the memory bus node for Exynos4210 SoC. Exynos4210 SoC has
one memory bus to translate data between DRAM and eMMC/sub-IPs because
Exynos4210 must need only one regulator for memory bus.
Following list specifies the detailed relation between memory bus clock and
sub-IPs:
- DMC/ACP clock : DMC (Dynamic Memory Controller)
- ACLK200 clock : LCD0
- ACLK100 clock : PERIL/PERIR/MFC(PCLK)
- ACLK160 clock : CAM/TV/LCD0/LCD1
- ACLK133 clock : FSYS/GPS
- GDL/GDR clock : leftbus/rightbus
- SCLK_MFC clock : MFC
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 31 Dec 2014 02:06:34 +0000 (11:06 +0900)]
ARM: dts: Add memory bus node for Exynos4x12
This patch adds the memory bus node for Exynos4x12 SoC. Exynos4x12 SoC has
two memory bus to translate data between DRAM and eMMC/sub-IPs.
Following list specifies the detailed relation between memory bus clock and DMC
IP in MIF (Memory Interface) block:
- DMC/ACP clock : DMC (Dynamic Memory Controller)
Following list specifies the detailed relation between memory bus clock and
sub-IPs in INT (Internal) block:
- ACLK100 clock : PERIL/PERIR/MFC(PCLK)
- ACLK160 clock : CAM/TV/LCD
- ACLK133 clock : FSYS
- GDL/GDR clock : leftbus/rightbus
- SCLK_MFC clock : MFC
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 17 Dec 2014 05:05:02 +0000 (14:05 +0900)]
ARM: dts: Add memory bus node for Exynos3250
This patch adds the memory bus node for Exynos3250 SoC. Exynos3250 has
following memory buses to translate data between DRAM and eMMC/sub-IPs.
Following list specifies the detailed relation between memory bus clock and DMC
IP in MIF (Memory Interface) block:
- DMC clock : DMC (Dynamic Memory Controller)
Following list specifies the detailed relation between memory bus clock and
sub-IPs in INT (Internal) block:
- ACLK100 clock : PERIL
- ACLK160 clock : LCD0
- ACLK200 clock : FSYS
- ACLK266 clock : ISP
- GDL/GDR clock : leftbus/rightbus
- SCLK_MFC clock : MFC
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Chanwoo Choi [Wed, 14 Jan 2015 05:22:31 +0000 (14:22 +0900)]
PM / devfreq: event: exynos-ppmu: Add the support of PPMU 2.0 for Exynos5433
This patch adds the support for PPMU (Platform Performance Monitoring Unit)
version 2.0 for Exynos5433 SoC. Exynos5433 SoC must need PPMU v2 which is
quite different from PPMUv1.1. The exynos-ppmu.c driver supports both PPMUv1.1
and PPMUv2.
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Mon, 12 Jan 2015 10:22:15 +0000 (19:22 +0900)]
PM / devfreq: exynos: Remove unused exynos4 memory busfreq driver
This patch removes the unused exynos4 memory busfreq driver by adding generic
exynos memory bus frequency driver.
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Tue, 23 Dec 2014 11:36:13 +0000 (20:36 +0900)]
PM / devfreq: exynos: Add documentation for generic exynos memory bus frequency driver
This patch adds the documentation for generic exynos memory bus frequency
driver.
Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Mon, 23 Feb 2015 05:51:00 +0000 (14:51 +0900)]
PM / devfreq: exynos: Add generic exynos memory bus frequency driver
This patch adds the generic exynos bus frequency driver for memory bus
with DEVFREQ framework. The Samsung Exynos SoCs have the common architecture
for memory bus between DRAM memory and MMC/sub IP in SoC. This driver can
support the memory bus frequency driver for Exynos SoCs.
Each memory bus block has a clock for memory bus speed and frequency
table which is changed according to the utilization of memory bus on runtime.
And then each memory bus group has the one more memory bus blocks and
OPP table (including frequency and voltage), regulator, devfreq-event
devices.
There are a little difference about the number of memory bus because each Exynos
SoC have the different sub-IP and different memory bus speed. In spite of this
difference among Exynos SoCs, we can support almost Exynos SoC by adding
unique data of memory bus to devicetree file.
Cc: Myungjoo Ham <myungjoo.ham@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Thu, 2 Apr 2015 09:30:04 +0000 (18:30 +0900)]
clk: samsung: exynos5433: Add CLK_SET_RATE_PARENT to support DVFS for Cortex-A57 core
This patch adds CLK_SET_RATE_PARENT flag to support DVFS feature of Cortex-A57
Core (big core) because 'sclk_atlas' leaf clock is used to change the CPU
frequency of Cortex-A57 core in arm_big_little.c driver.
- 'atlas' word means the big core (Cortex-A57 core) in Exynos5433 TRM.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Thu, 2 Apr 2015 08:54:11 +0000 (17:54 +0900)]
LOCAL / arm64: dts: exynos: Use sclk_{atlas|apollo} clock to change cpu frequency
This patch uses the sclk_{atlas|}apollo leak clock to change cpu frequency
of big.LITTLE core on arm_big_little.c dirver.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 1 Apr 2015 06:26:12 +0000 (15:26 +0900)]
LOCAL / arm64: dts: exynos: Set the maximum rate of G3D clock for Exynos5433 SoC
This patch set the maximum rate of G3D clock on Exynos5433 SoC.
- Maximum rate of G3D_PLL is 550MHz
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Wed, 1 Apr 2015 06:27:58 +0000 (15:27 +0900)]
clk: samsung: exynos5433: Add CLK_SET_RATE_PARENT flag for aclk_g3d to support GPU DVFS
This patch adds the CLK_SET_RATE_PARENT flag for 'aclk_g3d' clock to support
GPU DVFS feature. The MALI driver uses the 'aclk_g3d' clock for DVFS feature.
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Thu, 2 Apr 2015 07:30:52 +0000 (16:30 +0900)]
clk: samsung: exynos5433: Add CLK_SET_RATE_PARENT to support DVFS for Cortex-A53 core
This patch adds CLK_SET_RATE_PARENT flag to support DVFS feature of Cortex-A53
Core (LITTLE core) because 'sclk_apollo' leaf clock is used to change the CPU
frequency of Cortex-A53 core in arm_big_little.c driver.
- 'apollo' word means the LITTLE core (Cortex-A53 core) in Exynos5433 TRM.
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chanwoo Choi [Thu, 2 Apr 2015 06:33:02 +0000 (15:33 +0900)]
clk: Show clock rate instead of return value
This patch shows the current clock rate instead of return value
when clk_set_rate() return fail because log message means the clock rate.
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Inha Song [Wed, 1 Apr 2015 06:01:20 +0000 (15:01 +0900)]
LOCAL / arm64: dts: exynos5433-tm2: Change WRSTB_IN GPIO pin function to OUTPUT from INPUT
This patch Change WRSTB_IN GPIO pin function to OUT from INPUT.
WRSTB (Warm reset information from AP) is used to detect input
of AP's warm reset. If falling edge is detected, Buck 1/2/3/4/5/6
Voltage are changed its default voltage.
It's buck voltate changes can cause the boot problem. We can prevent
these problems by the changing WRSTB GPIO pin function to OUTPUT.
Signed-off-by: Inha Song <ideal.song@samsung.com>
Chanwoo Choi [Wed, 1 Apr 2015 04:29:59 +0000 (13:29 +0900)]
LOCAL / arm64: configs: Update defconfig to enable GATOR_MALI_MIDGARD
This patch enables the CONFIG_GATOR_MALI_MIDGARD to debug GPU operation.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Jaewon Kim [Mon, 23 Mar 2015 12:34:49 +0000 (21:34 +0900)]
LOCAL / arm64: configs: Enable xhci to support drd on tm2 board
This patch enables xhci(USB 3.0 Host driver) to support
drd(Dual Role Device) on tm2 board.
Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
Robert Baldyga [Tue, 31 Mar 2015 07:38:05 +0000 (16:38 +0900)]
LCOAL / arm64: dts: exynos5433-tm2: change dwc3 mode to OTG
It enables OTG mode in dwc3 controller.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Tue, 31 Mar 2015 07:33:15 +0000 (16:33 +0900)]
LOCAL / arm64: dts: exynos5433-tm2: make usbdrd3 extcon client
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Tue, 31 Mar 2015 07:32:09 +0000 (16:32 +0900)]
LOCAL / arm64: dts: exynos5433: add snps,dis_u3_susphy_quirk to dwc3 controllers
It's needed for proper role switching in OTG mode.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Tue, 31 Mar 2015 07:31:01 +0000 (16:31 +0900)]
LOCAL / arm64: dts: exynos5433: set usb3_lpm_capable in dwc3
These hardware has LPM and we want to use it.
This will be necessary for OTG role switching.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 9 Mar 2015 12:28:18 +0000 (13:28 +0100)]
LOCAL / dwc3: exynos: add software role switching code
Exynos platform doesn't have hardware OTG support, so we need to
supply mechanism of notification about cable change.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 14:58:30 +0000 (15:58 +0100)]
LOCAL / dwc3: core: fix SUSPHY problem
This is needed for OTG mode. Without this change endpoint enabling in
gadget mode fails after role switching.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 14:44:04 +0000 (15:44 +0100)]
LOCAL / dwc3: core: add OTG support
Initialize OTG core if hardware runs in OTG mode.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 15:20:48 +0000 (16:20 +0100)]
LOCAL / dwc3: gadget: reinitialize core after each role change
According to the Databook in case of reconnection and role switching
the core should be completely reinitialized, excepting first connection
as peripheral when core was initialized during probing.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 11:32:19 +0000 (12:32 +0100)]
LOCAL / dwc3: host: don't add xhci device only if in OTG mode
OTG handling code adds xhci device automaticaly when USB host cable
is detected.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 12:57:14 +0000 (13:57 +0100)]
LOCAL / dwc3: gadget: register gadget in OTG core
Gadget driver needs to be registered in OTG to perform dynamic
role switching.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 23 Feb 2015 11:01:54 +0000 (12:01 +0100)]
LOCAL / dwc3: add otg handling code
This code is based on DWC3 driver from https://github.com/hardkernel/linux.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Robert Baldyga [Mon, 9 Mar 2015 08:41:24 +0000 (09:41 +0100)]
LOCAL / dwc3: core: cleanup suspend/resume code
Remove unused cases from switch-case statement and place
dwc3_event_buffers_cleanup() function outside switch-case
as it's called in each case anyway.
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Jaewon Kim [Thu, 26 Feb 2015 07:38:13 +0000 (16:38 +0900)]
LOCAL / usb: gadget: change gadget connect order
This patch changes usb_gadget_connect() order before add config.
Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
Jaewon Kim [Tue, 31 Mar 2015 05:25:08 +0000 (14:25 +0900)]
LOCAL / arm64: dts: fix usb handle for Exynos5433 tm2 borad.
This patch fixes usb handle name for Exynos5433 tm2 board.
Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
Jaewon Kim [Tue, 31 Mar 2015 05:22:03 +0000 (14:22 +0900)]
LOCAL / arm64: dts: fix usb3.0 host dt handle
This patch fixes USB3.0 host dt handle to usbhost30.
Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
Jaewon Kim [Tue, 31 Mar 2015 05:18:21 +0000 (14:18 +0900)]
LOCAL / arm64: dts: fix usbdrd handle name
This patch fixes usbdrd handle name
Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com>
Chanwoo Choi [Tue, 31 Mar 2015 04:34:09 +0000 (13:34 +0900)]
LOCAL / arm64: dts: exynos: Remove high-frequency of big core to remove kernel lockup
This patch removes the high-frequency of big core from frequency table
to remove kernel lockup issue. This is work-around method to resolve lockup.
Firstly, after making the stable kernel, I'll debug this isuse.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Joonyoung Shim [Tue, 31 Mar 2015 01:50:18 +0000 (10:50 +0900)]
local / arm64: configs: update defconfig for syscon-reboot
Exynos5433 SoC can support to reboot using syscon-reboot driver.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Tue, 31 Mar 2015 01:47:51 +0000 (10:47 +0900)]
local / arm64: dts: add reboot node for exynos5433
This reboot node uses syscon-reboot driver.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Fri, 23 Jan 2015 09:08:10 +0000 (18:08 +0900)]
local / arm64: configs: update defconfig for mali
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Mon, 19 Jan 2015 08:54:35 +0000 (17:54 +0900)]
gpu: arm: midgard: add initial exynos5433 platform files
We should check more clock and regulator for DVFS.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Tue, 10 Mar 2015 01:07:08 +0000 (10:07 +0900)]
gpu: arm: midgard: remove set_dma_ops
Don't use set_dma_ops since commit
9d3bfbb4df58 ("arm64: Combine
coherent and non-coherent swiotlb dma_ops")
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Wed, 21 Jan 2015 08:13:51 +0000 (17:13 +0900)]
gpu: arm: midgard: Drop CONFIG_PM_RUNTIME
After commit
464ed18ebdb6 ("PM: Eliminate CONFIG_PM_RUNTIME") PM_RUNTIME
is eliminated.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Mon, 19 Jan 2015 08:54:35 +0000 (17:54 +0900)]
gpu: arm: midgard: add initial exynos5422 platform files
We should check more clock and regulator for DVFS.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Tue, 20 Jan 2015 09:07:38 +0000 (18:07 +0900)]
gpu: arm: midgard: support kernel error defines for platform.
Don't use mali error defines, it'a ugly.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Joonyoung Shim [Tue, 20 Jan 2015 02:18:00 +0000 (11:18 +0900)]
gpu: arm: add mali midgard r5p0-06rel0 driver
This comes from below link. Remove sconscript and modify file permission
to 644.
http://malideveloper.arm.com/develop-for-mali/drivers/open-source-mali-t6xx-gpu-kernel-device-drivers/
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Kevin Hilman [Wed, 12 Aug 2015 10:19:09 +0000 (19:19 +0900)]
sched: hmp: fix spinlock recursion in active migration
[original commit message]
Commit
cd5c2cc93d3d (hmp: Remove potential for task_struct access
race) introduced a put_task_struct() to prevent races, but in
doing so introduced potential spinlock recursion. (This change was further
onsolidated in commit
0baa5811bacf -- sched: hmp: unify active migration code.)
Unfortunately, the put_task_struct() is done while the runqueue
spinlock is held, but put_task_struct() can also cause a reschedule
causing the runqueue lock to be acquired recursively.
To fix, move the put_task_struct() outside the runqueue spinlock.
[additional commit message by Chanwoo Choi]
We did not apply hmp patch[1] because patch[1] clean the code by sharing the
same code. When I applied hmp patch[1], scheduling problem issue occured.
[1] commit
0baa5811bacf -- sched: hmp: unify active migration code.)
So, this patch move the put_task_struct() just outside the runqueue spinlock.
Reported-by: Victor Lixin <victor.lixin@hisilicon.com>
Cc: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Cc: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Jon Medhurst <tixy@linaro.org>
Reviewed-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[cw00.choi: Fix the merge conflict]
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chris Redpath [Thu, 12 Mar 2015 11:32:13 +0000 (20:32 +0900)]
hmp: Restrict ILB events if no CPU has > 1 task
Frequently in HMP, the big CPUs are only active with one task per
CPU and there may be idle CPUs in the big cluster. This patch avoids
triggering an idle balance in situations where none of the active
CPUs in the current HMP domain have > 1 tasks running.
When packing is enabled, only enforce this behaviour when we are
not in the smallest domain - there we idle balance whenever a CPU
is over the up_threshold regardless of tasks in case one needs to
be moved.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chris Redpath [Thu, 12 Mar 2015 11:19:34 +0000 (20:19 +0900)]
HMP: Do not fork-boost tasks coming from PIDs <= 2
System services are generally started by init, whilst kernel threads
are started by kthreadd. We do not want to give those tasks a head
start, as this costs power for very little benefit. We do however
wish to do that for tasks which the user launches.
Further, some tasks allocate per-cpu timers directly after launch
which can lead to those tasks being always scheduled on a big CPU
when there is no computational need to do so. Not promoting services
to big CPUs on launch will prevent that unless a service allocates
their per-cpu resources after a period of intense computation, which
is not a common pattern.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Alex Shi [Thu, 12 Mar 2015 11:13:22 +0000 (20:13 +0900)]
HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull
Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull.
Otherwise a deadlock happened when do bl_idle_init.
[ 113.878664] other info that might help us debug this:
[ 113.878667] Possible unsafe locking scenario:
[ 113.878667]
[ 113.878670] CPU0
[ 113.878673] ----
[ 113.878681] lock(cpuidle_driver_lock);
[ 113.878684] <Interrupt>
[ 113.878691] lock(cpuidle_driver_lock);
[ 113.878693]
[ 113.878693] *** DEADLOCK ***
[ 113.878693]
[ 113.878697] 1 lock held by ksoftirqd/4/28:
[ 113.878719] #0: (hmp_force_migration){+.....}, at: [<
c0054da5>]
hmp_idle_pull+0x49/0x508
This patch is just a quick/cheap workaround for cpuidle_driver_lock
deadlock. It works for TC2 and any other platform where the idle
driver cannot be changed at runtime.
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chris Redpath [Thu, 12 Mar 2015 11:10:45 +0000 (20:10 +0900)]
sched: hmp: fix out-of-range CPU possible
If someone hotplugs all the little CPUs while another CPU is handling
a wakeup, we can potentially return new_cpu == NR_CPUS from
hmp_select_slower_cpu (which is called internally by
hmp_best_little_cpu as well). We will use this to deref the
per_cpu rq array in hmp_next_down_delay which can go boom.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Chris Redpath [Thu, 12 Mar 2015 11:04:08 +0000 (20:04 +0900)]
hmp: dont attempt to pull tasks if affinity doesn't allow it
When looking for a task to be idle-pulled, don't consider tasks
where the affinity does not allow that task to be placed on the
target CPU. Also ensure that tasks with restricted affinity
do not block selecting other unrestricted busy tasks.
Use the knowledge of target CPU more effectively in idle pull
by passing to hmp_get_heaviest_task when we know it, otherwise
only checking for general affinity matches with any of the CPUs
in the bigger HMP domain.
We still need to explicitly check affinity is allowed in idle pull
since if we find no match in hmp_get_heaviest_task we will return
the current one, which may not be affine to the new CPU despite
having high enough load. In this case, there is nothing to move.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Thu, 12 Mar 2015 10:57:56 +0000 (19:57 +0900)]
hmp: Use idle pull to perform forced up-migrations
When a normal forced up-migration takes place we stop the task to
be migrated while the target CPU becomes available. This delay can
range from 80us to 1500us on TC2 if the target CPU is in a deep idle
state.
Instead, interrupt the target CPU and ask it to pull a task.
This lets the current eligible task continue executing on the
original CPU while the target CPU wakes. Use a pinned timer to
prevent the pulling CPU going back into power-down with pending
up-migrations.
If we trigger for a nohz kick, it doesn't matter about triggering
for an idle pull since the idle_pull flag will be set when we
execute the softirq and we'll still do the idle pull.
If the target CPU is busy, we will not pull any tasks.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chris Redpath [Thu, 12 Mar 2015 10:43:19 +0000 (19:43 +0900)]
sched: hmp: Change small task packing defaults for all platforms
All platforms other than TC2 default to enabling packing. Since TC2
shows no performance or energy degradation with this feature enabled
make it default enabled the same as everyone else.
Likewise, vendors have been including TC2 support in multi-machine
kernel builds so they expect the default thresholds to remain the
same when the TC2 #ifdef is removed.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chanwoo Choi [Thu, 12 Mar 2015 05:45:47 +0000 (14:45 +0900)]
LOCAL / sched: Fix build break by using alternative function
This patch fixes the build break because Linux 4.0 didn't include the
cpumask_scnprintf() function. So, this patch use the alternative function
(cpumap_print_to_pagebuf()).
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
[k.kozlowski: rebased on 4.1]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 4 Feb 2015 05:55:19 +0000 (14:55 +0900)]
hmp: sched: Clean up hmp_up_threshold checks into a utility fn
In anticipation of modifying the up_threshold handling, make all
instances use the same utility fn to check if a task is eligible
for up-migration. This also removes the previous difference in
threshold comparison where up-migration used '!<threshold' and
idle pull used '>threshold' to decide up-migration eligibility.
Make them both use '!<threshold' instead for consistency, although
this is unlikely to change any results.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Dietmar Eggemann [Wed, 4 Feb 2015 05:50:24 +0000 (14:50 +0900)]
HMP: Fix rt task allowed cpu mask restriction code on 1x1 system
There is an error scenario where on a 1x1 HMP system (weight of the
hmp_slow_cpu_mask is 1) the short-cut of restricting the
allowed cpu mask
of an rt tasks leads to triggering a kernel bug in the rt sched class
set_cpus_allowed function set_cpus_allowed_rt().
In case the task is on the run-queue and the weight of the required cpu mask
is 1 and this is different to the p->nr_cpus_allowed value, this back-end
function interprets this in such a way that a task changed from being
migratable to not migratable anymore and decrements the rt_nr_migratory
counter. There is a BUG_ON(!rq->rt.rt_nr_migratory) check in this code
path which triggers in this situation.
To circumvent this issue, set the number of allowed cpus for a task p to
the weight of the hmp_slow_cpu_mask before calling do_set_cpus_allowed()
in __setscheduler(). It will be set to this value in
do_set_cpus_allowed()
after the call to the sched class related backend function any way. By
doing this, set_cpus_allowed_rt() returns without trying to update the
rt_nr_migratory counter.
This patch has been tested with a test device driver requiring a
threaded
irq handler on a TC2 system with a reduced cpu mask (1 Cortex A15, 1
Cortex A7).
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Dietmar Eggemann [Wed, 4 Feb 2015 05:48:35 +0000 (14:48 +0900)]
HMP: Restrict irq_default_affinity to hmp_slow_cpu_mask
This patch limits the default affinity mask for all irqs to the cluster of
the little cpus.
This patch has the positive side effect that an irq thread which has its
IRQTF_RUNTHREAD set inside irq_thread() -> irq_wait_for_interrupt() will
not overwrite its struct task_struct->cpus_allowed with a full cpu mask of
desc->irq_data.affinity in irq_thread_check_affinity() essentially reverting
patch "HMP: experimental: Force all rt tasks to start on little domain."
for this irq thread.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chris Redpath [Wed, 4 Feb 2015 05:44:53 +0000 (14:44 +0900)]
sched: hmp: Fix potential task_struct memory leak
We use get_task_struct to increment the ref count on a task_struct
so that even if the task dies with a pending migration we are still
able to read the memory without causing a fault.
In the case of non-running tasks, we forgot to decrement the ref
count when we are done with the task.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Chris Redpath [Wed, 4 Feb 2015 05:42:51 +0000 (14:42 +0900)]
sched: hmp: Change TC2 packing config to disabled default if present
Since TC2 power curves don't really have a utilisation hotspot where
packing makes sense, if it is present for a TC2 system at least make
it default to disabled.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 4 Feb 2015 05:40:41 +0000 (14:40 +0900)]
sched: hmp: Make idle balance behaviour normal when packing disabled
The presence of packing permanently changed the idle balance
behaviour. Do not restrict idle balance on the smallest CPUs when
packing is present but disabled.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 4 Feb 2015 05:25:57 +0000 (14:25 +0900)]
sched: update runqueue clock before migrations away
If we migrate a sleeping task away from a CPU which has the
tick stopped, then both the clock_task and decay_counter will
be out of date for that CPU and we will not decay load correctly
regardless of how often we update the blocked load.
This is only an issue for tasks which are not on a runqueue
(because otherwise that CPU would be awake) and simultaneously
the CPU the task previously ran on has had the tick stopped.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Mon, 23 Feb 2015 06:24:16 +0000 (15:24 +0900)]
sched: reset blocked load decay_count during synchronization
If an entity happens to sleep for less than one tick duration
the tracked load associated with that entity can be decayed by an
unexpectedly large amount if it is later migrated to a different
CPU. This can interfere with correct scheduling when entity load
is used for decision making.
The reason for this is that when an entity is dequeued and enqueued
quickly, such that se.avg.decay_count and cfs_rq.decay_counter
do not differ when that entity is enqueued again,
__synchronize_entity_decay skips the calculation step and also skips
clearing the decay_count. At a later time that entity may be
migrated and its load will be decayed incorrectly.
All users of this function expect decay_count to be zero'ed after
use.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Thomas Gleixner [Wed, 4 Feb 2015 05:12:25 +0000 (14:12 +0900)]
genirq: Add default affinity mask command line option
If we isolate CPUs, then we don't want random device interrupts on
them. Even w/o the user space irq balancer enabled we can end up with
irqs on non boot cpus.
Allow to restrict the default irq affinity mask.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 4 Feb 2015 05:06:14 +0000 (14:06 +0900)]
sched: hmp: Fix build breakage when not using CONFIG_SCHED_HMP
hmp_variable_scale_convert was used without guards in
__update_entity_runnable_avg. Guard it.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 20 Nov 2013 14:14:44 +0000 (14:14 +0000)]
Documentation: HMP: Small Task Packing explanation
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Wed, 4 Feb 2015 02:58:09 +0000 (11:58 +0900)]
sched: hmp: add read-only hmp domain sysfs file
In order to allow userspace to restrict known low-load tasks to
little CPUs, we must export this knowledge from the kernel or
expect userspace to make their own attempts at figuring it out.
Since we now have a userspace requirement for an HMP implementation
to always have at least some sysfs files, change the integration
so that it only depends upon CONFIG_SCHED_HMP rather than
CONFIG_HMP_VARIABLE_SCALE. Fix Kconfig text to match.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Mathieu Poirier [Wed, 4 Feb 2015 02:39:05 +0000 (11:39 +0900)]
HMP: Avoid using the cpu stopper to stop runnable tasks
When migrating a runnable task, we use the CPU stopper on
the source CPU to ensure that the task to be moved is not
currently running. Before this patch, all forced migrations
(up, offload, idle pull) use the stopper for every migration.
Using the CPU stopper is mandatory only when a task is currently
running on a CPU. Otherwise tasks can be moved by locking the
source and destination run queues.
This patch checks to see if the task to be moved are currently
running. If not the task is moved directly without using the
stopper thread.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Mark Brown [Wed, 4 Feb 2015 02:23:34 +0000 (11:23 +0900)]
smp: Don't use typedef to work around compiler issue with tracepoints
Having the typedef in place for the tracepoints causes compiler crashes
in some situations. Just using void * directly avoids triggering the
issue and should have no effect on the trace.
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 12:45:54 +0000 (21:45 +0900)]
HMP: Implement task packing for small tasks in HMP systems
If we wake up a task on a little CPU, fill CPUs rather than
spread. Adds 2 new files to sys/kernel/hmp to control packing
behaviour.
packing_enable: task packing enabled (1) or disabled (0)
packing_limit: Runqueues will be filled up to this load ratio.
This functionality is disabled by default on TC2 as it lacks per-cpu
power gating so packing small tasks there doesn't make sense.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 12:19:41 +0000 (21:19 +0900)]
hmp: Remove potential for task_struct access race
Accessing the task_struct can be racy in certain conditions, so
we need to only acquire the data when needed.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 08:34:03 +0000 (17:34 +0900)]
sched: HMP: fix potential logical errors
The previous API for hmp_up_migration reset the destination
CPU every time, regardless of if a migration was desired. The code
using it assumed that the value would not be changed unless
a migration was required. In one rare circumstance, this could
have lead to a task migrating to a little CPU at the wrong time.
Fixing that lead to a slight logical tweak to make the surrounding
APIs operate a bit more obviously.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Robin Randhawa <robin.randhawa@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 08:26:54 +0000 (17:26 +0900)]
smp: smp_cross_call function pointer tracing
generic tracing for smp_cross_call function calls
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:56:54 +0000 (16:56 +0900)]
sched: HMP: Additional trace points for debugging HMP behaviour
1. Replace magic numbers in code for migration trace.
Trace points still emit a number as force=<n> field:
force=0 : wakeup migration
force=1 : forced migration
force=2 : offload migration
force=3 : idle pull migration
2. Add trace to expose offload decision-making.
Also adds tracing rq->nr_running so that you can
look back to see what state the RQ was in at the
time.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:52:16 +0000 (16:52 +0900)]
sched: HMP: Change default HMP thresholds
When the up-threshold is at 512 on TC2, behaviour looks OK since
the graphic-related tasks are very heavy due to lack of a GPU.
Increasing the up-threshold does not reduce power consumption.
When a GPU is present, graphic tasks are much less CPU-heavy and
so additional power may be saved by having a higher threshold.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:50:59 +0000 (16:50 +0900)]
HMP: Update migration timer when we fork-migrate
Prevents fork-migration adversely interacting with normal
migration (i.e. runqueues containing forked tasks being
selected as migration targets when there is a better
choice available)
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:46:56 +0000 (16:46 +0900)]
HMP: Access runqueue task clocks directly.
Avoids accesses through cfs_rq going bad when the cpu_rq doesn't
have a cfs member.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:40:16 +0000 (16:40 +0900)]
HMP: Implement idle pull for HMP
When an A15 goes idle, we should up-migrate anything which is
above the threshold and running on an A7.
Reuses the HMP force-migration spinlock, but adds its own new
cpu stopper client.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:33:30 +0000 (16:33 +0900)]
sched: HMP change nr_running offload metric
rq->nr_running was better than cfs.nr_running, since it includes
all tasks actually on the CPU. However, it includes RT tasks which
we would rather ignore at this point.
Switching to cfs.h_nr_running includes all the CFS tasks but no
RT tasks.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:30:36 +0000 (16:30 +0900)]
HMP: Explicitly implement all-load-is-max-load policy for HMP targets
Experimentally, one of the best policies for HMP migration CPU
selection is to completely ignore part-loaded CPUs and only look
for idle ones. If there are no idle ones, we will choose the one
which was least-recently-disturbed.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:23:22 +0000 (16:23 +0900)]
HMP: Modify the runqueue stats to add a new child stat
The original intent here was to track unweighted runqueue load
with less resolution so we could use the least-recently-disturbed
runqueue to choose between 'closely related' load levels.
However, after experimenting with the resolution it turns out
that the following algorithm is highly beneficial for mobile
workloads.
In hmp_domain_min_load:
* If any CPU is zero, the overall load is zero
* If no CPUs are idle, the domain is 'fully loaded'
Additionally, the time since last migration count is used to
discriminate between idle CPUs.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 07:08:42 +0000 (16:08 +0900)]
sched: track per-rq 'last migration time'
Track when migrations were performed to runqueues.
Use this to decide between runqueues as migration targets when run
queues in an hmp domain have equal load.
Intention is to spread migration load amongst CPUs more fairly.
When all CPUs in an hmp domain are fully loaded, the existing code
always selects the last CPU as a migration target - this is unfair
and little better than doing no selection.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Morten Rasmussen [Tue, 3 Feb 2015 07:02:08 +0000 (16:02 +0900)]
sched: HMP fix traversing the rb-tree from the curr pointer
The hmp_get_{lightest,heaviest}_task() need to use
__pick_first_entity() to get a pointer to a sched_entity on the rq.
The current is not kept on the rq while running, so its rb-tree node
pointers are no longer valid.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 06:46:36 +0000 (15:46 +0900)]
HMP: select 'best' task for migration rather than 'current'
When we are looking for a task to migrate up, select the heaviest
one in the first 5 runnable on the runqueue.
Likewise, when looking for a task to offload, select the lightest
one in the first 5 runnable on the runqueue.
Ensure task selected is runnable in the target domain.
This change is necessary in order to implement idle pull in a
sensible manner, but here is used in up-migration and offload to
select the correct target task.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Jon Medhurst [Tue, 3 Feb 2015 06:40:18 +0000 (15:40 +0900)]
HMP: Check the system has little cpus before forcing rt tasks onto them
It is sometimes desirable to run a kernel with HMP scheduling enabled
on a system which is not big.LITTLE, e.g. when building a multi-platform
kernel, or when testing a big.LITTLE system with one cluster disabled.
We should therefore allow for the situation where is no little domain.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Dietmar Eggemann [Tue, 3 Feb 2015 06:38:08 +0000 (15:38 +0900)]
HMP: experimental: Force all rt tasks to start on little domain.
This patch restricts the allowed cpu mask for rt tasks initially started
with a full cpu mask to the little domain.
An rt task is specified as real time in __setscheduler() which is finally
called for all rt tasks (kernel and user land). In this function we
restrict the allowed cpu mask to the little domain.
This also prevents that a rt tasks can later be pushed to the big domain
because the function find_lowest_rq() will only recognize the allowed cpu
mask of a task to find the new cpu the task runs on.
Current kludges of the patch:
* Since we do not have an API to get the cpu mask of the A7 cluster,
hmp_slow_cpu_mask is made global in arm/kernel/topology.c for now.
* The watchdog_enable() function calls sched_setscheduler() before
kthread_bind() for the cpu specific watchdog kernel threads. The order of
these two calls has to be changed to make this patch work.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chanwoo Choi [Tue, 3 Feb 2015 06:32:43 +0000 (15:32 +0900)]
LOCAL / sched: Fix build break
kernel/sched/fair.c: In function ‘find_new_ilb’:
kernel/sched/fair.c:7973:42: error: ‘call_cpu’ undeclared (first use in this function)
&((struct hmp_domain *)hmp_cpu_domain(call_cpu))->cpus);
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Chris Redpath [Tue, 3 Feb 2015 05:03:11 +0000 (14:03 +0900)]
sched: Restrict nohz balance kicks to stay in the HMP domain
There is little point in doing a nohz balance kick on a CPU from a
different HMP domain, since the unset SD_LOAD_BALANCE flag on the CPU
domain level prevents tasks from being balanced across clusters
except through the per-task load driven hmp_migrate/hmp_offload paths.
Further, the nohz balance kick is actively harmful to power usage if
all the tasks fit into the little domain since it causes the big
domain to wake up and do a lot of calculation to determine that
there is nothing to do.
A more generic solution is to walk the sched domain tree and determine
the intersection of potential idle balance cpus with visibility of
tasks on the current CPU, however HMP domains are more easily
accessible.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Mon, 23 Feb 2015 06:19:03 +0000 (15:19 +0900)]
HMP: Force new non-kernel tasks onto big CPUs until load stabilises
Initialise the load stats for new tasks so that they do not
see the instability in early task life which makes it so hard to
decide which CPU is appropriate.
Also, change the fork balance algorithm so that the least loaded of
the CPUs in the big cluster is chosen regardless of the bigness of
the parent task.
This is intended to help performance for applications which use
many short-lived tasks. Although best practise is usually to use
a thread pool, apps which do not do this should not be subject to
the randomness of the early stats.
We should ignore real-time threads for forking on big CPUs, but
it is not possible to figure out if a new thread is real-time or
not at the fork stage. Instead, we prevent kernel threads from
getting the initial boost - when they later become real-time they
will only be on big if their compute requirements demand it.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Fri, 30 Jan 2015 08:10:42 +0000 (17:10 +0900)]
HMP: Avoid multiple calls to hmp_domain_min_load in fast path
When evaluating a migration we make two calls to hmp_domain_min_load.
This is unnecessary if we pass on the target CPU information from the
hmp_up_migration path.
In hmp_down_migration, we don't consider the load of the target CPUS.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Fri, 30 Jan 2015 08:04:09 +0000 (17:04 +0900)]
HMP: Select least-loaded CPU when performing HMP Migrations
The reference patch set always selects the first CPU in an HMP
domain as a migration target. In busy situations, this means that
the migrated thread cannot make immediate use of an idle CPU but
must share a busy one until the load balancer runs across the big
domain.
This patch uses the hmp_domain_min_load function introduced in
global balancing to figure out which of the CPUs is the least busy
and selects that as a migration target - in both directions.
This essentially implements a task-spread strategy and is intended
to maximise performance of migrated threads but is likely
to use more power than the packing strategy previously employed.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Fri, 30 Jan 2015 07:59:46 +0000 (16:59 +0900)]
HMP: Use unweighted load for hmp migration decisions
Normal task and runqueue loading is scaled according to priority
to end up with a weighted load, known as the contribution.
We want the CPU time to be allotted according to priority, but
we also want to make big/little decisions based upon raw load.
It is common, for example, for Android apps following the dev
guide to end up with all their long-running or async action
threads as low priority unless they override the AsyncThread
constructor. All these threads are such low priority that they
become invisible to the hmp_offload routine.
Using unweighted load here allows us to maximise CPU usage in busy
situations.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Fri, 30 Jan 2015 07:44:17 +0000 (16:44 +0900)]
Revert "sched: Enable HMP priority filter by default"
This reverts commit
6ede44cadf39e35cd2f4fc0ebda8d85f6eca8947.
Having the priority filter enabled prevents proper operation
on Android systems where a wider range of priorities are used
by userspace to partition types of tasks. Those tasks should still
be able to benefit from the use of big CPUs when required.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Chris Redpath [Fri, 30 Jan 2015 07:29:35 +0000 (16:29 +0900)]
sched: cfs.nr_running does not contain the intended metric
rq->nr_running is the actual number of runnable tasks we wish to use
to determine if a task is alone on a CPU.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Morten Rasmussen [Fri, 30 Jan 2015 07:27:45 +0000 (16:27 +0900)]
sched: Basic global balancing support for HMP
This patch introduces an extra-check at task up-migration to
prevent overloading the cpus in the faster hmp_domain while the
slower hmp_domain is not fully utilized. The patch also introduces
a periodic balance check that can down-migrate tasks if the faster
domain is oversubscribed and the slower is under-utilized.
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Chris Redpath [Fri, 30 Jan 2015 07:22:26 +0000 (16:22 +0900)]
ARM: Fix build breakage when big.LITTLE.conf is not used.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Olivier Cozette [Fri, 30 Jan 2015 06:47:57 +0000 (15:47 +0900)]
ARM: Experimental Frequency-Invariant Load Scaling Patch
Evaluation Patch to investigate using load as a representation of the
amount of POTENTIAL cpu compute capacity used rather than a representation
of the CURRENT cpu compute capacity.
If CPUFreq is enabled, scales load in accordance with frequency.
Powersave/performance CPUFreq governors are detected and scaling is
disabled while these governors are in use. This is because when a
single-frequency governor is in use, potential CPU capacity is static.
So long as the governors and CPUFreq subsystem correctly report the
frequencies available, the scaling should self tune.
Adds an additional file to sysfs to allow this feature to be disabled
for experimentation.
/sys/kernel/hmp/frequency_invariant_load_scale
write 0 to disable, 1 to enable.
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Olivier Cozette [Fri, 30 Jan 2015 06:08:58 +0000 (15:08 +0900)]
ARM: Change load tracking scale using sysfs
These functions allow to change the load average period used
in the task load average computation through
/sys/kernel/hmp/load_avg_period_ms. This period is the time
in ms to go from 0 to 0.5 load average while running or the
time from 1 to 0.5 while sleeping.
The default one used is 32 and gives the same load_avg_ratio
computation than without this patch. These functions also allow
to change the up and down threshold of HMP using
/sys/kernel/hmp/{up,down}_threshold. Both must be between 0 and
1024. The thresholds are divided by 1024 before being compared
to the load_avg_ratio.
If /sys/kernel/hmp/load_avg_period_ms is 128 and
/sys/kernel/hmp/up_threshold is 512, a task will be migrated
to a bigger cluster after running for 128ms. Because after
load_avg_period_ms the load average is 0.5 and real up_threshold
us 512 / 1024 = 0.5.
Signed-off-by: Olivier Cozette <olivier.cozette@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[k.kozlowski: rebased on 4.1, no signed-off-by of previous committer]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>