Arnd Bergmann [Fri, 16 Dec 2016 09:06:15 +0000 (10:06 +0100)]
cpufreq: s3c64xx: remove incorrect __init annotation
s3c64xx_cpufreq_config_regulator is incorrectly annotated
as __init, since the caller is also not init:
WARNING: vmlinux.o(.text+0x92fe1c): Section mismatch in reference from the function s3c64xx_cpufreq_driver_init() to the function .init.text:s3c64xx_cpufreq_config_regulator()
With modern gcc versions, the function gets inline, so we don't
see the warning, this only happens with gcc-4.6 and older.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Boris Ostrovsky [Thu, 15 Dec 2016 15:00:58 +0000 (10:00 -0500)]
cpufreq: Remove CPU hotplug callbacks only if they were initialized
Since CPU hotplug callbacks are requested for CPUHP_AP_ONLINE_DYN state,
successful callback initialization will result in cpuhp_setup_state()
returning a positive value. Therefore acpi_cpufreq_online being zero
indicates that callbacks have not been installed.
This means that acpi_cpufreq_boost_exit() should only remove them if
acpi_cpufreq_online is positive. Trying to call
cpuhp_remove_state_nocalls(0) will cause a BUG().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Boris Ostrovsky [Thu, 15 Dec 2016 15:00:57 +0000 (10:00 -0500)]
CPU/hotplug: Clarify description of __cpuhp_setup_state() return value
When ivoked with CPUHP_AP_ONLINE_DYN state __cpuhp_setup_state()
is expected to return positive value which is the hotplug state that
the routine assigns.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 12 Dec 2016 19:44:25 +0000 (20:44 +0100)]
Merge schedutil governor updates for v4.10.
Srinivas Pandruvada [Tue, 6 Dec 2016 21:32:17 +0000 (13:32 -0800)]
Documentation: intel_pstate: Document HWP energy/performance hints
Updated documentation for the support of energy performance hint in
the HWP mode.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Tue, 6 Dec 2016 21:32:16 +0000 (13:32 -0800)]
cpufreq: intel_pstate: Support for energy performance hints with HWP
It is possible to provide hints to the HWP algorithms in the processor
to be more performance centric to more energy centric. These hints are
provided by using HWP energy performance preference (EPP) or energy
performance bias (EPB) settings.
The scope of these settings is per logical processor, which means that
each of the logical processors in the package can be programmed with a
different value.
This change provides cpufreq sysfs interface to provide hint. For each
policy, two additional attributes will be available to check and provide
hint. These attributes will only be present when the intel_pstate driver
is using HWP mode.
These attributes are:
- energy_performance_available_preferences
- energy_performance_preference
To get list of supported hints:
$ cat energy_performance_available_preferences
default performance balance_performance balance_power power
The current preference can be read or changed via cpufreq sysfs
attribute "energy_performance_preference". Reading from this attribute
will display current effective setting changed via any method. User can
write any of the valid preference string to this attribute. User can
always restore to power-on default by writing "default".
Implementation
Since these hints can be provided by direct MSR write or using some tools
like x86_energy_perf_policy, the driver internally doesn't maintain any
state. The user operation will result in direct read/write of MSR: 0x774
(HWP_REQUEST_MSR). Also driver use read modify write to update other
fields in this MSR.
Summary of changes:
- struct cpudata field epp_saved is renamed to epp_powersave, as this
stores the value to restore once policy is switched from performance
to powersave to restore original powersave EPP value.
- A new struct cpudata field epp_saved is used to store the raw MSR
EPP/EPB value when a CPU goes offline or on suspend and restore on
online/resume. This ensures that EPP value is restored to correct
value irrespective of the means used to set.
- EPP/EPB value ranges are fixed for each preference, which can be
set for the cpufreq sysfs, so user request is mapped to/from this
range.
- New attributes are only added when HWP is present.
- Since EPP value of 0 is valid the fields are initialized to
-EINVAL when not valid. The field epp_default is read only once
after powerup to avoid reading on subsequent CPU online operation
- New suspend callback to store epp on suspend operation
- Don't invalidate old epp_saved field on resume and online as now
we can restore last epp value on suspend and this field can still
have old EPP value sampled during switch to performance from
powersave.
- While here optimized setting of cpu_data->epp_powersave = epp in
intel_pstate_hwp_set() as this was done in both true and false
paths.
- epp/epb set function returns error to caller on failure to pass
on to user space for display.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Tue, 6 Dec 2016 21:32:15 +0000 (13:32 -0800)]
cpufreq: intel_pstate: Add locking around HWP requests
To avoid race conditions from multiple threads, increase the scope
of intel_pstate_limits_lock to include HWP requests also.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Chen Yu [Mon, 4 Jan 2016 04:14:45 +0000 (12:14 +0800)]
cpufreq: ondemand: Set MIN_FREQUENCY_UP_THRESHOLD to 1
Currently the minimal up_threshold is 11, and user may want to
use a smaller minimal up_threshold for performance tuning,
so MIN_FREQUENCY_UP_THRESHOLD could be set to 1 because:
1. Current systems wouldn't be affected as they have already
a value >= 11.
2. New systems with a default kernel would keep still the default
value that is >= 11.
Users now have the advantage that they can make their own decisions
and customize the 'trip point' to switch to the max frequency.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=65501
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Piotr Luc [Thu, 13 Oct 2016 15:31:00 +0000 (17:31 +0200)]
cpufreq: intel_pstate: Add Knights Mill CPUID
Add Knights Mill (KNM) to the list of CPUIDs supported by intel_pstate.
Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Wed, 30 Nov 2016 01:31:02 +0000 (02:31 +0100)]
MAINTAINERS: Add bug tracking system location entry for cpufreq
The kernel Bugzilla is used for tracking cpufreq bugs, so document
that.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Len Brown <len.brown@intel.com>
Baoyou Xie [Wed, 30 Nov 2016 07:35:29 +0000 (15:35 +0800)]
cpufreq: dt: Add support for zx296718
Add the compatible string for supporting the generic cpufreq driver on
the ZTE's zx296718 SoC.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sebastian Andrzej Siewior [Mon, 28 Nov 2016 09:52:03 +0000 (10:52 +0100)]
cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage
The online / pre_down callback is invoked on the target CPU since commit
1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu") which means
for the hotplug callback we can use rmdsrl() instead of rdmsr_on_cpus().
This leaves us with set_boost() as the only user which still needs to
read/write the MSR on different CPUs. There is no point in doing that
update on all cpus with the read modify write magic via per cpu data. We
simply can issue a function call on all online CPUs which also means that we
need half that many IPIs.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sebastian Andrzej Siewior [Mon, 28 Nov 2016 09:51:02 +0000 (10:51 +0100)]
cpufreq: acpi-cpufreq: Convert to hotplug state machine
Install the callbacks via the state machine.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Arnd Bergmann [Fri, 25 Nov 2016 16:50:20 +0000 (17:50 +0100)]
cpufreq: intel_pstate: fix intel_pstate_exit_perf_limits() prototype
The addition of the generic governor support marked the
intel_pstate_exit_perf_limits as inline(), which fixed a warning,
but it introduced another warning:
drivers/cpufreq/intel_pstate.c: In function ‘intel_pstate_exit_perf_limits’:
drivers/cpufreq/intel_pstate.c:483:1: error: no return statement in function returning non-void [-Werror=return-type]
This changes it back to a 'void' return type, and changes the
corresponding intel_pstate_init_acpi_perf_limits() function to
be inline as well for consistency.
Fixes:
001c76f05b01 (cpufreq: intel_pstate: Generic governors support)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Fri, 25 Nov 2016 00:07:10 +0000 (16:07 -0800)]
cpufreq: intel_pstate: Set EPP/EPB to 0 in performance mode
When user has selected performance policy, then set the EPP (Energy
Performance Preference) or EPB (Energy Performance Bias) to maximum
performance mode.
Also when user switch back to powersave, then restore EPP/EPB to last
EPP/EPB value before entering performance mode. If user has not changed
EPP/EPB manually then it will be power on default value.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Thu, 24 Nov 2016 08:21:11 +0000 (13:51 +0530)]
cpufreq: schedutil: Rectify comment in sugov_irq_work() function
This patch rectifies a comment present in sugov_irq_work() function to
follow proper grammar.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Tue, 22 Nov 2016 00:33:20 +0000 (16:33 -0800)]
cpufreq: intel_pstate: increase precision of performance limits
Even with round up of limits->min_perf and limits->max_perf, in some
cases resultant performance is 100 MHz less than the desired.
For example when the maximum frequency is 3.50 GHz, setting
scaling_min_frequency to 2.3 GHz always results in 2.2 GHz minimum.
Currently the fixed floating point operation uses 8 bit precision for
calculating limits->min_perf and limits->max_perf. For some operations
in this driver the 14 bit precision is used. Using the 14 bit precision
also for calculating limits->min_perf and limits->max_perf, addresses
this issue.
Introduced fp_ext_toint() equivalent to fp_toint() and int_ext_tofp()
equivalent to int_tofp() with 14 bit precision.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Tue, 22 Nov 2016 00:33:19 +0000 (16:33 -0800)]
cpufreq: intel_pstate: round up min_perf limits
In some use cases, user wants to enforce a minimum performance limit on
CPUs. But because of simple division the resultant performance is 100 MHz
less than the desired in some cases.
For example when the maximum frequency is 3.50 GHz, setting
scaling_min_frequency to 1.6 GHz always results in 1.5 GHz minimum. With
simple round up, the frequency can be set to 1.6 GHz to minimum in this
case. This round up is already done to max_policy_pct and max_perf, so do
the same for min_policy_pct and min_perf.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Fri, 18 Nov 2016 12:59:21 +0000 (13:59 +0100)]
cpufreq: Make cpufreq_update_policy() void
The return value of cpufreq_update_policy() is never used, so make
it void.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Rafael J. Wysocki [Fri, 18 Nov 2016 12:57:54 +0000 (13:57 +0100)]
ACPI / processor: Make acpi_processor_ppc_has_changed() void
The return value of acpi_processor_ppc_has_changed() is never used,
so make it void.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Rafael J. Wysocki [Fri, 18 Nov 2016 12:40:45 +0000 (13:40 +0100)]
cpufreq: Avoid using inactive policies
There are two places in the cpufreq core in which low-level driver
callbacks may be invoked for an inactive cpufreq policy, which isn't
guaranteed to work in general. Both are due to possible races with
CPU offline.
First, in cpufreq_get(), the policy may become inactive after
the check against policy->cpus in cpufreq_cpu_get() and before
policy->rwsem is acquired, in which case using it going forward may
not be correct.
Second, an analogous situation is possible in cpufreq_update_policy().
Avoid using inactive policies by adding policy_is_inactive() checks
to the code in the above places.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Rafael J. Wysocki [Thu, 17 Nov 2016 22:34:17 +0000 (23:34 +0100)]
cpufreq: intel_pstate: Generic governors support
There may be reasons to use generic cpufreq governors (eg. schedutil)
on Intel platforms instead of the intel_pstate driver's internal
governor. However, that currently can only be done by disabling
intel_pstate altogether and using the acpi-cpufreq driver instead
of it, which is subject to limitations.
First of all, acpi-cpufreq only works on systems where the _PSS
object is present in the ACPI tables for all logical CPUs. Second,
on those systems acpi-cpufreq will only use frequencies listed by
_PSS which may be suboptimal. In particular, by convention, the
whole turbo range is represented in _PSS as a single P-state and
the frequency assigned to it is greater by 1 MHz than the greatest
non-turbo frequency listed by _PSS. That may confuse governors to
use turbo frequencies less frequently which may lead to suboptimal
performance.
For this reason, make it possible to use the intel_pstate driver
with generic cpufreq governors as a "normal" cpufreq driver. That
mode is enforced by adding intel_pstate=passive to the kernel
command line and cannot be disabled at run time. In that mode,
intel_pstate provides a cpufreq driver interface including
the ->target() and ->fast_switch() callbacks and is listed in
scaling_driver as "intel_cpufreq".
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Doug Smythies <dsmythies@telus.net>
Rafael J. Wysocki [Thu, 17 Nov 2016 21:47:47 +0000 (22:47 +0100)]
cpufreq: intel_pstate: Request P-states control from SMM if needed
Currently, intel_pstate is unable to control P-states on my
IvyBridge-based Acer Aspire S5, because they are controlled by SMM
on that machine by default and it is necessary to request OS control
of P-states from it via the SMI Command register exposed in the ACPI
FADT. intel_pstate doesn't do that now, but acpi-cpufreq and other
cpufreq drivers for x86 platforms do.
Address this problem by making intel_pstate use the ACPI-defined
mechanism as well. However, intel_pstate is not modular and it
doesn't need the module refcount tricks played by
acpi_processor_notify_smm(), so export the core of this function
to it as acpi_processor_pstate_control() and make it call that.
[The changes in processor_perflib.c related to this should not
make any functional difference for the acpi_processor_notify_smm()
users].
To be safe, only call acpi_processor_notify_smm() from intel_pstate
if ACPI _PPC support is enabled in it.
Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Geert Uytterhoeven [Wed, 16 Nov 2016 10:05:51 +0000 (11:05 +0100)]
cpufreq: dt: Add support for r8a7743 and r8a7745
Add the compatible strings for supporting the generic cpufreq driver on
the Renesas RZ/G1M (r8a7743) and RZ/G1E (r8a7745) SoCs.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Denis Kirjanov [Tue, 8 Nov 2016 10:39:28 +0000 (05:39 -0500)]
cpufreq: powernv: Disable preemption while checking CPU throttling state
With preemption turned on we can read incorrect throttling state
while being switched to CPU on a different chip.
BUG: using smp_processor_id() in preemptible [
00000000] code: cat/7343
caller is .powernv_cpufreq_throttle_check+0x2c/0x710
CPU: 13 PID: 7343 Comm: cat Not tainted 4.8.0-rc5-dirty #1
Call Trace:
[
c0000007d25b75b0] [
c000000000971378] .dump_stack+0xe4/0x150 (unreliable)
[
c0000007d25b7640] [
c0000000005162e4] .check_preemption_disabled+0x134/0x150
[
c0000007d25b76e0] [
c0000000007b63ac] .powernv_cpufreq_throttle_check+0x2c/0x710
[
c0000007d25b7790] [
c0000000007b6d18] .powernv_cpufreq_target_index+0x288/0x360
[
c0000007d25b7870] [
c0000000007acee4] .__cpufreq_driver_target+0x394/0x8c0
[
c0000007d25b7920] [
c0000000007b22ac] .cpufreq_set+0x7c/0xd0
[
c0000007d25b79b0] [
c0000000007adf50] .store_scaling_setspeed+0x80/0xc0
[
c0000007d25b7a40] [
c0000000007ae270] .store+0xa0/0x100
[
c0000007d25b7ae0] [
c0000000003566e8] .sysfs_kf_write+0x88/0xb0
[
c0000007d25b7b70] [
c0000000003553b8] .kernfs_fop_write+0x178/0x260
[
c0000007d25b7c10] [
c0000000002ac3cc] .__vfs_write+0x3c/0x1c0
[
c0000007d25b7cf0] [
c0000000002ad584] .vfs_write+0xc4/0x230
[
c0000007d25b7d90] [
c0000000002aeef8] .SyS_write+0x58/0x100
[
c0000007d25b7e30] [
c00000000000bfec] system_call+0x38/0xfc
Fixes:
09a972d16209 (cpufreq: powernv: Report cpu frequency throttling)
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 15 Nov 2016 08:23:23 +0000 (13:53 +0530)]
cpufreq: schedutil: irq-work and mutex are only used in slow path
Execute the irq-work specific initialization/exit code only when the
fast path isn't available.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 15 Nov 2016 08:23:22 +0000 (13:53 +0530)]
cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
If slow path frequency changes are conducted in a SCHED_OTHER context
then they may be delayed for some amount of time, including
indefinitely, when real time or deadline activity is taking place.
Move the slow path to a real time kernel thread. In the future the
thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
to 50 for now.
Hackbench results on ARM Exynos, dual core A15 platform for 10
iterations:
$ hackbench -s 100 -l 100 -g 10 -f 20
Before After
---------------------------------
1.808 1.603
1.847 1.251
2.229 1.590
1.952 1.600
1.947 1.257
1.925 1.627
2.694 1.620
1.258 1.621
1.919 1.632
1.250 1.240
Average:
1.8829 1.5041
Based on initial work by Steve Muckle.
Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 15 Nov 2016 08:23:21 +0000 (13:53 +0530)]
cpufreq: schedutil: enable fast switch earlier
The fast_switch_enabled flag will be used by both sugov_policy_alloc()
and sugov_policy_free() with a later patch.
Prepare for that by moving the calls to enable and disable it to the
beginning of sugov_init() and end of sugov_exit().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 15 Nov 2016 08:23:20 +0000 (13:53 +0530)]
cpufreq: schedutil: Avoid indented labels
Switch to the more common practice of writing labels.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stratos Karafotis [Wed, 16 Nov 2016 19:27:22 +0000 (21:27 +0200)]
cpufreq: conservative: Fix comment explaining frequency updates
The original comment about the frequency increase to maximum is wrong.
Both increase and decrease happen at steps.
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stratos Karafotis [Wed, 16 Nov 2016 17:26:29 +0000 (19:26 +0200)]
cpufreq: conservative: Decrease frequency faster for deferred updates
Conservative governor changes the CPU frequency in steps.
That means that if a CPU runs at max frequency, it will need several
sampling periods to return to min frequency when the workload
is finished.
If the update function that calculates the load and target frequency
is deferred, the governor might need even more time to decrease the
frequency.
This may have impact to power consumption and after all conservative
should decrease the frequency if there is no workload at every sampling
rate.
To resolve the above issue calculate the number of sampling periods
that the update is deferred. Considering that for each sampling period
conservative should drop the frequency by a freq_step because the
CPU was idle apply the proper subtraction to requested frequency.
Below, the kernel trace with and without this patch. First an
intensive workload is applied on a specific CPU. Then the workload
is removed and the CPU goes to idle.
WITHOUT
<idle>-0 [007] dN.. 620.329153: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 620.350857: cpu_frequency: state=1700000 cpu_id=7
kworker/7:2-556 [007] .... 620.370856: cpu_frequency: state=1900000 cpu_id=7
kworker/7:2-556 [007] .... 620.390854: cpu_frequency: state=2100000 cpu_id=7
kworker/7:2-556 [007] .... 620.411853: cpu_frequency: state=2200000 cpu_id=7
kworker/7:2-556 [007] .... 620.432854: cpu_frequency: state=2400000 cpu_id=7
kworker/7:2-556 [007] .... 620.453854: cpu_frequency: state=2600000 cpu_id=7
kworker/7:2-556 [007] .... 620.494856: cpu_frequency: state=2900000 cpu_id=7
kworker/7:2-556 [007] .... 620.515856: cpu_frequency: state=3100000 cpu_id=7
kworker/7:2-556 [007] .... 620.536858: cpu_frequency: state=3300000 cpu_id=7
kworker/7:2-556 [007] .... 620.557857: cpu_frequency: state=3401000 cpu_id=7
<idle>-0 [007] d... 669.591363: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 669.591939: cpu_idle: state=
4294967295 cpu_id=7
<idle>-0 [007] d... 669.591980: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] dN.. 669.591989: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 670.201224: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 670.221975: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 670.222016: cpu_frequency: state=3300000 cpu_id=7
<idle>-0 [007] d... 670.222026: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 670.234964: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 670.801251: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.236046: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 671.236073: cpu_frequency: state=3100000 cpu_id=7
<idle>-0 [007] d... 671.236112: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.393437: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 671.401277: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.404083: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 671.404111: cpu_frequency: state=2900000 cpu_id=7
<idle>-0 [007] d... 671.404125: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.404974: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 671.501180: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.995414: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 671.995459: cpu_frequency: state=2800000 cpu_id=7
<idle>-0 [007] d... 671.995469: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 671.996287: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 672.001305: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.078374: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 672.078410: cpu_frequency: state=2600000 cpu_id=7
<idle>-0 [007] d... 672.078419: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.158020: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 672.158040: cpu_frequency: state=2400000 cpu_id=7
<idle>-0 [007] d... 672.158044: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.160038: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 672.234557: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.237121: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 672.237174: cpu_frequency: state=2100000 cpu_id=7
<idle>-0 [007] d... 672.237186: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.237778: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 672.267902: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.269860: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 672.269906: cpu_frequency: state=1900000 cpu_id=7
<idle>-0 [007] d... 672.269914: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.271902: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 672.751342: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 672.823056: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-556 [007] .... 672.823095: cpu_frequency: state=1600000 cpu_id=7
WITH
<idle>-0 [007] dN.. 4380.928009: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-399 [007] .... 4380.949767: cpu_frequency: state=2000000 cpu_id=7
kworker/7:2-399 [007] .... 4380.969765: cpu_frequency: state=2200000 cpu_id=7
kworker/7:2-399 [007] .... 4381.009766: cpu_frequency: state=2500000 cpu_id=7
kworker/7:2-399 [007] .... 4381.029767: cpu_frequency: state=2600000 cpu_id=7
kworker/7:2-399 [007] .... 4381.049769: cpu_frequency: state=2800000 cpu_id=7
kworker/7:2-399 [007] .... 4381.069769: cpu_frequency: state=3000000 cpu_id=7
kworker/7:2-399 [007] .... 4381.089771: cpu_frequency: state=3100000 cpu_id=7
kworker/7:2-399 [007] .... 4381.109772: cpu_frequency: state=3400000 cpu_id=7
kworker/7:2-399 [007] .... 4381.129773: cpu_frequency: state=3401000 cpu_id=7
<idle>-0 [007] d... 4428.226159: cpu_idle: state=1 cpu_id=7
<idle>-0 [007] d... 4428.226176: cpu_idle: state=
4294967295 cpu_id=7
<idle>-0 [007] d... 4428.226181: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4428.227177: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 4428.551640: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4428.649239: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-399 [007] .... 4428.649268: cpu_frequency: state=2800000 cpu_id=7
<idle>-0 [007] d... 4428.649278: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4428.689856: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 4428.799542: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4428.801683: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-399 [007] .... 4428.801748: cpu_frequency: state=1700000 cpu_id=7
<idle>-0 [007] d... 4428.801761: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4428.806545: cpu_idle: state=
4294967295 cpu_id=7
...
<idle>-0 [007] d... 4429.051880: cpu_idle: state=4 cpu_id=7
<idle>-0 [007] d... 4429.086240: cpu_idle: state=
4294967295 cpu_id=7
kworker/7:2-399 [007] .... 4429.086293: cpu_frequency: state=1600000 cpu_id=7
Without the patch the CPU dropped to min frequency after 3.2s
With the patch applied the CPU dropped to min frequency after 0.86s
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Mon, 14 Nov 2016 07:00:43 +0000 (12:30 +0530)]
cpufreq: conservative: Rename get_freq_target() to get_freq_step()
What's returned from this function is the delta by which the frequency
must be increased or decreased and not the final frequency that should
be selected.
Name it properly to match its purpose. Also update the variables used to
store that value.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Akshay Adiga [Mon, 14 Nov 2016 11:59:27 +0000 (17:29 +0530)]
cpufreq: powernv: Fix uninitialized lpstate_idx in gpstates_timer_handler()
lpstate_idx remains uninitialized in the case when elapsed_time
is greater than MAX_RAMP_DOWN_TIME. At the end of rampdown the
global pstate should be equal to the local pstate.
Fixes:
20b15b766354 (cpufreq: powernv: Use PMCR to verify global and localpstate)
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Mon, 14 Nov 2016 18:31:11 +0000 (10:31 -0800)]
cpufreq: intel_pstate: Use CPU load based algorithm for PM_MOBILE
Use get_target_pstate_use_cpu_load() to calculate target P-State for
devices, with the preferred power management profile in ACPI FADT
set to PM_MOBILE.
This may help in resolving some thermal issues caused by low sustained
cpu bound workloads. The current algorithm tend to over provision in this
case as it doesn't look at the CPU busyness.
Also included the fix from Arnd Bergmann <arnd@arndb.de> to solve compile
issue, when CONFIG_ACPI is not defined.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Sun, 13 Nov 2016 18:32:32 +0000 (10:32 -0800)]
Linux 4.9-rc5
Linus Torvalds [Sun, 13 Nov 2016 18:28:53 +0000 (10:28 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM fixes. There are a couple pending x86 patches but they'll have to
wait for next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
KVM: arm/arm64: vgic: Prevent access to invalid SPIs
arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
Linus Torvalds [Sun, 13 Nov 2016 18:26:05 +0000 (10:26 -0800)]
Merge branch 'media-fixes' (patches from Mauro)
Merge media fixes from Mauro Carvalho Chehab:
"This contains two patches fixing problems with my patch series meant
to make USB drivers to work again after the DMA on stack changes.
The last patch on this series is actually not related to DMA on stack.
It solves a longstanding bug affecting module unload, causing
module_put() to be called twice. It was reported by the user who
reported and tested the issues with the gp8psk driver with the DMA
fixup patches. As we're late at -rc cycle, maybe you prefer to not
apply it right now. If this is the case, I'll add to the pile of
patches for 4.10.
Exceptionally this time, I'm sending the patches via e-mail, because
I'm on another trip, and won't be able to use the usual procedure
until Monday. Also, it is only three patches, and you followed already
the discussions about the first one"
* emailed patches from Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
gp8psk: Fix DVB frontend attach
gp8psk: fix gp8psk_usb_in_op() logic
dvb-usb: move data_mutex to struct dvb_usb_device
Linus Torvalds [Sun, 13 Nov 2016 18:24:08 +0000 (10:24 -0800)]
Merge tag 'char-misc-4.9-rc5' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are three small driver fixes for some reported issues for
4.9-rc5.
One for the hyper-v subsystem, fixing up a naming issue that showed up
in 4.9-rc1, one mei driver fix, and one fix for parallel ports,
resolving a reported regression.
All have been in linux-next with no reported issues"
* tag 'char-misc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
ppdev: fix double-free of pp->pdev->name
vmbus: make sysfs names consistent with PCI
mei: bus: fix received data size check in NFC fixup
Linus Torvalds [Sun, 13 Nov 2016 18:22:07 +0000 (10:22 -0800)]
Merge tag 'driver-core-4.9-rc5' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.9-rc5.
The first resolves an issue with some drivers not liking to be unbound
and bound again (if CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled), which
solves some reported problems with graphics and storage drivers. The
other resolves a smatch error with the 4.9-rc1 driver core changes
around this feature.
Both have been in linux-next with no reported issues"
* tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: fix smatch warning on dev->bus check
driver core: skip removal test for non-removable drivers
Linus Torvalds [Sun, 13 Nov 2016 18:13:33 +0000 (10:13 -0800)]
Merge tag 'staging-4.9-rc5' of git://git./linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Grek KH:
"Here are a few small staging and iio driver fixes for reported issues.
The last one was cherry-picked from my -next branch to resolve a build
warning that Arnd fixed, in his quest to be able to turn
-Wmaybe-uninitialized back on again. That patch, and all of the
others, have been in linux-next for a while with no reported issues"
* tag 'staging-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: maxim_thermocouple: detect invalid storage size in read()
staging: nvec: remove managed resource from PS2 driver
Revert "staging: nvec: ps2: change serio type to passthrough"
drivers: staging: nvec: remove bogus reset command for PS/2 interface
staging: greybus: arche-platform: fix device reference leak
staging: comedi: ni_tio: fix buggy ni_tio_clock_period_ps() return value
staging: sm750fb: Fix bugs introduced by early commits
iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
iio: st_sensors: fix scale configuration for h3lis331dl
staging: iio: ad5933: avoid uninitialized variable in error case
Linus Torvalds [Sun, 13 Nov 2016 18:10:46 +0000 (10:10 -0800)]
Merge tag 'usb-4.9-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB / PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.9-rc5
Nothing major, just small fixes for reported issues, all of these have
been in linux-next for a while with no reported issues"
* tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: cdc-acm: fix TIOCMIWAIT
cdc-acm: fix uninitialized variable
drivers/usb: Skip auto handoff for TI and RENESAS usb controllers
usb: musb: remove duplicated actions
usb: musb: da8xx: Don't print phy error on -EPROBE_DEFER
phy: sun4i: check PMU presence when poking unknown bit of pmu
phy-rockchip-pcie: remove deassert of phy_rst from exit callback
phy: da8xx-usb: rename the ohci device to ohci-da8xx
phy: Add reset callback for not generic phy
uwb: fix device reference leaks
usb: gadget: u_ether: remove interrupt throttling
usb: dwc3: st: add missing <linux/pinctrl/consumer.h> include
usb: dwc3: Fix error handling for core init
Linus Torvalds [Sun, 13 Nov 2016 18:09:04 +0000 (10:09 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block fixes from Jens Axboe:
"Since I mistakenly left out the lightnvm regression fix yesterday and
the aoeblk seems adequately tested at this point, might as well send
out another pull to make -rc5"
* 'for-linus' of git://git.kernel.dk/linux-block:
aoe: fix crash in page count manipulation
lightnvm: invalid offset calculation for lba_shift
Linus Torvalds [Sun, 13 Nov 2016 18:07:08 +0000 (10:07 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The megaraid_sas patch in here fixes a major regression in the last
fix set that made all megaraid_sas cards unusable. It turns out no-one
had actually tested such an "obvious" fix, sigh. The fix for the fix
has been tested ...
The next most serious is the vmw_pvscsi abort problem which basically
means that aborts don't work on the vmware paravirt devices and error
handling always escalates to reset.
The rest are an assortment of missed reference counting in certain
paths and corner case bugs that show up on some architectures"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove
scsi: qla2xxx: do not queue commands when unloading
scsi: libcxgbi: fix incorrect DDP resource cleanup
scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
scsi: scsi_dh_alua: Fix a reference counting bug
scsi: vmw_pvscsi: return SUCCESS for successful command aborts
scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
scsi: scsi_dh_alua: fix missing kref_put() in alua_rtpg_work()
Linus Torvalds [Sun, 13 Nov 2016 18:04:55 +0000 (10:04 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"The typical collection of minor bug fixes in clk drivers. We don't
have anything in the core framework here, just driver fixes.
There's a boot fix for Samsung devices and a safety measure for qoriq
to prevent CPUs from running too fast. There's also a fix for i.MX6Q
to properly handle audio clock rates. We also have some "that's
obviously wrong" fixes like bad NULL pointer checks in the MPP driver
and a poor usage of __pa in the xgene clk driver that are fixed here"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: mmp: pxa910: fix return value check in pxa910_clk_init()
clk: mmp: pxa168: fix return value check in pxa168_clk_init()
clk: mmp: mmp2: fix return value check in mmp2_clk_init()
clk: qoriq: Don't allow CPU clocks higher than starting value
clk: imx: fix integer overflow in AV PLL round rate
clk: xgene: Don't call __pa on ioremaped address
clk/samsung: Use CLK_OF_DECLARE_DRIVER initialization method for CLKOUT
clk: rockchip: don't return NULL when failing to register ddrclk branch
Mauro Carvalho Chehab [Sat, 12 Nov 2016 14:46:28 +0000 (12:46 -0200)]
gp8psk: Fix DVB frontend attach
The DVB binding schema at the DVB core assumes that the frontend is a
separate driver. Faling to do that causes OOPS when the module is
removed, as it tries to do a symbol_put_addr on an internal symbol,
causing craches like:
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
Call Trace:
dump_stack+0x44/0x64
__warn+0xfa/0x120
module_put+0x57/0x70
module_put+0x57/0x70
warn_slowpath_null+0x23/0x30
module_put+0x57/0x70
gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
symbol_put_addr+0x27/0x50
dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
From Derek's tests:
"Attach bug is fixed, tuning works, module unloads without
crashing. Everything seems ok!"
Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mauro Carvalho Chehab [Sat, 12 Nov 2016 14:46:27 +0000 (12:46 -0200)]
gp8psk: fix gp8psk_usb_in_op() logic
Commit
bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") fixed the
usage of DMA on stack, but the memcpy was wrong for gp8psk_usb_in_op().
Fix it.
From Derek's email:
"Fix confirmed using 2 different Skywalker models with
HD mpeg4, SD mpeg2."
Suggested-by: Johannes Stezenbach <js@linuxtv.org>
Fixes:
bc29131ecb10 ("[media] gp8psk: don't do DMA on stack")
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mauro Carvalho Chehab [Sat, 12 Nov 2016 14:46:26 +0000 (12:46 -0200)]
dvb-usb: move data_mutex to struct dvb_usb_device
The data_mutex is initialized too late, as it is needed for
each device driver's power control, causing an OOPS:
dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 PGD 0
Oops: 0002 [#1] SMP
Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
task:
ffff88020e943840 task.stack:
ffff8801f36ec000
RIP: 0010:[<
ffffffff846617af>] [<
ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100
RSP: 0018:
ffff8801f36efb10 EFLAGS:
00010282
RAX:
0000000000000000 RBX:
ffff88021509bdc8 RCX:
00000000c0000100
RDX:
0000000000000001 RSI:
0000000000000000 RDI:
ffff88021509bdcc
RBP:
ffff8801f36efb58 R08:
ffff88021f216320 R09:
0000000000100000
R10:
ffff88021f216320 R11:
00000023fee6c5a1 R12:
ffff88020e943840
R13:
ffff88021509bdcc R14:
00000000ffffffff R15:
ffff88021509bdd0
FS:
00007f21adb86740(0000) GS:
ffff88021f200000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000000 CR3:
0000000215bce000 CR4:
00000000001406f0
Call Trace:
mutex_lock+0x16/0x25
cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
usb_probe_interface+0xf3/0x2a0
driver_probe_device+0x208/0x2b0
__driver_attach+0x87/0x90
driver_probe_device+0x2b0/0x2b0
bus_for_each_dev+0x52/0x80
bus_add_driver+0x1a3/0x220
driver_register+0x56/0xd0
usb_register_driver+0x77/0x130
do_one_initcall+0x46/0x180
free_vmap_area_noflush+0x38/0x70
kmem_cache_alloc+0x84/0xc0
do_init_module+0x50/0x1be
load_module+0x1d8b/0x2100
find_symbol_in_section+0xa0/0xa0
SyS_finit_module+0x89/0x90
entry_SYSCALL_64_fastpath+0x13/0x94
Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP [<
ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP <
ffff8801f36efb10>
CR2:
0000000000000000
So, move it to the struct dvb_usb_device and initialize it
before calling the driver's callbacks.
Reported-by: Jörg Otte <jrg.otte@gmail.com>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Tue, 25 Oct 2016 15:55:04 +0000 (17:55 +0200)]
iio: maxim_thermocouple: detect invalid storage size in read()
As found by gcc -Wmaybe-uninitialized, having a storage_bytes value other
than 2 or 4 will result in undefined behavior:
drivers/iio/temperature/maxim_thermocouple.c: In function 'maxim_thermocouple_read':
drivers/iio/temperature/maxim_thermocouple.c:141:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This probably cannot happen, but returning -EINVAL here is appropriate
and makes gcc happy and the code more robust.
Fixes:
231147ee77f3 ("iio: maxim_thermocouple: Align 16 bit big endian value of raw reads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
(cherry picked from commit
32cb7d27e65df9daa7cee8f1fdf7b259f214bee2)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jens Axboe [Sat, 12 Nov 2016 01:28:50 +0000 (18:28 -0700)]
aoe: fix crash in page count manipulation
aoeblk contains some mysterious code, that wants to elevate the bio
vec page counts while it's under IO. That is not needed, it's
fragile, and it's causing kernel oopses for some.
Reported-by: Tested-by: Don Koch <kochd@us.ibm.com>
Tested-by: Tested-by: Don Koch <kochd@us.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Matias Bjørling [Thu, 10 Nov 2016 11:26:57 +0000 (12:26 +0100)]
lightnvm: invalid offset calculation for lba_shift
The ns->lba_shift assumes its value to be the logarithmic of the
LA size. A previous patch duplicated the lba_shift calculation into
lightnvm. It prematurely also subtracted a 512byte shift, which commonly
is applied per-command. The 512byte shift being subtracted twice led to
data loss when restoring the logical to physical mapping table from
device and when issuing I/O commands using rrpc.
Fix offset by removing the 512byte shift subtraction when calculating
lba_shift.
Fixes:
b0b4e09c1ae7 "lightnvm: control life of nvm_dev in driver"
Reported-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Linus Torvalds [Sat, 12 Nov 2016 01:02:01 +0000 (17:02 -0800)]
Merge tag 'acpi-4.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix a recent regression in the 8250_dw serial driver introduced by
adding a quirk for the APM X-Gene SoC to it which uncovered an issue
related to the handling of built-in device properties in the core ACPI
device enumeration code (Heikki Krogerus)"
* tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / platform: Add support for build-in properties
Linus Torvalds [Sat, 12 Nov 2016 00:54:23 +0000 (16:54 -0800)]
Merge tag 'pm-4.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two bugs in error code paths in the PM core (system-wide
suspend of devices), a device reference leak in the boot-time suspend
test code and a cpupower utility regression from the 4.7 cycle.
Specifics:
- Prevent the PM core from attempting to suspend parent devices if
any of their children, whose suspend callbacks were invoked
asynchronously, have failed to suspend during the "late" and
"noirq" phases of system-wide suspend of devices (Brian Norris).
- Prevent the boot-time system suspend test code from leaking a
reference to the RTC device used by it (Johan Hovold).
- Fix cpupower to use the return value of one of its library
functions correctly and restore the correct behavior of it when
used for setting cpufreq tunables broken during the 4.7 development
cycle (Laura Abbott)"
* tag 'pm-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
PM / sleep: fix device reference leak in test_suspend
cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set
Linus Torvalds [Sat, 12 Nov 2016 00:51:50 +0000 (16:51 -0800)]
Merge tag 'arc-4.9-rc5' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- mmap handler for dma ops as generic handler no longer works for us
[Alexey]
- Fixes for EZChip platform [Noam]
- Fix RTC clocksource driver build issue
- ARC IRQ handling fixes [Yuriy]
- Revert a recent makefile change which doesn't go well with oldish
tools out in the wild
* tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARCv2: MCIP: Use IDU_M_DISTRI_DEST mode if there is only 1 destination core
ARC: IRQ: Do not use hwirq as virq and vice versa
ARC: [plat-eznps] set default baud for early console
ARC: [plat-eznps] remove IPI clear from SMP operations
Revert "ARC: build: retire old toggles"
ARC: timer: rtc: implement read loop in "C" vs. inline asm
ARC: change return value of userspace cmpxchg assist syscall
arc: Implement arch-specific dma_map_ops.mmap
ARC: [SMP] avoid overriding present cpumask
ARC: Enable PERF_EVENTS in nSIM driven platforms
Linus Torvalds [Sat, 12 Nov 2016 00:48:49 +0000 (16:48 -0800)]
Merge tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
"Minor doc fix, a DMI match for ideapad and a fix to toshiba-wmi to
avoid loading on non-toshiba systems.
Documentation/ABI:
- ibm_rtl: The "What:" fields are incomplete
toshiba-wmi:
- Fix loading the driver on non Toshiba laptops
ideapad-laptop:
- Add another DMI entry for Yoga 900"
* tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
Documentation/ABI: ibm_rtl: The "What:" fields are incomplete
toshiba-wmi: Fix loading the driver on non Toshiba laptops
ideapad-laptop: Add another DMI entry for Yoga 900
Linus Torvalds [Sat, 12 Nov 2016 00:42:03 +0000 (16:42 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Two small (really, one liners both of them!) fixes that should go into
this series:
- Request allocation error handling fix for nbd, from Christophe,
fixing a regression in this series.
- An oops fix for drbd. Not a regression in this series, but stable
material. From Richard"
* 'for-linus' of git://git.kernel.dk/linux-block:
drbd: Fix kernel_sendmsg() usage - potential NULL deref
nbd: Fix error handling
Linus Torvalds [Sat, 12 Nov 2016 00:38:26 +0000 (16:38 -0800)]
Merge tag 'pci-v4.9-fixes-3' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Update MAINTAINERS for Intel VMD driver filename
- Update Rockchip rk3399 host bridge driver DTS and resets
- Fix ROM shadow problem that made some video device initialization
fail
* tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: VMD: Update filename to reflect move
arm64: dts: rockchip: add three new resets for rk3399 PCIe controller
PCI: rockchip: Add three new resets as required properties
PCI: Don't attempt to claim shadow copies of ROM
Linus Torvalds [Sat, 12 Nov 2016 00:25:28 +0000 (16:25 -0800)]
Merge tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"AMD, radeon, i915, imx, msm and udl fixes:
- amdgpu/radeon have a number of power management regressions and
fixes along with some better error checking
- imx has a single regression fix
- udl has a single kmalloc instead of stack for usb control msg fix
- msm has some fixes for modesetting bugs and regressions
- i915 has a one fix for a Sandybridge regression along with some
others for DP audio.
They all seem pretty okay at this stage, we've got one MST fix I know
going through process for i915, but I expect it'll be next week"
* tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux: (30 commits)
drm/udl: make control msg static const. (v2)
drm/amd/powerplay: implement get_clock_by_type for iceland.
drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)
drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland
drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk
drm/imx: disable planes before DC
drm/amd/powerplay: return false instead of -EINVAL
drm/amdgpu/powerplay/smu7: fix unintialized data usage
drm/amdgpu: fix crash in acp_hw_fini
drm/i915: Limit Valleyview and earlier to only using mappable scanout
drm/i915: Round tile chunks up for constructing partial VMAs
drm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms
drm/i915/dp: BDW cdclk fix for DP audio
drm/i915/vlv: Prevent enabling hpd polling in late suspend
drm/i915: Respect alternate_ddc_pin for all DDI ports
drm/msm: Fix error handling crashes seen when VRAM allocation fails
drm/msm/mdp5: 8x16 actually has 8 mixer stages
drm/msm/mdp5: no scaling support on RGBn pipes for 8x16
drm/msm/mdp5: handle non-fullscreen base plane case
drm/msm: Set CLK_IGNORE_UNUSED flag for PLL clocks
...
Linus Torvalds [Sat, 12 Nov 2016 00:23:14 +0000 (16:23 -0800)]
Merge tag 'mmc-v4.9-rc4' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix mmc card initialization for hosts not supporting HW busy
detection
- Fix mmc_test for sending commands during non-blocking write
MMC host:
- mxs: Avoid using an uninitialized
- sdhci: Restore enhanced strobe setting during runtime resume
- sdhci: Fix a couple of reset related issues
- dw_mmc: Fix a reset controller issue"
* tag 'mmc-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mxs: Initialize the spinlock prior to using it
mmc: mmc: Use 500ms as the default generic CMD6 timeout
mmc: mmc_test: Fix "Commands during non-blocking write" tests
mmc: sdhci: Fix missing enhanced strobe setting during runtime resume
mmc: sdhci: Reset cmd and data circuits after tuning failure
mmc: sdhci: Fix unexpected data interrupt handling
mmc: sdhci: Fix CMD line reset interfering with ongoing data transfer
mmc: dw_mmc: add the "reset" as name of reset controller
Documentation: synopsys-dw-mshc: add binding for reset-names
Linus Torvalds [Sat, 12 Nov 2016 00:21:20 +0000 (16:21 -0800)]
Merge tag 'pinctrl-v4.9-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"All is about drivers, no core business going on.
- Fix a host of runtime problems with the Intel Cherryview driver:
suspend/resume needs to be marshalled properly, and strange effects
from BIOS interaction during suspend/resume need to be dealt with.
- A single bit was being set wrong in the Aspeed driver.
- Fix an iProc probe ordering fallout resulting from v4.9
refactorings for bus population.
- Do not specify a default trigger in the ST Micro cascaded GPIO IRQ
controller: the kernel will moan.
- Make IRQs optional altogether on the STM32 driver, it turns out not
all systems have them or want them.
- Fix a re-probe bug in the i.MX driver, it will eventually crash if
probed repeatedly, not good"
* tag 'pinctrl-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl-aspeed-g5: Never set SCU90[6]
pinctrl: cherryview: Prevent possible interrupt storm on resume
pinctrl: cherryview: Serialize register access in suspend/resume
pinctrl: imx: reset group index on probe
pinctrl: stm32: move gpio irqs binding to optional
pinctrl: stm32: remove dependency with interrupt controller
pinctrl: st: don't specify default interrupt trigger
pinctrl: iproc: Fix iProc and NSP GPIO support
Rafael J. Wysocki [Fri, 11 Nov 2016 22:24:58 +0000 (23:24 +0100)]
Merge branches 'pm-tools-fixes' and 'pm-sleep-fixes'
* pm-tools-fixes:
cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set
* pm-sleep-fixes:
PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
PM / sleep: fix device reference leak in test_suspend
Rafael J. Wysocki [Fri, 11 Nov 2016 22:23:02 +0000 (23:23 +0100)]
Merge branch 'device-properties'
* device-properties:
ACPI / platform: Add support for build-in properties
Linus Torvalds [Fri, 11 Nov 2016 18:03:01 +0000 (10:03 -0800)]
Merge branch 'maybe-uninitialized' (patches from Arnd)
Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
"It took a while for some patches to make it into mainline through
maintainer trees, but the 28-patch series is now reduced to 10, with
one tiny patch added at the end.
Aside from patches that are no longer required, I did these changes
compared to version 1:
- Dropped "iio: maxim_thermocouple: detect invalid storage size in
read()", which is currently in linux-next as commit
32cb7d27e65d.
This is the only remaining warning I see for a couple of corner
cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
bot both report it on arm64)
- Dropped "brcmfmac: avoid maybe-uninitialized warning in
brcmf_cfg80211_start_ap", which is currently in net/master merge
pending.
- Dropped two x86 patches, "x86: math-emu: possible uninitialized
variable use" and "x86: mark target address as output in 'insb'
asm" as they do not seem to trigger for a default build, and I got
no feedback on them. Both of these are ancient issues and seem
harmless, I will send them again to the x86 maintainers once the
rest is merged.
- Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
feedback from Ilya Dryomov, who already has a different fix queued
up for v4.10. The kbuild bot reports this as a warning for xtensa.
- Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
a simpler patch, this one always triggers but my first solution
would not be safe for linux-4.9 any more at this point. I'll follow
up with the larger patch as a cleanup for 4.10.
- Replaced "dib0700: fix nec repeat handling" with a better one,
contributed by Sean Young"
* -Wmaybe-uninitialized fixes:
Kbuild: enable -Wmaybe-uninitialized warnings by default
pcmcia: fix return value of soc_pcmcia_regulator_set
infiniband: shut up a maybe-uninitialized warning
crypto: aesni: shut up -Wmaybe-uninitialized warning
rc: print correct variable for z8f0811
dib0700: fix nec repeat handling
s390: pci: don't print uninitialized data for debugging
nios2: fix timer initcall return value
x86: apm: avoid uninitialized data
NFSv4.1: work around -Wmaybe-uninitialized warning
Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
Linus Torvalds [Fri, 11 Nov 2016 17:44:23 +0000 (09:44 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib/stackdepot: export save/fetch stack for drivers
mm: kmemleak: scan .data.ro_after_init
memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
coredump: fix unfreezable coredumping task
mm/filemap: don't allow partially uptodate page for pipes
mm/hugetlb: fix huge page reservation leak in private mapping error paths
ocfs2: fix not enough credit panic
Revert "console: don't prefer first registered if DT specifies stdout-path"
mm: hwpoison: fix thp split handling in memory_failure()
swapfile: fix memory corruption via malformed swapfile
mm/cma.c: check the max limit for cma allocation
scripts/bloat-o-meter: fix SIGPIPE
shmem: fix pageflags after swapping DMA32 object
mm, frontswap: make sure allocated frontswap map is assigned
mm: remove extra newline from allocation stall warning
Linus Torvalds [Fri, 11 Nov 2016 17:19:01 +0000 (09:19 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull VFS fixes from Al Viro:
"Christoph's and Jan's aio fixes, fixup for generic_file_splice_read
(removal of pointless detritus that actually breaks it when used for
gfs2 ->splice_read()) and fixup for generic_file_read_iter()
interaction with ITER_PIPE destinations."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
splice: remove detritus from generic_file_splice_read()
mm/filemap: don't allow partially uptodate page for pipes
aio: fix freeze protection of aio writes
fs: remove aio_run_iocb
fs: remove the never implemented aio_fsync file operation
aio: hold an extra file reference over AIO read/write operations
Linus Torvalds [Fri, 11 Nov 2016 17:17:10 +0000 (09:17 -0800)]
Merge tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client
Pull Ceph fixes from Ilya Dryomov:
"Ceph's ->read_iter() implementation is incompatible with the new
generic_file_splice_read() code that went into -rc1. Switch to the
less efficient default_file_splice_read() for now; the proper fix is
being held for 4.10.
We also have a fix for a 4.8 regression and a trival libceph fixup"
* tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client:
libceph: initialize last_linger_id with a large integer
libceph: fix legacy layout decode with pool 0
ceph: use default file splice read callback
Linus Torvalds [Fri, 11 Nov 2016 17:15:30 +0000 (09:15 -0800)]
Merge tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client bugfixes from Anna Schumaker:
"Most of these fix regressions in 4.9, and none are going to stable
this time around.
Bugfixes:
- Trim extra slashes in v4 nfs_paths to fix tools that use this
- Fix a -Wmaybe-uninitialized warnings
- Fix suspicious RCU usages
- Fix Oops when mounting multiple servers at once
- Suppress a false-positive pNFS error
- Fix a DMAR failure in NFS over RDMA"
* tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
fs/nfs: Fix used uninitialized warn in nfs4_slot_seqid_in_use()
NFS: Don't print a pNFS error if we aren't using pNFS
NFS: Ignore connections that have cl_rpcclient uninitialized
SUNRPC: Fix suspicious RCU usage
NFSv4.1: work around -Wmaybe-uninitialized warning
NFS: Trim extra slash in v4 nfs_path
Linus Torvalds [Fri, 11 Nov 2016 17:13:48 +0000 (09:13 -0800)]
Merge tag 'xfs-fixes-for-linus-4.9-rc5' of git://git./linux/kernel/git/dgc/linux-xfs
Pull xfs fix from Dave Chinner:
"This is a fix for an unmount hang (regression) when the filesystem is
shutdown. It was supposed to go to you for -rc3, but I accidentally
tagged the commit prior to it in that pullreq.
Summary:
- fix for aborting deferred transactions on filesystem shutdown"
* tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: defer should abort intent items if the trans roll fails
Arnd Bergmann [Thu, 10 Nov 2016 16:44:54 +0000 (17:44 +0100)]
Kbuild: enable -Wmaybe-uninitialized warnings by default
Previously the warnings were added back at the W=1 level and above, this
now turns them on again by default, assuming that we have addressed all
warnings and again have a clean build for v4.10.
I found a number of new warnings in linux-next already and submitted
bugfixes for those. Hopefully they are caught by the 0day builder in
the future as soon as this patch is merged.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:53 +0000 (17:44 +0100)]
pcmcia: fix return value of soc_pcmcia_regulator_set
The newly introduced soc_pcmcia_regulator_set() function sometimes
returns without setting its return code, as shown by this warning:
drivers/pcmcia/soc_common.c: In function 'soc_pcmcia_regulator_set':
drivers/pcmcia/soc_common.c:112:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This changes it to propagate the regulator_disable() result instead.
Fixes:
ac61b6001a63 ("pcmcia: soc_common: add support for Vcc and Vpp regulators")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:52 +0000 (17:44 +0100)]
infiniband: shut up a maybe-uninitialized warning
Some configurations produce this harmless warning when built with gcc
-Wmaybe-uninitialized:
infiniband/core/cma.c: In function 'cma_get_net_dev':
infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized]
I previously reported this for the powerpc64 defconfig, but have now
reproduced the same thing for x86 as well, using gcc-5 or higher.
The code looks correct to me, and this change just rearranges it by
making sure we alway initialize the entire address structure to make the
warning disappear. My first approach added an initialization at the
time of the declaration, which Doug commented may be too costly, so I
hope this version doesn't add overhead.
Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed
Link: https://patchwork.kernel.org/patch/9212825/
Acked-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:51 +0000 (17:44 +0100)]
crypto: aesni: shut up -Wmaybe-uninitialized warning
The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:
arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.
This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output. A
follow-up patch for v4.10 rearranges the code to make the warning go
away.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:50 +0000 (17:44 +0100)]
rc: print correct variable for z8f0811
A recent rework accidentally left a debugging printk untouched while
changing the meaning of the variables, leading to an uninitialized
variable being printed:
drivers/media/i2c/ir-kbd-i2c.c: In function 'get_key_haup_common':
drivers/media/i2c/ir-kbd-i2c.c:62:2: error: 'toggle' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This prints the correct one instead, as we did before the patch.
Fixes:
00bb820755ed ("[media] rc: Hauppauge z8f0811 can decode RC6")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sean Young [Thu, 10 Nov 2016 16:44:49 +0000 (17:44 +0100)]
dib0700: fix nec repeat handling
When receiving a nec repeat, ensure the correct scancode is repeated
rather than a random value from the stack. This removes the need for
the bogus uninitialized_var() and also fixes the warnings:
drivers/media/usb/dvb-usb/dib0700_core.c: In function ‘dib0700_rc_urb_completion’:
drivers/media/usb/dvb-usb/dib0700_core.c:679: warning: ‘protocol’ may be used uninitialized in this function
[sean addon: So after writing the patch and submitting it, I've bought the
hardware on ebay. Without this patch you get random scancodes
on nec repeats, which the patch indeed fixes.]
Signed-off-by: Sean Young <sean@mess.org>
Tested-by: Sean Young <sean@mess.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:48 +0000 (17:44 +0100)]
s390: pci: don't print uninitialized data for debugging
gcc correctly warns about an incorrect use of the 'pa' variable in case
we pass an empty scatterlist to __s390_dma_map_sg:
arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]
This adds a bogus initialization to the function to sanitize the debug
output. I would have preferred a solution without the initialization,
but I only got the report from the kbuild bot after turning on the
warning again, and didn't manage to reproduce it myself.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:47 +0000 (17:44 +0100)]
nios2: fix timer initcall return value
When called more than twice, the nios2_time_init() function return an
uninitialized value, as detected by gcc -Wmaybe-uninitialized
arch/nios2/kernel/time.c: warning: 'ret' may be used uninitialized in this function
This makes it return '0' here, matching the comment above the function.
Acked-by: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:46 +0000 (17:44 +0100)]
x86: apm: avoid uninitialized data
apm_bios_call() can fail, and return a status in its argument structure.
If that status however is zero during a call from
apm_get_power_status(), we end up using data that may have never been
set, as reported by "gcc -Wmaybe-uninitialized":
arch/x86/kernel/apm_32.c: In function ‘apm’:
arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here
arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here
This changes the function to return "APM_NO_ERROR" here, which makes the
code more robust to broken BIOS versions, and avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:45 +0000 (17:44 +0100)]
NFSv4.1: work around -Wmaybe-uninitialized warning
A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use if
we enable -Wmaybe-uninitialized again:
fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized]
gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair results
in a nonzero return value here. Using PTR_ERR_OR_ZERO() instead makes
this clear to the compiler.
Fixes:
e09c978aae5b ("NFSv4.1: Fix Oopsable condition in server callback races")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Thu, 10 Nov 2016 16:44:44 +0000 (17:44 +0100)]
Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
Traditionally, we have always had warnings about uninitialized variables
enabled, as this is part of -Wall, and generally a good idea [1], but it
also always produced false positives, mainly because this is a variation
of the halting problem and provably impossible to get right in all cases
[2].
Various people have identified cases that are particularly bad for false
positives, and in commit
e74fc973b6e5 ("Turn off -Wmaybe-uninitialized
when building with -Os"), I turned off the warning for any build that
was done with CC_OPTIMIZE_FOR_SIZE. This drastically reduced the number
of false positive warnings in the default build but unfortunately had
the side effect of turning the warning off completely in 'allmodconfig'
builds, which in turn led to a lot of warnings (both actual bugs, and
remaining false positives) to go in unnoticed.
With commit
877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE
definition") enabled the warning again for allmodconfig builds in v4.7
and in v4.8-rc1, I had finally managed to address all warnings I get in
an ARM allmodconfig build and most other maybe-uninitialized warnings
for ARM randconfig builds.
However, commit
6e8d666e9253 ("Disable "maybe-uninitialized" warning
globally") was merged at the same time and disabled it completely for
all configurations, because of false-positive warnings on x86 that I had
not addressed until then. This caused a lot of actual bugs to get
merged into mainline, and I sent several dozen patches for these during
the v4.9 development cycle. Most of these are actual bugs, some are for
correct code that is safe because it is only called under external
constraints that make it impossible to run into the case that gcc sees,
and in a few cases gcc is just stupid and finds something that can
obviously never happen.
I have now done a few thousand randconfig builds on x86 and collected
all patches that I needed to address every single warning I got (I can
provide the combined patch for the other warnings if anyone is
interested), so I hope we can get the warning back and let people catch
the actual bugs earlier.
This reverts the change to disable the warning completely and for now
brings it back at the "make W=1" level, so we can get it merged into
mainline without introducing false positives. A follow-up patch enables
it on all levels unless some configuration option turns it off because
of false-positives.
Link: https://rusty.ozlabs.org/?p=232
Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Wilson [Thu, 10 Nov 2016 18:46:47 +0000 (10:46 -0800)]
lib/stackdepot: export save/fetch stack for drivers
Some drivers would like to record stacktraces in order to aide leak
tracing. As stackdepot already provides a facility for only storing the
unique traces, thereby reducing the memory required, export that
functionality for use by drivers.
The code was originally created for KASAN and moved under lib in commit
cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot
for SLAB") so that it could be shared with mm/. In turn, we want to
share it now with drivers.
Link: http://lkml.kernel.org/r/20161108133209.22704-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jakub Kicinski [Thu, 10 Nov 2016 18:46:44 +0000 (10:46 -0800)]
mm: kmemleak: scan .data.ro_after_init
Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning. To achieve this we need to add
symbols for start and end of the section to the linker scripts.
The problem was been uncovered by commit
56989f6d8568 ("genetlink: mark
families as __ro_after_init").
Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Thelen [Thu, 10 Nov 2016 18:46:41 +0000 (10:46 -0800)]
memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB. When it happened, critical allocations needed
for loading drivers or creating new caches will fail.
The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.
This fix discards allocator specific flags from memcg before calling
create_cache.
The bug exists since 4.6-rc1 and affects testing debug pagealloc
configurations.
Fixes:
b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Tested-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Thu, 10 Nov 2016 18:46:38 +0000 (10:46 -0800)]
coredump: fix unfreezable coredumping task
It could be not possible to freeze coredumping task when it waits for
'core_state->startup' completion, because threads are frozen in
get_signal() before they got a chance to complete 'core_state->startup'.
Inability to freeze a task during suspend will cause suspend to fail.
Also CRIU uses cgroup freezer during dump operation. So with an
unfreezable task the CRIU dump will fail because it waits for a
transition from 'FREEZING' to 'FROZEN' state which will never happen.
Use freezer_do_not_count() to tell freezer to ignore coredumping task
while it waits for core_state->startup completion.
Link: http://lkml.kernel.org/r/1475225434-3753-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eryu Guan [Thu, 10 Nov 2016 18:46:35 +0000 (10:46 -0800)]
mm/filemap: don't allow partially uptodate page for pipes
Starting from 4.9-rc1 kernel, I started noticing some test failures of
sendfile(2) and splice(2) (sendfile0N and splice01 from LTP) when
testing on sub-page block size filesystems (tested both XFS and ext4),
these syscalls start to return EIO in the tests. e.g.
sendfile02 1 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 26, got: -1
sendfile02 2 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 24, got: -1
sendfile02 3 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 22, got: -1
sendfile02 4 TFAIL : sendfile02.c:133: sendfile(2) failed to return expected value, expected: 20, got: -1
This is because that in sub-page block size cases, we don't need the
whole page to be uptodate, only the part we care about is uptodate is OK
(if fs has ->is_partially_uptodate defined).
But page_cache_pipe_buf_confirm() doesn't have the ability to check the
partially-uptodate case, it needs the whole page to be uptodate. So it
returns EIO in this case.
This is a regression introduced by commit
82c156f85384 ("switch
generic_file_splice_read() to use of ->read_iter()"). Prior to the
change, generic_file_splice_read() doesn't allow partially-uptodate page
either, so it worked fine.
Fix it by skipping the partially-uptodate check if we're working on a
pipe in do_generic_file_read(), so we read the whole page from disk as
long as the page is not uptodate.
I think the other way to fix it is to add the ability to check & allow
partially-uptodate page to page_cache_pipe_buf_confirm(), but that is
much harder to do and seems gain little.
Link: http://lkml.kernel.org/r/1477986187-12717-1-git-send-email-guaneryu@gmail.com
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Kravetz [Thu, 10 Nov 2016 18:46:32 +0000 (10:46 -0800)]
mm/hugetlb: fix huge page reservation leak in private mapping error paths
Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly
allocated huge page.
If a reservation was associated with the huge page, alloc_huge_page()
consumed the reservation while allocating. When the newly allocated
page is freed in free_huge_page(), it will increment the global
reservation count. However, the reservation entry in the reserve map
will remain.
This is not an issue for shared mappings as the entry in the reserve map
indicates a reservation exists. But, an entry in a private mapping
reserve map indicates the reservation was consumed and no longer exists.
This results in an inconsistency between the reserve map and the global
reservation count. This 'leaks' a reserved huge page.
Create a new routine restore_reserve_on_error() to restore the reserve
entry in these specific error paths. This routine makes use of a new
function vma_add_reservation() which will add a reserve entry for a
specific address/page.
In general, these error paths were rarely (if ever) taken on most
architectures. However, powerpc contained arch specific code that that
resulted in an extra fault and execution of these error paths on all
private mappings.
Fixes:
67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings)
Link: http://lkml.kernel.org/r/1476933077-23091-2-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A . Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junxiao Bi [Thu, 10 Nov 2016 18:46:29 +0000 (10:46 -0800)]
ocfs2: fix not enough credit panic
The following panic was caught when run ocfs2 disconfig single test
(block size 512 and cluster size 8192). ocfs2_journal_dirty() return
-ENOSPC, that means credits were used up.
The total credit should include 3 times of "num_dx_leaves" from
ocfs2_dx_dir_rebalance(), because 2 times will be consumed in
ocfs2_dx_dir_transfer_leaf() and 1 time will be consumed in
ocfs2_dx_dir_new_cluster() -> __ocfs2_dx_dir_new_cluster() ->
ocfs2_dx_dir_format_cluster(). But only two times is included in
ocfs2_dx_dir_rebalance_credits(), fix it.
This can cause read-only fs(v4.1+) or panic for mainline linux depending
on mount option.
------------[ cut here ]------------
kernel BUG at fs/ocfs2/journal.c:775!
invalid opcode: 0000 [#1] SMP
Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2
Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
task:
ffff8800b6de6200 ti:
ffff8800a7d48000 task.ti:
ffff8800a7d48000
RIP: ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
RSP: 0018:
ffff8800a7d4b6d8 EFLAGS:
00010286
RAX:
00000000ffffffe4 RBX:
00000000814d0a9c RCX:
00000000000004f9
RDX:
ffffffffa008e990 RSI:
ffffffffa008f1ee RDI:
ffff8800622b6460
RBP:
ffff8800a7d4b6f8 R08:
ffffffffa008f288 R09:
ffff8800622b6460
R10:
0000000000000000 R11:
0000000000000282 R12:
0000000002c8421e
R13:
ffff88006d0cad00 R14:
ffff880092beef60 R15:
0000000000000070
FS:
00007f9b83e92700(0000) GS:
ffff8800be880000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fb2c0d1a000 CR3:
0000000008f80000 CR4:
00000000000406e0
Call Trace:
ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2]
ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2]
ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2]
ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2]
ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2]
ocfs2_mknod+0x5a2/0x1400 [ocfs2]
ocfs2_create+0x73/0x180 [ocfs2]
vfs_create+0xd8/0x100
lookup_open+0x185/0x1c0
do_last+0x36d/0x780
path_openat+0x92/0x470
do_filp_open+0x4a/0xa0
do_sys_open+0x11a/0x230
SyS_open+0x1e/0x20
system_call_fastpath+0x12/0x71
Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54
RIP ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
---[ end trace
91ac5312a6ee1288 ]---
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
Link: http://lkml.kernel.org/r/1478248135-31963-1-git-send-email-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hans de Goede [Thu, 10 Nov 2016 18:46:26 +0000 (10:46 -0800)]
Revert "console: don't prefer first registered if DT specifies stdout-path"
This reverts commit
05fd007e4629 ("console: don't prefer first
registered if DT specifies stdout-path").
The reverted commit changes existing behavior on which many ARM boards
rely. Many ARM small-board-computers, like e.g. the Raspberry Pi have
both a video output and a serial console. Depending on whether the user
is using the device as a more regular computer; or as a headless device
we need to have the console on either one or the other.
Many users rely on the kernel behavior of the console being present on
both outputs, before the reverted commit the console setup with no
console= kernel arguments on an ARM board which sets stdout-path in dt
would look like this:
[root@localhost ~]# cat /proc/consoles
ttyS0 -W- (EC p a) 4:64
tty0 -WU (E p ) 4:1
Where as after the reverted commit, it looks like this:
[root@localhost ~]# cat /proc/consoles
ttyS0 -W- (EC p a) 4:64
This commit reverts commit
05fd007e4629 ("console: don't prefer first
registered if DT specifies stdout-path") restoring the original
behavior.
Fixes:
05fd007e4629 ("console: don't prefer first registered if DT specifies stdout-path")
Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Thu, 10 Nov 2016 18:46:23 +0000 (10:46 -0800)]
mm: hwpoison: fix thp split handling in memory_failure()
When memory_failure() runs on a thp tail page after pmd is split, we
trigger the following VM_BUG_ON_PAGE():
page:
ffffd7cd819b0040 count:0 mapcount:0 mapping: (null) index:0x1
flags: 0x1fffc000400000(hwpoison)
page dumped because: VM_BUG_ON_PAGE(!page_count(p))
------------[ cut here ]------------
kernel BUG at /src/linux-dev/mm/memory-failure.c:1132!
memory_failure() passed refcount and page lock from tail page to head
page, which is not needed because we can pass any subpage to
split_huge_page().
Fixes:
61f5d698cc97 ("mm: re-enable THP")
Link: http://lkml.kernel.org/r/1477961577-7183-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jann Horn [Thu, 10 Nov 2016 18:46:19 +0000 (10:46 -0800)]
swapfile: fix memory corruption via malformed swapfile
When root activates a swap partition whose header has the wrong
endianness, nr_badpages elements of badpages are swabbed before
nr_badpages has been checked, leading to a buffer overrun of up to 8GB.
This normally is not a security issue because it can only be exploited
by root (more specifically, a process with CAP_SYS_ADMIN or the ability
to modify a swap file/partition), and such a process can already e.g.
modify swapped-out memory of any other userspace process on the system.
Link: http://lkml.kernel.org/r/1477949533-2509-1-git-send-email-jann@thejh.net
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shiraz Hashim [Thu, 10 Nov 2016 18:46:16 +0000 (10:46 -0800)]
mm/cma.c: check the max limit for cma allocation
CMA allocation request size is represented by size_t that gets truncated
when same is passed as int to bitmap_find_next_zero_area_off.
We observe that during fuzz testing when cma allocation request is too
high, bitmap_find_next_zero_area_off still returns success due to the
truncation. This leads to kernel crash, as subsequent code assumes that
requested memory is available.
Fail cma allocation in case the request breaches the corresponding cma
region size.
Link: http://lkml.kernel.org/r/1478189211-3467-1-git-send-email-shashim@codeaurora.org
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Thu, 10 Nov 2016 18:46:13 +0000 (10:46 -0800)]
scripts/bloat-o-meter: fix SIGPIPE
Fix piping output to a program which quickly exits (read: head -n1)
$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux | head -n1
add/remove: 0/0 grow/shrink: 9/60 up/down: 124/-305 (-181)
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
Link: http://lkml.kernel.org/r/20161028204618.GA29923@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Thu, 10 Nov 2016 18:46:11 +0000 (10:46 -0800)]
shmem: fix pageflags after swapping DMA32 object
If shmem_alloc_page() does not set PageLocked and PageSwapBacked, then
shmem_replace_page() needs to do so for itself. Without this, it puts
newpage on the wrong lru, re-unlocks the unlocked newpage, and system
descends into "Bad page" reports and freeze; or if CONFIG_DEBUG_VM=y, it
hits an earlier VM_BUG_ON_PAGE(!PageLocked), depending on config.
But shmem_replace_page() is not a common path: it's only called when
swapin (or swapoff) finds the page was already read into an unsuitable
zone: usually all zones are suitable, but gem objects for a few drm
devices (gma500, omapdrm, crestline, broadwater) require zone DMA32 if
there's more than 4GB of ram.
Fixes:
800d8c63b2e9 ("shmem: add huge pages support")
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1611062003510.11253@eggly.anvils
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.8.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Thu, 10 Nov 2016 18:46:07 +0000 (10:46 -0800)]
mm, frontswap: make sure allocated frontswap map is assigned
Christian Borntraeger reports:
With commit
8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to
static key") kmemleak complains about a memory leak in swapon
unreferenced object 0x3e09ba56000 (size
32112640):
comm "swapon", pid 7852, jiffies
4294968787 (age 1490.770s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
__vmalloc_node_range+0x194/0x2d8
vzalloc+0x58/0x68
SyS_swapon+0xd60/0x12f8
system_call+0xd6/0x270
Turns out kmemleak is right. We now allocate the frontswap map
depending on the kernel config (and no longer on the enablement)
swapfile.c:
[...]
if (IS_ENABLED(CONFIG_FRONTSWAP))
frontswap_map = vzalloc(BITS_TO_LONGS(maxpages) * sizeof(long));
but later on this is passed along
--> enable_swap_info(p, prio, swap_map, cluster_info, frontswap_map);
and ignored if frontswap is disabled
--> frontswap_init(p->type, frontswap_map);
static inline void frontswap_init(unsigned type, unsigned long *map)
{
if (frontswap_enabled())
__frontswap_init(type, map);
}
Thing is, that frontswap map is never freed.
The leakage is relatively not that bad, because swapon is an infrequent
and privileged operation. However, if the first frontswap backend is
registered after a swap type has been already enabled, it will WARN_ON
in frontswap_register_ops() and frontswap will not be available for the
swap type.
Fix this by making sure the map is assigned by frontswap_init() as long
as CONFIG_FRONTSWAP is enabled.
Fixes:
8ea1d2a1985a ("mm, frontswap: convert frontswap_enabled to static key")
Link: http://lkml.kernel.org/r/20161026134220.2566-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Thu, 10 Nov 2016 18:46:04 +0000 (10:46 -0800)]
mm: remove extra newline from allocation stall warning
Commit
63f53dea0c98 ("mm: warn about allocations which stall for too
long") by error embedded "\n" in the format string, resulting in strange
output.
[ 722.876655] kworker/0:1: page alloction stalls for 160001ms, order:0
[ 722.876656] , mode:0x2400000(GFP_NOIO)
[ 722.876657] CPU: 0 PID: 6966 Comm: kworker/0:1 Not tainted 4.8.0+ #69
Link: http://lkml.kernel.org/r/1476026219-7974-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paolo Bonzini [Fri, 11 Nov 2016 10:13:36 +0000 (11:13 +0100)]
Merge tag 'kvm-arm-for-v4.9-rc4' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM updates for v4.9-rc4
- Kick the vcpu when a pending interrupt becomes pending again
- Prevent access to invalid interrupt registers
- Invalid TLBs when two vcpus from the same VM share a CPU
Robert Jarzmik [Mon, 31 Oct 2016 19:54:53 +0000 (20:54 +0100)]
cpufreq: pxa: use generic platdev driver for device-tree
For device-tree based pxa25x and pxa27x platforms, cpufreq-dt driver is
doing the job as well as pxa2xx-cpufreq, so add these platforms to the
compatibility list.
This won't work for legacy non device-tree platforms where
pxa2xx-cpufreq is still required.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Markus Mayer [Mon, 7 Nov 2016 18:02:23 +0000 (10:02 -0800)]
cpufreq: stats: New sysfs attribute for clearing statistics
Allow CPUfreq statistics to be cleared by writing anything to
/sys/.../cpufreq/stats/reset.
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 8 Nov 2016 05:36:33 +0000 (11:06 +0530)]
cpufreq: governor: Don't use 'timer' keyword
The earlier implementation of governors used background timers and so
functions, mutex, etc had 'timer' keyword in their names.
But that's not true anymore. Replace 'timer' with 'update', as those
functions, variables are based around updates to frequency.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Akshay Adiga [Tue, 8 Nov 2016 13:33:28 +0000 (19:03 +0530)]
cpufreq: powernv: Use PMCR to verify global and local pstate
As fast_switch() may get called with interrupt disable mode, we cannot
hold a mutex to update the global_pstate_info. So currently, fast_switch()
does not update the global_pstate_info and it will end up with stale data
whenever pstate is updated through fast_switch().
As the gpstate_timer can fire after fast_switch() has updated the pstates,
the timer handler cannot rely on the cached values of local and global
pstate and needs to read it from the PMCR.
Only gpstate_timer_handler() is affected by the stale cached pstate data
beacause either fast_switch() or target_index() routines will be called
for a given govenor, but gpstate_timer can fire after the governor has
changed to schedutil.
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Akshay Adiga [Tue, 8 Nov 2016 13:33:27 +0000 (19:03 +0530)]
cpufreq: powernv: Adding fast_switch for schedutil
Adding fast_switch which does light weight operation to set the desired
pstate. Both global and local pstates are set to the same desired pstate.
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Wei Yongjun [Thu, 10 Nov 2016 15:19:05 +0000 (15:19 +0000)]
cpufreq: brcmstb-avs-cpufreq: make symbol brcm_avs_cpufreq_attr static
Fixes the following sparse warning:
drivers/cpufreq/brcmstb-avs-cpufreq.c:982:18: warning:
symbol 'brcm_avs_cpufreq_attr' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>