Andy Yan [Mon, 31 Jul 2017 10:10:22 +0000 (18:10 +0800)]
pinctrl: rockchip: add input schmitt support for rv1108
Some pins like i2c SCL/SDA need the schmitt input function
to avoid crosstalk problems.
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Peter Robinson [Tue, 4 Jul 2017 06:49:47 +0000 (07:49 +0100)]
pinctrl: intel: wrap Intel pin control drivers in an architecture check
The Intel pin control drivers are architecture specific so add an if arch
to check for X86 or compile test to ensure continued test coverage.
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Sergei Shtylyov [Sun, 30 Jul 2017 19:38:48 +0000 (22:38 +0300)]
pinctrl: sirf: atlas7: fix of_irq_get() error check
of_irq_get() may return any negative error number as well as 0 on failure,
while the driver only checks for -EPROBE_DEFER, blithely continuing with
the call to gpiochip_set_chained_irqchip() -- that function expects the
parent IRQ as *unsigned int*, so would probably do nothing when a large
IRQ number resulting from a conversion of a negative error number is passed
to it, however passing 0 would probably work but the driver won't receive
valid GPIO bank interrupts.
Check for 'ret <= 0' instead and return -ENXIO from the driver's probe iff
of_irq_get() returned 0.
Fixes:
f9367793293d ("pinctrl: sirf: add sirf atlas7 pinctrl and gpio support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Vivek Gautam [Fri, 28 Jul 2017 13:18:12 +0000 (18:48 +0530)]
pinctrl: Add pmi8994 gpio bindings
Update the binding doc for qcom pmi8994-gpio devices.
Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
David Wu [Fri, 21 Jul 2017 06:27:15 +0000 (14:27 +0800)]
pinctrl: rockchip: Add rk3128 pinctrl support
There are 3 IP blocks pin routes need to be switched, that are
emmc-cmd, spi, i2s. And there are some pins need to be recalced,
which are gpio2c4~gpio2c7 and gpio2d0.
Signed-off-by: David Wu <david.wu@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
David Wu [Fri, 21 Jul 2017 06:27:14 +0000 (14:27 +0800)]
pinctrl: rockchip: Use common interface for recalced iomux
The other Socs also need the feature of recalced iomux, so
make it as a common interface like iomux route feature.
Signed-off-by: David Wu <david.wu@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Thierry Reding [Thu, 20 Jul 2017 16:59:12 +0000 (18:59 +0200)]
pinctrl: bcm2835: Remove unneeded irq_group field
The irq_group field stores a 1:1 mapping. Use the loop variable to
derive the values instead of storing them in an extra array.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Thierry Reding [Thu, 20 Jul 2017 17:01:07 +0000 (19:01 +0200)]
pinctrl: sirf: atlas7: Initialize GPIO offset
The GPIO offset is never initialized, which means that it will end up
being zero as per the devm_kzalloc() of the parent structure.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Rob Herring [Tue, 18 Jul 2017 21:43:23 +0000 (16:43 -0500)]
pinctrl: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Lee Jones <lee@kernel.org>
Cc: Stefan Wahren <stefan.wahren@i2se.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ray Jui <rjui@broadcom.com>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Barry Song <baohua@kernel.org>
Cc: linux-gpio@vger.kernel.org
Cc: linux-rpi-kernel@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: kernel@stlinux.com
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-renesas-soc@vger.kernel.org
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Philipp Zabel [Wed, 19 Jul 2017 15:26:12 +0000 (17:26 +0200)]
pinctrl: tegra: explicitly request exclusive reset control
Commit
a53e35db70d1 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Philipp Zabel [Wed, 19 Jul 2017 15:26:11 +0000 (17:26 +0200)]
pinctrl: sunxi: explicitly request exclusive reset control
Commit
a53e35db70d1 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Philipp Zabel [Wed, 19 Jul 2017 15:26:10 +0000 (17:26 +0200)]
pinctrl: stm32: explicitly request exclusive reset control
Commit
a53e35db70d1 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: linux-gpio@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 18 Jul 2017 05:24:53 +0000 (14:54 +0930)]
pinctrl: aspeed: g5: Add USB device and host support
Implement the AST2500 USB functions as described by the devicetree
bindings. The AST2500 exposes five USB controllers through two USB
ports. Similar to the AST2400, the pins exposing USB are outliers with
respect to the rest of the pinmux as they not capable of GPIO.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 18 Jul 2017 05:24:52 +0000 (14:54 +0930)]
pinctrl: aspeed: g4: Add USB device and host support
Implement the AST2400 USB functions as described by the devicetree
bindings. Three ports are fully documented in the datasheet and exposed
through the bindings and pinctrl, though there are remnants of
documentation for a fourth port muxed with GPIO pins GPIOQ6 and GPIOQ7.
The implementation is updated to reflect this but the function and
group are not exposed.
Disregarding the mostly undocumented fourth port, the USB functions are
an outlier with respect to the rest of the muxed functionality on the
AST2400 as GPIO is not supported on these pins.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 18 Jul 2017 05:24:51 +0000 (14:54 +0930)]
dt-bindings: pinctrl: aspeed: Add g5 USB functions
The Aspeed AST2500 SoC contains a number of USB controllers:
* USB 1.1 Host Controller
* USB 2.0 Host Controller (x2)
* USB 2.0 Virtual Hub
* USB 2.0 Device Controller
* USB 1.1 HID Controller
The controllers are exposed via two USB ports with functionality muxed
as required. The following table illustrates the relationships between
the ports and the controllers via the mux function names:
Port | USB Version | USB Mode | Mux Function
------|--------------|--------------|-------------
A | 2.0 | Virtual Hub | USB2AD
A | 2.0 | Host | USB2AH
B | 1.1 | HID | USB11BHID
B | 2.0 | Device | USB2BD
B | 2.0 | Host | USB2BH
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Jeffery [Tue, 18 Jul 2017 05:24:50 +0000 (14:54 +0930)]
dt-bindings: pinctrl: aspeed: Add g4 USB functions
The AST2400 contains several USB controllers:
* USB 1.1 Host Controller
* USB 2.0 Host Controller
* USB 2.0 Virtual Hub
* USB 1.1 HID Controller
Pins for three ports are routed to the three controllers such that:
* Port 1 is a dedicated USB 1.1 host port
* Port 2 is shared between the USB 1.1 host and HID controllers
* Port 3 is shared between the USB 2.0 host and Hub controllers
As the pins for port 1 are fixed function there is no associated mux
function or group described in the bindings. Ports 2 and 3 are muxed as
above, and the table below describes the mapping between pinmux function
names and ports:
Port | USB Version | USB Mode | Mux Function
------|--------------|-----------|-------------
1 | 1.1 | Host | -
2 | 1.1 | Host | USB11H2
2 | 1.1 | HID | USB11D1
3 | 2.0 | Host | USB2H1
3 | 2.0 | Device | USB2D1
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Shawn Guo [Sun, 16 Jul 2017 13:33:28 +0000 (21:33 +0800)]
pinctrl: zte: fix 'functions' allocation in zx_pinctrl_build_state()
It fixes the following Smatch static check warning:
drivers/pinctrl/zte/pinctrl-zx.c:338 zx_pinctrl_build_state()
warn: passing devm_ allocated variable to kfree.
As we will be calling krealloc() on pointer 'functions', which means
kfree() will be called in there, devm_kzalloc() shouldn't be used with
the allocation in the first place. Fix the warning by calling kcalloc()
and managing the free procedure in error path on our own.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes:
cbff0c4d27f4 ("pinctrl: add ZTE ZX pinctrl driver support")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 21:18:36 +0000 (16:18 -0500)]
pinctrl: qcom: ssbi: mpp: constify gpio_chip structure
This structure is only used to copy into another structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
15136 5112 0 20248 4f18 drivers/pinctrl/qcom/pinctrl-ssbi-mpp.o
after:
bss dec hex filename
14849 5024 0 19873 4da1 drivers/pinctrl/qcom/pinctrl-ssbi-mpp.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:03:39 +0000 (13:03 -0500)]
pinctrl: bcm2835: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
18958 9000 128 28086 6db6 drivers/pinctrl/bcm/pinctrl-bcm2835.o
after:
text data bss dec hex filename
18764 8912 128 27804 6c9c drivers/pinctrl/bcm/pinctrl-bcm2835.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Nava kishore Manne [Thu, 13 Jul 2017 09:12:43 +0000 (11:12 +0200)]
pinctrl: zynq: Fix warnings in the driver
This patch fixes the below warning
--> Prefer 'unsigned int' to bare use of 'unsigned'.
--> line over 80 characters.
--> Prefer 'unsigned int **' to bare use of 'unsigned **'.
Signed-off-by: Nava kishore Manne <navam@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Nava kishore Manne [Thu, 13 Jul 2017 09:12:42 +0000 (11:12 +0200)]
pinctrl: zynq: Fix kernel doc warnings
This patch fixes the kernel doc warnings in the driver.
Signed-off-by: Nava kishore Manne <navam@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:15:19 +0000 (13:15 -0500)]
pinctrl: st: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
21671 3632 128 25431 6357 drivers/pinctrl/pinctrl-st.o
after:
text data bss dec hex filename
21366 3576 128 25070 61ee drivers/pinctrl/pinctrl-st.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Hans de Goede [Wed, 12 Jul 2017 12:31:01 +0000 (14:31 +0200)]
pinctrl: baytrail: Do not call WARN_ON for a firmware bug
WARN_ON causes a backtrace to get logged which is only useful for
kernel bugs. For signalling a firmware bug dev_warn(dev, FW_BUG "...")
should be used.
This fixes users running userspace software to monitor kernel oopses
getting a false positive bug-report every boot because of the wrong
use of WARN_ON.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:56 +0000 (21:41 +0800)]
pinctrl: pinctrl-imx7ulp: add gpio_set_direction support
Add gpio_set_direction support. This makes the driver support
GPIO input/output dynamically change from userspace.
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Fugang Duan <fugang.duan@nxp.com>
Cc: Bai Ping <ping.bai@nxp.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:55 +0000 (21:41 +0800)]
pinctrl: imx: make imx_pmx_ops.gpio_set_direction platform specific callbacks
Various IMX platforms may have different imx_pmx_ops.gpio_set_direction
implementations, so let's make it platform specific callbacks instead of
the fixed common one.
Currently only VF610 platform implements it. No function level changes.
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:54 +0000 (21:41 +0800)]
pinctrl: imx: remove gpio_request_enable and gpio_disable_free
gpio_request_enable/disable_free actually are not quite necessary as
standard IMX pinctrl binding already sets GPIO mux from device tree,
e.g. VF610_PAD_PTB20__GPIO_42 or MX7D_PAD_SD2_CD_B__GPIO5_IO9
No need to do it again in gpio_request_enable.
And according to Stefan:
"For all GPIO I checked in upstream device trees we assign a pinctrl
to the same node, so in all cases gpio_request_enable/disable is really
unnecessary."
So it should be safe to simply remove it.
Note that this changes semantics for Vybrid, e.g.
"The two functions have been introduced for Vybrid (through
SHARE_MUX_CONF_REG) and mux pins as GPIOs automatically when a GPIO
gets requested. The automatic mux is optional by the pinmux/gpio
subsystem semantics, and other NXP devices do not use it, instead an
explicit pinctrl node is added in the device tree to mux GPIOs where
required. Hence this change aligns Vybrid to other NXP (i.MX) devices.
Note that all upstream device tree assign proper pinctrl properties
where GPIOs are used so no change is necessary for device trees."
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Fugang Duan <fugang.duan@nxp.com>
Cc: Bai Ping <ping.bai@nxp.com>
Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:53 +0000 (21:41 +0800)]
pinctrl: imx: add imx7ulp driver
i.MX 7ULP has three IOMUXC instances: IOMUXC0 for M4 ports,
IOMUXC1 for A7 ports and IOMUXC DDR for DDR interface.
This patch adds the IOMUXC1 support for A7.
It only supports generic pin config.
Cc: Bai Ping <ping.bai@nxp.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:52 +0000 (21:41 +0800)]
pinctrl: imx: switch to use the generic pinmux property
The generic pinmux property seems to be more suitable for IMX.
So we change to use 'pinmux' instead of 'pins'.
Cc: Bai Ping <ping.bai@nxp.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Dong Aisheng [Tue, 25 Jul 2017 13:41:51 +0000 (21:41 +0800)]
dt-bindings: pinctrl: add imx7ulp pinctrl binding doc
i.MX 7ULP has three IOMUXC instances: IOMUXC0 for M4 ports,
IOMUXC1 for A7 ports and IOMUXC DDR for DDR interface.
This patch adds the IOMUXC1 support for A7.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:11 +0000 (15:21 +0900)]
pinctrl: uniphier: add UniPhier PXs3 pinctrl driver
Add pin configuration and pinmux support for UniPhier PXs3 SoC.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:10 +0000 (15:21 +0900)]
pinctrl: uniphier: add suspend / resume support
Save registers lost in the sleep when suspending, and restore them
when resuming.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:09 +0000 (15:21 +0900)]
pinctrl: uniphier: omit redundant input enable bit information
For LD11/20 SoCs (capable of per-pin input enable), the iectrl bit
number matches its pin number. So, this is redundant information.
Instead, we just need a flag to know if the iectrl gating exists or not.
With this refactoring, 5 bits in pin data will be saved.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:08 +0000 (15:21 +0900)]
pinctrl: uniphier: clean up GPIO port muxing
There are a bunch of GPIO muxing data, but most of them are actually
unneeded because GPIO-to-pin mapping can be specified by "gpio-ranges"
DT properties.
Tables that contain a set of GPIO pins are still needed for the named
mapping by "gpio-ranges-group-names". This is a much cleaner way for
UniPhier SoC family where GPIO numbers are not straight mapped to pin
numbers.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:07 +0000 (15:21 +0900)]
pinctrl: uniphier: fix pin_config_get() for input-enable
For LD11/LD20 SoCs (capable of per-pin input enable), iectrl bits are
located across multiple registers. So, the register offset must be
taken into account. Otherwise, wrong input-enable status is displayed.
While we here, rename the macro because it is a base address.
Fixes:
aa543888ca8c ("pinctrl: uniphier: support per-pin input enable for new SoCs")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Masahiro Yamada [Mon, 31 Jul 2017 06:21:06 +0000 (15:21 +0900)]
pinctrl: uniphier: remove unneeded EXPORT_SYMBOL_GPL()
All UniPhier pinctrl drivers are built-in. Exporting the symbol
is meaningless.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:34:14 +0000 (13:34 -0500)]
pinctrl: qcom: msm: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
13129 2808 192 16129 3f01 drivers/pinctrl/qcom/pinctrl-msm.o
after:
text data bss dec hex filename
12839 2720 192 15751 3d87 drivers/pinctrl/qcom/pinctrl-msm.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:39:50 +0000 (13:39 -0500)]
pinctrl: qcom: ssbi-gpio: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
17061 6992 0 24053 5df5 drivers/pinctrl/qcom/pinctrl-ssbi-gpio.o
after:
text data bss dec hex filename
16777 6904 0 23681 5c81 drivers/pinctrl/qcom/pinctrl-ssbi-gpio.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 21:35:19 +0000 (16:35 -0500)]
pinctrl: coh901: constify gpio_chip structure
This structure is only used to copy into another structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
12775 3696 64 16535 4097 drivers/pinctrl/pinctrl-coh901.o
after:
bss dec hex filename
12440 3640 64 16144 3f10 drivers/pinctrl/pinctrl-coh901.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 21:29:29 +0000 (16:29 -0500)]
pinctrl: nomadik: abx500: constify gpio_chip structure
This structure is only used to copy into another structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
17545 5376 0 22921 5989 drivers/pinctrl/nomadik/pinctrl-abx500.o
after:
bss dec hex filename
17273 5320 0 22593 5841 drivers/pinctrl/nomadik/pinctrl-abx500.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:28:42 +0000 (13:28 -0500)]
pinctrl: vt8500: wmt: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
7754 2328 0 10082 2762 drivers/pinctrl/vt8500/pinctrl-wmt.o
after:
text data bss dec hex filename
7472 2272 0 9744 2610 drivers/pinctrl/vt8500/pinctrl-wmt.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Gustavo A. R. Silva [Tue, 11 Jul 2017 18:22:37 +0000 (13:22 -0500)]
pinctrl: rza1: constify gpio_chip structure
This structure is only used to copy into other structure, so declare
it as const.
This issue was detected using Coccinelle and the following semantic patch:
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct gpio_chip i@p = { ... };
@ok@
identifier r.i;
expression e;
position p;
@@
e = i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct gpio_chip e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct gpio_chip i = { ... };
In the following log you can see a significant difference in the code size
and data segment, hence in the dec segment. This log is the output
of the size command, before and after the code change:
before:
text data bss dec hex filename
11866 3520 128 15514 3c9a drivers/pinctrl/pinctrl-rza1.o
after:
text data bss dec hex filename
11539 3464 128 15131 3b1b drivers/pinctrl/pinctrl-rza1.o
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Icenowy Zheng [Sun, 30 Jul 2017 05:36:18 +0000 (13:36 +0800)]
pinctrl: sunxi: rename R_PIO i2c pin function name
The I2C pin functions in R_PIO used to be named "s_twi".
As we usually use the name "i2c" instead of "twi" in the mainline
kernel, change these names to "s_i2c" for consistency.
The "s_twi" functions are not yet referenced by any device trees in
mainline kernel so I think it's safe to change the name.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Icenowy Zheng [Sat, 22 Jul 2017 02:50:54 +0000 (10:50 +0800)]
pinctrl: sunxi: add support of R40 to A10 pinctrl driver
R40 is said to be an upgrade of A20, and its pin configuration is also
similar to A20 (and thus similar to A10).
Add support for R40 to the A10 pinctrl driver.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Ram Chandra Jangir [Fri, 14 Jul 2017 14:14:11 +0000 (16:14 +0200)]
pinctrl: msm: add support to configure ipq40xx GPIO_PULL bits
GPIO_PULL bits configurations in TLMM_GPIO_CFG register
differs for IPQ40xx from rest of the other qcom SoCs.
As it does not support the keeper state and therefore can't
support bias-bus-hold property.
This patch adds a pull_no_keeper setting which configures the
msm_gpio_pull bits for ipq40xx. This is required to fix the
proper configurations of gpio-pull bits for nand pins mux.
IPQ40xx SoC:
2'b10: Internal pull up enable.
2'b11: Unsupport
For other SoC's:
2'b10: Keeper
2'b11: Pull-Up
Note: Due to pull_no_keeper length, all kerneldoc entries
in the msm_pinctrl_soc_data struct had to be realigned.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Ram Chandra Jangir <rjangir@codeaurora.org>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Ram Chandra Jangir [Fri, 14 Jul 2017 14:14:10 +0000 (16:14 +0200)]
pinctrl: qcom: ipq4019: add most remaining pin definitions
This patch adds multiple pinctrl functions and mappings
for SDIO, NAND, I2S, WIFI, PCIE, LEDs, etc... that have
been missing from the current minimal version.
This patch has been updated from the original version
that was posted by Ram Chandra Jangir on the LEDE-DEV ML:
<https://patchwork.ozlabs.org/patch/752962/>. A short
summary of the changes are documented in the device-tree
patch of this series:
"dt-bindings: pinctrl: add most other IPQ4019 pin functions and groups"
Cc: John Crispin <john@phrozen.org>
Signed-off-by: Ram Chandra Jangir <rjangir@codeaurora.org>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Christian Lamparter [Fri, 14 Jul 2017 14:14:09 +0000 (16:14 +0200)]
dt-bindings: pinctrl: add most other IPQ4019 pin functions and groups
This patch adds the remaining pin functions and mux groups.
All unknown and debug functions are omitted. Existing functions
for qpic, sdio, rgmii, rmii, wifi/d are squashed together as
much as possible. And only in case of a clash, the individually
named functions have been kept. The exceptions are:
led0-11
i2s_rx, i2s_tx, i2s_td, i2s_spdif_in, i2s_spdif_out,
smart0-3
Cc: Varadarajan Narayanan <varada@codeaurora.org>
Cc: Ram Chandra Jangir <rjangir@codeaurora.org>
Cc: John Crispin <john@phrozen.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Torvalds [Sun, 13 Aug 2017 23:01:32 +0000 (16:01 -0700)]
Linux 4.13-rc5
Linus Torvalds [Sun, 13 Aug 2017 22:34:28 +0000 (15:34 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes:
- compressed boot: Ignore a generated .c file
- VDSO: Fix a register clobber list
- DECstation: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
- Octeon: Fix recent cleanups that cleaned away a bit too much thus
breaking the arch side of the EDAC and USB drivers.
- uasm: Fix duplicate const in "const struct foo const bar[]" which
GCC 7.1 no longer accepts.
- Fix race on setting and getting cpu_online_mask
- Fix preemption issue. To do so cleanly introduce macro to get the
size of L3 cache line.
- Revert include cleanup that sometimes results in build error
- MicroMIPS uses bit 0 of the PC to indicate microMIPS mode. Make
sure this bit is set for kernel entry as well.
- Prevent configuring the kernel for both microMIPS and MT. There are
no such CPUs currently and thus the combination is unsupported and
results in build errors.
This has been sitting in linux-next for a few days and has survived
automated testing by Imagination's test farm. No known regressions
pending except a number of issues that crept up due to lots of people
switching to GCC 7.1"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Set ISA bit in entry-y for microMIPS kernels
MIPS: Prevent building MT support for microMIPS kernels
MIPS: PCI: Fix smp_processor_id() in preemptible
MIPS: Introduce cpu_tcache_line_size
MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
MIPS: VDSO: Fix clobber lists in fallback code paths
Revert "MIPS: Don't unnecessarily include kmalloc.h into <asm/cache.h>."
MIPS: OCTEON: Fix USB platform code breakage.
MIPS: Octeon: Fix broken EDAC driver.
MIPS: gitignore: ignore generated .c files
MIPS: Fix race on setting and getting cpu_online_mask
MIPS: mm: remove duplicate "const" qualifier on insn_table
Linus Torvalds [Sun, 13 Aug 2017 19:44:18 +0000 (12:44 -0700)]
Merge tag 'driver-core-4.13-rc5' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are three firmware core fixes for 4.13-rc5.
All three of these fix reported issues and have been floating around
for a few weeks. They have been in linux-next with no reported
problems"
* tag 'driver-core-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
firmware: avoid invalid fallback aborts by using killable wait
firmware: fix batched requests - send wake up on failure on direct lookups
firmware: fix batched requests - wake all waiters
Linus Torvalds [Sun, 13 Aug 2017 19:41:58 +0000 (12:41 -0700)]
Merge tag 'char-misc-4.13-rc5' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are two patches for 4.13-rc5.
One is a fix for a reported thunderbolt issue, and the other a fix for
an MEI driver issue. Both have been in linux-next with no reported
issues"
* tag 'char-misc-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
thunderbolt: Do not enumerate more ports from DROM than the controller has
mei: exclude device from suspend direct complete optimization
Linus Torvalds [Sun, 13 Aug 2017 19:33:35 +0000 (12:33 -0700)]
Merge tag 'tty-4.13-rc5' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are two tty serial driver fixes for 4.13-rc5. One is a revert of
a -rc1 patch that turned out to not be a good idea, and the other is a
fix for the pl011 serial driver.
Both have been in linux-next with no reported issues"
* tag 'tty-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: Delete dead code for CIR serial ports"
tty: pl011: fix initialization order of QDF2400 E44
Linus Torvalds [Sun, 13 Aug 2017 19:30:17 +0000 (12:30 -0700)]
Merge tag 'staging-4.13-rc5' of git://git./linux/kernel/git/gregkh/staging
Pull staging/iio fixes from Greg KH:
"Here are some Staging and IIO driver fixes for 4.13-rc5.
Nothing major, just a number of small fixes for reported issues. All
of these have been in linux-next for a while now with no reported
issues. Full details are in the shortlog"
* tag 'staging-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: comedi_fops: do not call blocking ops when !TASK_RUNNING
iio: aspeed-adc: wait for initial sequence.
iio: accel: bmc150: Always restore device to normal mode after suspend-resume
staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
iio: adc: axp288: Fix the GPADC pin reading often wrongly returning 0
iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits
iio: accel: st_accel: add SPI-3wire support
iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications"
iio: adc: sun4i-gpadc-iio: fix unbalanced irq enable/disable
iio: pressure: st_pressure_core: disable multiread by default for LPS22HB
iio: light: tsl2563: use correct event code
Linus Torvalds [Sun, 13 Aug 2017 19:27:42 +0000 (12:27 -0700)]
Merge tag 'usb-4.13-rc5' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB driver fixes and new device ids for
4.13-rc5. There is the usual gadget driver fixes, some new quirks for
"messy" hardware, and some new device ids.
All have been in linux-next with no reported issues"
* tag 'usb-4.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: pl2303: add new ATEN device id
usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter
USB: Check for dropped connection before switching to full speed
usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume
usb: renesas_usbhs: gadget: fix unused-but-set-variable warning
usb: renesas_usbhs: Fix UGCTRL2 value for R-Car Gen3
usb: phy: phy-msm-usb: Fix usage of devm_regulator_bulk_get()
usb: gadget: udc: renesas_usb3: Fix usb_gadget_giveback_request() calling
usb: dwc3: gadget: Correct ISOC DATA PIDs for short packets
USB: serial: option: add D-Link DWM-222 device ID
usb: musb: fix tx fifo flush handling again
usb: core: unlink urbs from the tail of the endpoint's urb_list
usb-storage: fix deadlock involving host lock and scsi_done
uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069
USB: hcd: Mark secondary HCD as dead if the primary one died
USB: serial: cp210x: add support for Qivicon USB ZigBee dongle
Linus Torvalds [Sat, 12 Aug 2017 23:19:43 +0000 (16:19 -0700)]
Merge tag 'for-linus-
20170812' of git://git.infradead.org/linux-mtd
Pull another MTD fix from Brian Norris:
"An mtdblock regression occurred in -rc1 (all writes were broken!), in
the process of some block subsystem refactoring. Noticed and fixed
last week, but I'm a little slow on the uptake"
* tag 'for-linus-
20170812' of git://git.infradead.org/linux-mtd:
mtd: blkdevs: Fix mtd block write failure
Abhishek Sahu [Wed, 2 Aug 2017 12:33:05 +0000 (18:03 +0530)]
mtd: blkdevs: Fix mtd block write failure
All the MTD block write requests are failing with
following error messages
mkfs.ext4 /dev/mtdblock0
print_req_error: I/O error, dev mtdblock0, sector 0
Buffer I/O error on dev mtdblock0, logical block 0,
lost async page write
The control is going to default case after block write request
because of missing return.
Fixes: commit
2a842acab109 ("block: introduce new block status code type")
Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Linus Torvalds [Sat, 12 Aug 2017 19:08:59 +0000 (12:08 -0700)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"The highlights include:
- Fix iscsi-target payload memory leak during
ISCSI_FLAG_TEXT_CONTINUE (Varun Prakash)
- Fix tcm_qla2xxx incorrect use of tcm_qla2xxx_free_cmd during ABORT
(Pascal de Bruijn + Himanshu Madhani + nab)
- Fix iscsi-target long-standing issue with parallel delete of a
single network portal across multiple target instances (Gary Guo +
nab)
- Fix target dynamic se_node GPF during uncached shutdown regression
(Justin Maggard + nab)"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Fix node_acl demo-mode + uncached dynamic shutdown regression
iscsi-target: Fix iscsi_np reset hung task during parallel delete
qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT (v2)
cxgbit: fix sg_nents calculation
iscsi-target: fix invalid flags in text response
iscsi-target: fix memory leak in iscsit_setup_text_cmd()
cxgbit: add missing __kfree_skb()
tcmu: free old string on reconfig
tcmu: Fix possible to/from address overflow when doing the memcpy
Linus Torvalds [Sat, 12 Aug 2017 16:01:36 +0000 (09:01 -0700)]
Merge tag 'for-linus-4.13b-rc5-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Some fixes for Xen:
- a fix for a regression introduced in 4.13 for a Xen HVM-guest
configured with KASLR
- a fix for a possible deadlock in the xenbus driver when booting the
system
- a fix for lost interrupts in Xen guests"
* tag 'for-linus-4.13b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: Fix interrupt lost during irq_disable and irq_enable
xen: avoid deadlock in xenbus
xen: fix hvm guest with kaslr enabled
xen: split up xen_hvm_init_shared_info()
x86: provide an init_mem_mapping hypervisor hook
Linus Torvalds [Fri, 11 Aug 2017 20:54:09 +0000 (13:54 -0700)]
Merge tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes from me for rc5.
Dros has a stable fix for flexfiles to prevent leaking the
nfs4_ff_ds_version arrays when freeing a layout, Trond fixed a
potential recovery loop situation with the TEST_STATEID operation, and
Christoph fixed up the pNFS blocklayout Kconfig options to prevent
unsafe use with kernels that don't have large block device support.
Summary:
Stable fix:
- fix leaking nfs4_ff_ds_version array
Other fixes:
- improve TEST_STATEID OLD_STATEID handling to prevent recovery loop
- require 64-bit sector_t for pNFS blocklayout to prevent 32-bit
compile errors"
* tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfs:
pnfs/blocklayout: require 64-bit sector_t
NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid()
nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
Linus Torvalds [Fri, 11 Aug 2017 19:26:49 +0000 (12:26 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A set of fixes that should go into this series. This contains:
- Fix from Bart for blk-mq requeue queue running, preventing a
continued loop of run/restart.
- Fix for a bio/blk-integrity issue, in two parts. One from
Christoph, fixing where verification happens, and one from Milan,
for a NULL profile.
- NVMe pull request, most of the changes being for nvme-fc, but also
a few trivial core/pci fixes"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme: fix directive command numd calculation
nvme: fix nvme reset command timeout handling
nvme-pci: fix CMB sysfs file removal in reset path
lpfc: support nvmet_fc defer_rcv callback
nvmet_fc: add defer_req callback for deferment of cmd buffer return
nvme: strip trailing 0-bytes in wwid_show
block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time
bio-integrity: only verify integrity on the lowest stacked driver
bio-integrity: Fix regression if profile verify_fn is NULL
Linus Torvalds [Fri, 11 Aug 2017 18:56:54 +0000 (11:56 -0700)]
Merge tag 'mmc-v4.13-rc4' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- fix lockdep splat when removing mmc_block module
- fix the logic for setting eMMC HS400ES signal voltage
MMC host:
- omap_hsmmc: add CMD23 capability to fix -EIO errors"
* tag 'mmc-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: block: fix lockdep splat when removing mmc_block module
mmc: mmc: correct the logic for setting HS400ES signal voltage
mmc: host: omap_hsmmc: Add CMD23 capability to omap_hsmmc driver
Linus Torvalds [Fri, 11 Aug 2017 18:44:18 +0000 (11:44 -0700)]
Merge tag 'fbdev-v4.13-rc5' of git://github.com/bzolnier/linux
Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
- allow user to disable write combined mapping in efifb driver (Dave
Airlie)
- fix use after free bugs on driver removal in imxfb driver (Dan
Carpenter)
- fix unused variable warning in omapfb driver (Arnd Bergmann)
* tag 'fbdev-v4.13-rc5' of git://github.com/bzolnier/linux:
efifb: allow user to disable write combined mapping.
fbdev: omapfb: remove unused variable
video: fbdev: imxfb: use after free in imxfb_remove()
Linus Torvalds [Fri, 11 Aug 2017 18:20:48 +0000 (11:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"Fix a few bugs in fuse"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: set mapping error in writepage_locked when it fails
fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio
fuse: initialize the flock flag in fuse_file on allocation
Linus Torvalds [Fri, 11 Aug 2017 18:15:51 +0000 (11:15 -0700)]
Merge tag 'iommu-fixes-v4.13-rc4' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU fix from Joerg Roedel:
"Fix a NULL-pointer dereference in arm_smmu_add_device"
* tag 'iommu-fixes-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device
Christoph Hellwig [Sat, 5 Aug 2017 08:59:14 +0000 (10:59 +0200)]
pnfs/blocklayout: require 64-bit sector_t
The blocklayout code does not compile cleanly for a 32-bit sector_t,
and also has no reliable checks for devices sizes, which makes it
unsafe to use with a kernel that doesn't support large block devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
5c83746a0cf2 ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Linus Torvalds [Fri, 11 Aug 2017 15:56:01 +0000 (08:56 -0700)]
Merge tag 'powerpc-4.13-6' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"All fixes for code that went in this cycle.
- a revert of an optimisation to the syscall exit path, which could
lead to an oops on either older machines or machines with > 1TB of
memory
- disable some deep idle states if the firmware configuration for
them fails
- re-enable HARD/SOFT lockup detectors in defconfigs after a Kconfig
change
- six fairly small patches fixing bugs in our new watchdog code
Thanks to: Gautham R Shenoy, Nicholas Piggin"
* tag 'powerpc-4.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/watchdog: add locking around init/exit functions
powerpc/watchdog: Fix marking of stuck CPUs
powerpc/watchdog: Fix final-check recovered case
powerpc/watchdog: Moderate touch_nmi_watchdog overhead
powerpc/watchdog: Improve watchdog lock primitive
powerpc: NMI IPI improve lock primitive
powerpc/configs: Re-enable HARD/SOFT lockup detectors
powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails
Revert "powerpc/64: Avoid restore_math call if possible in syscall exit"
Artem Savkov [Tue, 8 Aug 2017 10:26:02 +0000 (12:26 +0200)]
iommu/arm-smmu: fix null-pointer dereference in arm_smmu_add_device
Commit c54451a "iommu/arm-smmu: Fix the error path in arm_smmu_add_device"
removed fwspec assignment in legacy_binding path as redundant which is
wrong. It needs to be updated after fwspec initialisation in
arm_smmu_register_legacy_master() as it is dereferenced later. Without
this there is a NULL-pointer dereference panic during boot on some hosts.
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Liu Shuo [Sat, 29 Jul 2017 16:59:57 +0000 (00:59 +0800)]
xen/events: Fix interrupt lost during irq_disable and irq_enable
Here is a device has xen-pirq-MSI interrupt. Dom0 might lost interrupt
during driver irq_disable/irq_enable. Here is the scenario,
1. irq_disable -> disable_dynirq -> mask_evtchn(irq channel)
2. dev interrupt raised by HW and Xen mark its evtchn as pending
3. irq_enable -> startup_pirq -> eoi_pirq ->
clear_evtchn(channel of irq) -> clear pending status
4. consume_one_event process the irq event without pending bit assert
which result in interrupt lost once
5. No HW interrupt raising anymore.
Now use enable_dynirq for enable_pirq of xen_pirq_chip to remove
eoi_pirq when irq_enable.
Signed-off-by: Liu Shuo <shuo.a.liu@intel.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Fri, 28 Jul 2017 14:53:55 +0000 (16:53 +0200)]
xen: avoid deadlock in xenbus
When starting the xenwatch thread a theoretical deadlock situation is
possible:
xs_init() contains:
task = kthread_run(xenwatch_thread, NULL, "xenwatch");
if (IS_ERR(task))
return PTR_ERR(task);
xenwatch_pid = task->pid;
And xenwatch_thread() does:
mutex_lock(&xenwatch_mutex);
...
event->handle->callback();
...
mutex_unlock(&xenwatch_mutex);
The callback could call unregister_xenbus_watch() which does:
...
if (current->pid != xenwatch_pid)
mutex_lock(&xenwatch_mutex);
...
In case a watch is firing before xenwatch_pid could be set and the
callback of that watch unregisters a watch, then a self-deadlock would
occur.
Avoid this by setting xenwatch_pid in xenwatch_thread().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Jens Axboe [Fri, 11 Aug 2017 14:07:19 +0000 (08:07 -0600)]
Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph:
"A few more small fixes - the fc/lpfc update is the biggest by far."
Juergen Gross [Fri, 28 Jul 2017 10:23:14 +0000 (12:23 +0200)]
xen: fix hvm guest with kaslr enabled
A Xen HVM guest running with KASLR enabled will die rather soon today
because the shared info page mapping is using va() too early. This was
introduced by commit
a5d5f328b0e2baa5ee7c119fd66324eb79eeeb66 ("xen:
allocate page for shared info page from low memory").
In order to fix this use early_memremap() to get a temporary virtual
address for shared info until va() can be used safely.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Fri, 28 Jul 2017 10:23:13 +0000 (12:23 +0200)]
xen: split up xen_hvm_init_shared_info()
Instead of calling xen_hvm_init_shared_info() on boot and resume split
it up into a boot time function searching for the pfn to use and a
mapping function doing the hypervisor mapping call.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Fri, 28 Jul 2017 10:23:12 +0000 (12:23 +0200)]
x86: provide an init_mem_mapping hypervisor hook
Provide a hook in hypervisor_x86 called after setting up initial
memory mapping.
This is needed e.g. by Xen HVM guests to map the hypervisor shared
info page.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Jeff Layton [Thu, 25 May 2017 10:57:50 +0000 (06:57 -0400)]
fuse: set mapping error in writepage_locked when it fails
This ensures that we see errors on fsync when writeback fails.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Linus Torvalds [Fri, 11 Aug 2017 05:33:47 +0000 (22:33 -0700)]
Merge tag 'drm-fixes-for-v4.13-rc5' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nothing too earth shattering here, it just seems like lots of little
things all over the place.
msm has probably the larger amount of changes, but they all seem fine,
otherwise, some rockchip, i915, etnaviv and exynos fixes, along with
one nouveau regression fix for some older GPUs"
* tag 'drm-fixes-for-v4.13-rc5' of git://people.freedesktop.org/~airlied/linux: (35 commits)
drm/nouveau/disp/nv04: avoid creation of output paths
drm: make DRM_STM default n
drm/exynos: forbid creating framebuffers from too small GEM buffers
drm/etnaviv: Fix off-by-one error in reloc checking
drm/i915: fix backlight invert for non-zero minimum brightness
drm/i915/shrinker: Wrap need_resched() inside preempt-disable
drm/i915/perf: fix flex eu registers programming
drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut
drm/i915/gvt: Change the max length of mmio_reg_rw from 4 to 8
drm/i915/gvt: Initialize MMIO Block with HW state
drm/rockchip: vop: report error when check resource error
drm/rockchip: vop: round_up pitches to word align
drm/rockchip: vop: fix NV12 video display error
drm/rockchip: vop: fix iommu page fault when resume
drm/i915/gvt: clean workload queue if error happened
drm/i915/gvt: change resetting to resetting_eng
drm/msm: gpu: don't abuse dma_alloc for non-DMA allocations
drm/msm: gpu: call qcom_mdt interfaces only for ARCH_QCOM
drm/msm/adreno: Prevent unclocked access when retrieving timestamps
drm/msm: Remove __user from __u64 data types
...
Linus Torvalds [Thu, 10 Aug 2017 23:20:52 +0000 (16:20 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"21 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
userfaultfd: replace ENOSPC with ESRCH in case mm has gone during copy/zeropage
zram: rework copy of compressor name in comp_algorithm_store()
rmap: do not call mmu_notifier_invalidate_page() under ptl
mm: fix list corruptions on shmem shrinklist
mm/balloon_compaction.c: don't zero ballooned pages
MAINTAINERS: copy virtio on balloon_compaction.c
mm: fix KSM data corruption
mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
mm: make tlb_flush_pending global
mm: refactor TLB gathering API
Revert "mm: numa: defer TLB flush for THP migration as long as possible"
mm: migrate: fix barriers around tlb_flush_pending
mm: migrate: prevent racy access to tlb_flush_pending
fault-inject: fix wrong should_fail() decision in task context
test_kmod: fix small memory leak on filesystem tests
test_kmod: fix the lock in register_test_dev_kmod()
test_kmod: fix bug which allows negative values on two config options
test_kmod: fix spelling mistake: "EMTPY" -> "EMPTY"
userfaultfd: hugetlbfs: remove superfluous page unlock in VM_SHARED case
mm: ratelimit PFNs busy info message
...
Mike Rapoport [Thu, 10 Aug 2017 22:24:32 +0000 (15:24 -0700)]
userfaultfd: replace ENOSPC with ESRCH in case mm has gone during copy/zeropage
When the process exit races with outstanding mcopy_atomic, it would be
better to return ESRCH error. When such race occurs the process and
it's mm are going away and returning "no such process" to the uffd
monitor seems better fit than ENOSPC.
Link: http://lkml.kernel.org/r/1502111545-32305-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthias Kaehlcke [Thu, 10 Aug 2017 22:24:29 +0000 (15:24 -0700)]
zram: rework copy of compressor name in comp_algorithm_store()
comp_algorithm_store() passes the size of the source buffer to strlcpy()
instead of the destination buffer size. Make it explicit that the two
buffers have the same size and use strcpy() instead of strlcpy(). The
latter can be done safely since the function ensures that the string in
the source buffer is terminated.
Link: http://lkml.kernel.org/r/20170803163350.45245-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Thu, 10 Aug 2017 22:24:27 +0000 (15:24 -0700)]
rmap: do not call mmu_notifier_invalidate_page() under ptl
MMU notifiers can sleep, but in page_mkclean_one() we call
mmu_notifier_invalidate_page() under page table lock.
Let's instead use mmu_notifier_invalidate_range() outside
page_vma_mapped_walk() loop.
[jglisse@redhat.com: try_to_unmap_one() do not call mmu_notifier under ptl]
Link: http://lkml.kernel.org/r/20170809204333.27485-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170804134928.l4klfcnqatni7vsc@black.fi.intel.com
Fixes:
c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reported-by: axie <axie@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Writer, Tim" <Tim.Writer@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cong Wang [Thu, 10 Aug 2017 22:24:24 +0000 (15:24 -0700)]
mm: fix list corruptions on shmem shrinklist
We saw many list corruption warnings on shmem shrinklist:
WARNING: CPU: 18 PID: 177 at lib/list_debug.c:59 __list_del_entry+0x9e/0xc0
list_del corruption. prev->next should be
ffff9ae5694b82d8, but was
ffff9ae5699ba960
Modules linked in: intel_rapl sb_edac edac_core x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid0 dcdbas shpchp wmi hed i2c_i801 ioatdma lpc_ich i2c_smbus acpi_cpufreq tcp_diag inet_diag sch_fq_codel ipmi_si ipmi_devintf ipmi_msghandler igb ptp crc32c_intel pps_core i2c_algo_bit i2c_core dca ipv6 crc_ccitt
CPU: 18 PID: 177 Comm: kswapd1 Not tainted 4.9.34-t3.el7.twitter.x86_64 #1
Hardware name: Dell Inc. PowerEdge C6220/0W6W6G, BIOS 2.2.3 11/07/2013
Call Trace:
dump_stack+0x4d/0x66
__warn+0xcb/0xf0
warn_slowpath_fmt+0x4f/0x60
__list_del_entry+0x9e/0xc0
shmem_unused_huge_shrink+0xfa/0x2e0
shmem_unused_huge_scan+0x20/0x30
super_cache_scan+0x193/0x1a0
shrink_slab.part.41+0x1e3/0x3f0
shrink_slab+0x29/0x30
shrink_node+0xf9/0x2f0
kswapd+0x2d8/0x6c0
kthread+0xd7/0xf0
ret_from_fork+0x22/0x30
WARNING: CPU: 23 PID: 639 at lib/list_debug.c:33 __list_add+0x89/0xb0
list_add corruption. prev->next should be next (
ffff9ae5699ba960), but was
ffff9ae5694b82d8. (prev=
ffff9ae5694b82d8).
Modules linked in: intel_rapl sb_edac edac_core x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid0 dcdbas shpchp wmi hed i2c_i801 ioatdma lpc_ich i2c_smbus acpi_cpufreq tcp_diag inet_diag sch_fq_codel ipmi_si ipmi_devintf ipmi_msghandler igb ptp crc32c_intel pps_core i2c_algo_bit i2c_core dca ipv6 crc_ccitt
CPU: 23 PID: 639 Comm: systemd-udevd Tainted: G W 4.9.34-t3.el7.twitter.x86_64 #1
Hardware name: Dell Inc. PowerEdge C6220/0W6W6G, BIOS 2.2.3 11/07/2013
Call Trace:
dump_stack+0x4d/0x66
__warn+0xcb/0xf0
warn_slowpath_fmt+0x4f/0x60
__list_add+0x89/0xb0
shmem_setattr+0x204/0x230
notify_change+0x2ef/0x440
do_truncate+0x5d/0x90
path_openat+0x331/0x1190
do_filp_open+0x7e/0xe0
do_sys_open+0x123/0x200
SyS_open+0x1e/0x20
do_syscall_64+0x61/0x170
entry_SYSCALL64_slow_path+0x25/0x25
The problem is that shmem_unused_huge_shrink() moves entries from the
global sbinfo->shrinklist to its local lists and then releases the
spinlock. However, a parallel shmem_setattr() could access one of these
entries directly and add it back to the global shrinklist if it is
removed, with the spinlock held.
The logic itself looks solid since an entry could be either in a local
list or the global list, otherwise it is removed from one of them by
list_del_init(). So probably the race condition is that, one CPU is in
the middle of INIT_LIST_HEAD() but the other CPU calls list_empty()
which returns true too early then the following list_add_tail() sees a
corrupted entry.
list_empty_careful() is designed to fix this situation.
[akpm@linux-foundation.org: add comments]
Link: http://lkml.kernel.org/r/20170803054630.18775-1-xiyou.wangcong@gmail.com
Fixes:
779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wei Wang [Thu, 10 Aug 2017 22:24:21 +0000 (15:24 -0700)]
mm/balloon_compaction.c: don't zero ballooned pages
Revert commit
bb01b64cfab7 ("mm/balloon_compaction.c: enqueue zero page
to balloon device")'
Zeroing ballon pages is rather time consuming, especially when a lot of
pages are in flight. E.g. 7GB worth of ballooned memory takes 2.8s with
__GFP_ZERO while it takes ~491ms without it.
The original commit argued that zeroing will help ksmd to merge these
pages on the host but this argument is assuming that the host actually
marks balloon pages for ksm which is not universally true. So we pay
performance penalty for something that even might not be used in the end
which is wrong. The host can zero out pages on its own when there is a
need.
[mhocko@kernel.org: new changelog text]
Link: http://lkml.kernel.org/r/1501761557-9758-1-git-send-email-wei.w.wang@intel.com
Fixes:
bb01b64cfab7 ("mm/balloon_compaction.c: enqueue zero page to balloon device")
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: zhenwei.pi <zhenwei.pi@youruncloud.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael S. Tsirkin [Thu, 10 Aug 2017 22:24:18 +0000 (15:24 -0700)]
MAINTAINERS: copy virtio on balloon_compaction.c
Changes to mm/balloon_compaction.c can easily break virtio, and virtio
is the only user of that interface. Add a line to MAINTAINERS so
whoever changes that file remembers to copy us.
Link: http://lkml.kernel.org/r/1501764010-24456-1-git-send-email-mst@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 10 Aug 2017 22:24:15 +0000 (15:24 -0700)]
mm: fix KSM data corruption
Nadav reported KSM can corrupt the user data by the TLB batching
race[1]. That means data user written can be lost.
Quote from Nadav Amit:
"For this race we need 4 CPUs:
CPU0: Caches a writable and dirty PTE entry, and uses the stale value
for write later.
CPU1: Runs madvise_free on the range that includes the PTE. It would
clear the dirty-bit. It batches TLB flushes.
CPU2: Writes 4 to /proc/PID/clear_refs , clearing the PTEs soft-dirty.
We care about the fact that it clears the PTE write-bit, and of
course, batches TLB flushes.
CPU3: Runs KSM. Our purpose is to pass the following test in
write_protect_page():
if (pte_write(*pvmw.pte) || pte_dirty(*pvmw.pte) ||
(pte_protnone(*pvmw.pte) && pte_savedwrite(*pvmw.pte)))
Since it will avoid TLB flush. And we want to do it while the PTE is
stale. Later, and before replacing the page, we would be able to
change the page.
Note that all the operations the CPU1-3 perform canhappen in parallel
since they only acquire mmap_sem for read.
We start with two identical pages. Everything below regards the same
page/PTE.
CPU0 CPU1 CPU2 CPU3
---- ---- ---- ----
Write the same
value on page
[cache PTE as
dirty in TLB]
MADV_FREE
pte_mkclean()
4 > clear_refs
pte_wrprotect()
write_protect_page()
[ success, no flush ]
pages_indentical()
[ ok ]
Write to page
different value
[Ok, using stale
PTE]
replace_page()
Later, CPU1, CPU2 and CPU3 would flush the TLB, but that is too late.
CPU0 already wrote on the page, but KSM ignored this write, and it got
lost"
In above scenario, MADV_FREE is fixed by changing TLB batching API
including [set|clear]_tlb_flush_pending. Remained thing is soft-dirty
part.
This patch changes soft-dirty uses TLB batching API instead of
flush_tlb_mm and KSM checks pending TLB flush by using
mm_tlb_flush_pending so that it will flush TLB to avoid data lost if
there are other parallel threads pending TLB flush.
[1] http://lkml.kernel.org/r/
BD3A0EBE-ECF4-41D4-87FA-
C755EA9AB6BD@gmail.com
Link: http://lkml.kernel.org/r/20170802000818.4760-8-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Tested-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 10 Aug 2017 22:24:12 +0000 (15:24 -0700)]
mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem
Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
problem and Mel fixed it[1] and found same problem on MADV_FREE[2].
Quote from Mel Gorman:
"The race in question is CPU 0 running madv_free and updating some PTEs
while CPU 1 is also running madv_free and looking at the same PTEs.
CPU 1 may have writable TLB entries for a page but fail the pte_dirty
check (because CPU 0 has updated it already) and potentially fail to
flush.
Hence, when madv_free on CPU 1 returns, there are still potentially
writable TLB entries and the underlying PTE is still present so that a
subsequent write does not necessarily propagate the dirty bit to the
underlying PTE any more. Reclaim at some unknown time at the future
may then see that the PTE is still clean and discard the page even
though a write has happened in the meantime. I think this is possible
but I could have missed some protection in madv_free that prevents it
happening."
This patch aims for solving both problems all at once and is ready for
other problem with KSM, MADV_FREE and soft-dirty story[3].
TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
catch there are parallel threads going on. In that case, forcefully,
flush TLB to prevent for user to access memory via stale TLB entry
although it fail to gather page table entry.
I confirmed this patch works with [4] test program Nadav gave so this
patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
v2" in current mmotm.
NOTE:
This patch modifies arch-specific TLB gathering interface(x86, ia64,
s390, sh, um). It seems most of architecture are straightforward but
s390 need to be careful because tlb_flush_mmu works only if
mm->context.flush_mm is set to non-zero which happens only a pte entry
really is cleared by ptep_get_and_clear and friends. However, this
problem never changes the pte entries but need to flush to prevent
memory access from stale tlb.
[1] http://lkml.kernel.org/r/
20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
[2] http://lkml.kernel.org/r/
20170725100722.2dxnmgypmwnrfawp@suse.de
[3] http://lkml.kernel.org/r/
BD3A0EBE-ECF4-41D4-87FA-
C755EA9AB6BD@gmail.com
[4] https://patchwork.kernel.org/patch/9861621/
[minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 10 Aug 2017 22:24:09 +0000 (15:24 -0700)]
mm: make tlb_flush_pending global
Currently, tlb_flush_pending is used only for CONFIG_[NUMA_BALANCING|
COMPACTION] but upcoming patches to solve subtle TLB flush batching
problem will use it regardless of compaction/NUMA so this patch doesn't
remove the dependency.
[akpm@linux-foundation.org: remove more ifdefs from world's ugliest printk statement]
Link: http://lkml.kernel.org/r/20170802000818.4760-6-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 10 Aug 2017 22:24:05 +0000 (15:24 -0700)]
mm: refactor TLB gathering API
This patch is a preparatory patch for solving race problems caused by
TLB batch. For that, we will increase/decrease TLB flush pending count
of mm_struct whenever tlb_[gather|finish]_mmu is called.
Before making it simple, this patch separates architecture specific part
and rename it to arch_tlb_[gather|finish]_mmu and generic part just
calls it.
It shouldn't change any behavior.
Link: http://lkml.kernel.org/r/20170802000818.4760-5-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nadav Amit [Thu, 10 Aug 2017 22:24:02 +0000 (15:24 -0700)]
Revert "mm: numa: defer TLB flush for THP migration as long as possible"
While deferring TLB flushes is a good practice, the reverted patch
caused pending TLB flushes to be checked while the page-table lock is
not taken. As a result, in architectures with weak memory model (PPC),
Linux may miss a memory-barrier, miss the fact TLB flushes are pending,
and cause (in theory) a memory corruption.
Since the alternative of using smp_mb__after_unlock_lock() was
considered a bit open-coded, and the performance impact is expected to
be small, the previous patch is reverted.
This reverts
b0943d61b8fa ("mm: numa: defer TLB flush for THP migration
as long as possible").
Link: http://lkml.kernel.org/r/20170802000818.4760-4-namit@vmware.com
Signed-off-by: Nadav Amit <namit@vmware.com>
Suggested-by: Mel Gorman <mgorman@suse.de>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nadav Amit [Thu, 10 Aug 2017 22:23:59 +0000 (15:23 -0700)]
mm: migrate: fix barriers around tlb_flush_pending
Reading tlb_flush_pending while the page-table lock is taken does not
require a barrier, since the lock/unlock already acts as a barrier.
Removing the barrier in mm_tlb_flush_pending() to address this issue.
However, migrate_misplaced_transhuge_page() calls mm_tlb_flush_pending()
while the page-table lock is already released, which may present a
problem on architectures with weak memory model (PPC). To deal with
this case, a new parameter is added to mm_tlb_flush_pending() to
indicate if it is read without the page-table lock taken, and calling
smp_mb__after_unlock_lock() in this case.
Link: http://lkml.kernel.org/r/20170802000818.4760-3-namit@vmware.com
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nadav Amit [Thu, 10 Aug 2017 22:23:56 +0000 (15:23 -0700)]
mm: migrate: prevent racy access to tlb_flush_pending
Patch series "fixes of TLB batching races", v6.
It turns out that Linux TLB batching mechanism suffers from various
races. Races that are caused due to batching during reclamation were
recently handled by Mel and this patch-set deals with others. The more
fundamental issue is that concurrent updates of the page-tables allow
for TLB flushes to be batched on one core, while another core changes
the page-tables. This other core may assume a PTE change does not
require a flush based on the updated PTE value, while it is unaware that
TLB flushes are still pending.
This behavior affects KSM (which may result in memory corruption) and
MADV_FREE and MADV_DONTNEED (which may result in incorrect behavior). A
proof-of-concept can easily produce the wrong behavior of MADV_DONTNEED.
Memory corruption in KSM is harder to produce in practice, but was
observed by hacking the kernel and adding a delay before flushing and
replacing the KSM page.
Finally, there is also one memory barrier missing, which may affect
architectures with weak memory model.
This patch (of 7):
Setting and clearing mm->tlb_flush_pending can be performed by multiple
threads, since mmap_sem may only be acquired for read in
task_numa_work(). If this happens, tlb_flush_pending might be cleared
while one of the threads still changes PTEs and batches TLB flushes.
This can lead to the same race between migration and
change_protection_range() that led to the introduction of
tlb_flush_pending. The result of this race was data corruption, which
means that this patch also addresses a theoretically possible data
corruption.
An actual data corruption was not observed, yet the race was was
confirmed by adding assertion to check tlb_flush_pending is not set by
two threads, adding artificial latency in change_protection_range() and
using sysctl to reduce kernel.numa_balancing_scan_delay_ms.
Link: http://lkml.kernel.org/r/20170802000818.4760-2-namit@vmware.com
Fixes:
20841405940e ("mm: fix TLB flush race between migration, and
change_protection_range")
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Thu, 10 Aug 2017 22:23:53 +0000 (15:23 -0700)]
fault-inject: fix wrong should_fail() decision in task context
Commit
1203c8e6fb0a ("fault-inject: simplify access check for fail-nth")
unintentionally broke a conditional statement in should_fail(). Any
faults are not injected in the task context by the change when the
systematic fault injection is not used.
This change restores to the previous correct behaviour.
Link: http://lkml.kernel.org/r/1501633700-3488-1-git-send-email-akinobu.mita@gmail.com
Fixes:
1203c8e6fb0a ("fault-inject: simplify access check for fail-nth")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Tested-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 10 Aug 2017 22:23:50 +0000 (15:23 -0700)]
test_kmod: fix small memory leak on filesystem tests
The break was in the wrong place so file system tests don't work as
intended, leaking memory at each test switch.
[mcgrof@kernel.org: massaged commit subject, noted memory leak issue without the fix]
Link: http://lkml.kernel.org/r/20170802211450.27928-6-mcgrof@kernel.org
Fixes:
39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reported-by: David Binderman <dcb314@hotmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Thu, 10 Aug 2017 22:23:47 +0000 (15:23 -0700)]
test_kmod: fix the lock in register_test_dev_kmod()
We accidentally just drop the lock twice instead of taking it and then
releasing it. This isn't a big issue unless you are adding more than
one device to test on, and the kmod.sh doesn't do that yet, however this
obviously is the correct thing to do.
[mcgrof@kernel.org: massaged subject, explain what happens]
Link: http://lkml.kernel.org/r/20170802211450.27928-5-mcgrof@kernel.org
Fixes:
39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Luis R. Rodriguez [Thu, 10 Aug 2017 22:23:44 +0000 (15:23 -0700)]
test_kmod: fix bug which allows negative values on two config options
Parsing with kstrtol() enables values to be negative, and we failed to
check for negative values when parsing with test_dev_config_update_uint_sync()
or test_dev_config_update_uint_range().
test_dev_config_update_uint_range() has a minimum check though so an
issue is not present there. test_dev_config_update_uint_sync() is only
used for the number of threads to use (config_num_threads_store()), and
indeed this would fail with an attempt for a large allocation.
Although the issue is only present in practice with the first fix both
by using kstrtoul() instead of kstrtol().
Link: http://lkml.kernel.org/r/20170802211450.27928-4-mcgrof@kernel.org
Fixes:
39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Ian King [Thu, 10 Aug 2017 22:23:40 +0000 (15:23 -0700)]
test_kmod: fix spelling mistake: "EMTPY" -> "EMPTY"
Trivial fix to spelling mistake in snprintf text
[mcgrof@kernel.org: massaged commit message]
Link: http://lkml.kernel.org/r/20170802211450.27928-3-mcgrof@kernel.org
Fixes:
39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michal Marek <mmarek@suse.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Thu, 10 Aug 2017 22:23:38 +0000 (15:23 -0700)]
userfaultfd: hugetlbfs: remove superfluous page unlock in VM_SHARED case
huge_add_to_page_cache->add_to_page_cache implicitly unlocks the page
before returning in case of errors.
The error returned was -EEXIST by running UFFDIO_COPY on a non-hole
offset of a VM_SHARED hugetlbfs mapping. It was an userland bug that
triggered it and the kernel must cope with it returning -EEXIST from
ioctl(UFFDIO_COPY) as expected.
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
kernel BUG at mm/filemap.c:964!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 22582 Comm: qemu-system-x86 Not tainted 4.11.11-300.fc26.x86_64 #1
RIP: unlock_page+0x4a/0x50
Call Trace:
hugetlb_mcopy_atomic_pte+0xc0/0x320
mcopy_atomic+0x96f/0xbe0
userfaultfd_ioctl+0x218/0xe90
do_vfs_ioctl+0xa5/0x600
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x1a/0xa9
Link: http://lkml.kernel.org/r/20170802165145.22628-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jonathan Toppins [Thu, 10 Aug 2017 22:23:35 +0000 (15:23 -0700)]
mm: ratelimit PFNs busy info message
The RDMA subsystem can generate several thousand of these messages per
second eventually leading to a kernel crash. Ratelimit these messages
to prevent this crash.
Doug said:
"I've been carrying a version of this for several kernel versions. I
don't remember when they started, but we have one (and only one) class
of machines: Dell PE R730xd, that generate these errors. When it
happens, without a rate limit, we get rcu timeouts and kernel oopses.
With the rate limit, we just get a lot of annoying kernel messages but
the machine continues on, recovers, and eventually the memory
operations all succeed"
And:
"> Well... why are all these EBUSY's occurring? It sounds inefficient
> (at least) but if it is expected, normal and unavoidable then
> perhaps we should just remove that message altogether?
I don't have an answer to that question. To be honest, I haven't
looked real hard. We never had this at all, then it started out of the
blue, but only on our Dell 730xd machines (and it hits all of them),
but no other classes or brands of machines. And we have our 730xd
machines loaded up with different brands and models of cards (for
instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
meant it wasn't tied to any particular brand/model of RDMA hardware.
To me, it always smelled of a hardware oddity specific to maybe the
CPUs or mainboard chipsets in these machines, so given that I'm not an
mm expert anyway, I never chased it down.
A few other relevant details: it showed up somewhere around 4.8/4.9 or
thereabouts. It never happened before, but the prinkt has been there
since the 3.18 days, so possibly the test to trigger this message was
changed, or something else in the allocator changed such that the
situation started happening on these machines?
And, like I said, it is specific to our 730xd machines (but they are
all identical, so that could mean it's something like their specific
ram configuration is causing the allocator to hit this on these
machine but not on other machines in the cluster, I don't want to say
it's necessarily the model of chipset or CPU, there are other bits of
identicalness between these machines)"
Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.com
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 10 Aug 2017 22:23:31 +0000 (15:23 -0700)]
mm: fix global NR_SLAB_.*CLAIMABLE counter reads
As Tetsuo points out:
"Commit
385386cff4c6 ("mm: vmstat: move slab statistics from zone to
node counters") broke "Slab:" field of /proc/meminfo . It shows nearly
0kB"
In addition to /proc/meminfo, this problem also affects the slab
counters OOM/allocation failure info dumps, can cause early -ENOMEM from
overcommit protection, and miscalculate image size requirements during
suspend-to-disk.
This is because the patch in question switched the slab counters from
the zone level to the node level, but forgot to update the global
accessor functions to read the aggregate node data instead of the
aggregate zone data.
Use global_node_page_state() to access the global slab counters.
Fixes:
385386cff4c6 ("mm: vmstat: move slab statistics from zone to node counters")
Link: http://lkml.kernel.org/r/20170801134256.5400-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Stefan Agner <stefan@agner.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 10 Aug 2017 21:52:45 +0000 (14:52 -0700)]
Merge tag 'pci-v4.13-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Work around Renesas uPD72020x 32-bit DMA issue"
* tag 'pci-v4.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue
PCI: Add pci_reset_function_locked()
Mika Westerberg [Tue, 25 Jul 2017 14:41:58 +0000 (17:41 +0300)]
thunderbolt: Do not enumerate more ports from DROM than the controller has
Some Alpine Ridge LP DROMs (there might be others) erroneusly list more
ports than the controller actually has. Most probably because DROM of
the full Dual/Single port Thunderbolt controller was reused for LP
version. The current DROM parser does not check the upper bound thus it
leads to crash when sw->ports[] is accessed over bounds:
BUG: unable to handle kernel NULL pointer dereference at
00000000000002ec
IP: tb_drom_read+0x383/0x890 [thunderbolt]
PGD 0
P4D 0
Oops: 0000 [#1] SMP
CPU: 3 PID: 12248 Comm: systemd-udevd Not tainted 4.13.0-rc1-next-
20170719 #1
Hardware name: LENOVO 20HF000YGE/20HF000YGE, BIOS N1WET32W (1.11 ) 05/23/2017
task:
ffff8a293e4bcd80 task.stack:
ffffa698027a8000
RIP: 0010:tb_drom_read+0x383/0x890 [thunderbolt]
RSP: 0018:
ffffa698027ab990 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff8a2940af7800 RCX:
0000000000000000
RDX:
ffff8a2940ebb400 RSI:
0000000000000000 RDI:
ffffa698027ab9a0
RBP:
ffffa698027ab9d0 R08:
0000000000000001 R09:
0000000000000002
R10:
ffff8a2940ebb5b0 R11:
0000000000000000 R12:
ffff8a293bfa968c
R13:
000000000000002c R14:
0000000000000056 R15:
0000000000000056
FS:
00007f0a945a38c0(0000) GS:
ffff8a2961580000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000000002ec CR3:
000000043e785000 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
tb_switch_add+0x9d/0x730 [thunderbolt]
? tb_switch_alloc+0x3cd/0x4d0 [thunderbolt]
icm_start+0x5a/0xa0 [thunderbolt]
tb_domain_add+0xc3/0xf0 [thunderbolt]
nhi_probe+0x19e/0x310 [thunderbolt]
local_pci_probe+0x42/0xa0
pci_device_probe+0x18d/0x1a0
driver_probe_device+0x2ff/0x450
__driver_attach+0xa4/0xe0
? driver_probe_device+0x450/0x450
bus_for_each_dev+0x6e/0xb0
driver_attach+0x1e/0x20
bus_add_driver+0x1d0/0x270
? 0xffffffffc0bbb000
driver_register+0x60/0xe0
? 0xffffffffc0bbb000
__pci_register_driver+0x4c/0x50
nhi_init+0x28/0x1000 [thunderbolt]
do_one_initcall+0x50/0x190
? __vunmap+0x81/0xb0
? _cond_resched+0x1a/0x50
? kmem_cache_alloc_trace+0x15f/0x1c0
? do_init_module+0x27/0x1e9
do_init_module+0x5f/0x1e9
load_module+0x24e7/0x2a60
? vfs_read+0x115/0x130
SYSC_finit_module+0xfc/0x120
? SYSC_finit_module+0xfc/0x120
SyS_finit_module+0xe/0x10
do_syscall_64+0x67/0x170
entry_SYSCALL64_slow_path+0x25/0x25
Fix this by making sure we only enumerate DROM port entries the hardware
actually has.
Reported-by: Christian Kellner <ckellner@redhat.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Christian Kellner <ckellner@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Usyskin [Thu, 3 Aug 2017 14:30:19 +0000 (17:30 +0300)]
mei: exclude device from suspend direct complete optimization
MEI device performs link reset during system suspend sequence.
The link reset cannot be performed while device is in
runtime suspend state. The resume sequence is bypassed with
suspend direct complete optimization,so the optimization should be
disabled for mei devices.
Fixes:
[ 192.940537] Restarting tasks ...
[ 192.940610] PGI is not set
[ 192.940619] ------------[ cut here ]------------ [ 192.940623]
WARNING: CPU: 0
me.c:653 mei_me_pg_exit_sync+0x351/0x360 [ 192.940624] Modules
linked
in:
[ 192.940627] CPU: 0 PID: 1661 Comm: kworker/0:3 Not tainted
4.13.0-rc2+
#2 [ 192.940628] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS
A11
12/08/2016 [ 192.940630] Workqueue: pm pm_runtime_work <snip> [
192.940642] Call Trace:
[ 192.940646] ? pci_pme_active+0x1de/0x1f0 [ 192.940649] ?
pci_restore_standard_config+0x50/0x50
[ 192.940651] ? kfree+0x172/0x190
[ 192.940653] ? kfree+0x172/0x190
[ 192.940655] ? pci_restore_standard_config+0x50/0x50
[ 192.940663] mei_me_pm_runtime_resume+0x3f/0xc0
[ 192.940665] pci_pm_runtime_resume+0x7a/0xa0 [ 192.940667]
__rpm_callback+0xb9/0x1e0 [ 192.940668] ?
preempt_count_add+0x6d/0xc0 [ 192.940670] rpm_callback+0x24/0x90 [
192.940672] ? pci_restore_standard_config+0x50/0x50
[ 192.940674] rpm_resume+0x4e8/0x800 [ 192.940676]
pm_runtime_work+0x55/0xb0 [ 192.940678]
process_one_work+0x184/0x3e0 [ 192.940680]
worker_thread+0x4d/0x3a0 [ 192.940681] ?
preempt_count_sub+0x9b/0x100 [ 192.940683]
kthread+0x122/0x140 [ 192.940684] ? process_one_work+0x3e0/0x3e0 [
192.940685] ? __kthread_create_on_node+0x1a0/0x1a0
[ 192.940688] ret_from_fork+0x27/0x40 [ 192.940690] Code: 96 3a
9e ff 48 8b 7d 98 e8 cd 21 58 00 83 bb bc 01 00 00
04 0f 85 40 fe ff ff e9 41 fe ff ff 48 c7 c7 5f 04 99 96 e8 93 6b 9f
ff <0f> ff e9 5d fd ff ff e8 33 fe 99 ff 0f 1f 00 0f 1f 44 00 00 55
[ 192.940719] ---[ end trace
a86955597774ead8 ]--- [ 192.942540] done.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Luis R. Rodriguez [Thu, 20 Jul 2017 20:13:11 +0000 (13:13 -0700)]
firmware: avoid invalid fallback aborts by using killable wait
Commit
0cb64249ca500 ("firmware_loader: abort request if wait_for_completion
is interrupted") added via 4.0 added support to abort the fallback mechanism
when a signal was detected and wait_for_completion_interruptible() returned
-ERESTARTSYS -- for instance when a user hits CTRL-C. The abort was overly
*too* effective.
When a child process terminates (successful or not) the signal SIGCHLD can
be sent to the parent process which ran the child in the background and
later triggered a sync request for firmware through a sysfs interface which
relies on the fallback mechanism. This signal in turn can be recieved by the
interruptible wait we constructed on firmware_class and detects it as an
abort *before* userspace could get a chance to write the firmware. Upon
failure -EAGAIN is returned, so userspace is also kept in the dark about
exactly what happened.
We can reproduce the issue with the fw_fallback.sh selftest:
Before this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: error - sync firmware request cancelled due to SIGCHLD
After this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected
Fix this by making the wait killable -- only killable by SIGKILL (kill -9).
We loose the ability to allow userspace to cancel a write with CTRL-C
(SIGINT), however its been decided the compromise to require SIGKILL is
worth the gains.
Chances of this issue occuring are low due to the number of drivers upstream
exclusively relying on the fallback mechanism for firmware (2 drivers),
however this is observed in the field with custom drivers with sysfs
triggers to load firmware. Only distributions relying on the fallback
mechanism are impacted as well. An example reported issue was on Android,
as follows:
1) Android init (pid=1) fork()s (say pid=42) [this child process is totally
unrelated to firmware loading, it could be sleep 2; for all we care ]
2) Android init (pid=1) does a write() on a (driver custom) sysfs file which
ends up calling request_firmware() kernel side
3) The firmware loading fallback mechanism is used, the request is sent to
userspace and pid 1 waits in the kernel on wait_*
4) before firmware loading completes pid 42 dies (for any reason, even
normal termination)
5) Kernel delivers SIGCHLD to pid=1 to tell it a child has died, which
causes -ERESTARTSYS to be returned from wait_*
6) The kernel's wait aborts and return -EAGAIN for the
request_firmware() caller.
Cc: stable <stable@vger.kernel.org> # 4.0
Fixes:
0cb64249ca500 ("firmware_loader: abort request if wait_for_completion is interrupted")
Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Martin Fuzzey <mfuzzey@parkeon.com>
Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>