platform/kernel/linux-rpi.git
23 months agodm writecache: count number of blocks discarded, not number of discard bios
Mikulas Patocka [Mon, 11 Jul 2022 20:31:52 +0000 (16:31 -0400)]
dm writecache: count number of blocks discarded, not number of discard bios

[ Upstream commit 2ee73ef60db4d79b9f9b8cd501e8188b5179449f ]

Change dm-writecache, so that it counts the number of blocks discarded
instead of the number of discard bios. Make it consistent with the
read and write statistics counters that were changed to count the
number of blocks instead of bios.

Fixes: e3a35d03407c ("dm writecache: add event counters")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodm writecache: count number of blocks written, not number of write bios
Mikulas Patocka [Mon, 11 Jul 2022 20:31:26 +0000 (16:31 -0400)]
dm writecache: count number of blocks written, not number of write bios

[ Upstream commit b2676e1482af89714af6988ce5d31a84692e2530 ]

Change dm-writecache, so that it counts the number of blocks written
instead of the number of write bios. Bios can be split and requeued
using the dm_accept_partial_bio function, so counting bios caused
inaccurate results.

Fixes: e3a35d03407c ("dm writecache: add event counters")
Reported-by: Yu Kuai <yukuai1@huaweicloud.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodm writecache: count number of blocks read, not number of read bios
Mikulas Patocka [Mon, 11 Jul 2022 20:30:52 +0000 (16:30 -0400)]
dm writecache: count number of blocks read, not number of read bios

[ Upstream commit 2c6e755b49d273243431f5f1184654e71221fc78 ]

Change dm-writecache, so that it counts the number of blocks read
instead of the number of read bios. Bios can be split and requeued
using the dm_accept_partial_bio function, so counting bios caused
inaccurate results.

Fixes: e3a35d03407c ("dm writecache: add event counters")
Reported-by: Yu Kuai <yukuai1@huaweicloud.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodm writecache: return void from functions
Mikulas Patocka [Mon, 11 Jul 2022 20:30:27 +0000 (16:30 -0400)]
dm writecache: return void from functions

[ Upstream commit 9bc0c92e4b82adb017026dbb2aa816b1ac2bef31 ]

The functions writecache_map_remap_origin and writecache_bio_copy_ssd
only return a single value, thus they can be made to return void.

This helps simplify the following IO accounting changes.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoPM: domains: Ensure genpd_debugfs_dir exists before remove
Hsin-Yi Wang [Tue, 5 Jul 2022 17:16:49 +0000 (01:16 +0800)]
PM: domains: Ensure genpd_debugfs_dir exists before remove

[ Upstream commit 37101d3c719386040ded735a5ec06974f1d94d1f ]

Both genpd_debug_add() and genpd_debug_remove() may be called
indirectly by other drivers while genpd_debugfs_dir is not yet
set. For example, drivers can call pm_genpd_init() in probe or
pm_genpd_init() in probe fail/cleanup path:

pm_genpd_init()
 --> genpd_debug_add()

pm_genpd_remove()
 --> genpd_remove()
   --> genpd_debug_remove()

At this time, genpd_debug_init() may not yet be called.

genpd_debug_add() checks that if genpd_debugfs_dir is NULL, it
will return directly. Make sure this is also checked
in pm_genpd_remove(), otherwise components under debugfs root
which has the same name as other components under pm_genpd may
be accidentally removed, since NULL represents debugfs root.

Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoblktrace: Trace remapped requests correctly
Bart Van Assche [Thu, 14 Jul 2022 18:06:36 +0000 (11:06 -0700)]
blktrace: Trace remapped requests correctly

[ Upstream commit 22c80aac882f712897b88b7ea8f5a74ea19019df ]

Trace the remapped operation and its flags instead of only the data
direction of remapped operations. This issue was detected by analyzing
the warnings reported by sparse related to the new blk_opf_t type.

Reviewed-by: Jun'ichi Nomura <junichi.nomura@nec.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Fixes: 1b9a9ab78b0a ("blktrace: use op accessors")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-11-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agohwmon: (drivetemp) Add module alias
Linus Walleij [Tue, 12 Jul 2022 21:46:24 +0000 (23:46 +0200)]
hwmon: (drivetemp) Add module alias

[ Upstream commit 5918036cfa8ded7aa8094db70295011ce2275447 ]

Adding a MODULE_ALIAS() to drivetemp will make the driver easier
for modprobe to autoprobe.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220712214624.1845158-1-linus.walleij@linaro.org
Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agospi: tegra20-slink: fix UAF in tegra_slink_remove()
Yang Yingliang [Wed, 13 Jul 2022 09:40:23 +0000 (17:40 +0800)]
spi: tegra20-slink: fix UAF in tegra_slink_remove()

[ Upstream commit 7e9984d183bb1e99e766c5c2b950ff21f7f7b6c0 ]

After calling spi_unregister_master(), the refcount of master will
be decrease to 0, and it will be freed in spi_controller_release(),
the device data also will be freed, so it will lead a UAF when using
'tspi'. To fix this, get the master before unregister and put it when
finish using it.

Fixes: 26c863418221 ("spi: tegra20-slink: Don't use resource-managed spi_register helper")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220713094024.1508869-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agospi: Fix simplification of devm_spi_register_controller
Yang Yingliang [Tue, 12 Jul 2022 13:55:04 +0000 (21:55 +0800)]
spi: Fix simplification of devm_spi_register_controller

[ Upstream commit 43cc5a0afe4184a7fafe1eba32b5a11bb69c9ce0 ]

This reverts commit 59ebbe40fb51 ("spi: simplify
devm_spi_register_controller").

If devm_add_action() fails in devm_add_action_or_reset(),
devm_spi_unregister() will be called, it decreases the
refcount of 'ctlr->dev' to 0, then it will cause uaf in
the drivers that calling spi_put_controller() in error path.

Fixes: 59ebbe40fb51 ("spi: simplify devm_spi_register_controller")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220712135504.1055688-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoblk-mq: don't create hctx debugfs dir until q->debugfs_dir is created
Ming Lei [Mon, 11 Jul 2022 09:08:08 +0000 (17:08 +0800)]
blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created

[ Upstream commit f3ec5d11554778c24ac8915e847223ed71d104fc ]

blk_mq_debugfs_register_hctx() can be called by blk_mq_update_nr_hw_queues
when gendisk isn't added yet, such as nvme tcp.

Fixes the warning of 'debugfs: Directory 'hctx0' with parent '/' already present!'
which can be observed reliably when running blktests nvme/005.

Fixes: 6cfc0081b046 ("blk-mq: no need to check return value of debugfs_create functions")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220711090808.259682-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoerofs: avoid consecutive detection for Highmem memory
Gao Xiang [Fri, 8 Jul 2022 10:10:01 +0000 (18:10 +0800)]
erofs: avoid consecutive detection for Highmem memory

[ Upstream commit 448b5a1548d87c246c3d0c3df8480d3c6eb6c11a ]

Currently, vmap()s are avoided if physical addresses are
consecutive for decompressed buffers.

I observed that is very common for 4KiB pclusters since the
numbers of decompressed pages are almost 2 or 3.

However, such detection doesn't work for Highmem pages on
32-bit machines, let's fix it now.

Reported-by: Liu Jinbao <liujinbao1@xiaomi.com>
Fixes: 7fc45dbc938a ("staging: erofs: introduce generic decompression backend")
Link: https://lore.kernel.org/r/20220708101001.21242-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: tegra: Fix SDMMC1 CD on P2888
Tamás Szűcs [Sun, 12 Jun 2022 14:59:45 +0000 (14:59 +0000)]
arm64: tegra: Fix SDMMC1 CD on P2888

[ Upstream commit b415bb7c976f1d595ed752001c0938f702645dab ]

Hook SDMMC1 CD up with CVM GPIO02 (SOC_GPIO11) used for card detection on J4
(uSD socket) on the carrier.

Fixes: ef633bfc21e9 ("arm64: tegra: Enable card detect for SD card on P2888")
Signed-off-by: Tamás Szűcs <tszucs@protonmail.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: tegra: Mark BPMP channels as no-memory-wc
Mikko Perttunen [Wed, 22 Jun 2022 13:23:00 +0000 (16:23 +0300)]
arm64: tegra: Mark BPMP channels as no-memory-wc

[ Upstream commit 61192a9d8a6367ae1b8234876941b037910a2459 ]

The Tegra SYSRAM contains regions access to which is restricted to
certain hardware blocks on the system, and speculative accesses to
those will cause issues.

Patch 'misc: sram: Only map reserved areas in Tegra SYSRAM' attempted
to resolve this by only mapping the regions specified in the device
tree on the assumption that there are no such restricted areas within
the 64K-aligned area of memory that contains the memory we wish to map.

Turns out this assumption is wrong, as there are such areas above the
4K pages described in the device trees. As such, we need to use the
bigger hammer that is no-memory-wc, which causes the memory to be
mapped as Device memory to which speculative accesses are disallowed.

As such, the previous patch in the series,
  'firmware: tegra: bpmp: do only aligned access to IPC memory area',
is required with this patch to make the BPMP driver only issue aligned
memory accesses as those are also required with Device memory.

Fixes: fec29bf04994 ("misc: sram: Only map reserved areas in Tegra SYSRAM")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: tegra: Update Tegra234 BPMP channel addresses
Mikko Perttunen [Fri, 12 Nov 2021 12:35:39 +0000 (13:35 +0100)]
arm64: tegra: Update Tegra234 BPMP channel addresses

[ Upstream commit 98094be152d34f8014ca67fbdc210e5261c4b09d ]

On final Tegra234 systems, shared memory for communication with BPMP is
located at offset 0x70000 in SYSRAM.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: tegra: Fixup SYSRAM references
Thierry Reding [Fri, 12 Nov 2021 12:35:37 +0000 (13:35 +0100)]
arm64: tegra: Fixup SYSRAM references

[ Upstream commit 7fa307524a4d721d4a04523018509882c5414e72 ]

The json-schema bindings for SRAM expect the nodes to be called "sram"
rather than "sysram" or "shmem". Furthermore, place the brackets around
the SYSRAM references such that a two-element array is created rather
than a two-element array nested in a single-element array. This is not
relevant for device tree itself, but allows the nodes to be properly
validated against json-schema bindings.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: mt7622: fix BPI-R64 WPS button
Nick Hainke [Thu, 30 Jun 2022 11:16:57 +0000 (13:16 +0200)]
arm64: dts: mt7622: fix BPI-R64 WPS button

[ Upstream commit c98e6e683632386a3bd284acda4342e68aec4c41 ]

The bananapi R64 (BPI-R64) experiences wrong WPS button signals.
In OpenWrt pushing the WPS button while powering on the device will set
it to recovery mode. Currently, this also happens without any user
interaction. In particular, the wrong signals appear while booting the
device or restarting it, e.g. after doing a system upgrade. If the
device is in recovery mode the user needs to manually power cycle or
restart it.

The official BPI-R64 sources set the WPS button to GPIO_ACTIVE_LOW in
the device tree. This setting seems to suppress the unwanted WPS button
press signals. So this commit changes the button from GPIO_ACTIVE_HIGH to
GPIO_ACTIVE_LOW.

The official BPI-R64 sources can be found on
https://github.com/BPI-SINOVOIP/BPI-R64-openwrt

Fixes: 0b6286dd96c0 ("arm64: dts: mt7622: add bananapi BPI-R64 board")

Suggested-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Signed-off-by: Nick Hainke <vincent@systemli.org>
Link: https://lore.kernel.org/r/20220630111746.4098-1-vincent@systemli.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sm8250: add missing PCIe PHY clock-cells
Johan Hovold [Tue, 5 Jul 2022 11:40:20 +0000 (13:40 +0200)]
arm64: dts: qcom: sm8250: add missing PCIe PHY clock-cells

[ Upstream commit d9fd162ce764c227fcfd4242f6c1639895a9481f ]

Add the missing '#clock-cells' properties to the PCIe QMP PHY nodes.

Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Fixes: e53bdfc00977 ("arm64: dts: qcom: sm8250: Add PCIe support")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220705114032.22787-3-johan+linaro@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sm6125: Append -state suffix to pinctrl nodes
Marijn Suijten [Sun, 8 May 2022 10:03:34 +0000 (12:03 +0200)]
arm64: dts: qcom: sm6125: Append -state suffix to pinctrl nodes

[ Upstream commit cbfb5668aece448877fa7826cde81c9d06f4a4ac ]

According to qcom,sm6125-pinctrl.yaml all nodes inside the tlmm must be
suffixed by -state:

    qcom/sm6125-sony-xperia-seine-pdx201.dtb: pinctrl@500000: 'sdc2-off', 'sdc2-on' do not match any of the regexes: '-state$', 'pinctrl-[0-9]+'

The label names have been updated to match, going from sdc2_state_X to
sdc2_X_state.

Fixes: cff4bbaf2a2d ("arm64: dts: qcom: Add support for SM6125")
Fixes: 82e1783890b7 ("arm64: dts: qcom: sm6125: Add support for Sony Xperia 10II")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220508100336.127176-2-marijn.suijten@somainline.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sm6125: Move sdc2 pinctrl from seine-pdx201 to sm6125
Marijn Suijten [Sun, 8 May 2022 10:03:33 +0000 (12:03 +0200)]
arm64: dts: qcom: sm6125: Move sdc2 pinctrl from seine-pdx201 to sm6125

[ Upstream commit 6990640a93ba4e76dd62ca3ea1082a7354db09d7 ]

Both the sdc2-on and sdc2-off pinctrl nodes are used by the
sdhci@4784000 node in sm6125.dtsi.  Surprisingly sdc2-off is defined in
sm6125, yet its sdc2-on counterpart is only defined in board-specific DT
for the Sony Seine PDX201 board/device resulting in an "undefined label
&sdc2_state_on" error if sm6125.dtsi were included elsewhere.
This sm6125 base dtsi should not rely on externally defined labels; the
properties referencing it should then also be written externally.
Since the sdc2-on pin configuration is board-independent just like
sdc2-off, move it from seine-pdx201.dts into sm6125.dtsi.

The SDCard-detect pin (gpio98) is however board-specific, and remains as
an overwrite in seine-pdx201.dts for both the on and off state.

As a drive-by cleanup, reorder bias- and drive-strength properties.

Fixes: cff4bbaf2a2d ("arm64: dts: qcom: Add support for SM6125")
Fixes: 82e1783890b7 ("arm64: dts: qcom: sm6125: Add support for Sony Xperia 10II")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220508100336.127176-1-marijn.suijten@somainline.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: VIOT: Fix ACS setup
Eric Auger [Thu, 30 Jun 2022 09:40:59 +0000 (11:40 +0200)]
ACPI: VIOT: Fix ACS setup

[ Upstream commit 3dcb861dbc6ab101838a1548b1efddd00ca3c3ec ]

Currently acpi_viot_init() gets called after the pci
device has been scanned and pci_enable_acs() has been called.
So pci_request_acs() fails to be taken into account leading
to wrong single iommu group topologies when dealing with
multi-function root ports for instance.

We cannot simply move the acpi_viot_init() earlier, similarly
as the IORT init because the VIOT parsing relies on the pci
scan. However we can detect VIOT is present earlier and in
such a case, request ACS. Introduce a new acpi_viot_early_init()
routine that allows to call pci_request_acs() before the scan.

While at it, guard the call to pci_request_acs() with #ifdef
CONFIG_PCI.

Fixes: 3cf485540e7b ("ACPI: Add driver for the VIOT table")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Jin Liu <jinl@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodrivers/iio: Remove all strcpy() uses
Len Baker [Sun, 15 Aug 2021 17:42:04 +0000 (19:42 +0200)]
drivers/iio: Remove all strcpy() uses

[ Upstream commit d722f1e06fbc53eb369b39646945c1fa92068e74 ]

strcpy() performs no bounds checking on the destination buffer. This
could result in linear overflows beyond the end of the buffer, leading
to all kinds of misbehaviors. So, remove all the uses and add
devm_kstrdup() or devm_kasprintf() instead.

Also, modify the "for" loop conditions to clarify the access to the
st->orientation.rotation buffer.

This patch is an effort to clean up the proliferation of str*()
functions in the kernel and a previous step in the path to remove
the strcpy function from the kernel entirely [1].

[1] https://github.com/KSPP/linux/issues/88

Signed-off-by: Len Baker <len.baker@gmx.com>
Link: https://lore.kernel.org/r/20210815174204.126593-1-len.baker@gmx.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: APEI: explicit init of HEST and GHES in apci_init()
Shuai Xue [Sun, 27 Feb 2022 12:25:45 +0000 (20:25 +0800)]
ACPI: APEI: explicit init of HEST and GHES in apci_init()

[ Upstream commit dc4e8c07e9e2f69387579c49caca26ba239f7270 ]

From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus
memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage
the estatus memory pool. On the other hand, ghes_init() relies on
sdei_init() to detect the SDEI version and (un)register events. The
dependencies are as follows:

    ghes_init() => acpi_hest_init() => acpi_bus_init() => acpi_init()
    ghes_init() => sdei_init()

HEST is not PCI-specific and initcall ordering is implicit and not
well-defined within a level.

Based on above, remove acpi_hest_init() from acpi_pci_root_init() and
convert ghes_init() and sdei_init() from initcalls to explicit calls in the
following order:

    acpi_hest_init()
    ghes_init()
        sdei_init()

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: msm8916: Fix typo in pronto remoteproc node
Sireesh Kodali [Thu, 26 May 2022 14:17:40 +0000 (19:47 +0530)]
arm64: dts: qcom: msm8916: Fix typo in pronto remoteproc node

[ Upstream commit 5458d6f2827cd30218570f266b8d238417461f2f ]

The smem-state properties for the pronto node were incorrectly labelled,
reading `qcom,state*` rather than `qcom,smem-state*`. Fix that, allowing
the stop state to be used.

Fixes: 88106096cbf8 ("ARM: dts: msm8916: Add and enable wcnss node")
Signed-off-by: Sireesh Kodali <sireeshkodali1@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220526141740.15834-3-sireeshkodali1@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agostack: Declare {randomize_,}kstack_offset to fix Sparse warnings
GONG, Ruiqi [Wed, 29 Jun 2022 06:04:23 +0000 (14:04 +0800)]
stack: Declare {randomize_,}kstack_offset to fix Sparse warnings

[ Upstream commit 375561bd6195a31bf4c109732bd538cb97a941f4 ]

Fix the following Sparse warnings that got noticed when the PPC-dev
patchwork was checking another patch (see the link below):

init/main.c:862:1: warning: symbol 'randomize_kstack_offset' was not declared. Should it be static?
init/main.c:864:1: warning: symbol 'kstack_offset' was not declared. Should it be static?

Which in fact are triggered on all architectures that have
HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET support (for instances x86, arm64
etc).

Link: https://lore.kernel.org/lkml/e7b0d68b-914d-7283-827c-101988923929@huawei.com/T/#m49b2d4490121445ce4bf7653500aba59eefcb67f
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220629060423.2515693-1-gongruiqi1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agobus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe()
Yang Yingliang [Fri, 1 Jul 2022 09:43:52 +0000 (17:43 +0800)]
bus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe()

[ Upstream commit 54872fea6a5ac967ec2272aea525d1438ac6735a ]

In error case in hisi_lpc_acpi_probe() after calling platform_device_add(),
hisi_lpc_acpi_remove() can't release the failed 'pdev', so it will be leak,
call platform_device_put() to fix this problem.
I'v constructed this error case and tested this patch on D05 board.

Fixes: 99c0228d6ff1 ("HISI LPC: Re-Add ACPI child enumeration support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: qcom: pm8841: add required thermal-sensor-cells
Krzysztof Kozlowski [Wed, 8 Jun 2022 11:27:02 +0000 (13:27 +0200)]
ARM: dts: qcom: pm8841: add required thermal-sensor-cells

[ Upstream commit e2759fa0676c9a32bbddb9aff955b54bb35066ad ]

The PM8841 temperature sensor has to define thermal-sensor-cells.

Fixes: dab8134ca072 ("ARM: dts: qcom: Add PM8841 functions device nodes")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220608112702.80873-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosoc: qcom: aoss: Fix refcount leak in qmp_cooling_devices_register
Miaoqian Lin [Mon, 6 Jun 2022 06:42:52 +0000 (10:42 +0400)]
soc: qcom: aoss: Fix refcount leak in qmp_cooling_devices_register

[ Upstream commit e6e0951414a314e7db3e9e24fd924b3e15515288 ]

Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node.
When breaking early from a for_each_available_child_of_node() loop,
we need to explicitly call of_node_put() on the child node.
Add missing of_node_put() to avoid refcount leak.

Fixes: 05589b30b21a ("soc: qcom: Extend AOSS QMP driver to support resources that are used to wake up the SoC.")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220606064252.42595-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosoc: qcom: ocmem: Fix refcount leak in of_get_ocmem
Miaoqian Lin [Thu, 2 Jun 2022 04:24:30 +0000 (08:24 +0400)]
soc: qcom: ocmem: Fix refcount leak in of_get_ocmem

[ Upstream commit 92a563fcf14b3093226fb36f12e9b5cf630c5a5d ]

of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.
of_node_put() will check NULL pointer.

Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220602042430.1114-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: qcom-msm8974: fix irq type on blsp2_uart1
Luca Weiss [Sun, 22 May 2022 08:36:18 +0000 (10:36 +0200)]
ARM: dts: qcom-msm8974: fix irq type on blsp2_uart1

[ Upstream commit ab1489017aa7a9f02e24bee73cf9ec8079cd3909 ]

IRQ_TYPE_NONE is invalid, so use the correct interrupt type.

Signed-off-by: Luca Weiss <luca@z3ntu.xyz>
Fixes: b05f82b152c9 ("ARM: dts: qcom: msm8974: Add blsp2_uart7 for bluetooth on sirius")
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220522083618.17894-1-luca@z3ntu.xyz
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP
Dan Williams [Fri, 24 Jun 2022 23:05:26 +0000 (16:05 -0700)]
ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP

[ Upstream commit b13a3e5fd40b7d1b394c5ecbb5eb301a4c38e7b2 ]

When a platform marks a memory range as "special purpose" it is not
onlined as System RAM by default. However, it is still suitable for
error injection. Add IORES_DESC_SOFT_RESERVED to einj_error_inject() as
a permissible memory type in the sanity checking of the arguments to
_EINJ.

Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration")
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reported-by: Omar Avelar <omar.avelar@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoregulator: qcom_smd: Fix pm8916_pldo range
Stephan Gerhold [Thu, 23 Jun 2022 09:46:12 +0000 (11:46 +0200)]
regulator: qcom_smd: Fix pm8916_pldo range

[ Upstream commit e8977917e116d1571dacb8e9864474551c1c12bd ]

The PM8916 device specification [1] documents a programmable range of
1.75V to 3.337V with 12.5mV steps for the PMOS LDOs in PM8916. This
range is also used when controlling the regulator directly using the
qcom_spmi-regulator driver ("ult_pldo" there).

However, for some reason the qcom_smd-regulator driver allows a much
larger range for the same hardware component. This could be simply a
typo, since the start of the range is essentially just missing a '1'.

In practice this does not cause any major problems, since the driver
just sends the actual voltage to the RPM firmware instead of making use
of the incorrect voltage selector. Still, having the wrong range there
is confusing and prevents the regulator core from validating requests
correctly.

[1]: https://developer.qualcomm.com/download/sd410/pm8916pm8916-1-power-management-ic-device-specification.pdf

Fixes: 57d6567680ed ("regulator: qcom-smd: Add PM8916 support")
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Link: https://lore.kernel.org/r/20220623094614.1410180-2-stephan.gerhold@kernkonzept.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agocpufreq: zynq: Fix refcount leak in zynq_get_revision
Miaoqian Lin [Sun, 5 Jun 2022 08:28:07 +0000 (12:28 +0400)]
cpufreq: zynq: Fix refcount leak in zynq_get_revision

[ Upstream commit d1ff2559cef0f6f8d97fba6337b28adb10689e16 ]

of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.

Fixes: 00f7dc636366 ("ARM: zynq: Add support for SOC_BUS")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220605082807.21526-1-linmq006@gmail.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sdm636-sony-xperia-ganges-mermaid: correct sdc2 pinconf
Dmitry Baryshkov [Sat, 21 May 2022 20:27:05 +0000 (23:27 +0300)]
arm64: dts: qcom: sdm636-sony-xperia-ganges-mermaid: correct sdc2 pinconf

[ Upstream commit 3a04cec9cba393abfe70fc62e523f381c9baec2e ]

Fix the device tree node in the &sdc2_state_on override. The sdm630 uses
'clk' rather than 'pinconf-clk'.

Fixes: 4c1d849ec047 ("arm64: dts: qcom: sdm630-xperia: Retire sdm630-sony-xperia-ganges.dtsi")
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220521202708.1509308-9-dmitry.baryshkov@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sdm630: fix gpu's interconnect path
Dmitry Baryshkov [Sat, 21 May 2022 20:27:04 +0000 (23:27 +0300)]
arm64: dts: qcom: sdm630: fix gpu's interconnect path

[ Upstream commit 3cd1c4f41d64a40ea6bc4575ae28e37542123d77 ]

ICC path for the GPU incorrectly states <&gnoc 1 &bimc 5>, which is
a path from SLAVE_GNOC_BIMC to SLAVE_EBI. According to the downstream
kernel sources, the GPU uses MASTER_OXILI here, which is equivalent to
<&bimc 1 ...>.

While we are at it, use defined names instead of the numbers for this
interconnect path.

Fixes: 5cf69dcbec8b ("arm64: dts: qcom: sdm630: Add Adreno 508 GPU configuration")
Reported-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220521202708.1509308-8-dmitry.baryshkov@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sdm630: fix the qusb2phy ref clock
Dmitry Baryshkov [Sat, 21 May 2022 20:27:01 +0000 (23:27 +0300)]
arm64: dts: qcom: sdm630: fix the qusb2phy ref clock

[ Upstream commit 924bbd8dd60e094344711c3526a5b308d71dc008 ]

According to the downstram DT file, the qusb2phy ref clock should be
GCC_RX0_USB2_CLKREF_CLK, not GCC_RX1_USB2_CLKREF_CLK.

Fixes: c65a4ed2ea8b ("arm64: dts: qcom: sdm630: Add USB configuration")
Cc: Konrad Dybcio <konrad.dybcio@somainline.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220521202708.1509308-5-dmitry.baryshkov@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sdm630: disable GPU by default
Dmitry Baryshkov [Sat, 21 May 2022 20:27:00 +0000 (23:27 +0300)]
arm64: dts: qcom: sdm630: disable GPU by default

[ Upstream commit 1c047919763b4548381d1ab3320af1df66ab83df ]

The SoC's device tree file disables gpucc and adreno's SMMU by default.
So let's disable the GPU too. Moreover it looks like SMMU might be not
usable without additional patches (which means that GPU is unusable
too). No board uses GPU at this moment.

Fixes: 5cf69dcbec8b ("arm64: dts: qcom: sdm630: Add Adreno 508 GPU configuration")
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220521202708.1509308-4-dmitry.baryshkov@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: OMAP2+: Fix refcount leak in omap3xxx_prm_late_init
Miaoqian Lin [Thu, 26 May 2022 07:37:24 +0000 (11:37 +0400)]
ARM: OMAP2+: Fix refcount leak in omap3xxx_prm_late_init

[ Upstream commit 942228fbf5d4901112178b93d41225be7c0dd9de ]

of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: 1e037794f7f0 ("ARM: OMAP3+: PRM: register interrupt information from DT")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Message-Id: <20220526073724.21169-1-linmq006@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: OMAP2+: Fix refcount leak in omapdss_init_of
Miaoqian Lin [Wed, 1 Jun 2022 04:48:58 +0000 (08:48 +0400)]
ARM: OMAP2+: Fix refcount leak in omapdss_init_of

[ Upstream commit 9705db1eff38d6b9114121f9e253746199b759c9 ]

omapdss_find_dss_of_node() calls of_find_compatible_node() to get device
node. of_find_compatible_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() in later error path and normal path.

Fixes: e0c827aca0730 ("drm/omap: Populate DSS children in omapdss driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Message-Id: <20220601044858.3352-1-linmq006@gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: qcom: mdm9615: add missing PMIC GPIO reg
Krzysztof Kozlowski [Sat, 7 May 2022 19:49:12 +0000 (21:49 +0200)]
ARM: dts: qcom: mdm9615: add missing PMIC GPIO reg

[ Upstream commit dc590cdc31f636ea15658f1206c3e380a53fb78e ]

'reg' property is required in SSBI children:
  qcom-mdm9615-wp8548-mangoh-green.dtb: gpio@150: 'reg' is a required property

Fixes: 2c5e596524e7 ("ARM: dts: Add MDM9615 dtsi")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220507194913.261121-11-krzysztof.kozlowski@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoblock: fix infinite loop for invalid zone append
Keith Busch [Fri, 10 Jun 2022 19:58:20 +0000 (12:58 -0700)]
block: fix infinite loop for invalid zone append

[ Upstream commit b82d9fa257cb3725c49d94d2aeafc4677c34448a ]

Returning 0 early from __bio_iov_append_get_pages() for the
max_append_sectors warning just creates an infinite loop since 0 means
success, and the bio will never fill from the unadvancing iov_iter. We
could turn the return into an error value, but it will already be turned
into an error value later on, so just remove the warning. Clearly no one
ever hit it anyway.

Fixes: 0512a75b98f84 ("block: Introduce REQ_OP_ZONE_APPEND")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220610195830.3574005-2-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosoc: fsl: guts: machine variable might be unset
Michael Walle [Mon, 4 Apr 2022 09:56:03 +0000 (11:56 +0200)]
soc: fsl: guts: machine variable might be unset

[ Upstream commit ab3f045774f704c4e7b6a878102f4e9d4ae7bc74 ]

If both the model and the compatible properties are missing, then
machine will not be set. Initialize it with NULL.

Fixes: 34c1c21e94ac ("soc: fsl: fix section mismatch build warnings")
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: sc7180: Remove ipa_fw_mem node on trogdor
Stephen Boyd [Tue, 17 May 2022 19:33:07 +0000 (12:33 -0700)]
arm64: dts: qcom: sc7180: Remove ipa_fw_mem node on trogdor

[ Upstream commit e60414644cf3a703e10ed4429c15263095945ffe ]

We don't use this carveout on trogdor boards, and having it defined in
the sc7180 SoC file causes an overlap message to be printed at boot.

 OF: reserved mem: OVERLAP DETECTED!
 memory@86000000 (0x0000000086000000--0x000000008ec00000) overlaps with memory@8b700000 (0x000000008b700000--0x000000008b710000)

Delete the node in the trogdor dtsi file to fix the overlap problem and
remove the error message.

Cc: Alex Elder <elder@linaro.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Fixes: 310b266655a3 ("arm64: dts: qcom: sc7180: define ipa_fw_mem node")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220517193307.3034602-1-swboyd@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agolocking/lockdep: Fix lockdep_init_map_*() confusion
Peter Zijlstra [Fri, 17 Jun 2022 13:26:06 +0000 (15:26 +0200)]
locking/lockdep: Fix lockdep_init_map_*() confusion

[ Upstream commit eae6d58d67d9739be5f7ae2dbead1d0ef6528243 ]

Commit dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t") added yet
another lockdep_init_map_*() variant, but forgot to update all the
existing users of the most complicated version.

This could lead to a loss of lock_type and hence an incorrect report.
Given the relative rarity of both local_lock and these annotations,
this is unlikely to happen in practise, still, best fix things.

Fixes: dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YqyEDtoan20K0CVD@worktop.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
Alexandru Elisei [Fri, 17 Jun 2022 11:13:32 +0000 (12:13 +0100)]
arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1

[ Upstream commit 506506cad3947b942425b119ffa2b06715d5d804 ]

Commit b20d1ba3cf4b ("arm64: cpufeature: allow for version discrepancy in
PMU implementations") made it possible to run Linux on a machine with PMUs
with different versions without tainting the kernel. The patch relaxed the
restriction only for the ID_AA64DFR0_EL1.PMUVer field, and missed doing the
same for ID_DFR0_EL1.PerfMon , which also reports the PMU version, but for
the AArch32 state.

For example, with Linux running on two clusters with different PMU
versions, the kernel is tainted when bringing up secondaries with the
following message:

[    0.097027] smp: Bringing up secondary CPUs ...
[..]
[    0.142805] Detected PIPT I-cache on CPU4
[    0.142805] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000004011088, CPU4: 0x00000005011088
[    0.143555] CPU features: Unsupported CPU feature variation detected.
[    0.143702] GICv3: CPU4: found redistributor 10000 region 0:0x000000002f180000
[    0.143702] GICv3: CPU4: using allocated LPI pending table @0x00000008800d0000
[    0.144888] CPU4: Booted secondary processor 0x0000010000 [0x410fd0f0]

The boot CPU implements FEAT_PMUv3p1 (ID_DFR0_EL1.PerfMon, bits 27:24, is
0b0100), but CPU4, part of the other cluster, implements FEAT_PMUv3p4
(ID_DFR0_EL1.PerfMon = 0b0101).

Treat the PerfMon field as FTR_NONSTRICT and FTR_EXACT to pass the sanity
check and to match how PMUVer is treated for the 64bit ID register.

Fixes: b20d1ba3cf4b ("arm64: cpufeature: allow for version discrepancy in PMU implementations")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20220617111332.203061-1-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: select TRACE_IRQFLAGS_NMI_SUPPORT
Mark Rutland [Wed, 11 May 2022 13:17:33 +0000 (14:17 +0100)]
arm64: select TRACE_IRQFLAGS_NMI_SUPPORT

[ Upstream commit 3381da254fab37ba08c4b7c4f19b4ee28b1a27ec ]

Due to an oversight, on arm64 lockdep IRQ state tracking doesn't work as
intended in NMI context. This demonstrably results in bogus warnings
from lockdep, and in theory could mask a variety of issues.

On arm64, we've consistently tracked IRQ flag state for NMIs (and
saved/restored the state of the interrupted context) since commit:

  f0cd5ac1e4c53cb6 ("arm64: entry: fix NMI {user, kernel}->kernel transitions")

That commit fixed most lockdep issues with NMI by virtue of the
save/restore of the lockdep state of the interrupted context. However,
for lockdep IRQ state tracking to consistently take effect in NMI
context it has been necessary to select TRACE_IRQFLAGS_NMI_SUPPORT since
commit:

  ed00495333ccc80f ("locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs")

As arm64 does not select TRACE_IRQFLAGS_NMI_SUPPORT, this means that the
lockdep state can be stale in NMI context, and some uses of that state
can consume stale data.

When an NMI is taken arm64 entry code will call arm64_enter_nmi(). This
will enter NMI context via __nmi_enter() before calling
lockdep_hardirqs_off() to inform lockdep that IRQs have been masked.
Where TRACE_IRQFLAGS_NMI_SUPPORT is not selected, lockdep_hardirqs_off()
will not update lockdep state if called in NMI context. Thus if IRQs
were enabled in the original context, lockdep will continue to believe
that IRQs are enabled despite the call to lockdep_hardirqs_off().

However, the lockdep_assert_*() checks do take effect in NMI context,
and will consume the stale lockdep state. If an NMI is taken from a
context which had IRQs enabled, and during the handling of the NMI
something calls lockdep_assert_irqs_disabled(), this will result in a
spurious warning based upon the stale lockdep state.

This can be seen when using perf with GICv3 pseudo-NMIs. Within the perf
NMI handler we may attempt a uaccess to record the userspace callchain,
and is this faults the el1_abort() call in the nested context will call
exit_to_kernel_mode() when returning, which has a
lockdep_assert_irqs_disabled() assertion:

| # ./perf record -a -g sh
| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 164 at arch/arm64/kernel/entry-common.c:73 exit_to_kernel_mode+0x118/0x1ac
| Modules linked in:
| CPU: 0 PID: 164 Comm: perf Not tainted 5.18.0-rc5 #1
| Hardware name: linux,dummy-virt (DT)
| pstate: 004003c5 (nzcv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : exit_to_kernel_mode+0x118/0x1ac
| lr : el1_abort+0x80/0xbc
| sp : ffff8000080039f0
| pmr_save: 000000f0
| x29: ffff8000080039f0 x28: ffff6831054e4980 x27: ffff683103adb400
| x26: 0000000000000000 x25: 0000000000000001 x24: 0000000000000001
| x23: 00000000804000c5 x22: 00000000000000c0 x21: 0000000000000001
| x20: ffffbd51e635ec44 x19: ffff800008003a60 x18: 0000000000000000
| x17: ffffaadf98d23000 x16: ffff800008004000 x15: 0000ffffd14f25c0
| x14: 0000000000000000 x13: 00000000000018eb x12: 0000000000000040
| x11: 000000000000001e x10: 000000002b820020 x9 : 0000000100110000
| x8 : 000000000045cac0 x7 : 0000ffffd14f25c0 x6 : ffffbd51e639b000
| x5 : 00000000000003e5 x4 : ffffbd51e58543b0 x3 : 0000000000000001
| x2 : ffffaadf98d23000 x1 : ffff6831054e4980 x0 : 0000000100110000
| Call trace:
|  exit_to_kernel_mode+0x118/0x1ac
|  el1_abort+0x80/0xbc
|  el1h_64_sync_handler+0xa4/0xd0
|  el1h_64_sync+0x74/0x78
|  __arch_copy_from_user+0xa4/0x230
|  get_perf_callchain+0x134/0x1e4
|  perf_callchain+0x7c/0xa0
|  perf_prepare_sample+0x414/0x660
|  perf_event_output_forward+0x80/0x180
|  __perf_event_overflow+0x70/0x13c
|  perf_event_overflow+0x1c/0x30
|  armv8pmu_handle_irq+0xe8/0x160
|  armpmu_dispatch_irq+0x2c/0x70
|  handle_percpu_devid_fasteoi_nmi+0x7c/0xbc
|  generic_handle_domain_nmi+0x3c/0x60
|  gic_handle_irq+0x1dc/0x310
|  call_on_irq_stack+0x2c/0x54
|  do_interrupt_handler+0x80/0x94
|  el1_interrupt+0xb0/0xe4
|  el1h_64_irq_handler+0x18/0x24
|  el1h_64_irq+0x74/0x78
|  lockdep_hardirqs_off+0x50/0x120
|  trace_hardirqs_off+0x38/0x214
|  _raw_spin_lock_irq+0x98/0xa0
|  pipe_read+0x1f8/0x404
|  new_sync_read+0x140/0x150
|  vfs_read+0x190/0x1dc
|  ksys_read+0xdc/0xfc
|  __arm64_sys_read+0x20/0x30
|  invoke_syscall+0x48/0x114
|  el0_svc_common.constprop.0+0x158/0x17c
|  do_el0_svc+0x28/0x90
|  el0_svc+0x60/0x150
|  el0t_64_sync_handler+0xa4/0x130
|  el0t_64_sync+0x19c/0x1a0
| irq event stamp: 483
| hardirqs last  enabled at (483): [<ffffbd51e636aa24>] _raw_spin_unlock_irqrestore+0xa4/0xb0
| hardirqs last disabled at (482): [<ffffbd51e636acd0>] _raw_spin_lock_irqsave+0xb0/0xb4
| softirqs last  enabled at (468): [<ffffbd51e5216f58>] put_cpu_fpsimd_context+0x28/0x70
| softirqs last disabled at (466): [<ffffbd51e5216ed4>] get_cpu_fpsimd_context+0x0/0x5c
| ---[ end trace 0000000000000000 ]---

Note that as lockdep_assert_irqs_disabled() uses WARN_ON_ONCE(), and
this uses a BRK, the warning is logged with the real PSTATE at the time
of the warning, which clearly has DAIF.I set, meaning IRQs (and
pseudo-NMIs) were definitely masked and the warning is spurious.

Fix this by selecting TRACE_IRQFLAGS_NMI_SUPPORT such that the existing
entry tracking takes effect, as we had originally intended when the
arm64 entry code was fixed for transitions to/from NMI.

Arguably the lockdep_assert_*() functions should have the same NMI
checks as the rest of the code to prevent spurious warnings when
TRACE_IRQFLAGS_NMI_SUPPORT is not selected, but the real fix for any
architecture is to explicitly handle the transitions to/from NMI in the
entry code.

Fixes: f0cd5ac1e4c5 ("arm64: entry: fix NMI {user, kernel}->kernel transitions")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220511131733.4074499-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: mt8192: Fix idle-states entry-method
Nícolas F. R. A. Prado [Fri, 17 Jun 2022 23:31:50 +0000 (19:31 -0400)]
arm64: dts: mt8192: Fix idle-states entry-method

[ Upstream commit 2e599740f7e423ee89fb027896cb2635dd43784f ]

The entry-method property of the idle-states node should be "psci" as
described in the idle-states binding, since this is already the value of
enable-method in the CPU nodes. Fix it to get rid of a dtbs_check
warning.

Fixes: 9260918d3a4f ("arm64: dts: mt8192: Add cpu-idle-states")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220617233150.2466344-3-nfraprado@collabora.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: mt8192: Fix idle-states nodes naming scheme
Nícolas F. R. A. Prado [Fri, 17 Jun 2022 23:31:49 +0000 (19:31 -0400)]
arm64: dts: mt8192: Fix idle-states nodes naming scheme

[ Upstream commit 399e23ad51caaf62400a531c9268ad3c453c3d76 ]

Tweak the name of the idle-states subnodes so that they follow the
binding pattern, getting rid of dtbs_check warnings.

Only the usage of "-" in the name was necessary, but "off" was also
exchanged for "sleep" since that seems to be a more common wording in
other dts files.

Fixes: 9260918d3a4f ("arm64: dts: mt8192: Add cpu-idle-states")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220617233150.2466344-2-nfraprado@collabora.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: ast2600-evb-a1: fix board compatible
Krzysztof Kozlowski [Sun, 29 May 2022 10:49:27 +0000 (12:49 +0200)]
ARM: dts: ast2600-evb-a1: fix board compatible

[ Upstream commit 33c39140cc298e0d4e36083cb9a665a837773a60 ]

The AST2600 EVB A1 board should have dedicated compatible.

Fixes: a72955180372 ("ARM: dts: aspeed: ast2600evb: Add dts file for A1 and A0")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220529104928.79636-6-krzysztof.kozlowski@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: ast2600-evb: fix board compatible
Krzysztof Kozlowski [Sun, 29 May 2022 10:49:26 +0000 (12:49 +0200)]
ARM: dts: ast2600-evb: fix board compatible

[ Upstream commit aa5e06208500a0db41473caebdee5a2e81d5a277 ]

The AST2600 EVB board should have dedicated compatible.

Fixes: 2ca5646b5c2f ("ARM: dts: aspeed: Add AST2600 and EVB")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220529104928.79636-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: ast2500-evb: fix board compatible
Krzysztof Kozlowski [Sun, 29 May 2022 10:49:25 +0000 (12:49 +0200)]
ARM: dts: ast2500-evb: fix board compatible

[ Upstream commit 30b276fca5c0644f3cb17bceb1bd6a626c670184 ]

The AST2500 EVB board should have dedicated compatible.

Fixes: 02440622656d ("arm/dst: Add Aspeed ast2500 device tree")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220529104928.79636-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agox86/pmem: Fix platform-device leak in error path
Johan Hovold [Mon, 20 Jun 2022 14:07:23 +0000 (16:07 +0200)]
x86/pmem: Fix platform-device leak in error path

[ Upstream commit 229e73d46994f15314f58b2d39bf952111d89193 ]

Make sure to free the platform device in the unlikely event that
registration fails.

Fixes: 7a67832c7e44 ("libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220620140723.9810-1-johan@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: renesas: Fix thermal-sensors on single-zone sensors
Geert Uytterhoeven [Wed, 15 Jun 2022 14:04:26 +0000 (16:04 +0200)]
arm64: dts: renesas: Fix thermal-sensors on single-zone sensors

[ Upstream commit 62e8a53431145e06e503b71625a34eaa87b72b2c ]

"make dtbs_check":

    arch/arm64/boot/dts/renesas/r8a774c0-cat874.dtb: thermal-zones: cpu-thermal:thermal-sensors: [[74], [0]] is too long
    arch/arm64/boot/dts/renesas/r8a774c0-ek874.dtb: thermal-zones: cpu-thermal:thermal-sensors: [[79], [0]] is too long
    arch/arm64/boot/dts/renesas/r8a774c0-ek874-idk-2121wr.dtb: thermal-zones: cpu-thermal:thermal-sensors: [[82], [0]] is too long
    arch/arm64/boot/dts/renesas/r8a774c0-ek874-mipi-2.1.dtb: thermal-zones: cpu-thermal:thermal-sensors: [[87], [0]] is too long
    arch/arm64/boot/dts/renesas/r8a77990-ebisu.dtb: thermal-zones: cpu-thermal:thermal-sensors: [[105], [0]] is too long
    From schema: Documentation/devicetree/bindings/thermal/thermal-zones.yaml

Indeed, the thermal sensors on R-Car E3 and RZ/G2E support only a single
zone, hence #thermal-sensor-cells = <0>.

Fix this by dropping the bogus zero cell from the thermal sensor
specifiers.

Fixes: 8fa7d18f9ee2dc20 ("arm64: dts: renesas: r8a77990: Create thermal zone to support IPA")
Fixes: 8438bfda9d768157 ("arm64: dts: renesas: r8a774c0: Create thermal zone to support IPA")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/28b812fdd1fc3698311fac984ab8b91d3d655c1c.1655301684.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosoc: amlogic: Fix refcount leak in meson-secure-pwrc.c
Liang He [Thu, 16 Jun 2022 14:49:15 +0000 (22:49 +0800)]
soc: amlogic: Fix refcount leak in meson-secure-pwrc.c

[ Upstream commit d18529a4c12f66d83daac78045ea54063bd43257 ]

In meson_secure_pwrc_probe(), there is a refcount leak in one fail
path.

Signed-off-by: Liang He <windhl@126.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Fixes: b3dde5013e13 ("soc: amlogic: Add support for Secure power domains controller")
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220616144915.3988071-1-windhl@126.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodt-bindings: iio: accel: Add DT binding doc for ADXL355
Puranjay Mohan [Wed, 11 Aug 2021 07:30:26 +0000 (13:00 +0530)]
dt-bindings: iio: accel: Add DT binding doc for ADXL355

[ Upstream commit bf43a71a0a7f396434f6460b46e33eb00752f78d ]

Add devicetree binding document for ADXL355, a 3-Axis MEMS Accelerometer.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210811073027.124619-2-puranjay12@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoscsi: hisi_sas: Use managed PCI functions
Xiang Chen [Tue, 24 Aug 2021 10:00:56 +0000 (18:00 +0800)]
scsi: hisi_sas: Use managed PCI functions

[ Upstream commit 4f6094f1663e2ed26a940f1842cdaa15c1dd649a ]

Use managed PCI functions such as pcim_enable_device() and
pcim_iomap_regions() to simplify exception handling code.

Link: https://lore.kernel.org/r/1629799260-120116-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosoc: renesas: r8a779a0-sysc: Fix A2DP1 and A2CV[2357] PDR values
Geert Uytterhoeven [Wed, 8 Jun 2022 13:51:35 +0000 (15:51 +0200)]
soc: renesas: r8a779a0-sysc: Fix A2DP1 and A2CV[2357] PDR values

[ Upstream commit bccceabb92ce8eb78bbf2de08308e2cc2761a2e5 ]

The PDR values for the A2DP1 and A2CV[2357] power areas on R-Car V3U are
incorrect (copied-and-pasted from A2DP0 and A2CV[0146]).
Fix them.

Reported-by: Renesas Vietnam via Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Fixes: 1b4298f000064cc2 ("soc: renesas: r8a779a0-sysc: Add r8a779a0 support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/87bc2e70ba4082970cf8c65871beae4be3503189.1654696188.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx7d-colibri-emmc: add cpu1 supply
Marcel Ziswiler [Mon, 16 May 2022 13:47:23 +0000 (15:47 +0200)]
ARM: dts: imx7d-colibri-emmc: add cpu1 supply

[ Upstream commit ba28db60d34271e8a3cf4d7158d71607e8b1e57f ]

Each cpu-core is supposed to list its supply separately, add supply for
cpu1.

Fixes: 2d7401f8632f ("ARM: dts: imx7d: Add cpu1 supply")
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: processor/idle: Annotate more functions to live in cpuidle section
Guilherme G. Piccoli [Tue, 7 Jun 2022 22:24:58 +0000 (19:24 -0300)]
ACPI: processor/idle: Annotate more functions to live in cpuidle section

[ Upstream commit 409dfdcaffb266acfc1f33529a26b1443c9332d4 ]

Commit 6727ad9e206c ("nmi_backtrace: generate one-line reports for idle cpus")
introduced a new text section called cpuidle; with that, we have a mechanism
to add idling functions in such section and skip them from nmi_backtrace
output, since they're useless and potentially flooding for such report.

Happens that inlining might cause some real idle functions to end-up
outside of such section; this is currently the case of ACPI processor_idle
driver; the functions acpi_idle_enter_* do inline acpi_idle_do_entry(),
hence they stay out of the cpuidle section.
Fix that by marking such functions to also live in the cpuidle section.

Fixes: 6727ad9e206c ("nmi_backtrace: generate one-line reports for idle cpus")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: bcm: Fix refcount leak in bcm_kona_smc_init
Miaoqian Lin [Thu, 26 May 2022 08:13:25 +0000 (12:13 +0400)]
ARM: bcm: Fix refcount leak in bcm_kona_smc_init

[ Upstream commit cb23389a2458c2e4bfd6c86a513cbbe1c4d35e76 ]

of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: b8eb35fd594a ("ARM: bcm281xx: Add L2 cache enable code")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agospi: spi-altera-dfl: Fix an error handling path
Christophe JAILLET [Sun, 29 May 2022 06:31:53 +0000 (08:31 +0200)]
spi: spi-altera-dfl: Fix an error handling path

[ Upstream commit 8e3ca32f46994e74b7f43c57731150b2aedb2630 ]

The spi_alloc_master() call is not undone in all error handling paths.
Moreover, there is no .remove function to release the allocated memory.

In order to fix both this issues, switch to devm_spi_alloc_master().

This allows further simplification of the probe.

Fixes: ba2fc167e944 ("spi: altera: Add DFL bus driver for Altera API Controller")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/0607bb59f4073f86abe5c585d35245aef0b045c6.1653805901.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: renesas: beacon: Fix regulator node names
Geert Uytterhoeven [Wed, 11 May 2022 10:14:06 +0000 (12:14 +0200)]
arm64: dts: renesas: beacon: Fix regulator node names

[ Upstream commit 7512af9f78dedea7e04225f665dad6750df7d095 ]

Currently there are two nodes named "regulator_camera".  This causes the
former to be overwritten by the latter.

Fix this by renaming them to unique names, using the preferred hyphen
instead of an underscore.

While at it, update the name of the audio regulator (which was added in
the same commit) to use a hyphen.

Fixes: a1d8a344f1ca0709 ("arm64: dts: renesas: Introduce r8a774a1-beacon-rzg2m-kit")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/a9ac82bdf108162487289d091c53a9b3de393f13.1652263918.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agomeson-mx-socinfo: Fix refcount leak in meson_mx_socinfo_init
Miaoqian Lin [Tue, 24 May 2022 06:57:29 +0000 (10:57 +0400)]
meson-mx-socinfo: Fix refcount leak in meson_mx_socinfo_init

[ Upstream commit a2106f38077e78afcb4bf98fdda3e162118cfb3d ]

of_find_matching_node() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: 5e68c0fc8df8 ("soc: amlogic: Add Meson6/Meson8/Meson8b/Meson8m2 SoC Information driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20220524065729.33689-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: findbit: fix overflowing offset
Russell King (Oracle) [Tue, 26 Jul 2022 22:51:48 +0000 (23:51 +0100)]
ARM: findbit: fix overflowing offset

[ Upstream commit ec85bd369fd2bfaed6f45dd678706429d4f75b48 ]

When offset is larger than the size of the bit array, we should not
attempt to access the array as we can perform an access beyond the
end of the array. Fix this by changing the pre-condition.

Using "cmp r2, r1; bhs ..." covers us for the size == 0 case, since
this will always take the branch when r1 is zero, irrespective of
the value of r2. This means we can fix this bug without adding any
additional code!

Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agospi: spi-rspi: Fix PIO fallback on RZ platforms
Biju Das [Thu, 21 Jul 2022 14:34:49 +0000 (15:34 +0100)]
spi: spi-rspi: Fix PIO fallback on RZ platforms

[ Upstream commit b620aa3a7be346f04ae7789b165937615c6ee8d3 ]

RSPI IP on RZ/{A, G2L} SoC's has the same signal for both interrupt
and DMA transfer request. Setting DMARS register for DMA transfer
makes the signal to work as a DMA transfer request signal and
subsequent interrupt requests to the interrupt controller
are masked.

PIO fallback does not work as interrupt signal is disabled.

This patch fixes this issue by re-enabling the interrupts by
calling dmaengine_synchronize().

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220721143449.879257-1-biju.das.jz@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agopowerpc/64s: Disable stack variable initialisation for prom_init
Michael Ellerman [Mon, 18 Jul 2022 13:44:18 +0000 (23:44 +1000)]
powerpc/64s: Disable stack variable initialisation for prom_init

[ Upstream commit be640317a1d0b9cf42fedb2debc2887a7cfa38de ]

With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220718134418.354114-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agodrm/amdgpu: Remove one duplicated ef removal
xinhui pan [Fri, 8 Jul 2022 01:22:44 +0000 (09:22 +0800)]
drm/amdgpu: Remove one duplicated ef removal

[ Upstream commit e1aadbab445b06e072013a1365fd0cf2aa25e843 ]

That has been done in BO release notify.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2074
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agokasan: test: Silence GCC 12 warnings
Kees Cook [Wed, 8 Jun 2022 21:40:24 +0000 (14:40 -0700)]
kasan: test: Silence GCC 12 warnings

[ Upstream commit aaf50b1969d7933a51ea421b11432a7fb90974e3 ]

GCC 12 continues to get smarter about array accesses. The KASAN tests
are expecting to explicitly test out-of-bounds conditions at run-time,
so hide the variable from GCC, to avoid warnings like:

../lib/test_kasan.c: In function 'ksize_uaf':
../lib/test_kasan.c:790:61: warning: array subscript 120 is outside array bounds of 'void[120]' [-Warray-bounds]
  790 |         KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[size]);
      |                                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~
../lib/test_kasan.c:97:9: note: in definition of macro 'KUNIT_EXPECT_KASAN_FAIL'
   97 |         expression; \
      |         ^~~~~~~~~~

Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: kasan-dev@googlegroups.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220608214024.1068451-1-keescook@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoselinux: Add boundary check in put_entry()
Xiu Jianfeng [Tue, 14 Jun 2022 02:14:49 +0000 (10:14 +0800)]
selinux: Add boundary check in put_entry()

[ Upstream commit 15ec76fb29be31df2bccb30fc09875274cba2776 ]

Just like next_entry(), boundary check is necessary to prevent memory
out-of-bound access.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoselinux: fix memleak in security_read_state_kernel()
Xiu Jianfeng [Mon, 13 Jun 2022 13:59:53 +0000 (21:59 +0800)]
selinux: fix memleak in security_read_state_kernel()

[ Upstream commit 73de1befcc53a7c68b0c5e76b9b5ac41c517760f ]

In this function, it directly returns the result of __security_read_policy
without freeing the allocated memory in *data, cause memory leak issue,
so free the memory if __security_read_policy failed.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoPM: hibernate: defer device probing when resuming from hibernation
Tetsuo Handa [Fri, 15 Jul 2022 05:49:58 +0000 (14:49 +0900)]
PM: hibernate: defer device probing when resuming from hibernation

[ Upstream commit 8386c414e27caba8501119948e9551e52b527f59 ]

syzbot is reporting hung task at misc_open() [1], for there is a race
window of AB-BA deadlock which involves probe_count variable. Currently
wait_for_device_probe() from snapshot_open() from misc_open() can sleep
forever with misc_mtx held if probe_count cannot become 0.

When a device is probed by hub_event() work function, probe_count is
incremented before the probe function starts, and probe_count is
decremented after the probe function completed.

There are three cases that can prevent probe_count from dropping to 0.

  (a) A device being probed stopped responding (i.e. broken/malicious
      hardware).

  (b) A process emulating a USB device using /dev/raw-gadget interface
      stopped responding for some reason.

  (c) New device probe requests keeps coming in before existing device
      probe requests complete.

The phenomenon syzbot is reporting is (b). A process which is holding
system_transition_mutex and misc_mtx is waiting for probe_count to become
0 inside wait_for_device_probe(), but the probe function which is called
 from hub_event() work function is waiting for the processes which are
blocked at mutex_lock(&misc_mtx) to respond via /dev/raw-gadget interface.

This patch mitigates (b) by deferring wait_for_device_probe() from
snapshot_open() to snapshot_write() and snapshot_ioctl(). Please note that
the possibility of (b) remains as long as any thread which is emulating a
USB device via /dev/raw-gadget interface can be blocked by uninterruptible
blocking operations (e.g. mutex_lock()).

Please also note that (a) and (c) are not addressed. Regarding (c), we
should change the code to wait for only one device which contains the
image for resuming from hibernation. I don't know how to address (a), for
use of timeout for wait_for_device_probe() might result in loss of user
data in the image. Maybe we should require the userland to wait for the
image device before opening /dev/snapshot interface.

Link: https://syzkaller.appspot.com/bug?extid=358c9ab4c93da7b7238c
Reported-by: syzbot <syzbot+358c9ab4c93da7b7238c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+358c9ab4c93da7b7238c@syzkaller.appspotmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agohwmon: (sht15) Fix wrong assumptions in device remove callback
Uwe Kleine-König [Mon, 25 Jul 2022 19:43:44 +0000 (21:43 +0200)]
hwmon: (sht15) Fix wrong assumptions in device remove callback

[ Upstream commit 7d4edccc9bbfe1dcdff641343f7b0c6763fbe774 ]

Taking a lock at the beginning of .remove() doesn't prevent new readers.
With the existing approach it can happen, that a read occurs just when
the lock was taken blocking the reader until the lock is released at the
end of the remove callback which then accessed *data that is already
freed then.

To actually fix this problem the hwmon core needs some adaption. Until
this is implemented take the optimistic approach of assuming that all
readers are gone after hwmon_device_unregister() and
sysfs_remove_group() as most other drivers do. (And once the core
implements that, taking the lock would deadlock.)

So drop the lock, move the reset to after device unregistration to keep
the device in a workable state until it's deregistered. Also add a error
message in case the reset fails and return 0 anyhow. (Returning an error
code, doesn't stop the platform device unregistration and only results
in a little helpful error message before the devm cleanup handlers are
called.)

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20220725194344.150098-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agohwmon: (dell-smm) Add Dell XPS 13 7390 to fan control whitelist
Armin Wolf [Sun, 12 Jun 2022 04:18:06 +0000 (06:18 +0200)]
hwmon: (dell-smm) Add Dell XPS 13 7390 to fan control whitelist

[ Upstream commit 385e5f57053ff293282fea84c1c27186d53f66e1 ]

A user reported that the program dell-bios-fan-control
worked on his Dell XPS 13 7390 to switch off automatic
fan control.
Since it uses the same mechanism as the dell_smm_hwmon
module, add this model to the fan control whitelist.

Compile-tested only.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Acked-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20220612041806.11367-1-W_Armin@gmx.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agofirmware: tegra: Fix error check return value of debugfs_create_file()
Lv Ruyi [Tue, 19 Apr 2022 01:36:48 +0000 (01:36 +0000)]
firmware: tegra: Fix error check return value of debugfs_create_file()

[ Upstream commit afcdb8e55c91c6ff0700ab272fd0f74e899ab884 ]

If an error occurs, debugfs_create_file() will return ERR_PTR(-ERROR),
so use IS_ERR() to check it.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: shmobile: rcar-gen2: Increase refcount for new reference
Liang He [Fri, 1 Jul 2022 12:18:04 +0000 (20:18 +0800)]
ARM: shmobile: rcar-gen2: Increase refcount for new reference

[ Upstream commit 75a185fb92e58ccd3670258d8d3b826bd2fa6d29 ]

In rcar_gen2_regulator_quirk(), for_each_matching_node_and_match() will
automatically increase and decrease the refcount.  However, we should
call of_node_get() for the new reference created in 'quirk->np'.
Besides, we also should call of_node_put() before the 'quirk' being
freed.

Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220701121804.234223-1-windhl@126.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: allwinner: a64: orangepi-win: Fix LED node name
Samuel Holland [Sat, 2 Jul 2022 13:28:15 +0000 (08:28 -0500)]
arm64: dts: allwinner: a64: orangepi-win: Fix LED node name

[ Upstream commit b8eb2df19fbf97aa1e950cf491232c2e3bef8357 ]

"status" does not match any pattern in the gpio-leds binding. Rename the
node to the preferred pattern. This fixes a `make dtbs_check` error.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220702132816.46456-1-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoarm64: dts: qcom: ipq8074: fix NAND node name
Robert Marko [Tue, 21 Jun 2022 12:06:42 +0000 (14:06 +0200)]
arm64: dts: qcom: ipq8074: fix NAND node name

[ Upstream commit b39961659ffc3c3a9e3d0d43b0476547b5f35d49 ]

Per schema it should be nand-controller@79b0000 instead of nand@79b0000.
Fix it to match nand-controller.yaml requirements.

Signed-off-by: Robert Marko <robimarko@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220621120642.518575-1-robimarko@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: qcom: sdx55: Fix the IRQ trigger type for UART
Manivannan Sadhasivam [Mon, 30 May 2022 08:08:40 +0000 (13:38 +0530)]
ARM: dts: qcom: sdx55: Fix the IRQ trigger type for UART

[ Upstream commit ae500b351ab0006d933d804a2b7507fe1e98cecc ]

The trigger type should be LEVEL_HIGH. So fix it!

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220530080842.37024-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: LPSS: Fix missing check in register_device_clock()
huhai [Thu, 23 Jun 2022 13:21:27 +0000 (21:21 +0800)]
ACPI: LPSS: Fix missing check in register_device_clock()

[ Upstream commit b4f1f61ed5928b1128e60e38d0dffa16966f06dc ]

register_device_clock() misses a check for platform_device_register_simple().
Add a check to fix it.

Signed-off-by: huhai <huhai@kylinos.cn>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: PM: save NVS memory for Lenovo G40-45
Manyi Li [Wed, 22 Jun 2022 07:42:48 +0000 (15:42 +0800)]
ACPI: PM: save NVS memory for Lenovo G40-45

[ Upstream commit 4b7ef7b05afcde44142225c184bf43a0cd9e2178 ]

[821d6f0359b0614792ab8e2fb93b503e25a65079] is to make machines
produced from 2012 to now not saving NVS region to accelerate S3.

But, Lenovo G40-45, a platform released in 2015, still needs NVS memory
saving during S3. A quirk is introduced for this platform.

Signed-off-by: Manyi Li <limanyi@uniontech.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk
Hans de Goede [Mon, 20 Jun 2022 09:25:44 +0000 (11:25 +0200)]
ACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk

[ Upstream commit f7090e0ef360d674f08a22fab90e4e209fb1f658 ]

It seems that these quirks are no longer necessary since
commit 69b957c26b32 ("ACPI: EC: Fix possible issues related to EC
initialization order"), which has fixed this in a generic manner.

There are 3 commits adding DMI entries with this quirk (adding multiple
DMI entries per commit). 2/3 commits are from before the generic fix.

Which leaves commit 6306f0431914 ("ACPI: EC: Make more Asus laptops
use ECDT _GPE"), which was committed way after the generic fix.
But this was just due to slow upstreaming of it. This commit stems
from Endless from 15 Aug 2017 (committed upstream 20 May 2021):
https://github.com/endlessm/linux/pull/288

The current code should work fine without this:

 1. The EC_FLAGS_IGNORE_DSDT_GPE flag is only checked in ec_parse_device(),
    like this:

if (boot_ec && boot_ec_is_ecdt && EC_FLAGS_IGNORE_DSDT_GPE) {
ec->gpe = boot_ec->gpe;
} else {
/* parse GPE */
}

 2. ec_parse_device() is only called from acpi_ec_add() and
    acpi_ec_dsdt_probe()

 3. acpi_ec_dsdt_probe() starts with:

if (boot_ec)
return;

    so it only calls ec_parse_device() when boot_ec == NULL, meaning that
    the quirk never triggers for this call. So only the call in
    acpi_ec_add() matters.

 4. acpi_ec_add() does the following after the ec_parse_device() call:

if (boot_ec && ec->command_addr == boot_ec->command_addr &&
    ec->data_addr == boot_ec->data_addr &&
    !EC_FLAGS_TRUST_DSDT_GPE) {
/*
 * Trust PNP0C09 namespace location rather than
 * ECDT ID. But trust ECDT GPE rather than _GPE
 * because of ASUS quirks, so do not change
 * boot_ec->gpe to ec->gpe.
 */
boot_ec->handle = ec->handle;
acpi_handle_debug(ec->handle, "duplicated.\n");
acpi_ec_free(ec);
ec = boot_ec;
}

The quirk only matters if boot_ec != NULL and EC_FLAGS_TRUST_DSDT_GPE
is never set at the same time as EC_FLAGS_IGNORE_DSDT_GPE.

That means that if the addresses match we always enter this if block and
then only the ec->handle part of the data stored in ec by ec_parse_device()
is used and the rest is thrown away, after which ec is made to point
to boot_ec, at which point ec->gpe == boot_ec->gpe, so the same result
as with the quirk set, independent of the value of the quirk.

Also note the comment in this block which indicates that the gpe result
from ec_parse_device() is deliberately not taken to deal with buggy
Asus laptops and all DMI quirks setting EC_FLAGS_IGNORE_DSDT_GPE are for
Asus laptops.

Based on the above I believe that unless on some quirked laptops
the ECDT and DSDT EC addresses do not match we can drop the quirk.

I've checked dmesg output to ensure the ECDT and DSDT EC addresses match
for quirked models using https://linux-hardware.org hw-probe reports.

I've been able to confirm that the addresses match for the following
models this way: GL702VMK, X505BA, X505BP, X550VXK, X580VD.
Whereas for the following models I could find any dmesg output:
FX502VD, FX502VE, X542BA, X542BP.

Note the models without dmesg all were submitted in patches with a batch
of models and other models from the same batch checkout ok.

This, combined with that all the code adding the quirks was written before
the generic fix makes me believe that it is safe to remove this quirk now.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Daniel Drake <drake@endlessos.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoACPI: EC: Remove duplicate ThinkPad X1 Carbon 6th entry from DMI quirks
Hans de Goede [Mon, 20 Jun 2022 09:25:43 +0000 (11:25 +0200)]
ACPI: EC: Remove duplicate ThinkPad X1 Carbon 6th entry from DMI quirks

[ Upstream commit 0dd6db359e5f206cbf1dd1fd40dd211588cd2725 ]

Somehow the "ThinkPad X1 Carbon 6th" entry ended up twice in the
struct dmi_system_id acpi_ec_no_wakeup[] array. Remove one of
the entries.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: OMAP2+: pdata-quirks: Fix refcount leak bug
Liang He [Sat, 18 Jun 2022 02:06:03 +0000 (10:06 +0800)]
ARM: OMAP2+: pdata-quirks: Fix refcount leak bug

[ Upstream commit 5cdbab96bab314c6f2f5e4e8b8a019181328bf5f ]

In pdata_quirks_init_clocks(), the loop contains
of_find_node_by_name() but without corresponding of_node_put().

Signed-off-by: Liang He <windhl@126.com>
Message-Id: <20220618020603.4055792-1-windhl@126.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: OMAP2+: display: Fix refcount leak bug
Liang He [Fri, 17 Jun 2022 14:58:03 +0000 (22:58 +0800)]
ARM: OMAP2+: display: Fix refcount leak bug

[ Upstream commit 50b87a32a79bca6e275918a711fb8cc55e16d739 ]

In omapdss_init_fbdev(), of_find_node_by_name() will return a node
pointer with refcount incremented. We should use of_node_put() when
it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Message-Id: <20220617145803.4050918-1-windhl@126.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agospi: synquacer: Add missing clk_disable_unprepare()
Guo Mengqi [Fri, 24 Jun 2022 00:56:14 +0000 (08:56 +0800)]
spi: synquacer: Add missing clk_disable_unprepare()

[ Upstream commit 917e43de2a56d9b82576f1cc94748261f1988458 ]

Add missing clk_disable_unprepare() in synquacer_spi_resume().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guo Mengqi <guomengqi3@huawei.com>
Link: https://lore.kernel.org/r/20220624005614.49434-1-guomengqi3@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: ux500: Fix Gavini accelerometer mounting matrix
Linus Walleij [Sat, 11 Jun 2022 20:51:38 +0000 (22:51 +0200)]
ARM: dts: ux500: Fix Gavini accelerometer mounting matrix

[ Upstream commit e24c75f02a81d6ddac0072cbd7a03e799c19d558 ]

This was fixed wrong so fix it. Now verified by using
iio-sensor-proxy monitor-sensor test program.

Link: https://lore.kernel.org/r/20220611205138.491513-1-linus.walleij@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: ux500: Fix Codina accelerometer mounting matrix
Linus Walleij [Sat, 11 Jun 2022 20:42:49 +0000 (22:42 +0200)]
ARM: dts: ux500: Fix Codina accelerometer mounting matrix

[ Upstream commit 0b2152e428ab91533a02888ff24e52e788dc4637 ]

This was fixed wrong so fix it again. Now verified by using
iio-sensor-proxy monitor-sensor test program.

Link: https://lore.kernel.org/r/20220611204249.472250-1-linus.walleij@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: BCM5301X: Add DT for Meraki MR26
Christian Lamparter [Fri, 17 Jun 2022 22:00:29 +0000 (00:00 +0200)]
ARM: dts: BCM5301X: Add DT for Meraki MR26

[ Upstream commit 935327a73553001f8d81375c76985d05f604507f ]

Meraki MR26 is an EOL wireless access point featuring a
PoE ethernet port and two dual-band 3x3 MIMO 802.11n
radios and 1x1 dual-band WIFI dedicated to scanning.

Thank you Amir for the unit and PSU.

Hardware info:
SOC   : Broadcom BCM53015A1KFEBG (dual-core Cortex-A9 CPU at 800 MHz)
RAM   : SK Hynix Inc. H5TQ1G63EFR, 1 GBit DDR3 SDRAM = 128 MiB
NAND  : Spansion S34ML01G100TF100, 1 GBit SLC NAND Flash = 128 MiB
ETH   : 1 GBit Ethernet Port - PoE (TPS23754 PoE Interface)
WIFI0 : Broadcom BCM43431KMLG, BCM43431 802.11 abgn (3x3:3)
WIFI1 : Broadcom BCM43431KMLG, BCM43431 802.11 abgn (3x3:3)
WIFI2 : Broadcom BCM43428 "Air Marshal" 802.11 abgn (1x1:1)
BUTTON: One reset key behind a small hole next to the Ethernet Port
LEDS  : One amber (fault), one white (indicator) LED, separate RGB-LED
MISC  : Atmel AT24C64 8KiB EEPROM i2c
      : Ti INA219 26V, 12-bit, i2c output current/voltage/power monitor

SERIAL:
      WARNING: The serial port needs a TTL/RS-232 3V3 level converter!
      The Serial setting is 115200-8-N-1. The board has a populated
      right angle 1x4 0.1" pinheader.
      The pinout is: VCC (next to J3, has the pin 1 indicator), RX, TX, GND.

Odd stuff:

- uboot does not support lzma compression, but gzip'd uImage/DTB work.
- uboot claims to support FIT, but fails to pass the DTB to the kernel.
  Appending the dtb after the kernel image works.
- RGB-controller is supported through an external userspace program.
- The ubi partition contains a "board-config" volume. It stores the
  MAC Address (0x66 in binary) and Serial No. (0x7c alpha-numerical).
- SoC's temperature sensor always reports that it is on fire.
  This causes the system to immediately shutdown! Looking at reported
  "418 degree Celsius" suggests that this sensor is not working.

WIFI:
b43 is able to initialize all three WIFIs @ 802.11bg.
| b43-phy0: Broadcom 43431 WLAN found (core revision 29)
| bcma-pci-bridge 0000:01:00.0: bus1: Switched to core: 0x812
| b43-phy0: Found PHY: Analog 9, Type 7 (HT), Revision 1
| b43-phy0: Found Radio: Manuf 0x17F, ID 0x2059, Revision 0, Version 1
| b43-phy0 warning: 5 GHz band is unsupported on this PHY
| b43-phy1: Broadcom 43431 WLAN found (core revision 29)
| bcma-pci-bridge 0001:01:00.0: bus2: Switched to core: 0x812
| b43-phy1: Found PHY: Analog 9, Type 7 (HT), Revision 1
| b43-phy1: Found Radio: Manuf 0x17F, ID 0x2059, Revision 0, Version 1
| b43-phy1 warning: 5 GHz band is unsupported on this PHY
| b43-phy2: Broadcom 43228 WLAN found (core revision 30)
| bcma-pci-bridge 0002:01:00.0: bus3: Switched to core: 0x812
| b43-phy2: Found PHY: Analog 9, Type 4 (N), Revision 16
| b43-phy2: Found Radio: Manuf 0x17F, ID 0x2057, Revision 9, Version 1
| Broadcom 43xx driver loaded [ Features: NL ]

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: fix qspi node compatible
Alexander Stein [Mon, 13 Jun 2022 12:33:57 +0000 (14:33 +0200)]
ARM: dts: imx6ul: fix qspi node compatible

[ Upstream commit 0c6cf86e1ab433b2d421880fdd9c6e954f404948 ]

imx6ul is not compatible to imx6sx, both have different erratas.
Fixes the dt_binding_check warning:
spi@21e0000: compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6ul-qspi', 'fsl,imx6sx-qspi'] is too long
Additional items are not allowed ('fsl,imx6sx-qspi' was unexpected)
'fsl,imx6ul-qspi' is not one of ['fsl,ls1043a-qspi']
'fsl,imx6ul-qspi' is not one of ['fsl,imx8mq-qspi']
'fsl,ls1021a-qspi' was expected
'fsl,imx7d-qspi' was expected

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: fix lcdif node compatible
Alexander Stein [Mon, 13 Jun 2022 12:33:56 +0000 (14:33 +0200)]
ARM: dts: imx6ul: fix lcdif node compatible

[ Upstream commit 1a884d17ca324531634cce82e9f64c0302bdf7de ]

In yaml binding "fsl,imx6ul-lcdif" is listed as compatible to imx6sx-lcdif,
but not imx28-lcdif. Change the list accordingly. Fixes the
dt_binding_check warning:
lcdif@21c8000: compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6ul-lcdif', 'fsl,imx28-lcdif'] is too long
Additional items are not allowed ('fsl,imx28-lcdif' was unexpected)
'fsl,imx6ul-lcdif' is not one of ['fsl,imx23-lcdif', 'fsl,imx28-lcdif',
'fsl,imx6sx-lcdif']
'fsl,imx6sx-lcdif' was expected

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: fix csi node compatible
Alexander Stein [Mon, 13 Jun 2022 12:33:55 +0000 (14:33 +0200)]
ARM: dts: imx6ul: fix csi node compatible

[ Upstream commit e0aca931a2c7c29c88ebf37f9c3cd045e083483d ]

"fsl,imx6ul-csi" was never listed as compatible to "fsl,imx7-csi", neither
in yaml bindings, nor previous txt binding. Remove the imx7 part. Fixes
the dt schema check warning:
csi@21c4000: compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6ul-csi', 'fsl,imx7-csi'] is too long
Additional items are not allowed ('fsl,imx7-csi' was unexpected)
'fsl,imx8mm-csi' was expected

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: fix keypad compatible
Alexander Stein [Mon, 13 Jun 2022 12:33:53 +0000 (14:33 +0200)]
ARM: dts: imx6ul: fix keypad compatible

[ Upstream commit 7d15e0c9a515494af2e3199741cdac7002928a0e ]

According to binding, the compatible shall only contain imx6ul and imx21
compatibles. Fixes the dt_binding_check warning:
keypad@20b8000: compatible: 'oneOf' conditional failed, one must be fixed:
['fsl,imx6ul-kpp', 'fsl,imx6q-kpp', 'fsl,imx21-kpp'] is too long
Additional items are not allowed ('fsl,imx6q-kpp', 'fsl,imx21-kpp' were
unexpected)
Additional items are not allowed ('fsl,imx21-kpp' was unexpected)
'fsl,imx21-kpp' was expected

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: change operating-points to uint32-matrix
Alexander Stein [Mon, 13 Jun 2022 12:33:52 +0000 (14:33 +0200)]
ARM: dts: imx6ul: change operating-points to uint32-matrix

[ Upstream commit edb67843983bbdf61b4c8c3c50618003d38bb4ae ]

operating-points is a uint32-matrix as per opp-v1.yaml. Change it
accordingly. While at it, change fsl,soc-operating-points as well,
although there is no bindings file (yet). But they should have the same
format. Fixes the dt_binding_check warning:
cpu@0: operating-points:0: [696000, 1275000, 528000, 1175000, 396000,
1025000, 198000, 950000] is too long
cpu@0: operating-points:0: Additional items are not allowed (528000,
1175000, 396000, 1025000, 198000, 950000 were unexpected)

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoARM: dts: imx6ul: add missing properties for sram
Alexander Stein [Mon, 13 Jun 2022 12:33:51 +0000 (14:33 +0200)]
ARM: dts: imx6ul: add missing properties for sram

[ Upstream commit 5655699cf5cff9f4c4ee703792156bdd05d1addf ]

All 3 properties are required by sram.yaml. Fixes the dtbs_check
warning:
sram@900000: '#address-cells' is a required property
sram@900000: '#size-cells' is a required property
sram@900000: 'ranges' is a required property

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agowait: Fix __wait_event_hrtimeout for RT/DL tasks
Juri Lelli [Mon, 27 Jun 2022 09:50:51 +0000 (11:50 +0200)]
wait: Fix __wait_event_hrtimeout for RT/DL tasks

[ Upstream commit cceeeb6a6d02e7b9a74ddd27a3225013b34174aa ]

Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on
PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not
accounted for by hrtimer_start_expires call paths. In particular,
__wait_event_hrtimeout suffers from this problem as we have, for
example:

fs/aio.c::read_events
  wait_event_interruptible_hrtimeout
    __wait_event_hrtimeout
      hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD"
                                       on RT if task runs at RT/DL priority
        hrtimer_start_range_ns
          WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard)
          fires since the latter doesn't see the change of mode done by
          init_sleeper

Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires,
which is aware of the special RT/DL case, instead of hrtimer_start_range_ns.

Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20220627095051.42470-1-juri.lelli@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoirqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
William Dean [Sat, 23 Jul 2022 10:01:28 +0000 (18:01 +0800)]
irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()

[ Upstream commit 71349cc85e5930dce78ed87084dee098eba24b59 ]

The function ioremap() in gic_of_init() can fail, so
its return value should be checked.

Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: William Dean <williamsukatube@163.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220723100128.2964304-1-williamsukatube@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosched/core: Always flush pending blk_plug
John Keeping [Fri, 8 Jul 2022 16:27:02 +0000 (17:27 +0100)]
sched/core: Always flush pending blk_plug

[ Upstream commit 401e4963bf45c800e3e9ea0d3a0289d738005fd4 ]

With CONFIG_PREEMPT_RT, it is possible to hit a deadlock between two
normal priority tasks (SCHED_OTHER, nice level zero):

INFO: task kworker/u8:0:8 blocked for more than 491 seconds.
      Not tainted 5.15.49-rt46 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u8:0    state:D stack:    0 pid:    8 ppid:     2 flags:0x00000000
Workqueue: writeback wb_workfn (flush-7:0)
[<c08a3a10>] (__schedule) from [<c08a3d84>] (schedule+0xdc/0x134)
[<c08a3d84>] (schedule) from [<c08a65a0>] (rt_mutex_slowlock_block.constprop.0+0xb8/0x174)
[<c08a65a0>] (rt_mutex_slowlock_block.constprop.0) from [<c08a6708>]
+(rt_mutex_slowlock.constprop.0+0xac/0x174)
[<c08a6708>] (rt_mutex_slowlock.constprop.0) from [<c0374d60>] (fat_write_inode+0x34/0x54)
[<c0374d60>] (fat_write_inode) from [<c0297304>] (__writeback_single_inode+0x354/0x3ec)
[<c0297304>] (__writeback_single_inode) from [<c0297998>] (writeback_sb_inodes+0x250/0x45c)
[<c0297998>] (writeback_sb_inodes) from [<c0297c20>] (__writeback_inodes_wb+0x7c/0xb8)
[<c0297c20>] (__writeback_inodes_wb) from [<c0297f24>] (wb_writeback+0x2c8/0x2e4)
[<c0297f24>] (wb_writeback) from [<c0298c40>] (wb_workfn+0x1a4/0x3e4)
[<c0298c40>] (wb_workfn) from [<c0138ab8>] (process_one_work+0x1fc/0x32c)
[<c0138ab8>] (process_one_work) from [<c0139120>] (worker_thread+0x22c/0x2d8)
[<c0139120>] (worker_thread) from [<c013e6e0>] (kthread+0x16c/0x178)
[<c013e6e0>] (kthread) from [<c01000fc>] (ret_from_fork+0x14/0x38)
Exception stack(0xc10e3fb0 to 0xc10e3ff8)
3fa0:                                     00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000

INFO: task tar:2083 blocked for more than 491 seconds.
      Not tainted 5.15.49-rt46 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:tar             state:D stack:    0 pid: 2083 ppid:  2082 flags:0x00000000
[<c08a3a10>] (__schedule) from [<c08a3d84>] (schedule+0xdc/0x134)
[<c08a3d84>] (schedule) from [<c08a41b0>] (io_schedule+0x14/0x24)
[<c08a41b0>] (io_schedule) from [<c08a455c>] (bit_wait_io+0xc/0x30)
[<c08a455c>] (bit_wait_io) from [<c08a441c>] (__wait_on_bit_lock+0x54/0xa8)
[<c08a441c>] (__wait_on_bit_lock) from [<c08a44f4>] (out_of_line_wait_on_bit_lock+0x84/0xb0)
[<c08a44f4>] (out_of_line_wait_on_bit_lock) from [<c0371fb0>] (fat_mirror_bhs+0xa0/0x144)
[<c0371fb0>] (fat_mirror_bhs) from [<c0372a68>] (fat_alloc_clusters+0x138/0x2a4)
[<c0372a68>] (fat_alloc_clusters) from [<c0370b14>] (fat_alloc_new_dir+0x34/0x250)
[<c0370b14>] (fat_alloc_new_dir) from [<c03787c0>] (vfat_mkdir+0x58/0x148)
[<c03787c0>] (vfat_mkdir) from [<c0277b60>] (vfs_mkdir+0x68/0x98)
[<c0277b60>] (vfs_mkdir) from [<c027b484>] (do_mkdirat+0xb0/0xec)
[<c027b484>] (do_mkdirat) from [<c0100060>] (ret_fast_syscall+0x0/0x1c)
Exception stack(0xc2e1bfa8 to 0xc2e1bff0)
bfa0:                   01ee42f0 01ee4208 01ee42f0 000041ed 00000000 00004000
bfc0: 01ee42f0 01ee4208 00000000 00000027 01ee4302 00000004 000dcb00 01ee4190
bfe0: 000dc368 bed11924 0006d4b0 b6ebddfc

Here the kworker is waiting on msdos_sb_info::s_lock which is held by
tar which is in turn waiting for a buffer which is locked waiting to be
flushed, but this operation is plugged in the kworker.

The lock is a normal struct mutex, so tsk_is_pi_blocked() will always
return false on !RT and thus the behaviour changes for RT.

It seems that the intent here is to skip blk_flush_plug() in the case
where a non-preemptible lock (such as a spinlock) has been converted to
a rtmutex on RT, which is the case covered by the SM_RTLOCK_WAIT
schedule flag.  But sched_submit_work() is only called from schedule()
which is never called in this scenario, so the check can simply be
deleted.

Looking at the history of the -rt patchset, in fact this change was
present from v5.9.1-rt20 until being dropped in v5.13-rt1 as it was part
of a larger patch [1] most of which was replaced by commit b4bfa3fcfe3b
("sched/core: Rework the __schedule() preempt argument").

As described in [1]:

   The schedule process must distinguish between blocking on a regular
   sleeping lock (rwsem and mutex) and a RT-only sleeping lock (spinlock
   and rwlock):
   - rwsem and mutex must flush block requests (blk_schedule_flush_plug())
     even if blocked on a lock. This can not deadlock because this also
     happens for non-RT.
     There should be a warning if the scheduling point is within a RCU read
     section.

   - spinlock and rwlock must not flush block requests. This will deadlock
     if the callback attempts to acquire a lock which is already acquired.
     Similarly to being preempted, there should be no warning if the
     scheduling point is within a RCU read section.

and with the tsk_is_pi_blocked() in the scheduler path, we hit the first
issue.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/0022-locking-rtmutex-Use-custom-scheduling-function-for-s.patch?h=linux-5.10.y-rt-patches

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20220708162702.1758865-1-john@metanate.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agogenirq: GENERIC_IRQ_IPI depends on SMP
Samuel Holland [Fri, 1 Jul 2022 20:00:50 +0000 (15:00 -0500)]
genirq: GENERIC_IRQ_IPI depends on SMP

[ Upstream commit 0f5209fee90b4544c58b4278d944425292789967 ]

The generic IPI code depends on the IRQ affinity mask being allocated
and initialized. This will not be the case if SMP is disabled. Fix up
the remaining driver that selected GENERIC_IRQ_IPI in a non-SMP config.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220701200056.46555-3-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agoirqchip/mips-gic: Only register IPI domain when SMP is enabled
Samuel Holland [Fri, 1 Jul 2022 20:00:49 +0000 (15:00 -0500)]
irqchip/mips-gic: Only register IPI domain when SMP is enabled

[ Upstream commit 8190cc572981f2f13b6ffc26c7cfa7899e5d3ccc ]

The MIPS GIC irqchip driver may be selected in a uniprocessor
configuration, but it unconditionally registers an IPI domain.

Limit the part of the driver dealing with IPIs to only be compiled when
GENERIC_IRQ_IPI is enabled, which corresponds to an SMP configuration.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220701200056.46555-2-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agogenirq: Don't return error on missing optional irq_request_resources()
Antonio Borneo [Thu, 12 May 2022 16:05:44 +0000 (18:05 +0200)]
genirq: Don't return error on missing optional irq_request_resources()

[ Upstream commit 95001b756467ecc9f5973eb5e74e97699d9bbdf1 ]

Function irq_chip::irq_request_resources() is reported as optional
in the declaration of struct irq_chip.
If the parent irq_chip does not implement it, we should ignore it
and return.

Don't return error if the functions is missing.

Signed-off-by: Antonio Borneo <antonio.borneo@foss.st.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220512160544.13561-1-antonio.borneo@foss.st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
23 months agosched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
Chen Yu [Sun, 12 Jun 2022 16:34:28 +0000 (00:34 +0800)]
sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg

[ Upstream commit 70fb5ccf2ebb09a0c8ebba775041567812d45f86 ]

[Problem Statement]
select_idle_cpu() might spend too much time searching for an idle CPU,
when the system is overloaded.

The following histogram is the time spent in select_idle_cpu(),
when running 224 instances of netperf on a system with 112 CPUs
per LLC domain:

@usecs:
[0]                  533 |                                                    |
[1]                 5495 |                                                    |
[2, 4)             12008 |                                                    |
[4, 8)            239252 |                                                    |
[8, 16)          4041924 |@@@@@@@@@@@@@@                                      |
[16, 32)        12357398 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         |
[32, 64)        14820255 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[64, 128)       13047682 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       |
[128, 256)       8235013 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@                        |
[256, 512)       4507667 |@@@@@@@@@@@@@@@                                     |
[512, 1K)        2600472 |@@@@@@@@@                                           |
[1K, 2K)          927912 |@@@                                                 |
[2K, 4K)          218720 |                                                    |
[4K, 8K)           98161 |                                                    |
[8K, 16K)          37722 |                                                    |
[16K, 32K)          6715 |                                                    |
[32K, 64K)           477 |                                                    |
[64K, 128K)            7 |                                                    |

netperf latency usecs:
=======
case             load         Lat_99th     std%
TCP_RR           thread-224       257.39 (  0.21)

The time spent in select_idle_cpu() is visible to netperf and might have a negative
impact.

[Symptom analysis]
The patch [1] from Mel Gorman has been applied to track the efficiency
of select_idle_sibling. Copy the indicators here:

SIS Search Efficiency(se_eff%):
        A ratio expressed as a percentage of runqueues scanned versus
        idle CPUs found. A 100% efficiency indicates that the target,
        prev or recent CPU of a task was idle at wakeup. The lower the
        efficiency, the more runqueues were scanned before an idle CPU
        was found.

SIS Domain Search Efficiency(dom_eff%):
        Similar, except only for the slower SIS
patch.

SIS Fast Success Rate(fast_rate%):
        Percentage of SIS that used target, prev or
recent CPUs.

SIS Success rate(success_rate%):
        Percentage of scans that found an idle CPU.

The test is based on Aubrey's schedtests tool, including netperf, hackbench,
schbench and tbench.

Test on vanilla kernel:
schedstat_parse.py -f netperf_vanilla.log
case         load     se_eff%     dom_eff%   fast_rate% success_rate%
TCP_RR    28 threads      99.978       18.535       99.995      100.000
TCP_RR    56 threads      99.397        5.671       99.964      100.000
TCP_RR    84 threads      21.721        6.818       73.632      100.000
TCP_RR   112 threads      12.500        5.533       59.000      100.000
TCP_RR   140 threads       8.524        4.535       49.020      100.000
TCP_RR   168 threads       6.438        3.945       40.309       99.999
TCP_RR   196 threads       5.397        3.718       32.320       99.982
TCP_RR   224 threads       4.874        3.661       25.775       99.767
UDP_RR    28 threads      99.988       17.704       99.997      100.000
UDP_RR    56 threads      99.528        5.977       99.970      100.000
UDP_RR    84 threads      24.219        6.992       76.479      100.000
UDP_RR   112 threads      13.907        5.706       62.538      100.000
UDP_RR   140 threads       9.408        4.699       52.519      100.000
UDP_RR   168 threads       7.095        4.077       44.352      100.000
UDP_RR   196 threads       5.757        3.775       35.764       99.991
UDP_RR   224 threads       5.124        3.704       28.748       99.860

schedstat_parse.py -f schbench_vanilla.log
(each group has 28 tasks)
case         load     se_eff%     dom_eff%   fast_rate% success_rate%
normal    1   mthread      99.152        6.400       99.941      100.000
normal    2   mthreads      97.844        4.003       99.908      100.000
normal    3   mthreads      96.395        2.118       99.917       99.998
normal    4   mthreads      55.288        1.451       98.615       99.804
normal    5   mthreads       7.004        1.870       45.597       61.036
normal    6   mthreads       3.354        1.346       20.777       34.230
normal    7   mthreads       2.183        1.028       11.257       21.055
normal    8   mthreads       1.653        0.825        7.849       15.549

schedstat_parse.py -f hackbench_vanilla.log
(each group has 28 tasks)
case load         se_eff%     dom_eff%   fast_rate% success_rate%
process-pipe      1 group          99.991        7.692       99.999      100.000
process-pipe     2 groups          99.934        4.615       99.997      100.000
process-pipe     3 groups          99.597        3.198       99.987      100.000
process-pipe     4 groups          98.378        2.464       99.958      100.000
process-pipe     5 groups          27.474        3.653       89.811       99.800
process-pipe     6 groups          20.201        4.098       82.763       99.570
process-pipe     7 groups          16.423        4.156       77.398       99.316
process-pipe     8 groups          13.165        3.920       72.232       98.828
process-sockets      1 group          99.977        5.882       99.999      100.000
process-sockets     2 groups          99.927        5.505       99.996      100.000
process-sockets     3 groups          99.397        3.250       99.980      100.000
process-sockets     4 groups          79.680        4.258       98.864       99.998
process-sockets     5 groups           7.673        2.503       63.659       92.115
process-sockets     6 groups           4.642        1.584       58.946       88.048
process-sockets     7 groups           3.493        1.379       49.816       81.164
process-sockets     8 groups           3.015        1.407       40.845       75.500
threads-pipe      1 group          99.997        0.000      100.000      100.000
threads-pipe     2 groups          99.894        2.932       99.997      100.000
threads-pipe     3 groups          99.611        4.117       99.983      100.000
threads-pipe     4 groups          97.703        2.624       99.937      100.000
threads-pipe     5 groups          22.919        3.623       87.150       99.764
threads-pipe     6 groups          18.016        4.038       80.491       99.557
threads-pipe     7 groups          14.663        3.991       75.239       99.247
threads-pipe     8 groups          12.242        3.808       70.651       98.644
threads-sockets      1 group          99.990        6.667       99.999      100.000
threads-sockets     2 groups          99.940        5.114       99.997      100.000
threads-sockets     3 groups          99.469        4.115       99.977      100.000
threads-sockets     4 groups          87.528        4.038       99.400      100.000
threads-sockets     5 groups           6.942        2.398       59.244       88.337
threads-sockets     6 groups           4.359        1.954       49.448       87.860
threads-sockets     7 groups           2.845        1.345       41.198       77.102
threads-sockets     8 groups           2.871        1.404       38.512       74.312

schedstat_parse.py -f tbench_vanilla.log
case load       se_eff%     dom_eff%   fast_rate% success_rate%
loopback   28 threads        99.976       18.369       99.995      100.000
loopback   56 threads        99.222        7.799       99.934      100.000
loopback   84 threads        19.723        6.819       70.215      100.000
loopback  112 threads        11.283        5.371       55.371       99.999
loopback  140 threads         0.000        0.000        0.000        0.000
loopback  168 threads         0.000        0.000        0.000        0.000
loopback  196 threads         0.000        0.000        0.000        0.000
loopback  224 threads         0.000        0.000        0.000        0.000

According to the test above, if the system becomes busy, the
SIS Search Efficiency(se_eff%) drops significantly. Although some
benchmarks would finally find an idle CPU(success_rate% = 100%), it is
doubtful whether it is worth it to search the whole LLC domain.

[Proposal]
It would be ideal to have a crystal ball to answer this question:
How many CPUs must a wakeup path walk down, before it can find an idle
CPU? Many potential metrics could be used to predict the number.
One candidate is the sum of util_avg in this LLC domain. The benefit
of choosing util_avg is that it is a metric of accumulated historic
activity, which seems to be smoother than instantaneous metrics
(such as rq->nr_running). Besides, choosing the sum of util_avg
would help predict the load of the LLC domain more precisely, because
SIS_PROP uses one CPU's idle time to estimate the total LLC domain idle
time.

In summary, the lower the util_avg is, the more select_idle_cpu()
should scan for idle CPU, and vice versa. When the sum of util_avg
in this LLC domain hits 85% or above, the scan stops. The reason to
choose 85% as the threshold is that this is the imbalance_pct(117)
when a LLC sched group is overloaded.

Introduce the quadratic function:

y = SCHED_CAPACITY_SCALE - p * x^2
and y'= y / SCHED_CAPACITY_SCALE

x is the ratio of sum_util compared to the CPU capacity:
x = sum_util / (llc_weight * SCHED_CAPACITY_SCALE)
y' is the ratio of CPUs to be scanned in the LLC domain,
and the number of CPUs to scan is calculated by:

nr_scan = llc_weight * y'

Choosing quadratic function is because:
[1] Compared to the linear function, it scans more aggressively when the
    sum_util is low.
[2] Compared to the exponential function, it is easier to calculate.
[3] It seems that there is no accurate mapping between the sum of util_avg
    and the number of CPUs to be scanned. Use heuristic scan for now.

For a platform with 112 CPUs per LLC, the number of CPUs to scan is:
sum_util%   0    5   15   25  35  45  55   65   75   85   86 ...
scan_nr   112  111  108  102  93  81  65   47   25    1    0 ...

For a platform with 16 CPUs per LLC, the number of CPUs to scan is:
sum_util%   0    5   15   25  35  45  55   65   75   85   86 ...
scan_nr    16   15   15   14  13  11   9    6    3    0    0 ...

Furthermore, to minimize the overhead of calculating the metrics in
select_idle_cpu(), borrow the statistics from periodic load balance.
As mentioned by Abel, on a platform with 112 CPUs per LLC, the
sum_util calculated by periodic load balance after 112 ms would
decay to about 0.5 * 0.5 * 0.5 * 0.7 = 8.75%, thus bringing a delay
in reflecting the latest utilization. But it is a trade-off.
Checking the util_avg in newidle load balance would be more frequent,
but it brings overhead - multiple CPUs write/read the per-LLC shared
variable and introduces cache contention. Tim also mentioned that,
it is allowed to be non-optimal in terms of scheduling for the
short-term variations, but if there is a long-term trend in the load
behavior, the scheduler can adjust for that.

When SIS_UTIL is enabled, the select_idle_cpu() uses the nr_scan
calculated by SIS_UTIL instead of the one from SIS_PROP. As Peter and
Mel suggested, SIS_UTIL should be enabled by default.

This patch is based on the util_avg, which is very sensitive to the
CPU frequency invariance. There is an issue that, when the max frequency
has been clamp, the util_avg would decay insanely fast when
the CPU is idle. Commit addca285120b ("cpufreq: intel_pstate: Handle no_turbo
in frequency invariance") could be used to mitigate this symptom, by adjusting
the arch_max_freq_ratio when turbo is disabled. But this issue is still
not thoroughly fixed, because the current code is unaware of the user-specified
max CPU frequency.

[Test result]

netperf and tbench were launched with 25% 50% 75% 100% 125% 150%
175% 200% of CPU number respectively. Hackbench and schbench were launched
by 1, 2 ,4, 8 groups. Each test lasts for 100 seconds and repeats 3 times.

The following is the benchmark result comparison between
baseline:vanilla v5.19-rc1 and compare:patched kernel. Positive compare%
indicates better performance.

Each netperf test is a:
netperf -4 -H 127.0.1 -t TCP/UDP_RR -c -C -l 100
netperf.throughput
=======
case             load     baseline(std%) compare%( std%)
TCP_RR           28 threads  1.00 (  0.34)  -0.16 (  0.40)
TCP_RR           56 threads  1.00 (  0.19)  -0.02 (  0.20)
TCP_RR           84 threads  1.00 (  0.39)  -0.47 (  0.40)
TCP_RR           112 threads  1.00 (  0.21)  -0.66 (  0.22)
TCP_RR           140 threads  1.00 (  0.19)  -0.69 (  0.19)
TCP_RR           168 threads  1.00 (  0.18)  -0.48 (  0.18)
TCP_RR           196 threads  1.00 (  0.16) +194.70 ( 16.43)
TCP_RR           224 threads  1.00 (  0.16) +197.30 (  7.85)
UDP_RR           28 threads  1.00 (  0.37)  +0.35 (  0.33)
UDP_RR           56 threads  1.00 ( 11.18)  -0.32 (  0.21)
UDP_RR           84 threads  1.00 (  1.46)  -0.98 (  0.32)
UDP_RR           112 threads  1.00 ( 28.85)  -2.48 ( 19.61)
UDP_RR           140 threads  1.00 (  0.70)  -0.71 ( 14.04)
UDP_RR           168 threads  1.00 ( 14.33)  -0.26 ( 11.16)
UDP_RR           196 threads  1.00 ( 12.92) +186.92 ( 20.93)
UDP_RR           224 threads  1.00 ( 11.74) +196.79 ( 18.62)

Take the 224 threads as an example, the SIS search metrics changes are
illustrated below:

    vanilla                    patched
   4544492          +237.5%   15338634        sched_debug.cpu.sis_domain_search.avg
     38539        +39686.8%   15333634        sched_debug.cpu.sis_failed.avg
  128300000          -87.9%   15551326        sched_debug.cpu.sis_scanned.avg
   5842896          +162.7%   15347978        sched_debug.cpu.sis_search.avg

There is -87.9% less CPU scans after patched, which indicates lower overhead.
Besides, with this patch applied, there is -13% less rq lock contention
in perf-profile.calltrace.cycles-pp._raw_spin_lock.raw_spin_rq_lock_nested
.try_to_wake_up.default_wake_function.woken_wake_function.
This might help explain the performance improvement - Because this patch allows
the waking task to remain on the previous CPU, rather than grabbing other CPUs'
lock.

Each hackbench test is a:
hackbench -g $job --process/threads --pipe/sockets -l 1000000 -s 100
hackbench.throughput
=========
case             load     baseline(std%) compare%( std%)
process-pipe     1 group   1.00 (  1.29)  +0.57 (  0.47)
process-pipe     2 groups   1.00 (  0.27)  +0.77 (  0.81)
process-pipe     4 groups   1.00 (  0.26)  +1.17 (  0.02)
process-pipe     8 groups   1.00 (  0.15)  -4.79 (  0.02)
process-sockets  1 group   1.00 (  0.63)  -0.92 (  0.13)
process-sockets  2 groups   1.00 (  0.03)  -0.83 (  0.14)
process-sockets  4 groups   1.00 (  0.40)  +5.20 (  0.26)
process-sockets  8 groups   1.00 (  0.04)  +3.52 (  0.03)
threads-pipe     1 group   1.00 (  1.28)  +0.07 (  0.14)
threads-pipe     2 groups   1.00 (  0.22)  -0.49 (  0.74)
threads-pipe     4 groups   1.00 (  0.05)  +1.88 (  0.13)
threads-pipe     8 groups   1.00 (  0.09)  -4.90 (  0.06)
threads-sockets  1 group   1.00 (  0.25)  -0.70 (  0.53)
threads-sockets  2 groups   1.00 (  0.10)  -0.63 (  0.26)
threads-sockets  4 groups   1.00 (  0.19) +11.92 (  0.24)
threads-sockets  8 groups   1.00 (  0.08)  +4.31 (  0.11)

Each tbench test is a:
tbench -t 100 $job 127.0.0.1
tbench.throughput
======
case             load     baseline(std%) compare%( std%)
loopback         28 threads  1.00 (  0.06)  -0.14 (  0.09)
loopback         56 threads  1.00 (  0.03)  -0.04 (  0.17)
loopback         84 threads  1.00 (  0.05)  +0.36 (  0.13)
loopback         112 threads  1.00 (  0.03)  +0.51 (  0.03)
loopback         140 threads  1.00 (  0.02)  -1.67 (  0.19)
loopback         168 threads  1.00 (  0.38)  +1.27 (  0.27)
loopback         196 threads  1.00 (  0.11)  +1.34 (  0.17)
loopback         224 threads  1.00 (  0.11)  +1.67 (  0.22)

Each schbench test is a:
schbench -m $job -t 28 -r 100 -s 30000 -c 30000
schbench.latency_90%_us
========
case             load     baseline(std%) compare%( std%)
normal           1 mthread  1.00 ( 31.22)  -7.36 ( 20.25)*
normal           2 mthreads  1.00 (  2.45)  -0.48 (  1.79)
normal           4 mthreads  1.00 (  1.69)  +0.45 (  0.64)
normal           8 mthreads  1.00 (  5.47)  +9.81 ( 14.28)

*Consider the Standard Deviation, this -7.36% regression might not be valid.

Also, a OLTP workload with a commercial RDBMS has been tested, and there
is no significant change.

There were concerns that unbalanced tasks among CPUs would cause problems.
For example, suppose the LLC domain is composed of 8 CPUs, and 7 tasks are
bound to CPU0~CPU6, while CPU7 is idle:

          CPU0    CPU1    CPU2    CPU3    CPU4    CPU5    CPU6    CPU7
util_avg  1024    1024    1024    1024    1024    1024    1024    0

Since the util_avg ratio is 87.5%( = 7/8 ), which is higher than 85%,
select_idle_cpu() will not scan, thus CPU7 is undetected during scan.
But according to Mel, it is unlikely the CPU7 will be idle all the time
because CPU7 could pull some tasks via CPU_NEWLY_IDLE.

lkp(kernel test robot) has reported a regression on stress-ng.sock on a
very busy system. According to the sched_debug statistics, it might be caused
by SIS_UTIL terminates the scan and chooses a previous CPU earlier, and this
might introduce more context switch, especially involuntary preemption, which
impacts a busy stress-ng. This regression has shown that, not all benchmarks
in every scenario benefit from idle CPU scan limit, and it needs further
investigation.

Besides, there is slight regression in hackbench's 16 groups case when the
LLC domain has 16 CPUs. Prateek mentioned that we should scan aggressively
in an LLC domain with 16 CPUs. Because the cost to search for an idle one
among 16 CPUs is negligible. The current patch aims to propose a generic
solution and only considers the util_avg. Something like the below could
be applied on top of the current patch to fulfill the requirement:

if (llc_weight <= 16)
nr_scan = nr_scan * 32 / llc_weight;

For LLC domain with 16 CPUs, the nr_scan will be expanded to 2 times large.
The smaller the CPU number this LLC domain has, the larger nr_scan will be
expanded. This needs further investigation.

There is also ongoing work[2] from Abel to filter out the busy CPUs during
wakeup, to further speed up the idle CPU scan. And it could be a following-up
optimization on top of this change.

Suggested-by: Tim Chen <tim.c.chen@intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Mohini Narkhede <mohini.narkhede@intel.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Link: https://lore.kernel.org/r/20220612163428.849378-1-yu.c.chen@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>