review.tizen.org Git - platform/kernel/linux-rpi.git/log

dmaengine: idxd: remove redudant idxd_wq_disable_cleanup() call

idxd_wq_device_reset_cleanup() already calls idxd_wq_disable_cleanup().
There is no need to call idxd_wq_disable_cleanup() again in
idxd_device_wqs_clear_state(). Remove redudant call from
idxd_wq_device_reset_cleanup().

Fixes: 0dcfe41e9a4c ("dmanegine: idxd: cleanup all device related bits after disabling device")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231365717.986350.2441351765955825964.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: free irq before wq type is reset

Call idxd_wq_free_irq() in the drv_disable_wq() function before
idxd_wq_reset() is called. Otherwise the wq type is reset and the irq does
not get freed.

Fixes: 63c14ae6c161 ("dmaengine: idxd: refactor wq driver enable/disable operations")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231367316.986407.11001767338124941736.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: fix lockdep warning on device driver removal

Jacob reported that with lockdep debug turned on, idxd_device_driver
removal causes kernel splat from lock assert warning for
idxd_device_wqs_clear_state(). Make sure
idxd_device_wqs_clear_state() holds the wq lock for each wq when
cleaning the wq state. Move the call outside of the device spinlock.

Reported-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231364426.986304.9294302800482492780.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: Separate user and kernel pasid enabling

The idxd driver always gated the pasid enabling under a single knob and
this assumption is incorrect. The pasid used for kernel operation can be
independently toggled and has no dependency on the user pasid (and vice
versa). Split the two so they are independent "enabled" flags.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165231431746.986466.5666862038354800551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: renesas,rcar-dmac: R-Car V3U is R-Car Gen4

Despite the name, R-Car V3U is the first member of the R-Car Gen4
family. Hence move its compatible value to the R-Car Gen4 section.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/e6e4cf701f3a43b061b9c3f7f0adc4d6addd4722.1651497024.git.geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: Fix the error handling path in idxd_cdev_register()

If a call to alloc_chrdev_region() fails, the already allocated resources
are leaking.

Add the needed error handling path to fix the leak.

Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/1b5033dcc87b5f2a953c413f0306e883e6114542.1650521591.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: tegra: Use platform_get_irq() to get IRQ resource

Use platform_irq_get() instead platform_get_resource() for IRQ resource
to fix the probe failure. platform_get_resource() fails to fetch the IRQ
resource as it might not be ready at that time.

platform_irq_get() is also the recommended way to get interrupt as it
directly gives the IRQ number and no conversion from resource is
required.

Fixes: ee17028009d4 ("dmaengine: tegra: Add tegra gpcdma driver")
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20220505091440.12981-1-akhilrajeev@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: mv_xor_v2 : Move spin_lock_bh() to spin_lock()

It is unnecessary to call spin_lock_bh() for that you are already
in a tasklet.

Signed-off-by: Yunbo Yu <yuyunbo519@gmail.com>
Link: https://lore.kernel.org/r/20220420122754.148359-1-yuyunbo519@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: refactor wq driver enable/disable operations

Move the core driver operations from wq driver to the drv_enable_wq() and
drv_disable_wq() functions. The move should reduce the wq driver's
knowledge of the core driver operations and prevent code confusion for
future wq drivers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/165047301643.3841827.11222723219862233060.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: ti: k3-psil-am62: Update PSIL thread for saul.

Correct the RX PSIL thread for sa3ul.

Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com>
Fixes: 5ac6bfb587772 ("dmaengine: ti: k3-psil: Add AM62x PSIL and PDMA data")
Link: https://lore.kernel.org/r/20220421065323.16378-1-j-choudhary@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: ptdma: statify pt_tx_status

LKP bot reports a new warning:
Warning:
drivers/dma/ptdma/ptdma-dmaengine.c:262:1: warning: no previous prototype for 'pt_tx_status' [-Wmissing-prototypes]

pt_tx_status() should be static, so declare as such.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: d965068259d1 ("dmaengine: PTDMA: support polled mode")
Link: https://lore.kernel.org/r/20220421052407.745637-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: hidma: In hidma_prep_dma_memset treat value as a single byte

The value parameter is a single byte, so duplicate it to the 8 byte
range that is used as the pattern.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Cc: Sinan Kaya <okaya@kernel.org>
Link: https://lore.kernel.org/r/20220301182551.883474-5-benjamin.walker@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: at_xdmac: In at_xdmac_prep_dma_memset, treat value as a single byte

The value passed in to .prep_dma_memset is to be treated as a single
byte repeating pattern.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Cc: Ludovic Desroches <ludovic.desroches@microchip.com>
Cc: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20220301182551.883474-4-benjamin.walker@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: at_hdmac: In atc_prep_dma_memset, treat value as a single byte

The value passed in to .prep_dma_memset is to be treated as a single
byte repeating pattern.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Cc: Ludovic Desroches <ludovic.desroches@microchip.com>
Cc: Tudor Ambarus <tudor.ambarus@microchip.com>
Link: https://lore.kernel.org/r/20220301182551.883474-3-benjamin.walker@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: Document dmaengine_prep_dma_memset

Document this function to make clear the expected behavior of the
'value' parameter. It was intended to match the behavior of POSIX memset
as laid out here:

https://lore.kernel.org/dmaengine/YejrA5ZWZ3lTRO%2F1@matsya/

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Link: https://lore.kernel.org/r/20220301182551.883474-2-benjamin.walker@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: move wq irq enabling to after device enable

Move the calling of request_irq() and other related irq setup code until
after the WQ is successfully enabled. This reduces the amount of
setup/teardown if the wq is not configured correctly and cannot be enabled.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164642777730.179702.1880317757087484299.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: tegra: Remove unused including <linux/version.h>

Eliminate the follow versioncheck warning:

./drivers/dma/tegra186-gpc-dma.c: 21 linux/version.h not needed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220413083842.69845-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: add verification of DMA_INTERRUPT capability for dmatest

Looks like I forgot to add DMA_INTERRUPT cap setting to the idxd driver and
dmatest is still working regardless of this mistake. Add an explicit check
of DMA_INTERRUPT capability for dmatest to make sure the DMA device being used
actually supports interrupt before the test is launched and also that the
driver is programmed correctly.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164978679251.2361020.5856734256126725993.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: mediatek: mtk-hsdma: use NULL instead of using plain integer as pointer

This fixes the following sparse warnings:
drivers/dma/mediatek/mtk-hsdma.c:604:26: warning: Using plain integer
as NULL pointer

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Link: https://lore.kernel.org/r/1649750340-30777-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: pl08x: drop the useless function

Unneeded variable: "retval". Return "NULL" , so we have to make code clear.
better way, drop the function.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Link: https://lore.kernel.org/r/1649726180-13133-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: set max_xfer and max_batch for RO device

Load the max_xfer_size and max_batch_size values from the values read from
registers to the shadow variables. This will allow the read-only device to
display the correct values for the sysfs attributes.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164971507673.2201761.11244446608988838897.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: set DMA_INTERRUPT cap bit

Even though idxd driver has always supported interrupt, it never actually
set the DMA_INTERRUPT cap bit. Rectify this mistake so the interrupt
capability is advertised.

Reported-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164971497859.2201379.17925303210723708961.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: jz4780: set DMA maximum segment size

Set the maximum segment size, since the hardware can do transfers larger
than the default 64 KiB returned by dma_get_max_seg_size().

The maximum segment size is limited by the 24-bit transfer count field
in DMA descriptors. The number of bytes is equal to the transfer count
times the transfer size unit, which is selected by the driver based on
the DMA buffer address and length of the transfer. The size unit can be
as small as 1 byte, so set the maximum segment size to 2^24-1 bytes to
ensure the transfer count will not overflow regardless of the size unit
selected by the driver.

Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220411153618.49876-1-aidanmacdonald.0x0@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: PTDMA: support polled mode

If the DMA_PREP_INTERRUPT flag is not provided, run in polled mode,
which significantly improves IOPS: more than twice on chunks < 4K.

Signed-off-by: Ilya Novikov <i.m.novikov@yadro.com>
Link: https://lore.kernel.org/r/20220413113733.59041-1-i.m.novikov@yadro.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: dmaengine: qcom: gpi: add compatible for sc7280

Document the compatible for GPI DMA controller on SC7280 SoC

Signed-off-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220414064216.1182177-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: plx_dma: Move spin_lock_bh() to spin_lock()

It is unnecessary to call spin_lock_bh() if you are already in a tasklet.

Signed-off-by: Yunbo Yu <yuyunbo519@gmail.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20220418142021.1241558-1-yuyunbo519@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: dmaengine: xilinx_dma: Add MCMDA channel ID index description

MCDMA IP provides up to 16 multiple channels of data movement each on
MM2S and S2MM paths. Inline with implementation, in the binding add
description for the channel ID start index and mention that it's fixed
irrespective of the MCDMA IP configuration(number of read/write channels).

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/1649939061-6675-1-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: qcom: gpi: Add SM8350 support

The Qualcomm SM8350 platform does, like the SM8450, provide a set of GPI
controllers with an ee-offset of 0x10000. Add this to the driver.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220412212959.2385085-1-bjorn.andersson@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: qcom: gpi: Add support for ee_offset

Controller on newer SoCs like SM8450 have registers at at offset. Add
ee_offset to driver_data and add this compatible for the driver.

Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220406132508.1029348-3-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: dmaengine: qcom: gpi: add compatible for sm8350/sm8350

Add the compatible for newer qcom socs with gpi dma i.e qcom sm8350 and
sm8450.

Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220406132508.1029348-2-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: qcom: gpi: set chain and link flag for duplex

Newer platforms seem to have strict requirement for TRE flags which
causes transaction to timeout. This was resolved to missing chain and
link flag for duplex spi transaction.

So add these two flags.

Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220406132508.1029348-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: Remove a useless mutex

According to lib/idr.c,
The IDA handles its own locking. It is safe to call any of the IDA
functions without synchronisation in your code.

so the 'chan_mutex' mutex can just be removed.
It is here only to protect some ida_alloc()/ida_free() calls.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/7180452c1d77b039e27b6f9418e0e7d9dd33c431.1644140845.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: update IAA definitions for user header

Add additional structure definitions for Intel In-memory Analytics
Accelerator (IAA/IAX). See specification (1) for more details.

1: https://cdrdv2.intel.com/v1/dl/getContent/721858

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164704100212.1373038.18362680016033557757.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: tegra: Add tegra gpcdma driver

Adding GPC DMA controller driver for Tegra. The driver supports dma
transfers between memory to memory, IO peripheral to memory and
memory to IO peripheral.

Co-developed-by: Pavan Kunapuli <pkunapuli@nvidia.com>
Signed-off-by: Pavan Kunapuli <pkunapuli@nvidia.com>
Co-developed-by: Rajesh Gumasta <rgumasta@nvidia.com>
Signed-off-by: Rajesh Gumasta <rgumasta@nvidia.com>
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220225132044.14478-3-akhilrajeev@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: dmaengine: Add doc for tegra gpcdma

Add DT binding document for Nvidia Tegra GPCDMA controller.

Signed-off-by: Rajesh Gumasta <rgumasta@nvidia.com>
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20220225132044.14478-2-akhilrajeev@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: altr,msgdma: update my email address

This email should now be used to contact me.

Signed-off-by: Olivier Dautricourt <olivier.dautricourt@orolia.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/dc3decf1dae172c688017bd3ada2ad2b7d060c1e.1647539776.git.olivier.dautricourt@orolia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

MAINTAINERS: update my email address

This email should now be used to contact me.

Signed-off-by: Olivier Dautricourt <olivier.dautricourt@orolia.com>
Link: https://lore.kernel.org/r/85c4174fa162bd946ccf3e08dcfc9b83cfe69b5c.1647539776.git.olivier.dautricourt@orolia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: remove trailing white space on input str for wq name

Add string processing with strim() in order to remove trailing white spaces
that may be input by user for the wq->name.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164789525123.2799661.13795829125221129132.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: Clarify cyclic transfer residue documentation

The current documentation for the residue reported in a cyclic transfer
case mentions that the reported residue should be relative to the current
period only. However the definition of DMA_RESIDUE_GRANULARITY_SEGMENT
specifies that the residue should be updated after each period for
a cyclic transfer, which is in direct contradiction.

Moreover the pcm_dmaengine common code uses the residue relative to
the whole cyclic buffer size, not one period.

Correct the residue-related documentation to reflect this.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Link: https://lore.kernel.org/r/20220331134114.703782-1-paul.kocialkowski@bootlin.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: stm32-mdma: check the channel availability (secure or not)

STM32_MDMA_CCR bit[8] is used to enable Secure Mode (SM). If this bit is
set, it means that all the channel registers are write-protected. So the
channel is not available for Linux use.

Add stm32_mdma_filter_fn() callback filter and give it to
__dma_request_chan (instead of dma_get_any_slave_channel()), to exclude the
channel if it is marked Secure.

Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20220330103645.99969-1-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: bestcomm: Prepare cleanup of powerpc's asm/prom.h

powerpc's asm/prom.h brings some headers that it doesn't
need itself.

In order to clean it up, first add missing headers in
users of asm/prom.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/f98acba303489bdf003e7256460696225b00702e.1648833428.git.christophe.leroy@csgroup.eu
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: ep93xx: Remove redundant word in comment

Remove the second 'to' which is repeated.

Signed-off-by: jianchunfu <jianchunfu@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20220403123120.7794-1-jianchunfu@cmss.chinamobile.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: idxd: don't load pasid config until needed

The driver currently programs the system pasid to the WQ preemptively when
system pasid is enabled. Given that a dwq will reprogram the pasid and
possibly a different pasid, the programming is not necessary. The pasid_en
bit can be set for swq as it does not need pasid programming but
needs the pasid_en bit. Remove system pasid programming on device config
write. Add pasid programming for kernel wq type on wq driver enable. The
char dev driver already reprograms the dwq on ->open() call so there's no
change.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164935607115.1660372.6734518676950372366.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: mediatek-cqdma: Use platform_get_irq() to get the interrupt

platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220404155557.27316-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: mediatek: mtk-hsdma: Use platform_get_irq() to get the interrupt

platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq().

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220404155557.27316-3-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: nbpfaxi: Use platform_get_irq_optional() to get the interrupt

platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.

In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq_optional().

There are no non-DT users for this driver so interrupt range
(irq_res->start-irq_res->end) is no longer required and with DT we will
be sure it will be a single IRQ resource for each index.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220404155557.27316-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: sh: Kconfig: Make RZ_DMAC depend on ARCH_RZG2L

The DMAC block is identical on Renesas RZ/G2L, RZ/G2UL and RZ/V2L SoC's, so
instead of adding dependency for each SoC's add dependency on ARCH_RZG2L.
The ARCH_RZG2L config option is already selected by ARCH_R9A07G043,
ARCH_R9A07G044 and ARCH_R9A07G054.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20220406080417.14593-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dmaengine: sf-pdma: Get number of channel by device tree

It currently assumes that there are always four channels, it would
cause the error if there is actually less than four channels. Change
that by getting number of channel from device tree.

For backwards-compatibility, it uses the default value (i.e. 4) when
there is no 'dma-channels' information in dts.

Signed-off-by: Zong Li <zong.li@sifive.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Link: https://lore.kernel.org/r/f08a95b6582a51712c5b2c3cb859136d07bfa8b9.1648461096.git.zong.li@sifive.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

dt-bindings: dma-engine: sifive,fu540: Add dma-channels property and modify compatible

Add dma-channels property, then we can determine how many channels there
by device tree, rather than statically defining it in PDMA driver.
In addition, we also modify the compatible for PDMA versioning scheme.

Signed-off-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Suggested-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Link: https://lore.kernel.org/r/7cc9a7b5f7e6c28fc9eb172c441b5aed2159b8a0.1648461096.git.zong.li@sifive.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>

Linux 5.18-rc1

Merge tag 'trace-v5.18-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:

- Rename the staging files to give them some meaning. Just
   stage1,stag2,etc, does not show what they are for

- Check for NULL from allocation in bootconfig

- Hold event mutex for dyn_event call in user events

- Mark user events to broken (to work on the API)

- Remove eBPF updates from user events

- Remove user events from uapi header to keep it from being installed.

- Move ftrace_graph_is_dead() into inline as it is called from hot
   paths and also convert it into a static branch.

* tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Move user_events.h temporarily out of include/uapi
  ftrace: Make ftrace_graph_is_dead() a static branch
  tracing: Set user_events to BROKEN
  tracing/user_events: Remove eBPF interfaces
  tracing/user_events: Hold event_mutex during dyn_event_add
  proc: bootconfig: Add null pointer check
  tracing: Rename the staging files for trace_events

Merge tag 'clk-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
"A single revert to fix a boot regression seen when clk_put() started
  dropping rate range requests. It's best to keep various systems
  booting so we'll kick this out and try again next time"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  Revert "clk: Drop the rate range on clk_put()"

Merge tag 'x86-urgent-2022-04-03' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes and updates:

   - Make the prctl() for enabling dynamic XSTATE components correct so
     it adds the newly requested feature to the permission bitmap
     instead of overwriting it. Add a selftest which validates that.

   - Unroll string MMIO for encrypted SEV guests as the hypervisor
     cannot emulate it.

   - Handle supervisor states correctly in the FPU/XSTATE code so it
     takes the feature set of the fpstate buffer into account. The
     feature sets can differ between host and guest buffers. Guest
     buffers do not contain supervisor states. So far this was not an
     issue, but with enabling PASID it needs to be handled in the buffer
     offset calculation and in the permission bitmaps.

   - Avoid a gazillion of repeated CPUID invocations in by caching the
     values early in the FPU/XSTATE code.

   - Enable CONFIG_WERROR in x86 defconfig.

   - Make the X86 defconfigs more useful by adapting them to Y2022
     reality"

* tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/xstate: Consolidate size calculations
  x86/fpu/xstate: Handle supervisor states in XSTATE permissions
  x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
  x86/fpu: Cache xfeature flags from CPUID
  x86/fpu/xsave: Initialize offset/size cache early
  x86/fpu: Remove unused supervisor only offsets
  x86/fpu: Remove redundant XCOMP_BV initialization
  x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO
  x86/config: Make the x86 defconfigs a bit more usable
  x86/defconfig: Enable WERROR
  selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
  x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation

Merge tag 'core-urgent-2022-04-03' of git://git./linux/kernel/git/tip/tip

Pull RT signal fix from Thomas Gleixner:
"Revert the RT related signal changes. They need to be reworked and
generalized"

* tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"

Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping

Pull more dma-mapping updates from Christoph Hellwig:

- fix a regression in dma remap handling vs AMD memory encryption (me)

- finally kill off the legacy PCI DMA API (Christophe JAILLET)

* tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: move pgprot_decrypted out of dma_pgprot
  PCI/doc: cleanup references to the legacy PCI DMA API
  PCI: Remove the deprecated "pci-dma-compat.h" API

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

- avoid unnecessary rebuilds for library objects

- fix return value of __setup handlers

- fix invalid input check for "crashkernel=" kernel option

- silence KASAN warnings in unwind_frame

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9191/1: arm/stacktrace, kasan: Silence KASAN warnings in unwind_frame()
  ARM: 9190/1: kdump: add invalid input check for 'crashkernel=0'
  ARM: 9187/1: JIVE: fix return value of __setup handler
  ARM: 9189/1: decompressor: fix unneeded rebuilds of library objects

Revert "clk: Drop the rate range on clk_put()"

This reverts commit 7dabfa2bc4803eed83d6f22bd6f045495f40636b. There are
multiple reports that this breaks boot on various systems. The common
theme is that orphan clks are having rates set on them when that isn't
expected. Let's revert it out for now so that -rc1 boots.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/r/366a0232-bb4a-c357-6aa8-636e398e05eb@samsung.com
Cc: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20220403022818.39572-1-sboyd@kernel.org

Merge tag 'perf-tools-for-v5.18-2022-04-02' of git://git./linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

- Avoid SEGV if core.cpus isn't set in 'perf stat'.

- Stop depending on .git files for building PERF-VERSION-FILE, used in
   'perf --version', fixing some perf tools build scenarios.

- Convert tracepoint.py example to python3.

- Update UAPI header copies from the kernel sources: socket,
   mman-common, msr-index, KVM, i915 and cpufeatures.

- Update copy of libbpf's hashmap.c.

- Directly return instead of using local ret variable in
   evlist__create_syswide_maps(), found by coccinelle.

* tag 'perf-tools-for-v5.18-2022-04-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf python: Convert tracepoint.py example to python3
  perf evlist: Directly return instead of using local ret variable
  perf cpumap: More cpu map reuse by merge.
  perf cpumap: Add is_subset function
  perf evlist: Rename cpus to user_requested_cpus
  perf tools: Stop depending on .git files for building PERF-VERSION-FILE
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools headers UAPI: Sync linux/kvm.h with the kernel sources
  tools kvm headers arm64: Update KVM headers from the kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
  perf beauty: Update copy of linux/socket.h with the kernel sources
  perf tools: Update copy of libbpf's hashmap.c
  perf stat: Avoid SEGV if core.cpus isn't set

Merge tag 'kbuild-fixes-v5.18' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- Fix empty $(PYTHON) expansion.

- Fix UML, which got broken by the attempt to suppress Clang warnings.

- Fix warning message in modpost.

* tag 'kbuild-fixes-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  modpost: restore the warning message for missing symbol versions
  Revert "um: clang: Strip out -mno-global-merge from USER_CFLAGS"
  kbuild: Remove '-mno-global-merge'
  kbuild: fix empty ${PYTHON} in scripts/link-vmlinux.sh
  kconfig: remove stale comment about removed kconfig_print_symbol()

Merge tag 'mips_5.18_1' of git://git./linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

- build fix for gpio

- fix crc32 build problems

- check for failed memory allocations

* tag 'mips_5.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: crypto: Fix CRC32 code
  MIPS: rb532: move GPIOD definition into C-files
  MIPS: lantiq: check the return value of kzalloc()
  mips: sgi-ip22: add a check for the return of kzalloc()

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

- Only do MSR filtering for MSRs accessed by rdmsr/wrmsr

- Documentation improvements

- Prevent module exit until all VMs are freed

- PMU Virtualization fixes

- Fix for kvm_irq_delivery_to_apic_fast() NULL-pointer dereferences

- Other miscellaneous bugfixes

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
  KVM: x86: fix sending PV IPI
  KVM: x86/mmu: do compare-and-exchange of gPTE via the user address
  KVM: x86: Remove redundant vm_entry_controls_clearbit() call
  KVM: x86: cleanup enter_rmode()
  KVM: x86: SVM: fix tsc scaling when the host doesn't support it
  kvm: x86: SVM: remove unused defines
  KVM: x86: SVM: move tsc ratio definitions to svm.h
  KVM: x86: SVM: fix avic spec based definitions again
  KVM: MIPS: remove reference to trap&emulate virtualization
  KVM: x86: document limitations of MSR filtering
  KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr
  KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
  KVM: x86/pmu: Fix and isolate TSX-specific performance event logic
  KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set
  KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
  KVM: x86: Trace all APICv inhibit changes and capture overall status
  KVM: x86: Add wrappers for setting/clearing APICv inhibits
  KVM: x86: Make APICv inhibit reasons an enum and cleanup naming
  KVM: X86: Handle implicit supervisor access with SMAP
  KVM: X86: Rename variable smap to not_smap in permission_fault()
  ...

modpost: restore the warning message for missing symbol versions

This log message was accidentally chopped off.

I was wondering why this happened, but checking the ML log, Mark
precisely followed my suggestion [1].

I just used "..." because I was too lazy to type the sentence fully.
Sorry for the confusion.

[1]: https://lore.kernel.org/all/CAK7LNAR6bXXk9-ZzZYpTqzFqdYbQsZHmiWspu27rtsFxvfRuVA@mail.gmail.com/

Fixes: 4a6795933a89 ("kbuild: modpost: Explicitly warn about unprototyped symbols")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

Merge tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block

Pull block driver fix from Jens Axboe:
"Got two reports on nbd spewing warnings on load now, which is a
  regression from a commit that went into your tree yesterday.

  Revert the problematic change for now"

* tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block:
  Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"

Merge tag 'pci-v5.18-changes-2' of git://git./linux/kernel/git/helgaas/pci

Pull pci fix from Bjorn Helgaas:

- Fix Hyper-V "defined but not used" build issue added during merge
window (YueHaibing)

* tag 'pci-v5.18-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: hv: Remove unused hv_set_msi_entry_from_desc()

Merge tag 'tag-chrome-platform-for-v5.18' of git://git./linux/kernel/git/chrome-platform/linux

Pull chrome platform updates from Benson Leung:
"cros_ec_typec:

   - Check for EC device - Fix a crash when using the cros_ec_typec
     driver on older hardware not capable of typec commands

   - Make try power role optional

   - Mux configuration reorganization series from Prashant

  cros_ec_debugfs:

   - Fix use after free. Thanks Tzung-bi

  sensorhub:

   - cros_ec_sensorhub fixup - Split trace include file

  misc:

   - Add new mailing list for chrome-platform development:

chrome-platform@lists.linux.dev

     Now with patchwork!"

* tag 'tag-chrome-platform-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: cros_ec_debugfs: detach log reader wq from devm
  platform: chrome: Split trace include file
  platform/chrome: cros_ec_typec: Update mux flags during partner removal
  platform/chrome: cros_ec_typec: Configure muxes at start of port update
  platform/chrome: cros_ec_typec: Get mux state inside configure_mux
  platform/chrome: cros_ec_typec: Move mux flag checks
  platform/chrome: cros_ec_typec: Check for EC device
  platform/chrome: cros_ec_typec: Make try power role optional
  MAINTAINERS: platform-chrome: Add new chrome-platform@lists.linux.dev list

Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"

This reverts commit 6d35d04a9e18990040e87d2bbf72689252669d54.

Both Gabriel and Borislav report that this commit casues a regression
with nbd:

sysfs: cannot create duplicate filename '/dev/block/43:0'

Revert it before 5.18-rc1 and we'll investigage this separately in
due time.

Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/
Reported-by: Gabriel L. Somlo <somlo@cmu.edu>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

watch_queue: Free the page array when watch_queue is dismantled

Commit 7ea1a0124b6d ("watch_queue: Free the alloc bitmap when the
watch_queue is torn down") took care of the bitmap, but not the page
array.

  BUG: memory leak
  unreferenced object 0xffff88810d9bc140 (size 32):
  comm "syz-executor335", pid 3603, jiffies 4294946994 (age 12.840s)
  hex dump (first 32 bytes):
    40 a7 40 04 00 ea ff ff 00 00 00 00 00 00 00 00  @.@.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
     kmalloc_array include/linux/slab.h:621 [inline]
     kcalloc include/linux/slab.h:652 [inline]
     watch_queue_set_size+0x12f/0x2e0 kernel/watch_queue.c:251
     pipe_ioctl+0x82/0x140 fs/pipe.c:632
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:874 [inline]
     __se_sys_ioctl fs/ioctl.c:860 [inline]
     __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:860
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]

Reported-by: syzbot+25ea042ae28f3888727a@syzkaller.appspotmail.com
Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20220322004654.618274-1-eric.dumazet@gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tracing: mark user_events as BROKEN

After being merged, user_events become more visible to a wider audience
that have concerns with the current API.

It is too late to fix this for this release, but instead of a full
revert, just mark it as BROKEN (which prevents it from being selected in
make config). Then we can work finding a better API. If that fails,
then it will need to be completely reverted.

To not have the code silently bitrot, still allow building it with
COMPILE_TEST.

And to prevent the uapi header from being installed, then later changed,
and then have an old distro user space see the old version, move the
header file out of the uapi directory.

Surround the include with CONFIG_COMPILE_TEST to the current location,
but when the BROKEN tag is taken off, it will use the uapi directory,
and fail to compile. This is a good way to remind us to move the header
back.

Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tracing: Move user_events.h temporarily out of include/uapi

While user_events API is under development and has been marked for broken
to not let the API become fixed, move the header file out of the uapi
directory. This is to prevent it from being installed, then later changed,
and then have an old distro user space update with a new kernel, where
applications see the user_events being available, but the old header is in
place, and then they get compiled incorrectly.

Also, surround the include with CONFIG_COMPILE_TEST to the current
location, but when the BROKEN tag is taken off, it will use the uapi
directory, and fail to compile. This is a good way to remind us to move
the header back.

Link: https://lore.kernel.org/all/20220330155835.5e1f6669@gandalf.local.home
Link: https://lkml.kernel.org/r/20220330201755.29319-1-mathieu.desnoyers@efficios.com
Link: https://lkml.kernel.org/r/20220401143903.188384f3@gandalf.local.home
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

ftrace: Make ftrace_graph_is_dead() a static branch

ftrace_graph_is_dead() is used on hot paths, it just reads a variable
in memory and is not worth suffering function call constraints.

For instance, at entry of prepare_ftrace_return(), inlining it avoids
saving prepare_ftrace_return() parameters to stack and restoring them
after calling ftrace_graph_is_dead().

While at it using a static branch is even more performant and is
rather well adapted considering that the returned value will almost
never change.

Inline ftrace_graph_is_dead() and replace 'kill_ftrace_graph' bool
by a static branch.

The performance improvement is noticeable.

Link: https://lkml.kernel.org/r/e0411a6a0ed3eafff0ad2bc9cd4b0e202b4617df.1648623570.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracing: Set user_events to BROKEN

After being merged, user_events become more visible to a wider audience
that have concerns with the current API. It is too late to fix this for
this release, but instead of a full revert, just mark it as BROKEN (which
prevents it from being selected in make config). Then we can work finding
a better API. If that fails, then it will need to be completely reverted.

Link: https://lore.kernel.org/all/2059213643.196683.1648499088753.JavaMail.zimbra@efficios.com/
Link: https://lkml.kernel.org/r/20220330155835.5e1f6669@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracing/user_events: Remove eBPF interfaces

Remove eBPF interfaces within user_events to ensure they are fully
reviewed.

Link: https://lore.kernel.org/all/20220329165718.GA10381@kbox/
Link: https://lkml.kernel.org/r/20220329173051.10087-1-beaub@linux.microsoft.com
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracing/user_events: Hold event_mutex during dyn_event_add

Make sure the event_mutex is properly held during dyn_event_add call.
This is required when adding dynamic events.

Link: https://lkml.kernel.org/r/20220328223225.1992-1-beaub@linux.microsoft.com
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

proc: bootconfig: Add null pointer check

kzalloc is a memory allocation function which can return NULL when some
internal memory errors happen. It is safer to add null pointer check.

Link: https://lkml.kernel.org/r/20220329104004.2376879-1-lv.ruyi@zte.com.cn
Cc: stable@vger.kernel.org
Fixes: c1a3c36017d4 ("proc: bootconfig: Add /proc/bootconfig to show boot config list")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

tracing: Rename the staging files for trace_events

When looking for implementation of different phases of the creation of the
TRACE_EVENT() macro, it is pretty useless when all helper macro
redefinitions are in files labeled "stageX_defines.h". Rename them to
state which phase the files are for. For instance, when looking for the
defines that are used to create the event fields, seeing
"stage4_event_fields.h" gives the developer a good idea that the defines
are in that file.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

KVM: x86: fix sending PV IPI

If apic_id is less than min, and (max - apic_id) is greater than
KVM_IPI_CLUSTER_SIZE, then the third check condition is satisfied but
the new apic_id does not fit the bitmask. In this case __send_ipi_mask
should send the IPI.

This is mostly theoretical, but it can happen if the apic_ids on three
iterations of the loop are for example 1, KVM_IPI_CLUSTER_SIZE, 0.

Fixes: aaffcfd1e82 ("KVM: X86: Implement PV IPIs in linux guest")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Message-Id: <1646814944-51801-1-git-send-email-lirongqing@baidu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/mmu: do compare-and-exchange of gPTE via the user address

FNAME(cmpxchg_gpte) is an inefficient mess.  It is at least decent if it
can go through get_user_pages_fast(), but if it cannot then it tries to
use memremap(); that is not just terribly slow, it is also wrong because
it assumes that the VM_PFNMAP VMA is contiguous.

The right way to do it would be to do the same thing as
hva_to_pfn_remapped() does since commit add6a0cd1c5b ("KVM: MMU: try to
fix up page faults before giving up", 2016-07-05), using follow_pte()
and fixup_user_fault() to determine the correct address to use for
memremap().  To do this, one could for example extract hva_to_pfn()
for use outside virt/kvm/kvm_main.c.  But really there is no reason to
do that either, because there is already a perfectly valid address to
do the cmpxchg() on, only it is a userspace address.  That means doing
user_access_begin()/user_access_end() and writing the code in assembly
to handle exceptions correctly.  Worse, the guest PTE can be 8-byte
even on i686 so there is the extra complication of using cmpxchg8b to
account for.  But at least it is an efficient mess.

(Thanks to Linus for suggesting improvement on the inline assembly).

Reported-by: Qiuhao Li <qiuhao@sysec.org>
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <kangel@zju.edu.cn>
Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com
Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bd53cb35a3e9 ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Remove redundant vm_entry_controls_clearbit() call

When emulating exit from long mode, EFER_LMA is cleared with
vmx_set_efer(). This will already unset the VM_ENTRY_IA32E_MODE control
bit as requested by SDM, so there is no need to unset VM_ENTRY_IA32E_MODE
again in exit_lmode() explicitly. In case EFER isn't supported by
hardware, long mode isn't supported, so exit_lmode() cannot be reached.

Note that, thanks to the shadow controls mechanism, this change doesn't
eliminate vmread or vmwrite.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20220311102643.807507-3-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: cleanup enter_rmode()

vmx_set_efer() sets uret->data but, in fact if the value of uret->data
will be used vmx_setup_uret_msrs() will have rewritten it with the value
returned by update_transition_efer(). uret->data is consumed if and only
if uret->load_into_hardware is true, and vmx_setup_uret_msrs() takes care
of (a) updating uret->data before setting uret->load_into_hardware to true
(b) setting uret->load_into_hardware to false if uret->data isn't updated.

Opportunistically use "vmx" directly instead of redoing to_vmx().

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20220311102643.807507-2-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: fix tsc scaling when the host doesn't support it

It was decided that when TSC scaling is not supported,
the virtual MSR_AMD64_TSC_RATIO should still have the default '1.0'
value.

However in this case kvm_max_tsc_scaling_ratio is not set,
which breaks various assumptions.

Fix this by always calculating kvm_max_tsc_scaling_ratio regardless of
host support. For consistency, do the same for VMX.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220322172449.235575-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: x86: SVM: remove unused defines

Remove some unused #defines from svm.c

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220322172449.235575-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: move tsc ratio definitions to svm.h

Another piece of SVM spec which should be in the header file

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220322172449.235575-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: fix avic spec based definitions again

Due to wrong rebase, commit
4a204f7895878 ("KVM: SVM: Allow AVIC support on system w/ physical APIC ID > 255")

moved avic spec #defines back to avic.c.

Move them back, and while at it extend AVIC_DOORBELL_PHYSICAL_ID_MASK to 12
bits as well (it will be used in nested avic)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220322172449.235575-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: MIPS: remove reference to trap&emulate virtualization

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220313140522.1307751-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: document limitations of MSR filtering

MSR filtering requires an exit to userspace that is hard to implement and
would be very slow in the case of nested VMX vmexit and vmentry MSR
accesses. Document the limitation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Only do MSR filtering when access MSR by rdmsr/wrmsr

If MSR access is rejected by MSR filtering,
kvm_set_msr()/kvm_get_msr() would return KVM_MSR_RET_FILTERED,
and the return value is only handled well for rdmsr/wrmsr.
However, some instruction emulation and state transition also
use kvm_set_msr()/kvm_get_msr() to do msr access but may trigger
some unexpected results if MSR access is rejected, E.g. RDPID
emulation would inject a #UD but RDPID wouldn't cause a exit
when RDPID is supported in hardware and ENABLE_RDTSCP is set.
And it would also cause failure when load MSR at nested entry/exit.
Since msr filtering is based on MSR bitmap, it is better to only
do MSR filtering for rdmsr/wrmsr.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Message-Id: <2b2774154f7532c96a6f04d71c82a8bec7d9e80b.1646655860.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/emulator: Emulate RDPID only if it is enabled in guest

When RDTSCP is supported but RDPID is not supported in host,
RDPID emulation is available. However, __kvm_get_msr() would
only fail when RDTSCP/RDPID both are disabled in guest, so
the emulator wouldn't inject a #UD when RDPID is disabled but
RDTSCP is enabled in guest.

Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Message-Id: <1dfd46ae5b76d3ed87bde3154d51c64ea64c99c1.1646226788.git.houwenlong.hwl@antgroup.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/pmu: Fix and isolate TSX-specific performance event logic

HSW_IN_TX* bits are used in generic code which are not supported on
AMD. Worse, these bits overlap with AMD EventSelect[11:8] and hence
using HSW_IN_TX* bits unconditionally in generic code is resulting in
unintentional pmu behavior on AMD. For example, if EventSelect[11:8]
is 0x2, pmc_reprogram_counter() wrongly assumes that
HSW_IN_TX_CHECKPOINTED is set and thus forces sampling period to be 0.

Also per the SDM, both bits 32 and 33 "may only be set if the processor
supports HLE or RTM" and for "IN_TXCP (bit 33): this bit may only be set
for IA32_PERFEVTSEL2."

Opportunistically eliminate code redundancy, because if the HSW_IN_TX*
bit is set in pmc->eventsel, it is already set in attr.config.

Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reported-by: Jim Mattson <jmattson@google.com>
Fixes: 103af0a98788 ("perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5")
Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220309084257.88931-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set

It makes more sense to print new SPTE value than the
old value.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220302102457.588450-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs

AMD EPYC CPUs never raise a #GP for a WRMSR to a PerfEvtSeln MSR. Some
reserved bits are cleared, and some are not. Specifically, on
Zen3/Milan, bits 19 and 42 are not cleared.

When emulating such a WRMSR, KVM should not synthesize a #GP,
regardless of which bits are set. However, undocumented bits should
not be passed through to the hardware MSR. So, rather than checking
for reserved bits and synthesizing a #GP, just clear the reserved
bits.

This may seem pedantic, but since KVM currently does not support the
"Host/Guest Only" bits (41:40), it is necessary to clear these bits
rather than synthesizing #GP, because some popular guests (e.g Linux)
will set the "Host Only" bit even on CPUs that don't support
EFER.SVME, and they don't expect a #GP.

For example,

root@Ubuntu1804:~# perf stat -e r26 -a sleep 1

Performance counter stats for 'system wide':

                 0      r26

       1.001070977 seconds time elapsed

Feb 23 03:59:58 Ubuntu1804 kernel: [  405.379957] unchecked MSR access error: WRMSR to 0xc0010200 (tried to write 0x0000020000130026) at rIP: 0xffffffff9b276a28 (native_write_msr+0x8/0x30)
Feb 23 03:59:58 Ubuntu1804 kernel: [  405.379958] Call Trace:
Feb 23 03:59:58 Ubuntu1804 kernel: [  405.379963]  amd_pmu_disable_event+0x27/0x90

Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
Reported-by: Lotus Fenn <lotusf@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Like Xu <likexu@tencent.com>
Reviewed-by: David Dunn <daviddunn@google.com>
Message-Id: <20220226234131.2167175-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Trace all APICv inhibit changes and capture overall status

Trace all APICv inhibit changes instead of just those that result in
APICv being (un)inhibited, and log the current state. Debugging why
APICv isn't working is frustrating as it's hard to see why APICv is still
inhibited, and logging only the first inhibition means unnecessary onion
peeling.

Opportunistically drop the export of the tracepoint, it is not and should
not be used by vendor code due to the need to serialize toggling via
apicv_update_lock.

Note, using the common flow means kvm_apicv_init() switched from atomic
to non-atomic bitwise operations. The VM is unreachable at init, so
non-atomic is perfectly ok.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220311043517.17027-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Add wrappers for setting/clearing APICv inhibits

Add set/clear wrappers for toggling APICv inhibits to make the call sites
more readable, and opportunistically rename the inner helpers to align
with the new wrappers and to make them more readable as well. Invert the
flag from "activate" to "set"; activate is painfully ambiguous as it's
not obvious if the inhibit is being activated, or if APICv is being
activated, in which case the inhibit is being deactivated.

For the functions that take @set, swap the order of the inhibit reason
and @set so that the call sites are visually similar to those that bounce
through the wrapper.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220311043517.17027-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Make APICv inhibit reasons an enum and cleanup naming

Use an enum for the APICv inhibit reasons, there is no meaning behind
their values and they most definitely are not "unsigned longs". Rename
the various params to "reason" for consistency and clarity (inhibit may
be confused as a command, i.e. inhibit APICv, instead of the reason that
is getting toggled/checked).

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220311043517.17027-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Handle implicit supervisor access with SMAP

There are two kinds of implicit supervisor access
implicit supervisor access when CPL = 3
implicit supervisor access when CPL < 3

Current permission_fault() handles only the first kind for SMAP.

But if the access is implicit when SMAP is on, data may not be read
nor write from any user-mode address regardless the current CPL.

So the second kind should be also supported.

The first kind can be detect via CPL and access mode: if it is
supervisor access and CPL = 3, it must be implicit supervisor access.

But it is not possible to detect the second kind without extra
information, so this patch adds an artificial PFERR_EXPLICIT_ACCESS
into @access. This extra information also works for the first kind, so
the logic is changed to use this information for both cases.

The value of PFERR_EXPLICIT_ACCESS is deliberately chosen to be bit 48
which is in the most significant 16 bits of u64 and less likely to be
forced to change due to future hardware uses it.

This patch removes the call to ->get_cpl() for access mode is determined
by @access.  Not only does it reduce a function call, but also remove
confusions when the permission is checked for nested TDP.  The nested
TDP shouldn't have SMAP checking nor even the L2's CPL have any bearing
on it.  The original code works just because it is always user walk for
NPT and SMAP fault is not set for EPT in update_permission_bitmask.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20220311070346.45023-5-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Rename variable smap to not_smap in permission_fault()

Comments above the variable says the bit is set when SMAP is overridden
or the same meaning in update_permission_bitmask(): it is not subjected
to SMAP restriction.

Renaming it to reflect the negative implication and make the code better
readability.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20220311070346.45023-4-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Fix comments in update_permission_bitmask

The commit 09f037aa48f3 ("KVM: MMU: speedup update_permission_bitmask")
refactored the code of update_permission_bitmask() and change the
comments. It added a condition into a list to match the new code,
so the number/order for conditions in the comments should be updated
too.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20220311070346.45023-3-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Change the type of access u32 to u64

Change the type of access u32 to u64 for FNAME(walk_addr) and
->gva_to_gpa().

The kinds of accesses are usually combinations of UWX, and VMX/SVM's
nested paging adds a new factor of access: is it an access for a guest
page table or for a final guest physical address.

And SMAP relies a factor for supervisor access: explicit or implicit.

So @access in FNAME(walk_addr) and ->gva_to_gpa() is better to include
all these information to do the walk.

Although @access(u32) has enough bits to encode all the kinds, this
patch extends it to u64:
o Extra bits will be in the higher 32 bits, so that we can
  easily obtain the traditional access mode (UWX) by converting
  it to u32.
o Reuse the value for the access kind defined by SVM's nested
  paging (PFERR_GUEST_FINAL_MASK and PFERR_GUEST_PAGE_MASK) as
  @error_code in kvm_handle_page_fault().

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Message-Id: <20220311070346.45023-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: Remove dirty handling from gfn_to_pfn_cache completely

It isn't OK to cache the dirty status of a page in internal structures
for an indefinite period of time.

Any time a vCPU exits the run loop to userspace might be its last; the
VMM might do its final check of the dirty log, flush the last remaining
dirty pages to the destination and complete a live migration. If we
have internal 'dirty' state which doesn't get flushed until the vCPU
is finally destroyed on the source after migration is complete, then
we have lost data because that will escape the final copy.

This problem already exists with the use of kvm_vcpu_unmap() to mark
pages dirty in e.g. VMX nesting.

Note that the actual Linux MM already considers the page to be dirty
since we have a writeable mapping of it. This is just about the KVM
dirty logging.

For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to
track which gfn_to_pfn_caches have been used and explicitly mark the
corresponding pages dirty before returning to userspace. But we would
have needed external tracking of that anyway, rather than walking the
full list of GPCs to find those belonging to this vCPU which are dirty.

So let's rely *solely* on that external tracking, and keep it simple
rather than laying a tempting trap for callers to fall into.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220303154127.202856-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: Use enum to track if cached PFN will be used in guest and/or host

Replace the guest_uses_pa and kernel_map booleans in the PFN cache code
with a unified enum/bitmask. Using explicit names makes it easier to
review and audit call sites.

Opportunistically add a WARN to prevent passing garbage; instantating a
cache without declaring its usage is either buggy or pointless.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220303154127.202856-2-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: SVM: Fix kvm_cache_regs.h inclusions for is_guest_mode()

Include kvm_cache_regs.h to pick up the definition of is_guest_mode(),
which is referenced by nested_svm_virtualize_tpr() in svm.h. Remove
include from svm_onhpyerv.c which was done only because of lack of
include in svm.h.

Fixes: 883b0a91f41ab ("KVM: SVM: Move Nested SVM Implementation to nested.c")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Gonda <pgonda@google.com>
Message-Id: <20220304161032.2270688-1-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>