platform/kernel/linux-starfive.git
23 months agohabanalabs: fix rc when new CPUCP opcodes are not supported
Tomer Tayar [Fri, 18 Nov 2022 13:08:33 +0000 (15:08 +0200)]
habanalabs: fix rc when new CPUCP opcodes are not supported

When the new CPUCP opcodes are not supported and a CPUCP packet fails,
the return value is the F/W error resposone which is a positive value.
If this packet is sent from IOCTL and the positive value is used, the
ICOTL will not be considered as unsuccessful.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: added memset for the cq_size register
Marco Pagani [Wed, 23 Nov 2022 08:56:39 +0000 (09:56 +0100)]
habanalabs/gaudi2: added memset for the cq_size register

The clang-analyzer reported a warning: "Value stored to
'cq_size_addr' is never read".

The cq_size register of dcore0 is not being zeroed using
gaudi2_memset_device_lbw(), along with the other cq_* registers,
even though the corresponding cq_size_addr variable is set.

Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: added return value check for hl_fw_dynamic_send_clear_cmd()
Marco Pagani [Wed, 16 Nov 2022 13:41:25 +0000 (14:41 +0100)]
habanalabs: added return value check for hl_fw_dynamic_send_clear_cmd()

The clang-analyzer reported a warning: "Value stored to 'rc' is never
read".

The return value check for the first hl_fw_dynamic_send_clear_cmd() call
in hl_fw_dynamic_send_protocol_cmd() appears to be missing.

Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: increase the size of busy engines mask
Tomer Tayar [Sun, 23 Oct 2022 09:55:21 +0000 (12:55 +0300)]
habanalabs: increase the size of busy engines mask

Increase the size of the busy engines mask in 'struct hl_info_hw_idle',
for future ASICs with more than 128 engines.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: change memory scrub mechanism
farah kassabri [Tue, 8 Nov 2022 13:24:33 +0000 (15:24 +0200)]
habanalabs/gaudi2: change memory scrub mechanism

Currently the scrubbing mechanism used the EDMA engines by directly
setting the engine core registers to scrub a chunk of memory.
Due to a sporadic failure with this mechanism, it was decided to
initiate the engines via its QMAN using LIN-DMA packets.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: extend process wait timeout in device fine
Oded Gabbay [Thu, 10 Nov 2022 15:24:02 +0000 (17:24 +0200)]
habanalabs: extend process wait timeout in device fine

Processes that use our device are likely to use at the same time other
devices such as remote storage.

In case our device is removed and a user process is still using the
device, we need to kill the user process. However, if that process
has a thread waiting for i/o to complete on remote storage, for example,
the process won't terminate.

Let's give it enough time to terminate before giving up.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
23 months agohabanalabs: check schedule_hard_reset correctly
Oded Gabbay [Thu, 10 Nov 2022 15:05:24 +0000 (17:05 +0200)]
habanalabs: check schedule_hard_reset correctly

schedule_hard_reset can be true only if we didn't do hard-reset.
Therefore, no point of checking it in case hard_reset is true.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
23 months agohabanalabs: reset device if still in use when released
Tomer Tayar [Wed, 9 Nov 2022 16:08:38 +0000 (18:08 +0200)]
habanalabs: reset device if still in use when released

If the device file is released while a context is still held, it won't
be possible to reopen it until the context is eventually released.
If that doesn't happen, only a device reset will revert it back to an
operational state, i.e. need to wait for a CS timeout or an error, or to
wait for an external intervention of injecting a reset via sysfs.

At this stage, after the device was released by user, context is held
either because of CS which were left running on the device and are not
relevant anymore, or due to missing cleanup steps from user side.

All of this is in any case handled in the device reset flow, so initiate
the reset at this point instead of waiting for it.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: return to reset upon SM SEI BRESP error
Tomer Tayar [Thu, 20 Oct 2022 11:40:16 +0000 (14:40 +0300)]
habanalabs/gaudi2: return to reset upon SM SEI BRESP error

Due to a H/W issue in the LBW path to the PCIE_DBI MSI-X doorbell, there
were false sporadic error responses in SM when it was configured to
write to there, and hence no reset was done as part of handling the
relevant event.
Now that the virtual MSI-X doorbell is used, such errors in SM are not
expected and reset shouldn't be skipped.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: don't enable entries in the MSIX_GW table
Tomer Tayar [Thu, 20 Oct 2022 09:05:09 +0000 (12:05 +0300)]
habanalabs/gaudi2: don't enable entries in the MSIX_GW table

User should use the virtual MSI-X doorbell to generate interrupts from
the device, so there is no need to enable entries in the MSIX_GW table.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: remove redundant firmware version check
farah kassabri [Tue, 8 Nov 2022 11:23:17 +0000 (13:23 +0200)]
habanalabs/gaudi2: remove redundant firmware version check

Firmware 1.7 is the first official firmware, so no need to check
if we are running a version below it.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi: fix print for firmware-alive event
Tomer Tayar [Mon, 7 Nov 2022 14:34:32 +0000 (16:34 +0200)]
habanalabs/gaudi: fix print for firmware-alive event

Add missing le{32,64}_to_cpu conversions.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix print for out-of-sync and pkt-failure events
Tomer Tayar [Mon, 7 Nov 2022 14:20:03 +0000 (16:20 +0200)]
habanalabs: fix print for out-of-sync and pkt-failure events

Add missing le32_to_cpu() conversions, and use %d for the value
returned from atomic_read().

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: add page fault notify event
Dani Liberman [Mon, 31 Oct 2022 21:04:14 +0000 (23:04 +0200)]
habanalabs/gaudi2: add page fault notify event

Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: classify power/thermal events as info
Ofir Bitton [Sun, 6 Nov 2022 10:07:03 +0000 (12:07 +0200)]
habanalabs/gaudi2: classify power/thermal events as info

As power and thermal envelope events are pure informative and not
indicating an error, we reduce the print level to info only.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: skip events info ioctl if not supported
Ohad Sharabi [Sun, 6 Nov 2022 07:26:01 +0000 (09:26 +0200)]
habanalabs: skip events info ioctl if not supported

Some ASICs haven't yet implemented this functionality and so the
ioctl call should fail and the user should be notified of the reason.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix firmware descriptor copy operation
farah kassabri [Thu, 22 Sep 2022 11:24:35 +0000 (14:24 +0300)]
habanalabs: fix firmware descriptor copy operation

This is needed to allow adding more data to the lkd_fw_comms_desc
structure.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: add razwi notify event
Dani Liberman [Sun, 30 Oct 2022 12:46:19 +0000 (14:46 +0200)]
habanalabs/gaudi2: add razwi notify event

Each time razwi (read-only zero, write ignored) event happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: implement fp32 not supported event
Ofir Bitton [Sun, 30 Oct 2022 13:10:13 +0000 (15:10 +0200)]
habanalabs/gaudi2: implement fp32 not supported event

Due to binning, Gaudi2 does not always support fp32.
We add support for such an event in case fp32 is used by the user
in such a device.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi: add page fault notify event
Dani Liberman [Mon, 31 Oct 2022 09:44:45 +0000 (11:44 +0200)]
habanalabs/gaudi: add page fault notify event

Each time page fault happens, besides capturing its data, also notify
the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: use single threaded WQ for event handling
Dani Liberman [Thu, 27 Oct 2022 17:38:26 +0000 (20:38 +0300)]
habanalabs: use single threaded WQ for event handling

Creating event queue workqueue using alloc_workqueue made it run in
multi threaded mode, which caused parallel dumping of events as well as
parallel events notifying to user, causing logs with multiple
events to be out of order.

Fixed by creating event queue workqueue as single threaded work queue.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi: add razwi notify event
Dani Liberman [Sun, 30 Oct 2022 11:08:37 +0000 (13:08 +0200)]
habanalabs/gaudi: add razwi notify event

Each time razwi (read-only zero, write ignore) happens, besides
capturing its data, also notify the user about it.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: add PCI revision 2 support
Ofir Bitton [Wed, 26 Oct 2022 13:20:45 +0000 (16:20 +0300)]
habanalabs/gaudi2: add PCI revision 2 support

Add support for Gaudi2 Device with PCI revision 2.
Functionality is exactly the same as revision 1, the only difference
is device name exposed to user.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: remove redundant gaudi2_sec asic type
Ofir Bitton [Wed, 26 Oct 2022 15:20:49 +0000 (18:20 +0300)]
habanalabs: remove redundant gaudi2_sec asic type

As Gaudi2 has a single PCI id, the secured asic type is redundant.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: add warning print upon a PCI error
Ofir Bitton [Wed, 19 Oct 2022 11:05:18 +0000 (14:05 +0300)]
habanalabs: add warning print upon a PCI error

In order to know if driver catches PCI errors correctly, we need to
print a warning per each error.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix PCIe access to SRAM via debugfs
Tomer Tayar [Sun, 23 Oct 2022 22:14:18 +0000 (01:14 +0300)]
habanalabs: fix PCIe access to SRAM via debugfs

hl_access_sram_dram_region() uses a region base which is set within the
hl_set_dram_bar() function. However, for SRAM access this function is
not called, and we end up with a wrong value of region base and with a
bad calculated address.
Fix it by initializing the region base value independently of whether
hl_set_dram_bar() is called or not.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: zero ts registration buff when allocated
farah kassabri [Tue, 20 Sep 2022 08:48:40 +0000 (11:48 +0300)]
habanalabs: zero ts registration buff when allocated

To avoid memory corruption in kernel memory while using timestamp
registration nodes, zero the kernel buff memory when its allocated.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: no consecutive err when user context is enabled
Tal Cohen [Tue, 18 Oct 2022 14:35:06 +0000 (17:35 +0300)]
habanalabs: no consecutive err when user context is enabled

Consecutive error protects a device reset loop from being triggered
due to h/w issues and enters the device into an unavailable state.
When user may cause the error, an unavailable state
will prevent the user from running its workloads.

The commit prevents entering consecutive state when a user context
is enabled.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: use graceful hard reset for CS timeouts
Tomer Tayar [Fri, 30 Sep 2022 14:02:19 +0000 (17:02 +0300)]
habanalabs: use graceful hard reset for CS timeouts

Use graceful hard reset when detecting a CS timeout that requires a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: use graceful hard reset for F/W events
Tomer Tayar [Fri, 30 Sep 2022 13:57:54 +0000 (16:57 +0300)]
habanalabs/gaudi2: use graceful hard reset for F/W events

Use graceful hard reset for F/W events on Gaudi2 device that require a
device reset.

While at it, do a small refactor of the checks and function calls,
to simplify it and to avoid code duplication.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi: use graceful hard reset for F/W events
Tomer Tayar [Fri, 30 Sep 2022 13:43:47 +0000 (16:43 +0300)]
habanalabs/gaudi: use graceful hard reset for F/W events

Use graceful hard reset for F/W events on Gaudi device that require a
device reset.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: add an option to control watchdog timeout via debugfs
Tomer Tayar [Fri, 30 Sep 2022 13:37:41 +0000 (16:37 +0300)]
habanalabs: add an option to control watchdog timeout via debugfs

Add an option to control the timeout value for the driver's watchdog
of the reset process. The timeout represents the amount of the user
has to close his process once he gets a device reset notification from
the driver.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: add support for graceful hard reset
Tomer Tayar [Fri, 30 Sep 2022 12:08:13 +0000 (15:08 +0300)]
habanalabs: add support for graceful hard reset

Calling hl_device_reset() for a hard reset will lead to a quite
immediate device reset and to killing user process.
For resets that follow errors, it disables the option to debug the
errors on both the device side and the user application side.

This patch adds a 'graceful hard reset' option and a new
hl_device_cond_reset() function.
Under some conditions, mainly if there is no user process or if he is
not registered to driver notifications, this function will execute hard
reset as usual.
Otherwise, the reset will be postponed and a notification will be sent
to user, to let him perform post-error actions and then to release the
device, after which reset will take place.

If device is not released by user in some defined time, a watchdog work
will execute the reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: avoid divide by zero in device utilization
Ohad Sharabi [Sun, 23 Oct 2022 11:46:08 +0000 (14:46 +0300)]
habanalabs: avoid divide by zero in device utilization

Currently there is no verification whether the divisor is legal.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix user mappings calculation in case of page fault
Dani Liberman [Wed, 19 Oct 2022 17:24:55 +0000 (20:24 +0300)]
habanalabs: fix user mappings calculation in case of page fault

As there are 2 types of user mappings, pmmu and hmmu, calculate
only the relevant mappings for the requested type.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: remove configurations to access the MSI-X doorbell
Tomer Tayar [Thu, 20 Oct 2022 08:29:03 +0000 (11:29 +0300)]
habanalabs/gaudi2: remove configurations to access the MSI-X doorbell

The virtual MSI-X doorbell is supported now in F/W, so all
configurations to access the PCIE_DBI MSI-X doorbell can be removed.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: allow setting HBM BAR to other regions
Ohad Sharabi [Wed, 14 Sep 2022 05:53:29 +0000 (08:53 +0300)]
habanalabs: allow setting HBM BAR to other regions

Up until now the use-case in the driver was that the HBM is accessed
using the HBM BAR, yet the BAR sometimes cannot cover the whole HBM and
so we needed to set the BAR to other HBM offset.
Now we are facing the need to access other PCI memory regions that can
be covered by the HBM BAR.
To answer that we are allowing the caller to determine if the HBM BAR
need to be set or not regardless of the PCI memory region.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix using freed pointer
Ohad Sharabi [Tue, 18 Oct 2022 05:51:33 +0000 (08:51 +0300)]
habanalabs: fix using freed pointer

The code uses the pointer for trace purpose (without actually
dereference it) but still get static analysis warning.
This patch eliminate the warning.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: unsecure CBU_EARLY_BRESP registers
Dilip Puri [Wed, 12 Oct 2022 08:06:48 +0000 (11:06 +0300)]
habanalabs/gaudi2: unsecure CBU_EARLY_BRESP registers

NIC ARCs need to have access to CBU_EARLY_BRESP, hence we unsecure
those registers.

Signed-off-by: Dilip Puri <dilipp@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: verify no zero event is sent
Tal Cohen [Mon, 3 Oct 2022 10:55:50 +0000 (13:55 +0300)]
habanalabs: verify no zero event is sent

The event notifier mechanism should not raise an empty
event (event equals zero).

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: capture page fault data
Dani Liberman [Thu, 29 Sep 2022 07:28:36 +0000 (10:28 +0300)]
habanalabs/gaudi2: capture page fault data

Capture page fault data when it happens.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: capture RAZWI information
Dani Liberman [Wed, 28 Sep 2022 19:14:55 +0000 (22:14 +0300)]
habanalabs/gaudi2: capture RAZWI information

Added function to calculate possible engines which caused
RAZWI (read-only zero, write ignored), from a given router id or
module index.

When getting RAZWI via PSOC IP, first the router id is calculated
and then the possible engines that caused the RAZWI are calculated.

There is a possibility that the RAZWI initiator is not an engine. In
that case, it will not be included in possible engines as it
doesn't have an engine id.

RAZWI information is captured when receiving event from engine or via
PSOC IP.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: handle HBM MMU when capturing page fault data
Dani Liberman [Thu, 29 Sep 2022 07:21:28 +0000 (10:21 +0300)]
habanalabs: handle HBM MMU when capturing page fault data

In case of HBM MMU page fault, capture its relevant mappings.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: move reset workqueue to be under hl_device
Tomer Tayar [Fri, 30 Sep 2022 11:36:27 +0000 (14:36 +0300)]
habanalabs: move reset workqueue to be under hl_device

'struct hl_device_reset_work' is used as a wrapper for the reset work
and its parameters, including the reset workqueue on which it runs.
In a future commit, another reset related work with similar parameters
is going to be added, but it won't use the reset workqueue.

As in any case there is a single reset workqueue, and to allow the resue
of this structure, move the reset workqueue to 'struct hl_device'.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: allow unregistering eventfd when device non-operational
Tomer Tayar [Fri, 30 Sep 2022 11:19:21 +0000 (14:19 +0300)]
habanalabs: allow unregistering eventfd when device non-operational

Unregistering eventfd is for releasing host resources and doesn't
involve an access to the device. As such, there is no reason to disallow
it when device isn't operational.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: skip idle status check if reset on device release
Tomer Tayar [Fri, 30 Sep 2022 11:09:32 +0000 (14:09 +0300)]
habanalabs: skip idle status check if reset on device release

If reset upon device release is enabled, there is no need to check the
device idle status in hpriv_release(), because device is going to be
reset in any case.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: add device unavailable notification
Tal Cohen [Wed, 28 Sep 2022 15:33:19 +0000 (18:33 +0300)]
habanalabs/gaudi2: add device unavailable notification

Device unavailable notifies the user that there isn't an option to
retrieve debug information from the device.
When a critical device error occurs and the f/w performs the device
reset, a device unavailable notification shall be sent to the user
process.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: remove privileged MME clock configuration
Koby Elbaz [Wed, 28 Sep 2022 12:56:13 +0000 (15:56 +0300)]
habanalabs/gaudi2: remove privileged MME clock configuration

Privileged MME clock configuration is removed as it is done by the f/w.

Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: replace 'pf' to 'prefetch'
Dafna Hirschfeld [Wed, 28 Sep 2022 08:38:00 +0000 (11:38 +0300)]
habanalabs: replace 'pf' to 'prefetch'

pf was an abbreviation for prefetch but because pf already stands
for 'physical function', we decided to change it to 'prefetch'.

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: add page fault info uapi
Dani Liberman [Sun, 18 Sep 2022 18:37:31 +0000 (21:37 +0300)]
habanalabs: add page fault info uapi

Only the first page fault will be saved.
Besides the address which caused the page fault, the driver captures
all of the mmu user mappings.
User can retrieve this data via the new uapi (new opcode in INFO ioctl).

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs/gaudi2: fix module ID for RAZWI handling
Tomer Tayar [Thu, 22 Sep 2022 12:25:46 +0000 (15:25 +0300)]
habanalabs/gaudi2: fix module ID for RAZWI handling

RAZWI is optionally handled as part of the generic QM SEI error
handling, but it always uses PDMA as the module ID.
Fix it to use the suitable module ID according to the specific event.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: use lower_32_bits()
Bharat Jauhari [Tue, 27 Sep 2022 11:38:38 +0000 (14:38 +0300)]
habanalabs: use lower_32_bits()

This fixes sparse warning on doing cast to 32-bits

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: refactor razwi event notification
Dani Liberman [Mon, 19 Sep 2022 15:51:59 +0000 (18:51 +0300)]
habanalabs: refactor razwi event notification

This event notification was compatible only with gaudi, where razwi
and page fault happens together.

To make it compatible with all ASICs, this refactor contains:

1. Razwi notification will only notify about razwi info.
   New notification will be added in future patch, to retrieve data
   about page fault error.

2. Changed razwi info structure to support all ASICs.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: Use simplified API for p2p dist calc
Oded Gabbay [Thu, 22 Sep 2022 09:30:32 +0000 (12:30 +0300)]
habanalabs: Use simplified API for p2p dist calc

Use the simplified API that calculates distance between two devices.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: allow control device open during reset
Ofir Bitton [Tue, 23 Aug 2022 12:14:14 +0000 (15:14 +0300)]
habanalabs: allow control device open during reset

Monitoring apps would like to query device state at any time so we
should allow it also during reset because it doesn't involve
accessing the h/w.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agohabanalabs: fix return value check in hl_fw_get_sec_attest_data()
Yang Yingliang [Fri, 23 Sep 2022 14:39:13 +0000 (22:39 +0800)]
habanalabs: fix return value check in hl_fw_get_sec_attest_data()

If hl_cpu_accessible_dma_pool_alloc() fails, we should check
'req_cpu_addr', fix it.

Fixes: 0c88760f8f5e ("habanalabs/gaudi2: add secured attestation info uapi")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
23 months agoMerge 6.1-rc6 into char-misc-next
Greg Kroah-Hartman [Mon, 21 Nov 2022 09:05:34 +0000 (10:05 +0100)]
Merge 6.1-rc6 into char-misc-next

We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
23 months agoLinux 6.1-rc6
Linus Torvalds [Mon, 21 Nov 2022 00:02:16 +0000 (16:02 -0800)]
Linux 6.1-rc6

23 months agoMerge tag 'trace-probes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 20 Nov 2022 23:31:20 +0000 (15:31 -0800)]
Merge tag 'trace-probes-v6.1' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing/probes fixes from Steven Rostedt:

 - Fix possible NULL pointer dereference on trace_event_file in
   kprobe_event_gen_test_exit()

 - Fix NULL pointer dereference for trace_array in
   kprobe_event_gen_test_exit()

 - Fix memory leak of filter string for eprobes

 - Fix a possible memory leak in rethook_alloc()

 - Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case which
   can cause a possible use-after-free

 - Fix warning in eprobe filter creation

 - Fix eprobe filter creation as it picked the wrong event for the
   fields

* tag 'trace-probes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/eprobe: Fix eprobe filter to make a filter correctly
  tracing/eprobe: Fix warning in filter creation
  kprobes: Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case
  rethook: fix a potential memleak in rethook_alloc()
  tracing/eprobe: Fix memory leak of filter string
  tracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test_exit()
  tracing: kprobe: Fix potential null-ptr-deref on trace_event_file in kprobe_event_gen_test_exit()

23 months agoMerge tag 'trace-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 20 Nov 2022 23:25:32 +0000 (15:25 -0800)]
Merge tag 'trace-v6.1-rc5' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix polling to block on watermark like the reads do, as user space
   applications get confused when the select says read is available, and
   then the read blocks

 - Fix accounting of ring buffer dropped pages as it is what is used to
   determine if the buffer is empty or not

 - Fix memory leak in tracing_read_pipe()

 - Fix struct trace_array warning about being declared in parameters

 - Fix accounting of ftrace pages used in output at start up.

 - Fix allocation of dyn_ftrace pages by subtracting one from order
   instead of diving it by 2

 - Static analyzer found a case were a pointer being used outside of a
   NULL check (rb_head_page_deactivate())

 - Fix possible NULL pointer dereference if kstrdup() fails in
   ftrace_add_mod()

 - Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event()

 - Fix bad pointer dereference in register_synth_event() on error path

 - Remove unused __bad_type_size() method

 - Fix possible NULL pointer dereference of entry in list 'tr->err_log'

 - Fix NULL pointer deference race if eprobe is called before the event
   setup

* tag 'trace-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix race where eprobes can be called before the event
  tracing: Fix potential null-pointer-access of entry in list 'tr->err_log'
  tracing: Remove unused __bad_type_size() method
  tracing: Fix wild-memory-access in register_synth_event()
  tracing: Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event()
  ftrace: Fix null pointer dereference in ftrace_add_mod()
  ring_buffer: Do not deactivate non-existant pages
  ftrace: Optimize the allocation for mcount entries
  ftrace: Fix the possible incorrect kernel message
  tracing: Fix warning on variable 'struct trace_array'
  tracing: Fix memory leak in tracing_read_pipe()
  ring-buffer: Include dropped pages in counting dirty patches
  tracing/ring-buffer: Have polling block on watermark

23 months agotracing: Fix race where eprobes can be called before the event
Steven Rostedt (Google) [Fri, 18 Nov 2022 02:42:49 +0000 (21:42 -0500)]
tracing: Fix race where eprobes can be called before the event

The flag that tells the event to call its triggers after reading the event
is set for eprobes after the eprobe is enabled. This leads to a race where
the eprobe may be triggered at the beginning of the event where the record
information is NULL. The eprobe then dereferences the NULL record causing
a NULL kernel pointer bug.

Test for a NULL record to keep this from happening.

Link: https://lore.kernel.org/linux-trace-kernel/20221116192552.1066630-1-rafaelmendsr@gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20221117214249.2addbe10@gandalf.local.home
Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>
Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
23 months agoMerge tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:47:39 +0000 (10:47 -0800)]
Merge tag 'x86_urgent_for_v6.1_rc6' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Do not hold fpregs lock when inheriting FPU permissions because the
   fpregs lock disables preemption on RT but fpu_inherit_perms() does
   spin_lock_irq(), which, on RT, uses rtmutexes and they need to be
   preemptible.

 - Check the page offset and the length of the data supplied by
   userspace for overflow when specifying a set of pages to add to an
   SGX enclave

* tag 'x86_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Drop fpregs lock before inheriting FPU permissions
  x86/sgx: Add overflow check in sgx_validate_offset_length()

23 months agoMerge tag 'sched_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:43:52 +0000 (10:43 -0800)]
Merge tag 'sched_urgent_for_v6.1_rc6' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Fix a small race on the task's exit path where there's a
   misunderstanding whether the task holds rq->lock or not

 - Prevent processes from getting killed when using deprecated or
   unknown rseq ABI flags in order to be able to fuzz the rseq() syscall
   with syzkaller

* tag 'sched_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix race in task_call_func()
  rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered

23 months agoMerge tag 'perf_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:41:14 +0000 (10:41 -0800)]
Merge tag 'perf_urgent_for_v6.1_rc6' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Fix an intel PT erratum where CPUs do not support single range output
   for more than 4K

 - Fix a NULL ptr dereference which can happen after an NMI interferes
   with the event enabling dance in amd_pmu_enable_all()

 - Free the events array too when freeing uncore contexts on CPU online,
   thereby fixing a memory leak

 - Improve the pending SIGTRAP check

* tag 'perf_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix sampling using single range output
  perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
  perf/x86/amd/uncore: Fix memory leak for events array
  perf: Improve missing SIGTRAP checking

23 months agoMerge tag 'locking_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 20 Nov 2022 18:39:45 +0000 (10:39 -0800)]
Merge tag 'locking_urgent_for_v6.1_rc6' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:

 - Fix a build error with clang 11

* tag 'locking_urgent_for_v6.1_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking: Fix qspinlock/x86 inline asm error

23 months agoMerge tag 'powerpc-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 20 Nov 2022 17:47:33 +0000 (09:47 -0800)]
Merge tag 'powerpc-6.1-5' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

 - Fix writable sections being moved into the rodata region.

Thanks to Nicholas Piggin and Christophe Leroy.

* tag 'powerpc-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix writable sections being moved into the rodata region

23 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 19 Nov 2022 23:51:22 +0000 (15:51 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five small fixes, all in drivers.

  Most of these are error leg freeing issues, with the only really user
  visible one being the zfcp fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: iscsi: Fix possible memory leak when device_register() failed
  scsi: zfcp: Fix double free of FSF request when qdio send fails
  scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper()
  scsi: target: tcm_loop: Fix possible name leak in tcm_loop_setup_hba_bus()
  scsi: mpi3mr: Suppress command reply debug prints

23 months agoMerge tag 'iommu-fixes-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 19 Nov 2022 17:08:57 +0000 (09:08 -0800)]
Merge tag 'iommu-fixes-v6.1-rc5' of git://git./linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Preset accessed bits in Intel VT-d page-directory entries to avoid
   hardware error

 - Set supervisor bit only when Intel IOMMU has the SRS capability

* tag 'iommu-fixes-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Set SRE bit only when hardware has SRS cap
  iommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries

23 months agoMerge tag 'kbuild-fixes-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 19 Nov 2022 17:03:20 +0000 (09:03 -0800)]
Merge tag 'kbuild-fixes-v6.1-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Update MAINTAINERS with Nathan and Nicolas as new Kbuild reviewers

 - Increment the debian revision for deb-pkg builds

* tag 'kbuild-fixes-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Restore .version auto-increment behaviour for Debian packages
  MAINTAINERS: Add linux-kbuild's patchwork
  MAINTAINERS: Remove Michal Marek from Kbuild maintainers
  MAINTAINERS: Add Nathan and Nicolas to Kbuild reviewers

23 months agoMerge tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 19 Nov 2022 16:58:58 +0000 (08:58 -0800)]
Merge tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:

 - two missing and one incorrect return value checks

 - fix leak on tlink mount failure

* tag '6.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add check for returning value of SMB2_set_info_init
  cifs: Fix wrong return value checking when GETFLAGS
  cifs: add check for returning value of SMB2_close_init
  cifs: Fix connections leak when tlink setup failed

23 months agoiommu/vt-d: Set SRE bit only when hardware has SRS cap
Tina Zhang [Wed, 16 Nov 2022 05:15:44 +0000 (13:15 +0800)]
iommu/vt-d: Set SRE bit only when hardware has SRS cap

SRS cap is the hardware cap telling if the hardware IOMMU can support
requests seeking supervisor privilege or not. SRE bit in scalable-mode
PASID table entry is treated as Reserved(0) for implementation not
supporting SRS cap.

Checking SRS cap before setting SRE bit can avoid the non-recoverable
fault of "Non-zero reserved field set in PASID Table Entry" caused by
setting SRE bit while there is no SRS cap support. The fault messages
look like below:

 DMAR: DRHD: handling fault status reg 2
 DMAR: [DMA Read NO_PASID] Request device [00:0d.0] fault addr 0x1154e1000
       [fault reason 0x5a]
       SM: Non-zero reserved field set in PASID Table Entry

Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221115070346.1112273-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
23 months agoiommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries
Tina Zhang [Wed, 16 Nov 2022 05:15:43 +0000 (13:15 +0800)]
iommu/vt-d: Preset Access bit for IOVA in FL non-leaf paging entries

The A/D bits are preseted for IOVA over first level(FL) usage for both
kernel DMA (i.e, domain typs is IOMMU_DOMAIN_DMA) and user space DMA
usage (i.e., domain type is IOMMU_DOMAIN_UNMANAGED).

Presetting A bit in FL requires to preset the bit in every related paging
entries, including the non-leaf ones. Otherwise, hardware may treat this
as an error. For example, in a case of ECAP_REG.SMPWC==0, DMA faults might
occur with below DMAR fault messages (wrapped for line length) dumped.

 DMAR: DRHD: handling fault status reg 2
 DMAR: [DMA Read NO_PASID] Request device [aa:00.0] fault addr 0x10c3a6000
    [fault reason 0x90]
    SM: A/D bit update needed in first-level entry when set up in no snoop

Fixes: 289b3b005cb9 ("iommu/vt-d: Preset A/D bits for user space DMA usage")
Cc: stable@vger.kernel.org
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113010324.1094483-1-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20221116051544.26540-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
23 months agoMerge tag 'input-for-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor...
Linus Torvalds [Sat, 19 Nov 2022 01:56:29 +0000 (17:56 -0800)]
Merge tag 'input-for-v6.1-rc5' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - a fix for 8042 to stop leaking platform device on unload

 - a fix for Goodix touchscreens on devices like Nanote UMPC-01 where we
   need to reset controller to load config from firmware

 - a workaround for Acer Switch to avoid interrupt storm from home and
   power buttons

 - a workaround for more ASUS ZenBook models to detect keyboard
   controller

 - a fix for iforce driver to properly handle communication errors

 - touchpad on HP Laptop 15-da3001TU switched to RMI mode

* tag 'input-for-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: i8042 - fix leaking of platform device on module removal
  Input: i8042 - apply probe defer to more ASUS ZenBook models
  Input: soc_button_array - add Acer Switch V 10 to dmi_use_low_level_irq[]
  Input: soc_button_array - add use_low_level_irq module parameter
  Input: iforce - invert valid length check when fetching device IDs
  Input: goodix - try resetting the controller when no config is set
  dt-bindings: input: touchscreen: Add compatible for Goodix GT7986U chip
  Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode

23 months agoMerge tag 'zonefs-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 19 Nov 2022 01:17:42 +0000 (17:17 -0800)]
Merge tag 'zonefs-6.1-rc6' of git://git./linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:

 - Fix the IO error recovery path for failures happening in the last
   zone of device, and that zone is a "runt" zone (smaller than the
   other zone). The current code was failing to properly obtain a zone
   report in that case.

 - Remove the unused to_attr() function as it is unused, causing
   compilation warnings with clang.

* tag 'zonefs-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: Remove to_attr() helper function
  zonefs: fix zone report size in __zonefs_io_error()

23 months agoInput: i8042 - fix leaking of platform device on module removal
Chen Jun [Fri, 18 Nov 2022 23:40:03 +0000 (15:40 -0800)]
Input: i8042 - fix leaking of platform device on module removal

Avoid resetting the module-wide i8042_platform_device pointer in
i8042_probe() or i8042_remove(), so that the device can be properly
destroyed by i8042_exit() on module unload.

Fixes: 9222ba68c3f4 ("Input: i8042 - add deferred probe support")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Link: https://lore.kernel.org/r/20221109034148.23821-1-chenjun102@huawei.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
23 months agoMerge tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 18 Nov 2022 22:59:53 +0000 (14:59 -0800)]
Merge tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "This is mostly fixing issues around the poll rework, but also two
  tweaks for the multishot handling for accept and receive.

  All stable material"

* tag 'io_uring-6.1-2022-11-18' of git://git.kernel.dk/linux:
  io_uring: disallow self-propelled ring polling
  io_uring: fix multishot recv request leaks
  io_uring: fix multishot accept request leaks
  io_uring: fix tw losing poll events
  io_uring: update res mask in io_poll_check_events

23 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 18 Nov 2022 22:31:03 +0000 (14:31 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix a build error with CONFIG_CFI_CLANG + CONFIG_FTRACE when
   CONFIG_FUNCTION_GRAPH_TRACER is not enabled.

 - Fix a BUG_ON triggered by the page table checker due to incorrect
   file_map_count for non-leaf pmd/pud (the arm64
   pmd_user_accessible_page() not checking whether it's a leaf entry).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
  arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER

23 months agoMerge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 18 Nov 2022 21:59:45 +0000 (13:59 -0800)]
Merge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
      - Two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
      - Memory leak fix in nvmet (Sagi Grimberg)

 - Regression fix for block cgroups pinning the wrong blkcg, causing
   leaks of cgroups and blkcgs (Chris)

 - UAF fix for drbd setup error handling (Dan)

 - Fix DMA alignment propagation in DM (Keith)

* tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux:
  dm-log-writes: set dma_alignment limit in io_hints
  dm-integrity: set dma_alignment limit in io_hints
  block: make blk_set_default_limits() private
  dm-crypt: provide dma_alignment limit in io_hints
  block: make dma_alignment a stacking queue_limit
  nvmet: fix a memory leak in nvmet_auth_set_key
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
  drbd: use after free in drbd_create_device()
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro
  blk-cgroup: properly pin the parent in blkcg_css_online

23 months agoMerge tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 18 Nov 2022 21:31:40 +0000 (13:31 -0800)]
Merge tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "I guess the main question is are things settling down, and I'd say
  kinda, these are all pretty small fixes, nothing big stands out
  really, just seems to be quite a few of them.

  Mostly amdgpu and core fixes, with some i915, tegra, vc4, panel bits.

  core:
   - Fix potential memory leak in drm_dev_init()
   - Fix potential null-ptr-deref in drm_vblank_destroy_worker()
   - Revert hiding unregistered connectors from userspace, as it breaks
     on DP-MST
   - Add workaround for DP++ dual mode adaptors that don't support i2c
     subaddressing

  i915:
   - Fix uaf with lmem_userfault_list handling

  amdgpu:
   - gang submit fixes
   - Fix a possible memory leak in ganng submit error path
   - DP tunneling fixes
   - DCN 3.1 page flip fix
   - DCN 3.2.x fixes
   - DCN 3.1.4 fixes
   - Don't expose degamma on hardware that doesn't support it
   - BACO fixes for SMU 11.x
   - BACO fixes for SMU 13.x
   - Virtual display fix for devices with no display hardware

  amdkfd:
   - Memory limit regression fix

  tegra:
   - tegra20 GART fix

  vc4:
   - Fix error handling in vc4_atomic_commit_tail()

  lima:
   - Set lima's clkname corrrectly when regulator is missing

  panel:
   - Set bpc for logictechno panels"

* tag 'drm-fixes-2022-11-19' of git://anongit.freedesktop.org/drm/drm: (28 commits)
  gpu: host1x: Avoid trying to use GART on Tegra20
  drm/display: Don't assume dual mode adaptors support i2c sub-addressing
  drm/amd/pm: fix SMU13 runpm hang due to unintentional workaround
  drm/amd/pm: enable runpm support over BACO for SMU13.0.7
  drm/amd/pm: enable runpm support over BACO for SMU13.0.0
  drm/amdgpu: there is no vbios fb on devices with no display hw (v2)
  drm/amdkfd: Fix a memory limit issue
  drm/amdgpu: disable BACO support on more cards
  drm/amd/display: don't enable DRM CRTC degamma property for DCE
  drm/amd/display: Set max for prefetch lines on dcn32
  drm/amd/display: use uclk pstate latency for fw assisted mclk validation dcn32
  drm/amd/display: Fix prefetch calculations for dcn32
  drm/amd/display: Fix optc2_configure warning on dcn314
  drm/amd/display: Fix calculation for cursor CAB allocation
  Revert "drm: hide unregistered connectors from GETCONNECTOR IOCTL"
  drm/amd/display: Support parsing VRAM info v3.0 from VBIOS
  drm/amd/display: Fix invalid DPIA AUX reply causing system hang
  drm/amdgpu: Add psp_13_0_10_ta firmware to modinfo
  drm/amd/display: Add HUBP surface flip interrupt handler
  drm/amd/display: Fix access timeout to DPIA AUX at boot time
  ...

23 months agoMerge tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 18 Nov 2022 20:30:23 +0000 (12:30 -0800)]
Merge tag 's390-6.1-5' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Fix deadlock in discontiguous saved segments (DCSS) block device
   driver. When adding a disk and scanning partitions the scan would not
   break out early without a missed flag.

 - Avoid using global register variable for current_stack_pointer due to
   an old bug in gcc versions prior to gcc-8.4. Due to this bug a broken
   code is generated, which leads to stack corruptions.

* tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: avoid using global register for current_stack_pointer
  s390/dcssblk: fix deadlock when adding a DCSS

23 months agoMerge tag 'for-6.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/devic...
Linus Torvalds [Fri, 18 Nov 2022 20:23:35 +0000 (12:23 -0800)]
Merge tag 'for-6.1/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix misbehavior if list_versions DM ioctl races with module loading

 - Fix missing decrement of no_sleep_enabled if dm_bufio_client_create
   failed

 - Allow DM integrity devices to be activated in read-only mode

* tag 'for-6.1/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm integrity: clear the journal on suspend
  dm integrity: flush the journal on suspend
  dm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed
  dm ioctl: fix misbehavior if list_versions races with module loading

23 months agoMerge tag 'drm/tegra/for-6.1-rc6' of https://gitlab.freedesktop.org/drm/tegra into...
Dave Airlie [Fri, 18 Nov 2022 20:15:20 +0000 (06:15 +1000)]
Merge tag 'drm/tegra/for-6.1-rc6' of https://gitlab.freedesktop.org/drm/tegra into drm-fixes

drm/tegra: Fixes for v6.1-rc6

This contains a single fix that avoids using the GART on Tegra20 because
it doesn't work well with the way the Tegra DRM driver tries to use it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118121614.3511110-1-thierry.reding@gmail.com
23 months agoMerge tag 'usb-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 18 Nov 2022 20:08:24 +0000 (12:08 -0800)]
Merge tag 'usb-6.1-rc6' of git://git./linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are a number of USB driver fixes and new device ids for 6.1-rc6.
  Included in here are:

   - new usb-serial device ids

   - dwc3 driver fixes for reported problems

   - cdns3 driver fixes

   - new USB device quirks

   - typec driver fixes

   - extcon USB typec driver fix

  All of these have been in linux-next with no reported issues"

* tag 'usb-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: option: add u-blox LARA-L6 modem
  USB: serial: option: add u-blox LARA-R6 00B modem
  USB: serial: option: remove old LARA-R6 PID
  USB: serial: option: add Fibocom FM160 0x0111 composition
  usb: add NO_LPM quirk for Realforce 87U Keyboard
  usb: cdns3: host: fix endless superspeed hub port reset
  usb: chipidea: fix deadlock in ci_otg_del_timer
  usb: dwc3: Do not get extcon device when usb-role-switch is used
  usb: typec: tipd: Prevent uninitialized event{1,2} in IRQ handler
  usb: typec: mux: Enter safe mode only when pins need to be reconfigured
  extcon: usbc-tusb320: Call the Type-C IRQ handler only if a port is registered
  Revert "usb: dwc3: disable USB core PHY management"
  usb: dwc3: gadget: Return -ESHUTDOWN on ep disable
  USB: bcma: Make GPIO explicitly optional
  USB: serial: option: add Sierra Wireless EM9191

23 months agoMerge tag 'staging-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 18 Nov 2022 20:02:38 +0000 (12:02 -0800)]
Merge tag 'staging-6.1-rc6' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix for 6.1-rc6.

  It resolves a bogus signed character test as pointed out, and fixed
  by, Jason in the rtl8192e driver

  It has been in linux-next for a few weeks now with no reported
  problems"

* tag 'staging-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8192e: remove bogus ssid character sign test

23 months agoarm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
Liu Shixin [Thu, 17 Nov 2022 07:56:01 +0000 (15:56 +0800)]
arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud

The page table check trigger BUG_ON() unexpectedly when collapse hugepage:

 ------------[ cut here ]------------
 kernel BUG at mm/page_table_check.c:82!
 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 6 PID: 68 Comm: khugepaged Not tainted 6.1.0-rc3+ #750
 Hardware name: linux,dummy-virt (DT)
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : page_table_check_clear.isra.0+0x258/0x3f0
 lr : page_table_check_clear.isra.0+0x240/0x3f0
[...]
 Call trace:
  page_table_check_clear.isra.0+0x258/0x3f0
  __page_table_check_pmd_clear+0xbc/0x108
  pmdp_collapse_flush+0xb0/0x160
  collapse_huge_page+0xa08/0x1080
  hpage_collapse_scan_pmd+0xf30/0x1590
  khugepaged_scan_mm_slot.constprop.0+0x52c/0xac8
  khugepaged+0x338/0x518
  kthread+0x278/0x2f8
  ret_from_fork+0x10/0x20
[...]

Since pmd_user_accessible_page() doesn't check if a pmd is leaf, it
decrease file_map_count for a non-leaf pmd comes from collapse_huge_page().
and so trigger BUG_ON() unexpectedly.

Fix this problem by using pmd_leaf() insteal of pmd_present() in
pmd_user_accessible_page(). Moreover, use pud_leaf() for
pud_user_accessible_page() too.

Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK")
Reported-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221117075602.2904324-2-liushixin2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
23 months agoMerge tag 'tty-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Fri, 18 Nov 2022 18:59:52 +0000 (10:59 -0800)]
Merge tag 'tty-6.1-rc6' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are a number of small tty and serial driver fixes for 6.1-rc6.
  They all resolve reported problems:

   - kernel doc build problems with the -rc1 serial driver documentation
     update

   - n_gsm reported problems

   - imx serial driver missing callback

   - lots of tiny 8250 driver fixes for reported issues.

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'tty-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  docs/driver-api/miscellaneous: Remove kernel-doc of serial_core.c
  serial: 8250: Flush DMA Rx on RLSI
  serial: 8250_lpss: Use 16B DMA burst with Elkhart Lake
  serial: 8250_lpss: Configure DMA also w/o DMA filter
  serial: 8250: Fall back to non-DMA Rx if IIR_RDI occurs
  tty: n_gsm: fix sleep-in-atomic-context bug in gsm_control_send
  Revert "tty: n_gsm: replace kicktimer with delayed_work"
  Revert "tty: n_gsm: avoid call of sleeping functions from atomic context"
  serial: imx: Add missing .thaw_noirq hook
  tty: serial: fsl_lpuart: don't break the on-going transfer when global reset
  serial: 8250: omap: Flush PM QOS work on remove
  serial: 8250: omap: Fix unpaired pm_runtime_put_sync() in omap8250_remove()
  serial: 8250_omap: remove wait loop from Errata i202 workaround
  serial: 8250: omap: Fix missing PM runtime calls for omap8250_set_mctrl()
  serial: 8250: 8250_omap: Avoid RS485 RTS glitch on ->set_termios()

23 months agoMerge tag 'driver-core-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 18 Nov 2022 18:49:53 +0000 (10:49 -0800)]
Merge tag 'driver-core-6.1-rc6' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two small driver core fixes for 6.1-rc6:

   - utsname fix, this one should already be in your tree as it came
     from a different tree earlier.

   - kernfs bugfix for a much reported syzbot report that seems to keep
     getting triggered.

  Both of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kernfs: Fix spurious lockdep warning in kernfs_find_and_get_node_by_id()
  kernel/utsname_sysctl.c: Add missing enum uts_proc value

23 months agoMerge tag 'char-misc-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 18 Nov 2022 18:29:25 +0000 (10:29 -0800)]
Merge tag 'char-misc-6.1-rc6' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char/misc and other driver fixes for 6.1-rc6 to
  resolve some reported problems. Included in here are:

   - iio driver fixes

   - binder driver fix

   - nvmem driver fix

   - vme_vmci information leak fix

   - parport fix

   - slimbus configuration fix

   - coreboot firmware bugfix

   - speakup build fix and crash fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (22 commits)
  firmware: coreboot: Register bus in module init
  nvmem: u-boot-env: fix crc32_data_offset on redundant u-boot-env
  slimbus: qcom-ngd: Fix build error when CONFIG_SLIM_QCOM_NGD_CTRL=y && CONFIG_QCOM_RPROC_COMMON=m
  docs: update mediator contact information in CoC doc
  slimbus: stream: correct presence rate frequencies
  nvmem: lan9662-otp: Fix compatible string
  binder: validate alloc->mm in ->mmap() handler
  parport_pc: Avoid FIFO port location truncation
  siox: fix possible memory leak in siox_device_add()
  misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram()
  speakup: replace utils' u_char with unsigned char
  speakup: fix a segfault caused by switching consoles
  tools: iio: iio_generic_buffer: Fix read size
  iio: imu: bno055: uninitialized variable bug in bno055_trigger_handler()
  iio: adc: at91_adc: fix possible memory leak in at91_adc_allocate_trigger()
  iio: adc: mp2629: fix potential array out of bound access
  iio: adc: mp2629: fix wrong comparison of channel
  iio: pressure: ms5611: changed hardcoded SPI speed to value limited
  iio: pressure: ms5611: fixed value compensation bug
  iio: accel: bma400: Ensure VDDIO is enable defore reading the chip ID.
  ...

23 months agoMerge tag 'sound-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 18 Nov 2022 17:52:10 +0000 (09:52 -0800)]
Merge tag 'sound-6.1-rc6' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A fair amount of commits at this time due to ASoC PR merge, but all
  look small and easy, mostly device-specific fixes spanned in various
  drivers. Hopefully this should be the last big chunk for 6.1"

* tag 'sound-6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360
  ALSA: hda/realtek: fix speakers for Samsung Galaxy Book Pro
  ALSA: usb-audio: Drop snd_BUG_ON() from snd_usbmidi_output_open()
  ASoC: stm32: dfsdm: manage cb buffers cleanup
  ASoC: sof_es8336: reduce pop noise on speaker
  ASoC: SOF: topology: No need to assign core ID if token parsing failed
  ASoC: soc-utils: Remove __exit for snd_soc_util_exit()
  ASoC: rt5677: fix legacy dai naming
  ASoC: rt5514: fix legacy dai naming
  ASoC: SOF: ipc3-topology: use old pipeline teardown flow with SOF2.1 and older
  ASoC: hda: intel-dsp-config: add ES83x6 quirk for IceLake
  ASoC: Intel: soc-acpi: add ES83x6 support to IceLake
  ASoC: tas2780: Fix set_tdm_slot in case of single slot
  ASoC: tas2764: Fix set_tdm_slot in case of single slot
  ASoC: tas2770: Fix set_tdm_slot in case of single slot
  ASoC: fsl_asrc fsl_esai fsl_sai: allow CONFIG_PM=N
  ASoC: core: Fix use-after-free in snd_soc_exit()
  MAINTAINERS: update Tzung-Bi's email address
  ASoC: Intel: bytcht_es8316: Add quirk for the Nanote UMPC-01
  ASoC: amd: yc: Add Alienware m17 R5 AMD into DMI table
  ...

23 months agoMerge tag 'mmc-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 18 Nov 2022 17:43:30 +0000 (09:43 -0800)]
Merge tag 'mmc-v6.1-rc5' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Fixup VDD/VMMC voltage-range negotiation

  MMC host:
   - sdhci-pci: Fix memory leak by adding a missing pci_dev_put()
   - sdhci-pci-o2micro: Fix card detect by tuning the debounce timeout"

* tag 'mmc-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pci: Fix possible memory leak caused by missing pci_dev_put()
  mmc: sdhci-pci-o2micro: fix card detect fail issue caused by CD# debounce timeout
  mmc: core: properly select voltage range without power cycle

23 months agoio_uring: disallow self-propelled ring polling
Pavel Begunkov [Fri, 18 Nov 2022 15:41:41 +0000 (15:41 +0000)]
io_uring: disallow self-propelled ring polling

When we post a CQE we wake all ring pollers as it normally should be.
However, if a CQE was generated by a multishot poll request targeting
its own ring, it'll wake that request up, which will make it to post
a new CQE, which will wake the request and so on until it exhausts all
CQ entries.

Don't allow multishot polling io_uring files but downgrade them to
oneshots, which was always stated as a correct behaviour that the
userspace should check for.

Cc: stable@vger.kernel.org
Fixes: aa43477b04025 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3124038c0e7474d427538c2d915335ec28c92d21.1668785722.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
23 months agodm integrity: clear the journal on suspend
Mikulas Patocka [Tue, 15 Nov 2022 17:51:50 +0000 (12:51 -0500)]
dm integrity: clear the journal on suspend

There was a problem that a user burned a dm-integrity image on CDROM
and could not activate it because it had a non-empty journal.

Fix this problem by flushing the journal (done by the previous commit)
and clearing the journal (done by this commit). Once the journal is
cleared, dm-integrity won't attempt to replay it on the next
activation.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
23 months agodm integrity: flush the journal on suspend
Mikulas Patocka [Tue, 15 Nov 2022 17:48:26 +0000 (12:48 -0500)]
dm integrity: flush the journal on suspend

This commit flushes the journal on suspend. It is prerequisite for the
next commit that enables activating dm integrity devices in read-only mode.

Note that we deliberately didn't flush the journal on suspend, so that the
journal replay code would be tested. However, the dm-integrity code is 5
years old now, so that journal replay is well-tested, and we can make this
change now.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
23 months agodm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed
Zhihao Cheng [Fri, 11 Nov 2022 12:10:27 +0000 (20:10 +0800)]
dm bufio: Fix missing decrement of no_sleep_enabled if dm_bufio_client_create failed

The 'no_sleep_enabled' should be decreased in error handling path
in dm_bufio_client_create() when the DM_BUFIO_CLIENT_NO_SLEEP flag
is set, otherwise static_branch_unlikely() will always return true
even if no dm_bufio_client instances have DM_BUFIO_CLIENT_NO_SLEEP
flag set.

Cc: stable@vger.kernel.org
Fixes: 3c1c875d0586 ("dm bufio: conditionally enable branching for DM_BUFIO_CLIENT_NO_SLEEP")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
23 months agodm ioctl: fix misbehavior if list_versions races with module loading
Mikulas Patocka [Tue, 1 Nov 2022 20:53:35 +0000 (16:53 -0400)]
dm ioctl: fix misbehavior if list_versions races with module loading

__list_versions will first estimate the required space using the
"dm_target_iterate(list_version_get_needed, &needed)" call and then will
fill the space using the "dm_target_iterate(list_version_get_info,
&iter_info)" call. Each of these calls locks the targets using the
"down_read(&_lock)" and "up_read(&_lock)" calls, however between the first
and second "dm_target_iterate" there is no lock held and the target
modules can be loaded at this point, so the second "dm_target_iterate"
call may need more space than what was the first "dm_target_iterate"
returned.

The code tries to handle this overflow (see the beginning of
list_version_get_info), however this handling is incorrect.

The code sets "param->data_size = param->data_start + needed" and
"iter_info.end = (char *)vers+len" - "needed" is the size returned by the
first dm_target_iterate call; "len" is the size of the buffer allocated by
userspace.

"len" may be greater than "needed"; in this case, the code will write up
to "len" bytes into the buffer, however param->data_size is set to
"needed", so it may write data past the param->data_size value. The ioctl
interface copies only up to param->data_size into userspace, thus part of
the result will be truncated.

Fix this bug by setting "iter_info.end = (char *)vers + needed;" - this
guarantees that the second "dm_target_iterate" call will write only up to
the "needed" buffer and it will exit with "DM_BUFFER_FULL_FLAG" if it
overflows the "needed" space - in this case, userspace will allocate a
larger buffer and retry.

Note that there is also a bug in list_version_get_needed - we need to add
"strlen(tt->name) + 1" to the needed size, not "strlen(tt->name)".

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
23 months agoMerge tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme into block-6.1
Jens Axboe [Fri, 18 Nov 2022 14:47:54 +0000 (07:47 -0700)]
Merge tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme into block-6.1

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 6.1

 - two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
 - memory leak fix in nvmet (Sagi Grimberg)"

* tag 'nvme-6.1-2022-11-18' of git://git.infradead.org/nvme:
  nvmet: fix a memory leak in nvmet_auth_set_key
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro

23 months agogpu: host1x: Avoid trying to use GART on Tegra20
Robin Murphy [Thu, 20 Oct 2022 14:23:40 +0000 (15:23 +0100)]
gpu: host1x: Avoid trying to use GART on Tegra20

Since commit c7e3ca515e78 ("iommu/tegra: gart: Do not register with
bus") quite some time ago, the GART driver has effectively disabled
itself to avoid issues with the GPU driver expecting it to work in ways
that it doesn't. As of commit 57365a04c921 ("iommu: Move bus setup to
IOMMU device registration") that bodge no longer works, but really the
GPU driver should be responsible for its own behaviour anyway. Make the
workaround explicit.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Suggested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
23 months agotracing: Fix potential null-pointer-access of entry in list 'tr->err_log'
Zheng Yejian [Mon, 14 Nov 2022 10:46:32 +0000 (18:46 +0800)]
tracing: Fix potential null-pointer-access of entry in list 'tr->err_log'

Entries in list 'tr->err_log' will be reused after entry number
exceed TRACING_LOG_ERRS_MAX.

The cmd string of the to be reused entry will be freed first then
allocated a new one. If the allocation failed, then the entry will
still be in list 'tr->err_log' but its 'cmd' field is set to be NULL,
later access of 'cmd' is risky.

Currently above problem can cause the loss of 'cmd' information of first
entry in 'tr->err_log'. When execute `cat /sys/kernel/tracing/error_log`,
reproduce logs like:
  [   37.495100] trace_kprobe: error: Maxactive is not for kprobe(null) ^
  [   38.412517] trace_kprobe: error: Maxactive is not for kprobe
    Command: p4:myprobe2 do_sys_openat2
            ^

Link: https://lore.kernel.org/linux-trace-kernel/20221114104632.3547266-1-zhengyejian1@huawei.com
Fixes: 1581a884b7ca ("tracing: Remove size restriction on tracing_log_err cmd strings")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
23 months agotracing: Remove unused __bad_type_size() method
Qiujun Huang [Thu, 17 Nov 2022 16:44:35 +0000 (00:44 +0800)]
tracing: Remove unused __bad_type_size() method

__bad_type_size() is unused after
commit 04ae87a52074("ftrace: Rework event_create_dir()").
So, remove it.

Link: https://lkml.kernel.org/r/D062EC2E-7DB7-4402-A67E-33C3577F551E@gmail.com
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
23 months agotracing/eprobe: Fix eprobe filter to make a filter correctly
Masami Hiramatsu (Google) [Fri, 18 Nov 2022 01:15:34 +0000 (10:15 +0900)]
tracing/eprobe: Fix eprobe filter to make a filter correctly

Since the eprobe filter was defined based on the eprobe's trace event
itself, it doesn't work correctly. Use the original trace event of
the eprobe when making the filter so that the filter works correctly.

Without this fix:

 # echo 'e syscalls/sys_enter_openat \
flags_rename=$flags:u32 if flags < 1000' >> dynamic_events
 # echo 1 > events/eprobes/sys_enter_openat/enable
[  114.551550] event trace: Could not enable event sys_enter_openat
-bash: echo: write error: Invalid argument

With this fix:
 # echo 'e syscalls/sys_enter_openat \
flags_rename=$flags:u32 if flags < 1000' >> dynamic_events
 # echo 1 > events/eprobes/sys_enter_openat/enable
 # tail trace
cat-241     [000] ...1.   266.498449: sys_enter_openat: (syscalls.sys_enter_openat) flags_rename=0
cat-242     [000] ...1.   266.977640: sys_enter_openat: (syscalls.sys_enter_openat) flags_rename=0

Link: https://lore.kernel.org/all/166823166395.1385292.8931770640212414483.stgit@devnote3/
Fixes: 752be5c5c910 ("tracing/eprobe: Add eprobe filter support")
Reported-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Tested-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>