Moti Haimovski [Thu, 29 Dec 2022 10:44:09 +0000 (12:44 +0200)]
habanalabs: extend fatal messages to contain PCI info
This commit attaches the PCI device address to driver fatal messages
in order to ease debugging in multi-device setups.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Dani Liberman [Tue, 3 Jan 2023 22:05:03 +0000 (00:05 +0200)]
habanalabs/gaudi2: remove use of razwi info received from f/w
Because f/w does not update razwi info when sending events, remove the
use of it.
The driver is responsible to check if razwi happened and to
collect razwi data.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Wed, 30 Nov 2022 12:41:49 +0000 (14:41 +0200)]
habanalabs: trace LBW reads/writes
Add traces to LBW reads/writes.
This may be handy when debugging configuration failure or events when
tracking configuration flow.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Wed, 30 Nov 2022 12:02:00 +0000 (14:02 +0200)]
habanalabs: define events to trace PCI LBW access
There are cases where it may be useful to dump the whole LBW configs.
Yet, doing so while spamming the kernel log will probably shade other
important messages since the LBW access is done in sheer volume.
To answer this we add trace events for those too.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Carmit Carmel [Wed, 4 Jan 2023 09:13:01 +0000 (11:13 +0200)]
habanalabs/gaudi2: fix log for sob value overflow/underflow
The value in SM_SEI_CAUSE includes the SOB index and not the SOB group
index.
Remove usage of log_mask in sm_sei_cause structure as it was never
used.
Signed-off-by: Carmit Carmel <ccarmel@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Mon, 2 Jan 2023 14:44:28 +0000 (16:44 +0200)]
habanalabs: add set engines masks ASIC function
This function shall be used whenever components enable/binning masks
should be updated.
Usage is in one of the below cases:
- update user (or default) component masks
- update when getting the masks from FW (either CPUCP or COMMS)
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Koby Elbaz [Fri, 23 Dec 2022 13:02:05 +0000 (15:02 +0200)]
habanalabs: protect access to dynamic mem 'user_mappings'
When HL_INFO_USER_MAPPINGS IOCTL is called, we copy_to_user from
a dynamically allocated memory - 'user_mappings'.
Since freeing/allocating it happens in runtime (upon a page fault),
it not unlikely to access it even before being initially allocated
(i.e., accessing a NULL pointer).
The solution is to simply mark the spot when the err info has been
collected, and that way to know whether err info (either page fault
or RAZWI) is available to be read.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tom Rix [Sat, 7 Jan 2023 18:48:27 +0000 (13:48 -0500)]
habanalabs: remove redundant memset
From reviewing the code, the line
memset(kdata, 0, usize);
is not needed because kdata is either zeroed by
kdata = kzalloc(asize, GFP_KERNEL);
when allocated at runtime or by
char stack_kdata[128] = {0};
at compile time.
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Koby Elbaz [Sun, 25 Dec 2022 10:43:04 +0000 (12:43 +0200)]
habanalabs: refactor razwi/page-fault information structures
This refactor makes the code clearer and the new variables' names
better describe their roles.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Koby Elbaz [Wed, 21 Dec 2022 15:49:42 +0000 (17:49 +0200)]
habanalabs/gaudi2: avoid reconfiguring the same PB registers
It appears that, within the sync manager security configuration,
we reconfigure PB registers over and over without any need to do that.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Sun, 25 Dec 2022 14:27:24 +0000 (16:27 +0200)]
habanalabs/gaudi: allow device acquire while in debug mode
During device acquire, the driver is using a QMAN for clearing some
registers. In order to avoid internal races, the driver verifies
the device is idle before submitting the register clear job.
This check introduces an issue, as debug mode will cause the device
to be non-idle which will lead to device acquire failure.
In order to overcome this issue we can entirely remove the idle
check as the driver is using the QMAN only when there is no active
context.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Thu, 22 Dec 2022 10:28:54 +0000 (12:28 +0200)]
habanalabs: move some prints to debug level
When entering an IOCTL, the driver prints a message in case device is
not operational. This message should be printed in debug level as
it can spam the kernel log and it is not an error.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Wed, 21 Dec 2022 10:51:13 +0000 (12:51 +0200)]
habanalabs: update f/w files
Update common firmware files with the latest version.
There is no functional change.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Wed, 21 Dec 2022 10:18:55 +0000 (12:18 +0200)]
habanalabs/gaudi2: update f/w files
Update gaudi2 firmware files with the latest version.
There is no functional change.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Wed, 21 Dec 2022 09:55:54 +0000 (11:55 +0200)]
habanalabs/gaudi2: update asic register files
Update some register files with the latest h/w auto-generated files.
There is no functional change.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Tue, 6 Dec 2022 17:54:10 +0000 (19:54 +0200)]
habanalabs: verify that kernel CB is destroyed only once
Remove the distinction between user CB and kernel CB, and verify for
both that they are not destroyed more than once.
As kernel CB might be taken from the pre-allocated CB pool, so we need
to clear the handle destroyed indication when returning a CB to the
pool.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Sun, 18 Dec 2022 07:42:34 +0000 (09:42 +0200)]
habanalabs: add uapi to flush inbound HBM transactions
When doing p2p with a NIC device, the NIC needs to make sure all the
writes to the HBM (through the PCI bar of the Gaudi device) were
flushed.
It can be done by either the NIC or the host reading through the PCI
bar.
To support the host side, we supply a simple uapi to perform this flush
through the driver, because the user can't create such a transaction
by itself (the PCI bar isn't exposed to normal users).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Mon, 26 Dec 2022 21:05:00 +0000 (23:05 +0200)]
habanalabs: move driver to accel subsystem
Now that we have a subsystem for compute accelerators, move the
habanalabs driver to it.
This patch only moves the files and fixes the Makefiles. Future
patches will change the existing code to register to the accel
subsystem and expose the accel device char files instead of the
habanalabs device char files.
Update the MAINTAINERS file to reflect this change.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Oded Gabbay [Tue, 20 Dec 2022 12:12:19 +0000 (14:12 +0200)]
habanalabs/uapi: move uapi file to drm
Move the habanalabs.h uapi file from include/uapi/misc to
include/uapi/drm, and rename it to habanalabs_accel.h.
This is required before moving the actual driver to the accel
subsystem.
Update MAINTAINERS file accordingly.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Thu, 15 Dec 2022 14:36:53 +0000 (16:36 +0200)]
habanalabs: fix dma-buf release handling if dma_buf_fd() fails
The dma-buf private object is freed if a call to dma_buf_fd() fails,
and because a file was already associated with the dma-buf in
dma_buf_export(), the release op will be called and will use this
object.
Mark the 'priv' field as NULL in this case, and avoid accessing it from
the release op.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Wed, 14 Dec 2022 14:52:05 +0000 (16:52 +0200)]
habanalabs/gaudi2: dump event description even if no cause
In order to have the no-cause error print be more informative,
we add the event description in addition to the event id.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
farah kassabri [Wed, 16 Nov 2022 13:40:30 +0000 (15:40 +0200)]
habanalabs: pass-through request from user to f/w
Add a uAPI, as part of the INFO IOCTL, to allow users to send
requests directly to f/w, according to a pre-defined set of opcodes
that the f/w exposes.
The f/w will put the result in a kernel-allocated buffer, which the
driver will then copy to the user-supplied buffer.
This will allow f/w tools to communicate directly with the f/w
without the need to add a new uAPI to the driver for each new type
of request.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tal Cohen [Thu, 1 Dec 2022 14:37:30 +0000 (16:37 +0200)]
habanalabs: support receiving ascii message from preboot f/w
An Ascii message that is sent from preboot towards the driver
will indicate the specific error that occurred on the f/w.
This commit supports that message and parse the ascii string
in order to print it into the kernel log
The commit also changes the way the descriptor struct is declared.
While its size increased (it now above 1024 bytes), it will be
allocated by using kmalloc instead of stack declaration.
Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Thu, 8 Dec 2022 13:19:10 +0000 (15:19 +0200)]
habanalabs: fix asic-specific functions documentation
- Add missing documentation of set DRAM props
- fix typo
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
farah kassabri [Mon, 28 Nov 2022 11:11:44 +0000 (13:11 +0200)]
habanalabs: fix wrong variable type used for vzalloc
vzalloc expects void* and not void __iomem*.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Wed, 30 Nov 2022 12:26:10 +0000 (14:26 +0200)]
habanalabs/gaudi2: wait for preboot ready if HW state is dirty
Instead of waiting for BTM indication we should wait for preboot ready.
Consider the below scenario:
1. FW update is being triggered
- setting the dirty bit
2. hard reset will be triggered due to the dirty bit
3. FW initiates the reset:
- dirty bit cleared
- BTM indication cleared
- preboot ready indication cleared
4. during hard reset:
- BTM indication will be set
- BIST test performed and another reset triggered
5. only after this reset the preboot will set the preboot ready
When polling on BTM indication alone we can lose sync with FW while
trying to communicate with FW that is during reset.
To overcome this we will always wait to preboot ready indication.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Sun, 4 Dec 2022 21:23:47 +0000 (23:23 +0200)]
habanalabs: put fences in case of unexpected wait status
Need to put fences even if an unexpected status value is received while
waiting for a fence.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Sun, 4 Dec 2022 20:09:08 +0000 (22:09 +0200)]
habanalabs: fix handling of wait CS for interrupting signals
The -ERESTARTSYS return value is not handled correctly when a signal is
received while waiting for CS completion.
This can lead to bad output values to user when waiting for a single CS
completion, and more severe, it can cause a non-stopping loop when
waiting to multi-CS completion and until a CS timeout.
Fix the handling and exit the waiting if this return value is received.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Thu, 10 Nov 2022 11:43:02 +0000 (13:43 +0200)]
habanalabs: fix dmabuf to export only required size
This patch fixes a bug that was found in the dmabuf flow.
Bug description as found on Gaudi2 device:
1. User allocates 4MB of device memory
- Note that although the allocation size was 4MB the HMMU allocated
a full page of 768MB to back the request.
- The user gets a memory handle that points to a single page (768MB)
- Mapping the handle, the user gets virtual address to the start of
the page.
2. User exports the buffer
3. User registers the exported buffer in the importer. This flow has
a callback to the exporter which in turn converts the phys_page_pack
to an SG list for the importer. This SG list is of single entry of
size 768MB. However, the size that was passed to the importer was
only 4MB.
The solution for this is to make sure the importer gets exposure only
to the exported size.
This will be done by fixing the SG created by the exporter to be of
the total size of the actual exported memory requested by the user.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Mon, 14 Nov 2022 10:16:37 +0000 (12:16 +0200)]
habanalabs: modify export dmabuf API
A previous commit deprecated the option to export from handle, leaving
the code with no support for devices with virtual memory.
This commit modifies the export API in a way that unifies the uAPI to
user address for both cases (i.e. with and without MMU support) and add
the actual support for devices with virtual memory.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Tue, 29 Nov 2022 07:13:34 +0000 (09:13 +0200)]
habanalabs: helper function to validate export params
Validate export parameters in a dedicated function instead of in the
main export flow.
This will be useful later when support to export dmabuf for devices
with virtual memory will be added.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Tue, 29 Nov 2022 12:02:07 +0000 (14:02 +0200)]
habanalabs: remove support to export dmabuf from handle
The API to the user which allows exporting DMA buffer from handle is
deprecated here. It was never used as it is relevant only for Gaudi2,
and the user stack has yet to add support for dmabuf in Gaudi2.
Looking forward, a modified API to export DMA buffer for ASICs that
supports virtual memory will be added.
Until the new API will be ready- exporting DMA buffer will not be
supported for ASICs with virtual memory.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
farah kassabri [Tue, 29 Nov 2022 13:37:55 +0000 (15:37 +0200)]
habanalabs: set log level for descriptor validation to debug
This warning doesn't have real consequences, and therefore can be
printed in debug level.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Wed, 30 Nov 2022 09:31:39 +0000 (11:31 +0200)]
habanalabs: trace COMMS protocol
Call COMMS tracepoints from within the dynamic CPU FW load.
This can help debug failures or delays in the dynamic FW load flow.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Wed, 30 Nov 2022 09:16:51 +0000 (11:16 +0200)]
habanalabs: define traces for COMMS protocol
As the COMMS protocol is being used more widely in our driver,
an available debug tool for the handshake will be handy.
This commit defines tracepoints to various key points of the COMMS
protocol.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Wed, 30 Nov 2022 12:35:32 +0000 (14:35 +0200)]
habanalabs/gaudi2: support abrupt device reset event
In certain scenarios, firmware might encounter a fatal event for
which a device reset is required. Hence, a proper notification
is needed for driver to be aware and initiate a reset sequence.
In secured environments the reset will be performed by firmware
without an explicit request from the driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Wed, 30 Nov 2022 10:07:06 +0000 (12:07 +0200)]
habanalabs: skip device idle check in hpriv_release if in reset
When user context is released and hpriv_release() is called, there is a
device idle status check, to understand if user has left the device not
idle and then a reset is required.
However, if the user process is killed because of device hard reset,
the device at this point would always be not idle, because the device
engines were already forcefully halted.
Modify hpriv_release() to skip the idle check if reset is in progress.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tamir Gilad-Raz [Sun, 6 Nov 2022 09:22:16 +0000 (11:22 +0200)]
habanalabs: adjacent timestamps should be more accurate
timestamp events that expire on the same interrupt will get the same
timestamp value
Signed-off-by: Tamir Gilad-Raz <tgiladraz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Wed, 16 Nov 2022 15:27:26 +0000 (17:27 +0200)]
habanalabs/gaudi2: remove duplicated event prints
In order to reduce error log, we try to minimize the dumped rows
while keeping all relevant error info. In addition we completely
remove clock throttling debug logs.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Wed, 23 Nov 2022 09:03:17 +0000 (11:03 +0200)]
habanalabs/gaudi2: count interrupt causes
During event handling we extract interrupt cause and count it.
In case we could not find any cause we should add proper error.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Sun, 27 Nov 2022 10:46:23 +0000 (12:46 +0200)]
habanalabs: update DRAM props according to preboot data
If the f/w reports the binning masks at the preboot stage, the driver
must align its DRAM properties according to the new information.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Marco Pagani [Tue, 29 Nov 2022 11:52:17 +0000 (12:52 +0100)]
habanalabs: fix double assignment in MMU V1
Removing double assignment of the hop2_pte_addr
variable in dram_default_mapping_fini().
Dead store reported by clang-analyzer.
Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ohad Sharabi [Sun, 27 Nov 2022 10:38:49 +0000 (12:38 +0200)]
habanalabs: make set_dram_properties an ASIC function
As ASICs are evolving, we will need to update the DRAM properties at
various points because we may get different information from the f/w
at different points of the initialization.
This ASIC function is a foundation for this capability.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Thu, 24 Nov 2022 09:12:38 +0000 (11:12 +0200)]
habanalabs: use dev_dbg() when hl_mmap_mem_buf_get() fails
As hl_mmap_mem_buf_get() is called also from IOCTLs which can have a
bad handle from user, modify the print for "no match to handle" to use
dev_dbg().
Calls to this function which are not dependent on user, already have an
error print when the function fails.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Wed, 23 Nov 2022 13:09:43 +0000 (15:09 +0200)]
habanalabs: don't allow user to destroy CB handle more than once
The refcount of a CB buffer is initialized when user allocates a CB,
and is decreased when he destroys the CB handle.
If this refcount is increased also from kernel and user sends more than
one destroy requests for the handle, the buffer will be released/freed
and later be accessed when the refcount is put from kernel side.
To avoid it, prevent user from destroying the handle more than once.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Ofir Bitton [Thu, 24 Nov 2022 09:01:44 +0000 (11:01 +0200)]
habanalabs: don't notify user about clk throttling due to power
As clock throttling due to high power consumption can happen very
frequently and there is no real reason to notify the user about it,
we skip this notification in all asics.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Tue, 8 Nov 2022 12:34:43 +0000 (14:34 +0200)]
habanalabs: abort waiting user threads upon error
User should close the FD when being notified about an error, after
which a device reset takes place.
However, if the user has pending threads that wait for completions,
the device release won't be called and eventually the watchdog timeout
will expire, leading to hard reset and killing the user process.
To avoid it, abort such waiting threads right after the error
notification, and block following waiting operations.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Tomer Tayar [Sun, 6 Nov 2022 18:29:18 +0000 (20:29 +0200)]
habanalabs: remove releasing of user threads from device release
The device file is not in use when hl_device_release() is called,
and there aren't any user threads that use IOCTLs to wait for
interrupts. Therefore there is no need to release them at this point.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
farah kassabri [Sun, 13 Nov 2022 15:44:17 +0000 (17:44 +0200)]
habanalabs: read binning info from preboot
Sometimes we need the binning info at a very early state of the
driver initialization. Therefore, support was added in preboot to
provide the binning info as part of the f/w descriptor and the driver
can now use that.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
tal albo [Wed, 16 Nov 2022 20:54:24 +0000 (22:54 +0200)]
habanalabs/gaudi2: fix BMON 3rd address range
Fix programming incorrect value of address range
Signed-off-by: tal albo <talbo@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:15 +0000 (21:04 +0100)]
drm/fbdev-generic: Rename struct fb_info 'fbi' to 'info'
The generic fbdev emulation names variables of type struct fb_info
both 'fbi' and 'info'. The latter seems to be more common in fbdev
code, so name fbi accordingly.
Also replace the duplicate variable in drm_fbdev_fb_destroy().
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-11-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:14 +0000 (21:04 +0100)]
drm/fbdev-generic: Inline clean-up helpers into drm_fbdev_fb_destroy()
The fbdev framebuffer cleanup in drm_fbdev_fb_destroy() calls
drm_fbdev_release() and drm_fbdev_cleanup(). Inline both into the
caller. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-10-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:13 +0000 (21:04 +0100)]
drm/fbdev-generic: Minimize client unregistering
For uninitialized framebuffers, only release the DRM client and
free the fbdev memory. Do not attempt to clean up the framebuffer.
DRM fbdev clients have a two-step initialization: first create
the DRM client; then create the framebuffer device on the first
successful hotplug event. In cases where the client never creates
the framebuffer, only the client state needs to be released. We
can detect which case it is, full or client-only cleanup, by
looking at the presence of fb_helper's info field.
v3:
* fix typo in commit message (Javier)
* release client before unpreparing fbdev
v2:
* remove test for (fbi != NULL) in drm_fbdev_cleanup() (Sam)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-9-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:12 +0000 (21:04 +0100)]
drm/fbdev-generic: Minimize hotplug error handling
Call drm_fb_helper_fini() in the generic-fbdev hotplug helper
to revert the effects of drm_fb_helper_init(). No full cleanup
is required.
v3:
* fix error in commit message (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-8-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:11 +0000 (21:04 +0100)]
drm/fb-helper: Initialize fb-helper's preferred BPP in prepare function
Initialize the fb-helper's preferred_bpp field early from within
drm_fb_helper_prepare(); instead of the later client hot-plugging
callback. This simplifies the generic fbdev setup function.
No real changes, but all drivers' fbdev code has to be adapted.
v3:
* build with CONFIG_DRM_FBDEV_EMULATION unset (kernel test bot)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-7-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:10 +0000 (21:04 +0100)]
drm/fb-helper: Remove preferred_bpp parameter from fbdev internals
Store the console's preferred BPP value in struct drm_fb_helper
and remove the respective function parameters from the internal
fbdev code.
The BPP value is only required as a fallback and will now always
be available in the fb-helper instance.
No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-6-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:09 +0000 (21:04 +0100)]
drm/fbdev-generic: Initialize fb-helper structure in generic setup
Initialize the fb-helper structure immediately after its allocation
in drm_fbdev_generic_setup(). That will make it easier to fill it with
driver-specific values, such as the preferred BPP.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-5-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:08 +0000 (21:04 +0100)]
drm/fb-helper: Introduce drm_fb_helper_unprepare()
Move the fb-helper clean-up code into drm_fb_helper_unprepare(). No
functional changes.
v2:
* declare as static inline (kernel test robot)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-4-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:07 +0000 (21:04 +0100)]
drm/client: Add hotplug_failed flag
Signal failed hotplugging with a flag in struct drm_client_dev. If set,
the client helpers will not further try to set up the fbdev display.
This used to be signalled with a combination of cleared pointers in
struct drm_fb_helper, which prevents us from initializing these pointers
early after allocation.
The change also harmonizes behavior among DRM clients. Additional DRM
clients will now handle failed hotplugging like fbdev does.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-3-tzimmermann@suse.de
Thomas Zimmermann [Wed, 25 Jan 2023 20:04:06 +0000 (21:04 +0100)]
drm/client: Test for connectors before sending hotplug event
Test for connectors in the client code and remove a similar test
from the generic fbdev emulation. Do nothing if the test fails.
Not having connectors indicates a driver bug.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125200415.14123-2-tzimmermann@suse.de
Jerome Brunet [Tue, 24 Jan 2023 10:11:57 +0000 (11:11 +0100)]
net: mdio-mux-meson-g12a: force internal PHY off on mux switch
Force the internal PHY off then on when switching to the internal path.
This fixes problems where the PHY ID is not properly set.
Fixes:
7090425104db ("net: phy: add amlogic g12a mdio mux support")
Suggested-by: Qi Duan <qi.duan@amlogic.com>
Co-developed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20230124101157.232234-1-jbrunet@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ivan Vecera [Tue, 24 Jan 2023 14:51:26 +0000 (15:51 +0100)]
docs: networking: Fix bridge documentation URL
Current documentation URL [1] is no longer valid.
[1] https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230124145127.189221-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gerhard Engleder [Tue, 24 Jan 2023 19:14:40 +0000 (20:14 +0100)]
tsnep: Fix TX queue stop/wake for multiple queues
netif_stop_queue() and netif_wake_queue() act on TX queue 0. This is ok
as long as only a single TX queue is supported. But support for multiple
TX queues was introduced with
762031375d5c and I missed to adapt stop
and wake of TX queues.
Use netif_stop_subqueue() and netif_tx_wake_queue() to act on specific
TX queue.
Fixes:
762031375d5c ("tsnep: Support multiple TX/RX queue pairs")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20230124191440.56887-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Christensen [Tue, 24 Jan 2023 18:53:39 +0000 (13:53 -0500)]
net/tg3: resolve deadlock in tg3_reset_task() during EEH
During EEH error injection testing, a deadlock was encountered in the tg3
driver when tg3_io_error_detected() was attempting to cancel outstanding
reset tasks:
crash> foreach UN bt
...
PID: 159 TASK:
c0000000067c6000 CPU: 8 COMMAND: "eehd"
...
#5 [
c00000000681f990] __cancel_work_timer at
c00000000019fd18
#6 [
c00000000681fa30] tg3_io_error_detected at
c00800000295f098 [tg3]
#7 [
c00000000681faf0] eeh_report_error at
c00000000004e25c
...
PID: 290 TASK:
c000000036e5f800 CPU: 6 COMMAND: "kworker/6:1"
...
#4 [
c00000003721fbc0] rtnl_lock at
c000000000c940d8
#5 [
c00000003721fbe0] tg3_reset_task at
c008000002969358 [tg3]
#6 [
c00000003721fc60] process_one_work at
c00000000019e5c4
...
PID: 296 TASK:
c000000037a65800 CPU: 21 COMMAND: "kworker/21:1"
...
#4 [
c000000037247bc0] rtnl_lock at
c000000000c940d8
#5 [
c000000037247be0] tg3_reset_task at
c008000002969358 [tg3]
#6 [
c000000037247c60] process_one_work at
c00000000019e5c4
...
PID: 655 TASK:
c000000036f49000 CPU: 16 COMMAND: "kworker/16:2"
...:1
#4 [
c0000000373ebbc0] rtnl_lock at
c000000000c940d8
#5 [
c0000000373ebbe0] tg3_reset_task at
c008000002969358 [tg3]
#6 [
c0000000373ebc60] process_one_work at
c00000000019e5c4
...
Code inspection shows that both tg3_io_error_detected() and
tg3_reset_task() attempt to acquire the RTNL lock at the beginning of
their code blocks. If tg3_reset_task() should happen to execute between
the times when tg3_io_error_deteced() acquires the RTNL lock and
tg3_reset_task_cancel() is called, a deadlock will occur.
Moving tg3_reset_task_cancel() call earlier within the code block, prior
to acquiring RTNL, prevents this from happening, but also exposes another
deadlock issue where tg3_reset_task() may execute AFTER
tg3_io_error_detected() has executed:
crash> foreach UN bt
PID: 159 TASK:
c0000000067d2000 CPU: 9 COMMAND: "eehd"
...
#4 [
c000000006867a60] rtnl_lock at
c000000000c940d8
#5 [
c000000006867a80] tg3_io_slot_reset at
c0080000026c2ea8 [tg3]
#6 [
c000000006867b00] eeh_report_reset at
c00000000004de88
...
PID: 363 TASK:
c000000037564000 CPU: 6 COMMAND: "kworker/6:1"
...
#3 [
c000000036c1bb70] msleep at
c000000000259e6c
#4 [
c000000036c1bba0] napi_disable at
c000000000c6b848
#5 [
c000000036c1bbe0] tg3_reset_task at
c0080000026d942c [tg3]
#6 [
c000000036c1bc60] process_one_work at
c00000000019e5c4
...
This issue can be avoided by aborting tg3_reset_task() if EEH error
recovery is already in progress.
Fixes:
db84bf43ef23 ("tg3: tg3_reset_task() needs to use rtnl_lock to synchronize")
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230124185339.225806-1-drc@linux.vnet.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Namjae Jeon [Tue, 24 Jan 2023 15:09:02 +0000 (00:09 +0900)]
ksmbd: downgrade ndr version error message to debug
When user switch samba to ksmbd, The following message flood is coming
when accessing files. Samba seems to changs dos attribute version to v5.
This patch downgrade ndr version error message to debug.
$ dmesg
...
[68971.766914] ksmbd: v5 version is not supported
[68971.779808] ksmbd: v5 version is not supported
[68971.871544] ksmbd: v5 version is not supported
[68971.910135] ksmbd: v5 version is not supported
...
Cc: stable@vger.kernel.org
Fixes:
e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Namjae Jeon [Tue, 24 Jan 2023 15:13:20 +0000 (00:13 +0900)]
ksmbd: limit pdu length size according to connection status
Stream protocol length will never be larger than 16KB until session setup.
After session setup, the size of requests will not be larger than
16KB + SMB2 MAX WRITE size. This patch limits these invalidly oversized
requests and closes the connection immediately.
Fixes:
0626e6641f6b ("cifsd: add server handler for central processing and tranport layers")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18259
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Dan Williams [Sat, 21 Jan 2023 00:26:12 +0000 (16:26 -0800)]
cxl/pmem: Fix nvdimm unregistration when cxl_pmem driver is absent
The cxl_pmem.ko module houses the driver for both cxl_nvdimm_bridge
objects and cxl_nvdimm objects. When the core creates a cxl_nvdimm it
arranges for it to be autoremoved when the bridge goes down. However, if
the bridge never initialized because the cxl_pmem.ko module never
loaded, it sets up a the following crash scenario:
BUG: kernel NULL pointer dereference, address:
0000000000000478
[..]
RIP: 0010:cxl_nvdimm_probe+0x99/0x140 [cxl_pmem]
[..]
Call Trace:
<TASK>
cxl_bus_probe+0x17/0x50 [cxl_core]
really_probe+0xde/0x380
__driver_probe_device+0x78/0x170
driver_probe_device+0x1f/0x90
__driver_attach+0xd2/0x1c0
bus_for_each_dev+0x79/0xc0
bus_add_driver+0x1b1/0x200
driver_register+0x89/0xe0
cxl_pmem_init+0x50/0xff0 [cxl_pmem]
It turns out the recent rework to simplify nvdimm probing obviated the
need to unregister cxl_nvdimm objects at cxl_nvdimm_bridge ->remove()
time. Leave the cxl_nvdimm device registered until the hosting
cxl_memdev departs. The alternative is that the cxl_memdev needs to be
reattached whenever the cxl_nvdimm_bridge attach state cycles, which is
awkward and unnecessary.
The only requirement is to make sure that when the cxl_nvdimm_bridge
goes away any dependent cxl_nvdimm objects are shutdown. Handle that in
unregister_nvdimm_bus().
With these registration entanglements removed there is no longer a need
to pre-load the cxl_pmem module in cxl_acpi.
Fixes:
cb9cfff82f6a ("cxl/acpi: Simplify cxl_nvdimm_bridge probing")
Reported-by: Gregory Price <gregory.price@memverge.com>
Debugged-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167426077263.3955046.9695309346988027311.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Deepak R Varma [Wed, 25 Jan 2023 15:07:14 +0000 (20:37 +0530)]
drm/nouveau/devinit: Convert function disable() to be void
The current design of callback function disable() of struct
nvkm_devinit_func is defined to return a u64 value. In its implementation
in the driver modules, the function always returns a fixed value 0. Hence
the design and implementation of this function should be enhanced to return
void instead of a fixed value. This change also eliminates untouched
return variables.
The change is identified using the returnvar.cocci Coccinelle semantic
patch script.
Signed-off-by: Deepak R Varma <drv@mailo.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y9FFoooIXjlr+UP1@ubun2204.myguest.virtualbox.org
Kees Cook [Fri, 6 Jan 2023 06:02:33 +0000 (22:02 -0800)]
bcache: Silence memcpy() run-time false positive warnings
struct bkey has internal padding in a union, but it isn't always named
the same (e.g. key ## _pad, key_p, etc). This makes it extremely hard
for the compiler to reason about the available size of copies done
against such keys. Use unsafe_memcpy() for now, to silence the many
run-time false positive warnings:
memcpy: detected field-spanning write (size 264) of single field "&i->j" at drivers/md/bcache/journal.c:152 (size 240)
memcpy: detected field-spanning write (size 24) of single field "&b->key" at drivers/md/bcache/btree.c:939 (size 16)
memcpy: detected field-spanning write (size 24) of single field "&temp.key" at drivers/md/bcache/extents.c:428 (size 16)
Reported-by: Alexandre Pereira <alexpereira@disroot.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216785
Acked-by: Coly Li <colyli@suse.de>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230106060229.never.047-kees@kernel.org
Kees Cook [Wed, 18 Jan 2023 20:21:35 +0000 (12:21 -0800)]
gcc-plugins: Reorganize gimple includes for GCC 13
The gimple-iterator.h header must be included before gimple-fold.h
starting with GCC 13. Reorganize gimple headers to work for all GCC
versions.
Reported-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/all/20230113173033.4380-1-palmer@rivosinc.com/
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Sat, 7 Jan 2023 03:47:05 +0000 (19:47 -0800)]
kunit: memcpy: Split slow memcpy tests into MEMCPY_SLOW_KUNIT_TEST
Since the long memcpy tests may stall a system for tens of seconds
in virtualized architecture environments, split those tests off under
CONFIG_MEMCPY_SLOW_KUNIT_TEST so they can be separately disabled.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/20221226195206.GA2626419@roeck-us.net
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Thomas Zimmermann [Wed, 25 Jan 2023 20:12:51 +0000 (21:12 +0100)]
Merge drm/drm-next into drm-misc-next
Backmerging to sync with other DRM trees.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Aurabindo Pillai [Wed, 11 Jan 2023 19:56:22 +0000 (14:56 -0500)]
drm/amd/display: Fix timing not changning when freesync video is enabled
[Why&How]
Switching between certain modes that are freesync video modes and those
are not freesync video modes result in timing not changing as seen by
the monitor due to incorrect timing being driven.
The issue is fixed by ensuring that when a non freesync video mode is
set, we reset the freesync status on the crtc.
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Alan Liu <HaoPing.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wayne Lin [Wed, 28 Dec 2022 06:50:43 +0000 (14:50 +0800)]
drm/display/dp_mst: Correct the kref of port.
[why & how]
We still need to refer to port while removing payload at commit_tail.
we should keep the kref till then to release.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes:
4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state")
Cc: stable@vger.kernel.org # 6.1
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Tested-by: Didier Raboud <odyx@debian.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wayne Lin [Mon, 12 Dec 2022 07:41:18 +0000 (15:41 +0800)]
drm/amdgpu/display/mst: update mst_mgr relevant variable when long HPD
[Why & How]
Now the vc_start_slot is controlled at drm side. When we
service a long HPD, we still need to run
dm_helpers_dp_mst_write_payload_allocation_table() to update
drm mst_mgr's relevant variable. Otherwise, on the next plug-in,
payload will get assigned with a wrong start slot.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes:
4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state")
Cc: stable@vger.kernel.org # 6.1
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Tested-by: Didier Raboud <odyx@debian.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wayne Lin [Fri, 9 Dec 2022 11:05:33 +0000 (19:05 +0800)]
drm/amdgpu/display/mst: limit payload to be updated one by one
[Why]
amdgpu expects to update payload table for one stream one time
by calling dm_helpers_dp_mst_write_payload_allocation_table().
Currently, it get modified to try to update HW payload table
at once by referring mst_state.
[How]
This is just a quick workaround. Should find way to remove the
temporary struct dc_dp_mst_stream_allocation_table later if set
struct link_mst_stream_allocatio directly is possible.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes:
4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state")
Cc: stable@vger.kernel.org # 6.1
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Tested-by: Didier Raboud <odyx@debian.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lyude Paul [Wed, 23 Nov 2022 19:50:16 +0000 (14:50 -0500)]
drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignments
Looks like I made a pretty big mistake here without noticing: it seems when
I moved the assignments of mst_state->pbn_div I completely missed the fact
that the reason for us calling drm_dp_mst_update_slots() earlier was to
account for the fact that we need to call this function using info from the
root MST connector, instead of just trying to do this from each MST
encoder's atomic check function. Otherwise, we end up filling out all of
DC's link information with zeroes.
So, let's restore that and hopefully fix this DSC regression.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2171
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes:
4d07b0bc4034 ("drm/display/dp_mst: Move all payload info into the atomic state")
Cc: stable@vger.kernel.org # 6.1
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Didier Raboud <odyx@debian.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Li Ma [Fri, 20 Jan 2023 07:41:22 +0000 (15:41 +0800)]
drm/amdgpu: declare firmware for new MES 11.0.4
To support new mes ip block
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Li Ma [Fri, 20 Jan 2023 07:38:33 +0000 (15:38 +0800)]
drm/amdgpu: enable imu firmware for GC 11.0.4
The GC 11.0.4 needs load IMU to power up GFX before loads GFX firmware.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Fri, 20 Jan 2023 03:21:53 +0000 (11:21 +0800)]
drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0
Add SMU13.0.0 AllowIHInterrupt message mapping.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
Jonathan Kim [Thu, 19 Jan 2023 23:42:03 +0000 (18:42 -0500)]
drm/amdgpu: remove unconditional trap enable on add gfx11 queues
Rebase of driver has incorrect unconditional trap enablement
for GFX11 when adding mes queues.
Reported-by: Graham Sider <graham.sider@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Graham Sider <graham.sider@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
Linus Torvalds [Wed, 25 Jan 2023 17:15:15 +0000 (09:15 -0800)]
Merge tag 'fs.fuse.acl.v6.2-rc6' of git://git./linux/kernel/git/vfs/idmapping
Pull fuse ACL fix from Christian Brauner:
"The new posix acl API doesn't depend on the xattr handler
infrastructure anymore and instead only relies on the posix acl inode
operations. As a result daemons without FUSE_POSIX_ACL are unable to
use posix acls like they used to.
Fix this by copying what we did for overlayfs during the posix acl api
conversion. Make fuse implement a dedicated ->get_inode_acl() method
as does overlayfs. Fuse can then also uses this to express different
needs for vfs permission checking during lookup and acl based
retrieval via the regular system call path.
This allows fuse to continue to refuse retrieving posix acls for
daemons that don't set FUSE_POSXI_ACL for permission checking while
also allowing a fuse server to retrieve it via the usual system calls"
* tag 'fs.fuse.acl.v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fuse: fixes after adapting to new posix acl api
Doug Smythies [Sat, 21 Jan 2023 16:41:35 +0000 (08:41 -0800)]
selftests: amd-pstate: Don't delete source files via Makefile
Revert the portion of a recent Makefile change that incorrectly
deletes source files when doing "make clean".
Fixes:
ba2d788aa873 ("selftests: amd-pstate: Trigger tbench benchmark and test cpus")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Christian König [Wed, 25 Jan 2023 15:49:02 +0000 (16:49 +0100)]
drm/ttm: revert "stop allocating dummy resources during BO creation"
This reverts commit
00984ad39599bb2a1e6ec5d4e9c75a749f7f45c9.
It seems to still breka i915.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125155023.105584-1-christian.koenig@amd.com
Christian König [Wed, 25 Jan 2023 15:48:36 +0000 (16:48 +0100)]
drm/ttm: revert "stop allocating a dummy resource for pipelined gutting"
This reverts commit
4110872b8115aab2adb3a52149c144d8465440de.
This still seems to break i915.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125155023.105584-1-christian.koenig@amd.com
Christian König [Wed, 25 Jan 2023 16:14:37 +0000 (17:14 +0100)]
drm/ttm: revert "prevent moving of pinned BOs"
This reverts commit
b49323aa35d502b0d9a7950327f30a1a52eae534.
This still seems to break i915.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125155023.105584-1-christian.koenig@amd.com
David Howells [Wed, 25 Jan 2023 14:02:13 +0000 (14:02 +0000)]
cifs: Fix oops due to uncleared server->smbd_conn in reconnect
In smbd_destroy(), clear the server->smbd_conn pointer after freeing the
smbd_connection struct that it points to so that reconnection doesn't get
confused.
Fixes:
8ef130f9ec27 ("CIFS: SMBD: Implement function to destroy a SMB Direct connection")
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Long Li <longli@microsoft.com>
Cc: Pavel Shilovsky <piastryyy@gmail.com>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Masami Hiramatsu (Google) [Thu, 19 Jan 2023 23:36:24 +0000 (08:36 +0900)]
bootconfig: Update MAINTAINERS file to add tree and mailing list
Since the bootconfig related changes will be handled on linux-trace
tree, add the tree and mailing lists for EXTRA BOOT CONFIG.
Link: https://lkml.kernel.org/r/167417138436.2333752.6988808113120359923.stgit@devnote3
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Colin Ian King [Mon, 16 Jan 2023 16:16:12 +0000 (16:16 +0000)]
rv: remove redundant initialization of pointer ptr
The pointer ptr is being initialized with a value that is never read,
it is being updated later on a call to strim. Remove the extraneous
initialization.
Link: https://lkml.kernel.org/r/20230116161612.77192-1-colin.i.king@gmail.com
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Mark Rutland [Tue, 3 Jan 2023 12:49:10 +0000 (12:49 +0000)]
ftrace: Maintain samples/ftrace
There's no entry in MAINTAINERS for samples/ftrace. Add one so that the
FTRACE maintainers are kept in the loop.
Link: https://lkml.kernel.org/r/20230103124912.2948963-2-mark.rutland@arm.com
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Randy Dunlap [Sun, 8 Jan 2023 02:12:38 +0000 (18:12 -0800)]
tracing/filter: fix kernel-doc warnings
Use the 'struct' keyword for a struct's kernel-doc notation and
use the correct function parameter name to eliminate kernel-doc
warnings:
kernel/trace/trace_events_filter.c:136: warning: cannot understand function prototype: 'struct prog_entry '
kerne/trace/trace_events_filter.c:155: warning: Excess function parameter 'when_to_branch' description in 'update_preds'
Also correct some trivial punctuation problems.
Link: https://lkml.kernel.org/r/20230108021238.16398-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Ley Foon Tan [Thu, 5 Jan 2023 03:37:05 +0000 (11:37 +0800)]
riscv: Move call to init_cpu_topology() to later initialization stage
If "capacity-dmips-mhz" is present in a CPU DT node,
topology_parse_cpu_capacity() will fail to allocate memory. arm64, with
which this code path is shared, does not call
topology_parse_cpu_capacity() until later in boot where memory
allocation is available. While "capacity-dmips-mhz" is not yet a valid
property on RISC-V, invalid properties should be ignored rather than
cause issues. Move init_cpu_topology(), which calls
topology_parse_cpu_capacity(), to a later initialization stage, to match
arm64.
As a side effect of this change, RISC-V is "protected" from changes to
core topology code that would work on arm64 where memory allocation is
safe but on RISC-V isn't.
Fixes:
03f11f03dbfe ("RISC-V: Parse cpu topology during boot.")
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Link: https://lore.kernel.org/r/20230105033705.3946130-1-leyfoon.tan@starfivetech.com
[Palmer: use Conor's commit text]
Link: https://lore.kernel.org/linux-riscv/20230104183033.755668-1-pierre.gondois@arm.com/T/#me592d4c8b9508642954839f0077288a353b0b9b2
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Rafael J. Wysocki [Wed, 25 Jan 2023 12:17:42 +0000 (13:17 +0100)]
thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
In order to prevent int340x_thermal_get_trip_type() from possibly
racing with int340x_thermal_read_trips() invoked by int3403_notify()
add locking to it in analogy with int340x_thermal_get_trip_temp().
Fixes:
6757a7abe47b ("thermal: intel: int340x: Protect trip temperature from concurrent updates")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jani Nikula [Wed, 18 Jan 2023 15:18:00 +0000 (17:18 +0200)]
drm/i915/params: use generics for parameter debugfs file creation
Replace the __builtin_strcmp() if ladder with generics.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230118151800.3669913-4-jani.nikula@intel.com
Jani Nikula [Wed, 18 Jan 2023 15:17:59 +0000 (17:17 +0200)]
drm/i915/params: use generics for parameter free
Replace the __builtin_strcmp() if ladder with generics.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230118151800.3669913-3-jani.nikula@intel.com
Jani Nikula [Wed, 18 Jan 2023 15:17:58 +0000 (17:17 +0200)]
drm/i915/params: use generics for parameter dup
Replace the __builtin_strcmp() if ladder with generics.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230118151800.3669913-2-jani.nikula@intel.com
Jani Nikula [Wed, 18 Jan 2023 15:17:57 +0000 (17:17 +0200)]
drm/i915/params: use generics for parameter printing
Replace the __builtin_strcmp() if ladder with generics.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230118151800.3669913-1-jani.nikula@intel.com
Colin Ian King [Fri, 20 Jan 2023 09:28:42 +0000 (09:28 +0000)]
accel/ivpu: Fix spelling mistake "tansition" -> "transition"
There are spelling mistakes in two ivpu_err error messages. Fix them.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230120092842.79238-1-colin.i.king@gmail.com
David S. Miller [Wed, 25 Jan 2023 13:07:38 +0000 (13:07 +0000)]
Merge branch 'mptcp-fixes'
Jeremy Kerr says:
====================
net: mctp: struct sock lifetime fixes
This series is a set of fixes for the sock lifetime handling in the
AF_MCTP code, fixing a uaf reported by Noam Rathaus
<noamr@ssd-disclosure.com>.
The Fixes: tags indicate the original patches affected, but some
tweaking to backport to those commits may be needed; I have a separate
branch with backports to 5.15 if that helps with stable trees.
Of course, any comments/queries most welcome.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Tue, 24 Jan 2023 02:01:06 +0000 (10:01 +0800)]
net: mctp: mark socks as dead on unhash, prevent re-add
Once a socket has been unhashed, we want to prevent it from being
re-used in a sk_key entry as part of a routing operation.
This change marks the sk as SOCK_DEAD on unhash, which prevents addition
into the net's key list.
We need to do this during the key add path, rather than key lookup, as
we release the net keys_lock between those operations.
Fixes:
4a992bbd3650 ("mctp: Implement message fragmentation & reassembly")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>