Matt Roper [Thu, 18 Aug 2022 23:41:45 +0000 (16:41 -0700)]
drm/i915/mtl: Don't mask off CCS according to DSS fusing
Unlike the Xe_HP platforms, MTL only has a single CCS engine; the
quad-based engine masking logic does not apply to this platform (or
presumably any future platforms that only have 0 or 1 CCS).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-5-radhakrishna.sripada@intel.com
Matt Roper [Thu, 18 Aug 2022 23:41:44 +0000 (16:41 -0700)]
drm/i915/mtl: MMIO range is now 4MB
Previously only dgfx platforms had a 4MB MMIO range, but starting with
MTL we now use the larger range for all platforms.
Bspec: 63834, 63830
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-4-radhakrishna.sripada@intel.com
Radhakrishna Sripada [Wed, 17 Aug 2022 22:43:04 +0000 (15:43 -0700)]
drm/i915: Skip Bit12 fw domain reset for gen12+
Bit12 of the Forcewake request register should not be cleared post
gen12. Do not touch this bit while clearing during fw domain reset.
v2: Tweak the comment to drop older platforms(MattR)
Bspec: 52542
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817224304.255767-1-radhakrishna.sripada@intel.com
Vinay Belgaumkar [Sat, 20 Aug 2022 01:08:32 +0000 (18:08 -0700)]
drm/i915/guc/slpc: Allow SLPC to use efficient frequency
Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case since
SLPC will still abide by the (min,max) range limits.
With this change, min freq will be at efficient frequency level at init
instead of fused min (RPn). If user chooses to reduce min freq below the
efficient freq, we will turn off usage of efficient frequency and honor
the user request. When a higher value is written, it will get toggled
back again.
The patch also corrects the register which needs to be read for obtaining
the correct efficient frequency for Gen9+.
We see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled as expected.
v2: Address review comments (Rodrigo)
v3: with efficient frequency being dynamic, it is possible that the req
frequency may go beyond max freq. This will cause SLPC selftests to fail.
Add a FIXME there to start the test with [RPn, RP0] instead and restore
it afterwards.
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220820010832.15350-1-vinay.belgaumkar@intel.com
Jani Nikula [Fri, 19 Aug 2022 12:02:34 +0000 (15:02 +0300)]
drm/i915/guc: remove runtime info printing from time stamp logging
Commit
368d179adbac ("drm/i915/guc: Add GuC <-> kernel time stamp
translation information") added intel_device_info_print_runtime() in the
time info dump for no obvious reason or explanation in the commit
message. It only logs the rawclk freq. Remove it.
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b395ac4c909042f5daabf29959d8733993545aa2.1660910433.git.jani.nikula@intel.com
Matthew Auld [Fri, 19 Aug 2022 12:39:04 +0000 (13:39 +0100)]
Revert "drm/i915/guc: Add delay to disable scheduling after pin count goes to zero"
This reverts commit
6a079903847cce1dd06345127d2a32f26d2cd9c6.
Everything in CI using GuC is now timing out[1], and killing the machine
with this change (perhaps a deadlock?). CI was recently on fire due to
some changes coming in from -rc1, so likely the pre-merge CI results for
this series were invalid? For now just revert, unless GuC experts
already have a fix in mind.
[1] https://intel-gfx-ci.01.org/tree/drm-tip/index.html?
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220819123904.913750-1-matthew.auld@intel.com
Matthew Brost [Wed, 17 Aug 2022 02:05:11 +0000 (19:05 -0700)]
drm/i915/guc: Add delay to disable scheduling after pin count goes to zero
Add a delay, configurable via debugfs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a costly operation as it requires synchronizing with
the GuC. So the idea is that a delay allows the user to resubmit
something before doing this operation. This delay is only done if
the context isn't closed and less than a given threshold
(default is 3/4) of the guc_ids are in use.
As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.
Alan Previn: Matt Brost first introduced this series back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.
Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in a way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means the totality of all 36 containers and their
workloads were not saturating the engines to their max (in order to
maintain just enough headrooom to meet the min fps and max latencies
of incoming container submissions).
Problem statement: It was observed that the CPU core processing the i915
soft IRQ work was experiencing severe load. Using tracelogs and an
instrumentation patch to count specific i915 IRQ events, it was confirmed
that the majority of the CPU cycles were caused by the
gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent
whenever a context goes idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.
Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.
Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-3-alan.previn.teres.alexis@intel.com
Matthew Brost [Wed, 17 Aug 2022 02:05:10 +0000 (19:05 -0700)]
drm/i915/selftests: Use correct selfest calls for live tests
This will help in an upcoming patch where the live selftest wrappers
are extended to do more.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <john.c.harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-2-alan.previn.teres.alexis@intel.com
Daniele Ceraolo Spurio [Thu, 11 Aug 2022 21:08:12 +0000 (14:08 -0700)]
drm/i915/guc: clear stalled request after a reset
If the GuC CTs are full and we need to stall the request submission
while waiting for space, we save the stalled request and where the stall
occurred; when the CTs have space again we pick up the request submission
from where we left off.
If a full GT reset occurs, the state of all contexts is cleared and all
non-guilty requests are unsubmitted, therefore we need to restart the
stalled request submission from scratch. To make sure that we do so,
clear the saved request after a reset.
Fixes note: the patch that introduced the bug is in 5.15, but no
officially supported platform had GuC submission enabled by default
in that kernel, so the backport to that particular version (and only
that one) can potentially be skipped.
Fixes:
925dc1cf58ed ("drm/i915/guc: Implement GuC submission tasklet")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220811210812.3239621-1-daniele.ceraolospurio@intel.com
Daniele Ceraolo Spurio [Fri, 8 Jul 2022 22:41:58 +0000 (15:41 -0700)]
drm/i915/guc: skip scrub_ctbs selftest if reset is disabled
The test needs GT reset to trigger the scrubbing logic, so we can only
run it when reset is enabled.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708224158.929327-1-daniele.ceraolospurio@intel.com
John Harrison [Thu, 28 Jul 2022 02:20:28 +0000 (19:20 -0700)]
drm/i915/guc: Reduce spam from error capture
Some debug code got left in when the GuC based register save for error
capture was added. Remove that.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-8-John.C.Harrison@Intel.com
John Harrison [Thu, 28 Jul 2022 02:20:27 +0000 (19:20 -0700)]
drm/i915/guc: Make GuC log sizes runtime configurable
The GuC log buffer sizes had to be configured statically at compile
time. This can be quite troublesome when needing to get larger logs
out of a released driver. So re-organise the code to allow a boot time
module parameter override.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-7-John.C.Harrison@Intel.com
Chris Wilson [Thu, 28 Jul 2022 02:20:26 +0000 (19:20 -0700)]
drm/i915/guc: Use streaming loads to speed up dumping the guc log
Use a temporary page and mempy_from_wc to reduce the time it takes to
dump the guc log to debugfs.
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-6-John.C.Harrison@Intel.com
John Harrison [Thu, 28 Jul 2022 02:20:25 +0000 (19:20 -0700)]
drm/i915/guc: Record CTB info in error logs
When debugging GuC communication issues, it is useful to have the CTB
info available. So add the state and buffer contents to the error
capture log.
Also, add a sub-structure for the GuC specific error capture info as
it is now becoming numerous.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-5-John.C.Harrison@Intel.com
John Harrison [Thu, 28 Jul 2022 02:20:24 +0000 (19:20 -0700)]
drm/i915/guc: Add GuC <-> kernel time stamp translation information
It is useful to be able to match GuC events to kernel events when
looking at the GuC log. That requires being able to convert GuC
timestamps to kernel time. So, when dumping error captures and/or GuC
logs, include a stamp in both time zones plus the clock frequency.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-4-John.C.Harrison@Intel.com
John Harrison [Thu, 28 Jul 2022 02:20:23 +0000 (19:20 -0700)]
drm/i915/guc: Fix capture size warning and bump the size
There was a size check to warn if the GuC error state capture buffer
allocation would be too small to fit a reasonable amount of capture
data for the current platform. Unfortunately, the test was done too
early in the boot sequence and was actually testing 'if(-ENODEV >
size)'.
Move the check to be later. The check is only used to print a warning
message, so it doesn't really matter how early or late it is done.
Note that it is not possible to dynamically size the buffer because
the allocation needs to be done before the engine information is
available (at least, it would be in the intended two-phase GuC init
process).
Now that the check works, it is reporting size too small for newer
platforms. The check includes a 3x oversample multiplier to allow for
multiple error captures to be bufferd by GuC before i915 has a chance
to read them out. This is less important than simply being big enough
to fit the first capture.
So a) bump the default size to be large enough for one capture minimum
and b) make the warning only if one capture won't fit, instead use a
notice for the 3x size.
Note that the size estimate is a worst case scenario. Actual captures
will likely be smaller.
Lastly, use drm_warn istead of DRM_WARN as the former provides more
infmration and the latter is deprecated.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-3-John.C.Harrison@Intel.com
Alan Previn [Thu, 28 Jul 2022 02:20:22 +0000 (19:20 -0700)]
drm/i915/guc: Add a helper for log buffer size
Add a helper to get GuC log buffer size.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-2-John.C.Harrison@Intel.com
Matt Roper [Tue, 16 Aug 2022 21:06:01 +0000 (14:06 -0700)]
drm/i915/dg2: Add additional tuning settings
Some additional MMIO tuning settings have appeared in the bspec's
performance tuning guide section.
One of the tuning settings here is also documented as formal workaround
Wa_22012654132 for some steppings of DG2. However the tuning setting
applies to all DG2 variants and steppings, making it a superset of the
workaround.
v2:
- Move DRAW_WATERMARK to engine workaround section. It only moves into
the engine context on future platforms. (Lucas)
- CHICKEN_RASTER_2 needs to be handled as a masked register. (Lucas)
Bspec: 68331
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220816210601.2041572-2-matthew.d.roper@intel.com
Matt Roper [Tue, 16 Aug 2022 21:06:00 +0000 (14:06 -0700)]
drm/i915/gt: Add dedicated function for non-ctx register tuning settings
The bspec performance tuning section gives recommended settings that the
driver should program for various MMIO registers. Although these
settings aren't "workarounds" we use the workaround infrastructure to do
this programming to make sure it is handled at the appropriate places
and doesn't conflict with any real workarounds.
Since more of these are starting to show up on recent platforms, it's a
good time to create a dedicated function to hold them so that there's
less ambiguity about how/where to implement new ones.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220816210601.2041572-1-matthew.d.roper@intel.com
Matthew Auld [Fri, 5 Aug 2022 13:22:40 +0000 (14:22 +0100)]
drm/i915/ttm: fix CCS handling
Crucible + recent Mesa seems to sometimes hit:
GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER)
And it looks like we can also trigger this with gem_lmem_swapping, if we
modify the test to use slightly larger object sizes.
Looking closer it looks like we have the following issues in
migrate_copy():
- We are using plain integer in various places, which we can easily
overflow with a large object.
- We pass the entire object size (when the src is lmem) into
emit_pte() and then try to copy it, which doesn't work, since we
only have a few fixed sized windows in which to map the pages and
perform the copy. With an object > 8M we therefore aren't properly
copying the pages. And then with an object > 64M we trigger the
GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER).
So it looks like our copy handling for any object > 8M (which is our
CHUNK_SZ) is currently broken on DG2.
Fixes:
da0595ae91da ("drm/i915/migrate: Evict and restore the flatccs capable lmem obj")
Testcase: igt@gem_lmem_swapping
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C<ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220805132240.442747-2-matthew.auld@intel.com
Matthew Auld [Fri, 5 Aug 2022 13:22:39 +0000 (14:22 +0100)]
drm/i915/ttm: remove calc_ctrl_surf_instr_size
We only ever need to emit one ccs block copy command.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C<ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220805132240.442747-1-matthew.auld@intel.com
Mauro Carvalho Chehab [Thu, 4 Aug 2022 07:37:22 +0000 (09:37 +0200)]
drm/i915: pass a pointer for tlb seqno at vma_invalidate_tlb()
WRITE_ONCE() should happen at the original var, not on a local
copy of it.
Cc: stable@vger.kernel.org
Fixes:
5d36acb7198b ("drm/i915/gt: Batch TLB invalidations")
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[added cc-stable while merging it]
Link: https://patchwork.freedesktop.org/patch/msgid/f9550e6bacea10131ff40dd8981b69eb9251cdcd.1659598090.git.mchehab@kernel.org
Chris Wilson [Tue, 26 Jul 2022 14:48:44 +0000 (16:48 +0200)]
drm/i915/gem: Remove shared locking on freeing objects
The obj->base.resv may be shared across many objects, some of which may
still be live and locked, preventing objects from being freed
indefintely. We could individualise the lock during the free, or rely on
a freed object having no contention and being able to immediately free
the pages it owns.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/6469
Fixes:
be7612fd6665 ("drm/i915: Require object lock when freeing pages during destruction")
Fixes:
6cb12fbda1c2 ("drm/i915: Use trylock instead of blocking lock for __i915_gem_free_objects.")
Cc: <stable@vger.kernel.org> # v5.17+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220726144844.18429-1-nirmoy.das@intel.com
Harish Chegondi [Mon, 1 Aug 2022 21:38:39 +0000 (14:38 -0700)]
drm/i915/dg2: Add Wa_1509727124
Bspec: 46052
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220801213839.8549-1-harish.chegondi@intel.com
John Harrison [Thu, 28 Jul 2022 23:07:22 +0000 (16:07 -0700)]
drm/i915/dg2: Update DG2 to GuC v70.4.1
New release of GuC with a bunch of fixes specific to DG2. Some of
these require follow up i915 changes to enable.
Note also that it is not necessary to maintain backwards compatibility
with 70.1.2 for DG2 because DG2 is still under force probe protection.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728230722.2749701-2-John.C.Harrison@Intel.com
Daniele Ceraolo Spurio [Thu, 28 Jul 2022 00:33:39 +0000 (17:33 -0700)]
drm/i915/guc: Don't send policy update for child contexts.
The GuC FW applies the parent context policy to all the children,
so individual updates to the children are not supported and we
should not send them.
Note that sending the message did not have any functional consequences,
because the GuC just drops it and logs an error; since we were trying
to set the child policy to match the parent anyway the message being
dropped was not a problem.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728003339.2361010-1-daniele.ceraolospurio@intel.com
John Harrison [Thu, 28 Jul 2022 02:42:25 +0000 (19:42 -0700)]
drm/i915/guc: Don't abort on CTB_UNUSED status
When the KMD sends a CLIENT_RESET request to GuC (as part of the
suspend sequence), GuC will mark the CTB buffer as 'UNUSED'. If the
KMD then checked the CTB queue, it would see a non-zero status value
and report the buffer as corrupted.
Technically, no G2H messages should be received once the CLIENT_RESET
has been sent. However, if a context was outstanding on an engine then
it would get reset and a reset notification would be sent. So, don't
actually treat UNUSED as a catastrophic error. Just flag it up as
unexpected and keep going.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-7-John.C.Harrison@Intel.com
Matthew Brost [Thu, 28 Jul 2022 02:42:24 +0000 (19:42 -0700)]
drm/i915/guc: Support larger contexts on newer hardware
The GuC needs a copy of a golden context for implementing watchdog
resets (aka media resets). This context is larger on newer platforms.
So adjust the size being allocated/copied accordingly.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-6-John.C.Harrison@Intel.com
John Harrison [Thu, 28 Jul 2022 02:42:23 +0000 (19:42 -0700)]
drm/i915/selftest: Cope with not having an RCS engine
It is no longer guaranteed that there will always be an RCS engine.
So, use the helper function for finding the first available engine that
can be used for general purpose selftets.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-5-John.C.Harrison@Intel.com
Rahul Kumar Singh [Thu, 28 Jul 2022 18:26:16 +0000 (11:26 -0700)]
drm/i915/guc: Add selftest for a hung GuC
Add a test to check that the hangcheck will recover from a submission
hang in the GuC.
Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728182616.2417491-1-John.C.Harrison@Intel.com
Matthew Brost [Thu, 28 Jul 2022 02:42:21 +0000 (19:42 -0700)]
drm/i915/guc: Fix issues with live_preempt_cancel
Having semaphores results in different behavior when a dependent request
is cancelled. In the case of semaphores the request could be on the HW
and complete successfully while without the request is held in the
driver and the error from the dependent request is propagated. Fix
live_preempt_cancel to take this behavior into account.
Also update live_preempt_cancel to use new function intel_context_ban
rather than intel_context_set_banned.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-3-John.C.Harrison@Intel.com
Michał Winiarski [Thu, 28 Jul 2022 02:42:20 +0000 (19:42 -0700)]
drm/i915/guc: Route semaphores to GuC for Gen12+
In GuC submission mode, there is an option to use auto-switch out
semaphores and have GuC auto-switch in a waiting context. This
requires routing the semaphore interrupt to GuC.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728024225.2363663-2-John.C.Harrison@Intel.com
Zhanjun Dong [Fri, 15 Jul 2022 21:13:13 +0000 (14:13 -0700)]
drm/i915/guc: Check for ct enabled while waiting for response
We are seeing error message of "No response for request". Some cases
happened while waiting for response and reset/suspend action was triggered.
In this case, no response is not an error, active requests will be
cancelled.
This patch will handle this condition and change the error message into
debug message.
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220715211313.143645-1-zhanjun.dong@intel.com
Mauro Carvalho Chehab [Wed, 27 Jul 2022 12:29:56 +0000 (14:29 +0200)]
drm/i915/gt: describe the new tlb parameter at i915_vma_resource
TLB cache invalidation can happen on two different situations:
1. synchronously, at __vma_put_pages();
2. asynchronously.
On the first case, TLB cache invalidation happens inside
__vma_put_pages(). So, no need to do it later on.
However, on the second case, the pages will keep in memory
until __i915_vma_evict() is called.
So, we need to store the TLB data at struct i915_vma_resource,
in order to do a TLB cache invalidation before allowing
userspace to re-use the same memory.
So, i915_vma_resource_unbind() has gained a new parameter
in order to store the TLB data at the second case.
Document it.
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/aa55eef7e63b8f3d0f69b525db2dd2eb87e9db6b.1658924372.git.mchehab@kernel.org
Chris Wilson [Wed, 27 Jul 2022 12:29:55 +0000 (14:29 +0200)]
drm/i915/gt: Batch TLB invalidations
Invalidate TLB in batches, in order to reduce performance regressions.
Currently, every caller performs a full barrier around a TLB
invalidation, ignoring all other invalidations that may have already
removed their PTEs from the cache. As this is a synchronous operation
and can be quite slow, we cause multiple threads to contend on the TLB
invalidate mutex blocking userspace.
We only need to invalidate the TLB once after replacing our PTE to
ensure that there is no possible continued access to the physical
address before releasing our pages. By tracking a seqno for each full
TLB invalidate we can quickly determine if one has been performed since
rewriting the PTE, and only if necessary trigger one for ourselves.
That helps to reduce the performance regression introduced by TLB
invalidate logic.
[mchehab: rebased to not require moving the code to a separate file]
Cc: stable@vger.kernel.org
Fixes:
7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org
Chris Wilson [Wed, 27 Jul 2022 12:29:54 +0000 (14:29 +0200)]
drm/i915/gt: Skip TLB invalidations once wedged
Skip all further TLB invalidations once the device is wedged and
had been reset, as, on such cases, it can no longer process instructions
on the GPU and the user no longer has access to the TLB's in each engine.
So, an attempt to do a TLB cache invalidation will produce a timeout.
That helps to reduce the performance regression introduced by TLB
invalidate logic.
Cc: stable@vger.kernel.org
Fixes:
7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/5aa86564b9ec5fe7fe605c1dd7de76855401ed73.1658924372.git.mchehab@kernel.org
Chris Wilson [Wed, 27 Jul 2022 12:29:53 +0000 (14:29 +0200)]
drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations
Ensure that the TLB of the OA unit is also invalidated
on gen12 HW, as just invalidating the TLB of an engine is not
enough.
Cc: stable@vger.kernel.org
Fixes:
7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/59724d9f5cf1e93b1620d01b8332ac991555283d.1658924372.git.mchehab@kernel.org
Mauro Carvalho Chehab [Wed, 27 Jul 2022 12:29:52 +0000 (14:29 +0200)]
drm/i915/gt: document with_intel_gt_pm_if_awake()
Add a kernel-doc markup to document this new macro.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b974905bd0f6b5308b91561cc85eeecd94f1452a.1658924372.git.mchehab@kernel.org
Chris Wilson [Wed, 27 Jul 2022 12:29:51 +0000 (14:29 +0200)]
drm/i915/gt: Ignore TLB invalidations on idle engines
Check if the device is powered down prior to any engine activity,
as, on such cases, all the TLBs were already invalidated, so an
explicit TLB invalidation is not needed, thus reducing the
performance regression impact due to it.
This becomes more significant with GuC, as it can only do so when
the connection to the GuC is awake.
Cc: stable@vger.kernel.org
Fixes:
7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Fei Yang <fei.yang@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/278a57a672edac75683f0818b292e95da583a5fe.1658924372.git.mchehab@kernel.org
Matthew Auld [Wed, 27 Jul 2022 16:43:46 +0000 (17:43 +0100)]
drm/i915/ttm: don't leak the ccs state
The kernel only manages the ccs state with lmem-only objects, however
the kernel should still take care not to leak the CCS state from the
previous user.
Fixes:
48760ffe923a ("drm/i915/gt: Clear compress metadata for Flat-ccs objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220727164346.282407-1-matthew.auld@intel.com
Nirmoy Das [Wed, 27 Jul 2022 17:33:06 +0000 (19:33 +0200)]
drm/i915: disable pci resize on 32-bit machine
PCI bar resize only works with 64 bit BAR so disable
this on 32-bit machine and resolve below compilation error:
drivers/gpu/drm/i915/gt/intel_region_lmem.c:94:23: error: result of
comparison of constant
4294967296 with expression of type
'resource_size_t' (aka 'unsigned int') is always false
[-Werror,-Wtautological-constant-out-of-range-compare]
root_res->start > 0x100000000ull)
Fixes:
a91d1a17cd341 ("drm/i915: Add support for LMEM PCIe resizable bar")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220727173306.16247-1-nirmoy.das@intel.com
Chris Wilson [Wed, 27 Jul 2022 17:40:23 +0000 (19:40 +0200)]
drm/i915: Suppress oom warning for shmemfs object allocation failure
We report object allocation failures to userspace with ENOMEM, yet we
still show the memory warning after failing to shrink device allocated
pages. While this warning is similar to other system page allocation
failures, it is superfluous to the ENOMEM provided directly to
userspace.
v2: Add NOWARN in few more places from where we might return
ENOMEM to userspace.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4936
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220727174023.16766-1-nirmoy.das@intel.com
Jason Wang [Sat, 16 Jul 2022 04:05:20 +0000 (12:05 +0800)]
drm/i915/selftests: Fix comment typo
Fix the double `wait' typo in comment.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220716040520.31676-1-wangborong@cdjrlc.com
Jason Wang [Sat, 16 Jul 2022 18:44:39 +0000 (02:44 +0800)]
drm/i915/gt: Remove unneeded semicolon
The semicolon after the `}' is unneeded.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Removed line mention when pushing]
Link: https://patchwork.freedesktop.org/patch/msgid/20220716184439.72056-1-wangborong@cdjrlc.com
John Harrison [Fri, 15 Jul 2022 00:40:28 +0000 (17:40 -0700)]
drm/i915/guc: Don't use pr_err when not necessary
A bunch of code was copy/pasted using pr_err as the default way to
report errors. However, drm_err is significantly more useful in
identifying where the error came from. So update the code to use that
instead.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220715004028.2126239-1-John.C.Harrison@Intel.com
Daniele Ceraolo Spurio [Mon, 18 Jul 2022 23:07:32 +0000 (16:07 -0700)]
drm/i915/guc: support v69 in parallel to v70
This patch re-introduces support for GuC v69 in parallel to v70. As this
is a quick fix, v69 has been re-introduced as the single "fallback" guc
version in case v70 is not available on disk and only for platforms that
are out of force_probe and require the GuC by default. All v69 specific
code has been labeled as such for easy identification, and the same was
done for all v70 functions for which there is a separate v69 version,
to avoid accidentally calling the wrong version via the unlabeled name.
When the fallback mode kicks in, a drm_notice message is printed in
dmesg to inform the user of the required update. The existing
logging of the fetch function has also been updated so that we no
longer complain immediately if we can't find a fw and we only throw an
error if the fetch of both the base and fallback blobs fails.
The plan is to follow this up with a more complex rework to allow for
multiple different GuC versions to be supported at the same time.
v2: reduce the fallback to platform that require it, switch to
firmware_request_nowarn(), improve logs.
Fixes:
2584b3549f4c ("drm/i915/guc: Update to GuC version 70.1.1")
Link: https://lists.freedesktop.org/archives/intel-gfx/2022-July/301640.html
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220718230732.1409641-1-daniele.ceraolospurio@intel.com
Ashutosh Dixit [Tue, 19 Jul 2022 01:07:08 +0000 (18:07 -0700)]
drm/i915/gt: Expose per-gt RPS defaults in sysfs
Add the following sysfs files to gt/gtN/.defaults/:
* rps_min_freq_mhz
* rps_max_freq_mhz
v2: Correct gt/gtN/.defaults/* file names in commit message
v3: Remove rps_boost_freq_mhz since it is not consumed by userspace
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/cf6e483bf79f871c2c8c74af6005bf6a83a3a1ce.1658192398.git.ashutosh.dixit@intel.com
Ashutosh Dixit [Tue, 19 Jul 2022 01:07:07 +0000 (18:07 -0700)]
drm/i915/gt: Create gt/gtN/.defaults/ for per gt sysfs defaults
Create a gt/gtN/.defaults/ directory (similar to
engine/<engine-name>/.defaults/) to expose default parameter values for
each gt in sysfs. This allows userspace to restore default parameter values
after they have changed. The empty 'struct gt_defaults' will be populated
by subsequent patches.
v2: Changed 'struct intel_rps_defaults rps_defaults' to
'struct gt_defaults defaults' (Andi)
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/be7c30d0ae58be9d8d5b8242ba00a1b2825e63ad.1658192398.git.ashutosh.dixit@intel.com
Chris Wilson [Thu, 30 Jun 2022 04:39:59 +0000 (21:39 -0700)]
drm/i915/reset: Handle reset timeouts under unrelated kernel hangs
When resuming after hibernate sometimes we see hangs in unrelated kernel
subsystems. These hangs often result in the following i915 trace:
i915 0000:00:02.0: [drm] *ERROR* \
intel_gt_reset_global timed out, cancelling all in-flight rendering
implying our reset task has been starved by the hanging kernel subsystem,
causing us to inappropiately declare the system as wedged beyond recovery.
The trace would be caused by our synchronize_srcu_expedited() taking more
than the allowed 5s due to the unrelated kernel hang. But we neither need
to perform that synchronisation inside the reset watchdog, nor do we need
such a short timeout before declaring the device as unrecoverable.
v2: Restore watchdog timeout to the previous 5 seconds (Ashutosh)
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/3575
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630043959.5708-1-ashutosh.dixit@intel.com
Priyanka Dandamudi [Wed, 13 Jul 2022 13:02:09 +0000 (18:32 +0530)]
drm/i915: Add lmem_bar_size modparam
For testing purposes, support forcing the lmem_bar_size through a new
modparam. In CI we only have a limited number of configurations for DG2,
but we still need to be reasonably sure we get a usable device (also
verifying we report the correct values for things like
probed_cpu_visible_size etc) with all the potential lmem_bar sizes that
we might expect see in the wild.
v2: Update commit message and a minor modification.(Matt)
v3: Optimised lmem bar size code and modified code to resize
bar maximum upto lmem_size instead of maximum supported size.(Nirmoy)
v4: Optimised lmem bar size code.(Nirmoy)
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-3-priyanka.dandamudi@intel.com
Akeem G Abodunrin [Wed, 13 Jul 2022 13:02:08 +0000 (18:32 +0530)]
drm/i915: Add support for LMEM PCIe resizable bar
Add support for the local memory PICe resizable bar, so that
local memory can be resized to the maximum size supported by the device,
and mapped correctly to the PCIe memory bar. It is usual that GPU
devices expose only 256MB BARs primarily to be compatible with 32-bit
systems. So, those devices cannot claim larger memory BAR windows size due
to the system BIOS limitation. With this change, it would be possible to
reprogram the windows of the bridge directly above the requesting device
on the same BAR type.
v2:Moved code to gt/intel_region_lmem.c and used only
single underscore for function names.(Jani)
v3: Optimised code.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Michael J Ruhl <michael.j.ruhl@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-2-priyanka.dandamudi@intel.com
Matt Roper [Tue, 12 Jul 2022 22:05:13 +0000 (15:05 -0700)]
drm/i915: Correct ss -> steering calculation for pre-Xe_HP platforms
Accidental use of a "SLICE" macro where a "SUBSLICE" macro was intended
causes the group ID for steering to be calculated incorrectly on
pre-Xe_HP platforms.
Fixes:
9a92732f040a ("drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220712220513.3451794-1-matthew.d.roper@intel.com
Matthew Auld [Tue, 12 Jul 2022 17:40:50 +0000 (18:40 +0100)]
drm/i915/ttm: fix 32b build
Since segment_pages is no longer a compile time constant, it looks the
DIV_ROUND_UP(node->size, segment_pages) breaks the 32b build. Simplest
is just to use the ULL variant, but really we should need not need more
than u32 for the page alignment (also we are limited by that due to the
sg->length type), so also make it all u32.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes:
bc99f1209f19 ("drm/i915/ttm: fix sg_table construction")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220712174050.592550-1-matthew.auld@intel.com
Andrzej Hajda [Fri, 24 Jun 2022 11:35:28 +0000 (13:35 +0200)]
drm/i915/selftests: fix subtraction overflow bug
On some machines hole_end can be small enough to cause subtraction
overflow. On the other side (addr + 2 * min_alignment) can overflow
in case of mock tests. This patch should handle both cases.
Fixes:
e1c5f754067b59 ("drm/i915: Avoid overflow in computing pot_hole loop termination")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3674
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624113528.2159210-1-andrzej.hajda@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Chris Wilson [Fri, 8 Jul 2022 14:20:13 +0000 (16:20 +0200)]
drm/i915/gt: Only kick the signal worker if there's been an update
One impact of commit
047a1b877ed4 ("dma-buf & drm/amdgpu: remove
dma_resv workaround") is that it stores many, many more fences. Whereas
adding an exclusive fence used to remove the shared fence list, that
list is now preserved and the write fences included into the list. Not
just a single write fence, but now a write/read fence per context. That
causes us to have to track more fences than before (albeit half of those
are redundant), and we trigger more interrupts for multi-engine
workloads.
As part of reducing the impact from handling more signaling, we observe
we only need to kick the signal worker after adding a fence iff we have
good cause to believe that there is work to be done in processing the
fence i.e. we either need to enable the interrupt or the request is
already complete but we don't know if we saw the interrupt and so need
to check signaling.
References:
047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d7b953c7a4ba747c8196a164e2f8c5aef468d048.1657289332.git.karolina.drobnik@intel.com
Chris Wilson [Fri, 8 Jul 2022 14:20:12 +0000 (16:20 +0200)]
drm/i915: Bump GT idling delay to 2 jiffies
In monitoring a transcode pipeline that is latency sensitive (it waits
between submitting frames, and each frame requires work on rcs/vcs/vecs
engines), it is found that it took longer than a single jiffy for it to
sustain its workload. Allowing an extra jiffy headroom for the userspace
prevents us from prematurely parking and having to exit powersaving
immediately.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/6284
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e37911ec087a9ce50630d6faf61fa2c0d5f96d44.1657289332.git.karolina.drobnik@intel.com
Chris Wilson [Fri, 8 Jul 2022 14:20:11 +0000 (16:20 +0200)]
drm/i915/gem: Look for waitboosting across the whole object prior to individual waits
We employ a "waitboost" heuristic to detect when userspace is stalled
waiting for results from earlier execution. Under latency sensitive work
mixed between the gpu/cpu, the GPU is typically under-utilised and so
RPS sees that low utilisation as a reason to downclock the frequency,
causing longer stalls and lower throughput. The user left waiting for
the results is not impressed.
On applying commit
047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv
workaround") it was observed that deinterlacing h264 on Haswell
performance dropped by 2-5x. The reason being that the natural workload
was not intense enough to trigger RPS (using HW evaluation intervals) to
upclock, and so it was depending on waitboosting for the throughput.
Commit
047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
changes the composition of dma-resv from keeping a single write fence +
multiple read fences, to a single array of multiple write and read
fences (a maximum of one pair of write/read fences per context). The
iteration order was also changed implicitly from all-read fences then
the single write fence, to a mix of write fences followed by read
fences. It is that ordering change that belied the fragility of
waitboosting.
Currently, a waitboost is inspected at the point of waiting on an
outstanding fence. If the GPU is backlogged such that we haven't yet
stated the request we need to wait on, we force the GPU to upclock until
the completion of that request. By changing the order in which we waited
upon requests, we ended up waiting on those requests in sequence and as
such we saw that each request was already started and so not a suitable
candidate for waitboosting.
Instead of asking whether to boost each fence in turn, we can look at
whether boosting is required for the dma-resv ensemble prior to waiting
on any fence, making the heuristic more robust to the order in which
fences are stored in the dma-resv.
Reported-by: Thomas Voegtle <tv@lio96.de>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6284
Fixes:
047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com>
Tested-by: Thomas Voegtle <tv@lio96.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/07e05518d9f6620d20cc1101ec1849203fe973f9.1657289332.git.karolina.drobnik@intel.com
Chris Wilson [Tue, 12 Jul 2022 15:21:33 +0000 (16:21 +0100)]
drm/i915/gt: Serialize TLB invalidates with GT resets
Avoid trying to invalidate the TLB in the middle of performing an
engine reset, as this may result in the reset timing out. Currently,
the TLB invalidate is only serialised by its own mutex, forgoing the
uncore lock, but we can take the uncore->lock as well to serialise
the mmio access, thereby serialising with the GDRST.
Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
i915 selftest/hangcheck.
Cc: stable@vger.kernel.org # v4.4 and upper
Fixes:
7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721ac6ea86c0677.1657639152.git.mchehab@kernel.org
Chris Wilson [Tue, 12 Jul 2022 15:21:32 +0000 (16:21 +0100)]
drm/i915/gt: Serialize GRDOM access between multiple engine resets
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable@vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7dbac3011729.1657639152.git.mchehab@kernel.org
Matt Roper [Fri, 8 Jul 2022 21:58:03 +0000 (14:58 -0700)]
drm/i915/dg2: Add Wa_15010599737
This workaround may need to be extended to other platforms soon, but for
now it's marked as DG2-specific.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708215804.2889246-1-matthew.d.roper@intel.com
Matthew Auld [Mon, 11 Jul 2022 08:58:59 +0000 (09:58 +0100)]
drm/i915/ttm: fix sg_table construction
If we encounter some monster sized local-memory page that exceeds the
maximum sg length (UINT32_MAX), ensure that don't end up with some
misaligned address in the entry that follows, leading to fireworks
later. Also ensure we have some coverage of this in the selftests.
v2(Chris):
- Use round_down consistently to avoid udiv errors
v3(Nirmoy):
- Also update the max_segment in the selftest
Fixes:
f701b16d4cc5 ("drm/i915/ttm: add i915_sg_from_buddy_resource")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6379
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220711085859.24198-1-matthew.auld@intel.com
Dan Carpenter [Fri, 8 Jul 2022 09:41:04 +0000 (12:41 +0300)]
drm/i915/selftests: fix a couple IS_ERR() vs NULL tests
The shmem_pin_map() function doesn't return error pointers, it returns
NULL.
Fixes:
be1cb55a07bf ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708094104.GL2316@kadam
Radhakrishna Sripada [Fri, 8 Jul 2022 00:03:35 +0000 (17:03 -0700)]
drm/i915/mtl: Add MeteorLake PCI IDs
Add Meteorlake PCI IDs. Split into M, and P subplatforms.
v2: Update PCI id's
v3: Move id 7d60 under MTL_M(MattR)
Bspec: 55420
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708000335.2869311-3-radhakrishna.sripada@intel.com
Radhakrishna Sripada [Fri, 8 Jul 2022 00:03:34 +0000 (17:03 -0700)]
drm/i915/mtl: Add MeteorLake platform info
MTL has Xe_LPD+ display IP (version = 14), MTL graphics IP
(version = 12.70), and Xe_LPM+ media IP (version = 13).
Bspec: 55413
Bspec: 55416
Bspec: 55417
Bspec: 55418
Bspec: 55726
Bspec: 45544
Bspec: 65380
v2: rearrange the fields in pci_info(MattR)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mattrope: Moved IS_METEORLAKE() higher in header]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708000335.2869311-2-radhakrishna.sripada@intel.com
Matt Roper [Fri, 1 Jul 2022 23:20:06 +0000 (16:20 -0700)]
drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr
Although all DSS belong to a single pool on Xe_HP platforms (i.e.,
they're not organized into slices from a topology point of view), we do
still need to pass 'group' and 'instance' targets when steering register
accesses to a specific instance of a per-DSS multicast register. The
rules for how to determine group and instance IDs (which previously used
legacy terms "slice" and "subslice") varies by platform. Some platforms
determine steering by gslice membership, some platforms by cslice
membership, and future platforms may have other rules.
Since looping over each DSS and performing steered unicast register
accesses is a relatively common pattern, let's add a dedicated iteration
macro to handle this (and replace the platform-specific "instdone" loop
we were using previously. This will avoid the calling code needing to
figure out the details about how to obtain steering IDs for a specific
DSS.
Most of the places where we use this new loop are in the GPU errorstate
code at the moment, but we do have some additional features coming in
the future that will also need to loop over each DSS and steer some
register accesses accordingly.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701232006.1016135-1-matthew.d.roper@intel.com
Umesh Nerlige Ramappa [Thu, 7 Jul 2022 19:30:02 +0000 (12:30 -0700)]
i915/perf: Disable OA sseu config param for gfx12.50+
The global sseu config is applicable only to gen11 platforms where
concurrent media, render and OA use cases may cause some subslices to be
turned off and hence lose NOA configuration. Ideally we want to return
ENODEV for non-gen11 platforms, however, this has shipped with gfx12, so
disable only for gfx12.50+.
v2: gfx12 is already shipped with this, disable for gfx12.50+ (Lionel)
v3: (Matt)
- Update commit message and replace "12.5" with "12.50"
- Replace DRM_DEBUG() with driver specific drm_dbg()
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-2-umesh.nerlige.ramappa@intel.com
Umesh Nerlige Ramappa [Thu, 7 Jul 2022 19:30:01 +0000 (12:30 -0700)]
i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call
DRM_DEBUG is not the right debug call to use in i915 OA, replace it with
driver specific drm_dbg() call (Matt).
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-1-umesh.nerlige.ramappa@intel.com
Chris Wilson [Wed, 6 Jul 2022 15:47:38 +0000 (16:47 +0100)]
drm/i915/selftests: Grab the runtime pm in shrink_thp
Since we are not holding a wakeref, shrinking a bound object is not
guaranteed.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6370
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220706154738.235204-1-matthew.auld@intel.com
Alan Previn [Tue, 7 Jun 2022 00:23:14 +0000 (17:23 -0700)]
drm/i915/guc: Asynchronous flush of GuC log regions
Both error-capture and relay-logging mechanism use the GuC
log infrastructure. That means the KMD must send a log flush
complete notification back to GuC after reading the data out.
This call is currently being sent synchronously.
However, synchronous H2Gs cause problems when the system is
backed up. There is no need for this to be synchronous. The
KMD wasn't even looking at the return status from it. So make
it asynchronous and then there is no issue about time outs.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220607002314.1451656-2-alan.previn.teres.alexis@intel.com
Thomas Hellström [Mon, 20 Jun 2022 12:36:59 +0000 (14:36 +0200)]
drm/i915: Fix vm use-after-free in vma destruction
In vma destruction, the following race may occur:
Thread 1: Thread 2:
i915_vma_destroy();
...
list_del_init(vma->vm_link);
...
mutex_unlock(vma->vm->mutex);
__i915_vm_release();
release_references();
And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.
However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().
v2: Fix a typo in the commit message (Andi Shyti)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944
Fixes:
e1a7ab4fca0c ("drm/i915: Remove the vm open count")
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.con>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
Niranjana Vishwanathapura [Fri, 1 Jul 2022 00:31:10 +0000 (17:31 -0700)]
drm/doc/rfc: VM_BIND uapi definition
VM_BIND and related uapi definitions
v2: Reduce the scope to simple Mesa use case.
v3: Expand VM_UNBIND documentation and add
I915_GEM_VM_BIND/UNBIND_FENCE_VALID
and I915_GEM_VM_BIND_TLB_FLUSH flags.
v4: Remove I915_GEM_VM_BIND_TLB_FLUSH flag and add additional
documentation for vm_bind/unbind.
v5: Remove TLB flush requirement on VM_UNBIND.
Add version support to stage implementation.
v6: Define and use drm_i915_gem_timeline_fence structure for
all timeline fences.
v7: Rename I915_PARAM_HAS_VM_BIND to I915_PARAM_VM_BIND_VERSION.
Update documentation on async vm_bind/unbind and versioning.
Remove redundant vm_bind/unbind FENCE_VALID flag, execbuf3
batch_count field and I915_EXEC3_SECURE flag.
v8: Remove I915_GEM_VM_BIND_READONLY and minor documentation
updates.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-4-niranjana.vishwanathapura@intel.com
Niranjana Vishwanathapura [Fri, 1 Jul 2022 00:31:09 +0000 (17:31 -0700)]
drm/i915: Update i915 uapi documentation
Add some missing i915 uapi documentation which the new
i915 VM_BIND feature documentation will be refer to.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-3-niranjana.vishwanathapura@intel.com
Niranjana Vishwanathapura [Fri, 1 Jul 2022 00:31:08 +0000 (17:31 -0700)]
drm/doc/rfc: VM_BIND feature design document
VM_BIND design document with description of intended use cases.
v2: Reduce the scope to simple Mesa use case.
v3: Expand documentation on dma-resv usage, TLB flushing and
execbuf3.
v4: Remove vm_bind tlb flush request support.
v5: Update TLB flushing documentation.
v6: Update out of order completion documentation.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-2-niranjana.vishwanathapura@intel.com
Matt Roper [Fri, 1 Jul 2022 15:22:31 +0000 (08:22 -0700)]
drm/i915: DG2 and ATS-M device ID updates
Small BAR support has now landed, which allows us to add the PCI IDs
that correspond to add-in card designs of DG2 and ATS-M. There's also
one additional MB-down PCI ID that recently appeared (0x5698) so we add
it too.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701152231.529511-2-matthew.d.roper@intel.com
Gustavo Sousa [Thu, 30 Jun 2022 20:14:07 +0000 (17:14 -0300)]
drm/i915/pvc: Implement w/a
16016694945
A new PVC-specific workaround has just been added to the BSpec.
BSpec: 64027
Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630201407.16770-1-gustavo.sousa@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:50 +0000 (18:43 +0100)]
drm/i915: turn on small BAR support
With the uAPI in place we should now have enough in place to ensure a
working system on small BAR configurations.
v2: (Nirmoy & Thomas):
- s/full BAR/Resizable BAR/ which is hopefully more easily
understood by users.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-13-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:49 +0000 (18:43 +0100)]
drm/i915/ttm: disallow CPU fallback mode for ccs pages
Falling back to memcpy/memset shouldn't be allowed if we know we have
CCS state to manage using the blitter. Otherwise we are potentially
leaving the aux CCS state in an unknown state, which smells like an info
leak.
Fixes:
48760ffe923a ("drm/i915/gt: Clear compress metadata for Flat-ccs objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-12-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:48 +0000 (18:43 +0100)]
drm/i915/ttm: handle blitter failure on DG2
If the move or clear operation somehow fails, and the memory underneath
is not cleared, like when moving to lmem, then we currently fallback to
memcpy or memset. However with small-BAR systems this fallback might no
longer be possible. For now we use the set_wedged sledgehammer if we
ever encounter such a scenario, and mark the object as borked to plug
any holes where access to the memory underneath can happen. Add some
basic selftests to exercise this.
v2:
- In the selftests make sure we grab the runtime pm around the reset.
Also make sure we grab the reset lock before checking if the device
is wedged, since the wedge might still be in-progress and hence the
bit might not be set yet.
- Don't wedge or put the object into an unknown state, if the request
construction fails (or similar). Just returning an error and
skipping the fallback should be safe here.
- Make sure we wedge each gt. (Thomas)
- Peek at the unknown_state in io_reserve, that way we don't have to
export or hand roll the fault_wait_for_idle. (Thomas)
- Add the missing read-side barriers for the unknown_state. (Thomas)
- Some kernel-doc fixes. (Thomas)
v3:
- Tweak the ordering of the set_wedged, also add FIXME.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:47 +0000 (18:43 +0100)]
drm/i915/selftests: ensure we reserve a fence slot
We should always be explicit and allocate a fence slot before adding a
new fence.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-10-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:46 +0000 (18:43 +0100)]
drm/i915/selftests: skip the mman tests for stolen
It's not supported, and just skips later anyway. With small-BAR things
get more complicated since all of stolen is likely not even CPU
accessible, hence not passing I915_BO_ALLOC_GPU_ONLY just results in the
object create failing.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-9-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:45 +0000 (18:43 +0100)]
drm/i915/uapi: tweak error capture on recoverable contexts
A non-recoverable context must be used if the user wants proper error
capture on discrete platforms. In the future the kernel may want to blit
the contents of some objects when later doing the capture stage. Also
extend to newer integrated platforms.
v2(Thomas):
- Also extend to newer integrated platforms, for capture buffer memory
allocation purposes.
v3 (Reported-by: kernel test robot <lkp@intel.com>):
- Fix build on !CONFIG_DRM_I915_CAPTURE_ERROR
Testcase: igt@gem_exec_capture@capture-recoverable
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-8-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:44 +0000 (18:43 +0100)]
drm/i915/error: skip non-mappable pages
Skip capturing any lmem pages that can't be copied using the CPU. This
in now only best effort on platforms that have small BAR.
Testcase: igt@gem-exec-capture@capture-invisible
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-7-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:43 +0000 (18:43 +0100)]
drm/i915/uapi: add NEEDS_CPU_ACCESS hint
If set, force the allocation to be placed in the mappable portion of
I915_MEMORY_CLASS_DEVICE. One big restriction here is that system memory
(i.e I915_MEMORY_CLASS_SYSTEM) must be given as a potential placement for the
object, that way we can always spill the object into system memory if we
can't make space.
Testcase: igt@gem-create@create-ext-cpu-access-sanity-check
Testcase: igt@gem-create@create-ext-cpu-access-big
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-6-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:42 +0000 (18:43 +0100)]
drm/i915/uapi: apply ALLOC_GPU_ONLY by default
On small BAR configurations, when dealing with I915_MEMORY_CLASS_DEVICE
allocations, we assume that by default, all userspace allocations should
be placed in the non-CPU visible portion. Note that dumb buffers are
not included here, since these are not "GPU accelerated" and likely need
CPU access. We choose to just always set GPU_ONLY, and let the backend
figure out if that should be ignored or not, for example on full BAR
systems.
In a later patch userspace will be able to provide a hint if CPU access
to the buffer is needed.
v2(Thomas)
- Apply GPU_ONLY on all discrete devices, but only if the BO can be
placed in LMEM. Down in the depths this should be turned into a noop,
where required, and as an annotation it still make some sense. If we
apply it regardless of the placements then we end up needing to check
the placements during exec capture. Also it's slightly inconsistent
since the NEEDS_CPU_ACCESS can only be applied on objects that can be
placed in LMEM. The other annoyance would be gem_create_ext vs plain
gem_create, if we were to always apply GPU_ONLY.
Testcase: igt@gem-create@create-ext-cpu-access-sanity-check
Testcase: igt@gem-create@create-ext-cpu-access-big
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-5-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:41 +0000 (18:43 +0100)]
drm/i915: remove intel_memory_region avail
No longer used.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-4-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:40 +0000 (18:43 +0100)]
drm/i915/uapi: expose the avail tracking
Vulkan would like to have a rough measure of how much device memory can
in theory be allocated. Also add unallocated_cpu_visible_size to track
the visible portion, in case the device is using small BAR. Also tweak
the locking so we nice consistent values for both the mm->avail and the
visible tracking.
v2: tweak the locking slightly so we update the mm->avail and visible
tracking as one atomic operation, such that userspace doesn't get
strange values when sampling the values.
Testcase: igt@i915_query@query-regions-unallocated
Testcase: igt@i915_query@query-regions-sanity-check
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-3-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:39 +0000 (18:43 +0100)]
drm/i915/uapi: add probed_cpu_visible_size
Userspace wants to know the size of CPU visible portion of device
local-memory, and on small BAR devices the probed_size is no longer
enough. In Vulkan, for example, it would like to know the size in bytes
for CPU visible VkMemoryHeap. We already track the io_size for each
region, so plumb that through to the region query.
v2: Drop the ( -1 = unknown ) stuff, which is confusing since nothing
can currently ever return such a value.
Testcase: igt@i915_query@query-regions-sanity-check
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-2-matthew.auld@intel.com
Matthew Auld [Wed, 29 Jun 2022 17:43:38 +0000 (18:43 +0100)]
drm/doc: add rfc section for small BAR uapi
Add an entry for the new uapi needed for small BAR on DG2+.
v2:
- Some spelling fixes and other small tweaks. (Akeem & Thomas)
- Rework error capture interactions, including no longer needing
NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
- Add probed_cpu_visible_size. (Lionel)
v3:
- Drop the vma query for now.
- Add unallocated_cpu_visible_size as part of the region query.
- Improve the docs some more, including documenting the expected
behaviour on older kernels, since this came up in some offline
discussion.
v4:
- Various improvements all over. (Tvrtko)
v5:
- Include newer integrated platforms when applying the non-recoverable
context and error capture restriction. (Thomas)
Mesa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16739
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: mesa-dev@lists.freedesktop.org
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Acked-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jordan Justen <jordan.l.justen@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-1-matthew.auld@intel.com
Daniele Ceraolo Spurio [Tue, 21 Jun 2022 23:30:05 +0000 (16:30 -0700)]
drm/i915/guc: ADL-N should use the same GuC FW as ADL-S
The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.
Fixes:
7e28d0b26759 ("drm/i915/adl-n: Enable ADL-N platform")
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com
Vinay Belgaumkar [Mon, 27 Jun 2022 23:03:46 +0000 (16:03 -0700)]
drm/i915/guc/slpc: Add a new SLPC selftest
This test will validate we can achieve actual frequency of RP0. Pcode
grants frequencies based on what GuC is requesting. However, thermal
throttling can limit what is being granted. Add a test to request for
max, but don't fail the test if RP0 is not granted due to throttle
reasons.
Also optimize the selftest by using a common run_test function to avoid
code duplication. Rename the "clamp" tests to vary_max_freq and
vary_min_freq.
v2: Fix compile warning
v3: Review comments (Ashutosh). Added a FIXME for the media RP0 case.
v4: Checkpatch (strict) fixes, remove FIXME and other comments (Ashutosh)
Fixes commit
8ee2c227822e ("drm/i915/guc/slpc: Add SLPC selftest")
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220627230346.27720-1-vinay.belgaumkar@intel.com
Nirmoy Das [Fri, 24 Jun 2022 11:08:21 +0000 (13:08 +0200)]
drm/i915: Fix a lockdep warning at error capture
For some platfroms we use stop_machine version of
gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a
concurrent GGTT access bug but this causes a circular locking
dependency warning:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ggtt->error_mutex);
lock(dma_fence_map);
lock(&ggtt->error_mutex);
lock(cpu_hotplug_lock);
Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries
directly at error capture which is concurrent GGTT access safe because
reset path make sure of that.
v2: Fix rebase conflict and added a comment.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com
Vinay Belgaumkar [Thu, 23 Jun 2022 00:32:25 +0000 (17:32 -0700)]
drm/i915/guc/slpc: Use non-blocking H2G for waitboost
SLPC min/max frequency updates require H2G calls. We are seeing
timeouts when GuC channel is backed up and it is unable to respond
in a timely fashion causing warnings and affecting CI.
This is seen when waitboosting happens during a stress test.
this patch updates the waitboost path to use a non-blocking
H2G call instead, which returns as soon as the message is
successfully transmitted.
v2: Use drm_notice to report any errors that might occur while
sending the waitboost H2G request (Tvrtko)
v3: Add drm_notice inside force_min_freq (Ashutosh)
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623003225.23301-1-vinay.belgaumkar@intel.com
Umesh Nerlige Ramappa [Tue, 21 Jun 2022 19:21:05 +0000 (12:21 -0700)]
drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.
In addition, extend the WA to gen11.
v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks
v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes:
f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
Alan Previn [Thu, 23 Jun 2022 02:31:57 +0000 (19:31 -0700)]
drm/i915/guc: Don't update engine busyness stats too frequently
Using two different types of workoads, it was observed that
guc_update_engine_gt_clks was being called too frequently and/or
causing a CPU-to-lmem bandwidth hit over PCIE. Details on
the workloads and numbers are in the notes below.
Background: At the moment, guc_update_engine_gt_clks can be invoked
via one of 3 ways. #1 and #2 are infrequent under normal operating
conditions:
1.When a predefined "ping_delay" timer expires so that GuC-
busyness can sample the GTPM clock counter to ensure it
doesn't miss a wrap-around of the 32-bits of the HW counter.
(The ping_delay is calculated based on 1/8th the time taken
for the counter go from 0x0 to 0xffffffff based on the
GT frequency. This comes to about once every 28 seconds at a
GT frequency of 19.2Mhz).
2.In preparation for a gt reset.
3.In response to __gt_park events (as the gt power management
puts the gt into a lower power state when there is no work
being done).
Root-cause: For both the workloads described farther below, it was
observed that when user space calls IOCTLs that unparks the
gt momentarily and repeats such calls many times in quick succession,
it triggers calling guc_update_engine_gt_clks as many times. However,
the primary purpose of guc_update_engine_gt_clks is to ensure we don't
miss the wraparound while the counter is ticking. Thus, the solution
is to ensure we skip that check if gt_park is calling this function
earlier than necessary.
Solution: Snapshot jiffies when we do actually update the busyness
stats. Then get the new jiffies every time intel_guc_busyness_park
is called and bail if we are being called too soon. Use half of the
ping_delay as a safe threshold.
NOTE1: Workload1: IGTs' gem_create was modified to create a file handle,
allocate memory with sizes that range from a min of 4K to the max supported
(in power of two step-sizes). Its maps, modifies and reads back the
memory. Allocations and modification is repeated until total memory
allocation reaches the max. Then the file handle is closed. With this
workload, guc_update_engine_gt_clks was called over 188 thousand times
in the span of 15 seconds while this test ran three times. With this patch,
the number of calls reduced to 14.
NOTE2: Workload2: 30 transcode sessions are created in quick succession.
While these sessions are created, pcm-iio tool was used to measure I/O
read operation bandwidth consumption sampled at 100 milisecond intervals
over the course of 20 seconds. The total bandwidth consumed over 20 seconds
without this patch was measured at average at 311KBps per sample. With this
patch, the number went down to about 175Kbps which is about a 43% savings.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220623023157.211650-2-alan.previn.teres.alexis@intel.com
Niranjana Vishwanathapura [Tue, 14 Jun 2022 18:43:47 +0000 (00:13 +0530)]
Revert "drm/i915: Hold reference to intel_context over life of i915_request"
This reverts commit
1e98d8c52ed5dfbaf273c4423c636525c2ce59e7.
The problem with this patch is that it makes i915_request to hold a
reference to intel_context, which in turn holds a reference on the VM.
This strong back referencing can lead to reference loops which leads
to resource leak.
An example is the upcoming VM_BIND work which requires VM to hold
a reference to some shared VM specific BO. But this BO's dma-resv
fences holds reference to the i915_request thus leading to reference
loop.
v2:
Do not use reserved requests for virtual engines
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Mathew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-3-ramalingam.c@intel.com
Niranjana Vishwanathapura [Tue, 14 Jun 2022 18:43:46 +0000 (00:13 +0530)]
drm/i915: Do not access rq->engine without a reference
In i915_fence_get_driver_name(), user may not hold a
reference to rq->engine. Hence do not access it. Instead,
store required device private pointer in 'rq->i915' and use it.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-2-ramalingam.c@intel.com
Matt Roper [Fri, 24 Jun 2022 21:03:28 +0000 (14:03 -0700)]
drm/i915: Prefer "XEHP_" prefix for registers
We've been introducing new registers with a mix of "XEHP_"
(architecture) and "XEHPSDV_" (platform) prefixes. For consistency,
let's settle on "XEHP_" as the preferred form.
XEHPSDV_RP_STATE_CAP stays with its current name since that's truly a
platform-specific register and not something that applies to the Xe_HP
architecture as a whole.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Caz Yokoyama <caz@caztech.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-2-matthew.d.roper@intel.com
Matt Roper [Fri, 24 Jun 2022 21:03:27 +0000 (14:03 -0700)]
drm/i915: Correct duplicated/misplaced GT register definitions
XEHPSDV_FLAT_CCS_BASE_ADDR, GEN8_L3_LRA_1_GPGPU, and MMCD_MISC_CTRL were
duplicated between i915_reg.h and intel_gt_regs.h. These are all GT
registers, so we should drop the copy from i915_reg.h.
XEHPSDV_TILE0_ADDR_RANGE was defined in i915_reg.h, but really belongs
in intel_gt_regs.h. Move it.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-1-matthew.d.roper@intel.com
Matthew Auld [Wed, 22 Jun 2022 15:59:19 +0000 (16:59 +0100)]
drm/i915: tweak the ordering in cpu_write_needs_clflush
For imported dma-buf objects we leave the object as cache_coherent = 0
across all platforms, which is reasonable given that have no clue what
the memory underneath is, and its not like the driver can ever manually
clflush the pages anyway (like with i915_gem_clflush_object) for such
objects. However on discrete we choose to treat cache_dirty = true as a
programmer error, leading to a warning. The simplest fix looks to be to
just change the ordering in cpu_write_needs_clflush to prevent ever
setting cache_dirty for dma-buf objects on discrete.
Fixes:
d028a7690d87 ("drm/i915/dmabuf: Fix prime_mmap to work when using LMEM")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5266
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622155919.355081-1-matthew.auld@intel.com
Akeem G Abodunrin [Wed, 22 Jun 2022 14:11:04 +0000 (15:11 +0100)]
drm/i915/selftests: Increase timeout for live_parallel_switch
With GuC submission, it takes a little bit longer switching contexts
among all available engines simultaneously, when running
live_parallel_switch subtest. Increase the timeout.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5885
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622141104.334432-1-matthew.auld@intel.com