review.tizen.org Git - platform/kernel/linux-starfive.git/log

drm/i915/perf: Handle non-power-of-2 reports

Some of the newer OA formats are not powers of 2. For those formats,
adjust the hw_tail accordingly when checking for new reports.

v2: (Ashutosh)
- Switch to OA_TAKEN for diff calculation
- Use OA_BUFFER_SIZE instead of the vma size
- Update comments

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-8-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Parse 64bit report header formats correctly

Now that OA formats come in flavor of 64 bit reports, the report header
has 64 bit report-id, timestamp, context-id and gpu-ticks fields. When
filtering these reports, use the right width for these fields.

Note that upper dword of context id is reserved, so squash lower dword
only.

v2: (Ashutosh)
- Drop inline
- Update comment with dword definitions - report id and timestamp

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-7-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Fail modprobe if i915_perf_init fails on OOM

i915_perf_init can fail due to OOM. Fail driver init if i915_perf_init
fails.

v2: (Jani)
- Reorder patch in the series

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-6-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Group engines into respective OA groups

Now that we may have multiple OA units in a single GT as well as on
separate GTs, create an engine group that maps to a single OA unit.

v2: (Jani)
- Drop warning on ENOMEM
- Reorder patch in the series

v3: (Ashutosh)
- Remove unused members from perf structs
- Update comments
- Update engine_supports_oa check
- Just return 1 in num_perf_groups_per_gt for now
- Set engine->oa_group to NULL to begin with

v4: Use engine_supports_oa() check in oa_init_reg_state (Ashutosh)
v5: Rebase after dropping engine_supports_oa helper

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-5-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Validate OA sseu config outside switch

Once OA supports media engine class:instance, the engine can only be
validated outside the switch since class and instance parameters are
separate entities. Since OA sseu config depends on engine
class:instance, validate OA sseu config outside the switch.

v2: (Ashutosh)
- Clarify commit message
- Use drm_dbg instead of DRM_DEBUG
- Reorder stack variables

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-4-umesh.nerlige.ramappa@intel.com

drm/i915/mtl: Synchronize i915/BIOS on C6 enabling

If BIOS enables/disables C6, i915 should do the same. Also, retain
this value across driver reloads. This is needed only for MTL as
of now due to an existing bug in OA which needs C6 disabled for
it to function. BIOS behavior is also different across platforms
in terms of how C6 is enabled.

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-3-umesh.nerlige.ramappa@intel.com

drm/i915/perf: Drop wakeref on GuC RC error

If we fail to adjust the GuC run-control on opening the perf stream,
make sure we unwind the wakeref just taken.

v2: Retain old goto label names (Ashutosh)
v3: Drop bitfield boolean

Fixes: 01e742746785 ("drm/i915/guc: Support OA when Wa_16011777198 is enabled")
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-2-umesh.nerlige.ramappa@intel.com

drm/i915/guc: Allow for very slow GuC loading

A failure to load the GuC is occasionally observed where the GuC log
actually showed that the GuC had loaded just fine. The implication
being that the load just took ever so slightly longer than the 200ms
timeout. Given that the actual time should be tens of milliseconds at
the slowest, this should never happen. So far the issue has generally
been caused by a bad IFWI resulting in low frequencies during boot
(depsite the KMD requesting max frequency). However, the issue seems
to happen more often than one would like.

So a) increase the timeout so that the user still gets a working
system even in the case of slow load. And b) report the frequency
during the load to see if that is the case of the slow down.

v2: Reduce timeout in non-debug builds, add references (Daniele)

References: https://gitlab.freedesktop.org/drm/intel/-/issues/7931
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8083
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8136
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8137
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Tested-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316220632.3312218-3-John.C.Harrison@Intel.com

drm/i915/guc: Improve GuC load error reporting

There are multiple ways in which the GuC load can fail. The driver was
reporting the status register as is, but not everyone can read the
matrix unfiltered. So add decoding of the common error cases.

Also, remove the comment about interrupt based load completion
checking being not recommended. The interrupt was removed from the GuC
firmware some time ago so it is no longer an option anyway. While at
it, also abort the timeout if a known error code is reported. No need
to keep waiting if the GuC has already given up the load.

v2: Fix mis-matched case and confusing 'success' variable (Daniele).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316220632.3312218-2-John.C.Harrison@Intel.com

drm/i915/selftests: Drop igt_cs_tlb

The gt_tlb live selftest has the same code coverage as the
igt_cs_tlb subtest of gtt, except it is better at detecting
TLB bugs. Furthermore, while igt_cs_tlb is hitting some
unforeseen issues, these issues are either false positives
due to the test being poorly formatted, or are true
positives that can be more easily diagnosed with smaller
tests. As such, igt_cs_tlb is superceded by and obsoleted
by gt_tlb, meaning it can be removed.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230320192117.287374-1-andi.shyti@linux.intel.com

drm/i915/gem: Flush lmem contents after construction

i915_gem_object_create_lmem_from_data() lacks the flush of the data
written to lmem to ensure the object is marked as dirty and the writes
flushed to the backing store. Once created, we can immediately release
the obj->mm.mapping caching of the vmap.

Fixes: 7acbbc7cf485 ("drm/i915/guc: put all guc objects in lmem when available")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316165918.13074-1-nirmoy.das@intel.com

drm/i915: Use i915 instead of dev_priv insied the file_priv structure

In the process of renaming all instances of 'dev_priv' to 'i915',
start using 'i915' within the 'drm_i915_file_private' structure.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230322001611.632321-1-andi.shyti@linux.intel.com

drm/i915/debugfs: Enable upper layer interfaces to act on all gt's

The commit 82a149a62b6b ("drm/i915/gt: move remaining debugfs
interfaces into gt") moved gt-related debugfs files in the gtX/
directories to operate on individual gt's.

However, the original files were only functioning on the root
GT (GT 0) and have been left in the same location to maintain
compatibility with userspace users.

Add multiplexing functionality to the higher directories' files.
This enables the operations to be performed on all the GTs with
a single write. In the case of reads, the files provide an or'ed
value across all the tiles.

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230318203616.183765-4-andi.shyti@linux.intel.com

drm/i915/gt: Create per-gt debugfs files

To support multi-GT configurations, we need to generate
independent debug files for each GT.

To achieve this create a separate directory for each GT under the
debugfs directory. For instance, in a system with two GTs, the
debugfs structure would look like this:

/sys/kernel/debug/dri
                  └── 0
                      ├── gt0
                      │   ├── drpc
                      │   ├── engines
                      │   ├── forcewake
                      │   ├── frequency
                      │   └── rps_boost
                      └── gt1
                      :   ├── drpc
                      :   ├── engines
                      :   ├── forcewake
                          ├── frequency
                          └── rps_boost

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230318203616.183765-2-andi.shyti@linux.intel.com

drm/i915/uapi: Replace fake flex-array with flexible-array member

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Address the following warning found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/gpu/drm/i915/gem/i915_gem_context.c: In function ‘set_proto_ctx_engines.isra’:
drivers/gpu/drm/i915/gem/i915_gem_context.c:769:41: warning: array subscript n is outside array bounds of ‘struct i915_engine_class_instance[0]’ [-Warray-bounds=]
  769 |                 if (copy_from_user(&ci, &user->engines[n], sizeof(ci))) {
      |                                         ^~~~~~~~~~~~~~~~~
./include/uapi/drm/i915_drm.h:2494:43: note: while referencing ‘engines’
2494 |         struct i915_engine_class_instance engines[0];

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/271
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZBSu2QsUJy31kjSE@work

drm/i915/pmu: Use functions common with sysfs to read actual freq

Expose intel_rps_read_actual_frequency_fw to read the actual freq without
taking forcewake for use by PMU. The code is refactored to use a common set
of functions across sysfs and PMU. Using common functions with sysfs in PMU
solves the issues of missing support for MTL and missing support for older
generations (prior to Gen6). It also future proofs the PMU where sometimes
code has been updated for sysfs and PMU has been missed.

v2: Remove runtime_pm_if_in_use from read_actual_frequency_fw (Tvrtko)

v3: (Tvrtko)
- Remove goto in __read_cagf
- Unexport intel_rps_get_cagf and intel_rps_read_punit_req

Fixes: 22009b6dad66 ("drm/i915/mtl: Modify CAGF functions for MTL")
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8280
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316004800.2539753-1-ashutosh.dixit@intel.com

drm/i915: Fix format for perf_limit_reasons

Use hex format so that it is easier to decode.

Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230315022906.2467408-1-vinay.belgaumkar@intel.com

drm/i915/selftests: keep same cache settings as timeline

On MTL, objects allocated through i915_gem_object_create_internal() are
mapped as uncached in GPU by default because HAS_LLC is false. However
in the live_hwsp_read selftest these watcher objects are mapped as WB
on CPU side. The conseqence is that the updates done by the GPU are not
immediately visible to CPU, thus the selftest is randomly failing due to
the stale data in CPU cache. Solution can be either setting WC for CPU +
UC for GPU, or WB for CPU + 1-way coherent WB for GPU.
To keep the consistency, let's simply inherit the same cache settings
from the timeline, which is WB for CPU + 1-way coherent WB for GPU,
because this test is supposed to emulate the behavior of the timeline
anyway.

Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230315180800.2632766-1-fei.yang@intel.com

drm/i915/gt: perform uc late init after probe error injection

Probe pseudo errors should be injected only in places where real errors
can be encountered, otherwise unwinding code can be broken.
Placing intel_uc_init_late before i915_inject_probe_error violated
this rule, resulting in following bug:
__intel_gt_disable:655 GEM_BUG_ON(intel_gt_pm_is_awake(gt))

Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230314151920.1065847-1-andrzej.hajda@intel.com

drm/i915: Simplify vcs/bsd engine selection

No need to look at the mask of present engines when we already have a
count stored ever since e2d0ff3525b9 ("drm/i915: Count engine instances
per uabi class").

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316142728.1335239-1-tvrtko.ursulin@linux.intel.com
[tursulin: fixup typo in patch title]

drm/i915: add guard page to ggtt->error_capture

Write-combining memory allows speculative reads by CPU.
ggtt->error_capture is WC mapped to CPU, so CPU/MMU can try
to prefetch memory beyond the error_capture, ie it tries
to read memory pointed by next PTE in GGTT.
If this PTE points to invalid address DMAR errors will occur.
This behaviour was observed on ADL and RPL platforms.
To avoid it, guard scratch page should be added after error_capture.
The patch fixes the most annoying issue with error capture but
since WC reads are used also in other places there is a risk similar
problem can affect them as well.

v2:
  - modified commit message (I hope the diagnosis is correct),
  - added bug checks to ensure scratch is initialized on gen3 platforms.
    CI produces strange stacktrace for it suggesting scratch[0] is NULL,
    to be removed after resolving the issue with gen3 platforms.
v3:
  - removed bug checks, replaced with gen check.
v4:
  - change code for scratch page insertion to support all platforms,
  - add info in commit message there could be more similar issues
v5:
  - check for nop_clear_range instead of gen8 (Tvrtko),
  - re-insert scratch pages on resume (Tvrtko)
v6:
  - use scratch_range callback to set scratch pages (Chris)

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230308-guard_error_capture-v6-2-1b5f31422563@intel.com

drm/i915/gt: introduce vm->scratch_range callback

The callback will be responsible for setting scratch page PTEs for
specified range. In contrast to clear_range it cannot be optimized to nop.
It will be used by code adding guard pages.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230308-guard_error_capture-v6-1-1b5f31422563@intel.com

drm/i915/gt: prevent forcewake releases during BAR resize

Tests on DG2 machines show that releasing forcewakes during BAR resize
results later in forcewake ack timeouts. Since forcewakes can be realeased
asynchronously the simplest way to prevent it is to get all forcewakes
for duration of BAR resizing.

v2: hold rpm as well during resizing (Rodrigo)

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6530
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7853
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230308133624.2131582-1-andrzej.hajda@intel.com

drm/i915/gt: Update engine_init_common documentation

Change the function doc to reflect updated name.

v2: un-kerneldoc the comment(Matt).
:s/engines_init_common/engine_init_common(Andi)

Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310101024.4700-1-nirmoy.das@intel.com

drm/i915/gt: make kobj attributes const

There's no need for any of these to be mutable, constify:

drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000020 files.0
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000050 files.1
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 preempt_timeout_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 timeslice_duration_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 timeslice_duration_def
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 preempt_timeout_def
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 max_spin_def
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 stop_timeout_def
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 heartbeat_interval_def
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 name_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 class_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 inst_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 mmio_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 caps_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 all_caps_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 max_spin_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 stop_timeout_attr
drivers/gpu/drm/i915/gt/sysfs_engines.o: .data 0000000000000038 heartbeat_interval_attr

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230309081645.385650-1-jani.nikula@intel.com

drm/i915/active: Fix missing debug object activation

debug_active_activate() expected ref->count to be zero
which is not true anymore as __i915_active_activate() calls
debug_active_activate() after incrementing the count.

v2: No need to check for "ref->count == 1" as __i915_active_activate()
already make sure of that(Janusz).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6733
Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active callback")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230313114613.9874-1-nirmoy.das@intel.com

drm/i915: Include timeline seqno in error capture

The seqno value actually written out to memory is no longer in the
regular HWSP. Instead, it is now in its own private timeline buffer.
Thus, it is no longer visible in an error capture. So, explicitly read
the value and include that in the capture.

v2: %d -> %u (Alan)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230311063714.570389-4-John.C.Harrison@Intel.com

drm/i915/guc: Clean up of register capture search

The comparison in the search for a matching register capture node was
not the most readable. It was also assuming that a zero GuC id means
invalid, which it does not. So remove one invalid term, one redundant
term and re-format to keep each term on a single line, and only one
term per line.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230311063714.570389-3-John.C.Harrison@Intel.com

drm/i915/guc: Fix missing ecodes

Error captures are tagged with an 'ecode'. This is a pseduo-unique magic
number that is meant to distinguish similar seeming bugs with
different underlying signatures. It is a combination of two ring state
registers. Unfortunately, the register state being used is only valid
in execlist mode. In GuC mode, the register state exists in a separate
list of arbitrary register address/value pairs rather than the named
entry structure. So, search through that list to find the two exciting
registers and copy them over to the structure's named members.

v2: if else if instead of if if (Alan)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Fixes: a6f0f9cf330a ("drm/i915/guc: Plumb GuC-capture into gpu_coredump")
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230311063714.570389-2-John.C.Harrison@Intel.com

drm/i915/mtl: Disable MC6 for MTL A step

The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508

Fixes: 8f70f1ec587d ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310061339.2495416-2-badal.nilawar@intel.com

drm/i915: Move DG2 tuning to the right function

Use gt_tuning_settings() for the recommended tunings rather than the one
for workarounds.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306204954.753739-2-lucas.demarchi@intel.com

drm/i915: Remove redundant check for DG1

dg1_gt_workarounds_init() is only ever called for DG1, so there is no
point checking it again.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306204954.753739-1-lucas.demarchi@intel.com

drm/i915/guc: Fix missing return code checks in submission init

The CI results for the 'fast request' patch set (enables error return
codes for fire-and-forget H2G messages) hit an issue with the KMD
sending context submission requests on an invalid context. That was
caused by a fault injection probe failing the context creation of a
kernel context. However, there was no return code checking on any of
the kernel context registration paths. So the driver kept going and
tried to use the kernel context for the record defaults process.

This would not cause any actual problems. The invalid requests would
be rejected by GuC and ultimately the start up sequence would
correctly wedge due to the context creation failure. But fixing the
issue correctly rather ignoring it means we won't get CI complaining
when the fast request patch lands and enables the extra error checking.

So fix it by checking for errors and aborting as appropriate when
creating kernel contexts. While at it, clean up some other submission
init related failure cleanup paths. Also, rename guc_init_lrc_mapping
to guc_init_submission as the former name hasn't been valid in a long
time.

v2: Add another wrapper to keep the flow balanced (Daniele)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230217223308.3449737-3-John.C.Harrison@Intel.com

drm/i915/guc: Improve clean up of busyness stats worker

The stats worker thread management was mis-matched between
enable/disable call sites. Fix those up. Also, abstract the
cancel/enable code into a helper function rather than replicating in
multiple places.

v2: Rename the helpers and wrap the enable as well as the cancel
(review feedback from Daniele).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230217223308.3449737-2-John.C.Harrison@Intel.com

drm/i915/gsc: Fix race between HW init and GSC FW load

The GSC FW load is a slow process (up to 250 ms), so we defer it to a
dedicated worker to avoid stalling the init flow for that long. However,
we currently start this worker before the HW init is complete, so there
is a chance that the GSC loading code submits to the HW before the
engine initialization has completed. We can easily fix this by starting
the thread later in the gt_resume flow.
From this later spot, the GSC code can still race with the default
submission code; we functionally don't care who wins the race (the GSC
load doesn't need any state), but since the whole point of the separate
worker is to make the main thread faster, we prefer the default
submission code to run first. Therefore, make an exception for driver
probe and only and start the gsc load from uc_init_late.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223172120.3304293-3-daniele.ceraolospurio@intel.com

drm/i915/gsc: flush the GSC worker before wedging on unload

If we unload the driver and wedge before the GSC worker is complete,
the worker will hit an error on its submission to the GSC engine and
then exit. This is hard to hit for a user, but it is reproducible
with skipping selftests. The error is handled gracefully by the
worker, so there are no functional issues, but we still end up with
an error message in dmesg, which is something we want to avoid as
this is a supported scenario. We could modify the worker to better
handle a wedging occurring during its execution, but that gets
complicated for a couple of reasons:
- We do want the error on runtime wedging, because there are
  implications for subsystems outside of GT (i.e., PXP, HDCP), it's
  only the error on driver unload that we want to silence.
- The worker is responsible for multiple submissions (GSC FW load,
  HuC auth, SW proxy), so all of those will have to be adapted to
  handle the wedged_on_fini scenario.
Therefore, it's much simpler to just wait for the worker to be done
before wedging on driver removal, also considering that the worker
will likely already be idle in the great majority of non-selftest
scenarios.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223172120.3304293-2-daniele.ceraolospurio@intel.com

drm/i915/xelp: Implement Wa_1606376872

Wa_1606376872 applies to all Xe_LP IPs except DG1.

Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230307032238.300674-1-gustavo.sousa@intel.com

drm/i915/selftest: Remove avoidable init assignment

We can skip the assignment and i915 variable
altogether and use refernce directly. Also used at
single place only.

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230307094643.532271-1-tejas.upadhyay@intel.com

drm/i915/selftest: Fix ktime_get() and h/w access order

Use ktime_get() after accessing the mmio or any driver resource,
while using wall time for various calculation that depends on
the inserted delay in order to account any mmio and resource
access latency.

Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223100503.3323627-3-anshuman.gupta@intel.com

drm/i915/selftest: Fix engine timestamp and ktime disparity

While reading the engine timestamps there can be uncontrollable
concurrent mmio access via other i915 child drivers and by GuC,
which is not truly atomic context as expected by this selftest,
which may cause mmio latency to read the engine timestamps,
Account such latency to calculate time to read engine timestamp
such that selftest can validate the timestamp and ktime pair.

Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223100503.3323627-2-anshuman.gupta@intel.com

drm/i915/active: Fix misuse of non-idle barriers as fence trackers

Users reported oopses on list corruptions when using i915 perf with a
number of concurrently running graphics applications.  Root cause analysis
pointed at an issue in barrier processing code -- a race among perf open /
close replacing active barriers with perf requests on kernel context and
concurrent barrier preallocate / acquire operations performed during user
context first pin / last unpin.

When adding a request to a composite tracker, we try to reuse an existing
fence tracker, already allocated and registered with that composite.  The
tracker we obtain may already track another fence, may be an idle barrier,
or an active barrier.

If the tracker we get occurs a non-idle barrier then we try to delete that
barrier from a list of barrier tasks it belongs to.  However, while doing
that we don't respect return value from a function that performs the
barrier deletion.  Should the deletion ever fail, we would end up reusing
the tracker still registered as a barrier task.  Since the same structure
field is reused with both fence callback lists and barrier tasks list,
list corruptions would likely occur.

Barriers are now deleted from a barrier tasks list by temporarily removing
the list content, traversing that content with skip over the node to be
deleted, then populating the list back with the modified content.  Should
that intentionally racy concurrent deletion attempts be not serialized,
one or more of those may fail because of the list being temporary empty.

Related code that ignores the results of barrier deletion was initially
introduced in v5.4 by commit d8af05ff38ae ("drm/i915: Allow sharing the
idle-barrier from other kernel requests").  However, all users of the
barrier deletion routine were apparently serialized at that time, then the
issue didn't exhibit itself.  Results of git bisect with help of a newly
developed igt@gem_barrier_race@remote-request IGT test indicate that list
corruptions might start to appear after commit 311770173fac ("drm/i915/gt:
Schedule request retirement when timeline idles"), introduced in v5.5.

Respect results of barrier deletion attempts -- mark the barrier as idle
only if successfully deleted from the list.  Then, before proceeding with
setting our fence as the one currently tracked, make sure that the tracker
we've got is not a non-idle barrier.  If that check fails then don't use
that tracker but go back and try to acquire a new, usable one.

v3: use unlikely() to document what outcome we expect (Andi),
  - fix bad grammar in commit description.
v2: no code changes,
  - blame commit 311770173fac ("drm/i915/gt: Schedule request retirement
    when timeline idles"), v5.5, not commit d8af05ff38ae ("drm/i915: Allow
    sharing the idle-barrier from other kernel requests"), v5.4,
  - reword commit description.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6333
Fixes: 311770173fac ("drm/i915/gt: Schedule request retirement when timeline idles")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org # v5.5
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230302120820.48740-1-janusz.krzysztofik@linux.intel.com

drm/i915/gsc: Fix the Driver-FLR completion

The Driver-FLR flow may inadvertently exit early before the full
completion of the re-init of the internal HW state if we only poll
GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead
we need a two-step completion wait-for-completion flow that also
involves GU_CNTL. See the patch and new code comments for detail.
This is new direction from HW architecture folks.

   v2: - Add error message for the teardown timeout (Anshuman)
       - Don't duplicate code in comments (Jani)
   v3: - Add get/put runtime-pm for this function. Though
         not functionally required during unload, its so the uncore
doesn't complain.
   v4: - Remove the get/put runtime-pm - that was for a prior
         version of this patch (not needed for drm-managed callback).
       - Remove the fixes tag since this is only for MTL and MTL
         still needs force probe (Daniele).
       - Bit 31 of GU_CNTL should be DRIVERFLR instead of
         DRIVERFLR_STATUS (Daniele).

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230224001758.544817-1-alan.previn.teres.alexis@intel.com

drm/i915/sseu: fix max_subslices array-index-out-of-bounds access

It seems that commit bc3c5e0809ae ("drm/i915/sseu: Don't try to store EU
mask internally in UAPI format") exposed a potential out-of-bounds
access, reported by UBSAN as following on a laptop with a gen 11 i915
card:

  UBSAN: array-index-out-of-bounds in drivers/gpu/drm/i915/gt/intel_sseu.c:65:27
  index 6 is out of range for type 'u16 [6]'
  CPU: 2 PID: 165 Comm: systemd-udevd Not tainted 6.2.0-9-generic #9-Ubuntu
  Hardware name: Dell Inc. XPS 13 9300/077Y9N, BIOS 1.11.0 03/22/2022
  Call Trace:
   <TASK>
   show_stack+0x4e/0x61
   dump_stack_lvl+0x4a/0x6f
   dump_stack+0x10/0x18
   ubsan_epilogue+0x9/0x3a
   __ubsan_handle_out_of_bounds.cold+0x42/0x47
   gen11_compute_sseu_info+0x121/0x130 [i915]
   intel_sseu_info_init+0x15d/0x2b0 [i915]
   intel_gt_init_mmio+0x23/0x40 [i915]
   i915_driver_mmio_probe+0x129/0x400 [i915]
   ? intel_gt_probe_all+0x91/0x2e0 [i915]
   i915_driver_probe+0xe1/0x3f0 [i915]
   ? drm_privacy_screen_get+0x16d/0x190 [drm]
   ? acpi_dev_found+0x64/0x80
   i915_pci_probe+0xac/0x1b0 [i915]
   ...

According to the definition of sseu_dev_info, eu_mask->hsw is limited to
a maximum of GEN_MAX_SS_PER_HSW_SLICE (6) sub-slices, but
gen11_sseu_info_init() can potentially set 8 sub-slices, in the
!IS_JSL_EHL(gt->i915) case.

Fix this by reserving up to 8 slots for max_subslices in the eu_mask
struct.

Reported-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Fixes: bc3c5e0809ae ("drm/i915/sseu: Don't try to store EU mask internally in UAPI format")
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230220171858.131416-1-andrea.righi@canonical.com

drm/i915/selftests: Fix live_requests for all engines

After the abandonment of i915->kernel_context and since we have started to
create per-gt engine->kernel_context, these tests need to be updated to
instantiate the batch buffer VMA in the correct PPGTT for the context used
to execute each spinner.

v2(Tejas):
  - Clean commit message - Matt
  - Add BUG_ON to match vm
v3(Tejas):
  - Fix dim checkpatch warnings

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230228044307.191639-1-tejas.upadhyay@intel.com

drm/i915: Stop whitelisting CS_CTX_TIMESTAMP on Xe_HP platforms

Xe_HP architecture already makes the CS_CTX_TIMESTAMP readable by
userspace on all engines; there's no longer a need to add it to the
software-managed whitelist for the non-RCS engines.

Bspec: 45545
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230224002300.3578985-2-matthew.d.roper@intel.com

drm/i915: Whitelist COMMON_SLICE_CHICKEN3 for UMD access

A recommended tuning setting for both gen12 and Xe_HP platforms requires
that we grant userspace r/w access to the COMMON_SLICE_CHICKEN3
register.

Bspec: 73993, 73994, 31870, 68331
Cc: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230224002300.3578985-1-matthew.d.roper@intel.com

drm/i915/gt: Rename dev_priv to i915 for private data naming consistency

It has become common practice to refer to the drm_i915_private
structures as "i915". However, there are still instances where
they are referred to as "dev_priv". This inconsistency can make
grepping for information more difficult and does not maintain a
cohesive style throughout the code.

Rename all the "dev_priv" structures in the gt/* directory to
"i915".

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230210150344.1066991-1-andi.shyti@linux.intel.com

drm/i915/mtl: Add engine TLB invalidation

MTL's primary GT can continue to use the same engine TLB invalidation
programming as past Xe_HP-based platforms.  However the media GT needs
some special handling:
* Invalidation registers on the media GT are singleton registers
   (unlike the primary GT where they are still MCR).
* Since the GSC is now exposed as an engine, there's a new register to
   use for TLB invalidation.  The offset is identical to the compute
   engine offset, but this is expected --- compute engines only exist on
   the primary GT while the GSC only exists on the media GT.
* Although there's only a single GSC engine instance, it inexplicably
   uses bit 1 to request invalidations rather than bit 0.

v2:
- Add a 'regs == xelpmp_regs' condition to the GSC instance handling.
   If the registers change on a future platform, the GSC-specific
   handling is likely to change as well.  (Andrzej)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230224012009.3594691-1-matthew.d.roper@intel.com

drm/i915/mtl: X-Tile support changes to client blits

Refactor the supports_x_tiling and fast_blit_ok helper
functions in the live client selftest to better reflect
when XY_FAST_COPY_BLT supports X-tile and can be used.

Bspec: 47982
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230223183954.1817632-1-jonathan.cavitt@intel.com

drm/i915/ttm: remove the virtualized start hack

This was mostly needed to differentiate between mappable and
non-mappable lmem, such that ttm would understand non-mappable ->
mappable moves (or vice versa), and not just turn them into noops. We
have since gained proper .intersects() and .compatible() hooks for the
resource manager, which takes care of this for us.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221220112736.161642-1-matthew.auld@intel.com

drm/i915: probe lmem before the stolen portion

At the very least, we have some tests that force the BAR size for
testing purposes, which would result in different BAR size with
stolen-lmem vs normal lmem, since the BAR is only resized as part of the
normal lmem probing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127160321.374350-1-matthew.auld@intel.com

drm/i915: Don't use BAR mappings for ring buffers with LLC

Direction from hardware is that ring buffers should never be mapped
via the BAR on systems with LLC. There are too many caching pitfalls
due to the way BAR accesses are routed. So it is safest to just not
use it.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Fixes: 9d80841ea4c9 ("drm/i915: Allow ringbuffers to be bound anywhere")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.9+
Tested-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216011101.1909009-3-John.C.Harrison@Intel.com

drm/i915: Don't use stolen memory for ring buffers with LLC

Direction from hardware is that stolen memory should never be used for
ring buffer allocations on platforms with LLC. There are too many
caching pitfalls due to the way stolen memory accesses are routed. So
it is safest to just not use it.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Fixes: c58b735fc762 ("drm/i915: Allocate rings from stolen")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.9+
Tested-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216011101.1909009-2-John.C.Harrison@Intel.com

drm/i915: Consolidate TLB invalidation flow

As the logic for selecting the register and corresponsing values grew, the
code become a bit unsightly. Consolidate by storing the required values at
engine init time in the engine itself, and by doing so minimise the amount
of invariant platform and engine checks during each and every TLB
invalidation.

v2:
* Fail engine probe if TLB invlidations registers are unknown.

v3:
* Rebase.

v4:
* Fix handling of GEN8_M2TCR. (Andrzej)

v5:
* Tidy checkpatch warnings.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216092123.159085-1-tvrtko.ursulin@linux.intel.com

drm/i915: Make kobj_type structures constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216-kobj_type-i915-v1-1-ca65c9b93518@weissschuh.net

drm/i915/xelpmp: Consider GSI offset when doing MCR lookups

MCR range tables use the final MMIO offset of a register (including the
0x380000 GSI offset when applicable). Since the i915_mcr_reg_t passed
as a parameter during steering lookup does not include the GSI offset,
we need to add it back in for GSI registers before searching the tables.

Fixes: a7ec65fc7e83 ("drm/i915/xelpmp: Add multicast steering for media GT")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230214001906.1477370-1-matthew.d.roper@intel.com

drm/i915/xehp: LNCF/LBCF workarounds should be on the GT list

Although registers in the L3 bank/node configuration ranges are marked
as having "DEV" reset characteristics in the bspec, this appears to be a
hold-over from pre-Xe_HP platforms. In reality, these registers
maintain their values across engine resets, meaning that workarounds
and tuning settings targeting them should be placed on the GT
workaround list rather than an engine workaround list.

Note that an extra clue here is that these registers moved from the
RENDER forcewake domain to the GT forcewake domain in Xe_HP; generally
RCS/CCS engine resets should not lead to the reset of a register that
lives outside the RENDER domain.

Re-applying these registers on engine resets wouldn't actually hurt
anything, but is unnecessary and just makes it more confusing to anyone
trying to decipher how these registers really work.

v2:
- Also move DG2's Wa_14010648519 to the GT list. (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230209232228.859317-1-matthew.d.roper@intel.com

drm/i915/guc: More debug print updates - GuC logging

Update a bunch more debug prints to use the new GT based scheme.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-7-John.C.Harrison@Intel.com

drm/i915/guc: More debug print updates - GuC SLPC

Update a bunch more debug prints to use the new GT based scheme.

v2: Also change prints to use %pe for error values (MichalW).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-6-John.C.Harrison@Intel.com

drm/i915/guc: More debug print updates - GuC selftests

Update a bunch more debug prints to use the new GT based scheme.

v2: Also change prints to use %pe for error values (MichalW).
Fix a context leak on error due to a -- being too early.
Use the correct header file for the debug macros.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-5-John.C.Harrison@Intel.com

drm/i915/guc: More debug print updates - GuC reg capture

Update a bunch more debug prints to use the new GT based scheme.

v2: Upgrade the no node found message to a warning on the grounds of
it being quite important if the error capture can't find any register
state information.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-4-John.C.Harrison@Intel.com

drm/i915/guc: More debug print updates - GSC firmware

Update a bunch more debug prints to use the new GT based scheme.

v2: Also change prints to use %pe for error values (MichalW).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-3-John.C.Harrison@Intel.com

drm/i915/guc: More debug print updates - UC firmware

Update a bunch more debug prints to use the new GT based scheme.

v2: Also change prints to use %pe for error values (MichalW).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207050717.1833718-2-John.C.Harrison@Intel.com

drm/i915: Remove unused/wrong INF_UNIT_LEVEL_CLKGATE

INF_UNIT_LEVEL_CLKGATE is not replicated, but since it's not actually
used it can just be removed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230206165410.3056073-2-lucas.demarchi@intel.com

drm/i915: Fix GEN8_MISCCPCTL

Register 0x9424 is not replicated on any platform, so it shouldn't be
declared with REG_MCR(). Declaring it with _MMIO() is basically
duplicate of the GEN7 version, so just remove the GEN8 and change all
the callers to use the right functions.

Old versions of the gen8 bspec page used to contain a table with MCR
registers, apparently implying 0x9400 - 0x94ff registers were
replicated. However that table went away and there is no information
related to the ranges for gen8 anymore. Moreover the current behavior of
the driver wouldn't do anything special for 0x9424 since there is no
equivalent table in intel_gt_mcr.c: the driver would just fallback to
intel_uncore_{read,write}(). Therefore, do not care about the possible
special case for gen8 and just use the register as non-MCR for all the
platforms.

One place doing read + write is also converted to intel_uncore_rmw().

v2: Reword commit message adding the justification wrt gen8

Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly")
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230206165410.3056073-1-lucas.demarchi@intel.com

drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list

The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround
has 'BUS' style reset, indicating that it does not lose its value on
engine resets. Furthermore, this register is part of the GT forcewake
domain rather than the RENDER domain, so it should not be impacted by
RCS engine resets. As such, we should implement this on the GT
workaround list rather than an engine list.

Bspec: 19219
Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthew.d.roper@intel.com

drm/i915/pvc: Annotate two more workaround/tuning registers as MCR

XEHPC_LNCFMISCCFGREG0 and XEHPC_L3SCRUB are both in MCR register ranges
on PVC (with HALFBSLICE and L3BANK replication respectively), so they
should be explicitly declared as MCR registers and use MCR-aware
workaround handlers.

The workarounds/tuning settings should still be applied properly on PVC
even without the MCR annotation, but readback verification on
CONFIG_DRM_I915_DEBUG_GEM builds could potentitally give false positive
"workaround lost on load" warnings on parts fused such that a unicast
read targets a terminated register instance.

Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-1-matthew.d.roper@intel.com

drm/i915/pxp: fix __le64 access to get rid of sparse warning

__le64 and friends should go through the cpu_to_* and *_to_cpu
accessors:

drivers/gpu/drm/i915/pxp/intel_pxp_huc.c:41:35: warning: incorrect type in assignment (different base types)
drivers/gpu/drm/i915/pxp/intel_pxp_huc.c:41:35: expected restricted __le64 [assigned] [usertype] huc_base_address
drivers/gpu/drm/i915/pxp/intel_pxp_huc.c:41:35: got unsigned long long [assigned] [usertype] huc_phys_addr

Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207124026.2105442-4-jani.nikula@intel.com

drm/i915/gt: add sparse lock annotation to avoid warnings

Annotate intel_gt_mcr_lock() and intel_gt_mcr_unlock() to fix sparse
warnings:

drivers/gpu/drm/i915/gt/intel_gt_mcr.c:397:9: warning: context imbalance in 'intel_gt_mcr_lock' - wrong count at exit
drivers/gpu/drm/i915/gt/intel_gt_mcr.c:412:6: warning: context imbalance in 'intel_gt_mcr_unlock' - unexpected unlock

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230207124026.2105442-1-jani.nikula@intel.com

drm/i915/pcode: Give the punit time to settle before fatally failing

During module load the punit might still be busy with its booting
routines. During this time we try to communicate with it but we
fail because we don't receive any feedback from it and we return
immediately with a -EINVAL fatal error.

At this point the driver load is "dramatically" aborted. The
following error message notifies us about it.

i915 0000:4d:00.0: drm_WARN_ON_ONCE(timeout_base_ms > 3)

It would be enough to wait a little in order to give the punit
the chance to come up bright and shiny, ready to interact with
the driver.

Wait up 10 seconds for the punit to settle and complete any
outstanding transactions upon module load. If it still fails try
again with a longer timeout, 180s, 3 minutes. If it still fails
then return -EPROBE_DEFER, in order to give the punit a second
chance.

Even if these timers might look long, we should consider that the
punit, depending on the platforms, might need long times to
complete its routines. Besides we want to try anything possible
to move forward before deciding to abort the driver's load.

The issue has been reported in:

https://gitlab.freedesktop.org/drm/intel/-/issues/7814

The changes in this patch are valid only and uniquely during
boot. The common transactions with the punit during the driver's
normal operation are not affected.

Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Co-developed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230206183236.109908-1-andi.shyti@linux.intel.com

drm/i915/huc: Add and use HuC oriented print macros

Like we did it for GuC, introduce some helper print macros for
HuC to have unified format of messages that also include GT#.

While around improve some messages and use %pe if possible.

v2: update GSC/PXP timeout message

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203085912.1963-1-michal.wajdeczko@intel.com

drm/i915/doc: Escape wildcard in method names

Stephen Rothwell reported htmldocs warnings:

Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:32: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:57: WARNING: Inline emphasis start-string without end-string.
Documentation/gpu/i915:64: drivers/gpu/drm/i915/gt/intel_workarounds.c:66: WARNING: Inline emphasis start-string without end-string.

Escape wildcards in *_ctx_workarounds_init(), *_gt_workarounds_init(), and
*_whitelist_build() to fix above warnings.

Link: https://lore.kernel.org/linux-next/20230203134622.0b6315b9@canb.auug.org.au/
Fixes: 0c3064cf33fbfa ("drm/i915/doc: Document where to implement register workarounds")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203100215.31852-2-bagasdotme@gmail.com

drm/i915: Initialize the obj flags for shmem objects

Obj flags for shmem objects is not being set correctly. Fixes in setting
BO_ALLOC_USER flag which applies to shmem objs as well.

v2: Add fixes tag (Tvrtko, Matt A)

Fixes: 13d29c823738 ("drm/i915/ehl: unconditionally flush the pages on acquire")
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[tursulin: Grouped all tags together.]
Link: https://patchwork.freedesktop.org/patch/msgid/20230203135205.4051149-1-aravind.iddamsetty@intel.com

drm/i915: Move fd_install after last use of fence

Because eb_composite_fence_create() drops the fence_array reference
after creation of the sync_file, only the sync_file holds a ref to the
fence. But fd_install() makes that reference visable to userspace, so
it must be the last thing we do with the fence.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
Cc: <stable@vger.kernel.org> # v5.15+
[tursulin: Added stable tag.]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230203164937.4035503-1-robdclark@gmail.com

drm/i915: Make sure dsm_size has correct granularity

DSM granularity is 1MB so make sure we stick to that.

The address set by firmware in GEN12_DSMBASE in driver initialization
doesn't mean "anything above that and until end of lmem is part of DSM".
In fact, there may be a few KB that is not part of DSM on the end of
lmem. How large is that space is platform-dependent, but since it's
always less than the DSM granularity, it can be simplified by simply
aligning the size down.

v2: replace "1 * SZ_1M" with SZ_1M (Andrzej).
v3: reword commit message to explain why the round down is needed
(Lucas)

Cc: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230202180243.23637-1-nirmoy.das@intel.com

drm/i915/guc: Improve debug message on context reset notification

Just recently we switched over to new GuC oriented log macros but in
the meantime yet another message was added that we missed to update.

While around improve that new message by adding engine name and use
existing helpers to check for context state.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230131214413.1879-1-michal.wajdeczko@intel.com

drm/i915/gt: Use sysfs_emit() and sysfs_emit_at()

Use sysfs_emit() and sysfs_emit_at() in show() callback
as recommended by Documentation/filesystems/sysfs.rst

Cc: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230130131358.16800-1-nirmoy.das@intel.com

drm/i915/gt: Add selftests for TLB invalidation

Check that we invalidate the TLB cache, the updated physical addresses
are immediately visible to the HW, and there is no retention of the old
physical address for concurrent HW access.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[ahajda: adjust to upstream driver, v2+]
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[tursulin: Small indentation fix.]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230130165058.1647414-1-andrzej.hajda@intel.com

drm/i915/mtl: Wa_22011802037: don't complain about missing regs on MTL

Wa_22011802037 requires waiting for an engine-specific register to
clear. A missing entry for GSC engine in the register table is flagged
as a drm_err. The drm_err was originally intended to catch missing
register entries for newer engines, however, it was later found that the
WA is only required for 'legacy' engines. So just drop the drm_err.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230124231111.1786429-1-umesh.nerlige.ramappa@intel.com

drm/i915/guc: Update GT/GuC messages in intel_uc.c

Use new macros to have common prefix that also include GT#.

v2: pass gt to print_fw_ver
v3: prefer guc_dbg in suspend/resume logs

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-9-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc_submission.c

Use new macros to have common prefix that also include GT#.

v2: improve few existing messages

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-8-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc_log.c

Use new macros to have common prefix that also include GT#.

v2: drop redundant GuC strings, minor improvements
v3: more message improvements

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-7-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc_fw.c

Use new macros to have common prefix that also include GT#.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-6-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc_ct.c

Use new macros to have common prefix that also include GT#.

v2: drop unused helpers

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-5-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc_ads.c

Use new macros to have common prefix that also include GT#.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-4-michal.wajdeczko@intel.com

drm/i915/guc: Update GuC messages in intel_guc.c

Use new macros to have common prefix that also include GT#.

v2: drop now redundant "GuC" word from the message

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-3-michal.wajdeczko@intel.com

drm/i915/guc: Add GuC oriented print macros

While we do have GT oriented print macros, add few more GuC
specific to have common look and feel across all messages
related to the GuC and to avoid chasing the gt pointer.

We will use these macros shortly in upcoming patches.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-2-michal.wajdeczko@intel.com

drm/i915: Fix potential bit_17 double-free

A userspace with multiple threads racing I915_GEM_SET_TILING to set the
tiling to I915_TILING_NONE could trigger a double free of the bit_17
bitmask. (Or conversely leak memory on the transition to tiled.) Move
allocation/free'ing of the bitmask within the section protected by the
obj lock.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Cc: <stable@vger.kernel.org> # v5.5+
[tursulin: Correct fixes tag and added cc stable.]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127200550.3531984-1-robdclark@gmail.com

drm/i915/guc: Rename GuC register state capture node to be more obvious

The GuC specific register state entry in the error capture object was
just called 'capture'. Although the companion 'node' entry was called
'guc_capture_node'. Rename the base entry to be 'guc_capture' instead
so that it is a) more consistent and b) more obvious what it is.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com

drm/i915/guc: Add a debug print on GuC triggered reset

For understanding bug reports, it can be useful to have an explicit
dmesg print when a reset notification is received from GuC. As opposed
to simply inferring that this happened from other messages.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com

drm/i915/guc: Look for a guilty context when an engine reset fails

Engine resets are supposed to never fail. But in the case when one
does (due to unknown reasons that normally come down to a missing
w/a), it is useful to get as much information out of the system as
possible. Given that the GuC intentionally dies on such a situation,
it is not possible to get a guilty context notification back. So do a
manual search instead. Given that GuC is dead, this is safe because
GuC won't be changing the engine state asynchronously.

v2: Change comment to be less alarming (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com

drm/i915: Allow error capture of a pending request

A hang situation has been observed where the only requests on the
context were either completed or not yet started according to the
breaadcrumbs. However, the register state claimed a batch was (maybe)
in progress. So, allow capture of the pending request on the grounds
that this might be better than nothing.

v2: Reword 'not started' warning message (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-6-John.C.Harrison@Intel.com

drm/i915: Allow error capture without a request

There was a report of error captures occurring without any hung
context being indicated despite the capture being initiated by a 'hung
context notification' from GuC. The problem was not reproducible.
However, it is possible to happen if the context in question has no
active requests. For example, if the hang was in the context switch
itself then the breadcrumb write would have occurred and the KMD would
see an idle context.

In the interests of attempting to provide as much information as
possible about a hang, it seems wise to include the engine info
regardless of whether a request was found or not. As opposed to just
prentending there was no hang at all.

So update the error capture code to always record engine information
if a context is given. Which means updating record_context() to take a
context instead of a request (which it only ever used to find the
context anyway). And split the request agnostic parts of
intel_engine_coredump_add_request() out into a seaprate function.

v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null
pointer.
v3: Tidy up request locking code flow (Tvrtko)
v4: Pull in improved info message from next patch and fix up potential
leak of GuC register state (Daniele)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2)
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-5-John.C.Harrison@Intel.com

drm/i915: Fix up locking around dumping requests lists

The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.

So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.

v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com

drm/i915: Fix request ref counting during error capture & debugfs dump

When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.

The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.

The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.

v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com

drm/i915/guc: Fix locking when searching for a hung request

intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com

drm/i915: Avoid potential vm use-after-free

Adding the vm to the vm_xa table makes it visible to userspace, which
could try to race with us to close the vm. So we need to take our extra
reference before putting it in the table.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 9ec8795e7d91 ("drm/i915: Drop __rcu from gem_context->vm")
Cc: <stable@vger.kernel.org> # v5.16+
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230119173321.2825472-1-robdclark@gmail.com

drm/i915/xehp: Annotate a couple more workaround registers as MCR

GAMSTLB_CTRL and GAMCNTRL_CTRL became multicast/replicated registers on
Xe_HP. They should be defined accordingly and use MCR-aware operations.

These registers have only been used for some dg2/xehpsdv workarounds, so
this fix is mostly just for consistency/future-proofing; even lacking
the MCR annotation, workarounds will always be properly applied in a
multicast manner on these platforms.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Fixes: 58bc2453ab8a ("drm/i915: Define multicast registers as a new type")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-3-matthew.d.roper@intel.com

drm/i915/mtl: Correct implementation of Wa_18018781329

Workaround Wa_18018781329 has applied to several recent Xe_HP-based
platforms.  However there are some extra gotchas to implementing this
properly for MTL that we need to take into account:

* Due to the separation of media and render/compute into separate GTs,
   this workaround needs to be implemented on each GT, not just the
   primary GT.  Since each class of register only exists on one of the
   two GTs, we should program the appropriate registers on each GT.

* As with past Xe_HP platforms, the registers on the primary GT (Xe_LPG
   IP) are multicast/replicated registers and should be handled with the
   MCR-aware functions.  However the registers on the media GT (Xe_LPM+
   IP) are regular singleton registers and should _not_ use MCR
   handling.  We need to create separate register definitions for the
   Xe_HP multicast form and the Xe_LPM+ singleton form and use each in
   the appropriate place.

* Starting with MTL, workarounds documented by the hardware teams are
   technically associated with IP versions/steppings rather than
   top-level platforms.  That means we should take care to check the
   media IP version rather than the graphics IP version when deciding
   whether the workaround is needed on the Xe_LPM+ media GT (in this
   case the workaround applies to both IPs and the stepping bounds are
   identical, but we should still write the code appropriately to set a
   proper precedent for future workaround implementations).

* It's worth noting that the GSC register and the CCS register are
   defined with the same MMIO offset (0xCF30).  Since the CCS is only
   relevant to the primary GT and the GSC is only relevant to the media
   GT there isn't actually a clash here (the media GT automatically adds
   the additional 0x380000 GSI offset).  However there's currently a
   glitch in the bspec where the CCS register doesn't show up at all and
   the GSC register is listed as existing on both GTs.  That's a known
   documentation problem for several registers with shared GSC/CCS
   offsets; rest assured that the CCS register really does still exist.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Fixes: 41bb543f5598 ("drm/i915/mtl: Add initial gt workarounds")
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-2-matthew.d.roper@intel.com

drm/i915/xehp: GAM registers don't need to be re-applied on engine resets

Register reset characteristics (i.e., whether the register maintains or
loses its value on engine reset) is an important factor that determines
which wa_list we want to add workarounds to.  We recently found out that
the bspec documentation for the Xe_HP's "GAM" registers in the 0xC800 -
0xCFFF range was misleading; these registers do not actually lose their
value on engine resets as the documentation implied.  This means there's
no need to re-apply workarounds touching these registers after a reset,
and the corresponding workarounds should be moved from the 'engine'
lists back to the 'gt' list.

v2:
- Don't add Wa_18018781329 to xehpsdv; the original condition didn't
   include that platform.  (Gustavo)
- Move the MTL code to the GT function as-is for now; we'll take care
   of the additional fixes needed in a follow-up patch.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Fixes: edf176f48d87 ("drm/i915/dg2: Move misplaced 'ctx' & 'gt' wa's to engine wa list")
Fixes: b2006061ae28 ("drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds")
Fixes: 41bb543f5598 ("drm/i915/mtl: Add initial gt workarounds")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-1-matthew.d.roper@intel.com