review.tizen.org Git - platform/kernel/linux-rpi.git/log

drm/i915: Pull GAMMA_MODE write out from haswell_load_luts()

For bdw+ let's move the GAMMA_MODE write for the legacy LUT
mode into the .load_luts() funciton directly, rather than
relying on haswell_load_luts(). We'll be getting rid of
haswell_load_luts() entirely soon, and it's anyway cleaner
to have the GAMMA_MODE write in a single place.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-5-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Constify the state arguments to the color management stuff

Pass the crtc state etc. as const to the color management commit
functions. And while at it polish some of the local variables.

v2: Rebase

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-4-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Precompute gamma_mode

We shouldn't be computing gamma mode during the commit phase.
Move it to the check phase.

v2: Reword comments a bit (Matt)
Rebase

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Split the gamma/csc enable bits from the plane_ctl() function

On g4x+ the pipe gamma enable bit for the primary plane affects
the pipe bottom color as well. The same for the pipe csc enable
bit on ilk+. Thus we must configure those bits correctly even
when the primary plane is disabled.

To make the feasible let's split those settings from the
plane_ctl() function into a seprate funciton that we can
call from the ->disable_plane() hook as well.

For consistency we'll do that on all the plane types. While
that has no real benefits at this time, it'll become useful
when we start to control the pipe gamma/csc enable bits
dynamically when we overhaul the color management code.

On pre-g4x there doesn't appear to be any way to gamma
correct the pipe bottom color, but sticking to the same
pattern doesn't hurt. And it'll still help us to do
crtc state readout correctly for the pipe gamma enable
bit for the color management overhaul.

An alternative apporach would be to still precompute these
bits into plane_state->ctl, but that would require that we
run through the plane check even when the plane isn't logically
enabled on any crtc. Currently that condition causes us to
short circuit the entire thing and not call ->check_plane().
There would also be some chicken and egg problems with
->check_plane() vs. crtc color state check that would
requite splitting certain things into multiple steps.
So all in all this seems like the easier route.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205160848.24662-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Don't set update_wm_post on g4x+

update_wm_post is meant for pre-g4x only. Don't ever set
it on g4x+.

The only effect of a bogus update_wm_post on g4x+ could
be that we clear the legacy_cursor_update flag in
intel_atomic_commit(). Since legacy_cursor_update is
only set for legacy cursor updates (as the name suggests)
and we only set update_wm_post for a modeset the two
cases should never occur at the same time. But let's
be consistent in setting update_wm_post so we don't
end up confusing so many people.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190206185433.8116-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Hack and slash, throttle execbuffer hogs

Apply backpressure to hogs that emit requests faster than the GPU can
process them by waiting for their ring to be less than half-full before
proceeding with taking the struct_mutex.

This is a gross hack to apply throttling backpressure, the long term
goal is to remove the struct_mutex contention so that each client
naturally waits, preferably in an asynchronous, nonblocking fashion
(pipelined operations for the win), for their own resources and never
blocks another client within the driver at least. (Realtime priority
goals would extend to ensuring that resource contention favours high
priority clients as well.)

This patch only limits excessive request production and does not attempt
to throttle clients that block waiting for eviction (either global GTT or
system memory) or any other global resources, see above for the long term
goal.

No microbenchmarks are harmed (to the best of my knowledge).

Testcase: igt/gem_exec_schedule/pi-ringfull-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207071829.5574-1-chris@chris-wilson.co.uk

drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set

Add err goto label and use it when VMA can't be established or changes
underneath.

v2:
- Dropping Fixes: as it's indeed impossible to race an object to the
error address. (Chris)
v3:
- Use IS_ERR_VALUE (Chris)

Reported-by: Adam Zabrocki <adamza@microsoft.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v2
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-2-joonas.lahtinen@linux.intel.com

drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set

Make sure the underlying VMA in the process address space is the
same as it was during vm_mmap to avoid applying WC to wrong VMA.

A more long-term solution would be to have vm_mmap_locked variant
in linux/mmap.h for when caller wants to hold mmap_sem for an
extended duration.

v2:
- Refactor the compare function

Fixes: 1816f9236303 ("drm/i915: Support creation of unbound wc user mappings for objects")
Reported-by: Adam Zabrocki <adamza@microsoft.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Adam Zabrocki <adamza@microsoft.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190207085454.10598-1-joonas.lahtinen@linux.intel.com

drm/i915: Don't send hotplug in intel_dp_check_mst_status()

This hotplug also isn't needed: drm_dp_mst_topology_mgr_set_mst()
already sends a hotplug on its own from drm_dp_destroy_connector_work()
after destroying connectors in the MST topology.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129191001.442-4-lyude@redhat.com

drm/i915: Don't send MST hotplugs during resume

We have a bad habit of calling drm_fb_helper_hotplug_event() far more
then we actually need to. MST appears to be one of these cases, where we
call drm_fb_helper_hotplug_event() if we fail to resume a connected MST
topology in intel_dp_mst_resume(). We don't actually need to do this at
all though since hotplug events are already sent from
drm_dp_connector_destroy_work() every time connectors are unregistered
from userspace's PoV. Additionally, extra calls to
drm_fb_helper_hotplug_event() also just mean more of a chance of doing a
connector probe somewhere we shouldn't.

So, don't send any hotplug events during resume if the MST topology
fails to come up. Just rely on the DP MST helpers to send them for us.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129191001.442-3-lyude@redhat.com

drm/i915: Block fbdev HPD processing during suspend

When resuming, we check whether or not any previously connected
MST topologies are still present and if so, attempt to resume them. If
this fails, we disable said MST topologies and fire off a hotplug event
so that userspace knows to reprobe.

However, sending a hotplug event involves calling
drm_fb_helper_hotplug_event(), which in turn results in fbcon doing a
connector reprobe in the caller's thread - something we can't do at the
point in which i915 calls drm_dp_mst_topology_mgr_resume() since
hotplugging hasn't been fully initialized yet.

This currently causes some rather subtle but fatal issues. For example,
on my T480s the laptop dock connected to it usually disappears during a
suspend cycle, and comes back up a short while after the system has been
resumed. This guarantees pretty much every suspend and resume cycle,
drm_dp_mst_topology_mgr_set_mst(mgr, false); will be caused and in turn,
a connector hotplug will occur. Now it's Rute Goldberg time: when the
connector hotplug occurs, i915 reprobes /all/ of the connectors,
including eDP. However, eDP probing requires that we power on the panel
VDD which in turn, grabs a wakeref to the appropriate power domain on
the GPU (on my T480s, this is the PORT_DDI_A_IO domain). This is where
things start breaking, since this all happens before
intel_power_domains_enable() is called we end up leaking the wakeref
that was acquired and never releasing it later. Come next suspend/resume
cycle, this causes us to fail to shut down the GPU properly, which
causes it not to resume properly and die a horrible complicated death.

(as a note: this only happens when there's both an eDP panel and MST
topology connected which is removed mid-suspend. One or the other seems
to always be OK).

We could try to fix the VDD wakeref leak, but this doesn't seem like
it's worth it at all since we aren't able to handle hotplug detection
while resuming anyway. So, let's go with a more robust solution inspired
by nouveau: block fbdev from handling hotplug events until we resume
fbdev. This allows us to still send sysfs hotplug events to be handled
later by user space while we're resuming, while also preventing us from
actually processing any hotplug events we receive until it's safe.

This fixes the wakeref leak observed on the T480s and as such, also
fixes suspend/resume with MST topologies connected on this machine.

Changes since v2:
* Don't call drm_fb_helper_hotplug_event() under lock, do it after lock
(Chris Wilson)
* Don't call drm_fb_helper_hotplug_event() in
intel_fbdev_output_poll_changed() under lock (Chris Wilson)
* Always set ifbdev->hpd_waiting (Chris Wilson)

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)")
Cc: Todd Previte <tprevite@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v3.17+
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129191001.442-2-lyude@redhat.com

drm/i915: Just use icl+ definition for PLANE_WM blocks field

The unused bits on PLANE_WM & co. are hardwired to zero. So no
need to worry about reading the extra bit on pre-icl.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205205056.30081-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

drm/i915: Bump skl+ wm blocks to 11 bits

On icl the plane watermark blocks field is 11 bits. Bump our define to
match so that readout won't ignore the extra bit. We can safely do this
for older platforms too since the unused bits are hardwired to zero.

Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205205056.30081-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

drm/i915/pmu: Fix enable count array size and bounds checking

Enable count array is supposed to have one counter for each possible
engine sampler. As such, array sizing and bounds checking is not correct
and would blow up the asserts if more samplers were added.

No ill-effect in the current code base but lets fix it for correctness.

At the same time tidy the assert for readability and robustness.

v2:
* One check per assert. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130353.21105-1-tvrtko.ursulin@linux.intel.com

drm/i915: W/A for underruns with WM1+ disabled on icl

Disabling WM1+ on ICL causes tons of underruns with
linear/X-tiled framebuffers. We can avoid this by flipping
on a chicken bit affecting the way the hw fill the FIFO.
This may not be the final solution but should hopefully
avoid some underruns in the meantime.

v2: Apparently PIPE_CHICKEN is icl+ only

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204202232.27153-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Setup PIPE_CHICKEN for fastsets too

Configure PIPE_CHICKEN during intel_update_pipe_config() to make
sure we have our chickens in a row with fastboot too.

v2: Apparently PIPE_CHICKEN is icl+ only

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204202214.27051-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Extract icl_set_pipe_chicken()

We need configure PIPE_CHICKEN during fastboot as well. Let's extract
it to a helper.

v2: Apparently PIPE_CHICKEN is icl+ only

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204202139.26884-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Fix wm latency==0 disable on skl+

When adding the early latency==0 check back I neglected to
realize that we no longer have a way to return a failure
from the wm computation like we had in the past (since we
now calculate wms before ddb allocations). Also plane_en
being false doesn't actually indicate that the level is
invalid as it wil also happen when the plane is not
enabled.

skl_allocate_pipe_ddb() starts scanning from the maximum
watermark level and it stops as soon as it finds a level
that is deemed viable. The assumption being that if level
n+1 is valid then level n is valid as well. Thus if we
now disable any watermark level by zeroing its latency
the code will think that level to be actually valid
and won't confirm whether the actually enabled lower
watermark level(s) actually fit into the allotted ddb
space. This results in hilarious watermark values that
exceed the ddb allocation of the plane.

The way we must now indicate a failure is to assign an
unreasoanbly big value to min_ddb_alloc which will then
make skl_allocate_pipe_ddb() reject the entire level.

v2: Also do the same for the lines>31 case (Matt)
v3: Make 'blocks' u32 (Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205155053.10081-1-ville.syrjala@linux.intel.com

drm/i915: Push clear_intel_crtc_state() onto the heap

clear_intel_crtc_state() uses the stack for saving a temporary copy of
certain bits of the inherited crtc_state before clearing the unwanted
bits. This pushes it over the stack limit for my little 32b Pineview,
so move the temporary allocation to the heap instead. As we now use a
zeroed struct, we can copy the whole extended state back to both
preserve what bits need to be preserved and zero the rest.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205092759.16018-1-chris@chris-wilson.co.uk

drm/i915: Include register polling in reg_rw traces

We generally omit register polling from the i915_reg_rw tracepoint.
Understandable since polling could generate a lot of noise in the
trace. The downside is that the trace is incomplete. As a compromise
let's trace the final register value observed while polling. That
should be generally sufficient to observe what the code should be
doing next.

I suppose in some cases it might make sense to also trace the initial
register value, and maybe the number of times we polled. But that
would require a separate tracepoint so let's leave it for the future.

The other users of _NOTRACE() are i915_pmu and i2c bitbanging,
which I decided to leave alone.

Next we should do something to claw back the tracepoints for
planes and whatnot which were switched to _FW() a while back.
I guess just new macros for raw_rw+trace. The question is
what to call it?

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204211644.21967-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: do not return invalid pointers as a *dentry

When calling debugfs functions, they can now return error values if
something went wrong. If that happens, return a NULL as a *dentry to
the relay core instead of passing it an illegal pointer.

The relay core should be able to handle an illegal pointer, but add this
check to be safe.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190131131507.GA19807@kroah.com

drm/i915: Rename HAS_GMCH

First of all GMCH can be considered a feature by itself
since it is a chip present in some platforms that connects
the IA processor to memory and other components in PC.

Also with the introduction of display block at device info,
we got a redundant definition:

.display.has_gmch_display = 1,

So, let's clean up things a bit and use the standardized
way of has_feature on displays side.

No functional change and no manual interaction to generate
this patch.

It is only:

sed -si -e 's/has_gmch_display/has_gmch/g' \
-e 's/HAS_GMCH_DISPLAY/HAS_GMCH/g' drivers/gpu/drm/i915/*{c,h}

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204222538.15842-1-rodrigo.vivi@intel.com

drm/i915: Pull i915_gem_active into the i915_active family

Looking forward, we need to break the struct_mutex dependency on
i915_gem_active. In the meantime, external use of i915_gem_active is
quite beguiling, little do new users suspect that it implies a barrier
as each request it tracks must be ordered wrt the previous one. As one
of many, it can be used to track activity across multiple timelines, a
shared fence, which fits our unordered request submission much better. We
need to steer external users away from the singular, exclusive fence
imposed by i915_gem_active to i915_active instead. As part of that
process, we move i915_gem_active out of i915_request.c into
i915_active.c to start separating the two concepts, and rename it to
i915_active_request (both to tie it to the concept of tracking just one
request, and to give it a longer, less appealing name).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk

drm/i915: Allocate active tracking nodes from a slabcache

Wrap the active tracking for a GPU references in a slabcache for faster
allocations, and hopefully better fragmentation reduction.

v3: Nothing device specific left, it's just a slabcache that we can
make global.
v4: Include i915_active.h and don't put the initfunc under DEBUG_GEM

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-4-chris@chris-wilson.co.uk

drm/i915: Release the active tracker tree upon idling

As soon as we detect that the active tracker is idle and we prepare to
call the retire callback, release the storage for our tree of
per-timeline nodes. We expect these to be infrequently used and quick
to allocate, so there is little benefit in keeping the tree cached and
we would prefer to return the pages back to the system in a timely
fashion.

This also means that when we finalize the struct as a whole, we know as
the activity tracker must be idle, the tree has already been released.
Indeed we can reduce i915_active_fini() just to the assertions that there
is nothing to do.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-3-chris@chris-wilson.co.uk

drm/i915: Generalise GPU activity tracking

We currently track GPU memory usage inside VMA, such that we never
release memory used by the GPU until after it has finished accessing it.
However, we may want to track other resources aside from VMA, or we may
want to split a VMA into multiple independent regions and track each
separately. For this purpose, generalise our request tracking (akin to
struct reservation_object) so that we can embed it into other objects.

v2: Tweak error handling during selftest setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk

drm/i915/selftests: Exercise some AB...BA preemption chains

Build a chain using 2 contexts (A, B) then request a preemption such
that a later A request runs before the spinner in B.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205123835.25331-1-chris@chris-wilson.co.uk

drm/i915/selftests: Context SSEU reconfiguration tests

Exercise the context image reconfiguration logic for idle and busy
contexts, with the resets thrown into the mix as well.

Free from the uAPI restrictions this test runs on all Gen9+ platforms
with slice power gating.

v2:
* Rename some helpers for clarity.
* Include subtest names in error logs.
* Remove unnecessary function export.

v3:
* Rebase for RUNTIME_INFO.

v4:
* Fix incomplete unexport from v2. (Chris Wilson)

v5:
* Rebased for runtime pm api changes.

v6:
* Rebased for i915_reset.c.

v7:
* Tidy checkpatch warnings.
* Consolidate error checking and logging a bit.
* Skip idle test phase if something failed before it.

v8:
(Chris Wilson)
* Fix i915_request_wait error handling.
* No need to PIN_HIGH the VMA.
* Remove pointless GEM_BUG_ON before pointer dereference.

v9:
* Avoid rq leak if rpcs query fails. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> # v6
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-5-tvrtko.ursulin@linux.intel.com

drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)

We want to allow userspace to reconfigure the subslice configuration on a
per context basis.

This is required for the functional requirement of shutting down non-VME
enabled sub-slices on Gen11 parts.

To do so, we expose a context parameter to allow adjustment of the RPCS
register stored within the context image (and currently not accessible via
LRI).

If the context is adjusted before first use or whilst idle, the adjustment
is for "free"; otherwise if the context is active we queue a request to do
so (using the kernel context), following all other activity by that
context, which is also marked as barrier for all following submission
against the same context.

Since the overhead of device re-configuration during context switching can
be significant, especially in multi-context workloads, we limit this new
uAPI to only support the Gen11 VME use case. In this use case either the
device is fully enabled, and exactly one slice and half of the subslices
are enabled.

Example usage:

struct drm_i915_gem_context_param_sseu sseu = { };
struct drm_i915_gem_context_param arg = {
.param = I915_CONTEXT_PARAM_SSEU,
.ctx_id = gem_context_create(fd),
.size = sizeof(sseu),
.value = to_user_pointer(&sseu)
};

/* Query device defaults. */
gem_context_get_param(fd, &arg);

/* Set VME configuration on a 1x6x8 part. */
sseu.slice_mask = 0x1;
sseu.subslice_mask = 0xe0;
gem_context_set_param(fd, &arg);

v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu()
    (Lionel)

v3: Add ability to program this per engine (Chris)

v4: Move most get_sseu() into i915_gem_context.c (Lionel)

v5: Validate sseu configuration against the device's capabilities (Lionel)

v6: Change context powergating settings through MI_SDM on kernel context
    (Chris)

v7: Synchronize the requests following a powergating setting change using
    a global dependency (Chris)
    Iterate timelines through dev_priv.gt.active_rings (Tvrtko)
    Disable RPCS configuration setting for non capable users
    (Lionel/Tvrtko)

v8: s/union intel_sseu/struct intel_sseu/ (Lionel)
    s/dev_priv/i915/ (Tvrtko)
    Change uapi class/instance fields to u16 (Tvrtko)
    Bump mask fields to 64bits (Lionel)
    Don't return EPERM when dynamic sseu is disabled (Tvrtko)

v9: Import context image into kernel context's ppgtt only when
    reconfiguring powergated slice/subslices (Chris)
    Use aliasing ppgtt when needed (Michel)

Tvrtko Ursulin:

v10:
* Update for upstream changes.
* Request submit needs a RPM reference.
* Reject on !FULL_PPGTT for simplicity.
* Pull out get/set param to helpers for readability and less indent.
* Use i915_request_await_dma_fence in add_global_barrier to skip waits
   on the same timeline and avoid GEM_BUG_ON.
* No need to explicitly assign a NULL pointer to engine in legacy mode.
* No need to move gen8_make_rpcs up.
* Factored out global barrier as prep patch.
* Allow to only CAP_SYS_ADMIN if !Gen11.

v11:
* Remove engine vfunc in favour of local helper. (Chris Wilson)
* Stop retiring requests before updates since it is not needed
   (Chris Wilson)
* Implement direct CPU update path for idle contexts. (Chris Wilson)
* Left side dependency needs only be on the same context timeline.
   (Chris Wilson)
* It is sufficient to order the timeline. (Chris Wilson)
* Reject !RCS configuration attempts with -ENODEV for now.

v12:
* Rebase for make_rpcs.

v13:
* Centralize SSEU normalization to make_rpcs.
* Type width checking (uAPI <-> implementation).
* Gen11 restrictions uAPI checks.
* Gen11 subslice count differences handling.
Chris Wilson:
* args->size handling fixes.
* Update context image from GGTT.
* Postpone context image update to pinning.
* Use i915_gem_active_raw instead of last_request_on_engine.

v14:
* Add activity tracker on intel_context to fix the lifetime issues
   and simplify the code. (Chris Wilson)

v15:
* Fix context pin leak if no space in ring by simplifying the
   context pinning sequence.

v16:
* Rebase for context get/set param locking changes.
* Just -ENODEV on !Gen11. (Joonas)

v17:
* Fix one Gen11 subslice enablement rule.
* Handle error from i915_sw_fence_await_sw_fence_gfp. (Chris Wilson)

v18:
* Update commit message. (Joonas)
* Restrict uAPI to VME use case. (Joonas)

v19:
* Rebase.

v20:
* Rebase for ce->active_tracker.

v21:
* Rebase for IS_GEN changes.

v22:
* Reserve uAPI for flags straight away. (Chris Wilson)

v23:
* Rebase for RUNTIME_INFO.

v24:
* Added some headline docs for the uapi usage. (Joonas/Chris)

v25:
* Renamed class/instance to engine_class/engine_instance to avoid clash
   with C++ keyword. (Tony Ye)

v26:
* Rebased for runtime pm api changes.

v27:
* Rebased for intel_context_init.
* Wrap commit msg to 75.

v28:
(Chris Wilson)
* Use i915_gem_ggtt.
* Use i915_request_await_dma_fence to show a better example.

v29:
* i915_timeline_set_barrier can now fail. (Chris Wilson)

v30:
* Capture some acks.

v31:
* Drop the WARN_ON from use controllable paths. (Chris Wilson)
* Use overflows_type for all checks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634
Issue: https://github.com/intel/media-driver/issues/267
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Zhipeng Gong <zhipeng.gong@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Stéphane Marchesin <marcheu@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-4-tvrtko.ursulin@linux.intel.com

drm/i915: Add timeline barrier support

Timeline barrier allows serialization between different timelines.

After calling i915_timeline_set_barrier with a request, all following
submissions on this timeline will be set up as depending on this request,
or barrier. Once the barrier has been completed it automatically gets
cleared and things continue as normal.

This facility will be used by the upcoming context SSEU code.

v2:
* Assert barrier has been retired on timeline_fini. (Chris Wilson)
* Fix mock_timeline.

v3:
* Improved comment language. (Chris Wilson)

v4:
* Maintain ordering with previous barriers set on the timeline.

v5:
* Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-3-tvrtko.ursulin@linux.intel.com

drm/i915/perf: lock powergating configuration to default when active

If some of the contexts submitting workloads to the GPU have been
configured to shutdown slices/subslices, we might loose the NOA
configurations written in the NOA muxes.

One possible solution to this problem is to reprogram the NOA muxes
when we switch to a new context. We initially tried this in the
workaround batchbuffer but some concerns where raised about the cost
of reprogramming at every context switch. This solution is also not
without consequences from the userspace point of view. Reprogramming
of the muxes can only happen once the powergating configuration has
changed (which happens after context switch). This means for a window
of time during the recording, counters recorded by the OA unit might
be invalid. This requires userspace dealing with OA reports to discard
the invalid values.

Minimizing the reprogramming could be implemented by tracking of the
last programmed configuration somewhere in GGTT and use MI_PREDICATE
to discard some of the programming commands, but the command streamer
would still have to parse all the MI_LRI instructions in the
workaround batchbuffer.

Another solution, which this change implements, is to simply disregard
the user requested configuration for the period of time when i915/perf
is active.

On most platforms there are no issues with this apart from a performance
penality for some media workloads that benefit from running on a partially
powergated GPU. We already prevent RC6 from affecting the programming so
it doesn't sound completely unreasonable to hold on powergating for the
same reason.

On Icelake however there would a functional problem if the slices not-
containing the VME block were left enabled with a running media workload
which explicitly disabled them. To avoid a GPU hang in this case, on
Icelake we lock the enablement to only slices which contain VME blocks.
Downside is that it means degraded GPU performance when OA is active but
there is no known alternative solution for this.

v2: Leave RPCS programming in intel_lrc.c (Lionel)

v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel)
More to_intel_context() (Tvrtko)
s/dev_priv/i915/ (Tvrtko)

Tvrtko Ursulin:

v4:
* Rebase for make_rpcs changes.

v5:
* Apply OA restriction from make_rpcs directly.

v6:
* Rebase for context image setup changes.

v7:
* Move stream assignment before metric enable.

v8-9:
* Rebase.

v10:
* Squashed with ICL support patch.

Bspec: 21140
Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v9
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-2-tvrtko.ursulin@linux.intel.com

drm/i915: Record the sseu configuration per-context & engine

We want to expose the ability to reconfigure the slices, subslice and
eu per context and per engine. To facilitate that, store the current
configuration on the context for each engine, which is initially set
to the device default upon creation.

v2: record sseu configuration per context & engine (Chris)

v3: introduce the i915_gem_context_sseu to store powergating
    programming, sseu_dev_info has grown quite a bit (Lionel)

v4: rename i915_gem_sseu into intel_sseu (Chris)
    use to_intel_context() (Chris)

v5: More to_intel_context() (Tvrtko)
    Switch intel_sseu from union to struct (Tvrtko)
    Move context default sseu in existing loop (Chris)

v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko)

Tvrtko Ursulin:

v7:
* Pass intel_sseu by pointer instead of value to make_rpcs.
* Rebase for make_rpcs changes.

v8:
* Rebase for RPCS edit on pin.

v9:
* Rebase for context image setup changes.

v10:
* Rename dev_priv to i915. (Chris Wilson)

v11:
* Rebase.

v12:
* Rebase for IS_GEN changes.

v13:
* Rebase for RUNTIME_INFO.

v14:
* Rebase for intel_context_init.

v15:
* Rebase for drm-tip changes.

v16:
* Moved struct intel_sseu definition to i915_gem_context.h.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-1-tvrtko.ursulin@linux.intel.com

drm/i915: Trim NEWCLIENT boosting

Limit the NEWCLIENT boost to only give its small priority boost to fresh
clients only that have no dependencies.

The idea for using NEWCLIENT boosting, commit b16c765122f9 ("drm/i915:
Priority boost for new clients"), is that short-lived streams are often
interactive and require lower latency -- and that by executing those
ahead of the long running hogs, the short-lived clients do little to
interfere with the system throughput by virtue of their short-lived
nature. However, we were only considering the client's own timeline for
determining whether or not it was a fresh stream. This allowed for
compositors to wake up before their vblank and bump all of its client
streams. However, in testing with media-bench this results in chaining
all cooperating contexts together preventing us from being able to
reorder contexts to reduce bubbles (pipeline stalls), overall increasing
latency, and reducing system throughput. The exact opposite of our
intent. The compromise of applying the NEWCLIENT boost to strictly fresh
clients (that do not wait upon anything else) should maintain the
"real-time response under load" characteristics of FQ_CODEL, without
locking together the long chains of dependencies across the system.

References: b16c765122f9 ("drm/i915: Priority boost for new clients")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204150101.30759-1-chris@chris-wilson.co.uk

drm/i915: Allow normal clients to always preempt idle priority clients

When first enabling preemption, we hesitated from making it a free-for-all
where every higher priority client would force a preempt-to-idle cycle
and take over from all lower priority clients. We hesitated because we
were uncertain just how well preemption would work in practice, whether
the preemption latency itself would detract from the latency gains for
higher priority tasks and whether it would work at all. Since
introducing preemption, we have been enabling it for more common tasks,
even giving normal clients a small preemptive boost when they first
start (to aide fairness and improve interactivity). Now lets take one
step further and give permission for all normal (priority:0) clients to
preempt any idle (priority:<0) task so that users running long compute
jobs do not overly impact other jobs (i.e. their desktop) and the system
remains responsive under such idle loads.

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
References: b16c765122f9 ("drm/i915: Priority boost for new clients")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: "Stead, Alan" <alan.stead@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190204084116.3013-1-chris@chris-wilson.co.uk

drm/i915: Update DRIVER_DATE to 20190202

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/cfl: Adding another PCI Device ID.

While cross checking PCI IDs from Intel Media SDK
and kernel Dmitry noticed this gap. So we checked the
spec and this new ID had been recently added.

v2: Adding new H_GT1 entry to i915_pci.c (Jose)

Reported-by: Dmitry Rogozhkin<dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Rogozhkin<dmitry.v.rogozhkin@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201235049.27206-1-rodrigo.vivi@intel.com

Merge tag 'gvt-next-2019-02-01' of https://github.com/intel/gvt-linux into drm-intel-next-queued

gvt-next-2019-02-01

- new VFIO EDID region support (Henry)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190201061523.GE5588@zhen-hp.sh.intel.com

drm/i915: Enable fastboot by default on VLV and CHV

We really want to have fastboot enabled by default to avoid an ugly
modeset during boot.

Currently we are enabling fastboot by default on gen9+ (Skylake and newer).
The intention is to enable it on older generations after it has seen more
testing on gen9+.

VLV and CHV devices are still being sold in stores today, as such it is
desirable to also enable fastboot by default on these now.

I've extensively tested fastboot=1 support on over 50 different
Bay- and Cherry-Trail devices. Testing DSI and eDP panels as well as
HDMI output (and even DP over Type-C on one device).

All 50 devices work fine with fastboot=1. On 2 devices their DSI panel
turns black as soon as the i915 driver loads when fastboot=0, so having
fastboot enabled is required for these 2 to work properly (for lack of
a better fix).

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129142237.8684-1-hdegoede@redhat.com

drm/i915/icl: restore WaEnableFloatBlendOptimization

Enables blend optimization for floating point RTs

This restores the workaround that was reverted in c358514ba8da
("Revert "drm/i915/icl: WaEnableFloatBlendOptimization"").

The revert was due to the register write seemingly not sticking,
but the HW team has confirmed that this is because the
register is WO and that the workaround is indeed required.

Here the wa is added with a mask of 0 since the register is WO.

References: https://hsdes.intel.com/resource/1408134172
References: https://bugs.freedesktop.org/show_bug.cgi?id=107338
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Talha Nassar <talha.nassar@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1548983324-15344-4-git-send-email-talha.nassar@intel.com

drm/i915: Save some lines of source code in workarounds

No functional or code size change - just notice we can compact the source
by re-using a single helper for adding workarounds.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1548983324-15344-3-git-send-email-talha.nassar@intel.com

drm/i915: Move workaround infrastructure code up

Top comment in intel_workarounds.c says common code should come first so
lets respect that. Also, by moving the common code together opportunities
to reduce duplication will become more obvious.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1548983324-15344-2-git-send-email-talha.nassar@intel.com

drm/i915/icl: Work around broken VBTs for port F detection

VBT may include incorrect information about the presence of port F. Work
around this on SKUs where we know the port is not present.

v2:
- Fix IS_ICL_WITH_PORT_F, so it's useable from any context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108915
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181220155211.31456-1-imre.deak@intel.com

drm/i915/icl: Add TypeC ports only if VBT is present

We can't safely probe Type C ports, whether they are a legacy or a
USB/Thunderbolt DP Alternate Type C port. This would require performing
the TypeC connect sequence - as described by the specification - but
that may have unwanted side-effects. These side-effects include at least
- without completeness - timeouts during AUX power well enabling and
subsequent PLL enabling errors.

To safely identify these ports we really need VBT, which has the proper
flag for this (ddi_vbt_port_info::supports_typec_usb, supports_tbt).
Based on the above disable Type C ports if we can't load VBT for some
reason.

v2:
- Notice that we disable TypeC ports completely and simplify accordingly
(Jose).
- Add code comment explaining why we disabled the ports. (Jani)

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Jose Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128114242.28666-1-imre.deak@intel.com

drm/i915: Pick the first unused PLL once again

commit 5b0bd14dcc6b ("drm/i915/icl: keep track of unused pll while
looping") inadvertently (I presume) changed the code to pick the
last unused dpll rather than the first unused one like we did before.

While there should most likely be no harm in changing the order
let's change back just to avoid a change in the behaviour. At
least it might reduce the confusion when staring at logs (took
me a while to figure out why DPLL1 being picked over DPLL0
when the latter was most definitely available).

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190130181359.20693-1-ville.syrjala@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: Don't use the second dbuf slice on icl

The code managing the dbuf slices is borked and needs some
real work to fix. In the meantime let's just stop using the
second slice.

v2: Drop the change to intel_enabled_dbuf_slices_num() (Mahesh)

Cc: Mahesh Kumar <mahesh1.sh.kumar@gmail.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190130155110.12918-1-ville.syrjala@linux.intel.com
Reviewed-by: Mahesh Kumar <mahesh1.sh.kumar@gmail.com>

drm/i915/gvt: add VFIO EDID region

Implement VFIO EDID region for vgpu. Support EDID blob update and notify
guest on link state change via hotplug event.

v3: move struct edid_region to kvmgt.c <zhenyu>
v2: add EDID sanity check and size update <zhenyu>

Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add hotplug emulation

Add function to emulate hotplug interrupt for SKL/KBL platforms

Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add functions to get default resolution

These functions will get default resolution according to vgpu type.

Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Apply LUT validation checks to platforms more accurately (v3)

Use of the new DRM_COLOR_LUT_NON_DECREASING test was a bit over-zealous;
it doesn't actually need to be applied to the degamma on "bdw-style"
platforms.  Likewise, we overlooked the fact that CHV should have that
test applied to the gamma LUT as well as the degamma LUT.

Rather than adding more complicated platform checking to
intel_color_check(), let's just store the appropriate set of LUT
validation flags for each platform in the intel_device_info structure.

v2:
- Shuffle around LUT size tests so that the hardware-specific tests
   won't be applied to legacy gamma tables.  (Ville)
- Add a debug message so that it will be easier to understand why an
   atomic transaction involving incorrectly-sized LUT's got rejected
   by the driver.

v3:
- Switch size_t's to int's.  (Ville)

Fixes: 85e2d61e4976 ("drm/i915: Validate userspace-provided color management LUT's (v4)")
References: https://lists.freedesktop.org/archives/intel-gfx/2019-January/187634.html
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190130181022.4291-1-matthew.d.roper@intel.com

drm/i915: Force background color to black for gen9+ (v2)

We don't yet allow userspace to control the CRTC background color, but
we should manually program the color to black to ensure the BIOS didn't
leave us with some other color. We should also set the pipe gamma and
pipe CSC bits so that the background color goes through the same color
management transformations that a plane with black pixels would.

v2: Rename register to SKL_BOTTOM_COLOR to more closely follow
bspec naming. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190130185122.10322-2-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Update DRIVER_DATE to 20190129

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Use IS_GEN9_LP() for the linetime w/a check

IS_GLK||IS_BXT == IS_GEN9_LP

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-10-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Drop the pointless linetime==0 check

0*whatever==0 so this check is pointless. Remove it.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-9-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Drop the definite article in front of SAGV

The spec doesn't use a definite article in front of SAGV. The
rules regarding articles and initialisms are super fuzzy, but
at least to my ears it sounds much more natural to not have
the article. Perhaps because I tend to pronounce it as
"sag-vee" instead of spelling out the letters one at a time.
Actually I might still prefer to leave out the article if I
did spell them out.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-8-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Pass dev_priv to skl_needs_memory_bw_wa()

skl_needs_memory_bw_wa() doesn't look at the passed in state at all.
Possibly it should, but for now let's make life simpler by just
passing in dev_priv.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-7-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Account for minimum ddb allocation restrictions

On icl+ bspec tells us to calculate a separate minimum ddb
allocation from the blocks watermark. Both have to be checked
against the actual ddb allocation, but since we do things the
other way around we'll just calculat the minimum acceptable
ddb allocation by taking the maximum of the two values.

We'll also replace the memcmp() with a full trawl over the
the watermarks so that it'll ignore the min_ddb_alloc
because we can't directly read that out from the hw. I suppose
we could reconstruct it from the other values, but I was
too lazy to do that now.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-6-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Fix > vs >= mismatch in watermark/ddb calculations

Bspec says we have to reject the watermark if it's >= the ddb
allocation. Fix the code to reject the == case as it should.
For transition watermarks we can just use >=, for the rest
we'll do +1 when calculating the minimum ddb allocation size.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-5-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Fix bits vs. bytes mixup in dbuf block size computation

The spec used to say "8bpp" which someone took to mean 8 bytes per
pixel when in fact it was supposed to be 8 bits per pixel. The
spec has been updated to make it more clear now. Fix the code
to match.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-4-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Reinstate an early latency==0 check for skl+

I thought we could remove all the early latency==0 checks
and rely on skl_wm_method{1,2}() checking for it. But
skl_compute_plane_wm() applies a bunch of workarounds to bump
up the latency before calling those guys so clearly it won't
end up doing the right thing. Also not sure if the calculations
based on the method1/2 results are safe agaisnt overflows so
it might not work all that well in any case. Let's put the
early check back.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-3-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Don't ignore level 0 lines watermark for glk+

On glk+ the level 0 lines watermark actually matters. Do not ignore it.
And while at it let's change things so that we always program a
consistnet 0 to the register when the lines watermarks is ignored
by the hardware.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-2-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915/icl: keep track of unused pll while looping

Instead of looping again on the range of plls, just keep track of one
unused one and use it later.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-5-lucas.demarchi@intel.com

drm/i915/icl: remove dpll from clk_sel

We should not pass DPLL_ID_ICL_DPLL0 or DPLL_ID_ICL_DPLL1 to this
function because the path is only taken for non-combophy ports. Let the
warning trigger if improper value is given.

While at it, rename the function to match the register name we are
trying to program.

v2: fix typo in comment

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-4-lucas.demarchi@intel.com

drm/i915: always return something on DDI clock selection

Even if we don't have the correct clock and get a warning, we should not
skip the return.

v2: improve commit message (from Joonas)

Fixes: 1fa11ee2d9d0 ("drm/i915/icl: start adding the TBT pll")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-3-lucas.demarchi@intel.com

drm/i915/icl: use tc_port in MG_PLL macros

Fix the TODO leftover in the code by changing the argument in MG_PLL
macros. The MG_PLL ids used to access the register values can be
converted from tc_port rather than port.

All these registers can use the TC port to calculate the right offsets
because they are only available for TC ports. The range (PORT_C onwards)
may not be stable and change from platform to platform. So by using the
TC id directly we avoid having to check for the platform in the "leaf
functions" and thus passing dev_priv around.

The helper functions were also renamed to use "tc" as prefix to make
them more generic.

v2: Improve commit message and fix checkpatch warning (from Paulo)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125222444.19926-2-lucas.demarchi@intel.com

drm/i915: Drop fake breadcrumb irq

Missed breadcrumb detection is defunct due to the tight coupling with
dma_fence signaling and the myriad ways we may signal fences from
everywhere but from an interrupt, i.e. we frequently signal a fence
before we even see its interrupt. This means that even if we miss an
interrupt for a fence, it still is signaled before our breadcrumb
hangcheck fires, so simplify the breadcrumb hangchecking by moving it
into the GPU hangcheck and forgo fake interrupts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-3-chris@chris-wilson.co.uk

drm/i915: Replace global breadcrumbs with per-context interrupt tracking

A few years ago, see commit 688e6c725816 ("drm/i915: Slaughter the
thundering i915_wait_request herd"), the issue of handling multiple
clients waiting in parallel was brought to our attention. The
requirement was that every client should be woken immediately upon its
request being signaled, without incurring any cpu overhead.

To handle certain fragility of our hw meant that we could not do a
simple check inside the irq handler (some generations required almost
unbounded delays before we could be sure of seqno coherency) and so
request completion checking required delegation.

Before commit 688e6c725816, the solution was simple. Every client
waiting on a request would be woken on every interrupt and each would do
a heavyweight check to see if their request was complete. Commit
688e6c725816 introduced an rbtree so that only the earliest waiter on
the global timeline would woken, and would wake the next and so on.
(Along with various complications to handle requests being reordered
along the global timeline, and also a requirement for kthread to provide
a delegate for fence signaling that had no process context.)

The global rbtree depends on knowing the execution timeline (and global
seqno). Without knowing that order, we must instead check all contexts
queued to the HW to see which may have advanced. We trim that list by
only checking queued contexts that are being waited on, but still we
keep a list of all active contexts and their active signalers that we
inspect from inside the irq handler. By moving the waiters onto the fence
signal list, we can combine the client wakeup with the dma_fence
signaling (a dramatic reduction in complexity, but does require the HW
being coherent, the seqno must be visible from the cpu before the
interrupt is raised - we keep a timer backup just in case).

Having previously fixed all the issues with irq-seqno serialisation (by
inserting delays onto the GPU after each request instead of random delays
on the CPU after each interrupt), we can rely on the seqno state to
perfom direct wakeups from the interrupt handler. This allows us to
preserve our single context switch behaviour of the current routine,
with the only downside that we lose the RT priority sorting of wakeups.
In general, direct wakeup latency of multiple clients is about the same
(about 10% better in most cases) with a reduction in total CPU time spent
in the waiter (about 20-50% depending on gen). Average herd behaviour is
improved, but at the cost of not delegating wakeups on task_prio.

v2: Capture fence signaling state for error state and add comments to
warm even the most cold of hearts.
v3: Check if the request is still active before busywaiting
v4: Reduce the amount of pointer misdirection with list_for_each_safe
and using a local i915_request variable inside the loops
v5: Add a missing pluralisation to a purely informative selftest message.

References: 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk

drm/i915: Remove the intel_engine_notify tracepoint

The global seqno is defunct and so we have no meaningful indicator of
forward progress for an engine. You need to listen to the request
signaling tracepoints instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-1-chris@chris-wilson.co.uk

drm/i915/tv: Bypass the vertical filter if possible

Let's switch the pipe into interlaced mode and switch off
the TV encoder vertical filter if the pipe vdisplay
matches the TV YSIZE exactly.

While I didn't measure it I presume this might reduce
the power consumption a little bit, and the pixel rate
is halved as the pipe will now fetching in interlaced
mode rather than in progressive mode (effectively the
same difference as between IF-ID vs. PF-ID pfit modes
on more modern hardware) so a bit easier on the memory
bandwidth.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129141913.5515-2-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/tv: Fix adjusted_mode dotclock for interlaced modes

intel_tv_mode_to_mode() assumes the pipe will be in progressive
fetch mode, and thus when programming the pipe into interlaced
mode we have to halve the calculated dotclock to get the correct
field duration.

This becomes more important when we start to program the pipe
into interlaced mode on i965gm as we depend on the timestamps
to get accurate frame counter values. Withot halving the clock
our guesstimated frame counter would tick at twice the expected
speed.

Cc: Imre Deak <imre.deak@intel.com>
Fixes: 690157f0a9e7 ("drm/i915/tv: Fix >1024 modes on gen3")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129141913.5515-1-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm: Constify drm_color_lut_check()

drm_color_lut_check() doens't modify the passed in blob so
let's make it const.

Also s/uint32_t/u32/ while at it.

v2: Reduce line wraps (Sam)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129170609.5718-1-ville.syrjala@linux.intel.com
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

drm/i915/execlists: Suppress preempting self

In order to avoid preempting ourselves, we currently refuse to schedule
the tasklet if we reschedule an inflight context. However, this glosses
over a few issues such as what happens after a CS completion event and
we then preempt the newly executing context with itself, or if something
else causes a tasklet_schedule triggering the same evaluation to
preempt the active context with itself.

However, when we avoid preempting ELSP[0], we still retain the preemption
value as it may match a second preemption request within the same time period
that we need to resolve after the next CS event. However, since we only
store the maximum preemption priority seen, it may not match the
subsequent event and so we should double check whether or not we
actually do need to trigger a preempt-to-idle by comparing the top
priorities from each queue. Later, this gives us a hook for finer
control over deciding whether the preempt-to-idle is justified.

The sequence of events where we end up preempting for no avail is:

1. Queue requests/contexts A, B
2. Priority boost A; no preemption as it is executing, but keep hint
3. After CS switch, B is less than hint, force preempt-to-idle
4. Resubmit B after idling

v2: We can simplify a bunch of tests based on the knowledge that PI will
ensure that earlier requests along the same context will have the highest
priority.
v3: Demonstrate the stale preemption hint with a selftest

References: a2bf92e8cc16 ("drm/i915/execlists: Avoid kicking priority on the current context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-4-chris@chris-wilson.co.uk

drm/i915: Rename execlists->queue_priority to queue_priority_hint

After noticing that we trigger preemption events for currently executing
requests, as well as requests that complete before the preemption and
attempting to suppress those preemption events, it is wise to not
consider the queue_priority to be authoritative. As we only track the
maximum priority seen between dequeue passes, if the maximum priority
request is no longer available for dequeuing (it completed or is even
executing on another engine), we have no knowledge of the previous
queue_priority as it would require us to keep a full history of enqueued
requests -- but we already have that history in the priolists!

Rename the queue_priority to queue_priority_hint so that we do not
confuse it as being exactly the maximum priority in the queue, but merely
an indication that we have seen a new maximum priority value and as such
we should check whether it should preempt the currently running request.

v2: s/preempt_priority_hint/queue_priority_hint/ as preempt implies it
being only used for the singular task of preemption and not the wider
question of waking up due to a change in the queue.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-3-chris@chris-wilson.co.uk

drm/i915: Identify active requests

To allow requests to forgo a common execution timeline, one question we
need to be able to answer is "is this request running?". To track
whether a request has started on HW, we can emit a breadcrumb at the
beginning of the request and check its timeline's HWSP to see if the
breadcrumb has advanced past the start of this request. (This is in
contrast to the global timeline where we need only ask if we are on the
global timeline and if the timeline has advanced past the end of the
previous request.)

There is still confusion from a preempted request, which has already
started but relinquished the HW to a high priority request. For the
common case, this discrepancy should be negligible. However, for
identification of hung requests, knowing which one was running at the
time of the hang will be much more important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk

drm/i915/selftests: Apply a subtest filter

In bringup on simulated HW even rudimentary tests are slow, and so many
may fail that we want to be able to filter out the noise to focus on the
specific problem. Even just the tests groups provided for igt is not
specific enough, and we would like to isolate one particular subtest
(and probably subsubtests!). For simplicity, allow the user to provide a
command line parameter such as

i915.st_filter=i915_timeline_mock_selftests/igt_sync

to restrict ourselves to only running on subtest. The exact name to use
is given during a normal run, highlighted as an error if it failed,
debug otherwise. The test group is optional, and then all subtests are
compared for an exact match with the filter (most subtests have unique
names). The filter can be negated, e.g. i915.st_filter=!igt_sync and
then all tests but those that match will be run. More than one match can
be supplied separated by a comma, e.g.

i915.st_filter=igt_vma_create,igt_vma_pin1

to only run those specified, or

i915.st_filter=!igt_vma_create,!igt_vma_pin1

to run all but those named. Mixing a blacklist and whitelist will only
execute those subtests matching the whitelist so long as they are
previously excluded in the blacklist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-1-chris@chris-wilson.co.uk

Merge drm/drm-next into drm-intel-next-queued

A backmerge to unblock gen8+ semaphores.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Fix skl srckey mask bits

We're incorrectly masking off the R/V channel enable bit from
KEYMSK. Fix it up.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: b20815255693 ("drm/i915: Add plane alpha blending support, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125183846.28755-1-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

drm/i915: Enable fastboot by default on Skylake and newer

We really want to have fastboot enabled by default to avoid an ugly
modeset during boot.

Rather then enabling it everywhere, lets start with enabling it on
Skylake and newer.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190124130114.3967-1-maarten.lankhorst@linux.intel.com

drm/i915: Track active timelines

Now that we pin timelines around use, we have a clearly defined lifetime
and convenient points at which we can track only the active timelines.
This allows us to reduce the list iteration to only consider those
active timelines and not all.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-6-chris@chris-wilson.co.uk

drm/i915: Track the context's seqno in its own timeline HWSP

Now that we have allocated ourselves a cacheline to store a breadcrumb,
we can emit a write from the GPU into the timeline's HWSP of the
per-context seqno as we complete each request. This drops the mirroring
of the per-engine HWSP and allows each context to operate independently.
We do not need to unwind the per-context timeline, and so requests are
always consistent with the timeline breadcrumb, greatly simplifying the
completion checks as we no longer need to be concerned about the
global_seqno changing mid check.

One complication though is that we have to be wary that the request may
outlive the HWSP and so avoid touching the potentially danging pointer
after we have retired the fence. We also have to guard our access of the
HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe.

At this point, we are emitting both per-context and global seqno and
still using the single per-engine execution timeline for resolving
interrupts.

v2: s/fake_complete/mark_complete/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk

drm/i915: Share per-timeline HWSP using a slab suballocator

If we restrict ourselves to only using a cacheline for each timeline's
HWSP (we could go smaller, but want to avoid needless polluting
cachelines on different engines between different contexts), then we can
suballocate a single 4k page into 64 different timeline HWSP. By
treating each fresh allocation as a slab of 64 entries, we can keep it
around for the next 64 allocation attempts until we need to refresh the
slab cache.

John Harrison noted the issue of fragmentation leading to the same worst
case performance of one page per timeline as before, which can be
mitigated by adopting a freelist.

v2: Keep all partially allocated HWSP on a freelist

This is still without migration, so it is possible for the system to end
up with each timeline in its own page, but we ensure that no new
allocation would needless allocate a fresh page!

v3: Throw a selftest at the allocator to try and catch invalid cacheline
reuse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-4-chris@chris-wilson.co.uk

drm/i915: Allocate a status page for each timeline

Allocate a page for use as a status page by a group of timelines, as we
only need a dword of storage for each (rounded up to the cacheline for
safety) we can pack multiple timelines into the same page. Each timeline
will then be able to track its own HW seqno.

v2: Reuse the common per-engine HWSP for the solitary ringbuffer
timeline, so that we do not have to emit (using per-gen specialised
vfuncs) the breadcrumb into the distinct timeline HWSP and instead can
keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the
sleight-of-hand for the global/per-context seqno switchover, we will
store both temporarily (and so use a custom offset for the shared timeline
HWSP until the switch over).

v3: Keep things simple and allocate a page for each timeline, page
sharing comes next.

v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over
again in selftests.

v5: And caught red handed copying create timeline + check.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk

drm/i915: Enlarge vma->pin_count

Previously we only accommodated having a vma pinned by a small number of
users, with the maximum being pinned for use by the display engine. As
such, we used a small bitfield only large enough to allow the vma to
be pinned twice (for back/front buffers) in each scanout plane. Keeping
the maximum permissible pin_count small allows us to quickly catch a
potential leak. However, as we want to split a 4096B page into 64
different cachelines and pin each cacheline for use by a different
timeline, we will exceed the current maximum permissible vma->pin_count
and so time has come to enlarge it.

Whilst we are here, try to pull together the similar bits:

Address/layout specification:
- bias, mappable, zone_4g: address limit specifiers
- fixed: address override, limits still apply though
- high: not strictly an address limit, but an address direction to search

Search controls:
- nonblock, nonfault, noevict

v2: Rewrite the guideline comment on bit consumption.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <john.C.Harrison@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-2-chris@chris-wilson.co.uk

drm/i915: Introduce concept of per-timeline (context) HWSP

Supplement the per-engine HWSP with a per-timeline HWSP. That is a
per-request pointer through which we can check a local seqno,
abstracting away the presumption of a global seqno. In this first step,
we point each request back into the engine's HWSP so everything
continues to work with the global timeline.

v2: s/i915_request_hwsp/hwsp_seqno/ to emphasis that this is the current
HW value and that we are accessing it via i915_request merely as a
convenience.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-1-chris@chris-wilson.co.uk

drm/i915: Move list of timelines under its own lock

Currently, the list of timelines is serialised by the struct_mutex, but
to alleviate difficulties with using that mutex in future, move the
list management under its own dedicated mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk

drm/i915: Always allocate an object/vma for the HWSP

Currently we only allocate an object and vma if we are using a GGTT
virtual HWSP, and a plain struct page for a physical HWSP. For
convenience later on with global timelines, it will be useful to always
have the status page being tracked by a struct i915_vma. Make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk

drm/i915: Move vma lookup to its own lock

Remove the struct_mutex requirement for looking up the vma for an
object.

v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk

drm/i915: Pull VM lists under the VM mutex.

A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk

drm/i915: Stop tracking MRU activity on VMA

Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.

Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).

v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk

drm/i915: Try to sanitize bogus DPLL state left over by broken SNB BIOSen

Certain SNB machines (eg. ASUS K53SV) seem to have a broken BIOS
which misprograms the hardware badly when encountering a suitably
high resolution display. The programmed pipe timings are somewhat
bonkers and the DPLL is totally misprogrammed (P divider == 0).
That will result in atomic commit timeouts as apparently the pipe
is sufficiently stuck to not signal vblank interrupts.

IIRC something like this was also observed on some other SNB
machine years ago (might have been a Dell XPS 8300) but a BIOS
update cured it. Sadly looks like this was never fixed for the
ASUS K53SV as the latest BIOS (K53SV.320 11/11/2011) is still
broken.

The quickest way to deal with this seems to be to shut down
the pipe+ports+DPLL. Unfortunately doing this during the
normal sanitization phase isn't quite soon enough as we
already spew several WARNs about the bogus hardware state.
But it's better than hanging the boot for a few dozen seconds.
Since this is limited to a few old machines it doesn't seem
entirely worthwile to try and rework the readout+sanitization
code to handle it more gracefully.

v2: Fix potential NULL deref (kbuild test robot)
Constify has_bogus_dpll_config()

Cc: stable@vger.kernel.org # v4.20+
Cc: Daniel Kamil Kozar <dkk089@gmail.com>
Reported-by: Daniel Kamil Kozar <dkk089@gmail.com>
Tested-by: Daniel Kamil Kozar <dkk089@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109245
Fixes: 516a49cc1946 ("drm/i915: Fix assert_plane() warning on bootup with external display")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111174950.10681-1-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

drm/i915/tv: Use the scanline counter for timestamps on i965gm TV output

Just like the frame counter, the pixel counter also reads zero
all the time when the TV encoder is used. Fortunately the
scanline counter still works sufficiently well so let's use that
to correct the vblank timestamps. Otherwise the timestamps may
en up out of whack, and since we use them to guesstimate the
vblank counter value that may end up incorrect as well.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125181931.19482-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915/tv: Fix return value for intel_tv_compute_config()

Ever since commit 204474a6b859 ("drm/i915: Pass down rc in
intel_encoder->compute_config()") we're supposed to return an
errno from .compute_config(). I failed to notice that when
pushing the TV encoder fixes which were written before said
commmit. Fix up the return value for the error case.

Cc: Imre Deak <imre.deak@intel.com>
Fixes: 690157f0a9e7 ("drm/i915/tv: Fix >1024 modes on gen3")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125181931.19482-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915: Wait for a moment before forcibly resetting the device

During igt, we ask to reset the device if any requests are still
outstanding at the end of a test, as this quickly kills off any
erroneous hanging request streams that may escape a test. However, since
it may take the device a few milliseconds to flush itself after the end
of a normal test, *cough* guc *cough*, we may accidentally tell the
device to reset itself after it idles. If we wait a moment, our usual
I915_IDLE_ENGINES_TIMEOUT of 200ms (seems a bit high, but still better
than umpteen hangchecks!), we can differentiate better between a stuck
engine and a healthy one, and so avoid prematurely forcing the reset and
any extra complications that may entail.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128010245.20148-1-chris@chris-wilson.co.uk

drm/i915: Disable -Wuninitialized

This warning is disabled by default in scripts/Makefile.extrawarn when
W= is not provided but this Makefile adds -Wall after this warning is
disabled so it shows up in the build when it shouldn't:

In file included from drivers/gpu/drm/i915/intel_breadcrumbs.c:895:
drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c:350:34: error:
variable 'wq' is uninitialized when used within its own initialization
[-Werror,-Wuninitialized]
        DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
                                        ^~
./include/linux/wait.h:74:63: note: expanded from macro
'DECLARE_WAIT_QUEUE_HEAD_ONSTACK'
        struct wait_queue_head name = __WAIT_QUEUE_HEAD_INIT_ONSTACK(name)
                               ~~~~                                  ^~~~
./include/linux/wait.h:72:33: note: expanded from macro
'__WAIT_QUEUE_HEAD_INIT_ONSTACK'
        ({ init_waitqueue_head(&name); name; })
                                       ^~~~
1 error generated.

Explicitly disable the warning like commit 46e2068081e9 ("drm/i915:
Disable some extra clang warnings").

Link: https://github.com/ClangBuiltLinux/linux/issues/220
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190126071122.24557-1-natechancellor@gmail.com

drm/i915: correct the pitch check for NV12 framebuffer

framebuffer for NV12 requires the pitch to the multiplier of 4, instead
of the width. This patch corrects it.

For instance, a 480p video, whose width and pitch are 854 and 896
respectively, is excluded for NV12 plane so far.

Changes since v1:
- Removed check for NV12 buffer dimensions since additional checks
are done for viewport size in intel_sprite.c

Signed-off-by: Dongseong Hwang <dongseong.hwang@intel.com>
Signed-off-by: P Raviraj Sitaram <raviraj.p.sitaram@intel.com>
Cc: Chandra Konduru <chandra.konduru@intel.com>
Cc: Vidya Srinivas <vidya.srinivas@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1545208152-22658-1-git-send-email-raviraj.p.sitaram@intel.com

drm/i915: Clean up intel_plane_atomic_check_with_state()

Rename some of the state variables in
intel_plane_atomic_check_with_state() to make it less confusing.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190111170823.4441-2-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>

drm/i915/tv: Filter out >1024 wide modes that would need vertical scaling on gen3

Since gen3 can't handle >1024 wide sources with vertical scaling
let's not advertize such modes in the mode list. Less tempetation
to the user to try out things that won't work.

v2: s/IS_GEN3(dev_priv/IS_GEN(dev_priv, 3)/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-17-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915/tv: Fix >1024 modes on gen3

On gen3 we must disable the TV encoder vertical filter for >1024
pixel wide sources. Once that's done all we can is try to center
the image on the screen. Naturally the TV mode vertical resolution
must be equal or larger than the user mode vertical resolution
or else we'd have to cut off part of the user mode.

And while we may not be able to respect the user's choice of
top and bottom borders exactly (or we'd have to reject he mode
most likely), we can try to maintain the relative sizes of the
top and bottom border with respect to each orher.

Additionally we must configure the pipe as interlaced if the
TV mode is interlaced.

v2: Make +intel_tv_connector_duplicate_state() static and drop
the badly copy pasted kerneldoc
s/IS_GEN3(dev_priv/IS_GEN(dev_priv, 3)/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-16-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915/tv: Generate better pipe timings for TV encoder

To make vblank timestamps work better with the TV encoder let's
scale the pipe timings such that the relationship between the
TV active and TV blanking periods is mirrored in the
corresponding pipe timings.

Note that in reality the pipe runs at a faster speed during the
TV vblank, and correspondigly there are periods when the pipe
is enitrely stopped. We pretend that this isn't the case and
as such we incur some error in the vblank timestamps during
the TV vblank. Further explanation of the issues in a big
comment in the code.

This makes the vblank timestamps good enough to make
i965gm (which doesn't have a working frame counter with
the TV encoder) report correct frame numbers. Previously
you could get all kinds of nonsense which resulted in
eg. glxgears reporting that it's running at twice the
actual framerate in most cases.

v2: s/IS_GEN4(dev_priv)/IS_GEN(dev_priv, 4)/ in the comment
for consistency

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-15-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915/tv: Add 1080p30/50/60 TV modes

Add the missing 1080p TV modes. On gen4 all of them work just fine,
whereas on gen3 only the 30Hz mode actually works correctly.

v2: s/IS_GEN3(dev_priv)/IS_GEN(dev_priv, 3)/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-14-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>

drm/i915/tv: Nuke reported_modes[]

Remove the silly reported_modes[] array. I suppse once upon a time
this actually had something to do with modes we reported to userspace.
Now it is just the placeholder for the mode we use for load detection.
We don't need it even for that, and instead we can just rely on
the fallback mode in intel_get_load_detect_pipe().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181112170000.27531-13-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>