platform/upstream/mesa.git
3 years agofreedreno: Require C++17.
Vinson Lee [Mon, 9 Aug 2021 22:48:25 +0000 (15:48 -0700)]
freedreno: Require C++17.

Commit 3a772be026c ("freedreno: Add perfetto renderpass support")
uses C++17 init-statement feature.

GCC
../src/gallium/drivers/freedreno/freedreno_perfetto.cc: In lambda function:
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: init-statement in selection statements only available with ‘-std=c++17’ or ‘-std=gnu++17’
  148 |       if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
      |           ^~~~

Clang
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: 'if' initialization statements are a C++17 extension [-Wc++17-extensions]
      if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
          ^

Intel C++ Compiler
../src/gallium/drivers/freedreno/freedreno_perfetto.cc(148): error: expected a ")"
        if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
                                                   ^

Fixes: 3a772be026c ("freedreno: Add perfetto renderpass support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5193
Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12293>

3 years agointel/compiler: Add unified barrier support for CS
Jason Ekstrand [Fri, 5 Feb 2021 01:36:16 +0000 (19:36 -0600)]
intel/compiler: Add unified barrier support for CS

Program CS barrier message fields for producers/consumers.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>

3 years agointel/compiler: Add unified barrier support for TCS
Jordan Justen [Wed, 2 Sep 2020 22:07:02 +0000 (15:07 -0700)]
intel/compiler: Add unified barrier support for TCS

Program the producers/consumer fields for TCS Barrier messages.
Producer and consumer fields are set to number of TCS threads.

Ref: Bspec 54006 for Barrier Data Payload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>

3 years agointel/compiler: Regroup TCS barrier code paths
Jordan Justen [Wed, 2 Sep 2020 21:59:29 +0000 (14:59 -0700)]
intel/compiler: Regroup TCS barrier code paths

Rearrange if/else fragments to unify case for Gen11 or later
platforms. This will help the code look cleaner for adding
unified barrier support to TCS.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>

3 years agopanfrost: Rip out primconvert code
Alyssa Rosenzweig [Mon, 23 Aug 2021 16:10:53 +0000 (12:10 -0400)]
panfrost: Rip out primconvert code

This is handled in common Gallium code if we set the appropriate CAP.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12509>

3 years agopanfrost: Fix NULL dereference in allowlist code
Alyssa Rosenzweig [Tue, 24 Aug 2021 00:18:25 +0000 (20:18 -0400)]
panfrost: Fix NULL dereference in allowlist code

If a user attempts to run Panfrost on an unsupported GPU (e.g. Mali
T604), Panfrost will refuse to load and will destroy the screen
immediately, allowing for a graceful fallback to a software rasterizer.
However, the screen destroy code calls a screen_destroy function in the
GenXML vtbl -- and this function is still NULL when the allowlist is
checked. This manifests as crashes on unsuported GPUs.

Issue tracked down with Icecream95's mad Ghidra skills.

Closes: #5269
Fixes: 88dc4db6be7 ("panfrost: Init/destroy blitter from per-gen file")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12512>

3 years agointel: Parse INTEL_NO_HW for devinfo construction
Nanley Chery [Wed, 21 Jul 2021 23:50:32 +0000 (16:50 -0700)]
intel: Parse INTEL_NO_HW for devinfo construction

This commit does several things:

* Unify code common to several drivers by evaluating INTEL_NO_HW within
  intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
  separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
  code simplification, but we may find reason to undo this later on.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>

3 years agointel: Use env_var_as_boolean for INTEL_NO_HW
Nanley Chery [Wed, 21 Jul 2021 21:37:04 +0000 (14:37 -0700)]
intel: Use env_var_as_boolean for INTEL_NO_HW

The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>

3 years agopanfrost: Port v5 blend shader issue to blitter
Alyssa Rosenzweig [Mon, 23 Aug 2021 18:06:41 +0000 (14:06 -0400)]
panfrost: Port v5 blend shader issue to blitter

This is a presumed erratum workaround. Fixes INSTR_INVALID_PC faults on
some draw_buffers_indexed.* cases on Midgard, where a blend shader is
required to pack RT n > 0.

Backport the workaround from the GL driver. The helper is now in common
code for panvk to use as well; it has the same bug.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Zero initialize blend_shaders
Alyssa Rosenzweig [Mon, 23 Aug 2021 17:42:23 +0000 (13:42 -0400)]
panfrost: Zero initialize blend_shaders

Fixes an invalid read caught by valgrind when there is a hole in the
valid render target mask:

==6749== Conditional jump or move depends on uninitialised value(s)
==6749==    at 0x5E88EC0: panfrost_prepare_fs_state (pan_cmdstream.c:417)
==6749==    by 0x5E88EC0: panfrost_emit_frag_shader (pan_cmdstream.c:501)
==6749==    by 0x5E88EC0: panfrost_emit_frag_shader_meta (pan_cmdstream.c:573)
==6749==    by 0x5E88EC0: panfrost_update_state_fs (pan_cmdstream.c:2593)
==6749==    by 0x5E8B0BF: panfrost_direct_draw (pan_cmdstream.c:2839)

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Fixes: a124c47b9f9 ("panfrost: Fix NULL derefs in pan_cmdstream.c")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/mdg: Handle swapped 565 and 1010102 unorm
Alyssa Rosenzweig [Tue, 15 Jun 2021 17:10:35 +0000 (13:10 -0400)]
pan/mdg: Handle swapped 565 and 1010102 unorm

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Don't open-code pan_unpacked_type_for_format
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:45:31 +0000 (12:45 -0400)]
pan/lower_framebuffer: Don't open-code pan_unpacked_type_for_format

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Don't open-code pad_vec4
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:40:29 +0000 (12:40 -0400)]
pan/lower_framebuffer: Don't open-code pad_vec4

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Don't treat UNORM 4 special
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:38:19 +0000 (12:38 -0400)]
pan/lower_framebuffer: Don't treat UNORM 4 special

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Unify UNORM handling
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:37:37 +0000 (12:37 -0400)]
pan/lower_framebuffer: Unify UNORM handling

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Use fmul_imm
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:26:49 +0000 (12:26 -0400)]
pan/lower_framebuffer: Use fmul_imm

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/lower_framebuffer: Don't replicate so much
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:18:26 +0000 (12:18 -0400)]
pan/lower_framebuffer: Don't replicate so much

We need to replicate to deal with multisampling, but not otherwise.
Simplify the logic substantially.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopan/mdg: Insert moves before writeout when needed
Alyssa Rosenzweig [Fri, 11 Jun 2021 22:48:09 +0000 (18:48 -0400)]
pan/mdg: Insert moves before writeout when needed

Otherwise we end up accessing overwritten registers. Fixes

dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_enable_buffer_enable

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Delete unpacks for blendable formats
Alyssa Rosenzweig [Tue, 15 Jun 2021 17:15:15 +0000 (13:15 -0400)]
panfrost: Delete unpacks for blendable formats

Unnecessary.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Use blendable check for tib read check
Alyssa Rosenzweig [Fri, 11 Jun 2021 22:02:11 +0000 (18:02 -0400)]
panfrost: Use blendable check for tib read check

These are the same! Either you're blendable and can use f32/f16
conversion, or you're raw and you can only get raw. It's that simple!

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Fix UNORM 10 sizes
Alyssa Rosenzweig [Fri, 11 Jun 2021 21:27:37 +0000 (17:27 -0400)]
panfrost: Fix UNORM 10 sizes

Fixes: 56047fb64d7 ("panfrost: Fix UNORM 16 rendering")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Remove unneeded quirks from T760
Alyssa Rosenzweig [Mon, 23 Aug 2021 18:53:00 +0000 (14:53 -0400)]
panfrost: Remove unneeded quirks from T760

Will cause trouble later in the series when we start garbage collecting
unneeded code.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Add explicit padding to pan_blend_shader_key
Boris Brezillon [Fri, 4 Jun 2021 12:41:06 +0000 (14:41 +0200)]
panfrost: Add explicit padding to pan_blend_shader_key

So the hash function doesn't end up hashing uninitialized values.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reported-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Fixes: bbff09b9521f ("panfrost: Move the blend shader cache at the device level")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agopanfrost: Add padding to pan_blit_blend_shader_key
Tomeu Vizoso [Tue, 6 Jul 2021 08:31:38 +0000 (10:31 +0200)]
panfrost: Add padding to pan_blit_blend_shader_key

So the hashtable helpers know the correct size of the struct.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>

3 years agoiris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE.
Kenneth Graunke [Sun, 22 Aug 2021 00:36:34 +0000 (17:36 -0700)]
iris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE.

Having these could be useful when tracking down GPU hangs.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>

3 years agoiris: Bypass the BO cache when allocating buffers for aux map tables
Kenneth Graunke [Mon, 16 Aug 2021 19:54:47 +0000 (12:54 -0700)]
iris: Bypass the BO cache when allocating buffers for aux map tables

When freeing a buffer, we may return a non-idle buffer to the cache,
which means we cannot unmap aux entries at that time.  Instead, we
defer unmapping the stale aux entry until we reuse a BO from the cache.

Unfortunately, this can lead to a recursive locking issue:

1. intel_aux_map_add_mapping wants to set up a new aux entry

   It takes the intel_aux_map_context::mutex lock, then calls:

   add_mapping -> get_aux_entry -> add_sub_table -> add_buffer ->
   intel_aux_map_buffer_alloc -> iris_bo_alloc

2. iris_bo_alloc tries to allocate a BO from the cache, doing:

   alloc_bo_from_cache -> intel_aux_map_unmap_range ->
   intel_aux_unmap_range

   ...which then tries to take the intel_aux_map_context::mutex lock.
   But it is already locked.

One solution would be to rework the aux map handling code to allocate
BOs without holding its lock, but that looks to be painful.  Another
is to make the lock recursive, but we try and avoid that.  A third
option wuold be to add a BO_ALLOC flag that makes alloc_bo_from_cache
skip any buffers with aux_map_address != 0 so we don't have to unmap,
making the less cache effective but fixing the recursive lock.

A fourth option is to simply bypass the BO cache altogether for the
buffers that hold the aux map itself.  Allocating new BOs for the aux
tables should be relatively rare, so there's probably not a lot of
benefit in using the BO cache.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5191
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>

3 years agovenus: scrub ignored fields of pipeline info when rasterization is disable
Yiwei Zhang [Sat, 21 Aug 2021 23:44:47 +0000 (23:44 +0000)]
venus: scrub ignored fields of pipeline info when rasterization is disable

v2: use vk_alloc instead of vk_zalloc because of full memcpy

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12499>

3 years agovenus: fix all missing vn_object_base_fini
Yiwei Zhang [Sat, 21 Aug 2021 22:21:17 +0000 (22:21 +0000)]
venus: fix all missing vn_object_base_fini

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12498>

3 years agotu: Enable VK_KHR_uniform_buffer_standard_layout
Matt Turner [Sat, 21 Aug 2021 01:48:45 +0000 (18:48 -0700)]
tu: Enable VK_KHR_uniform_buffer_standard_layout

This extension relaxes the alignment requirements to allow the GL std430
layout to be used. freedreno/ir3 already supports this (via
PIPE_CAP_LOAD_CONSTBUF).

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12495>

3 years agonir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)
Samuel Pitoiset [Thu, 1 Jul 2021 06:41:04 +0000 (08:41 +0200)]
nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)

Found with Cyberpunk 2077.

fossils-db (GFX10.3):
Totals from 128 (2.34% of 5465) affected shaders:
CodeSize: 769720 -> 767656 (-0.27%); split: -0.27%, +0.00%
Instrs: 145748 -> 145229 (-0.36%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11604>

3 years agovulkan/wsi/sw: wait for image fence before submitting to queue
Dave Airlie [Mon, 23 Aug 2021 03:00:56 +0000 (13:00 +1000)]
vulkan/wsi/sw: wait for image fence before submitting to queue

With hw devices, when you submit a present, implicit sync will
make sure the work submitted to the gpu on the client will end
up happening before the present work submitted on the server.

However with sw paths there is no real GPU, the lavapipe fake
GPU thread is client side only and presenting is done directly
from the pixmap (or later shared pixmap). In order for this to
make sense the wsi common code should wait for the fence on the
image before queueing the submit to the server so that all
client works has been flushed to the pixmap before the copy or
present operation is submitted.

Fixes: 8004fa9c9501 ("vulkan/wsi: add sw support. (v2)")
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12502>

3 years agoaco/scheduler: allow moving down VMEM stores to below VMEM loads
Rhys Perry [Thu, 5 Aug 2021 10:42:39 +0000 (11:42 +0100)]
aco/scheduler: allow moving down VMEM stores to below VMEM loads

fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%

fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211>

3 years agollvmpipe: use preferred attribute interpolation for wide lines
Erik Faye-Lund [Wed, 9 Jun 2021 12:15:54 +0000 (14:15 +0200)]
llvmpipe: use preferred attribute interpolation for wide lines

When rasterizing legacy-lines, OpenGL defines the width as being an
extrusion along the minor axis, repeating varyings. While the spec
*does* allow for an alternative method that matches our current results,
the OpenGL ES CTS doesn't allow these results even if OpenGL ES has the
same wording of an alternative method.

This is technically speaking a bug in the OpenGL ES CTS, but it seems
like nobody else is using the alternative formulation, at least not
while passing the OpenGL ES CTS. On top of this, the OpenGL specification
explicitly lists the extrusion results as the preferred method.

So it seems like a good idea for us to do this the way the OpenGL
specification prefers regardless; it's going to give less surprising
results to applications, and it's helping us pass some tests.

This math to set these up would "trivially" be:

dx = (dx * dx + dy * dy) / dx
dy = 0

and:

dy = (dx * dx + dy * dy) / dy
dx = 0

...but since we've already calculated dxdy, we can reformulate this to
save a division.

This fixes the following dEQP test-cases:
- dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
- dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
- dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
- dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.interpolation.lines_wide
- dEQP-GLES3.functional.rasterization.fbo.texture_2d.interpolation.lines_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.line_loop_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.line_strip_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.lines_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.line_loop_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.line_strip_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.lines_wide

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11315>

3 years agoaco: remove label_extract if the extract is used by a non-VALU
Rhys Perry [Thu, 29 Jul 2021 15:55:51 +0000 (16:55 +0100)]
aco: remove label_extract if the extract is used by a non-VALU

If an extract is used by a non-VALU instruction, it can't be applied to
all instructions, so it's not beneficial to try to apply it.

This check isn't needed because can_apply_extract()/can_use_SDWA() should
already handle non-VALU instructions.

fossil-db (Sienna Cichlid):
Totals from 1020 (0.68% of 150170) affected shaders:
SpillSGPRs: 1577 -> 1571 (-0.38%)
CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00%
Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01%
Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01%
SClause: 49072 -> 49071 (-0.00%)
Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06%
Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02%
PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01%

fossil-db (Polaris10):
Totals from 654 (0.43% of 151696) affected shaders:
CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00%
Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00%
Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00%
Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00%
PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01%
PreVGPRs: 43813 -> 43809 (-0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>

3 years agoradv: allocate shaders to 32-bit address to skip PGM_HI
Samuel Pitoiset [Thu, 19 Aug 2021 07:04:46 +0000 (09:04 +0200)]
radv: allocate shaders to 32-bit address to skip PGM_HI

This reduces the number of emitted registers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>

3 years agoradv: don't use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x
Samuel Pitoiset [Thu, 19 Aug 2021 06:40:19 +0000 (08:40 +0200)]
radv: don't use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x

Seems it make the perf worse.
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>

3 years agoaco: add more validation rules for SDWA operands
Daniel Schürmann [Wed, 18 Aug 2021 19:42:15 +0000 (21:42 +0200)]
aco: add more validation rules for SDWA operands

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco/opcodes: remove definition_size[]
Daniel Schürmann [Fri, 13 Aug 2021 12:39:29 +0000 (14:39 +0200)]
aco/opcodes: remove definition_size[]

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco/validate: simplify get_subdword_bytes_written()
Daniel Schürmann [Fri, 13 Aug 2021 12:38:40 +0000 (14:38 +0200)]
aco/validate: simplify get_subdword_bytes_written()

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco/ra: refactor subdword operand stride
Daniel Schürmann [Thu, 12 Aug 2021 16:04:01 +0000 (18:04 +0200)]
aco/ra: refactor subdword operand stride

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco/ra: refactor subdword definition info
Daniel Schürmann [Fri, 13 Aug 2021 10:54:59 +0000 (12:54 +0200)]
aco/ra: refactor subdword definition info

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco: add instr_is_16bit() helper function
Daniel Schürmann [Wed, 18 Aug 2021 16:56:59 +0000 (18:56 +0200)]
aco: add instr_is_16bit() helper function

to indicate whether some instruction writes partial registers, only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco: use VOPC_SDWA on GFX9+
Daniel Schürmann [Wed, 7 Jul 2021 09:37:49 +0000 (11:37 +0200)]
aco: use VOPC_SDWA on GFX9+

Totals from 5138 (3.42% of 150170) affected shaders: (GFX10.3)
VGPRs: 409520 -> 409416 (-0.03%); split: -0.03%, +0.00%
CodeSize: 43056360 -> 43035696 (-0.05%); split: -0.06%, +0.02%
MaxWaves: 69296 -> 69310 (+0.02%)
Instrs: 8161016 -> 8153365 (-0.09%); split: -0.10%, +0.01%
Latency: 109397002 -> 109756208 (+0.33%); split: -0.05%, +0.38%
InvThroughput: 23238920 -> 23310761 (+0.31%); split: -0.11%, +0.42%
VClause: 135141 -> 135100 (-0.03%); split: -0.05%, +0.02%
SClause: 349511 -> 349489 (-0.01%); split: -0.01%, +0.00%
Copies: 388107 -> 387754 (-0.09%); split: -0.48%, +0.38%
Branches: 184629 -> 184503 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 258807 -> 258839 (+0.01%)
PreVGPRs: 372561 -> 372184 (-0.10%); split: -0.10%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco/print_ir: fix printing of VOPC_SDWA definitions
Daniel Schürmann [Fri, 13 Aug 2021 13:07:16 +0000 (15:07 +0200)]
aco/print_ir: fix printing of VOPC_SDWA definitions

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>

3 years agoaco: fix vectorized 16-bit load_input/load_interpolated_input
Rhys Perry [Fri, 20 Aug 2021 13:03:29 +0000 (14:03 +0100)]
aco: fix vectorized 16-bit load_input/load_interpolated_input

Seems we haven't encountered this before because
nir_lower_io_to_scalar_early usually scalarizes this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12486>

3 years agoradv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state
Samuel Pitoiset [Wed, 11 Aug 2021 11:35:13 +0000 (13:35 +0200)]
radv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state

This has no effect without enabling EXPCLEAR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326>

3 years agoradv: remove unused fast depth-stencil gfx clear path with expclear
Samuel Pitoiset [Wed, 11 Aug 2021 11:33:30 +0000 (13:33 +0200)]
radv: remove unused fast depth-stencil gfx clear path with expclear

This has never been used because it requires to know the previous
clear values which is not really possible in Vulkan.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326>

3 years agolavapipe: fix missing VKAPI_CALL attribute
Michel Zou [Fri, 20 Aug 2021 08:42:06 +0000 (10:42 +0200)]
lavapipe: fix missing VKAPI_CALL attribute

Fixes build on mingw

Fixes: c198adf7

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12484>

3 years agoutil/xmlconfig: Test values set via the environment
Ian Romanick [Mon, 16 Aug 2021 18:20:56 +0000 (11:20 -0700)]
util/xmlconfig: Test values set via the environment

driconf options can also be set via environment variables.  This is a
simple touch-test of that feature.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12477>

3 years agoutil/xmlconfig: Make unit tests more resilient against user env settings
Ian Romanick [Mon, 16 Aug 2021 18:20:00 +0000 (11:20 -0700)]
util/xmlconfig: Make unit tests more resilient against user env settings

Before this, setting 'vblank_mode=0' in the environment would cause a
unit test to fail.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12477>

3 years agofrontend/dri: add environment variable DRI_NO_MSAA for performance comparisons
Marek Olšák [Sun, 7 Mar 2021 15:28:52 +0000 (10:28 -0500)]
frontend/dri: add environment variable DRI_NO_MSAA for performance comparisons

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12491>

3 years agoradeonsi: remove vertices_per_patch parameter from draw-related functions
Marek Olšák [Fri, 13 Aug 2021 07:24:38 +0000 (03:24 -0400)]
radeonsi: remove vertices_per_patch parameter from draw-related functions

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>

3 years agogallium: remove vertices_per_patch, add pipe_context::set_patch_vertices
Marek Olšák [Fri, 13 Aug 2021 06:29:56 +0000 (02:29 -0400)]
gallium: remove vertices_per_patch, add pipe_context::set_patch_vertices

We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).

It also allows removing some code from draw_vbo for tessellation.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>

3 years agotu: Remove some stale bypass xfails
Connor Abbott [Fri, 20 Aug 2021 15:24:45 +0000 (17:24 +0200)]
tu: Remove some stale bypass xfails

These were fixed by 09e0b29bb63f60231b26b4c8f02eadb68e51b623 which was
missed during the suite conversion. For the remaining still-valid fail,
there is a CTS patch in progress.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12488>

3 years agofreedreno/crashdec: Quiet spammy print in query mode
Rob Clark [Fri, 20 Aug 2021 20:46:03 +0000 (13:46 -0700)]
freedreno/crashdec: Quiet spammy print in query mode

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>

3 years agofreedreno/crashdec: Decode full RB in verbose mode
Rob Clark [Fri, 20 Aug 2021 18:13:44 +0000 (11:13 -0700)]
freedreno/crashdec: Decode full RB in verbose mode

This is useful to get a better view of previous commands in the
ringbuffer.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>

3 years agofreedreno/cffdec: Fix gpuaddr comparision
Rob Clark [Fri, 20 Aug 2021 17:48:57 +0000 (10:48 -0700)]
freedreno/cffdec: Fix gpuaddr comparision

gpuaddrs are 64b, and they can be more than 2^^32 apart.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>

3 years agofreedreno/cffdec: Fix indentation
Rob Clark [Fri, 20 Aug 2021 17:48:28 +0000 (10:48 -0700)]
freedreno/cffdec: Fix indentation

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>

3 years agopan/bi: Extend bi_add_nop_for_atest for tilebuffer loads
Icecream95 [Sat, 14 Aug 2021 11:36:27 +0000 (23:36 +1200)]
pan/bi: Extend bi_add_nop_for_atest for tilebuffer loads

Fixes framebuffer_fetch and blend_equation_advanced dEQP tests on v6.

v2: Use clause dependencies rather than comparing the message type
v3: Shift the BIFROST_SLOT_* constants before using them as a mask

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12375>

3 years agotu: Free device->bo_idx and device->bo_list on init failure
Matt Turner [Fri, 20 Aug 2021 03:12:57 +0000 (20:12 -0700)]
tu: Free device->bo_idx and device->bo_list on init failure

Two related changes:

- in tu_device.c:tu_CreateDevice we need to free both pointers in the
  teardown path after tu_bo_finish(global_bo), which uses the pointers.
  They are allocated in the first call to tu_bo_init(), which happens
  when global_bo is allocated.

- in tu_drm.c:tu_bo_init we need to free bo_list if the bo_idx
  allocation fails. Convert to the goto teardown pattern as well.

Fixes the following dEQP-VK tests:
  dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail
  dEQP-VK.api.object_management.alloc_callback_fail.device
  dEQP-VK.api.object_management.alloc_callback_fail.device_group

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12481>

3 years agopan/bi: Use CLPER_V6 on Mali G31
Alyssa Rosenzweig [Thu, 19 Aug 2021 22:09:32 +0000 (22:09 +0000)]
pan/bi: Use CLPER_V6 on Mali G31

Apparently, CLPER_V7 is missing from Mali G31, but CLPER_V6 works. Fixes
INSTR_INVALID_ENC faults and failures in
dEQP-GLES3.functional.shaders.derivate.* on Dvalin.

Technically not an errata but an implementation difference. I suspect
Mali G51 will need this as well, should we ever allowlist it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Use ST_TILE for multisampled blend output
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:12:02 +0000 (22:12 +0000)]
pan/bi: Use ST_TILE for multisampled blend output

ST_TILE lets us specify an explicit sample, whereas BLEND replicates to
all samples. This fully fixes the interaction between blend shaders and
multisampling on Bifrost, manifesting as
dEQP-GLES3.functional.fragment_ops.random.* failures with the
configuration rgba8888d24s8ms4.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopanfrost: Evaluate blend shaders per-sample
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:10:49 +0000 (22:10 +0000)]
panfrost: Evaluate blend shaders per-sample

This varies the sample ID value, which will be used in the next commit.
This is less complicated than keying blend shaders to the content of
this flag and trying to make mega blend shaders covering all samples at
once ... complexity I'd rather not think about right now. The DDK does
it this way.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Set the sample ID for blend shader LD_TILE
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:08:31 +0000 (22:08 +0000)]
pan/bi: Set the sample ID for blend shader LD_TILE

Use the explicit sample mode and set the sample ID in the pixel indices
structure to the current sample ID. This fixes tilebuffer loads in blend
shaders on multisampled framebuffers.

Make sure the new routine is broken out to a helper for use with ST_TILE
in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Extract load_sample_id to a helper
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:07:50 +0000 (22:07 +0000)]
pan/bi: Extract load_sample_id to a helper

Will be reused in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Correct the sr_count on +ST_TILE
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:05:17 +0000 (22:05 +0000)]
pan/bi: Correct the sr_count on +ST_TILE

Otherwise we'll get validator fails when emitting +ST_TILE.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Don't set td in blend shaders
Alyssa Rosenzweig [Wed, 18 Aug 2021 21:38:07 +0000 (21:38 +0000)]
pan/bi: Don't set td in blend shaders

This breaks screen-space derivatives in a shader that uses multiple
render targets, if the derivative calculation is scheduled after a BLEND
instruction calling into a blend shader.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopan/bi: Set eldest_colour dependency for ST_TILE
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:05:52 +0000 (22:05 +0000)]
pan/bi: Set eldest_colour dependency for ST_TILE

I don't think we'll ever hit this in practice, since it's not needed for
blend shaders, but better to correct the code anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agopanfrost: Disable shader-assisted indirect draws
Alyssa Rosenzweig [Fri, 13 Aug 2021 23:11:39 +0000 (23:11 +0000)]
panfrost: Disable shader-assisted indirect draws

Although it is passing all of dEQP-GLES31, it is failing a few
KHR-GLES31.* tests. It also has performance issues at the moment. Invert
the existing noindirect debug flag to become a indirect debug flag. Set
this flag for dEQP-GLES31 CI on G52, to make sure the code doesn't bit
rot on the hope someone will pick this up later on.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>

3 years agovulkan/wsi/wayland: memset members of image to zero
Leandro Ribeiro [Wed, 18 Aug 2021 14:43:48 +0000 (11:43 -0300)]
vulkan/wsi/wayland: memset members of image to zero

struct wsi_wl_image is only used as member of the swapchain, and during
the swapchain creation the image is already initialized to zero. So we
have no problems with members of the image being used uninitialized.

But for consistency, memset the members of this struct to zero in
wsi_wl_image_init(). This can help to avoid problems in the future.

Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12451>

3 years agovulkan/wsi/wayland: create swapchain using vk_zalloc()
Leandro Ribeiro [Tue, 6 Jul 2021 19:00:12 +0000 (16:00 -0300)]
vulkan/wsi/wayland: create swapchain using vk_zalloc()

In wsi_wl_surface_create_swapchain() we have a piece of code to init
some members of the chain to 0, in order to allow us to call
wsi_wl_swapchain_destroy() for cleanup.

Instead, we can use vk_zalloc() to allocate the chain, as it initializes
all members of the struct to zero. This help us to avoid problems when
people add new members to the struct and forget to initialize them.
Also, it makes the code look better.

Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12451>

3 years agoci/lavapipe: Add a fractional run with ASan
Emma Anholt [Mon, 12 Jul 2021 15:07:07 +0000 (16:07 +0100)]
ci/lavapipe: Add a fractional run with ASan

This catches use-after-frees and buffer overflows, but not leaks (which we
disable the checking for since the library gets dlclose()d and we end up
with useless backtraces).

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8889>

3 years agotu: Add a650-specific CCU flush workaround
Connor Abbott [Thu, 19 Aug 2021 15:11:20 +0000 (17:11 +0200)]
tu: Add a650-specific CCU flush workaround

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12475>

3 years agotu: Properly handle waiting on an earlier pipeline stage
Connor Abbott [Thu, 19 Aug 2021 13:49:00 +0000 (15:49 +0200)]
tu: Properly handle waiting on an earlier pipeline stage

I never really implemented this properly, because I wasn't aware of the
clusters when doing the original pipeline barrier implementation. It
turns out that the Vulkan stages we get as part of the barriers are
actually good for something, because it turns out that the pipeline
state is split into stages, so earlier stages can run ahead of later
stages and sometimes we need to wait when an earlier stage depends on
the result of a later stage. This happens most often whenever a shader
reads the result of a color/depth attachment write, because attachment
writes happen in a logically later stage. However this could also happen
for a FS -> VS dependency.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12475>

3 years agoanv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa)
Nanley Chery [Wed, 16 Jun 2021 17:17:38 +0000 (10:17 -0700)]
anv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa)

Only emit the workaround as needed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Optimize genX(emit_depth_state_workarounds)
Nanley Chery [Wed, 16 Jun 2021 17:22:48 +0000 (10:22 -0700)]
iris: Optimize genX(emit_depth_state_workarounds)

Only emit the workaround as needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Use constants for emitting cso_z->packets
Nanley Chery [Thu, 17 Jun 2021 17:12:12 +0000 (10:12 -0700)]
iris: Use constants for emitting cso_z->packets

This should be a bit faster and easier to follow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agointel: Move the D16 workarounds out of ISL
Nanley Chery [Tue, 15 Jun 2021 17:38:38 +0000 (10:38 -0700)]
intel: Move the D16 workarounds out of ISL

Implement the workarounds in anv and iris instead.

Before this commit, ISL unconditionally modified workaround registers
while filling out depth stencil state. To account for this, drivers
unconditionally stalled prior to emitting depth stencil packets. This
hurt performance.

By having the drivers perform the workarounds, they can choose when to
modify the relevant registers. The drivers now avoid emitting the
workaround for NULL depth buffers. This reduces stalls and leads to
better performance.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Update clear_params only when HiZ is enabled
Nanley Chery [Thu, 17 Jun 2021 16:48:20 +0000 (09:48 -0700)]
iris: Update clear_params only when HiZ is enabled

This more closely matches ISL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Emit clear_params as part of cso_z->packets
Nanley Chery [Thu, 17 Jun 2021 16:39:50 +0000 (09:39 -0700)]
iris: Emit clear_params as part of cso_z->packets

This should be a bit faster.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Update the clear value in cso_z->packets
Nanley Chery [Thu, 17 Jun 2021 16:34:00 +0000 (09:34 -0700)]
iris: Update the clear value in cso_z->packets

Enables emitting the packets all at once later on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoiris: Add genX(emit_depth_state_workarounds)
Nanley Chery [Tue, 15 Jun 2021 15:37:56 +0000 (08:37 -0700)]
iris: Add genX(emit_depth_state_workarounds)

This will replace the workaround built into ISL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoanv: Add genX(cmd_buffer_emit_gfx12_depth_wa)
Nanley Chery [Tue, 15 Jun 2021 16:52:58 +0000 (09:52 -0700)]
anv: Add genX(cmd_buffer_emit_gfx12_depth_wa)

This will replace the workaround built into ISL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>

3 years agoradv: fix copying depth+stencil images on compute
Samuel Pitoiset [Mon, 9 Aug 2021 19:57:41 +0000 (21:57 +0200)]
radv: fix copying depth+stencil images on compute

Using separate aspects is required.

Fixes few CTS failures (dEQP-VK.api.copy_and_blit.*) when the compute
path is forced in the driver. Note that CTS coverage of compute queue
is rather limited.

Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12287>

3 years agoglsl: fix variable scope for instructions inside case statements
Timothy Arceri [Wed, 18 Aug 2021 03:57:14 +0000 (13:57 +1000)]
glsl: fix variable scope for instructions inside case statements

Fixes: 665d75cc5a23 ("glsl: Fix scoping bug in if statements.")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5247

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12435>

3 years agoradv: remove incorrect comment about compressed writes to HTILE on GFX10+
Samuel Pitoiset [Wed, 18 Aug 2021 14:00:50 +0000 (16:00 +0200)]
radv: remove incorrect comment about compressed writes to HTILE on GFX10+

This seems to be unsupported.
COMPRESSION_EN=1 and WRITE_COMPRESS_ENABLE=1 don't update HTILE
with image stores.

Note that there is no issue because depth/stencil images will be
decompressed for image stores, and TC-compat HTILE is disabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12450>

3 years agoradv: remove unnecessary check in radv_layout_is_htile_compressed()
Samuel Pitoiset [Wed, 18 Aug 2021 13:58:09 +0000 (15:58 +0200)]
radv: remove unnecessary check in radv_layout_is_htile_compressed()

The driver doesn't enable TC-compat HTILE for storage images, so this
was actually always TRUE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12450>

3 years agost/mesa: move handling CubeMapSeamless into st_convert_sampler where it belongs
Marek Olšák [Mon, 7 Jun 2021 12:51:41 +0000 (08:51 -0400)]
st/mesa: move handling CubeMapSeamless into st_convert_sampler where it belongs

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12334>

3 years agost/mesa: set take_ownership = true in set_sampler_views
Marek Olšák [Sun, 6 Jun 2021 06:28:14 +0000 (02:28 -0400)]
st/mesa: set take_ownership = true in set_sampler_views

update_textures_local is removed because the only thing it did was
unreferencing sampler views, which is being removed.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12334>

3 years agogallium: add take_ownership into set_sampler_views to skip reference counting
Marek Olšák [Sun, 6 Jun 2021 06:23:31 +0000 (02:23 -0400)]
gallium: add take_ownership into set_sampler_views to skip reference counting

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12334>

3 years agoac/surface: allow arbitrary swizzle modes for displayable DCC
Marek Olšák [Tue, 17 Aug 2021 16:57:03 +0000 (12:57 -0400)]
ac/surface: allow arbitrary swizzle modes for displayable DCC

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>

3 years agoradv: allow arbitrary swizzle modes for displayable DCC
Marek Olšák [Tue, 17 Aug 2021 16:57:03 +0000 (12:57 -0400)]
radv: allow arbitrary swizzle modes for displayable DCC

by adding retile pipeline variants

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>

3 years agoradeonsi: allow arbitrary swizzle modes for displayable DCC
Marek Olšák [Tue, 17 Aug 2021 16:57:03 +0000 (12:57 -0400)]
radeonsi: allow arbitrary swizzle modes for displayable DCC

by adding retile shader variants

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>

3 years agoir3: prohibit folding of half->full conversion into mul.s24/u24
Danylo Piliaiev [Thu, 19 Aug 2021 11:53:23 +0000 (14:53 +0300)]
ir3: prohibit folding of half->full conversion into mul.s24/u24

mul.s24/u24 always return 32b result regardless of its sources size,
hence we cannot guarantee the high 16b of dst being zero or sign extended.

Fixes cts tests on a650:
 dEQP-VK.spirv_assembly.type.scalar.i16.mul_test_high_part_zero_*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12471>

3 years agofreedreno/ci: Add spillall tests
Connor Abbott [Wed, 18 Aug 2021 11:06:37 +0000 (13:06 +0200)]
freedreno/ci: Add spillall tests

Only test shader tests, because the others are unlikely to have
interesting shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3, turnip, freedreno: Report stp/ldp in shader stats
Connor Abbott [Fri, 23 Jul 2021 12:06:04 +0000 (14:06 +0200)]
ir3, turnip, freedreno: Report stp/ldp in shader stats

This is important after spilling, so that we get an indication when a
change causes spilling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Fix getting stp/ldp components in ir3_info
Connor Abbott [Fri, 23 Jul 2021 11:57:24 +0000 (13:57 +0200)]
ir3: Fix getting stp/ldp components in ir3_info

Noticed by inspection when adding stp_count/ldp_count.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Initial support for spilling non-shared registers
Connor Abbott [Fri, 23 Jul 2021 11:12:30 +0000 (13:12 +0200)]
ir3: Initial support for spilling non-shared registers

Support for spilling shared registers to normal registers is still TODO.
There are also several improvements to be made, like rematerialization.

Note, there is one behavior change to register pressure accounting: we
now include half registers in the current full pressure directly in
mergedregs mode, rather than adding the max half pressure to the max
full pressure afterwards, which might result in lower calculated max
pressure in some cases with half registers. This is needed for spilling,
since we need to make sure the total pressure including half registers
is below the maximum at each instruction. Because the entire pass is
rewritten, including the register pressure calculating parts, it didn't
seem worth it to separate out this change.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Fix compress_regs_left accounting for half-regs
Connor Abbott [Thu, 19 Aug 2021 16:50:07 +0000 (18:50 +0200)]
ir3: Fix compress_regs_left accounting for half-regs

This was just wrong - we need to check against the entire register file,
and we need to include removed full regs even if the register we're
trying to insert is a half-reg, or else we could run out of space when
reinserting full regs after it. There does need to be an additional
check so that we don't try to insert a half-reg beyond the half-reg
limit, but that has to happen in addition to the normal check.

This fixes KHR-GLES31.core.arrays_of_arrays.InteractionArgumentAliasing6
once spilling is added.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Properly validate pcopy reg sizes
Connor Abbott [Wed, 18 Aug 2021 12:43:48 +0000 (14:43 +0200)]
ir3: Properly validate pcopy reg sizes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>