platform/upstream/mesa.git
2 years agoradv: Use MAX_PUSH_CONSTANTS_SIZE for saved push constants.
Bas Nieuwenhuizen [Mon, 10 Jan 2022 22:15:34 +0000 (23:15 +0100)]
radv: Use MAX_PUSH_CONSTANTS_SIZE for saved push constants.

So that it can never again get out of sync.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14485>

2 years agozink: use device-local heap for sparse backing allocations
Mike Blumenkrantz [Tue, 4 Jan 2022 16:31:25 +0000 (11:31 -0500)]
zink: use device-local heap for sparse backing allocations

backing allocations are real allocations, so they shouldn't be initialized
as sparse containers

Fixes: 40fdb3212c3 ("zink: add a suballocator")

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14394>

2 years agonir: handle per-view clip/cull distances
Marcin Ślusarz [Fri, 17 Dec 2021 15:11:16 +0000 (16:11 +0100)]
nir: handle per-view clip/cull distances

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agospirv: mark [Clip|Cull]DistancePerViewNV variables as compact
Marcin Ślusarz [Fri, 17 Dec 2021 15:09:43 +0000 (16:09 +0100)]
spirv: mark [Clip|Cull]DistancePerViewNV variables as compact

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agonir: remove invalid assert affecting per-view variables
Marcin Ślusarz [Thu, 28 Oct 2021 10:54:59 +0000 (12:54 +0200)]
nir: remove invalid assert affecting per-view variables

per-view variables can have arbitrary (but > 0) number of array levels

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agospirv: handle multiview bits of SPV_NV_mesh_shader
Marcin Ślusarz [Fri, 17 Dec 2021 16:02:18 +0000 (17:02 +0100)]
spirv: handle multiview bits of SPV_NV_mesh_shader

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agonir: add load_mesh_view_count and load_mesh_view_indices intrinsics
Marcin Ślusarz [Fri, 10 Sep 2021 14:42:01 +0000 (16:42 +0200)]
nir: add load_mesh_view_count and load_mesh_view_indices intrinsics

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agospirv: add MeshViewCountNV/MeshViewIndidcesNV builtins from SPV_NV_mesh_shader
Marcin Ślusarz [Thu, 16 Dec 2021 13:28:58 +0000 (14:28 +0100)]
spirv: add MeshViewCountNV/MeshViewIndidcesNV builtins from SPV_NV_mesh_shader

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agocompiler: add new MESH_VIEW_COUNT/MESH_VIEW_INDICES system values
Marcin Ślusarz [Thu, 22 Jul 2021 11:47:42 +0000 (13:47 +0200)]
compiler: add new MESH_VIEW_COUNT/MESH_VIEW_INDICES system values

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agospirv: handle ViewportMaskNV builtin/cap from SPV_NV_mesh_shader
Marcin Ślusarz [Fri, 17 Dec 2021 16:00:08 +0000 (17:00 +0100)]
spirv: handle ViewportMaskNV builtin/cap from SPV_NV_mesh_shader

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263>

2 years agointel/isl: Return false more in isl_surf_get_hiz_surf
Nanley Chery [Wed, 8 Dec 2021 19:45:42 +0000 (14:45 -0500)]
intel/isl: Return false more in isl_surf_get_hiz_surf

Follow the CCS and MCS functions by returning false for unsupported
cases. This reduces the burden on the caller.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agointel/isl: Allow HiZ with Tile4/64 surfaces
Nanley Chery [Sat, 4 May 2019 00:40:54 +0000 (17:40 -0700)]
intel/isl: Allow HiZ with Tile4/64 surfaces

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agointel/isl: Require Y-tiling for depth on gfx4-5
Nanley Chery [Wed, 8 Dec 2021 18:28:46 +0000 (13:28 -0500)]
intel/isl: Require Y-tiling for depth on gfx4-5

This enables isl_surf_get_hiz_surf to be simplified.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agointel/isl: Use a new HiZ format on XeHP+
Nanley Chery [Mon, 6 Dec 2021 03:29:44 +0000 (22:29 -0500)]
intel/isl: Use a new HiZ format on XeHP+

The new HiZ compresses twice as many rows of the depth surface compared
to TGL (Bspec 47009). Also, its tiling needs to be specified in
3DSTATE_HIER_DEPTH_BUFFER_BODY::TiledMode.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agointel/isl: Update comment for the XeHP HiZ block
Nanley Chery [Mon, 6 Dec 2021 03:29:44 +0000 (22:29 -0500)]
intel/isl: Update comment for the XeHP HiZ block

An 8x4 HiZ block doesn't fit in with the new formulas for sizing HiZ on
XeHP. Update a comment which assumed this block size on SKL+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agointel/isl: Rework HiZ image align calculations
Nanley Chery [Fri, 3 Dec 2021 20:00:24 +0000 (15:00 -0500)]
intel/isl: Rework HiZ image align calculations

* Check the format's compression type instead of the format directly to
  prepare for a new HiZ format on XeHP.
* Adjust the gfx12+ calculations so that XeHP will automatically be
  handled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agoblorp: Drop multisampled code in blorp_can_hiz_clear_depth
Nanley Chery [Mon, 6 Dec 2021 01:46:16 +0000 (20:46 -0500)]
blorp: Drop multisampled code in blorp_can_hiz_clear_depth

Anv allows non-8x4-aligned depth buffer clears, but it has multisampled
HiZ disabled for BDW. iris allows multisampled HiZ on BDW, but disallows
non-8x4-aligned depth buffer clears.

Drop the unused optimization for non-8x4-aligned clears of multisampled
surfaces on BDW and use this opportunity to use some PRM text in the
code comment.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14091>

2 years agoanv: increase binding table pool size to 64KB
Felix DeGrood [Thu, 6 Jan 2022 22:23:50 +0000 (14:23 -0800)]
anv: increase binding table pool size to 64KB

Binding table pool runs out of capacity quickly on modern games,
requiring new Surface Base Address instructions to be sent. That
is costly due to flushes and stalls.  Increasing BT pool capacity
to 64KB improves performance several workloads.

Fallout4 +4%
Shadow of the Tomb Raider +4%
Borderlands3 +3%

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14483>

2 years agointel/dev: fixup chv workaround
Lionel Landwerlin [Tue, 11 Jan 2022 12:03:47 +0000 (14:03 +0200)]
intel/dev: fixup chv workaround

We're using the wrong helper to get the subslice total count.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c24ba6cecbacf2 ("intel/dev: Handle CHV CS thread weirdness in get_device_info_from_fd")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14492>

2 years agoturnip: Use vk_common_QueueSignalReleaseImageANDROID for DRM
Jason Ekstrand [Sun, 2 Jan 2022 05:12:43 +0000 (23:12 -0600)]
turnip: Use vk_common_QueueSignalReleaseImageANDROID for DRM

It's identical to the one turnip copy+pasted from RADV.  For KGSL, we
still need to hand-roll because of all the emulated stuff.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14411>

2 years agoturnip: Use vk_common_AcquireImageANDROID
Jason Ekstrand [Sun, 2 Jan 2022 05:10:31 +0000 (23:10 -0600)]
turnip: Use vk_common_AcquireImageANDROID

It's got some bug fixes that turnip never picked up.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14411>

2 years agor300: use point sprite coordinates only when drawing points (v5)
Pavel Ondračka [Tue, 4 Jan 2022 11:41:55 +0000 (12:41 +0100)]
r300: use point sprite coordinates only when drawing points (v5)

Fixes piglit arb_point_sprite-interactions
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/364
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/370

Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14389>

2 years agozink: add extra synchronization for buffer descriptor binds
Mike Blumenkrantz [Thu, 6 Jan 2022 18:01:41 +0000 (13:01 -0500)]
zink: add extra synchronization for buffer descriptor binds

"most" times it isn't necessary to insert any pipeline barriers when binding
descriptors, as GL requires explicit barrier usage which comes through a different
codepath

the exception here is when the following scenario occurs:
* have buffer A
* buffer_subdata is called on A
* discard path is taken || A is not host-visible
* stream uploader is used for host write
* CmdCopyBuffer is used to copy the data back to A
buffer A now has a pending TRANSFER write that must complete before the buffer is
used in a shader, so synchronization is required any time TRANSFER usage is detected
in a bind

there's also going to be more exceptions going forward as more internal usage is added,
so just remove the whole fake-barrier mechanism since it'll become more problematic
going forward

Cc: 21.3 mesa-stable
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14496>

2 years agod3d12/ci: Skip flaky tex-miplevel-selection and timestamp tests
Jesse Natalie [Tue, 11 Jan 2022 01:08:56 +0000 (17:08 -0800)]
d3d12/ci: Skip flaky tex-miplevel-selection and timestamp tests

Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14494>

2 years agozink: always unset vertex shader variant key data when changing last vertex stage
Mike Blumenkrantz [Mon, 10 Jan 2022 16:36:25 +0000 (11:36 -0500)]
zink: always unset vertex shader variant key data when changing last vertex stage

ensure that vertex key data is always zeroed when changing last stage since it will
be updated before draw anyway and can only cause problems if left alone here

fixes the following caselist:
dEQP-GLES31.functional.shaders.builtin_constants.tessellation_shader.max_tess_evaluation_texture_image_units
dEQP-GLES31.functional.tessellation_geometry_interaction.feedback.tessellation_output_quads_geometry_output_points
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.25

cc: mesa-stable

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14482>

2 years agozink: add some wsi instance extensions
Mike Blumenkrantz [Thu, 6 Jan 2022 18:56:14 +0000 (13:56 -0500)]
zink: add some wsi instance extensions

not used for now

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14426>

2 years agozink: add missing assert for 8bit vertex decompose
Mike Blumenkrantz [Mon, 3 Jan 2022 17:02:24 +0000 (12:02 -0500)]
zink: add missing assert for 8bit vertex decompose

verify that this bit was set above

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14380>

2 years agoradv: implement wsi's private transfer queue using SDMA
Pierre-Eric Pelloux-Prayer [Tue, 4 Jan 2022 10:57:38 +0000 (11:57 +0100)]
radv: implement wsi's private transfer queue using SDMA

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agovulkan/wsi: add a private transfer pool to exec the DRI_PRIME blit
Pierre-Eric Pelloux-Prayer [Wed, 8 Dec 2021 13:05:15 +0000 (14:05 +0100)]
vulkan/wsi: add a private transfer pool to exec the DRI_PRIME blit

The idea is to offer the driver a way to execute on a different queue
than the one the app is using for Present.

For instance, this could be used to make the DRI_PRIME blit asynchronous,
by using a transfer queue.

So instead of creating a command buffer to be executed on present using
the supplied queue, this commit uses an internal transfer queue to perform
the blit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agovulkan/wsi: add use_prime_blit param to wsi_swapchain_init
Pierre-Eric Pelloux-Prayer [Fri, 7 Jan 2022 13:49:30 +0000 (14:49 +0100)]
vulkan/wsi: add use_prime_blit param to wsi_swapchain_init

Instead of initializing it to false and overriding it later if
needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agoradv: allocate the prime buffer as uncached
Pierre-Eric Pelloux-Prayer [Mon, 6 Dec 2021 09:47:34 +0000 (10:47 +0100)]
radv: allocate the prime buffer as uncached

This is a write only buffer so caches aren't needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agoradv: partial sdma support
Pierre-Eric Pelloux-Prayer [Thu, 25 Nov 2021 09:12:56 +0000 (10:12 +0100)]
radv: partial sdma support

SDMA code adapted from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12763

The only supported use case is image (linear or tiled) -> buffer and only GFX9+ is
supported (for now).

Since RADV_QUEUE_TRANSFER aren't exposed to applications, this cannot be used,
except by the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agoamd: add SDMA_NOP_PAD
Pierre-Eric Pelloux-Prayer [Thu, 6 Jan 2022 13:38:24 +0000 (14:38 +0100)]
amd: add SDMA_NOP_PAD

And use it in amdgpu_cs.c.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13959>

2 years agoaco: validate VOP3P opsel correctly
Daniel Schürmann [Sat, 8 Jan 2022 21:57:29 +0000 (21:57 +0000)]
aco: validate VOP3P opsel correctly

Before RA, subdword operands must use .xx
After RA, opsel can either be .xx or .yy

Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14472>

2 years agomesa: free vbo_save_vertex_list store prims
Tapani Pälli [Mon, 10 Jan 2022 11:40:27 +0000 (13:40 +0200)]
mesa: free vbo_save_vertex_list store prims

Fixes a leak:
  ==47470== 60 bytes in 1 blocks are definitely lost in loss record 1,790 of 1,904
  ==47470==    at 0x484186F: malloc (vg_replace_malloc.c:381)
  ==47470==    by 0x58EBA6A: compile_vertex_list (vbo_save_api.c:535)
  ==47470==    by 0x58EDABF: wrap_buffers (vbo_save_api.c:1021)
  ==47470==    by 0x58EDF97: upgrade_vertex (vbo_save_api.c:1134)
  ==47470==    by 0x58EE52F: fixup_vertex (vbo_save_api.c:1251)
  ==47470==    by 0x58EFE9E: _save_Normal3f (vbo_attrib_tmp.h:315)

Fixes: 69615d92a0e ("vbo/dlist: realloc prims array instead of free/malloc")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14474>

2 years agomesa: free idalloc storage for display lists
Tapani Pälli [Mon, 10 Jan 2022 11:23:31 +0000 (13:23 +0200)]
mesa: free idalloc storage for display lists

Fixes a leak:
  ==46154== 48 bytes in 1 blocks are definitely lost in loss record 1,571 of 1,905
  ==46154==    at 0x48466AF: realloc (vg_replace_malloc.c:1437)
  ==46154==    by 0x5FC98EC: util_idalloc_resize (u_idalloc.c:43)
  ==46154==    by 0x5FC9C16: util_idalloc_alloc_range (u_idalloc.c:125)
  ==46154==    by 0x56FDB9F: _mesa_EndList (dlist.c:13681)

Fixes: b703d7c15f4 ("dlist: store all dlist in a continuous memory block")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14474>

2 years agointel/xehp: Switch to coarser cross-slice pixel hashing with table permutation.
Francisco Jerez [Wed, 13 Oct 2021 06:57:53 +0000 (23:57 -0700)]
intel/xehp: Switch to coarser cross-slice pixel hashing with table permutation.

The coarser 32x32 cross-slice hashing mode seems to lead to better L1
and L2 utilization due to the improved execution locality, however it
can also lead to a bottleneck in a single slice, especially in
workloads that concentrate heavy rendering in small areas of the
screen (e.g. SynMark2 OglGeomPoint, OglTerrain*) -- This effect is
mitigated here by performing a permutation of the pixel pipe hashing
tables that ensures that adjacent rows map to pixel pipes as far away
as possible in the caching hierarchy.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agoanv: Program pixel hashing tables on XeHP.
Francisco Jerez [Wed, 6 Oct 2021 21:45:35 +0000 (14:45 -0700)]
anv: Program pixel hashing tables on XeHP.

Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agoiris: Program pixel hashing tables on XeHP.
Francisco Jerez [Wed, 6 Oct 2021 21:45:02 +0000 (14:45 -0700)]
iris: Program pixel hashing tables on XeHP.

Unlike the Gen11 code, this requires us to allocate a pipe_resource
for the pixel pipe hashing tables and hold a reference to it from the
context, since we need to add it to the validation list of every
batch, the tables may be accessed by the hardware at any time after
they're specified via 3DSTATE_SLICE_TABLE_STATE_POINTERS.

Note that this has an effect even for unfused native die platforms,
since the pixel pipe hashing tables we intend to program aren't
equivalent to the hardware's defaults on such configs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel: Rename intel_compute_pixel_hash_table() to intel_compute_pixel_hash_table_3way().
Francisco Jerez [Tue, 26 Oct 2021 23:51:41 +0000 (16:51 -0700)]
intel: Rename intel_compute_pixel_hash_table() to intel_compute_pixel_hash_table_3way().

For consistency with intel_compute_pixel_hash_table_nway().

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel: Minimal calculation of pixel hash table for arbitrary number of pixel pipes.
Francisco Jerez [Wed, 6 Oct 2021 21:42:18 +0000 (14:42 -0700)]
intel: Minimal calculation of pixel hash table for arbitrary number of pixel pipes.

This starts off with the simplest possible pixel hashing table
calculation that just assigns consecutive indices (modulo N) to
adjacent entries of the table, along the lines of the existing
intel_compute_pixel_hash_table().  The same function will be improved
in a future commit with a more optimal calculation.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel: Move pixel hashing table computation into common header file.
Francisco Jerez [Tue, 26 Oct 2021 23:50:35 +0000 (16:50 -0700)]
intel: Move pixel hashing table computation into common header file.

In order to avoid some duplication between the GL and Vulkan driver,
which will get worse as we introduce additional code in order to
handle more recent generations.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agoiris: Merge gfx11_ and gfx12_upload_pixel_hashing_tables() into the same function.
Francisco Jerez [Wed, 21 Jul 2021 21:50:12 +0000 (14:50 -0700)]
iris: Merge gfx11_ and gfx12_upload_pixel_hashing_tables() into the same function.

Will save some boilerplate as we introduce another variant of this
function.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel/genxml: Fix SLICE_HASH_TABLE struct on XeHP.
Francisco Jerez [Wed, 21 Jul 2021 21:30:28 +0000 (14:30 -0700)]
intel/genxml: Fix SLICE_HASH_TABLE struct on XeHP.

It's now an array with 7 tables, each table is intended to specify the
pixel pipe hashing behavior for every possible slice count between 2
and 8, however that doesn't actually work, among other reasons due to
hardware bugs that will cause the GPU to erroneously access the table
at the wrong index in some cases, so in practice all 7 tables need to
be initialized to the same value.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel/blorp/gfx12+: Drop unnecessary state cache invalidation from binding table...
Francisco Jerez [Tue, 26 Oct 2021 23:51:19 +0000 (16:51 -0700)]
intel/blorp/gfx12+: Drop unnecessary state cache invalidation from binding table setup.

The state cache invalidation shouldn't be necessary on recent
platforms.  On ICL it *seems* to be required to get the hardware to
pick up an updated indirect clear color, so this change is only
applied to TGL platforms and later for the moment.

On some DG2 configs this seems to improve SynMark2/OglDrvRes by 16.0%
±0.1%, n=8.

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel/fs: Don't assume packed dispatch for fragment shaders on XeHP.
Francisco Jerez [Sat, 16 Oct 2021 21:33:51 +0000 (14:33 -0700)]
intel/fs: Don't assume packed dispatch for fragment shaders on XeHP.

The current packed dispatch assumptions for fragment shaders seem to
be the reason that the fs-readFirstInvocation-uint-loop Piglit
test-case for the ARB_shader_ballot extension fails on DG2 in
combination with the patches in this series that enable pixel pipe
hashing (thanks Jordan for reporting the regression).  I've confirmed
that the brw_fs_test_dispatch_packing() test fails on DG2 hardware for
fragment shaders, while it succeeds for other shader stages,
indicating that the PSD hardware no longer guarantees packed dispatch.
Disable it.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agointel/xehp: Update 3DSTATE_PS maximum number of threads per PSD.
Francisco Jerez [Mon, 19 Jul 2021 20:51:46 +0000 (13:51 -0700)]
intel/xehp: Update 3DSTATE_PS maximum number of threads per PSD.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>

2 years agodocs: Update d3d12 extension list and new_features.txt
Jesse Natalie [Fri, 7 Jan 2022 15:52:46 +0000 (07:52 -0800)]
docs: Update d3d12 extension list and new_features.txt

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Enable compute
Jesse Natalie [Thu, 23 Dec 2021 00:06:25 +0000 (16:06 -0800)]
d3d12: Enable compute

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Run DXIL shared atomic lowering pass
Jesse Natalie [Sat, 1 Jan 2022 16:09:05 +0000 (08:09 -0800)]
d3d12: Run DXIL shared atomic lowering pass

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Handle indirect dispatch
Jesse Natalie [Sat, 1 Jan 2022 00:56:08 +0000 (16:56 -0800)]
d3d12: Handle indirect dispatch

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Implement num workgroups as a state var
Jesse Natalie [Fri, 31 Dec 2021 22:50:07 +0000 (14:50 -0800)]
d3d12: Implement num workgroups as a state var

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Implement launch_grid
Jesse Natalie [Fri, 31 Dec 2021 21:50:42 +0000 (13:50 -0800)]
d3d12: Implement launch_grid

Some more refactoring in d3d12_draw.cpp to re-use a bunch of state
and descriptor management, and some refactoring of the dirty states.

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Hook up compute shader variations
Jesse Natalie [Fri, 31 Dec 2021 20:54:04 +0000 (12:54 -0800)]
d3d12: Hook up compute shader variations

Currently only variable workgroup size is implemented

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Support compute root signatures
Jesse Natalie [Fri, 31 Dec 2021 18:08:54 +0000 (10:08 -0800)]
d3d12: Support compute root signatures

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Compile, bind, and cache compute PSOs
Jesse Natalie [Fri, 31 Dec 2021 17:58:50 +0000 (09:58 -0800)]
d3d12: Compile, bind, and cache compute PSOs

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Stop trying to set D3D12_DIRTY_SHADER during bindings
Jesse Natalie [Fri, 31 Dec 2021 21:47:04 +0000 (13:47 -0800)]
d3d12: Stop trying to set D3D12_DIRTY_SHADER during bindings

We don't key off of it to try to figure out if we need to produce
a new shader variant, so there's no need to set it when changing
properties that feed into variants. If we do have a new shader or
variant at draw time, we'll produce a new PSO without this.

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Remove draw_info from selection_context
Jesse Natalie [Fri, 31 Dec 2021 20:52:05 +0000 (12:52 -0800)]
d3d12: Remove draw_info from selection_context

It's not needed, and having it there can be misleading since sometimes it's null

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Keep state vars last in the per-stage root parameters
Jesse Natalie [Fri, 31 Dec 2021 19:52:31 +0000 (11:52 -0800)]
d3d12: Keep state vars last in the per-stage root parameters

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agod3d12: Limit sampler view count to 32
Jesse Natalie [Mon, 10 Jan 2022 23:27:33 +0000 (15:27 -0800)]
d3d12: Limit sampler view count to 32

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agomicrosoft/compiler: Handle more GL memory barriers
Jesse Natalie [Sat, 1 Jan 2022 16:08:48 +0000 (08:08 -0800)]
microsoft/compiler: Handle more GL memory barriers

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agomicrosoft/compiler: Move workgroup_size lowering from clc
Jesse Natalie [Fri, 31 Dec 2021 22:28:28 +0000 (14:28 -0800)]
microsoft/compiler: Move workgroup_size lowering from clc

It doesn't depend on the clc data being provided externally, so no
need to tie it there, we can re-use it for GL and Vulkan compute.

Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>

2 years agofreedreno: Report system memory as video memory
Rob Clark [Mon, 10 Jan 2022 16:49:33 +0000 (08:49 -0800)]
freedreno: Report system memory as video memory

This seems to be the approach that other UMA drivers have settled on,
when there aren't some other constraints.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5675
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14478>

2 years agonir_to_tgsi: Fix a bug in TXP detection after backend lowering.
Emma Anholt [Tue, 28 Dec 2021 19:04:28 +0000 (11:04 -0800)]
nir_to_tgsi: Fix a bug in TXP detection after backend lowering.

TGSI reserves 2 components for the coord in the first operand vector, even
for 1D.  Fixes r600 failure with shadow1d.

Fixes: 390a3fcdc45e ("nir_to_tgsi: Add support for TXP.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14322>

2 years agointel/xehp: Implement XeHP workaround Wa_14014148106.
Francisco Jerez [Sat, 19 Jun 2021 02:40:10 +0000 (19:40 -0700)]
intel/xehp: Implement XeHP workaround Wa_14014148106.

Actually, no, there's no need to do anything, just update some
comments for the record.  An earlier revision of this change that
implemented the workaround text to the letter required no less than 8
new PIPE_CONTROLs throughout the tree.  However Felix Degrood noticed
that the cost of some of the PIPE_CONTROLs was showing up in workloads
like Shadow of the Tomb Raider.  The Windows driver wasn't emitting
many of those pipe controls, contrary to the W/A instructions, so we
engaged in a back and forth with the hardware team, who concluded that
the original suggested workaround was unnecessarily strict, and the
Windows driver's behavior acceptable.  It turns out that Wa_1408224581
we had already implemented for TGL is roughly equivalent to the
Windows behavior, so no need to do anything new after all.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>

2 years agointel/xehp: Implement XeHP workaround Wa_14013910100.
Francisco Jerez [Sat, 19 Jun 2021 02:39:08 +0000 (19:39 -0700)]
intel/xehp: Implement XeHP workaround Wa_14013910100.

XeHP platforms require the invalidation of the instruction cache after
a STATE_BASE_ADDRESS change due to a hardware bug potentially leading
to instruction cache pollution.  Note that the workaround text says
it's applicable "DG2 128/256/512-A/B", however it's also marked as
permanent and not confirmed to be fixed in any specific steping, so we
apply it to all Gfx12HP platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>

2 years agovc4: Use u_box_pixels_to_blocks helper
Alyssa Rosenzweig [Tue, 4 Jan 2022 21:41:33 +0000 (16:41 -0500)]
vc4: Use u_box_pixels_to_blocks helper

Eliminates a ETC1 special case. In fact this unit conversion applies to
all formats; the original code path works since ETC1 is the only format
with blocks bigger than 1x1 supported by vc4 (I assume).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>

2 years agov3d: Use u_box_pixels_to_blocks helper
Alyssa Rosenzweig [Tue, 4 Jan 2022 21:18:42 +0000 (16:18 -0500)]
v3d: Use u_box_pixels_to_blocks helper

Rather than open-coding.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>

2 years agolima,panfrost: Correct pixel vs block mismatches
Alyssa Rosenzweig [Tue, 4 Jan 2022 20:31:25 +0000 (15:31 -0500)]
lima,panfrost: Correct pixel vs block mismatches

Different parts of our codebase disagree on whether spatial
coordinates/dimensions are given in pixels or blocks, which differ by a
constant factor for block-compressed formats. This disagreement
manifests as incorrect results accessing block-compressed formats.

To resolve this, define the public tiling routines to take their
coordinates in pixels, and align the relevant code in Panfrost
accordingly.

Fixes rendering glitches in Factorio, as well as a pile of piglits on
Panfrost. It should also fix glTexSubImage() with ETC1 on Lima, but
there are no tests for this in dEQP/Piglit.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> [dEQP/Lima]
Tested-by: Erico Nunes <nunes.erico@gmail.com> [Piglit/Lima]
Reported-by: Icecream95 <ixn@disroot.org>
Closes: #5560
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>

2 years agogallium/util: Add pixel->blocks box helper
Alyssa Rosenzweig [Tue, 4 Jan 2022 21:16:06 +0000 (16:16 -0500)]
gallium/util: Add pixel->blocks box helper

There is a lot of unit confusion in Gallium due to pixels versus blocks
matching only with uncompressed textures. Add a helper to do a common
pixels->blocks unit conversion required in multiple drivers.

v2: Rename dst->blocks, src->pixels to avoid confusion about the units
to casual readers (Mike).

Note to mesa-stable maintainers: this is marked as Cc: mesa-stable so
the next patch (a set of bug fixes for Lima and Panfrost) can be
backported. It's not a bug fix in its own right, of course.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>

2 years agoreplace 0 with NULL for NULL pointers
Thomas H.P. Andersen [Mon, 2 Aug 2021 21:39:27 +0000 (23:39 +0200)]
replace 0 with NULL for NULL pointers

This updates many places where 0 is used as NULL pointer.

There are a few warnings left when I build the default
configuration but they either relate to code
outside of mesa or where "None" is used instead.

Found with static analysis (smatch)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12174>

2 years agoaco: remove pack_half_2x16(a, 0) optimization
Rhys Perry [Fri, 7 Jan 2022 16:13:00 +0000 (16:13 +0000)]
aco: remove pack_half_2x16(a, 0) optimization

This makes the compiler less predictable and should only have a very small
effect on performance.

fossil-db (Vega):
Totals from 2410 (1.79% of 134756) affected shaders:
CodeSize: 6911568 -> 6942840 (+0.45%)

Fixes Horizon Zero Dawn artifacts.

If a shader has:
   a = pack_half_2x16(a, 0) //rtne
   store(pack_half_2x16(0, b) | a) //rtne
   a = unpack_2x16(a).x
It will become:
   store(pack_half_2x16(a, b)) //rtz
   a = unpack_2x16(pack_half_2x16(a, 0)).x //rtne

So a later shader with "unpack_2x16(load()).x" will use "a" rounded to
zero, while the previous shader will use "a" rounded to the nearest even.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2f125908b35 ("radv,aco: lower_pack_half_2x16")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14475>

2 years agoci: Uprev piglit to af1785f31
Christian Gmeiner [Thu, 6 Jan 2022 06:18:28 +0000 (07:18 +0100)]
ci: Uprev piglit to af1785f31

Brings in these changes:

af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed
dad078717 occlusion_query_conform: convert to pilgit subtests
b52c1c761 glsl-1.30: test nested preprocessor concat
6c4da153b texture-storage: Fix subtest result handling of skips.
4343f19db fbo-integer: Remove the invalid DrawPixels test.
e3842f2fe arb_dsa: exclude stencil8 textures from test sets.
ce8649be7 spec/ext_external_objects: Fix build on Debian systems
4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking
7e61e5199 Tests for variable in and out of loop scope
f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20
9be2fe999 glx: add glx-multi-display-single-pbuffer test
bfe290725 glx: add glx-swap-pbuffer test
efa64335e framework: Fix build on Windows when using waffle

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>

2 years agoisl: Don't enable HDC:L1 caches on DG2
Jordan Justen [Thu, 6 Jan 2022 21:33:07 +0000 (13:33 -0800)]
isl: Don't enable HDC:L1 caches on DG2

The MOCS entry used for this on Tigerlake doesn't exist on DG2.

Ref: aca31baafc0 ("isl: Enable Tigerlake HDC:L1 caches via MOCS in various cases.")
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14467>

2 years agonir/uniform_atomics: fix is_atomic_already_optimized without workgroups
Rhys Perry [Wed, 5 Jan 2022 13:51:50 +0000 (13:51 +0000)]
nir/uniform_atomics: fix is_atomic_already_optimized without workgroups

dims_needed would have been zero, so this would always returned true for
non-compute stages.

Also fix this for variable workgroup sizes.

Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and
4.5% (day_of_dead, jungle and paititi scenes).

radv_perf before and after:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'}

{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'}

Performance now seems slightly better than AMDVLK 2021.Q4.3:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'}

fossil-db (Sienna Cichlid):
Totals from 40 (0.03% of 134621) affected shaders:
CodeSize: 62676 -> 65996 (+5.30%)
Instrs: 11372 -> 12111 (+6.50%)
Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21%
InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87%
VClause: 304 -> 306 (+0.66%)
SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00%
Copies: 780 -> 858 (+10.00%)
Branches: 235 -> 329 (+40.00%)
PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407>

2 years agopanvk: Fixed maxFragmentCombinedOutputResources
Konstantin Seurer [Tue, 28 Dec 2021 19:10:51 +0000 (20:10 +0100)]
panvk: Fixed maxFragmentCombinedOutputResources

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>

2 years agoturnip: Fixed maxFragmentCombinedOutputResources
Konstantin Seurer [Tue, 28 Dec 2021 19:07:34 +0000 (20:07 +0100)]
turnip: Fixed maxFragmentCombinedOutputResources

Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>

2 years agoanv: Fixed maxFragmentCombinedOutputResources
Konstantin Seurer [Tue, 28 Dec 2021 19:04:48 +0000 (20:04 +0100)]
anv: Fixed maxFragmentCombinedOutputResources

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>

2 years agolavapipe: Fixed maxFragmentCombinedOutputResources
Konstantin Seurer [Tue, 28 Dec 2021 18:56:16 +0000 (19:56 +0100)]
lavapipe: Fixed maxFragmentCombinedOutputResources

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>

2 years agoac/nir: fix store_buffer_amd write_masks
Rhys Perry [Thu, 6 Jan 2022 17:43:44 +0000 (17:43 +0000)]
ac/nir: fix store_buffer_amd write_masks

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>

2 years agonir/lower_shader_calls: fix store_scratch write_mask
Rhys Perry [Fri, 7 Jan 2022 10:58:19 +0000 (10:58 +0000)]
nir/lower_shader_calls: fix store_scratch write_mask

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>

2 years agoetnaviv: drm: defer destruction of softpin BOs
Lucas Stach [Fri, 10 Dec 2021 21:46:37 +0000 (22:46 +0100)]
etnaviv: drm: defer destruction of softpin BOs

When destroying a BO with a userspace managed address and thus freeing
the VMA space, we need to make sure that the BO isn't in use by any
active submit anymore, as the kernel will rightfully reject the next
submit that re-uses the still active VMA. Keep the BO alive as long
as it isn't fully idle to prevent the VMA being reused prematurely.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>

2 years agoetnaviv: drm: rename _etna_bo_del
Lucas Stach [Fri, 10 Dec 2021 21:35:34 +0000 (22:35 +0100)]
etnaviv: drm: rename _etna_bo_del

Rename it to a somwhat more descriptive name, which makes it easier
to distinguish between the etna_bo_del function in the public interface
and the internal function. Also remove the duplicated forward declaration
and move it to the common interal header.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>

2 years agoetnaviv: drm: export BO idle check function
Lucas Stach [Fri, 10 Dec 2021 21:17:15 +0000 (22:17 +0100)]
etnaviv: drm: export BO idle check function

The ability to check if a BO is idle is not only useful in the
buffer cache, but also in other parts of the winsys and even the
pipe driver. Make this functionality available in the interface.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>

2 years agoetnaviv: drm: properly handle reviving BOs via a lookup
Lucas Stach [Fri, 10 Dec 2021 22:55:35 +0000 (23:55 +0100)]
etnaviv: drm: properly handle reviving BOs via a lookup

If a BO is removed from a cache bucket list via a lookup, we must
handle it in the same way as if a allocation from the cache happened:
tell valgrind that the buffer is active again and take a reference
to the etna_device, which the BO had given up while being in the
cache.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>

2 years agoetnaviv: drm: fix size limit in etna_cmd_stream_realloc
Lucas Stach [Thu, 6 Jan 2022 17:58:01 +0000 (18:58 +0100)]
etnaviv: drm: fix size limit in etna_cmd_stream_realloc

The intended limit for command stream size is 64KB, as this is what old
kernels can reliably do and what allows for maximum number of queued
streams on newer kernels. However, due to unit confusion with the size
member, which is in dwords, the submitted streams could grow up to
~128KB. Fix this by using the proper limit in dwords.

Flushing due to some limits being exceeded is not an issue, but is
expected with certain workloads, so lower the severity of the message
being emitted in this case to debug level.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14425>

2 years agoegl/wayland: break double/tripple buffering feedback loops
Lucas Stach [Fri, 7 Jan 2022 11:53:21 +0000 (12:53 +0100)]
egl/wayland: break double/tripple buffering feedback loops

Currently we dispose any unneeded color buffers immediately if we detect that
there are more unlocked buffers than we need. This can lead to feedback loops
between the compositor and the application causing rapid toggling between
double and tripple buffering.

Scenario: 2 buffers already queued to the compositor, egl/wayland allocates a
new back buffer to avoid throttling, slowing down the frame. This allows the
compositor to catch up and unlock both buffers. EGL detects that there are
more buffers than currently needed, freeing the buffer, restarting the loop
shortly after.

To avoid wasting CPU time on rapidly freeing and reallocating color buffers
break those feedback loops by letting the unneeded buffers sit around for a
short while before disposing them.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14451>

2 years agotu,ir3: Implement VK_KHR_shader_integer_dot_product
Danylo Piliaiev [Fri, 26 Nov 2021 16:57:52 +0000 (18:57 +0200)]
tu,ir3: Implement VK_KHR_shader_integer_dot_product

- gen4 - has dp4acc and dp2acc, dp4acc is used to implement
  4x8 dot product.
- gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product
  on both get3 and gen4.
- gen2 - unknown, lower everything.
- gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise
  cl_qcom_dot_product8 but still generates code for it.
  The assembly is more verbose and uses yet to be documented
  mad32.u16 instruction.

Passes:
 dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.*
 dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.*

Only packed 4x8 unsigned and mixed versions are accelerated.
However in theory we should be able to do better for signed version
than current NIR lowering.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>

2 years agoir3: Make nir compiler options a part of ir3_compiler
Danylo Piliaiev [Tue, 30 Nov 2021 16:06:53 +0000 (18:06 +0200)]
ir3: Make nir compiler options a part of ir3_compiler

This would allow for sub-gens to have different options.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>

2 years agonir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
Danylo Piliaiev [Fri, 26 Nov 2021 17:27:03 +0000 (19:27 +0200)]
nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8

Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>

2 years agoir3: New cat3 instructions
Danylo Piliaiev [Wed, 24 Nov 2021 12:57:03 +0000 (14:57 +0200)]
ir3: New cat3 instructions

* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
  SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
  SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
  duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
  the first reg in each vec4 result is overwritten instead of
  accumulating.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>

2 years agotu: Implement VK_EXT_subgroup_size_control
Connor Abbott [Thu, 25 Nov 2021 16:02:42 +0000 (17:02 +0100)]
tu: Implement VK_EXT_subgroup_size_control

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>

2 years agotu, ir3: Support runtime gl_SubgroupSize in FS
Connor Abbott [Thu, 25 Nov 2021 15:55:01 +0000 (16:55 +0100)]
tu, ir3: Support runtime gl_SubgroupSize in FS

We already supported it in the CS for computing the subgroup ID, but
soon we'll need it in the FS too. Vertex stages will always have it
lowered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>

2 years agoir3: Add wavesize control
Connor Abbott [Thu, 25 Nov 2021 14:17:36 +0000 (15:17 +0100)]
ir3: Add wavesize control

This allows the wavesize to be controlled per-shader. This will be used
by VK_EXT_subgroup_size_control, and freedreno will also need it if
legacy ARB_shader_ballot is to be supported (since it forces a wavesize
of 64 or less).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>

2 years agoir3: Pass shader to ir3_nir_post_finalize()
Connor Abbott [Thu, 25 Nov 2021 14:16:36 +0000 (15:16 +0100)]
ir3: Pass shader to ir3_nir_post_finalize()

We'll need to add shader-specific lowering for gl_SubgroupSize.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>

2 years agoir3, freedreno: Add options struct for ir3_shader_from_nir()
Connor Abbott [Thu, 25 Nov 2021 13:30:46 +0000 (14:30 +0100)]
ir3, freedreno: Add options struct for ir3_shader_from_nir()

We'll expand this in a moment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>

2 years agotu: fix workaround for depth bounds test without depth test
Danylo Piliaiev [Thu, 7 Oct 2021 13:02:16 +0000 (16:02 +0300)]
tu: fix workaround for depth bounds test without depth test

Fixes: bb4db22ff43a708bf80a8f72913ee493313393d1

("turnip: apply workaround for depth bounds test without depth test")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>

2 years agoanv: limit compiler valid color outputs using NIR variables
Lionel Landwerlin [Thu, 6 Jan 2022 09:03:36 +0000 (11:03 +0200)]
anv: limit compiler valid color outputs using NIR variables

This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc
test which asserts in the backend with :

   brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend || key->nr_color_regions == 1' failed.

This is because there is 2 color attachments provided by the
renderpass so we initially set nr_color_regions = 2. But once we've
parsed the shader, we can see it's only using one output (with dual
source color blending).

This change looks at the output variables to update the valid output
variables.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>

2 years agoiris: unref syncobjs and free r/w dependencies array for slab entries
Tapani Pälli [Tue, 4 Jan 2022 09:26:55 +0000 (11:26 +0200)]
iris: unref syncobjs and free r/w dependencies array for slab entries

Fixes memory leak with dependencies array:

  ==5224== 104 (96 direct, 8 indirect) bytes in 3 blocks are definitely lost in loss record 1,954 of 2,035
  ==5224==    at 0x484178A: malloc (vg_replace_malloc.c:380)
  ==5224==    by 0x484670B: realloc (vg_replace_malloc.c:1437)
  ==5224==    by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
  ==5224==    by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
  ==5224==    by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
  ==5224==    by 0x14DB77D0: iris_transfer_map (iris_resource.c:2348)
  ==5224==    by 0x157786FD: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
  ==5224==    by 0x14C479E7: tc_buffer_map (u_threaded_context.c:2252)
  ==5224==    by 0x1434F3F8: pipe_buffer_map_range (u_inlines.h:393)
  ==5224==    by 0x1435094A: _mesa_bufferobj_map_range (bufferobj.c:491)
  ==5224==    by 0x143586D9: map_buffer_range (bufferobj.c:3737)
  ==5224==    by 0x14358DA3: _mesa_MapBuffer (bufferobj.c:3947)

  ==5224== 240 (192 direct, 48 indirect) bytes in 6 blocks are definitely lost in loss record 1,984 of 2,035
  ==5224==    at 0x484178A: malloc (vg_replace_malloc.c:380)
  ==5224==    by 0x484670B: realloc (vg_replace_malloc.c:1437)
  ==5224==    by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
  ==5224==    by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
  ==5224==    by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
  ==5224==    by 0x14FF72CC: iris_get_query_result (iris_query.c:631)
  ==5224==    by 0x14C4396A: tc_get_query_result (u_threaded_context.c:880)
  ==5224==    by 0x1458F4F7: get_query_result (st_cb_queryobj.c:273)
  ==5224==    by 0x1458F7EB: st_WaitQuery (st_cb_queryobj.c:352)
  ==5224==    by 0x144EFF66: get_query_object (queryobj.c:742)
  ==5224==    by 0x144F01AE: _mesa_GetQueryObjectuiv (queryobj.c:811)

And leak with syncobjs:

  ==13644== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1,846
  ==13644==    at 0x484186F: malloc (vg_replace_malloc.c:381)
  ==13644==    by 0x639789B: iris_create_syncobj (iris_fence.c:69)
  ==13644==    by 0x63B213A: iris_batch_reset (iris_batch.c:512)
  ==13644==    by 0x63B3637: _iris_batch_flush (iris_batch.c:1056)
  ==13644==    by 0x65EF2BC: iris_get_query_result (iris_query.c:631)
  ==13644==    by 0x623B970: tc_get_query_result (u_threaded_context.c:880)
  ==13644==    by 0x5B874F7: get_query_result (st_cb_queryobj.c:273)
  ==13644==    by 0x5B877EB: st_WaitQuery (st_cb_queryobj.c:352)
  ==13644==    by 0x5AE7F66: get_query_object (queryobj.c:742)
  ==13644==    by 0x5AE8150: _mesa_GetQueryObjectiv (queryobj.c:801)

Fixes: ce2e2296ab6 ("iris: Suballocate BO using the Gallium pb_slab mechanism")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14387>

2 years agoiris/ci: update piglit fails
Christian Gmeiner [Fri, 7 Jan 2022 06:46:10 +0000 (07:46 +0100)]
iris/ci: update piglit fails

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14442>