platform/upstream/mesa.git
2 years agonouveau/nir: Don't try to emit OP_FMA pre-nvc0.
Emma Anholt [Sat, 26 Mar 2022 04:06:35 +0000 (21:06 -0700)]
nouveau/nir: Don't try to emit OP_FMA pre-nvc0.

The TGSI backend avoids TGSI_OPCODE_FMA (and thus OP_FMA) pre-nvc0,
replacing it with TGSI_OPCODE_MAD in that case.

Noticed when looking at native-NIR stats and finding that load
optimization wasn't taking place on the unsupported opcode.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15543>

2 years agovirgl: Extend integer write out output fix to all non-move integers ops
Gert Wollny [Wed, 13 Apr 2022 10:04:33 +0000 (12:04 +0200)]
virgl: Extend integer write out output fix to all non-move integers ops

The host virglrenderer can only handle moves to integer outputs, all
ALU opt that create integer outputs are created with extra code to convert
to float for the temporaries, and this breaks the output write
handling.

Fixes:
  spec@arb_sample_shading@builtin-gl-sample-mask *
  spec@arb_sample_shading@builtin-gl-sample-mask-simple *

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15921>

2 years agoradv: exclude PRIMITIVE_{COUNT,INDICES} from the per-vertex output mask
Samuel Pitoiset [Wed, 13 Apr 2022 09:12:28 +0000 (11:12 +0200)]
radv: exclude PRIMITIVE_{COUNT,INDICES} from the per-vertex output mask

They should be excluded for the primitive and vertex output masks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15918>

2 years agoclc: Use stringstream for printing spirv errors
Icecream95 [Wed, 19 Jan 2022 08:41:23 +0000 (21:41 +1300)]
clc: Use stringstream for printing spirv errors

The type of the spv_position_t components can differ across platforms,
it's simpler to just let C++ overloading handle it.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15437>

2 years agodocs: truncate new_features.txt
Dylan Baker [Wed, 13 Apr 2022 22:54:22 +0000 (15:54 -0700)]
docs: truncate new_features.txt

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15935>

2 years agoVERSION: bump to 22.2-devel for next cycle
Dylan Baker [Wed, 13 Apr 2022 22:52:31 +0000 (15:52 -0700)]
VERSION: bump to 22.2-devel for next cycle

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15935>

2 years agoradv: use load_shared2_amd/store_shared2_amd
Rhys Perry [Fri, 12 Nov 2021 10:28:24 +0000 (10:28 +0000)]
radv: use load_shared2_amd/store_shared2_amd

fossil-db (Sienna Cichlid):
Totals from 376 (0.23% of 162293) affected shaders:
MaxWaves: 9620 -> 9596 (-0.25%); split: +0.08%, -0.33%
Instrs: 207533 -> 203901 (-1.75%); split: -1.76%, +0.01%
CodeSize: 1130904 -> 1106420 (-2.16%); split: -2.17%, +0.01%
VGPRs: 14016 -> 14120 (+0.74%); split: -0.34%, +1.08%
Latency: 2143281 -> 2132212 (-0.52%); split: -0.56%, +0.05%
InvThroughput: 389116 -> 387990 (-0.29%); split: -0.34%, +0.05%
VClause: 4483 -> 4485 (+0.04%); split: -0.11%, +0.16%
SClause: 5780 -> 5778 (-0.03%); split: -0.17%, +0.14%
Copies: 15319 -> 15331 (+0.08%); split: -0.53%, +0.61%
Branches: 5561 -> 5563 (+0.04%)
PreSGPRs: 11776 -> 11775 (-0.01%)
PreVGPRs: 11393 -> 11497 (+0.91%); split: -0.13%, +1.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agoac/llvm: implement load_shared2_amd/store_shared2_amd
Rhys Perry [Fri, 12 Nov 2021 10:45:46 +0000 (10:45 +0000)]
ac/llvm: implement load_shared2_amd/store_shared2_amd

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agoaco: implement load_shared2_amd/store_shared2_amd
Rhys Perry [Fri, 12 Nov 2021 10:28:13 +0000 (10:28 +0000)]
aco: implement load_shared2_amd/store_shared2_amd

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agoaco: handle read2st64/write2st64 in optimizer
Rhys Perry [Mon, 15 Nov 2021 16:40:53 +0000 (16:40 +0000)]
aco: handle read2st64/write2st64 in optimizer

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agoaco: fix signedness of DS_instruction::offset0/1
Rhys Perry [Wed, 10 Nov 2021 15:02:24 +0000 (15:02 +0000)]
aco: fix signedness of DS_instruction::offset0/1

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agonir/opt_load_store_vectorize: create load_shared2_amd/store_shared2_amd
Rhys Perry [Fri, 12 Nov 2021 10:27:13 +0000 (10:27 +0000)]
nir/opt_load_store_vectorize: create load_shared2_amd/store_shared2_amd

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agonir/opt_load_store_vectorize: fix broken indentation
Rhys Perry [Thu, 11 Nov 2021 16:07:20 +0000 (16:07 +0000)]
nir/opt_load_store_vectorize: fix broken indentation

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agonir: add load_shared2_amd and store_shared2_amd
Rhys Perry [Fri, 12 Nov 2021 10:26:30 +0000 (10:26 +0000)]
nir: add load_shared2_amd and store_shared2_amd

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778>

2 years agoradv: Fix barriers with cp dma
Konstantin Seurer [Wed, 13 Apr 2022 19:02:55 +0000 (21:02 +0200)]
radv: Fix barriers with cp dma

We need to wait for cp dma if VK_PIPELINE_STAGE_2_ALL_TRANSFER_BIT or
VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT are set.

Closes: #5911
Fixes: 4b9bc4791b5 ("radv: only sync CP DMA for transfer operations or bottom pipe")

Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15933>

2 years agoaco: remove register hints entirely
Daniel Schürmann [Tue, 15 Mar 2022 15:28:06 +0000 (16:28 +0100)]
aco: remove register hints entirely

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoaco: remove occurences of VCC hint
Daniel Schürmann [Tue, 15 Mar 2022 13:49:32 +0000 (14:49 +0100)]
aco: remove occurences of VCC hint

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoaco: make program->needs_vcc independent of VCC hints
Daniel Schürmann [Wed, 16 Mar 2022 09:56:26 +0000 (10:56 +0100)]
aco: make program->needs_vcc independent of VCC hints

Totals from 5 (0.00% of 135048) affected shaders: (GFX9)
SGPRs: 208 -> 160 (-23.08%)
CodeSize: 2700 -> 2692 (-0.30%)
Instrs: 533 -> 531 (-0.38%)
Latency: 41688 -> 41680 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoaco/ra: omit VCC affinity on VOPC_SDWA for GFX9+
Daniel Schürmann [Tue, 15 Mar 2022 12:06:48 +0000 (13:06 +0100)]
aco/ra: omit VCC affinity on VOPC_SDWA for GFX9+

VOPC_SDWA can also use arbitrary SGPR pairs on GFX9+.

Totals from 5607 (4.16% of 134913) affected shaders: (GFX10.3)
CodeSize: 42470760 -> 42452988 (-0.04%)
Instrs: 7943174 -> 7942883 (-0.00%)
Latency: 102887029 -> 102886305 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 20454456 -> 20454338 (-0.00%); split: -0.00%, +0.00%
Copies: 376818 -> 376865 (+0.01%); split: -0.00%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoaco/ra: create VCC-affinities during RA
Daniel Schürmann [Tue, 15 Mar 2022 11:15:44 +0000 (12:15 +0100)]
aco/ra: create VCC-affinities during RA

instead of using register hints.

Totals from 88367 (65.50% of 134913) affected shaders: (GFX10.3)
CodeSize: 322492184 -> 322252912 (-0.07%); split: -0.08%, +0.01%
Instrs: 60615809 -> 60541260 (-0.12%); split: -0.12%, +0.00%
Latency: 557067980 -> 557009210 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 109676757 -> 109674804 (-0.00%); split: -0.00%, +0.00%
SClause: 1939703 -> 1939924 (+0.01%); split: -0.01%, +0.02%
Copies: 4557567 -> 4487530 (-1.54%); split: -1.54%, +0.00%
Branches: 1941123 -> 1937453 (-0.19%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoaco/ra: only use VCC if program->needs_vcc == true
Daniel Schürmann [Wed, 16 Mar 2022 09:59:52 +0000 (10:59 +0100)]
aco/ra: only use VCC if program->needs_vcc == true

A future commit will make VCC register assignment independent
from register hints. Up to GFX9, VCC can alternatively be used
as regular SGPR, so prevent overlap.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>

2 years agoanv: stop using old entrypoint/struct/enum names for 1.3
Lionel Landwerlin [Wed, 13 Apr 2022 10:06:43 +0000 (13:06 +0300)]
anv: stop using old entrypoint/struct/enum names for 1.3

v2: More replacements

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15920>

2 years agonir_to_tgsi: Do the required cleanup for nir_opt_find_array_copies().
Emma Anholt [Tue, 12 Apr 2022 17:33:14 +0000 (10:33 -0700)]
nir_to_tgsi: Do the required cleanup for nir_opt_find_array_copies().

If we made a copy deref, then we need to do dead-write elimination for the
pervious writes or we'll just emit the same copy deref again next time
around.  And, at the end of the opt loop, we need to lower copy derefs
because later passes (locals_to_regs, notably) depend on it.

Fixes infinite opt loop on fs-function-inout-array with virgl on NTT.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15899>

2 years agoiris: More gracefully fail in resource_from_user_memory
Jason Ekstrand [Tue, 12 Apr 2022 18:34:26 +0000 (13:34 -0500)]
iris: More gracefully fail in resource_from_user_memory

rusticl (and clover) would like to get a graceful fail here so they can
fall back to a shadow copy instead of us asserting.  We also start
rejecting arrayed surface because isl doesn't allow selecting a QPitch
yet.  Even if it did, QPitch is horribly restrictive, even for linear
surfaces, that it likely wouldn't be that useful.

Fixes: e81f3edf76b0 ("iris: Allow userptr on 1D and 2D images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15903>

2 years agozink: set optimal tiling on swapchain images
Mike Blumenkrantz [Wed, 13 Apr 2022 15:36:53 +0000 (11:36 -0400)]
zink: set optimal tiling on swapchain images

this otherwise breaks kopper

fixes #6294

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15928>

2 years agodzn: Add CI target for vulkan driver
Louis-Francis Ratté-Boulianne [Thu, 10 Feb 2022 16:03:57 +0000 (11:03 -0500)]
dzn: Add CI target for vulkan driver

A custom branch of `deqp` is used to have proper results when
crashing. See:

https://github.com/KhronosGroup/VK-GL-CTS/issues/311

A custom branch of `deqp-runner` with Windows support is also
used until the changes are merged into the main repository.

The `api`, `info`, `draw`, `query-pool` and `memory` test cases are
executed for now.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15742>

2 years agodzn: Add a debug flag to enable D3D12 debug layer
Louis-Francis Ratté-Boulianne [Mon, 4 Apr 2022 18:45:20 +0000 (14:45 -0400)]
dzn: Add a debug flag to enable D3D12 debug layer

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15742>

2 years agopvr: Implement vkCreateQueryPool() and vkDestroyQueryPool().
Karmjit Mahil [Wed, 23 Feb 2022 11:48:05 +0000 (11:48 +0000)]
pvr: Implement vkCreateQueryPool() and vkDestroyQueryPool().

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880>

2 years agopvr: Add pvrsrvkm visibility test heap.
Karmjit Mahil [Wed, 23 Feb 2022 15:43:54 +0000 (15:43 +0000)]
pvr: Add pvrsrvkm visibility test heap.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880>

2 years agopvr: Add core count info and pvr_device_runtime_info.
Karmjit Mahil [Wed, 23 Feb 2022 13:51:55 +0000 (13:51 +0000)]
pvr: Add core count info and pvr_device_runtime_info.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880>

2 years agov3dv: Add emulated timeline semaphore support
Jason Ekstrand [Mon, 4 Apr 2022 23:37:26 +0000 (18:37 -0500)]
v3dv: Add emulated timeline semaphore support

This is trivial thanks to the emulated timelines provided in common
code.  "Real" timeline semaphores which can be shared across processes
will require kernel support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Use the core version property helpers
Jason Ekstrand [Mon, 4 Apr 2022 23:33:55 +0000 (18:33 -0500)]
v3dv: Use the core version property helpers

vulkaninfo is the same before and after.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Use the core version feature helpers
Jason Ekstrand [Mon, 4 Apr 2022 23:24:30 +0000 (18:24 -0500)]
v3dv: Use the core version feature helpers

vulkaninfo is the same before and after.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Switch to the common submit framework
Jason Ekstrand [Tue, 29 Mar 2022 22:52:32 +0000 (17:52 -0500)]
v3dv: Switch to the common submit framework

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Always wait on last_job_syncs if job->serialize
Jason Ekstrand [Thu, 7 Apr 2022 15:17:26 +0000 (10:17 -0500)]
v3dv: Always wait on last_job_syncs if job->serialize

Even if we're the first job on some queue, there may be no wait
semaphores but we still need to ensure things happen in-order.  (See
the "Implicit Synchronization Guarantees" section of the Vulkan spec.)
The client can submit back-to-back command buffers with no semaphores
between them and it needs to adt the same as if there were a semaphore.
If job->serialize is set because of a barrier or something, we still
need to synchronize across HW queues by waiting on last_job_syncs.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Add a condition variable for queries
Jason Ekstrand [Mon, 4 Apr 2022 15:25:15 +0000 (10:25 -0500)]
v3dv: Add a condition variable for queries

In order to properly wait for a query to be complete, we need to first
wait for the end query job to flush through on the queue.  Since query
end is always handled on the CPU, we can do this with a condition
variable.  The 2s timeout is taken from ANV.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Use util/os_time helpers
Jason Ekstrand [Mon, 4 Apr 2022 15:22:45 +0000 (10:22 -0500)]
v3dv: Use util/os_time helpers

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Switch to the common device lost tracking
Jason Ekstrand [Mon, 4 Apr 2022 14:50:26 +0000 (09:50 -0500)]
v3dv: Switch to the common device lost tracking

Vulkan requires that, once the device has been lost, you keep returning
VK_ERROR_DEVICE_LOST.  We've got tracking for this in common code; it
just needs to be wired up.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Destroy the device mutex on the teardown path
Jason Ekstrand [Mon, 4 Apr 2022 13:44:53 +0000 (08:44 -0500)]
v3dv: Destroy the device mutex on the teardown path

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Don't use pthread functions on c11 mutexes
Jason Ekstrand [Mon, 4 Apr 2022 13:40:30 +0000 (08:40 -0500)]
v3dv: Don't use pthread functions on c11 mutexes

This only works because c11/threads.h is typedeffing the c11 stuff to
ptrheads.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Put indirect compute CSD jobs in the job list
Jason Ekstrand [Fri, 1 Apr 2022 21:10:12 +0000 (16:10 -0500)]
v3dv: Put indirect compute CSD jobs in the job list

Instead of having the CPU job execute the CSD job, put both jobs on the
list with the CPU job first which modifies the GPU job which gets kicked
off next.  This gives the queue code more visibility into what types of
jobs are actually in the list.  In particular, if an indirect compute
job is the last job in a batch buffer, it currently appears as if the
batch ends with CPU work which isn't true because it kicks off GPU work.
In that case, the last job on the list is now a GPU job, which better
matches reality.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agov3dv: Stop directly setting vk_device::alloc
Jason Ekstrand [Tue, 29 Mar 2022 22:55:27 +0000 (17:55 -0500)]
v3dv: Stop directly setting vk_device::alloc

vk_device_init() will do this.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agovulkan/drm_syncobj: Implement WAIT_PENDING with a sync_file lookup
Jason Ekstrand [Thu, 31 Mar 2022 20:29:30 +0000 (15:29 -0500)]
vulkan/drm_syncobj: Implement WAIT_PENDING with a sync_file lookup

The v3dv kernel driver doesn't support timelines yet but we want
threaded submit and that requires WAIT_PENDING.  Fortunately, it should
never sit in this loop for long in practice.  The primary use-case is
sorting out dependencies and these checks will always trivially succeed
for non-shared semaphores because v3dv only has a single queue.

Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>

2 years agoaco: remove old global access intrinsics
Rhys Perry [Thu, 2 Dec 2021 14:38:57 +0000 (14:38 +0000)]
aco: remove old global access intrinsics

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoradv,ac/nir: lower global access to _amd global access intrinsics
Rhys Perry [Thu, 2 Dec 2021 14:35:15 +0000 (14:35 +0000)]
radv,ac/nir: lower global access to _amd global access intrinsics

fossil-db (Sienna Cichlid):
Totals from 400 (0.30% of 134621) affected shaders:
VGPRs: 18696 -> 18688 (-0.04%)
CodeSize: 2031348 -> 1946640 (-4.17%)
Instrs: 374703 -> 360226 (-3.86%)
Latency: 4200727 -> 4108628 (-2.19%); split: -2.20%, +0.01%
InvThroughput: 1059935 -> 1029441 (-2.88%); split: -2.88%, +0.00%
VClause: 5777 -> 5771 (-0.10%)
SClause: 11890 -> 10891 (-8.40%); split: -8.57%, +0.17%
Copies: 34035 -> 33259 (-2.28%); split: -2.98%, +0.70%
Branches: 11108 -> 11100 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 15999 -> 15942 (-0.36%); split: -0.44%, +0.08%
PreVGPRs: 16994 -> 16970 (-0.14%)

fossil-db (Polaris10):
Totals from 400 (0.29% of 135668) affected shaders:
SGPRs: 23799 -> 22919 (-3.70%); split: -4.30%, +0.61%
VGPRs: 18480 -> 18472 (-0.04%)
CodeSize: 2090316 -> 2041592 (-2.33%)
Instrs: 395461 -> 385747 (-2.46%); split: -2.46%, +0.00%
Latency: 5045768 -> 5020196 (-0.51%); split: -0.53%, +0.02%
InvThroughput: 2694320 -> 2689886 (-0.16%); split: -0.23%, +0.07%
VClause: 5982 -> 5968 (-0.23%)
SClause: 12064 -> 10823 (-10.29%); split: -10.33%, +0.04%
Copies: 48233 -> 48322 (+0.18%); split: -0.47%, +0.65%
PreSGPRs: 16409 -> 16358 (-0.31%); split: -0.39%, +0.08%

fossil-db (Pitcairn):
Totals from 400 (0.29% of 135668) affected shaders:
SGPRs: 22431 -> 22215 (-0.96%); split: -2.60%, +1.64%
VGPRs: 18776 -> 18560 (-1.15%); split: -1.21%, +0.06%
CodeSize: 2104440 -> 2017708 (-4.12%)
MaxWaves: 2363 -> 2367 (+0.17%)
Instrs: 413099 -> 397446 (-3.79%)
Latency: 5507707 -> 5450251 (-1.04%); split: -1.12%, +0.07%
InvThroughput: 2838867 -> 2786903 (-1.83%); split: -1.83%, +0.00%
VClause: 10334 -> 10097 (-2.29%)
SClause: 12346 -> 11005 (-10.86%); split: -10.89%, +0.02%
Copies: 54034 -> 52065 (-3.64%); split: -3.99%, +0.35%
PreSGPRs: 17916 -> 17857 (-0.33%); split: -0.40%, +0.07%
PreVGPRs: 16917 -> 16893 (-0.14%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: increase global_load_params.max_const_offset_plus_one
Rhys Perry [Fri, 3 Dec 2021 13:48:28 +0000 (13:48 +0000)]
aco: increase global_load_params.max_const_offset_plus_one

The callback now supports this. This shouldn't have any effect yet except
on GFX6 with 12 byte loads.

fossil-db (Pitcairn):
Totals from 246 (0.18% of 135668) affected shaders:
VGPRs: 14684 -> 14768 (+0.57%); split: -0.44%, +1.01%
CodeSize: 1765792 -> 1738040 (-1.57%)
Instrs: 344605 -> 340055 (-1.32%)
Latency: 4892904 -> 4861942 (-0.63%)
InvThroughput: 2479599 -> 2446070 (-1.35%)
VClause: 8782 -> 8735 (-0.54%)
SClause: 9854 -> 9853 (-0.01%)
Copies: 47327 -> 45401 (-4.07%); split: -4.08%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: implement _amd global access intrinsics
Rhys Perry [Thu, 2 Dec 2021 14:34:52 +0000 (14:34 +0000)]
aco: implement _amd global access intrinsics

fossil-db (Sienna Cichlid):
Totals from 7 (0.01% of 134621) affected shaders:
VGPRs: 760 -> 776 (+2.11%)
CodeSize: 222000 -> 222044 (+0.02%); split: -0.01%, +0.03%
Instrs: 40959 -> 40987 (+0.07%); split: -0.01%, +0.08%
Latency: 874811 -> 886609 (+1.35%); split: -0.00%, +1.35%
InvThroughput: 437405 -> 443303 (+1.35%); split: -0.00%, +1.35%
VClause: 1242 -> 1240 (-0.16%)
SClause: 1050 -> 1049 (-0.10%); split: -0.19%, +0.10%
Copies: 4953 -> 4973 (+0.40%); split: -0.04%, +0.44%
Branches: 1947 -> 1957 (+0.51%); split: -0.05%, +0.56%
PreVGPRs: 741 -> 747 (+0.81%)

fossil-db changes seem to be noise.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoac/llvm: implement _amd global access intrinsics
Rhys Perry [Fri, 3 Dec 2021 16:07:24 +0000 (16:07 +0000)]
ac/llvm: implement _amd global access intrinsics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agonir: add _amd global access intrinsics
Rhys Perry [Thu, 2 Dec 2021 14:33:17 +0000 (14:33 +0000)]
nir: add _amd global access intrinsics

These are the same as the normal ones, but they take an unsigned 32-bit
offset in BASE and another unsigned 32-bit offset in the last source.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: don't expand smem/mubuf global loads
Rhys Perry [Thu, 2 Dec 2021 10:57:35 +0000 (10:57 +0000)]
aco: don't expand smem/mubuf global loads

For example, dwordx3->dwordx4 or ubyte3->dwordx2.

Global loads don't have the bounds checking that buffer loads have that
makes this safe.

The alignment checks are added to global_load_callback() in case
byte_align_loads=false, align=1 and bytes_needed=3. Without them, the
callback will create a dword load.

fossil-db (Sienna Cichlid):
Totals from 267 (0.20% of 134621) affected shaders:
CodeSize: 1603352 -> 1606568 (+0.20%)
Instrs: 294946 -> 295482 (+0.18%); split: -0.00%, +0.18%
Latency: 2997003 -> 2997052 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 526645 -> 526659 (+0.00%)
SClause: 9179 -> 9185 (+0.07%); split: -0.02%, +0.09%
Copies: 25363 -> 25375 (+0.05%); split: -0.08%, +0.13%
Branches: 8298 -> 8299 (+0.01%)

fossil-db (Polaris10):
Totals from 267 (0.20% of 135668) affected shaders:
CodeSize: 1636672 -> 1638756 (+0.13%); split: -0.00%, +0.13%
Instrs: 308484 -> 308733 (+0.08%); split: -0.01%, +0.09%
Latency: 3446045 -> 3446904 (+0.02%); split: -0.00%, +0.03%
InvThroughput: 1206722 -> 1206828 (+0.01%); split: -0.00%, +0.01%
SClause: 9308 -> 9311 (+0.03%); split: -0.08%, +0.11%
Copies: 36933 -> 36921 (-0.03%); split: -0.08%, +0.05%

fossil-db (Pitcairn):
Totals from 275 (0.20% of 135668) affected shaders:
SGPRs: 17616 -> 17520 (-0.54%); split: -0.64%, +0.09%
VGPRs: 15428 -> 15540 (+0.73%); split: -0.23%, +0.96%
CodeSize: 1885792 -> 1929120 (+2.30%); split: -0.00%, +2.30%
MaxWaves: 1284 -> 1285 (+0.08%)
Instrs: 368963 -> 376095 (+1.93%); split: -0.00%, +1.94%
Latency: 5122922 -> 5168398 (+0.89%); split: -0.01%, +0.90%
InvThroughput: 2562866 -> 2604279 (+1.62%)
VClause: 9268 -> 9296 (+0.30%); split: -0.13%, +0.43%
SClause: 10702 -> 10705 (+0.03%); split: -0.05%, +0.07%
Copies: 48620 -> 50629 (+4.13%); split: -0.08%, +4.21%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: use saddr for global access with sgpr address
Rhys Perry [Tue, 9 Mar 2021 16:09:15 +0000 (16:09 +0000)]
aco: use saddr for global access with sgpr address

fossil-db (Sienna Cichlid):
Totals from 38 (0.03% of 134621) affected shaders:
CodeSize: 237196 -> 237060 (-0.06%); split: -0.09%, +0.03%
Instrs: 43895 -> 43894 (-0.00%); split: -0.02%, +0.01%
Latency: 914633 -> 916263 (+0.18%); split: -0.01%, +0.19%
InvThroughput: 468215 -> 468971 (+0.16%); split: -0.02%, +0.18%
SClause: 1239 -> 1242 (+0.24%)
PreSGPRs: 997 -> 1003 (+0.60%)
PreVGPRs: 936 -> 923 (-1.39%); split: -1.50%, +0.11%

Regression seems to be RA noise, creating a waitcnt.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: use vcc for 64-bit vgpr addition
Rhys Perry [Tue, 9 Mar 2021 16:40:23 +0000 (16:40 +0000)]
aco: use vcc for 64-bit vgpr addition

fossil-db (Sienna Cichlid):
Totals from 229 (0.17% of 134621) affected shaders:
CodeSize: 1520192 -> 1517644 (-0.17%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoradv: don't require robust vectorization for nir_var_mem_global
Rhys Perry [Thu, 15 Apr 2021 13:22:11 +0000 (14:22 +0100)]
radv: don't require robust vectorization for nir_var_mem_global

Robust vectorization is to prevent vectorization of loads using the near
maximum offset with loads of offset 0. Global loads can't read from offset
0 (NULL) anyways, so this isn't necessary.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoiris: Don't leak scratch BOs
Jason Ekstrand [Tue, 12 Apr 2022 16:45:41 +0000 (11:45 -0500)]
iris: Don't leak scratch BOs

Fixes: 4d219b0eb3d6 ("iris: implement scratch space!")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15897>

2 years agoradv: Only use TES vertex offset 2 for triangles and quads.
Timur Kristóf [Wed, 13 Apr 2022 12:54:30 +0000 (14:54 +0200)]
radv: Only use TES vertex offset 2 for triangles and quads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15837>

2 years agoradv: Fix gs_vgpr_comp_cnt for NGG VS without passthrough mode.
Timur Kristóf [Sat, 9 Apr 2022 20:00:10 +0000 (22:00 +0200)]
radv: Fix gs_vgpr_comp_cnt for NGG VS without passthrough mode.

When not in passthrough mode, the NGG shader needs to calculate the
primitive export value from the input primitive's vertex indices.

So, GS vertex offset 2 is needed when NGG has triangles
and isn't in passthrough mode.

Fixes: 7ad69e2f7ee10c0e7afc302b9324e7a320424dcb "radv: stop loading invocation ID for NGG vertex shaders"
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15837>

2 years agonir: Handle out of bounds access in nir_vectorize_tess_levels.
Timur Kristóf [Wed, 6 Apr 2022 16:53:20 +0000 (18:53 +0200)]
nir: Handle out of bounds access in nir_vectorize_tess_levels.

Replace out of bounds loads with undef.
Then, delete instructions with out of bounds access.

Fixes: f5adf27fb926a330a13af716f0a03da1a224656d "nir,radv: add and use nir_vectorize_tess_levels()"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6264
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15775>

2 years agoaco: Fix VOP2 instruction format in visit_tex.
Timur Kristóf [Wed, 13 Apr 2022 12:11:18 +0000 (14:11 +0200)]
aco: Fix VOP2 instruction format in visit_tex.

There was a v_or_b32 that accidentally used SOP2.
It should use VOP2.

Issue found by looking at a gfxreconstruct trace posted by a user
in this bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5838

Cc: mesa-stable
Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 "aco: Initial commit of independent AMD compiler"

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15923>

2 years agoiris: set a default EDSC flag
Rohan Garg [Tue, 12 Apr 2022 19:07:13 +0000 (21:07 +0200)]
iris: set a default EDSC flag

anv sets the default EDSC flag, do the same for iris too

Fixes: 5ae278da18b6 ("iris: use vtbl to avoid multiple symbols, fix state base address")

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15905>

2 years agointel/fs: add a note on possible optimization of root node address
Lionel Landwerlin [Wed, 13 Apr 2022 06:39:31 +0000 (09:39 +0300)]
intel/fs: add a note on possible optimization of root node address

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>

2 years agointel/fs: fix metadata preserve on trace_ray intrinsic
Lionel Landwerlin [Tue, 12 Apr 2022 18:59:58 +0000 (21:59 +0300)]
intel/fs: fix metadata preserve on trace_ray intrinsic

c78be5da300 ("intel/fs: lower ray query intrinsics") introduced a
helper function using nir_(push|pop)_if which invalidated dominance &
block_index for the replacement of nir_intrinsic_rt_trace_ray.

We can still keep dominance/block_index metadata for the lowering of
nir_intrinsic_rt_execute_callable though.

This change uses 2 different lowering function with correct metadata
preservation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c78be5da300 ("intel/fs: lower ray query intrinsics")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>

2 years agozink: avoid creating ssbo variable types with multiple runtime arrays
Mike Blumenkrantz [Tue, 12 Apr 2022 13:47:03 +0000 (09:47 -0400)]
zink: avoid creating ssbo variable types with multiple runtime arrays

this is illegal

affects:
KHR-GL46.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-packed-matC

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15894>

2 years agozink: use the calculated last struct member idx for ssbo size in ntv
Mike Blumenkrantz [Tue, 12 Apr 2022 13:46:32 +0000 (09:46 -0400)]
zink: use the calculated last struct member idx for ssbo size in ntv

this may or may not be 1

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15894>

2 years agovirgl: Fix relocating the re-writing the transformation code
Gert Wollny [Wed, 13 Apr 2022 09:59:20 +0000 (11:59 +0200)]
virgl: Fix relocating the re-writing the transformation code

The transformation must come before the code emission.

Fixes: 6a264e7024a29eb7

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15919>

2 years agoiris: Add VF_CACHE_INVALIDATE to IRIS_DOMAIN_OTHER_WRITE flush bits
Kenneth Graunke [Tue, 8 Mar 2022 07:37:58 +0000 (23:37 -0800)]
iris: Add VF_CACHE_INVALIDATE to IRIS_DOMAIN_OTHER_WRITE flush bits

Suggested by Francisco Jerez.

Although including VF invalidation in the flush bits is strange, we
believe this is the only way to guarantee that stream output has
finished.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Demote DC flush to HDC flush in cache tracker
Kenneth Graunke [Wed, 25 Aug 2021 01:09:53 +0000 (18:09 -0700)]
iris: Demote DC flush to HDC flush in cache tracker

FLUSH_HDC is sufficient to flush things out to L3, so we'd rather
use that where possible.  It's also emulated via DATA_CACHE_FLUSH
on platforms where it isn't supported, so we can use it unconditionally.

We still use DATA_CACHE_FLUSH for invalidating the data cache, and to
flush the DC-tagged cachelines in L3 to be globally-observable.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Emit flushes for push constant source buffers
Kenneth Graunke [Fri, 4 Mar 2022 11:47:06 +0000 (03:47 -0800)]
iris: Emit flushes for push constant source buffers

Push constant loading is not coherent with L3 according to the document
that describes the hardware change for the vertex buffer L3 Bypass
Disable field.

If we've updated a push constant buffer with say, a blorp_buffer_copy,
we may need to flush both the render cache and the tile cache.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Use cache-tracker for draw count flushing
Kenneth Graunke [Tue, 8 Mar 2022 08:07:16 +0000 (00:07 -0800)]
iris: Use cache-tracker for draw count flushing

We should be using the cache tracker for this.  We can consider
this access IRIS_DOMAIN_OTHER_READ now that it's the catch-all
non-L3-coherent read-only access domain.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add pre-draw flushing for stream output targets
Kenneth Graunke [Tue, 8 Mar 2022 06:01:08 +0000 (22:01 -0800)]
iris: Add pre-draw flushing for stream output targets

When stream output is active, we need to let the cache tracker know
about any SO buffers, which we access via IRIS_DOMAIN_OTHER_WRITE.

In particular, we may have written to those buffers via another
mechanism, such as BLORP buffer copies.  In that case, previous writes
happened via IRIS_DOMAIN_RENDER_WRITE, in which case we'd need to flush
both the render cache and the tile cache to make that data globally-
observable before we begin writing via streamout, which is incoherent
with the earlier mechanism.

Fixes misrendering in Ryujinx.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6085
Fixes: d8cb76211c5 ("iris: Fix MOCS for buffer copies")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Extend the cache tracker to handle L3 flushes and invalidates
Kenneth Graunke [Mon, 2 Aug 2021 21:50:19 +0000 (14:50 -0700)]
iris: Extend the cache tracker to handle L3 flushes and invalidates

Most clients are L3-coherent these days.  However, there are some
notable exceptions, such as push constants, stream output, and command
streamer memory reads and writes.

With the advent of the tile cache, flushing the render or depth caches
alone are no longer sufficient for memory to become globally-observable.
For those, we need to flush the tile cache as well.  However, we'd like
to avoid that for L3-coherent clients, as it shouldn't be necessary,
and is expensive.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bit
Kenneth Graunke [Sat, 9 Apr 2022 09:19:15 +0000 (02:19 -0700)]
iris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bit

This will let us use it without performing a VF cache invalidation,
should we want to do that.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add an iris_is_domain_l3_coherent helper.
Kenneth Graunke [Mon, 2 Aug 2021 19:47:10 +0000 (12:47 -0700)]
iris: Add an iris_is_domain_l3_coherent helper.

The render, depth, sampler, and data (HDC) caches are all coherent
with L3.  We consider OTHER_READ and OTHER_WRITE to be non-coherent,
as they're kitchen-sink domains which include non-L3-clients.

Starting with Tigerlake, the VF cache is coherent with L3 (because we
set the L3BypassDisable bit in the vertex/index buffer packets).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Fix UBO cache tracking for the !indirect_ubos_use_sampler case
Kenneth Graunke [Fri, 4 Mar 2022 11:09:36 +0000 (03:09 -0800)]
iris: Fix UBO cache tracking for the !indirect_ubos_use_sampler case

On Tigerlake, we use the data cache for reading indirect UBOs instead
of the sampler.  But we still use the constant cache for direct UBO
access, so unfortunately we may access it through two different domains.

To work around this, we add a new domain for pull constants (UBOs),
which will be either constant+texture or constant+data.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Split out an IRIS_DOMAIN_SAMPLER_READ domain from OTHER_READ
Kenneth Graunke [Fri, 4 Mar 2022 11:27:05 +0000 (03:27 -0800)]
iris: Split out an IRIS_DOMAIN_SAMPLER_READ domain from OTHER_READ

The bulk of IRIS_DOMAIN_OTHER_READ domain usage was the 3D sampler, but
there were also a few oddball cases like command streamer reads, blitter
access, and so on.  The sampler is definitely L3 coherent, but some off
the more esoteric reads may not be, so I'd like to separate them, so
that OTHER_READ can become a non-L3-coherent kitchen-sink domain.

The sampler cases only need TEXTURE_CACHE_INVALIDATE, and can skip the
CONSTANT_CACHE_INVALIDATE we had on IRIS_DOMAIN_OTHER_READ.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Use IRIS_DOMAIN_DEPTH_WRITE for read only depth/stencil.
Kenneth Graunke [Mon, 4 Apr 2022 18:04:40 +0000 (11:04 -0700)]
iris: Use IRIS_DOMAIN_DEPTH_WRITE for read only depth/stencil.

We were using IRIS_DOMAIN_OTHER_READ for read-only depth/stencil access
in an attempt to avoid unnecessary flushing; IRIS_DOMAIN_DEPTH_WRITE
could indicate read-write access.

However, IRIS_DOMAIN_OTHER_READ is clearly the wrong domain.  Depth and
stencil data is read via the depth cache, while IRIS_DOMAIN_OTHER_READ
currently corresponds to the sampler cache and constant cache together
(although this will change in future patches).

It's unclear whether this hack was useful.  For now, just drop it and
use the correct depth cache domain, even if it's marked as read-write.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agovirgl: Apply integer op fix only for ALU ops and clear modifiers
Gert Wollny [Tue, 12 Apr 2022 15:53:12 +0000 (17:53 +0200)]
virgl: Apply integer op fix only for ALU ops and clear modifiers

For texture fetches and buffer load the fix is not needed,
and the override creates faulty TGSI.

In addition remove all modifiers from the src in the additional mov
instruction.

Fixes: d1c7a7b1317c518e160cc6d37245de22b2bfa60d
  virgl: Add an extra mov for int outputs from constant and immediate inputs

v2: Move workaround after the use of
    virgl_tgsi_rewrite_src_for_input_temp (Emma)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15896>

2 years agor600: Assign shader type when creating a new CS state
Gert Wollny [Mon, 4 Apr 2022 12:37:42 +0000 (14:37 +0200)]
r600: Assign shader type when creating a new CS state

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15898>

2 years agost/mesa: Transcode ASTC to BC7 (BPTC) where possible
Kenneth Graunke [Tue, 15 Feb 2022 20:47:49 +0000 (12:47 -0800)]
st/mesa: Transcode ASTC to BC7 (BPTC) where possible

This patch adds support for transcoding ASTC to BC7 (BPTC) and prefers
it over BC3 (DXT5) when hardware supports that format.

BC7 is a much newer format (~2009 vs. ~1999) and offers higher quality
than the older BC3 format.  Furthermore, our encoder seems to be faster.

Tapani put together a small benchmark for transcoding a 1024x1024 ASTC
texture, and switching from BC3 to BC7 improves performance of that
microbenchmark by 25% on my Tigerlake NUC (with hardware ASTC disabled
so we can test this path).  Presumably, this isn't fundamental to the
formats, but rather reflects the speed of our in-tree compressors.

So, we should use BC7 where possible.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15875>

2 years agost/mesa: Make transcode_astc also check for non-SRGB format support
Kenneth Graunke [Tue, 15 Feb 2022 20:44:30 +0000 (12:44 -0800)]
st/mesa: Make transcode_astc also check for non-SRGB format support

This is probably unnecessary in that all drivers which support the sRGB
format likely also support the non-sRGB format.  But we may as well
check both the formats we use, for documentation if nothing else.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15875>

2 years agoci: Move most stuff out of root .gitlab-ci.yml
Tomeu Vizoso [Fri, 8 Apr 2022 11:29:04 +0000 (13:29 +0200)]
ci: Move most stuff out of root .gitlab-ci.yml

This file was getting a bit hard to navigate. Split container, build and
test jobs to their own files.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Allow local installations to build additional stuff into the rootfs
Tomeu Vizoso [Tue, 29 Mar 2022 11:48:49 +0000 (13:48 +0200)]
ci: Allow local installations to build additional stuff into the rootfs

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Add env var to add packages to install in debian/arm_build image
Tomeu Vizoso [Tue, 29 Mar 2022 11:47:52 +0000 (13:47 +0200)]
ci: Add env var to add packages to install in debian/arm_build image

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Add env var to add packages to install in rootfs
Tomeu Vizoso [Tue, 29 Mar 2022 11:47:26 +0000 (13:47 +0200)]
ci: Add env var to add packages to install in rootfs

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Allow specifying a different kernel in LAVA jobs
Tomeu Vizoso [Thu, 17 Mar 2022 15:35:30 +0000 (16:35 +0100)]
ci: Allow specifying a different kernel in LAVA jobs

To make it possible to use a kernel different from that built along with
the rootfs.

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Use CI_PROJECT_NAME instead of hardcoding 'mesa'
Tomeu Vizoso [Thu, 17 Mar 2022 14:09:18 +0000 (15:09 +0100)]
ci: Use CI_PROJECT_NAME instead of hardcoding 'mesa'

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agonir/lower_shader_calls: name resume shaders
Lionel Landwerlin [Tue, 12 Apr 2022 13:04:52 +0000 (16:04 +0300)]
nir/lower_shader_calls: name resume shaders

Helpful when lost in a sea of NIR :)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15887>

2 years agoci: Disable Google's lab
Tomeu Vizoso [Wed, 13 Apr 2022 06:11:05 +0000 (08:11 +0200)]
ci: Disable Google's lab

The runner is down and pipelines are being stuck.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15909>

2 years agozink: rework choose_pdev to (finally) be competent
Mike Blumenkrantz [Mon, 11 Apr 2022 15:04:45 +0000 (11:04 -0400)]
zink: rework choose_pdev to (finally) be competent

now zink will init using a priority system if multiple devices are available

multiple devices will ONLY be available if:
* the user does not specify VK_ICD_FILENAMES as they should
* the user does not specify LIBGL_ALWAYS_SOFTWARE
* multiple drivers exist

I've prioritized the virtualized gpu here with the assumption that if
such a thing is detected, the environment is most likely virtualized

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoaux/trace: clean up some zink+lavapipe tracing awfulness
Mike Blumenkrantz [Mon, 11 Apr 2022 14:41:22 +0000 (10:41 -0400)]
aux/trace: clean up some zink+lavapipe tracing awfulness

now that it's easier to determine whether zink is being used (mostly),
this whole thing can be simplified

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agozink: ZINK_USE_LAVAPIPE -> LIBGL_ALWAYS_SOFTWARE
Mike Blumenkrantz [Mon, 11 Apr 2022 14:39:05 +0000 (10:39 -0400)]
zink: ZINK_USE_LAVAPIPE -> LIBGL_ALWAYS_SOFTWARE

this is a documented variable, so reuse it

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoegl: don't make LIBGL_ALWAYS_SOFTWARE and MESA_LOADER_DRIVER_OVERRIDE=zink exclusive
Mike Blumenkrantz [Mon, 11 Apr 2022 14:31:40 +0000 (10:31 -0400)]
egl: don't make LIBGL_ALWAYS_SOFTWARE and MESA_LOADER_DRIVER_OVERRIDE=zink exclusive

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoac/gpu_info: disallow displayable DCC for Navi12 and Navi14
Indrajit Kumar Das [Fri, 8 Apr 2022 05:21:54 +0000 (10:51 +0530)]
ac/gpu_info: disallow displayable DCC for Navi12 and Navi14

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15813>

2 years agointel/nir: Lower 8 and 16-bit bitwise unops
Jason Ekstrand [Fri, 8 Apr 2022 20:17:33 +0000 (15:17 -0500)]
intel/nir: Lower 8 and 16-bit bitwise unops

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agointel/fs: Implement 16-bit [ui]mul_high
Jason Ekstrand [Fri, 8 Apr 2022 20:17:12 +0000 (15:17 -0500)]
intel/fs: Implement 16-bit [ui]mul_high

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agonir/lower_int64: Fix [iu]mul_high handling
Jason Ekstrand [Fri, 8 Apr 2022 20:06:11 +0000 (15:06 -0500)]
nir/lower_int64: Fix [iu]mul_high handling

e551040c602d, which added a new mechanism for 64-bit imul which is more
efficient on BDW and later Intel hardware also introduced a bug where we
weren't properly walking both X and Y.  No idea how testing didn't find
this.

Fixes: e551040c602d ("nir/glsl: Add another way of doing lower_imul64 for gen8+"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6306
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agokopper: print better error message if loader not detected
Mike Blumenkrantz [Mon, 11 Apr 2022 12:34:36 +0000 (08:34 -0400)]
kopper: print better error message if loader not detected

silently failing on release builds is annoying

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15851>

2 years agolima: fix vector const src referenced multiple times
Erico Nunes [Sun, 3 Apr 2022 16:28:19 +0000 (18:28 +0200)]
lima: fix vector const src referenced multiple times

It can happen that a single vector constant is referenced multiple times
by the same node, with different swizzles.
This needs to be taken into account by checking and updating the
swizzles for all the srcs of a target node when inserting the const
node to the same instruction.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15726>

2 years agofeatures: mark off ARB_seamless_cubemap_per_texture for zink
Mike Blumenkrantz [Tue, 12 Apr 2022 18:42:56 +0000 (14:42 -0400)]
features: mark off ARB_seamless_cubemap_per_texture for zink

forgot to do this with the MR

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15902>

2 years agontt: translate nir_intrinsic_shader_clock
Gert Wollny [Tue, 12 Apr 2022 14:03:59 +0000 (16:03 +0200)]
ntt: translate nir_intrinsic_shader_clock

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15889>

2 years agozink: finish up radv piglit baseline updates
Mike Blumenkrantz [Tue, 12 Apr 2022 18:00:35 +0000 (14:00 -0400)]
zink: finish up radv piglit baseline updates

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15900>