Mike Blumenkrantz [Tue, 17 Nov 2020 23:17:10 +0000 (18:17 -0500)]
zink: add ntv util function for getting image type
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8504>
Mike Blumenkrantz [Tue, 4 Aug 2020 20:36:27 +0000 (16:36 -0400)]
zink: rename zink_context::*image_views -> sampler_views
this is wrongly named and conflicts with things that are actually
image_views (assuming we're using gallium terminology in order to match
the rest of the ecosystem)
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8504>
Rhys Perry [Tue, 25 Aug 2020 16:26:35 +0000 (17:26 +0100)]
radv: sink load_ssbo
fossil-db (GFX10.3):
Totals from 11485 (8.24% of 139391) affected shaders:
SGPRs: 1032456 -> 1033696 (+0.12%); split: -0.69%, +0.81%
VGPRs: 815332 -> 807448 (-0.97%); split: -1.04%, +0.07%
SpillSGPRs: 18014 -> 13497 (-25.07%); split: -28.28%, +3.20%
SpillVGPRs: 1821 -> 1749 (-3.95%)
CodeSize:
101194172 ->
101235028 (+0.04%); split: -0.06%, +0.10%
Scratch: 198656 -> 178176 (-10.31%)
MaxWaves: 86703 -> 87219 (+0.60%); split: +0.67%, -0.07%
Instrs:
19224250 ->
19238562 (+0.07%); split: -0.05%, +0.13%
Cycles:
1486045388 ->
1487481292 (+0.10%); split: -0.03%, +0.13%
VMEM: 2040484 -> 2127647 (+4.27%); split: +6.64%, -2.37%
SMEM: 724060 -> 674966 (-6.78%); split: +1.22%, -8.00%
VClause: 312375 -> 314735 (+0.76%); split: -0.26%, +1.02%
SClause: 702274 -> 711991 (+1.38%); split: -0.77%, +2.15%
Copies: 1413440 -> 1422782 (+0.66%); split: -0.45%, +1.11%
Branches: 658696 -> 658838 (+0.02%); split: -0.12%, +0.14%
PreSGPRs: 884666 -> 879736 (-0.56%); split: -1.30%, +0.74%
PreVGPRs: 777374 -> 769947 (-0.96%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
Rhys Perry [Thu, 21 Jan 2021 17:01:07 +0000 (17:01 +0000)]
nir/sink,nir/move: sink/move reorderable load_ssbo
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
Rhys Perry [Tue, 25 Aug 2020 16:23:36 +0000 (17:23 +0100)]
radv: use nir_opt_access
fossil-db (GFX10.3):
Totals from 3231 (2.32% of 139391) affected shaders:
SGPRs: 168654 -> 167454 (-0.71%); split: -0.72%, +0.00%
VGPRs: 152352 -> 152416 (+0.04%)
CodeSize:
13872836 ->
13806376 (-0.48%); split: -0.50%, +0.02%
MaxWaves: 36640 -> 36634 (-0.02%)
Instrs: 2639959 -> 2626852 (-0.50%); split: -0.52%, +0.03%
Cycles:
77706000 ->
77496792 (-0.27%); split: -0.28%, +0.01%
VMEM: 809496 -> 790847 (-2.30%); split: +2.06%, -4.36%
SMEM: 267843 -> 253187 (-5.47%); split: +0.76%, -6.23%
VClause: 61353 -> 60426 (-1.51%); split: -1.86%, +0.35%
SClause: 95409 -> 92355 (-3.20%); split: -3.24%, +0.04%
Copies: 194951 -> 196702 (+0.90%); split: -0.53%, +1.43%
Branches: 84320 -> 84331 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 110162 -> 110203 (+0.04%); split: -0.04%, +0.07%
PreVGPRs: 127021 -> 127037 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
Erik Faye-Lund [Thu, 21 Jan 2021 14:16:49 +0000 (15:16 +0100)]
docs: turn non-code into comment
If we want syntax-highlighting to actually work here, we should make
sure the code actually parses.
This fixes a warning during docs build.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8243>
Erik Faye-Lund [Wed, 6 Jan 2021 12:23:35 +0000 (13:23 +0100)]
docs: fix broken link
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8243>
Erik Faye-Lund [Mon, 28 Dec 2020 12:37:13 +0000 (13:37 +0100)]
docs: fix sphinx-warnings due to lacking escaping
There's a few more cases that needs proper quoting for Sphinx. Asterisks
and ticks at the start of words, as well as underscores at the end of
symbols, even when they have trailing escaped characters.
We should really find a way to robustly escape these things when
generating them.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8243>
Alejandro Piñeiro [Tue, 12 Jan 2021 11:09:46 +0000 (12:09 +0100)]
v3dv/descriptor: assert CrateDescriptorPool receives valid count values
Although I assume that this should be caught by the validation layers,
recently while triaging the following tests:
dEQP-VK.ycbcr.query.*r8g8b8a8_unorm*
We found that they were setting a descriptorCount of zero, because it
was not handling correctly the differences between Vulkan 1.0 and
Vulkan 1.1.
So let's just assert, just in case it happens again, as that would
make the bugfixing far easier.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8614>
Arcady Goldmints-Orlov [Tue, 19 Jan 2021 00:16:12 +0000 (18:16 -0600)]
v3dv: Fix uninitialized variable warnings
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8570>
Iago Toral Quiroga [Thu, 21 Jan 2021 10:39:58 +0000 (11:39 +0100)]
v3dv: fix disabling Early Z for the whole frame
The documentation states that if we disable Early Z for the whole
frame in the RCL Tile Rendering Mode packet, then we should not
emit any draw calls with it enabled (which we can do by enabling
it in the CFG_BITS packet).
Since we emit our RCL after recording our draw calls in the BCL
and we were not considering there if any condition for global disable
would be met, it was possible that we end up with an incorrect
configuration when we decide for a global disable in the RCL, which
can cause rendering artifacts. This can be easily observed by simply
forcing the RCL bit to disable early Z in applications that are known
to enable it in CFG_BITS (such as the UE Shooter demo for example).
With this change we keep track of this scenario when we record
draw calls in the BCL and if decide that we need to disable EZ for
the entire job, we make sure we never enable it for any draw calls
in the frame.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Iago Toral Quiroga [Wed, 20 Jan 2021 09:31:01 +0000 (10:31 +0100)]
v3dv: enable early Z/S clears
This is an optimization that should make Z/S clears faster. To enable
this we can't have any Z/S loads or stores in the job. Also, it seems
that enabling early Z/S clearing is independent of whether early Z/S
testing is enabled.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Iago Toral Quiroga [Wed, 20 Jan 2021 08:51:33 +0000 (09:51 +0100)]
v3dv: do not emit full tile buffers clears to handle Z/S clears
There was a misunderstanding regarding the scope of some hardware bugs
that led us to think that:
1. The Clear Tile Buffer Z/S bit was broken
2. The Clear Tile Buffer RTs bit would also clear Z/S.
1) is not really true, what happened was that some other bugs for which
we need workarounds anyway would have that effect. 2) was only true
for V3D 4.1, so it doesn't affect v3dv.
This change makes proper use of the Z/S bit instead of falling back to
clearing all tile buffers every time we have a Z/S clear. This also
allows us to do color clears on the tile store (which is faster) rather
than falling back to the clear all RTs bit every time we have a Z/S clear.
v2: rewrite the original comment about the hardwarebug description to
include recent discussions with Broadcom instead of keeping it as
is and amending it with an update note.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Iago Toral Quiroga [Wed, 20 Jan 2021 08:46:53 +0000 (09:46 +0100)]
v3dv: refactor checks for subpass attachment stores
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Iago Toral Quiroga [Wed, 20 Jan 2021 08:36:06 +0000 (09:36 +0100)]
v3dv: refactor checks for subpass attachment loading
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Iago Toral Quiroga [Tue, 19 Jan 2021 09:01:42 +0000 (10:01 +0100)]
v3dv: refactor checks for subpass attachment clears
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
Rhys Perry [Fri, 18 Dec 2020 13:19:50 +0000 (13:19 +0000)]
radv,aco: use deref_buffer_array_length
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3993
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163>
Rhys Perry [Mon, 7 Dec 2020 14:44:15 +0000 (14:44 +0000)]
nir/lower_io: fix array_length lowering if buffer is smaller than offset
Matches SPIR-V -> NIR implementation of OpArrayLength.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163>
Daniel Schürmann [Thu, 21 Jan 2021 11:13:39 +0000 (12:13 +0100)]
radv: don't vectorize shift operations
Currently, these cannot be vectorized as in NIR
shift operands are 32bit while for 16bit-vectorization
they need to be 16bit.
No fossildb changes.
Fixes:
fcd2ef23e5f1d50008166168e772815c0213e37c ('radv: vectorize 16bit instructions')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8612>
Erik Faye-Lund [Wed, 20 Jan 2021 09:19:32 +0000 (10:19 +0100)]
zink: fix vertex-stride wrangling
Because Gallium and Vulkan disagree on what kind of state strides is, we
need to wrangle this state a bit, and up until now, we've been simply
fixing this up while binding the vertex-buffers.
But this isn't robust, because the vertex element state might be bound
after the vertex-buffer state was bound. We also need to take
binding-map into account, which we're currently missing as well.
Instead, w need to deal with this at a place where we know what's being
used for both of these. So let's do this during draw instead.
Ideally, we'd also do some dirty-tracking to know if this is needed or
not, but I believe Mike has some patches in this areas lined up, so it
might be easier to wait for those.
Fixes:
8d46e35d16e ("zink: introduce opengl over vulkan")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3661
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4125
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8588>
Daniel Schürmann [Fri, 15 Jan 2021 08:23:04 +0000 (09:23 +0100)]
aco/optimizer: convert extract_vector with index 0 into parallelcopies if possible
Totals from 273 (0.20% of 139391) affected shaders (Navi10):
VGPRs: 11600 -> 11792 (+1.66%)
CodeSize: 1389304 -> 1383152 (-0.44%); split: -0.53%, +0.08%
MaxWaves: 3848 -> 3752 (-2.49%)
Instrs: 240228 -> 239478 (-0.31%); split: -0.37%, +0.06%
Cycles:
20637708 ->
20580024 (-0.28%); split: -0.46%, +0.18%
VMEM: 39164 -> 38831 (-0.85%); split: +0.06%, -0.91%
SMEM: 21743 -> 22204 (+2.12%)
VClause: 4787 -> 4783 (-0.08%)
Copies: 39057 -> 38308 (-1.92%); split: -2.28%, +0.37%
Branches: 6556 -> 6557 (+0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Thu, 31 Dec 2020 11:04:11 +0000 (11:04 +0000)]
aco/optimizer: expand subdword vectors with SGPRs on all generations
No fossildb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Thu, 31 Dec 2020 11:01:08 +0000 (11:01 +0000)]
aco: propagate temporaries into PSEUDO instructions if it can take it
This patch relaxes copy-propagation for PSEUDO instructions with
subdword Operands / Definitions:
general:
- only propagate VGPR temps if the Definition is VGPR (or on p_as_uniform)
parallelcopy/create_vector/phis:
- size has to be the same
extract_vector/split_vector:
- propagate SGPR temps on GFX9+ or if the Definitions are not subdword
- split_vector: size must not increase
Totals from 282 (0.20% of 140985) affected shaders (Polaris10):
VGPRs: 14520 -> 14408 (-0.77%)
CodeSize: 2693956 -> 2694316 (+0.01%); split: -0.20%, +0.21%
Instrs: 512874 -> 512864 (-0.00%); split: -0.16%, +0.16%
Cycles:
26338860 ->
26320652 (-0.07%); split: -0.36%, +0.29%
VMEM: 49460 -> 49634 (+0.35%); split: +0.47%, -0.12%
SMEM: 10035 -> 10036 (+0.01%)
VClause: 7675 -> 7674 (-0.01%)
Copies: 66012 -> 65943 (-0.10%); split: -1.31%, +1.20%
Branches: 17265 -> 17281 (+0.09%); split: -0.10%, +0.19%
PreVGPRs: 12211 -> 12124 (-0.71%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Thu, 31 Dec 2020 10:46:37 +0000 (10:46 +0000)]
aco/validate: relax subdword restrictions
This affects constants/SGPRs on GFX6-8 and
the operand regClass of SDWA instructions.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Thu, 31 Dec 2020 10:43:43 +0000 (10:43 +0000)]
aco/validate: ensure that Operand and Definition size matches for parallelcopies
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Wed, 30 Dec 2020 16:38:08 +0000 (16:38 +0000)]
aco/validate: validate that p_create_vector operands are aligned unless they are subdword operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Wed, 30 Dec 2020 15:06:04 +0000 (15:06 +0000)]
aco: generalize subdword constant copy lowering
This will allow to propagate and emit sub-register constants
on all hardware generations.
Also fixes GFX8 constant emission to not use SDWA.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Mon, 28 Dec 2020 19:15:17 +0000 (19:15 +0000)]
aco/optimizer: don't propagate subdword temps of different size
It could happen that due to inconsistent copy-propagation
v1 = p_parallelcopy v2b
instructions were left after optimization on GFX8.
Cc: 20.3
Cc: 21.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Daniel Schürmann [Fri, 15 Jan 2021 17:54:01 +0000 (18:54 +0100)]
aco/optimizer: don't copy-prop logical phis
This is dangerous w.r.t. LCSSA-phis.
Totals from 746 (0.54% of 139391) affected shaders (Navi10):
CodeSize: 8592160 -> 8568156 (-0.28%); split: -0.30%, +0.02%
MaxWaves: 5172 -> 5171 (-0.02%); split: +0.02%, -0.04%
Instrs: 1653949 -> 1648489 (-0.33%); split: -0.36%, +0.03%
Cycles:
49474892 ->
49329224 (-0.29%); split: -0.33%, +0.03%
VMEM: 137574 -> 137421 (-0.11%); split: +0.18%, -0.29%
SMEM: 42391 -> 42439 (+0.11%); split: +0.12%, -0.01%
VClause: 26946 -> 26943 (-0.01%)
Copies: 130902 -> 126176 (-3.61%); split: -4.05%, +0.43%
Branches: 54891 -> 54556 (-0.61%); split: -0.64%, +0.03%
PreVGPRs: 53941 -> 53939 (-0.00%)
This has a slight effect on RA due to affinity changes.
Cc: 20.3
Cc: 21.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
Samuel Pitoiset [Wed, 20 Jan 2021 16:29:14 +0000 (17:29 +0100)]
radv: fix a sync issue with geometry shader primitives query on GFX10+
When NGG is used, the hw can't know the number of geometry shader
primitives. To fix that, the NGG geometry shader accumulates itself
the number of primitives by using an atomic operation directly to GDS.
Then, begin/query copy the start/stop values from GDS to the
query pool buffer using a PS_DONE event. This was actually wrong
because PS_DONE is completely asynchronous to everything and executed
when the preceding draws finish pixel shaders.
Fix this by using a COPY_DATA packet which is synced with CP. This
fixes random failures on Sienna Cichlid with
dEQP-VK.query_pool.statistics_query.*.geometry_shader_primitives.*.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8590>
Tapani Pälli [Tue, 19 Jan 2021 12:51:11 +0000 (14:51 +0200)]
mesa: add GL_SR8_EXT, GL_SRG8_EXT for color/srgb format queries
This makes glGetInternalformati64v queries work correctly.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4120
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8575>
Vinson Lee [Sun, 17 Jan 2021 01:53:58 +0000 (17:53 -0800)]
nv50/ir: Add InsertConstraintsPass constructor.
Fix defect reported by Coverity Scan.
Uninitialized pointer field (UNINIT_CTOR)
member_not_init_in_gen_ctor: The compiler-generated constructor for this
class does not initialize targ.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8541>
Icecream95 [Tue, 19 Jan 2021 21:46:05 +0000 (10:46 +1300)]
pan/decode: Free mapped memory objects on BO unreference
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8583>
Marek Olšák [Wed, 23 Dec 2020 21:51:01 +0000 (16:51 -0500)]
mesa: simplify terminating display list loops
Remove the done variable and return directly from switches.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 21:42:39 +0000 (16:42 -0500)]
mesa: simplify handling OPCODE_CONTINUE for display lists
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 19:06:02 +0000 (14:06 -0500)]
mesa: optimize glCallLists by using loops inside a switch
This changes glCallLists from using a switch inside a loop to using loops
inside a switch.
Also fix the comments that didn't tell WHY something was important.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 12:16:51 +0000 (07:16 -0500)]
mesa: remove redundant glRect functions for display lists
vbo_save implements this better by putting the vertices into a VBO.
The SET functions removed here were overriding it to use the slower
version that uploads the vertices on every call.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 12:10:06 +0000 (07:10 -0500)]
mesa: remove _mesa_initialize_exec_dispatch from draw.c by autogenerating it
The glapi scripts are fully capable of generating this correctly for all
GL APIs if we don't set exec="dynamic".
exec="dynamic" should only be used for glBegin, glEnd, and all functions
that are legal inside Begin/End.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 22:19:14 +0000 (17:19 -0500)]
glthread: add display list support to fix state tracking with display lists
The existing display list code is reused to call display lists from the app
thread. The util_queue_fence_wait call waits for the last call that modifies
display lists (such as glEndList and glDeleteLists), which ensures that
accessing display lists from a non-mesa thread is thread safe because
the wait guarantees that display lists are immutable during the asynchronous
display list execution.
Display lists are executed just like normal display lists except that they
call glthread functions instead of the default GL dispatch. Many calls in
display lists are skipped because glthread only tracks a few states.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 22:18:08 +0000 (17:18 -0500)]
mesa: add _mesa_get_list helper
glthread will use it.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 17:30:13 +0000 (12:30 -0500)]
glthread: remove if (COMPAT) conditions from functions that are GL-compat-only
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 17:29:24 +0000 (12:29 -0500)]
glthread: rename inside_dlist to ListMode for future use
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Sat, 3 Oct 2020 03:14:13 +0000 (23:14 -0400)]
glthread: implement glGetIntegerv for states that glthread tracks
for viewperf
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Sat, 3 Oct 2020 03:12:52 +0000 (23:12 -0400)]
glthread: track all matrix stack depths
This just tracks matrix stack depths in MatrixStackDepth and everything
else here is needed to make it correct.
Matrix stack depths will be returned by glGetIntegerv without synchronizing.
Display lists will be handled by a separate commit.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 11:21:57 +0000 (06:21 -0500)]
glthread: add specialized versions of unmarshal_Draw funcs without user buffers
This decreases CPU time spent in the unmarshal_DrawElements function
from 0.44% to 0.26% if no user buffers are present.
Instead of converting all calls to either unmarshal_DrawArraysInstanced-
BaseInstance or unmarshal_DrawElementsInstancedBaseVertexBaseInstance,
which both also conditionally bind uploaded user buffers if needed and
call one of:
- DrawArraysInstancedBaseInstance
- DrawElementsInstancedBaseVertexBaseInstance
- DrawRangeElementsBaseVertex,
add 3 unmarshal draw variants that are specialized version of the above that
never bind uploaded user buffers. This removes all conditionals from
the unmarshal functions for the common case when there are no user buffers.
Unused function enums are used for the various draw variants. For example,
CMD_DrawArrays is used to dispatch DrawArraysInstacedBaseInstance without
user buffers, while CMD_DrawArraysInstacedBaseInstance is used to dispatch
the same with user buffers. glthread isn't flexible enough to do it cleanly.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Mon, 4 Jan 2021 01:23:07 +0000 (20:23 -0500)]
glthread: don't sync with NV_half_float vertex attrib functions
Set the pointer sizes, so that glthread knows how much data needs to be
copied.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Marek Olšák [Wed, 23 Dec 2020 07:42:04 +0000 (02:42 -0500)]
glthread: remove marshal="draw" because it doesn't do much
It only checked whether the pointer was indices or indirect, but we can
just determine the same thing manually for each draw call.
Simplify it as follows:
- if a call contains a pointer without count and it's either indirect or
indices, set marshal="async". The marshal_sync attribute still determines
when it syncs.
- if a call doesn't contain any pointer without count, remove the marshal
attribute
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8297>
Icecream95 [Thu, 17 Dec 2020 11:27:14 +0000 (00:27 +1300)]
panfrost: Fix the tile size assertion
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>
Icecream95 [Mon, 9 Nov 2020 10:13:21 +0000 (23:13 +1300)]
panfrost: Transaction elimination support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>
Icecream95 [Wed, 20 Jan 2021 06:42:12 +0000 (19:42 +1300)]
panfrost: Add a debug flag to disable checksumming
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>
Icecream95 [Sat, 12 Dec 2020 05:41:16 +0000 (18:41 +1300)]
panfrost: Only checksum resources when it makes sense to
To simplify tracking the checksum validity, only checksum 2D
resources.
The GPU cannot do checksumming when there is too much data to be
written out, so limit the number of bytes per pixel to an architecture
specific value.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>
Icecream95 [Mon, 9 Nov 2020 10:13:43 +0000 (23:13 +1300)]
panfrost: Add a function to determine if a resource is 2D
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7086>
Rob Clark [Wed, 20 Jan 2021 19:04:31 +0000 (11:04 -0800)]
radeonsi: Use util_writes_stencil() helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8598>
Rob Clark [Wed, 20 Jan 2021 19:04:09 +0000 (11:04 -0800)]
r300: Use util_writes_depth_stencil() helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8598>
Rob Clark [Wed, 20 Jan 2021 18:44:20 +0000 (10:44 -0800)]
freedreno/a6xx: Don't early-z if there are stencil writes
Fixes a similar stencil related misrendering in a couple "RV AppStudios"
titles.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8598>
Rob Clark [Wed, 20 Jan 2021 18:27:55 +0000 (10:27 -0800)]
gallium/util: Add helpers to determine if z/s is written
For drivers that must control various 'early-z'-like state, it is easy
enough to get this wrong. So add helpers so we don't have to duplicate
the logic in each driver.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8598>
Marek Olšák [Mon, 11 Jan 2021 11:35:43 +0000 (06:35 -0500)]
radeonsi: allow instance_count == 0 on chips that handle it correctly
Let's remove this overhead.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Tue, 12 Jan 2021 03:57:32 +0000 (22:57 -0500)]
radeonsi: don't validate inlinable uniforms at draw time
Let's trust the state tracker that it sets inlinable uniforms only
when shaders can use them.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sun, 10 Jan 2021 02:33:35 +0000 (21:33 -0500)]
radeonsi: move variables closer to their use in most draw state functions
for lower register pressure, though I haven't measured this.
si_draw_vbo will be handled in a future commit.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sun, 10 Jan 2021 00:48:54 +0000 (19:48 -0500)]
radeonsi: clear dirty_atoms and dirty_states only if we entered the emit loop
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 21:24:42 +0000 (16:24 -0500)]
radeonsi: enable the GS tri strip adj workaround with primitive_restart
If a primitive restart index occurs after an even number of triangles,
the workaround works.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 12:31:49 +0000 (07:31 -0500)]
radeonsi: evaluate si_get_vs in si_draw_vbo at compile time
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 12:26:39 +0000 (07:26 -0500)]
radeonsi: inline the last use of si_get_vs_state
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 11:48:54 +0000 (06:48 -0500)]
radeonsi: evaluate sh_base in si_emit_vs_state at compile time
This computes the value at compile time because si_get_user_data_base is
always inlined.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 11:42:55 +0000 (06:42 -0500)]
radeonsi: add si_get_user_data_base selecting user data registers
This will be used in templated si_draw_vbo in place of sh_base.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sat, 9 Jan 2021 10:48:28 +0000 (05:48 -0500)]
radeonsi: don't set context_roll for non-gfx9 in templated functions
It's not needed by other chips.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Sun, 27 Dec 2020 02:34:15 +0000 (21:34 -0500)]
radeonsi: don't pass pipe_draw_info into si_emit_draw_registers
Only two fields are used. It's probably better this way.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Marek Olšák [Mon, 21 Dec 2020 06:34:38 +0000 (01:34 -0500)]
radeonsi: unify uploaders on APUs too
const_uploader and stream_uploader point to the same uploader.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8600>
Matt Turner [Wed, 20 Jan 2021 19:40:42 +0000 (14:40 -0500)]
docs/freedreno: Fix a few typos
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8599>
Jesse Natalie [Tue, 19 Jan 2021 19:23:39 +0000 (11:23 -0800)]
nir: Work around MSVC x86 internal compiler error
Fixes:
1fd8b466 ("nir,spirv: add sparse image loads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4108
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8581>
Lionel Landwerlin [Wed, 13 Jan 2021 14:09:18 +0000 (16:09 +0200)]
anv: Fix stencil layout in render passes
We incorrectly utilize the stencil layouts structures even if we
should stick to the depth_stencil ones if the layout includes stencil.
v2: Don't forget stencil only layout (Nanley)
Simplify callers of new helper functions (Nanley)
v3: Store VK_IMAGE_LAYOUT_UNDEFINED when no stencil is available (Nanley)
Use a switch statement (Nanley)
v4: Consider all layouts but depth only to be potential stencil layouts (Lionel)
v5: Refactor helper in vk_image_layout_depth_only() and discard
VkAttachmentDescriptionStencilLayoutKHR in
VkAttachmentDescription2KHR if format is not depth/stencil.
v5: s/LAYOUT_COLOR_ATTACHMENT_OPTIMAL/LAYOUT_DEPTH_ATTACHMENT_OPTIMAL/ (Nanley)
v6: Fix overly harsh assert()
Fixes:
c1c346f1667375 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4084
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8475>
Eric Anholt [Wed, 9 Dec 2020 23:49:16 +0000 (15:49 -0800)]
nir_to_tgsi: Store directly to TGSI outputs when possible.
Saves emitting a MOV at the end of the program to store the output.
softpipe glmark2 -b buffer +9.73451% +/- 3.17924% (n=6)
softpipe glmark2 -b build +5.57621% +/- 1.35074% (n=9)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8023>
Eric Anholt [Wed, 20 Jan 2021 17:49:50 +0000 (09:49 -0800)]
ci/freedreno: Fix xfail setup for sampler3d_float_vertex.
I mised the ",Fail" status, so it just wasn't interpreted. I should make
the runner throw an error instead.
Fixes:
22bf4831b8ae ("ci/freedreno: Fix up the xfail/flake handling of a3xx texture functions.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8596>
Rhys Perry [Mon, 30 Nov 2020 15:29:31 +0000 (15:29 +0000)]
radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2
Fixes several dEQP-VK.robustness.robustness2.* tests on GFX8. Generations
other than GFX8 don't fail the tests because bounds-checking is done using
the index (making it per-vertex).
fossil-db (Polaris):
Totals from 1387 (0.99% of 140385) affected shaders:
(no statistics affected)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes:
03a0d39366d ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7834>
Samuel Pitoiset [Tue, 19 Jan 2021 16:54:13 +0000 (17:54 +0100)]
radv: remove redundant check in depth_view_can_fast_clear()
We check below if HTILE is in compressed state, so checking if
the image has HTILE is useless because radv_layout_is_htile_compressed()
will return FALSE if no HTILE.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
Samuel Pitoiset [Tue, 19 Jan 2021 16:49:06 +0000 (17:49 +0100)]
radv: remove unnecessary radv_image::tc_compatible_htile
Use the surface flags directly instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
Samuel Pitoiset [Tue, 19 Jan 2021 16:48:18 +0000 (17:48 +0100)]
radv: remove redundant check in radv_process_depth_stencil()
This is already checked in radv_handle_depth_image_transition().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8579>
Rohan Garg [Mon, 18 Jan 2021 13:21:28 +0000 (14:21 +0100)]
virgl: Cache depth and stencil buffers
Expand the list of bind flags to cache depth and stencil
buffers.
Extensive testing shows the following performance improvements:
Game | % difference in FPS
Plague Inc | 7
Portal 2 | 21
Overcooked 2 | 1.2
Hollow Knight | -1.1
Civilization V | 3.8
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8560>
Rhys Perry [Wed, 9 Dec 2020 16:32:58 +0000 (16:32 +0000)]
aco: add affinity for non-sequential MIMG operands
fossil-db (GFX10.3):
Totals from 42008 (30.14% of 139391) affected shaders:
VGPRs: 2139116 -> 2147696 (+0.40%); split: -0.06%, +0.46%
CodeSize:
199109120 ->
198637852 (-0.24%); split: -0.24%, +0.01%
Instrs:
37713901 ->
37714574 (+0.00%); split: -0.02%, +0.03%
Cycles:
1621911328 ->
1621634168 (-0.02%); split: -0.02%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Thu, 14 Jan 2021 20:00:12 +0000 (20:00 +0000)]
aco: only require texture coordinates to be in WQM if NSA is used
From comment in emit_mimg():
We don't need the bias, sample index, compare value or offset to be
computed in WQM but if the p_create_vector copies the coordinates, then it
needs to be in WQM.
fossil-db (GFX10.3):
Totals from 1778 (1.28% of 139391) affected shaders:
SGPRs: 105080 -> 105072 (-0.01%); split: -0.02%, +0.01%
VGPRs: 96800 -> 96776 (-0.02%); split: -0.07%, +0.05%
CodeSize:
10001120 ->
10001384 (+0.00%); split: -0.04%, +0.04%
MaxWaves: 18164 -> 18163 (-0.01%)
Instrs: 1883750 -> 1883598 (-0.01%); split: -0.06%, +0.05%
Cycles:
34800176 ->
34767840 (-0.09%); split: -0.10%, +0.01%
We don't have a p_create_vector if we use NSA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Thu, 14 Jan 2021 19:58:13 +0000 (19:58 +0000)]
aco: use non-sequential addressing
fossil-db (GFX10.3):
Totals from 70493 (50.57% of 139391) affected shaders:
SGPRs: 4232624 -> 4231808 (-0.02%); split: -0.09%, +0.07%
VGPRs: 2831772 -> 2764740 (-2.37%); split: -2.53%, +0.17%
CodeSize:
225584412 ->
225048740 (-0.24%); split: -0.44%, +0.21%
MaxWaves: 875319 -> 878837 (+0.40%); split: +0.44%, -0.04%
Instrs:
43157803 ->
42496421 (-1.53%); split: -1.54%, +0.01%
Cycles:
1656380132 ->
1641532056 (-0.90%); split: -0.94%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Thu, 14 Jan 2021 17:46:50 +0000 (17:46 +0000)]
aco: move VADDR to the end of the operand list
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Thu, 14 Jan 2021 17:33:43 +0000 (17:33 +0000)]
aco: add emit_mimg() helper
Some fossil-db noise from slightly different order of instructions.
fossil-db (GFX10.3):
Totals from 73 (0.05% of 139391) affected shaders:
SGPRs: 3424 -> 3440 (+0.47%)
CodeSize: 199076 -> 199064 (-0.01%); split: -0.01%, +0.00%
Instrs: 37303 -> 37300 (-0.01%); split: -0.01%, +0.00%
Cycles: 786328 -> 786316 (-0.00%); split: -0.00%, +0.00%
VMEM: 19448 -> 19454 (+0.03%); split: +0.04%, -0.01%
SMEM: 5241 -> 5305 (+1.22%); split: +1.70%, -0.48%
SClause: 1282 -> 1281 (-0.08%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Thu, 14 Jan 2021 17:35:16 +0000 (17:35 +0000)]
aco: have emit_wqm() take Builder instead of isel_context
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Tue, 19 Jan 2021 11:37:52 +0000 (11:37 +0000)]
aco: fix num_waves on GFX10+
There are half the SIMDs per CU and physical_vgprs should be 512 instead
of 256.
fossil-db (GFX10.3):
Totals from 3622 (2.60% of 139391) affected shaders:
VGPRs: 298192 -> 289732 (-2.84%); split: -3.43%, +0.59%
CodeSize:
29443432 ->
29458388 (+0.05%); split: -0.00%, +0.06%
MaxWaves: 21703 -> 23395 (+7.80%); split: +7.84%, -0.05%
Instrs: 5677920 -> 5681438 (+0.06%); split: -0.01%, +0.07%
Cycles:
280715524 ->
280895676 (+0.06%); split: -0.00%, +0.07%
VMEM: 981142 -> 981894 (+0.08%); split: +0.18%, -0.10%
SMEM: 243315 -> 243454 (+0.06%); split: +0.07%, -0.02%
VClause: 88991 -> 89767 (+0.87%); split: -0.02%, +0.89%
SClause: 200660 -> 200659 (-0.00%); split: -0.00%, +0.00%
Copies: 430729 -> 434160 (+0.80%); split: -0.07%, +0.86%
Branches: 158004 -> 158021 (+0.01%); split: -0.01%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Rhys Perry [Tue, 19 Jan 2021 10:59:47 +0000 (10:59 +0000)]
radv: fix max_waves estimation on GFX10.3
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523>
Mike Blumenkrantz [Thu, 31 Dec 2020 17:47:58 +0000 (12:47 -0500)]
zink: enable WSI-faking for RADV too
temporary until we get real WSI support
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8284>
Mike Blumenkrantz [Wed, 20 Jan 2021 14:24:43 +0000 (09:24 -0500)]
zink: add VK_KHR_driver_properties
yet another extension that breaks naming conventions for structs/enums,
even if it does so in a very sensible way
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8284>
Andrii Simiklit [Mon, 18 Jan 2021 12:42:05 +0000 (14:42 +0200)]
st/mesa: fix pbo upload/download for arrays of textures with only 1 layer
Having only one layer we can put 0 as third texture coordinate
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4115
Fixes:
36097fc7 ("st/pbo: fix pbo uploads without PIPE_CAP_TGSI_VS_LAYER_VIEWPORT and skip gs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8576>
Samuel Pitoiset [Wed, 20 Jan 2021 09:09:33 +0000 (10:09 +0100)]
ci: exclude one CTS test that timeout most of the time for RADV CI
dEQP is too slow.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8587>
Danylo Piliaiev [Tue, 19 Jan 2021 16:29:03 +0000 (18:29 +0200)]
turnip: don't emit tess consts if they are not used
If tess consts aren't used they don't get included in constlen,
and we risk overrunning consts of the next stage.
Fixes:
dEQP-VK.tessellation.invariance.outer_edge_index_independence.quads_fractional_even_spacing_ccw
dEQP-VK.tessellation.invariance.outer_triangle_set.quads_fractional_odd_spacing
dEQP-VK.tessellation.invariance.primitive_set.isolines_fractional_odd_spacing_ccw
dEQP-VK.tessellation.invariance.primitive_set.quads_fractional_odd_spacing_cw
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4117
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8578>
Alejandro Piñeiro [Wed, 20 Jan 2021 09:50:07 +0000 (10:50 +0100)]
v3d/compiler: enable lower_add_sat NIR option
We are enabling this option for the Vulkan driver, so it makes sense
to enable it for the OpenGL one.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8582>
Alejandro Piñeiro [Tue, 19 Jan 2021 12:48:46 +0000 (13:48 +0100)]
v3dv/pipeline: enable lower_add_sat NIR option
We don't support them by hw, so we would need to get them
lowered. This fix some crashes while using renderdoc with UE4 shooter
demo traces.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8582>
Gert Wollny [Mon, 18 Jan 2021 12:03:57 +0000 (13:03 +0100)]
r600: Enable sb also for NIR
Currently, r600/nir doens't have a proper scheduler or optimizer backend,
to be able to make use of this code path without performance regressions,
we enable the sb optimizer also for NIR.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Mon, 18 Jan 2021 16:28:49 +0000 (17:28 +0100)]
r600/sb: fall back to un-optimized byte code when ra_init fails
Some emulated fp64 piglit create code that seems to make the register
allocation with sb worse than the original shader created from NIR, so
fall back to using the un-optimized shader.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Mon, 18 Jan 2021 16:09:07 +0000 (17:09 +0100)]
r600/sb: fix boundary assert for mem-instruction decoding
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Mon, 18 Jan 2021 12:02:44 +0000 (13:02 +0100)]
r600/sfn: Keep array registers alive for the whole shader
This is needed when using sb as a post-optimizer.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Sun, 17 Jan 2021 19:58:32 +0000 (20:58 +0100)]
r600/sfn: update shader array info
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Sun, 17 Jan 2021 18:10:36 +0000 (19:10 +0100)]
r600/nir: pass array info to r600_shader for sb
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Gert Wollny [Mon, 18 Jan 2021 17:04:16 +0000 (18:04 +0100)]
r600/sb: Add support for INTERP_X and INTERP_Z ops
v2: Fix parsing the 2-slot instructions
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>