Erik Faye-Lund [Wed, 9 Mar 2022 14:40:25 +0000 (15:40 +0100)]
docs: improve language in zink article
Turns out, this was not proper use of language!
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15300>
Erik Faye-Lund [Wed, 9 Mar 2022 11:03:37 +0000 (12:03 +0100)]
docs: fixup zink gl 4.3 requirements
The multiViewport feature isn't required for GL 4.3, it's required for
GL 4.1. Technically speaking, we could have just dropped it because we
already list the maxViewports requirement. But it seems better to be
very clear here to me.
Fixes:
29f8f21bff6 ("docs: document zink GL 4.3 requirements")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15300>
Iago Toral Quiroga [Mon, 7 Mar 2022 15:27:02 +0000 (16:27 +0100)]
broadcom/compiler: don't always assign r5 if available
Instead, only favor assigning r5 if we have first decided to
assign an accumulator. This helps with assining r5 to short
lived uniforms, favoring accumulator rotation to facilitate
QPU merges.
total instructions in shared programs:
12656164 ->
12628339 (-0.22%)
instructions in affected programs: 5368373 -> 5340548 (-0.52%)
helped: 17420
HURT: 9996
total uniforms in shared programs: 3704776 -> 3704863 (<.01%)
uniforms in affected programs: 12247 -> 12334 (0.71%)
helped: 23
HURT: 78
total max-temps in shared programs: 2153505 -> 2152684 (-0.04%)
max-temps in affected programs: 26468 -> 25647 (-3.10%)
helped: 569
HURT: 328
total fills in shared programs: 4656 -> 4657 (0.02%)
fills in affected programs: 43 -> 44 (2.33%)
helped: 0
HURT: 1
total sfu-stalls in shared programs: 34728 -> 34403 (-0.94%)
sfu-stalls in affected programs: 3411 -> 3086 (-9.53%)
helped: 842
HURT: 534
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:03:03 +0000 (14:03 +0100)]
broadcom/compiler: add comment on why we don't use r5 with ldunifa
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:47:55 +0000 (14:47 +0100)]
broadcom/compiler: adjust register threshold for 2-thread compiles
We have twice the registers in this case so it makes sense to double
this as well. While this causes slight regressions in shader-db
stats (due to additional register pressure), it helps us hide latency
of memory reads better on 2-thread compiles, where the thread switch
mechanism will be less effective. This shows a ~3% performance
improvement on the UE4 SunTemple demo.
total instructions in shared programs:
12642413 ->
12656164 (0.11%)
instructions in affected programs: 2272652 -> 2286403 (0.61%)
helped: 2924
HURT: 3389
total uniforms in shared programs: 3703861 -> 3704776 (0.02%)
uniforms in affected programs: 213729 -> 214644 (0.43%)
helped: 823
HURT: 1272
total max-temps in shared programs: 2150686 -> 2153505 (0.13%)
max-temps in affected programs: 191332 -> 194151 (1.47%)
helped: 1900
HURT: 1891
total spills in shared programs: 3255 -> 3274 (0.58%)
spills in affected programs: 166 -> 185 (11.45%)
helped: 3
HURT: 6
total fills in shared programs: 4630 -> 4656 (0.56%)
fills in affected programs: 367 -> 393 (7.08%)
helped: 7
HURT: 15
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:42:39 +0000 (14:42 +0100)]
broadcom/compiler: add a strategy to disable scheduling of general TMU reads
This can add quite a bit of register pressure so it makes sense to disable it
to prevent us from dropping to 2 threads or increase spills:
total instructions in shared programs:
12672813 ->
12642413 (-0.24%)
instructions in affected programs: 256721 -> 226321 (-11.84%)
helped: 719
HURT: 77
total threads in shared programs: 415534 -> 416322 (0.19%)
threads in affected programs: 788 -> 1576 (100.00%)
helped: 394
HURT: 0
total uniforms in shared programs: 3711370 -> 3703861 (-0.20%)
uniforms in affected programs: 28859 -> 21350 (-26.02%)
helped: 204
HURT: 455
total max-temps in shared programs: 2159439 -> 2150686 (-0.41%)
max-temps in affected programs: 32945 -> 24192 (-26.57%)
helped: 585
HURT: 47
total spills in shared programs: 5966 -> 3255 (-45.44%)
spills in affected programs: 2933 -> 222 (-92.43%)
helped: 192
HURT: 4
total fills in shared programs: 9328 -> 4630 (-50.36%)
fills in affected programs: 5184 -> 486 (-90.62%)
helped: 196
HURT: 0
Compared to the stats before adding scheduling of non-filtered
memory reads we see we that we have now gotten back all that was
lost and then some:
total instructions in shared programs:
12663186 ->
12642413 (-0.16%)
instructions in affected programs: 2051803 -> 2031030 (-1.01%)
helped: 4885
HURT: 3338
total threads in shared programs: 415870 -> 416322 (0.11%)
threads in affected programs: 896 -> 1348 (50.45%)
helped: 300
HURT: 74
total uniforms in shared programs: 3711629 -> 3703861 (-0.21%)
uniforms in affected programs: 158766 -> 150998 (-4.89%)
helped: 1973
HURT: 499
total max-temps in shared programs: 2138857 -> 2150686 (0.55%)
max-temps in affected programs: 177920 -> 189749 (6.65%)
helped: 2666
HURT: 2035
total spills in shared programs: 3860 -> 3255 (-15.67%)
spills in affected programs: 2653 -> 2048 (-22.80%)
helped: 77
HURT: 21
total fills in shared programs: 5573 -> 4630 (-16.92%)
fills in affected programs: 3839 -> 2896 (-24.56%)
helped: 81
HURT: 15
total sfu-stalls in shared programs: 39583 -> 38154 (-3.61%)
sfu-stalls in affected programs: 8993 -> 7564 (-15.89%)
helped: 1808
HURT: 1038
total nops in shared programs: 324894 -> 323685 (-0.37%)
nops in affected programs: 30362 -> 29153 (-3.98%)
helped: 2513
HURT: 2077
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:04:19 +0000 (14:04 +0100)]
broadcom/compiler: define v3d-specific delays for NIR instructions
We do a few changes over NIR's defaults:
1. Lower delay for texture reads. Empirically, we don't observe any
benefits with delays over 50 and since this delay value is still
used by the scheduler in the "favor register pressure" case it is
benefitial to avoid overestimating it too much.
2. Adjust delay for non-filtered TMU reads to the delay selected for
texture reads.
3. In our case, UBO reads from dynamically uniform addresses don't
use the TMU and have a latency of 1 instruction in the best case
scenario or 4 at worse, so we go with 1 so we don't try to move
this early.
This helps us get back some of what we lost when updating the
default scheduler configuration to add a delay for non-filtered
memory reads:
total instructions in shared programs:
13126587 ->
12671765 (-3.46%)
instructions in affected programs: 3764097 -> 3309275 (-12.08%)
helped: 14664
HURT: 4244
total threads in shared programs: 407208 -> 415522 (2.04%)
threads in affected programs: 8716 -> 17030 (95.39%)
helped: 4224
HURT: 67
total uniforms in shared programs: 3812698 -> 3711224 (-2.66%)
uniforms in affected programs: 335170 -> 233696 (-30.28%)
helped: 2816
HURT: 3551
total max-temps in shared programs: 2318430 -> 2159345 (-6.86%)
max-temps in affected programs: 539991 -> 380906 (-29.46%)
helped: 13173
HURT: 1440
total spills in shared programs: 49086 -> 5966 (-87.85%)
spills in affected programs: 48306 -> 5186 (-89.26%)
helped: 1655
HURT: 28
total fills in shared programs: 55810 -> 9328 (-83.29%)
fills in affected programs: 54821 -> 8339 (-84.79%)
helped: 1659
HURT: 22
LOST: 0
GAINED: 3
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Thu, 3 Mar 2022 11:18:02 +0000 (12:18 +0100)]
nir/schedule: allow drivers to decide about instruction latency
On V3D reading UBOs from uniform addresses uses a more efficient
mechanism with lower latency. On other platforms there may be
simular scenarios.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 2 Mar 2022 11:15:15 +0000 (12:15 +0100)]
nir/schedule: use larger delay for non-filtered memory reads
This has been pending for a long time. It is not very consistent to
add a significant delay for textures and not do it for UBOs, etc
The reason we have not been doing this so far is the accumulated effect
on register pressure for V3D as shown by shader-db results below, but
from the point of view of a generic scheduler it makes sense to do this.
Later patches will address V3D specific issues with register pressure
derived from this by letting the driver control its instruction delay
settings.
total instructions in shared programs:
12662138 ->
13126587 (3.67%)
instructions in affected programs: 1813091 -> 2277540 (25.62%)
helped: 2410
HURT: 10499
total threads in shared programs: 415858 -> 407208 (-2.08%)
threads in affected programs: 17348 -> 8698 (-49.86%)
helped: 8
HURT: 4333
total uniforms in shared programs: 3711483 -> 3812698 (2.73%)
uniforms in affected programs: 128012 -> 229227 (79.07%)
helped: 3474
HURT: 2143
total max-temps in shared programs: 2138763 -> 2318430 (8.40%)
max-temps in affected programs: 318780 -> 498447 (56.36%)
helped: 588
HURT: 11997
total spills in shared programs: 3860 -> 49086 (1171.66%)
spills in affected programs: 709 -> 45935 (6378.84%)
helped: 23
HURT: 1595
total fills in shared programs: 5573 -> 55810 (901.44%)
fills in affected programs: 1067 -> 51304 (4708.25%)
helped: 23
HURT: 1595
LOST: 3
GAINED: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 9 Mar 2022 11:11:55 +0000 (12:11 +0100)]
nir/schedule: handle nir_intrinsic_group_memory_barrier
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 9 Mar 2022 09:38:42 +0000 (10:38 +0100)]
nir/schedule: fix handling of generic memory barrier
We can get a generic nir_intrinsic_memory_barrier to represent a
barrier involving multiple semantics (instead of getting individual
specific barriers for each semantic). This means that we need to
consider these as potentially affecting shared memory access as well.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 2 Mar 2022 10:10:39 +0000 (11:10 +0100)]
broadcom/compiler: stop moving UBO loads before NIR scheduling
This doesn't have any significant impact shader-db stats and would
reduce our capacity to hide latency from the loads, so it is probably
undesirable:
total instructions in shared programs:
12663189 ->
12663186 (<.01%)
instructions in affected programs: 4222 -> 4219 (-0.07%)
helped: 9
HURT: 4
total uniforms in shared programs: 3711624 -> 3711629 (<.01%)
uniforms in affected programs: 186 -> 191 (2.69%)
helped: 0
HURT: 2
total max-temps in shared programs: 2138822 -> 2138857 (<.01%)
max-temps in affected programs: 569 -> 604 (6.15%)
helped: 1
HURT: 9
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Michel Zou [Thu, 3 Mar 2022 06:12:05 +0000 (07:12 +0100)]
lavapipe: set non-zero device/driver uuid
Closes #5875
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15230>
Danylo Piliaiev [Fri, 11 Feb 2022 16:15:27 +0000 (18:15 +0200)]
turnip: Make autotuner work with reusable command buffers
To achieve it each command buffer now has its own GPU memory.
However the BOs usage by autotuner is not optimal, the ideal
pattern would be to use some memory pool to suballocate small
GPU memory chunks, since most command buffers have only a few
renderpasses.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5990
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14996>
Gert Wollny [Wed, 3 Nov 2021 13:13:43 +0000 (14:13 +0100)]
virgl: Add a few more formats to the format table
These formats are used by the piglit
arb_texture_buffer_object-formats fs arb
Adding them here keeps the piglit from crashing, but most of the related
tests don't pass.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13645>
Kenneth Graunke [Wed, 12 Jan 2022 00:17:34 +0000 (16:17 -0800)]
intel: Use 3DSTATE_BINDING_TABLE_POOL_ALLOC exclusively on Gfx11+
On Icelake and later, we can use a new 3DSTATE_BINDING_TABLE_POOL_ALLOC
command to update the location of the binder (buffer containing binding
table entries), rather than having to move Surface State Base Address
via a STATE_BASE_ADDRESS command. This has less stalling and also means
our surface addresses can remain relative to a fixed 4GB address range,
meaning we don't have to re-stream them any time the binder changes.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
Kenneth Graunke [Tue, 11 Jan 2022 23:06:06 +0000 (15:06 -0800)]
intel: Limit Wa_1607854226 to Gfx12.0 only
This workaround is needed on all Gfx12.0 parts, but doesn't appear to be
necessary on XeHP. The other drivers do not appear to be applying this
workaround on those parts. As further evidence, we accidentally added
the 3DSTATE_BINDING_TABLE_POOL_ALLOC commands after switching back to
GPGPU mode, which would be an incorrect way to implement the workaround,
and things seem to be working.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
Kenneth Graunke [Tue, 11 Jan 2022 20:36:26 +0000 (12:36 -0800)]
iris: Rename surface_base_address to binder_address in a few places
On Gfx11+, we're going to stop changing Surface State Base Address
and instead start changing the Binding Table Pool address instead.
So, rename a few things to track the last binder address, which is
what we're actually changing, regardless of how we program it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
Kenneth Graunke [Tue, 25 Jun 2019 20:16:50 +0000 (13:16 -0700)]
iris: Use more efficient binding table pointer formats on Icelake+.
Skylake and older use a 15:5 binding table pointer format, which means
our binder can be at most 64kB in size. Each binding table within the
binder must be aligned to 32B.
XeHP uses a new 20:5 binding table format, which allows us to increase
the binder size to 1MB while retaining the nice 32B alignment. Larger
binders mean fewer stalls as we update the base address for the binder.
Icelake and Tigerlake can either use the 15:5 format or an 18:8 format.
18:8 mode requires the base of each binding table to be aligned to 256B
instead of 32B, but it gives us a maximum binder size of 512kB.
We can store 64 binding table entries in a 256B chunk (256B / 4B = 64),
but only 8 entries in a 32B chunk (32B / 4B = 8). Assuming that most
binding tables have fewer than 64 entries, this means that with the 18:8
format, we're likely to be able to fit 2048 (512KB / 256B) tables into a
a buffer before needing to allocate a new one and stall.
Technically, the old format could also store 2048 binding tables per
buffer as well (64KB / 32B = 2048). However, tables that needed more
than 8 entries would need multiple 32B chunks. A single table would
take multiple aligned chunks, while with the larger 256B format, it
could fit in a single one.
This cuts binder resets by 6.3% on a Shadow of Mordor benchmark trace.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
Jason Ekstrand [Mon, 11 May 2020 18:49:55 +0000 (13:49 -0500)]
blorp: Add a binding_table_offset_to_pointer helper
On Gen11+, we have a feature that requires us to shift binding table
offsets by 3. This adds a helper which gives the driver a hook to do
this if it so chooses.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14507>
Pierre-Eric Pelloux-Prayer [Tue, 8 Mar 2022 10:44:26 +0000 (11:44 +0100)]
gallium/tc: zero alloc transfers
Otherwise this causes trouble with unitialized memory, eg with:
struct si_transfer {
struct threaded_transfer b;
struct si_resource *staging;
};
'staging' will not be initialized and this causes #6109.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6109
Cc: mesa-stable
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15298>
Pierre-Eric Pelloux-Prayer [Tue, 8 Mar 2022 10:43:29 +0000 (11:43 +0100)]
util/slab: add slab_zalloc
A a variant that clears the allocated object to 0.
Cc: mesa-stable
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15298>
Danylo Piliaiev [Wed, 12 Jan 2022 19:35:33 +0000 (21:35 +0200)]
tu: Refactor VS DECODE/DEST to be emitted in two pkt4
Refactor to emit VFD_DECODE and VFD_DEST_CNTL in two packets
regardless of attribute count.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14584>
Mike Blumenkrantz [Wed, 2 Mar 2022 18:26:08 +0000 (13:26 -0500)]
mesa/st: make export_point_size shader key clobber existing psiz
this is necessary to upload the API value using the uniform constant
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 16:54:18 +0000 (11:54 -0500)]
mesa/st: check max output components for adding pointsize during precompile
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 17:07:16 +0000 (12:07 -0500)]
mesa/st: count FF shaders as needing psiz export for precompile
this is consistent with logic for regular compiles
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Thu, 3 Mar 2022 02:47:22 +0000 (21:47 -0500)]
mesa/st: precompile with API pointsize only if the shader doesn't have pointsize
this is a more accurate hint, and maintains the existing behavior given
subsequent changes to this area
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 16:25:18 +0000 (11:25 -0500)]
mesa/st: simplify pointsize precompile conditional
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 15:55:57 +0000 (10:55 -0500)]
mesa/st: simplify pointsize shader update conditional
ES contexts have no API toggle for this, so it will never be flagged
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 16:06:55 +0000 (11:06 -0500)]
mesa: always set PointSizeEnabled for API_OPENGLES2
this is implicit, so make it explicit
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 16:48:43 +0000 (11:48 -0500)]
mesa/st: only add pointsize output if it doesn't exceed max component limit
fixes (zink-nvidia):
dEQP-GLES31.functional.geometry_shading.basic*
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 16:33:03 +0000 (11:33 -0500)]
nir/gather_info: check copy_deref instrs for writing outputs
this is a valid way to write an output even though it usually gets rewritten
to some other instruction later on
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 15:54:14 +0000 (10:54 -0500)]
mesa/st: conditionally add pointsize outputs to ES tess/geom shaders
if the driver requires this value to be set, add it if the shader doesn't
use the ext to allow exporting it
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 19:03:42 +0000 (14:03 -0500)]
mesa/st: add a gl_program struct flag to skip psiz exports for xfb
if this output did not exist in the original shader,
then it must not be exported in xfb
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Mike Blumenkrantz [Wed, 2 Mar 2022 15:53:46 +0000 (10:53 -0500)]
glsl: store OES/EXT point_size extension enablement to shader struct
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15228>
Boris Brezillon [Thu, 23 Sep 2021 15:54:08 +0000 (17:54 +0200)]
panvk: Implement vkCmdDispatch()
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Thu, 23 Sep 2021 14:27:06 +0000 (16:27 +0200)]
panvk: Add support for storage image
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Wed, 29 Sep 2021 11:19:47 +0000 (13:19 +0200)]
panvk: Move dummy attribute buffer emission out of emit_{attribute,varying}_bufs
So we can easily add entries after the standard varyings/attributes
(like image descriptors in the attribute array).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Thu, 23 Sep 2021 13:40:46 +0000 (15:40 +0200)]
panvk: Add support for storage/uniform buffers with dynamic offsets
The idea of storing offsets in a separate UBO and lowering accesses to
UBOs/SSBOs with a dynamic offset was not great. Let's apply the offset
at UBO/SSBO emission time instead.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Thu, 23 Sep 2021 13:47:30 +0000 (15:47 +0200)]
panvk: Support creation of compute pipelines
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Thu, 23 Sep 2021 13:13:21 +0000 (15:13 +0200)]
panvk: Add support for storage buffers
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Boris Brezillon [Mon, 13 Sep 2021 14:50:50 +0000 (16:50 +0200)]
panvk: Add support for push constants
Push constants are stored in a separate UBO.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15248>
Mike Blumenkrantz [Thu, 24 Feb 2022 20:43:44 +0000 (15:43 -0500)]
zink: handle spirv xfb insanity
this comes in two flavors:
* streamout of array<struct/block>
* partial streamout of array/struct/block
for the former:
* arrays of structs can just be blasted out in the initial var declaration (easy)
* arrays of blocks must be output to separate xfb buffers for each array block,
which requires skipping initial xfb blast-off and instead propagating the values
using tmp variables at a later point
for the latter:
* the optimal way to do this is to unwrap the struct first to figure out what's being
emitted, at which point the value can be extracted and exported
fixes the rest of spec@arb_gl_spirv@execution@xfb
...except spec@arb_gl_spirv@execution@xfb@vs_block_array, which I'm suspecting is broken
due to vtn bugs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Thu, 24 Feb 2022 20:43:15 +0000 (15:43 -0500)]
zink: store shader to ntv_context
it's insane the gymnastics that have to be done because this wasn't stored
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 16 Feb 2022 20:16:12 +0000 (15:16 -0500)]
zink: handle remaining xfb corner cases during analysis
this now handles inlining of stupid types (dvec3, dmatX) and complex
types (goku) as seen in cts
fixes:
KHR-Single-GL46.enhanced_layouts.xfb_explicit_location
KHR-Single-GL46.enhanced_layouts.xfb_struct_explicit_location
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 16 Feb 2022 22:44:13 +0000 (17:44 -0500)]
zink: fix xfb analysis variable finding for arrays
this fixes clipdistance exports
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 16 Feb 2022 20:15:55 +0000 (15:15 -0500)]
zink: correctly set xfb packed output offsets
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 16 Feb 2022 20:15:35 +0000 (15:15 -0500)]
zink: store the correct number of components for xfb packing outputs
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 16 Feb 2022 20:14:52 +0000 (15:14 -0500)]
zink: use 64bit mask for xfb analysis
I don't know how this worked before since all the values are oob?
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15224>
Mike Blumenkrantz [Wed, 9 Mar 2022 03:22:11 +0000 (22:22 -0500)]
ci: more stoney flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Wed, 9 Mar 2022 02:43:12 +0000 (21:43 -0500)]
ci: add another stoney flake
from #6109
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Tue, 1 Feb 2022 19:38:19 +0000 (14:38 -0500)]
aux/cso: stop tracing during cso_unbind()
this unnecessarily bloats lavapipe traces
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Thu, 27 Jan 2022 19:31:18 +0000 (14:31 -0500)]
aux/trace: dump more rasterizer state members
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Tue, 8 Feb 2022 14:41:30 +0000 (09:41 -0500)]
aux/trace: dump clear_texture colors
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Tue, 8 Feb 2022 14:41:05 +0000 (09:41 -0500)]
aux/trace: dump clear colors as uints
dumping as float is nice if the clear color is a float, but if it isn't
then the value is useless
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Thu, 27 Jan 2022 18:51:06 +0000 (13:51 -0500)]
aux/trace: rzalloc the context struct
this has problems if pointers are garbage
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Mon, 7 Feb 2022 19:39:25 +0000 (14:39 -0500)]
aux/trace: more screen methods
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14854>
Mike Blumenkrantz [Tue, 22 Feb 2022 20:25:06 +0000 (15:25 -0500)]
lavapipe: fix pipeline creation for blend and zs states
these values are read based on the specified subpass containing the
required attachments, not on the overall renderpass
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15282>
Mike Blumenkrantz [Tue, 8 Mar 2022 15:59:32 +0000 (10:59 -0500)]
lavapipe: update multisample state after blend state
null blend pipeline state will zero the blend struct, which would cause
values set here to be overwritten
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15282>
Rob Clark [Tue, 8 Mar 2022 21:40:14 +0000 (13:40 -0800)]
turnip: Don't call getenv() directly
I noticed it was using getenv directly when I tried to use 'setprop
mesa.tu.debug ..' on android. Use os_get_option() instead so we get
sysprop fallback on android.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15289>
Gert Wollny [Wed, 12 Jan 2022 12:01:30 +0000 (13:01 +0100)]
virgl: Fix texture transfers by using a staging resource
This commit fixes the following flaws in the implementation:
* when a resource was re-allocated, the guest side storage
was also allocated
* when a source needs a readback before being written to, then
the call would go through vws->transfer_get, thereby bypassing the
staging resource, and this would fail on the host, because no
the allocated IOV was too small (just one byte)
* if the texture write would need neither flush nor readback, the
old code path would be used expecting that guest side backing stogage
for the texture.
v2: - actually do a readback to the stageing resource when it is required
- fix typo (Lepton)
v3: Don't use stageing transfers if the host can't read back the data
by rendering to an FBO or calling getTexImage, because in this case
we rely on the IOV to hold the date.
v4: Also don't use staging transfers if the format is no readback
format. Otherwise we have to deal with the resolve blit, and
this is currently not working correctly.
v5: add a new flag that indicates whether non-renderable textures can
be read back (either via glGetTexImage or GBM)
v6: Restrict the use of staging texture transfers to textures that can
be read back, and on GLES also if the they are bound to scanout and
the host uses minigbm to allocate such textures.
For that replace the flag indicating the capability to read back
non-renderable textures with a cap that indicates whether scanout
textures can be read back.
v7: update virglrenderer version in the CI
v8: update use of stageing (Chia-I)
v9: remove superflous check and assignment (Chia-I)
v10: disable stageing textures for arrays with stencil format. This is a
workaround for failures of the CI.
Fixes:
cdc480585c9be368ddfdc33e2eb73e3582f25fe7
virgl/drm: New optimization for uploading textures
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14495>
Mike Blumenkrantz [Mon, 31 Jan 2022 15:18:10 +0000 (10:18 -0500)]
llvmpipe: clamp surface clear geometry
avoid oob writes to avoid crashing
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14655>
Mike Blumenkrantz [Fri, 21 Jan 2022 19:31:30 +0000 (14:31 -0500)]
lavapipe: clamp clear attachments rects
there is at least one unnamed game which has problems with this, so try
to avoid crashing
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14655>
Mike Blumenkrantz [Fri, 28 Jan 2022 14:46:47 +0000 (09:46 -0500)]
llvmpipe: fix debug print iterating in set_framebuffer_state
this would potentially access garbage memory by checking the existing
state using the incoming state's iterator values
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14857>
Mike Blumenkrantz [Tue, 8 Mar 2022 17:51:50 +0000 (12:51 -0500)]
zink: ci updates
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>
Mike Blumenkrantz [Mon, 7 Mar 2022 14:25:43 +0000 (09:25 -0500)]
zink: fix 64bit float shader ops
this was being set from back before zink actually supported 64bit
natively and only 32bit was functional, but it breaks 64bit support
cc: mesa-stable
fixes (lavapipe):
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec2
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec3
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec4
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>
Mike Blumenkrantz [Tue, 8 Mar 2022 17:12:21 +0000 (12:12 -0500)]
zink: run nir_lower_phis_to_scalar in optimization loop
fixes (lavapipe):
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>
Samuel Pitoiset [Fri, 18 Feb 2022 13:23:42 +0000 (14:23 +0100)]
radv,aco,llvm: lower post shuffle vertex in NIR
fossils-db (Sienna Cichlid):
Totals from 774 (0.57% of 134913) affected shaders:
VGPRs: 26496 -> 26312 (-0.69%)
CodeSize: 1825936 -> 1828812 (+0.16%); split: -0.04%, +0.20%
MaxWaves: 22046 -> 22062 (+0.07%)
Instrs: 347634 -> 347975 (+0.10%); split: -0.05%, +0.15%
Latency: 1363949 -> 1356426 (-0.55%); split: -0.59%, +0.04%
InvThroughput: 221529 -> 221380 (-0.07%); split: -0.10%, +0.04%
VClause: 5682 -> 5676 (-0.11%); split: -1.46%, +1.36%
SClause: 7485 -> 7411 (-0.99%); split: -1.48%, +0.49%
Copies: 30481 -> 30420 (-0.20%); split: -0.51%, +0.31%
PreVGPRs: 19717 -> 19656 (-0.31%)
fossil-db (Polaris10):
Totals from 896 (0.66% of 135960) affected shaders:
SGPRs: 49824 -> 49648 (-0.35%); split: -0.39%, +0.03%
VGPRs: 31040 -> 29948 (-3.52%); split: -3.62%, +0.10%
CodeSize: 875960 -> 875920 (-0.00%); split: -0.06%, +0.05%
MaxWaves: 6380 -> 6429 (+0.77%)
Instrs: 171522 -> 171482 (-0.02%); split: -0.07%, +0.05%
Latency: 1356082 -> 1334386 (-1.60%); split: -1.61%, +0.01%
InvThroughput: 553389 -> 552957 (-0.08%); split: -0.08%, +0.00%
VClause: 4317 -> 4244 (-1.69%); split: -2.41%, +0.72%
SClause: 6157 -> 6139 (-0.29%); split: -0.45%, +0.16%
Copies: 9340 -> 9235 (-1.12%); split: -1.24%, +0.12%
PreVGPRs: 22366 -> 22116 (-1.12%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15113>
Timur Kristóf [Thu, 24 Feb 2022 09:27:30 +0000 (10:27 +0100)]
nir: Introduce workgroup_index and ability to lower workgroup_id to it.
The workgroup_index is intended for situations when a 3 dimensional
workgroup_id is not available on the HW, but a 1 dimensional index is.
In this case, we can use lower the 3D ID to use this.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
Timur Kristóf [Thu, 24 Feb 2022 09:17:36 +0000 (10:17 +0100)]
nir: Extract lower_id_to_index into a separate function.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
Timur Kristóf [Thu, 24 Feb 2022 09:14:08 +0000 (10:14 +0100)]
nir: Fix lowering terminology of compute system values: "from"->"to".
This is to match other NIR terminology.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
Jason Ekstrand [Fri, 28 Jan 2022 21:04:50 +0000 (15:04 -0600)]
panvk: Non-destructively stub GetRenderAreaGranularity
Don't crash. Just print a warning and return 1x1.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>
Jason Ekstrand [Fri, 28 Jan 2022 21:04:14 +0000 (15:04 -0600)]
panvk: Advertise zero sparse format properties
This is the correct implementation when you don't support sparse and
fixes piles of CTS crashes.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>
Jason Ekstrand [Fri, 28 Jan 2022 21:22:45 +0000 (15:22 -0600)]
panvk: Advertise VK_KHR_get_physical_device_properties2
All the entrypooints are already implemented and a bunch of Vulkan CTS
tests assume this extension exists.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>
Rob Clark [Mon, 7 Mar 2022 23:53:59 +0000 (15:53 -0800)]
gallium/dri: Add missing in_fence_fd initialization
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6108
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15272>
Yogesh Mohan Marimuthu [Thu, 13 Jan 2022 10:36:48 +0000 (16:06 +0530)]
vulkan/device_select: add has_vulkan11 flag with has_pci_bus flag
In EnumeratePhysicalDevices(), pci bus info is available only in
vulkan version >= 1.1. hence adding has_vulkan11 flag in places
where has_pci_bus is used in EnumeratePhysicalDevices() code flow.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14535>
Yogesh Mohan Marimuthu [Thu, 13 Jan 2022 10:25:20 +0000 (15:55 +0530)]
vulkan/device_select: for vulkan 1.0 use vid/did for boot_vga
In device select layer EnumeratePhysicalDevices() function pci
bus information is available only in case of vulkan >= 1.1.
Hence use vid/did to match boot_vga device in case of vulkan 1.0.
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14535>
Timur Kristóf [Wed, 23 Feb 2022 14:01:05 +0000 (15:01 +0100)]
nir: Fix handling of NV_mesh_shader PRIMITIVE_INDICES output.
PRIMITIVE_INDICES is a flat array in NV_mesh_shader,
not a proper arrayed output, as opposed to D3D-style
mesh shaders where it's addressed by the primitive index.
Prevent assigning several slots to primitive indices,
to avoid issues.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15160>
Rhys Perry [Wed, 23 Feb 2022 17:29:25 +0000 (17:29 +0000)]
aco/insert_exec_mask: optimize top-level transition to exact before demote
fossil-db (Sienna Cichlid):
Totals from 5767 (3.55% of 162293) affected shaders:
Instrs: 3264949 -> 3257527 (-0.23%); split: -0.23%, +0.00%
CodeSize:
17835692 ->
17806004 (-0.17%); split: -0.17%, +0.00%
Latency:
45990060 ->
45987924 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 7643850 -> 7643835 (-0.00%); split: -0.00%, +0.00%
Copies: 193641 -> 186219 (-3.83%); split: -3.84%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
Rhys Perry [Wed, 23 Feb 2022 17:35:33 +0000 (17:35 +0000)]
aco/insert_exec_mask: use get_exec_op
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
Rhys Perry [Wed, 23 Feb 2022 17:21:42 +0000 (17:21 +0000)]
aco/insert_exec_mask: fix top-level to-exact with non-global exact mask
After transitioning to exact after a discard, the exec stack might be:
[exact|global, wqm, exact]
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
Cristian Ciocaltea [Mon, 7 Mar 2022 16:24:55 +0000 (18:24 +0200)]
radeonsi/ci: Mark a bunch of flaky tests on stoney
radeonsi-stoney-gl:amd64 job fails due to random crashes of some
'dEQP-GLES3.functional.buffer.map.write.explicit_flush.*' tests.
Fix the pipeline by adding them to 'radeonsi-stoney-flakes.txt'.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>
Cristian Ciocaltea [Mon, 7 Mar 2022 14:34:39 +0000 (16:34 +0200)]
ci/zink: Report flake test
Mark 'KHR-GL46.shader_image_load_store.advanced-sso-subroutine' as
flake since it failed several times while attempting to merge this MR.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>
Cristian Ciocaltea [Thu, 3 Mar 2022 23:59:55 +0000 (01:59 +0200)]
ci: Improve interrupt signal handling in crosvm-runner.sh
Run crosvm as a background process in order to allow intercepting
interrupt signals (INT, TERM) and properly release/cleanup any allocated
resources.
This is particularly helpful when one or more crosvm tasks hang, which
will eventually prevent subsequent instances to be started - currently
we can handle up to 128 concurrent crosvm instances per runner.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>
Cristian Ciocaltea [Mon, 28 Feb 2022 19:26:31 +0000 (21:26 +0200)]
ci: Increase limit of concurrent crosvm instances per runner
Ensure we can handle up to 128 concurrent crosvm instances per runner
with the current CID generator. This is a safety margin for the new
64-core runners.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>
Dave Airlie [Mon, 7 Mar 2022 07:03:25 +0000 (17:03 +1000)]
gallivm/nir: extract a valid texture index according to exec_mask.
When using indirect textures, some lanes may not be active,
particularly in a loop, so as with some other areas, extracting
the correct lane is needed here. This extracts the last valid one.
KHR-GL45.texture_barrier.* on zink.
Fixes:
e168d148d76d ("gallivm/nir: handle non-uniform texture offsets")
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15259>
Samuel Pitoiset [Mon, 7 Mar 2022 13:49:00 +0000 (14:49 +0100)]
radv/ci: remove unused files
These files are no longer used.
Fixes:
cc327a0fe45 ("amd, ci: Remove unused runners.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Charlie Turner <cturner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15264>
Ilia Mirkin [Tue, 16 Nov 2021 23:19:37 +0000 (18:19 -0500)]
freedreno: add a420 deqp-runner files
This doesn't actually get run in CI, but this helps track outstanding
issues / expectations. This is from a run on my IFC6540 with A420.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Ilia Mirkin [Sun, 14 Nov 2021 18:06:49 +0000 (13:06 -0500)]
freedreno/a4xx: expose shaders and images, as well as ES 3.1
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Ilia Mirkin [Sat, 4 Dec 2021 00:06:12 +0000 (19:06 -0500)]
freedreno/ir3: disable conversion folding on a4xx
Experiments suggest that e.g.
add.u r0.y, hr0.x, hr0.y
will result in the summed value in both the high and low words of r0.y.
This only happens with odd registers, not even ones (r0.x works fine).
Seen in the bit_count lowering (which turns out to be unnecessary, but
this is still a larger problem).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Ilia Mirkin [Fri, 3 Dec 2021 08:04:19 +0000 (03:04 -0500)]
freedreno/ir3: no need to count bits 16b at a time for a4xx
This also works out nicely since a4xx has some sort of problem with the
16b-based lowering.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Ilia Mirkin [Sat, 20 Nov 2021 07:30:37 +0000 (02:30 -0500)]
freedreno/a4xx: improve condition for disabling early z
This helps some subtests in the early-z piglit test, but leaves one
occlusion-based test still failing.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Ilia Mirkin [Wed, 17 Nov 2021 00:10:46 +0000 (19:10 -0500)]
freedreno/a4xx: extend astc and tg4 workarounds to compute shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>
Mike Blumenkrantz [Mon, 7 Mar 2022 23:00:42 +0000 (18:00 -0500)]
Revert "lavapipe: accurately set image/ssbo access based on shader usage"
This reverts commit
821a49981ff386559f8a8fdf6bf3526b8deb2415.
still flaky
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15271>
Matt Turner [Thu, 3 Mar 2022 23:30:37 +0000 (15:30 -0800)]
intel/perf: Destination array calculation into function
Cuts 119 KiB from iris_dri.so and libvulkan_intel.so.
text data bss dec hex filename
917511 0 0 917511 e0007 meson-generated_.._intel_perf_metrics.c.o (before)
796986 0 0 796986 c293a meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14130948 365708 210004
14706660 e067e4 iris_dri.so (before)
14009332 365708 210004
14585044 de8cd4 iris_dri.so (after)
text data bss dec hex filename
8124225 214264 22820 8361309 7f955d libvulkan_intel.so (before)
8002609 214264 22820 8239693 7dba4d libvulkan_intel.so (after)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
Matt Turner [Thu, 3 Mar 2022 07:28:18 +0000 (23:28 -0800)]
intel/perf: Fix mistake in description string
Along with fixing the grammar, this allows it to be deduplicated since
the properly worded description exists in later generations' XMLs.
Cuts 96 B from iris_dri.so and libvulkan_intel.so.
text data bss dec hex filename
917613 0 0 917613 e006d meson-generated_.._intel_perf_metrics.c.o (before)
917511 0 0 917511 e0007 meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14131044 365708 210004
14706756 e06844 iris_dri.so (before)
14130948 365708 210004
14706660 e067e4 iris_dri.so (after)
text data bss dec hex filename
8124321 214264 22820 8361405 7f95bd libvulkan_intel.so (before)
8124225 214264 22820 8361309 7f955d libvulkan_intel.so (after)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
Matt Turner [Wed, 2 Mar 2022 02:49:26 +0000 (18:49 -0800)]
intel/perf: Mark intel_perf_counter_* enums as PACKED
Reduces their sizes from 4 bytes to 1. Cuts 6 KiB from iris_dri.so and
libvulkan_intel.so.
text data bss dec hex filename
924401 0 0 924401 e1af1 meson-generated_.._intel_perf_metrics.c.o (before)
917613 0 0 917613 e006d meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14137732 365708 210004
14713444 e08264 iris_dri.so (before)
14131044 365708 210004
14706756 e06844 iris_dri.so (after)
text data bss dec hex filename
8131009 214264 22820 8368093 7fafdd libvulkan_intel.so (before)
8124321 214264 22820 8361405 7f95bd libvulkan_intel.so (after)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
Matt Turner [Mon, 31 Jan 2022 21:16:26 +0000 (13:16 -0800)]
intel/perf: Store indices to strings rather than pointers
The compiler does a good job of deduplicating strings already, but we
can eliminate the pointers to each string by combining the strings into
a single char array and storing only an index into that array.
The longest of the char arrays is the descriptions array, which is a
little over 45 KiB, so still under MSVC's 64 KiB string literal limit
[0]. Because the string length is under 64 KiB we can use uint16_t as
the index type, which roughly doubles our savings as compared to an int.
This cuts 77 KiB from iris_dri.so (0.5%) and libvulkan_intel.so (0.9%).
text data bss dec hex filename
926811 25920 0 952731 e899b meson-generated_.._intel_perf_metrics.c.o (before)
924401 0 0 924401 e1af1 meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14190852 391628 210004
14792484 e1b724 iris_dri.so (before)
14137732 365708 210004
14713444 e08264 iris_dri.so (after)
text data bss dec hex filename
8184097 240184 22820 8447101 80e47d libvulkan_intel.so (before)
8131009 214264 22820 8368093 7fafdd libvulkan_intel.so (after)
relinfo:
iris_dri.so (before): 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (before): 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) : 8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
[0] https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=msvc-170&viewFallbackFrom=vs-2019
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
Matt Turner [Thu, 3 Mar 2022 20:24:02 +0000 (12:24 -0800)]
intel/perf: Use slimmer intel_perf_query_counter_data struct
intel_perf_query_counter contains fields for things we can't or don't
want to store in our static data (like runtime-determined max values) or
oa_read_counter function pointers which are dependent on the GPU gen and
would make deduplication very ineffective.
Cuts 16 KiB from iris_dri.so and libvulkan_intel.so.
text data bss dec hex filename
926811 43200 0 970011 ecd1b meson-generated_.._intel_perf_metrics.c.o (before)
926811 25920 0 952731 e899b meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14190852 408908 210004
14809764 e1faa4 iris_dri.so (before)
14190852 391628 210004
14792484 e1b724 iris_dri.so (after)
text data bss dec hex filename
8184097 257464 22820 8464381 8127fd libvulkan_intel.so (before)
8184097 240184 22820 8447101 80e47d libvulkan_intel.so (after)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>
Matt Turner [Thu, 3 Mar 2022 01:53:02 +0000 (17:53 -0800)]
intel/perf: Use a function to initialize perf counters
And specifically mark it with ATTRIBUTE_NOINLINE. Otherwise it will be
inlined and actually slightly increase code size.
Cuts 505 KiB from iris_dri.so and libvulkan_intel.so.
text data bss dec hex filename
1538720 0 0 1538720 177aa0 meson-generated_.._intel_perf_metrics.c.o (before)
926811 43200 0 970011 ecd1b meson-generated_.._intel_perf_metrics.c.o (after)
text data bss dec hex filename
14751700 365708 210004
15327412 e9e0b4 iris_dri.so (before)
14190852 408908 210004
14809764 e1faa4 iris_dri.so (after)
text data bss dec hex filename
8744913 214264 22820 8981997 890ded libvulkan_intel.so (before)
8184097 257464 22820 8464381 8127fd libvulkan_intel.so (after)
Relocations increase because the counter initializations are moved from
code (in .text) to pointers (in .text) to .rodata, which require
relocations.
relinfo:
iris_dri.so (before): 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (before): 8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) : 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>