Konstantin Seurer [Tue, 28 Dec 2021 19:10:51 +0000 (20:10 +0100)]
panvk: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
Konstantin Seurer [Tue, 28 Dec 2021 19:07:34 +0000 (20:07 +0100)]
turnip: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
Konstantin Seurer [Tue, 28 Dec 2021 19:04:48 +0000 (20:04 +0100)]
anv: Fixed maxFragmentCombinedOutputResources
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
Konstantin Seurer [Tue, 28 Dec 2021 18:56:16 +0000 (19:56 +0100)]
lavapipe: Fixed maxFragmentCombinedOutputResources
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>
Rhys Perry [Thu, 6 Jan 2022 17:43:44 +0000 (17:43 +0000)]
ac/nir: fix store_buffer_amd write_masks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>
Rhys Perry [Fri, 7 Jan 2022 10:58:19 +0000 (10:58 +0000)]
nir/lower_shader_calls: fix store_scratch write_mask
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>
Lucas Stach [Fri, 10 Dec 2021 21:46:37 +0000 (22:46 +0100)]
etnaviv: drm: defer destruction of softpin BOs
When destroying a BO with a userspace managed address and thus freeing
the VMA space, we need to make sure that the BO isn't in use by any
active submit anymore, as the kernel will rightfully reject the next
submit that re-uses the still active VMA. Keep the BO alive as long
as it isn't fully idle to prevent the VMA being reused prematurely.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
Lucas Stach [Fri, 10 Dec 2021 21:35:34 +0000 (22:35 +0100)]
etnaviv: drm: rename _etna_bo_del
Rename it to a somwhat more descriptive name, which makes it easier
to distinguish between the etna_bo_del function in the public interface
and the internal function. Also remove the duplicated forward declaration
and move it to the common interal header.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
Lucas Stach [Fri, 10 Dec 2021 21:17:15 +0000 (22:17 +0100)]
etnaviv: drm: export BO idle check function
The ability to check if a BO is idle is not only useful in the
buffer cache, but also in other parts of the winsys and even the
pipe driver. Make this functionality available in the interface.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
Lucas Stach [Fri, 10 Dec 2021 22:55:35 +0000 (23:55 +0100)]
etnaviv: drm: properly handle reviving BOs via a lookup
If a BO is removed from a cache bucket list via a lookup, we must
handle it in the same way as if a allocation from the cache happened:
tell valgrind that the buffer is active again and take a reference
to the etna_device, which the BO had given up while being in the
cache.
Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>
Lucas Stach [Thu, 6 Jan 2022 17:58:01 +0000 (18:58 +0100)]
etnaviv: drm: fix size limit in etna_cmd_stream_realloc
The intended limit for command stream size is 64KB, as this is what old
kernels can reliably do and what allows for maximum number of queued
streams on newer kernels. However, due to unit confusion with the size
member, which is in dwords, the submitted streams could grow up to
~128KB. Fix this by using the proper limit in dwords.
Flushing due to some limits being exceeded is not an issue, but is
expected with certain workloads, so lower the severity of the message
being emitted in this case to debug level.
Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14425>
Lucas Stach [Fri, 7 Jan 2022 11:53:21 +0000 (12:53 +0100)]
egl/wayland: break double/tripple buffering feedback loops
Currently we dispose any unneeded color buffers immediately if we detect that
there are more unlocked buffers than we need. This can lead to feedback loops
between the compositor and the application causing rapid toggling between
double and tripple buffering.
Scenario: 2 buffers already queued to the compositor, egl/wayland allocates a
new back buffer to avoid throttling, slowing down the frame. This allows the
compositor to catch up and unlock both buffers. EGL detects that there are
more buffers than currently needed, freeing the buffer, restarting the loop
shortly after.
To avoid wasting CPU time on rapidly freeing and reallocating color buffers
break those feedback loops by letting the unneeded buffers sit around for a
short while before disposing them.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14451>
Danylo Piliaiev [Fri, 26 Nov 2021 16:57:52 +0000 (18:57 +0200)]
tu,ir3: Implement VK_KHR_shader_integer_dot_product
- gen4 - has dp4acc and dp2acc, dp4acc is used to implement
4x8 dot product.
- gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product
on both get3 and gen4.
- gen2 - unknown, lower everything.
- gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise
cl_qcom_dot_product8 but still generates code for it.
The assembly is more verbose and uses yet to be documented
mad32.u16 instruction.
Passes:
dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.*
dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.*
dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.*
dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.*
dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.*
dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.*
Only packed 4x8 unsigned and mixed versions are accelerated.
However in theory we should be able to do better for signed version
than current NIR lowering.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
Danylo Piliaiev [Tue, 30 Nov 2021 16:06:53 +0000 (18:06 +0200)]
ir3: Make nir compiler options a part of ir3_compiler
This would allow for sub-gens to have different options.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
Danylo Piliaiev [Fri, 26 Nov 2021 17:27:03 +0000 (19:27 +0200)]
nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
Danylo Piliaiev [Wed, 24 Nov 2021 12:57:03 +0000 (14:57 +0200)]
ir3: New cat3 instructions
* shrm - (src2 >> src1) & src3
* shlm - (src2 << src1) & src3
* shrg - (src2 >> src1) | src3
* shlg - (src2 << src1) | src3
* andg - (src2 & src1) | src3
* dp2acc - dot product of two {i,u}8vec2 packed into
SRC1 and SRC2, added to 32b SRC3
* dp4acc - dot product of two {i,u}8vec4 packed into
SRC1 and SRC2, added to 32b SRC3
* wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is
duplicated (1 << (SRC3 / 32)) times starting from DST register
* wmm.accu - same as wmm but result is added to DST registers, however
the first reg in each vec4 result is overwritten instead of
accumulating.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
Connor Abbott [Thu, 25 Nov 2021 16:02:42 +0000 (17:02 +0100)]
tu: Implement VK_EXT_subgroup_size_control
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
Connor Abbott [Thu, 25 Nov 2021 15:55:01 +0000 (16:55 +0100)]
tu, ir3: Support runtime gl_SubgroupSize in FS
We already supported it in the CS for computing the subgroup ID, but
soon we'll need it in the FS too. Vertex stages will always have it
lowered.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
Connor Abbott [Thu, 25 Nov 2021 14:17:36 +0000 (15:17 +0100)]
ir3: Add wavesize control
This allows the wavesize to be controlled per-shader. This will be used
by VK_EXT_subgroup_size_control, and freedreno will also need it if
legacy ARB_shader_ballot is to be supported (since it forces a wavesize
of 64 or less).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
Connor Abbott [Thu, 25 Nov 2021 14:16:36 +0000 (15:16 +0100)]
ir3: Pass shader to ir3_nir_post_finalize()
We'll need to add shader-specific lowering for gl_SubgroupSize.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
Connor Abbott [Thu, 25 Nov 2021 13:30:46 +0000 (14:30 +0100)]
ir3, freedreno: Add options struct for ir3_shader_from_nir()
We'll expand this in a moment.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>
Danylo Piliaiev [Thu, 7 Oct 2021 13:02:16 +0000 (16:02 +0300)]
tu: fix workaround for depth bounds test without depth test
Fixes:
bb4db22ff43a708bf80a8f72913ee493313393d1
("turnip: apply workaround for depth bounds test without depth test")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>
Lionel Landwerlin [Thu, 6 Jan 2022 09:03:36 +0000 (11:03 +0200)]
anv: limit compiler valid color outputs using NIR variables
This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc
test which asserts in the backend with :
brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend || key->nr_color_regions == 1' failed.
This is because there is 2 color attachments provided by the
renderpass so we initially set nr_color_regions = 2. But once we've
parsed the shader, we can see it's only using one output (with dual
source color blending).
This change looks at the output variables to update the valid output
variables.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>
Tapani Pälli [Tue, 4 Jan 2022 09:26:55 +0000 (11:26 +0200)]
iris: unref syncobjs and free r/w dependencies array for slab entries
Fixes memory leak with dependencies array:
==5224== 104 (96 direct, 8 indirect) bytes in 3 blocks are definitely lost in loss record 1,954 of 2,035
==5224== at 0x484178A: malloc (vg_replace_malloc.c:380)
==5224== by 0x484670B: realloc (vg_replace_malloc.c:1437)
==5224== by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
==5224== by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
==5224== by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
==5224== by 0x14DB77D0: iris_transfer_map (iris_resource.c:2348)
==5224== by 0x157786FD: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
==5224== by 0x14C479E7: tc_buffer_map (u_threaded_context.c:2252)
==5224== by 0x1434F3F8: pipe_buffer_map_range (u_inlines.h:393)
==5224== by 0x1435094A: _mesa_bufferobj_map_range (bufferobj.c:491)
==5224== by 0x143586D9: map_buffer_range (bufferobj.c:3737)
==5224== by 0x14358DA3: _mesa_MapBuffer (bufferobj.c:3947)
==5224== 240 (192 direct, 48 indirect) bytes in 6 blocks are definitely lost in loss record 1,984 of 2,035
==5224== at 0x484178A: malloc (vg_replace_malloc.c:380)
==5224== by 0x484670B: realloc (vg_replace_malloc.c:1437)
==5224== by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819)
==5224== by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898)
==5224== by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031)
==5224== by 0x14FF72CC: iris_get_query_result (iris_query.c:631)
==5224== by 0x14C4396A: tc_get_query_result (u_threaded_context.c:880)
==5224== by 0x1458F4F7: get_query_result (st_cb_queryobj.c:273)
==5224== by 0x1458F7EB: st_WaitQuery (st_cb_queryobj.c:352)
==5224== by 0x144EFF66: get_query_object (queryobj.c:742)
==5224== by 0x144F01AE: _mesa_GetQueryObjectuiv (queryobj.c:811)
And leak with syncobjs:
==13644== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1,846
==13644== at 0x484186F: malloc (vg_replace_malloc.c:381)
==13644== by 0x639789B: iris_create_syncobj (iris_fence.c:69)
==13644== by 0x63B213A: iris_batch_reset (iris_batch.c:512)
==13644== by 0x63B3637: _iris_batch_flush (iris_batch.c:1056)
==13644== by 0x65EF2BC: iris_get_query_result (iris_query.c:631)
==13644== by 0x623B970: tc_get_query_result (u_threaded_context.c:880)
==13644== by 0x5B874F7: get_query_result (st_cb_queryobj.c:273)
==13644== by 0x5B877EB: st_WaitQuery (st_cb_queryobj.c:352)
==13644== by 0x5AE7F66: get_query_object (queryobj.c:742)
==13644== by 0x5AE8150: _mesa_GetQueryObjectiv (queryobj.c:801)
Fixes:
ce2e2296ab6 ("iris: Suballocate BO using the Gallium pb_slab mechanism")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14387>
Christian Gmeiner [Fri, 7 Jan 2022 06:46:10 +0000 (07:46 +0100)]
iris/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14442>
Christian Gmeiner [Fri, 7 Jan 2022 06:41:48 +0000 (07:41 +0100)]
i915g/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14439>
Emma Anholt [Fri, 7 Jan 2022 17:44:43 +0000 (09:44 -0800)]
ci: Shrink container/rootfs sizes.
Cutting the extra VK mustpass files is 315MB out of 1.5GB of the amd64
rootfs. pip was 10MB. The rustup toolchains were massive (over a GB
IIRC) on the x86 container images.
Hopefully helps with #5837
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14460>
Yiwei Zhang [Fri, 7 Jan 2022 18:41:43 +0000 (18:41 +0000)]
venus: subtract appended header size in vn_CreatePipelineCache
Use header->header_size to offset cache data as well in case the header
struct extends on a newer driver but the cache data was appended with
an old header.
Fixes:
723f0bf74a3 ("venus: initial support for module and pipelines")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14463>
Danylo Piliaiev [Tue, 7 Dec 2021 14:43:21 +0000 (16:43 +0200)]
ir3: Assert that we cannot have enough concurrent waves for CS with barrier
If we have a compute shader that has a big workgroup, a barrier, and
a branchstack which limits max_waves - this may result in a situation
when we cannot run concurrently all waves of the workgroup, which
would lead to a hang.
Blob just explodes in such case.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
Danylo Piliaiev [Tue, 7 Dec 2021 13:15:23 +0000 (15:15 +0200)]
ir3: Be able to reduce register limit for RA when CS has barriers
If barriers are used, it must be possible for all waves in the workgroup
to execute concurrently. Thus we may have to reduce the registers limit.
Fixes a hang in "Digital Combat Simulator".
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>
Hoe Hao Cheng [Fri, 7 Jan 2022 04:51:54 +0000 (12:51 +0800)]
zink/codegen: remove bogus print statement
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
Hoe Hao Cheng [Fri, 7 Jan 2022 04:50:44 +0000 (12:50 +0800)]
zink/codegen: remove core_since in constructor
the variable is now automatically filled in according to registry values
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
Hoe Hao Cheng [Fri, 7 Jan 2022 04:36:41 +0000 (12:36 +0800)]
zink/codegen: support platform tags
Some extensions are locked behind certain platforms, don't include them
if the extension is unsupported.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>
Lionel Landwerlin [Thu, 6 Jan 2022 13:58:17 +0000 (15:58 +0200)]
anv: don't leave anv_batch fields undefined
Because the extend_cb vfunc is not initialized, there is a risk that
the emission code calls into a random pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14418>
Gert Wollny [Tue, 14 Dec 2021 11:24:27 +0000 (12:24 +0100)]
ntt: Set the output invariant flag according to the semantics
This is used by virglrenderer to create the correct shaders on the
host. Fixes:
dEQP-GLES31.functional.primitive_bounding_box.triangles.tessellation_set_per_primitive.vertex_tessellation_fragment.fbo
when using ntt with virgl.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
Gert Wollny [Tue, 14 Dec 2021 11:22:48 +0000 (12:22 +0100)]
nir_lower_io: propagate the "invariant" flag to outputs
Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer
to correctly declare output variables.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
Gert Wollny [Wed, 5 Jan 2022 14:17:36 +0000 (15:17 +0100)]
util/primconvert: map only index buffer part that is needed
By putting vertex store and indices all in one buffer the larger part
of the shared buffer might actually only be vertex data we are not
interested in. Hence only map the part of the buffer that contains the
index data for the currently active draw command.
This helps drivers where a mapping operation is expensive, like e.g. virgl.
v2: - add comment about ranged buffer mapping (Pierre-Eric)
- keep passing direct_draws[i].start to direct_draw_func, it looks
like the "start" parameter is properly set in
util_prim_restart_convert_to_direct
v3: Fix ws error (Mike)
Related: #5825
Fixes:
f9d12bf50e9c09a5896d9419f82d9f510982bc5d
vbo/dlist: use a single buffer object
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>
Christian Gmeiner [Fri, 7 Jan 2022 07:24:31 +0000 (08:24 +0100)]
etnaviv/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14441>
Rhys Perry [Fri, 7 Jan 2022 11:17:01 +0000 (11:17 +0000)]
radv: increase maxTaskOutputCount to 65535
This is the minimum required by the spec.
Fixes dEQP-VK.api.info.vulkan1p2_limits_validation.nv_mesh_shader
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14446>
Connor Abbott [Fri, 17 Dec 2021 18:48:49 +0000 (19:48 +0100)]
ir3: Use (ss) for instructions writing shared regs
The blob uses *both* nops and (ss). It turns out that in some rare cases
the hardware does take more than 6 cycles, at least for movmsk, but
adding nops is unnecessary. I believe the extra nops are only there due
to the immaturity of the blob's implementation of subgroup ops, so we
don't have to copy them - just handle shared reg producers the same as
SFU instructions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Fri, 17 Dec 2021 19:08:03 +0000 (20:08 +0100)]
ir3/postsched: Rename tex/sfu to sy/ss
Analogous to the previous commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Fri, 17 Dec 2021 19:03:02 +0000 (20:03 +0100)]
ir3/sched: Rename tex/sfu to sy/ss
This now covers e.g. cat6 instructions as well, and ss will cover
instructions writing shared regs as well. This is split out from the
previous change to avoid too much churn and shouldn't cause any
functional changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Fri, 17 Dec 2021 16:51:36 +0000 (17:51 +0100)]
ir3: Use new (sy)/(ss) stall helpers in the compiler
This fixes a few bad assumptions in the pre-RA and post-RA scheduler,
for example that (sy) is only for texture instructions and (ss) is only
for SFU instructions and (sy) and (ss) producers will always take the
same number of cycles. This means we now start doing latency hiding for
cat6 instructions like ldib and ldc. It also should make us hide latency
more aggressively, since the number used for (sy) stall cycles was way
lower than the real numbers for everything except ldc. Finally it
unifies the various places (ss) soft nops were calculated.
selected shader-db results:
total nops in shared programs: 345278 -> 358959 (3.96%)
nops in affected programs: 215622 -> 229303 (6.34%)
helped: 690
HURT: 2430
helped stats (abs) min: 1 max: 125 x̄: 11.40 x̃: 5
helped stats (rel) min: 0.53% max: 100.00% x̄: 24.19% x̃: 18.52%
HURT stats (abs) min: 1 max: 501 x̄: 8.87 x̃: 5
HURT stats (rel) min: 0.00% max: 9900.00% x̄: 52.36% x̃: 14.29%
95% mean confidence interval for nops value: 3.78 4.99
95% mean confidence interval for nops %-change: 28.21% 42.66%
Nops are HURT.
total mov in shared programs: 75049 -> 74110 (-1.25%)
mov in affected programs: 15754 -> 14815 (-5.96%)
helped: 566
HURT: 455
helped stats (abs) min: 1 max: 36 x̄: 4.52 x̃: 3
helped stats (rel) min: 0.83% max: 100.00% x̄: 35.85% x̃: 30.00%
HURT stats (abs) min: 1 max: 35 x̄: 3.55 x̃: 3
HURT stats (rel) min: 0.00% max: 1100.00% x̄: 63.60% x̃: 25.00%
95% mean confidence interval for mov value: -1.25 -0.58
95% mean confidence interval for mov %-change: 2.92% 14.02%
Inconclusive result (value mean confidence interval and %-change mean
confidence interval disagree).
total last-baryf in shared programs: 80468 -> 67670 (-15.90%)
last-baryf in affected programs: 63676 -> 50878 (-20.10%)
helped: 309
HURT: 147
helped stats (abs) min: 1 max: 260 x̄: 49.20 x̃: 24
helped stats (rel) min: 0.60% max: 98.81% x̄: 37.92% x̃: 40.91%
HURT stats (abs) min: 1 max: 115 x̄: 16.35 x̃: 12
HURT stats (rel) min: 0.96% max: 1933.33% x̄: 45.55% x̃: 7.89%
95% mean confidence interval for last-baryf value: -33.03 -23.10
95% mean confidence interval for last-baryf %-change: -21.52% -0.50%
Last-baryf are helped.
total sstall in shared programs: 133997 -> 126398 (-5.67%)
sstall in affected programs: 86866 -> 79267 (-8.75%)
helped: 1893
HURT: 598
helped stats (abs) min: 1 max: 77 x̄: 6.06 x̃: 4
helped stats (rel) min: 0.71% max: 100.00% x̄: 32.82% x̃: 16.67%
HURT stats (abs) min: 1 max: 65 x̄: 6.47 x̃: 6
HURT stats (rel) min: 0.00% max: 900.00% x̄: 65.51% x̃: 25.00%
95% mean confidence interval for sstall value: -3.39 -2.71
95% mean confidence interval for sstall %-change: -12.19% -6.24%
Sstall are helped.
total systall in shared programs: 350304 -> 288234 (-17.72%)
systall in affected programs: 234855 -> 172785 (-26.43%)
helped: 1456
HURT: 260
helped stats (abs) min: 1 max: 574 x̄: 46.42 x̃: 27
helped stats (rel) min: 0.19% max: 100.00% x̄: 39.43% x̃: 36.06%
HURT stats (abs) min: 1 max: 757 x̄: 21.20 x̃: 8
HURT stats (rel) min: 0.00% max: 180.95% x̄: 24.82% x̃: 12.50%
95% mean confidence interval for systall value: -39.31 -33.03
95% mean confidence interval for systall %-change: -31.49% -27.90%
Systall are helped.
total waves in shared programs: 236732 -> 235142 (-0.67%)
waves in affected programs: 6142 -> 4552 (-25.89%)
helped: 535
HURT: 17
helped stats (abs) min: 2 max: 8 x̄: 3.08 x̃: 2
helped stats (rel) min: 12.50% max: 75.00% x̄: 28.78% x̃: 25.00%
HURT stats (abs) min: 2 max: 6 x̄: 3.53 x̃: 4
HURT stats (rel) min: 16.67% max: 75.00% x̄: 37.35% x̃: 33.33%
95% mean confidence interval for waves value: -3.04 -2.72
95% mean confidence interval for waves %-change: -28.10% -25.39%
Waves are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Fri, 17 Dec 2021 16:40:02 +0000 (17:40 +0100)]
ir3: Introduce systall metric and new helper functions
Add new centralized functions which will replace the various places we
hardcode 10 for the number of (ss) nops, add numbers for soft (sy) nops
based on similar computerator experiments with ldc, sam, and ldib (the
most common (sy) producers), and add a "systall" metric which is
analogous to sstall. This also fixes some cases where we'd erroniously
count ldl* as (sy) producers instead of (ss) producers when calculating
sstall.
This only switches over the metric reporting to the new functions, so
there is no behavior change. The following commit will switch over
the rest of the compiler.
While we're at it, remove max_sun as it's never set.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Thu, 6 Jan 2022 19:16:44 +0000 (20:16 +0100)]
ir3: Bump type mismatch penalty to 3
After some experimentation with computerator, it seems on a618 that
writing a full register and then reading half of it as a half register
requires a delay of 6, the same as the delay for cat5/cat6 sources. The
other direction only has a delay of 5, but just bump it unconditionally
out of an abundance of caution.
Fixes:
890de1a4360 ("ir3/delay: Fix full->half and half->full delay")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Connor Abbott [Wed, 22 Dec 2021 17:51:25 +0000 (18:51 +0100)]
ir3/ra: Fix logic bug in compress_regs_left
If we're allocating a source then we force is_killed to false, not to
true. Fixes a regression in
dEQP-GLES31.functional.synchronization.in_invocation.image_atomic_write_read
later.
Fixes:
0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>
Tomeu Vizoso [Fri, 17 Dec 2021 13:57:27 +0000 (14:57 +0100)]
anv/tests: Free BO cache and device mutex
Was getting ASAN errors in CI when trying to add ANV to the
debian-testing job:
==10993==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 4194304 byte(s) in 64 object(s) allocated from:
#0 0x7f763c1bda3c in __interceptor_posix_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:226
#1 0x55f43d28627f in os_malloc_aligned ../src/util/os_memory_aligned.h:58
#2 0x55f43d28627f in _util_sparse_array_node_alloc ../src/util/sparse_array.c:107
#3 0x55f43d28627f in util_sparse_array_get ../src/util/sparse_array.c:143
#4 0x55f43d1fdaba in anv_device_lookup_bo ../src/intel/vulkan/anv_private.h:1335
#5 0x55f43d1fdaba in anv_device_import_bo_from_host_ptr ../src/intel/vulkan/anv_allocator.c:1843
#6 0x55f43d1ff571 in anv_block_pool_expand_range ../src/intel/vulkan/anv_allocator.c:534
#7 0x55f43d1ffcb5 in anv_block_pool_init ../src/intel/vulkan/anv_allocator.c:417
#8 0x55f43d18f082 in run_test ../src/intel/vulkan/tests/block_pool_no_free.c:123
#9 0x55f43d1862b6 in main ../src/intel/vulkan/tests/block_pool_no_free.c:152
#10 0x7f763b942d09 in __libc_start_main ../csu/libc-start.c:308
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14121>
Tomeu Vizoso [Mon, 6 Dec 2021 09:01:52 +0000 (10:01 +0100)]
anv/ci: Test with deqp-vk on Tiger Lake
Run half of the CTS in 10 Volteer Chromebook devices.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14121>
Jesse Natalie [Tue, 4 Jan 2022 20:06:02 +0000 (12:06 -0800)]
shader_info: tess.spacing needs to be unsigned
Otherwise MSVC will treat the bit-packed enum values as signed.
Reviewed-by: Marek Ol\9aák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14402>
Philipp Zabel [Wed, 15 Sep 2021 11:18:18 +0000 (13:18 +0200)]
etnaviv: fix emit_if in case the else block ends in a jump
Fixes piglit test shaders@ssa@fs-if-def-else-break.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12892>
Rohan Garg [Mon, 22 Nov 2021 14:45:14 +0000 (15:45 +0100)]
intel/fs: OpImageQueryLod does not support arrayed images as an operand
When we lower SPIR-V to NIR for textures in vtn_handle_texture, we only
bump the number of coordinate components when the op is not a lod query.
Update the assert to take this into account.
This fixes:
- dEQP-VK.robustness.robustness2.bind.template.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.bind.template.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
- dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag
Fixes:
231337a1 ("intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.")
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13925>
Emma Anholt [Tue, 14 Dec 2021 22:35:03 +0000 (14:35 -0800)]
nir_to_tgsi: Enable fdot_replicates flag.
That's how the TGSI math opcodes work.
This lets lower_vec_to_regs coalesce the DP output into the .yzw channels,
giving an impressive shader-db win on softpipe:
total instructions in shared programs: 2929840 -> 2794036 (-4.64%)
instructions in affected programs: 1651438 -> 1515634 (-8.22%)
total temps in shared programs: 372730 -> 332744 (-10.73%)
temps in affected programs: 118151 -> 78165 (-33.84%)
and a minor one on r300:
total instructions in shared programs: 51238 -> 51149 (-0.17%)
instructions in affected programs: 2621 -> 2532 (-3.40%)
total vinst in shared programs: 15655 -> 15618 (-0.24%)
vinst in affected programs: 468 -> 431 (-7.91%)
total temps in shared programs: 9838 -> 9828 (-0.10%)
temps in affected programs: 59 -> 49 (-16.95%)
and a bigger one on i915g:
total instructions in shared programs: 398064 -> 395901 (-0.54%)
instructions in affected programs: 29271 -> 27108 (-7.39%)
total tex_indirect in shared programs: 12261 -> 12233 (-0.23%)
tex_indirect in affected programs: 98 -> 70 (-28.57%)
LOST: 0
GAINED: 5
The r300 change is less impressive because it does some backend copy-prop,
but also because intermediate storage of DPs now takes a vec4 instead of a
scalar.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200>
Christian Gmeiner [Thu, 6 Jan 2022 14:35:33 +0000 (15:35 +0100)]
panfrost/ci: update piglit fails
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14428>
Francisco Jerez [Fri, 10 Dec 2021 00:29:06 +0000 (16:29 -0800)]
intel/dev: Implement DG2 restrictions requiring additional DSSes to be disabled.
Note that this causes a geometry slice to be disabled if any DSS is
fused off within that slice, which may seem stricter than the BSpec
quotation implies, but testing shows that pixel pipes with any faulted
DSS don't work at all, and that using a slice with any faulted pixel
pipe leads to serious graphics corruption.
It would be better to query this geometry topology information from
the hardware instead of trying to reconstruct it here, but the kernel
interface for that is not available yet.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14436>
Francisco Jerez [Tue, 8 Jun 2021 23:53:54 +0000 (23:53 +0000)]
intel/dev: Add support for pixel pipe subslice accounting on multi-slice GPUs.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14436>
Francisco Jerez [Wed, 7 Apr 2021 21:23:21 +0000 (14:23 -0700)]
intel/dev: Fix size of device info num_subslices array.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14436>
Dave Airlie [Fri, 7 Jan 2022 02:47:08 +0000 (12:47 +1000)]
glsl/nir: don't pass gl_context to the convertor routine.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 02:37:56 +0000 (12:37 +1000)]
glsl/linker: remove a bunch more gl_context references.
This leaves only one reference used for the ctx->Driver.NewProgram
hook, this should likely be revisited.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 02:29:59 +0000 (12:29 +1000)]
glsl/linker: drop unused gl_context.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 02:28:01 +0000 (12:28 +1000)]
glsl/linker/uniform_blocks: don't pass gl_context around.
just pass the constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 01:20:14 +0000 (11:20 +1000)]
glsl/nir/linker: avoid passing gl_context inside gl_nir linker
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 01:16:23 +0000 (11:16 +1000)]
glsl/linker: remove gl_context usage from more places.
Just pass in consts, exts, api
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 01:10:40 +0000 (11:10 +1000)]
glsl/linker: remove gl_context from check image resources
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 01:08:33 +0000 (11:08 +1000)]
glsl/linker: get rid of gl_context from atomic counters paths
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 01:00:21 +0000 (11:00 +1000)]
glsl/linker: get rid of gl_context from uniform assign paths
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:56:45 +0000 (10:56 +1000)]
glsl/linker: get rid of gl_context from link varyings
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:52:30 +0000 (10:52 +1000)]
glsl/linker: remove direct gl_context usage in favour of consts/exts/api
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:37:35 +0000 (10:37 +1000)]
glsl/linker: move more ctx->Consts to consts.
Don't pass gl contexts around as much
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:35:02 +0000 (10:35 +1000)]
glsl/linker: don't pass gl_context just for constants in xfb code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:33:06 +0000 (10:33 +1000)]
glsl: don't pass gl_context to lower shared references.
this uses the consts only
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Dave Airlie [Fri, 7 Jan 2022 00:31:10 +0000 (10:31 +1000)]
glsl/linker: cleanup passing gl_context unnecessarily
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14433>
Jesse Natalie [Tue, 4 Jan 2022 16:39:47 +0000 (08:39 -0800)]
nir_opt_dead_cf: Remove dead ifs
An if that looks like:
if (x) { } else { }
That has no phis following it is dead. Currently these are only
removed by peephole select, but that means that 'x' is considered
used until that pass is run, which can make it difficult to apply
sane lowering in the case where loading 'x' requires complex or
expensive transformations, but 'x' is *really* unused.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14400>
Jesse Natalie [Wed, 29 Dec 2021 00:31:48 +0000 (16:31 -0800)]
d3d12: Set appropriate caps for shader images
Note that currently there's no emulation if the D3D12 driver doesn't
support the "UAV typed load" feature for all of the GL required formats.
This is not a required D3D12 feature, so this support won't light up
on all D3D12 hardware.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 01:54:05 +0000 (17:54 -0800)]
d3d12: Handle bitcasting of shader images
This is handled in 2 ways:
* For casts in the same "class," we can just do the cast. There's also
limited support for 4x8 => 1x32 and 2x16 => 1x32.
* For casts that are just by size, use a lowering pass. This reads the
data in the appropriate UINT format (except R11G11B10), retrieves
the original bits, re-packs it into the dest, and returns it to the
app for loads. For stores, the process is done in reverse - bits
for the final format are computed and re-expanded to the format that
should be used for the store.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 23:55:27 +0000 (15:55 -0800)]
d3d12: Handle memory barriers
This is a bit fragile. The algorithm is essentially:
- Let the driver track state for non-dual-bound resources.
- For resources that are dual-bound as SSBO/image and a second
bind point, assume they're being used as UAVs.
- When a MemoryBarrier is issued, dirty all destination bind points
so they re-assert their state, and if the destination is not UAV,
temporarily suppress UAV state re-assertion.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 17:59:33 +0000 (09:59 -0800)]
d3d12: Lower cube images to 2D arrays via existing int cubemap lowering pass
Note that the coordinates are not lowered, just types.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:29:55 +0000 (16:29 -0800)]
d3d12: Fill out shader image descriptor tables
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:29:41 +0000 (16:29 -0800)]
d3d12: Create textures as UAV-capable when appropriate
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 23:32:46 +0000 (15:32 -0800)]
d3d12: Handle set_shader_images
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 23:33:11 +0000 (15:33 -0800)]
d3d12: Handle images in the root signature
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:29:13 +0000 (16:29 -0800)]
d3d12: Retrieve shader image dimensions during shader compiles
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:28:30 +0000 (16:28 -0800)]
d3d12: Init null UAVs
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:28:05 +0000 (16:28 -0800)]
d3d12: Handle format support queries for shader images
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 00:31:42 +0000 (16:31 -0800)]
d3d12: Figure out if we can support GL shader images
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 23:29:37 +0000 (15:29 -0800)]
d3d12: Add missed SSBO binding enum value
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 23:28:38 +0000 (15:28 -0800)]
d3d12: Rename UAV -> SSBO to disambiguate with image UAVs
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 01:53:23 +0000 (17:53 -0800)]
d3d12: Fix format table typeless-ness for A8 and RGBA1010102
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 15:23:42 +0000 (07:23 -0800)]
d3d12: Shrink 2D array size so that max-layer cube arrays can be created
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 01:50:51 +0000 (17:50 -0800)]
microsoft/compiler: Fix handling of fp16-in-32bit-val ops to handle high bits
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 18:38:36 +0000 (10:38 -0800)]
microsoft/compiler: Hook up memory/control barriers
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 01:54:04 +0000 (17:54 -0800)]
microsoft/compiler: Handle forced early depth
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Wed, 29 Dec 2021 01:37:32 +0000 (17:37 -0800)]
microsoft/compiler: Implement atomic image ops
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 01:51:41 +0000 (17:51 -0800)]
microsoft/compiler: Handle images as derefs for GL
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Fri, 31 Dec 2021 01:52:11 +0000 (17:52 -0800)]
microsoft/compiler: Fix array-of-array handling for derefs of textures/images
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 22:52:15 +0000 (14:52 -0800)]
microsoft/compiler: Emit SRVs/UAVs as arrays
This is done regardless of size so that "dynamic" indexing with a
uniform can work even on arrays of size 1.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 22:17:37 +0000 (14:17 -0800)]
microsoft/compiler: Unify handle retrieval between images and UBO/SSBO
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 21:45:26 +0000 (13:45 -0800)]
microsoft/compiler: Emit GL images in descriptor space 1 with driver_location instead of binding
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 21:39:33 +0000 (13:39 -0800)]
microsoft/compiler: Put SSBO and image handles in separate arrays
In a future change, the bindings for images and SSBOs will start to
overlap in GL (using a separate space to disambiguate them)
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Jesse Natalie [Tue, 28 Dec 2021 21:13:03 +0000 (13:13 -0800)]
microsoft/compiler: Change vulkan_environment bool to an enum
There are currently 3 different "environments" supported by this backend,
and they have slightly different semantics for how resources are accessed,
which is only going to get a little weirder when GL images start getting
supported. To try to make things less confusing, use an explicit enum
with heavy documentation on what's different between them.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14342>
Caio Oliveira [Fri, 24 Dec 2021 07:20:10 +0000 (23:20 -0800)]
anv/blorp: Apply pending pipe flushes after PIPELINE_SELECT
Allows the PIPELINE_SELECT change to consume any outstanding flushes.
In case it doesn't, we still apply the pipe flushses afterwards.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14301>