Iago Toral Quiroga [Fri, 2 Sep 2022 06:49:12 +0000 (08:49 +0200)]
v3dv: implement VK_EXT_depth_clip_control
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18387>
Karmjit Mahil [Thu, 28 Jul 2022 14:05:27 +0000 (15:05 +0100)]
pvr: Add depth_bias_array handling on dbenable.
On dbenable depth bias is enabled so we need to write the depth
bias data into the depth_bias_array (which gets uploaded to the
device) and also setup the depth bias index (used in the control
stream).
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18438>
Jason Ekstrand [Fri, 25 Mar 2022 18:51:14 +0000 (13:51 -0500)]
radv: Switch to dynamic rendering only
Also, update list of expected failures.
dEQP-VK.image.sample_texture.*_bit_compressed_format_two_samplers_*
now reliably pass on Polaris10 (GFX8) and Pitcairn (GFX6).
Stoney has new failures but given there is already a lot of
depth/stencil resolve failures, we shouldn't worry about them.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
Jason Ekstrand [Fri, 2 Sep 2022 15:56:12 +0000 (10:56 -0500)]
radv: Leave image layouts alone when doing HW MSAA resolves
If the current layout supports DCC, we initialize it. There's no reason
why we can't leave it in that layout and need to stomp it to
COLOR_ATTACHMENT_OPTIMAL. If the layout supports DCC, it's effectively
identical to COLOR_ATTACHMENT_OPTIMAL anyway.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
Jason Ekstrand [Tue, 30 Aug 2022 15:44:43 +0000 (10:44 -0500)]
radv: Only copy the render area from VRS to HTILE
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
Jason Ekstrand [Tue, 23 Aug 2022 03:36:49 +0000 (22:36 -0500)]
radv: Set the window scissor to the render area, not framebuffer
With dynamic rendering, the concept of framebuffer dimensions goes away
so this won't make sense. Even with render passes, the render area is
guaranteed to be inside the framebuffer so we may as well clip to the
potentially smaller render area. This commit also moves window scissor
setup to CmdBeginRenderPass2() time. This should be fine, even for meta
ops, as the only meta ops which happen inside a render pass need the
same render area as the render pass itself.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15587>
Martin Roukala (né Peres) [Thu, 8 Sep 2022 06:57:10 +0000 (09:57 +0300)]
radv/ci: document an unstable test
The test seem to fail when run in conjunction with other tests. This
got revealed after I introduced parrallelism in the VKCTS execution on
VanGogh.
This was caught by pre-merge CI, but idiot me thought this was a
flake... and did not try re-running the job to verify...</BrownBag>
Reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7220
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18480>
Mark Collins [Wed, 7 Sep 2022 02:56:27 +0000 (02:56 +0000)]
tu: Retain allocated CSes in tu_autotune_on_submit
It was determined that a significant part of queue submission
overhead was from allocation/freeing of CSes constantly inside
`tu_autotune_on_submit`. This has been reduced by retaining
instances of `tu_submission_data` with their corresponding
CSes, this results in entirely eliminating that overhead as
resetting a CS is a very cheap operation compared to allocation
or even freeing it wholly.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18461>
Gert Wollny [Wed, 7 Sep 2022 10:44:27 +0000 (12:44 +0200)]
mesa/glsl: Add support for NV_shader_noperspective_interpolation
With EXT_gpu_shader4 the support is already in place, we just
have to allow it in glsl and expose the extension name.
v2: Check whether the extension is enabled in the shader (Adam Jackson)
v3: Don't check GLES version in lexer (mareko)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18460>
Samuel Pitoiset [Thu, 8 Sep 2022 11:59:18 +0000 (13:59 +0200)]
radv: fix hw remapping of MRT holes with color attachments without export
If a color attachment is used in a render pass but not exported by the
FS, cb_shader_mask would be non-zero for this MRT. Though, to make sure
the hw remapping of SPI_SHADER_COL_FORMAT<->CB_SHADER_MASK works as
expected, we should also clear the unused color attachment in
CB_SHADER_MASK. Otherwise, the hw will remap to the wrong MRT.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7221
Fixes:
8fcb4aa0ebd ("radv: compact MRTs to save PS export memory space")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18491>
Samuel Pitoiset [Thu, 25 Aug 2022 15:19:12 +0000 (17:19 +0200)]
radv: import PS epilog from libraries if present
This enables using PS epilogs with GPL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Thu, 25 Aug 2022 15:14:58 +0000 (17:14 +0200)]
radv: add support for emitting and prefetching PS epilogs
Long jumps seem to be slow and prefetching might help.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Thu, 25 Aug 2022 15:17:32 +0000 (17:17 +0200)]
radv: create a PS epilog from a library without the main FS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Thu, 25 Aug 2022 15:14:23 +0000 (17:14 +0200)]
radv: keep track of the code size for VS prologs and PS epilogs
This will be used to prefetch PS epilogs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Tue, 19 Jul 2022 11:42:00 +0000 (13:42 +0200)]
radv: do not try to remove color exports for FS that need an epilog
The color format would be zero and all exports would be removed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Thu, 25 Aug 2022 14:08:50 +0000 (16:08 +0200)]
radv: add radv_remove_color_exports() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Tue, 19 Jul 2022 11:37:10 +0000 (13:37 +0200)]
radv: do not lower color exports for FS that need an epilog
When building the main FS with GPL we don't know the color export
formats.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18255>
Samuel Pitoiset [Thu, 8 Sep 2022 14:24:25 +0000 (16:24 +0200)]
radv: fix reporting RT shaders in RGP
RGP expects a compute bind point. This allows it to show ISA of RT
shaders and also enables instruction timing.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7213
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
Samuel Pitoiset [Thu, 8 Sep 2022 13:01:51 +0000 (15:01 +0200)]
radv: capture RT pipelines from the SQTT layer
They were just not recorded.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
Samuel Pitoiset [Wed, 7 Sep 2022 12:29:56 +0000 (14:29 +0200)]
radv: emit SQTT markers for RT related commands
This reports RT commands like vkCmdTraceRaysKHR and
vkCmdBuildAccelerationStructuresKHR in RGP.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18496>
Adam Jackson [Wed, 7 Sep 2022 18:03:56 +0000 (14:03 -0400)]
glx: Use XSaveContext, delete glxhash.c
libX11 has a perfectly good XID-based hash table we can be using, let's.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18474>
Yiwei Zhang [Wed, 7 Sep 2022 19:07:43 +0000 (19:07 +0000)]
venus: force synchronous submission for external signal semaphore
This is to ensure semaphore export under globalFencing represents the
correct submission.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18475>
Yiwei Zhang [Wed, 7 Sep 2022 18:26:41 +0000 (18:26 +0000)]
venus: clean up vn_QueueSubmit
and ensure all submissions are synchronous upon NO_ASYNC_QUEUE_SUBMIT
No other intended change in behavior.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18475>
Emma Anholt [Thu, 8 Sep 2022 18:26:15 +0000 (11:26 -0700)]
Revert "ci: disable the freedreno farm."
It's been moved back to the lab network, which hopefully has been
stabilized. This reverts commit
13f36d66ad5ee581740ec13297a33312863e1c56.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18505>
Timur Kristóf [Wed, 7 Sep 2022 16:52:13 +0000 (18:52 +0200)]
nir/gather_info: Clear cross-invocation output mask.
Similar to how other I/O info is cleared at the beginning
of gather_info we should also clear the cross-invocation
mesh shader output mask.
Fixes:
112a856813eb2649ea7ff81768bab594033ce00a
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
Timur Kristóf [Wed, 7 Sep 2022 13:23:46 +0000 (15:23 +0200)]
nir/lower_system_values: Add shortcut for 1D workgroups.
When the workgroup is 1 dimensional, simply use a vec3
filled with zeroes and the local invocation index.
This is is better than lower_id_to_index + constant folding,
because this way we don't leave behind extra ALU instrs.
Note, this is relevant to mesh shaders on RDNA2 because
it enables us to better detect cross-invocation output
access.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
Yiwei Zhang [Thu, 8 Sep 2022 03:17:22 +0000 (03:17 +0000)]
zink: implement fence_get_fd required by EGL android platform
fence_get_fd is required for any kind of surface flush or native fence
sync export on Android. The typical scenarios are:
- eglDupNativeFenceFDANDROID
- eglSwapBuffers*
- eglMakeCurrent
- glFlush/glFinish for front buffer rendering
This change updates zink_flush to handle PIPE_FLUSH_FENCE_FD via a
forced submit to signal an external sync_fd semaphore. fence_get_fd is
implemented to export the sync file from that semaphore.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
Yiwei Zhang [Wed, 7 Sep 2022 00:06:37 +0000 (00:06 +0000)]
zink: fix in-fence lifecycle
For in-fence handling, dri2 has this below sequence in a row:
1. create_fence_fd: import external fence fd
2. fence_server_sync: import the pipe fence into the driver ctx
3. fence_reference: deref the created pipe fence
Before this change, zink pushed the wrapped external semaphore to the
wait semaphores of the next batch but the followed fence_reference will
destroy the imported semaphore immediately. Instead of extending the
lifecycle of the pipe fence throughout the batch state, we can simply
transfer the semaphore ownership to the batch and destroy it upon batch
reset.
Fixes:
32597e116d7 ("zink: implement GL semaphores")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
Yiwei Zhang [Wed, 7 Sep 2022 02:21:14 +0000 (02:21 +0000)]
zink: fix zink_create_fence_fd to properly import
This change fixes below:
1. Dup the fence fd, otherwise, since external semaphore import takes
the ownership of the fd, non-Vulkan part touches the fd leading to
undefined behavior. This can be hit on implementations that defer
the processing of the passed fd.
2. Use VK_SEMAPHORE_IMPORT_TEMPORARY_BIT for importing since that's
required for SYNC_FD handle type because of its copy transference.
Meanwhile, doing temporary import for opaque fd is fine in this path.
Fixes:
32597e116d7 ("zink: implement GL semaphores")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
Yiwei Zhang [Tue, 6 Sep 2022 19:25:10 +0000 (19:25 +0000)]
zink: fix core support on Android
- use legacy EGL driver interface
- use Android Vulkan loader
- avoid unrelated kopper source files
v2: update false #elif to #else
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
Yiwei Zhang [Tue, 6 Sep 2022 17:54:17 +0000 (17:54 +0000)]
loader: use os_get_option for driver override
Android requires this to enable zink.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18453>
Chad Versace [Fri, 12 Aug 2022 16:19:32 +0000 (09:19 -0700)]
venus: Enable VK_EXT_pipeline_creation_feedback
Implement natively by always returning invalid feedback. This is a legal
(but useless) implementation according to the spec.
In the future, I want to return the real feedback values from the host,
but that requires changes to the venus protocol. The protocol does not
know that the VkPipelineCreationFeedback structs in the
VkGraphicsPipelineCreateInfo pNext are output parameters. Before
VK_EXT_pipeline_creation_feedback, the pNext chain was input-only.
Tested with `dEQP-VK.pipeline.*.creation_feedback.*`.
The tests in vulkan-cts-1.3.3.0 are buggy. I submitted a fix to dEQP
upstream; see below.
Results with the bug:
Passed: 0/30 ( 0.0%)
Failed: 12/30 (40.0%)
Not supported: 18/30 (60.0%)
Warnings: 0/30 ( 0.0%)
Results with bugfix:
Passed: 12/30 (40.0%)
Failed: 0/30 ( 0.0%)
Not supported: 18/30 (60.0%)
Warnings: 0/30 ( 0.0%)
See: https://gerrit.khronos.org/c/vk-gl-cts/+/10086
See: https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/909
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Chad Versace <chadversary@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18035>
David Heidelberg [Mon, 5 Sep 2022 21:00:44 +0000 (23:00 +0200)]
ci/lava: print set-job-env-vars.sh as other setups do
Let's be consistent.
Also needed for parsing device and traces yml file from logs.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18422>
David Heidelberg [Wed, 7 Sep 2022 13:43:15 +0000 (15:43 +0200)]
ci: print env as other setups do
Let's be consistent.
Also needed for parsing device and traces yml file from logs.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18422>
Emma Anholt [Thu, 1 Sep 2022 18:05:33 +0000 (11:05 -0700)]
zink: Don't lower indirect derefs of temp arrays.
nir_to_spirv can handle it. Cuts instructions in a turnip CS shader on
Aztec Ruins from 36k to 3k.
Part of #6115
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18374>
Emma Anholt [Thu, 1 Sep 2022 18:14:40 +0000 (11:14 -0700)]
zink: Don't upload shader immediate arrays through UBO 0.
Trust the host vulkan driver to load it through whatever way it finds to
be most efficient. Saves upload on the frontend when other uniforms
change.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18374>
Mike Blumenkrantz [Thu, 7 Apr 2022 14:14:10 +0000 (10:14 -0400)]
zink: flag all assigned output slots as mapped
this ensures types which consume more than 1 slot are effectively tagged
so that the next stage inputs are also assigned properly
fixes:
spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-dvec3
cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18444>
James Park [Sat, 3 Sep 2022 04:37:22 +0000 (21:37 -0700)]
vulkan: Augment _WIN32 stub comparison
Make current check robust to incremental linking.
Compare JMP targets if the first byte is opcode 0xE9.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18400>
Kenneth Graunke [Wed, 7 Sep 2022 00:18:16 +0000 (17:18 -0700)]
intel/compiler: Use subgroup invocation for ICP handle loads
When loading a TCS or GS input, we generate some code to read the URB
handle for a particular input control point (ICP handle), which often
involves indirect addressing due to a non-constant vertex.
For example:
mov(8) vgrf148+0.0:UW, 76543210V
shl(8) vgrf149:UD, vgrf148+0.0:UW, 2u
shl(8) vgrf150:UD, vgrf145:UD, 5u
add(8) vgrf151:UD, vgrf150:UD, vgrf149:UD
mov_indirect(8) vgrf147:UD, g2:UD, vgrf151:UD, 96u
Unfortunately, the first load with 76543210V is considered a partial
write because the 8 channels of 16-bit UW data doesn't fill an entire
register, and we can't allocate VGRFs at sub-register granularity.
This causes none of the above math to be CSE'd, even though the first
two instructions are common to *all* input loads, and the rest may be
reused sometimes as well.
To work around this, we stop emitting 76543210V to a temporary, and
instead use nir_system_values[SYSTEM_VALUE_SUBGROUP_INVOCATION], which
already contains this value, and is unconditionally set up for us.
With all input loads using the same register for the sequence, our
CSE pass is able to eliminate the rest of the common math.
shader-db results on Tigerlake:
total instructions in shared programs:
20748243 ->
20744844 (-0.02%)
instructions in affected programs: 73410 -> 70011 (-4.63%)
helped: 242 / HURT: 21
helped stats (abs) min: 1 max: 37 x̄: 14.17 x̃: 15
helped stats (rel) min: 0.17% max: 19.58% x̄: 6.13% x̃: 6.32%
HURT stats (abs) min: 1 max: 4 x̄: 1.38 x̃: 1
HURT stats (rel) min: 0.18% max: 1.31% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for instructions value: -13.73 -12.12
95% mean confidence interval for instructions %-change: -6.00% -5.19%
Instructions are helped.
total cycles in shared programs:
785828951 ->
785788480 (<.01%)
cycles in affected programs: 597593 -> 557122 (-6.77%)
helped: 227 / HURT: 13
helped stats (abs) min: 6 max: 624 x̄: 182.19 x̃: 185
helped stats (rel) min: 0.24% max: 18.22% x̄: 7.85% x̃: 7.80%
HURT stats (abs) min: 2 max: 153 x̄: 68.08 x̃: 36
HURT stats (rel) min: 0.03% max: 7.79% x̄: 2.97% x̃: 1.25%
95% mean confidence interval for cycles value: -182.55 -154.71
95% mean confidence interval for cycles %-change: -7.84% -6.69%
Cycles are helped.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18455>
Georg Lehmann [Tue, 6 Sep 2022 12:47:19 +0000 (14:47 +0200)]
nir/opt_algebraic: Optimize unpacking of upcasts to 64bit integers.
Foz-DB Navi21:
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 213364 -> 213028 (-0.16%)
Instrs: 38347 -> 38319 (-0.07%)
Latency: 780148 -> 779776 (-0.05%)
InvThroughput: 520098 -> 519851 (-0.05%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18435>
Tomeu Vizoso [Thu, 8 Sep 2022 05:44:38 +0000 (07:44 +0200)]
ci: Stop explicitly passing env vars to FDO_DISTRIBUTION_EXEC command
ci-templates will now pass all env vars to the command.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18467>
Tomeu Vizoso [Wed, 7 Sep 2022 11:24:33 +0000 (13:24 +0200)]
ci: Install sysvinit-core without --no-remove
As it conflicts now with some packages already installed but not
necessary, such as:
libpam-systemd packagekit packagekit-tools policykit-1 systemd-sysv
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18467>
Tomeu Vizoso [Wed, 7 Sep 2022 11:50:45 +0000 (13:50 +0200)]
ci: Use --no-install-recommends to avoid problems with --no-remove
Some packages that are being installed via recommends are conflicting
with already installed packages, causing this error:
E: Packages need to be removed but remove is disabled.
We dont need these packages, so don't install them.
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18467>
Tomeu Vizoso [Wed, 7 Sep 2022 11:38:36 +0000 (13:38 +0200)]
ci: Uprev ci-templates
Uprev it to:
d5aa3941aa03 ("freebsd: Move from 13.0 to 13.1")
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18467>
Karmjit Mahil [Thu, 1 Sep 2022 14:41:35 +0000 (15:41 +0100)]
pvr: Remove unimplemented push descriptor code.
Push descriptors are part of VK_KHR_push_descriptor.
Not supporting it for now.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18430>
Karol Herbst [Thu, 1 Sep 2022 22:10:05 +0000 (00:10 +0200)]
nv50: properly flush the TSC cache on 3D
The change didn't make any sense. `s` will always be
`NV50_SHADER_STAGE_COMPUTE`, because it's used to loop over all shader
stages. And the TSC cache on the compute side is already flushed in
`nv50_compute_validate_samplers`.
Fixes spurious `CACHE_ERROR` dmesg messages.
Fixes:
ba6ba8c9900 ("nv50: adapt texture and constbuf paths for compute shaders")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18382>
Karol Herbst [Thu, 1 Sep 2022 19:45:20 +0000 (21:45 +0200)]
nv50/ir: fix OP_UNION resolving when used for vector values
When an OP_UNION def takes part in a vector source e.g. for a tex
instruction we failed to clean up the OP_UNION instruction as rep() points
towards the coalesced value instead.
This fixes a regression on nv50 moving to NIR, but also potentially issues
with nvc0.
The main reason this is common in nv50 is, that we lower OP_SLCT to a set,
predicated movs and a union.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6406
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7117
Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18377>
Frank Binns [Wed, 31 Aug 2022 16:45:20 +0000 (17:45 +0100)]
pvr: finish render job sample count setup
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18437>
Thomas H.P. Andersen [Mon, 5 Sep 2022 22:25:50 +0000 (00:25 +0200)]
util: avoid deprecated builtin has_trivial_destructor
From clang 16 has_trivial_destructor is deprecated.
Use the replacement __is_trivially_destructible if it
is available.
Fixes new warnings with clang 16 like:
../src/compiler/glsl/list.h:58:4: warning: builtin __has_trivial_destructor is deprecated; use __is_trivially_destructible instead [-Wdeprecated-builtins]
../src/util/ralloc.h:551:4: note: expanded from macro 'DECLARE_RZALLOC_CXX_OPERATORS'
DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE(type, rzalloc_size)
^
../src/util/ralloc.h:542:12: note: expanded from macro 'DECLARE_ALLOC_CXX_OPERATORS_TEMPLATE'
if (!HAS_TRIVIAL_DESTRUCTOR(TYPE)) \
^
../src/util/macros.h:233:44: note: expanded from macro 'HAS_TRIVIAL_DESTRUCTOR'
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18423>
Karmjit Mahil [Thu, 1 Sep 2022 14:07:43 +0000 (15:07 +0100)]
pvr: Finish setting up job resolve info.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18431>
Karmjit Mahil [Wed, 31 Aug 2022 14:11:00 +0000 (15:11 +0100)]
pvr: Set descriptor dirty flag based on other flags.
Set the flag if the descriptor set is updated.
Set the fragment descriptor dirty flag if blend consts are updated.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18429>
Mark Collins [Fri, 2 Sep 2022 10:30:47 +0000 (10:30 +0000)]
tu: Expose VK_EXT_tooling_info using common implementation
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18390>
Mark Collins [Fri, 2 Sep 2022 10:09:11 +0000 (10:09 +0000)]
tu: Clamp priority in DRM submitqueue creation
The kernel driver has a range of valid priority values that can
be supplied to it, submitting any priority value outside these
bounds will result in `-EINVAL`. To avoid this, the priority
value is now clamped to the range that the kernel supports.
Fixes:
0c6fbfca0c91ef012e8ab767a317c07f1f6dc5e6
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18389>
Pavel Ondračka [Sun, 28 Aug 2022 16:41:08 +0000 (18:41 +0200)]
r300: allow presubtract when both ADD sources are negative
Current code doesn't handle this, however it is easy to make it work
by moving the negate to the presubtract source. Minor win in shader-db,
mostly with Unigine shaders.
Shader-db RV530:
total instructions in shared programs: 136382 -> 136236 (-0.11%)
instructions in affected programs: 9911 -> 9765 (-1.47%)
total temps in shared programs: 18939 -> 18942 (0.02%)
temps in affected programs: 37 -> 40 (8.11%)
Reviewed-by: Filip Gawin <filip@gawin.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18289>
Gert Wollny [Wed, 7 Sep 2022 12:28:35 +0000 (14:28 +0200)]
virgl: Add some formats that the CTS uses
Otherwise running the CTS emits lots of warnings about
these formats missing in the drivers format table.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18462>
Rob Clark [Thu, 11 Aug 2022 17:52:59 +0000 (10:52 -0700)]
egl: Relax locking
Now that we have the rwlock TerminateLock protecting us against
eglTerminate() yanking the rug from under us, drop the BDL across
calls to driver (or at least the main ones that can potentially
block).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7039
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Rob Clark [Mon, 29 Aug 2022 17:28:11 +0000 (10:28 -0700)]
egl: Introduce rwlock to protect eglTerminate()
eglTerminate() must be serialized against all other EGL calls. But in
most cases, other EGL calls do not need to be serialized against each
other. Which fits rather well with a rwlock.
One would be tempted to simply replace the existing BDL with a rwlock,
but several portability and debuggability limitations of the rwlock
implementation prevent that, as described in the TerminateLock comment
block.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Rob Clark [Thu, 18 Aug 2022 17:59:50 +0000 (10:59 -0700)]
egl: Make RefCount atomic
Once we relax the locking, we will be doing _eglPutFoo() outside of the
big display lock.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Rob Clark [Sat, 13 Aug 2022 17:41:46 +0000 (10:41 -0700)]
egl/dri2: Add display lock
In preperation of relaxing eglapi to not hold a lock across driver
calls, but instead only for protecting it's own state, add our own
lock to protect code paths that need locking or have not been audited
yet. The blocking calls (ClientWaitSyncKHR) or critical path and/or
blocking (MakeCurrent, SwapBuffers*) are lockless, as they have already
been audited for thread safety.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Rob Clark [Thu, 11 Aug 2022 20:02:29 +0000 (13:02 -0700)]
egl/dri2: Make ref_count atomic
In particular, MakeCurrent can be called on multiple threads in
parallel.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Rob Clark [Thu, 11 Aug 2022 16:32:29 +0000 (09:32 -0700)]
egl/wgl: Make ref_count atomic
Looks like wgl doesn't have much display state to protect. But it's
ref_count should be atomic before we start removing locking from eglapi
to protect against MakeCurrent being called in parallel on multiple
threads.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18050>
Timothy Arceri [Wed, 10 Aug 2022 02:14:07 +0000 (12:14 +1000)]
glsl: remove GLSL IR inverse comparison optimisations
As per
7d85dc4f350b GLSL IR is not smart enough to handle this
correctly for NANs.
Shader-db radeonsi (RX 6800):
Totals from affected shaders:
SGPRS: 26848 -> 26848 (0.00 %)
VGPRS: 13552 -> 13552 (0.00 %)
Spilled SGPRs: 134 -> 134 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 635000 -> 630988 (-0.63 %) bytes
Max Waves: 5474 -> 5474 (0.00 %)
Shader-db iris (BDW):
total instructions in shared programs:
17538859 ->
17539018 (<.01%)
instructions in affected programs: 29369 -> 29528 (0.54%)
helped: 3
HURT: 126
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
HURT stats (abs) min: 1 max: 2 x̄: 1.29 x̃: 1
HURT stats (rel) min: 0.27% max: 1.32% x̄: 0.61% x̃: 0.54%
95% mean confidence interval for instructions value: 1.13 1.33
95% mean confidence interval for instructions %-change: 0.54% 0.63%
Instructions are HURT.
total loops in shared programs: 4866 -> 4866 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs:
858548230 ->
858548915 (<.01%)
cycles in affected programs: 1331737 -> 1332422 (0.05%)
helped: 0
HURT: 92
HURT stats (abs) min: 2 max: 49 x̄: 7.45 x̃: 6
HURT stats (rel) min: 0.01% max: 1.90% x̄: 0.12% x̃: 0.05%
95% mean confidence interval for cycles value: 5.72 9.17
95% mean confidence interval for cycles %-change: 0.05% 0.19%
Cycles are HURT.
Note: With the addition of "nir/comparison_pre: See through an inot to
apply the optimization", idr's shader-db results are:
All Broadwell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
19940805 ->
19940802 (<.01%)
instructions in affected programs: 582 -> 579 (-0.52%)
helped: 3 / HURT: 0
total cycles in shared programs:
858431633 ->
858431747 (<.01%)
cycles in affected programs: 4938 -> 5052 (2.31%)
helped: 0 / HURT: 3
All older Intel platforms had similar results. (Haswell shown)
total instructions in shared programs:
16715626 ->
16715670 (<.01%)
instructions in affected programs: 9496 -> 9540 (0.46%)
helped: 0 / HURT: 44
total cycles in shared programs:
881224396 ->
881232314 (<.01%)
cycles in affected programs: 600610 -> 608528 (1.32%)
helped: 6 / HURT: 44
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
Ian Romanick [Tue, 6 Sep 2022 17:28:14 +0000 (10:28 -0700)]
nir/comparison_pre: See through an inot to apply the optimization
This also prevents some small regressions in "glsl: remove GLSL IR
inverse comparison optimisations".
shader-db results:
All Sandy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
19941025 ->
19940805 (<.01%)
instructions in affected programs: 52431 -> 52211 (-0.42%)
helped: 188 / HURT: 6
total cycles in shared programs:
858451784 ->
858431633 (<.01%)
cycles in affected programs: 2119134 -> 2098983 (-0.95%)
helped: 183 / HURT: 12
LOST: 2
GAINED: 0
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8364668 -> 8364670 (<.01%)
instructions in affected programs: 753 -> 755 (0.27%)
helped: 2 / HURT: 4
total cycles in shared programs:
248752572 ->
248752238 (<.01%)
cycles in affected programs: 87290 -> 86956 (-0.38%)
helped: 2 / HURT: 4
fossil-db results:
Skylake, Ice Lake, and Tiger Lake had similar results. (Ice Lake shown)
Instructions in all programs:
144909184 ->
144909130 (-0.0%)
Instructions helped: 6
Cycles in all programs:
9138641740 ->
9138640984 (-0.0%)
Cycles helped: 8
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
Timothy Arceri [Thu, 11 Aug 2022 01:34:13 +0000 (11:34 +1000)]
nir: support loop unrolling with inot conditions
Ever since
4246c2869c3c and
7d85dc4f350b loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.
This change avoids 292 loop unrolling regressions with shader-db
once the following patch is applied.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
Timothy Arceri [Thu, 11 Aug 2022 01:09:26 +0000 (11:09 +1000)]
nir: update nir_is_supported_terminator_condition()
Ever since
4246c2869c3c and
7d85dc4f350b loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.
Here we simply check to see if the inot contains a simple
terminator condition we previously handled. We also update
the previous users of this function to use a newly name
copy of the previous behaviour
nir_is_terminator_condition_with_two_inputs().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
Bas Nieuwenhuizen [Fri, 19 Aug 2022 12:58:30 +0000 (14:58 +0200)]
amd/common: Disable DCC retile modifiers on RDNA1
Some claims of corruption, modifier-less Mesa already doesn't do
it. Since these modifiers have no purpose besides being displayed
lets just disable in Mesa.
Cc: mesa-stable
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
Bas Nieuwenhuizen [Fri, 19 Aug 2022 12:55:54 +0000 (14:55 +0200)]
amd/common: Don't rely on DCN support checks with modifiers.
Going to be a bad time if they disagree, which is bound to happen
sometimes. Not asserting and stuff tends to be a better experience
than crashing.
Cc: mesa-stable
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18140>
Jordan Justen [Fri, 2 Sep 2022 06:51:38 +0000 (23:51 -0700)]
intel/pci_ids: Drop non-upstream dg2 pci-ids
These pci-ids should be included in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14523, since
these pci-ids will only be supported by kernels that support the
forked Linux uapi. (Note that !14523 will never be merged into
upstream Mesa.)
Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/drm/i915_pciids.h?h=v6.0-rc3#n695
Fixes:
398a9be94b4 ("intel/dev: Enable remaining DG2 and ATS-M device IDs")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18386>
Yiwei Zhang [Wed, 7 Sep 2022 17:41:39 +0000 (17:41 +0000)]
venus: double the abort timeout
To avoid bumping abort timeout too much. This change also doubles the
busy wait cycles, which would further reduce unnecessary sleeps for
synchronous calls. Ultimately, after we fix the fencing and push all
roundtrip waiting to the renderer side as well as we fixing the abort
logic, we can live with busy wait alone here.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18472>
Iván Briano [Tue, 6 Sep 2022 22:28:26 +0000 (15:28 -0700)]
anv: pipelineStageCreationFeedbackCount is allowed to be 0
Fixes:
6601e5d6fc6 ("anv: implement VK_EXT_pipeline_creation_feedback")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18451>
Georg Lehmann [Mon, 5 Sep 2022 11:10:06 +0000 (13:10 +0200)]
aco: Use plain VOPC for vcmpx when possible.
Foz-DB Navi21:
Totals from 66947 (49.62% of 134913) affected shaders:
CodeSize:
210383024 ->
210033376 (-0.17%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18417>
Konstantin Seurer [Thu, 1 Sep 2022 19:21:19 +0000 (21:21 +0200)]
radv: Deduplicate push constant structs
This patch adds a header that is shared between the accel struct build
kernels and the dispatch code.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18376>
Mike Blumenkrantz [Tue, 6 Sep 2022 18:25:38 +0000 (14:25 -0400)]
zink: fix sharedmem ops with bit_size!=32
* the rewrite_bo_access compiler pass already handles 64bit rewrites as-needed
* sharedmem access is not required to be 32bit
thus, this can use a similar methodology as ssbo/ubo vars to index based on bitsize
and handle operations through sized variables
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18449>
Corentin Noël [Wed, 7 Sep 2022 12:45:25 +0000 (14:45 +0200)]
ci: disable the freedreno farm.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18463>
Eric Engestrom [Sun, 3 Jul 2022 13:35:50 +0000 (14:35 +0100)]
v3dv: implement VK_EXT_shader_module_identifier
Passes `dEQP-VK.*.shader_module_identifier.*`
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18458>
Tomeu Vizoso [Wed, 7 Sep 2022 04:53:19 +0000 (06:53 +0200)]
Revert "Revert "Revert "ci: set venus on lavapipe to manual due to flakes"""
Now the flakiness might have been fixed for good.
This reverts commit
e51c5a18ade928868623be0048fdd24ed158a42c.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18454>
Tomeu Vizoso [Tue, 6 Sep 2022 14:48:04 +0000 (16:48 +0200)]
ci: Crosvm won't remove the control socket file on stop
When sending the stop command to a lingering Crosvm instance, the socket
file will remain and the next instance will fail to start.
Make sure the file is deleted before starting Crosvm with the same path.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18454>
Samuel Pitoiset [Wed, 7 Sep 2022 07:01:41 +0000 (09:01 +0200)]
radv: only expose sparseResidencyImage3D on GFX9+
It's currently broken on Polaris10 and breaks running VKCTS entirely.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18457>
Martin Roukala (né Peres) [Tue, 6 Sep 2022 11:23:31 +0000 (14:23 +0300)]
radv/ci: run vkcts on the two steam decks in parallel
We just added a new Steam Deck to our CI, which should allow us to
halve the execution time of a full VKCTS run from 1h20 to a more
reasonable 40 minutes.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18432>
Samuel Pitoiset [Mon, 15 Aug 2022 09:59:03 +0000 (11:59 +0200)]
radv: compact MRTs to save PS export memory space
If there are holes between color outputs (e.g. a shader exports MRT1,
but not MRT0), we can remove the holes by moving higher MRTs lower. The
hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK. This is good for
performance because the hardware will allocate less space for color
MRTs.
This also allows to remove even more unused color exports because we no
longer need to force previous targets to be non-zero. Only SotTR seems
affected from our fossils db.
fossils-db (NAVI21):
Totals from 859 (0.64% of 134913) affected shaders:
VGPRs: 24328 -> 24216 (-0.46%)
CodeSize: 1433276 -> 1422576 (-0.75%)
Instrs: 255275 -> 253728 (-0.61%)
Latency: 1666836 -> 1661544 (-0.32%)
InvThroughput: 346038 -> 343406 (-0.76%)
Copies: 16520 -> 16506 (-0.08%)
PreSGPRs: 25934 -> 25920 (-0.05%)
PreVGPRs: 19903 -> 19662 (-1.21%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5786>
Samuel Pitoiset [Wed, 31 Aug 2022 15:15:27 +0000 (17:15 +0200)]
radv: gather MRTs that are written by the fragment shader
This will be used to filter color attachments without exports.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5786>
Erik Faye-Lund [Tue, 6 Sep 2022 12:59:11 +0000 (14:59 +0200)]
docs/zink: remove bptc from required formats for gl4.2
We now have lowering-code in the mesa state-tracker, meaning we no
longer need this feature.
Fixes:
e4ff42684b9 ("mesa/st: enable bptc extension with fallback")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18434>
Erik Faye-Lund [Tue, 6 Sep 2022 09:08:46 +0000 (11:08 +0200)]
vc4: do not attempt to do deep tiled blits
We only copy a single layer, so let's not even try to support deep
blits here.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18427>
Erik Faye-Lund [Tue, 6 Sep 2022 09:03:58 +0000 (11:03 +0200)]
vc4: respect z-offset in tiled blits
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18427>
Erik Faye-Lund [Wed, 10 Aug 2022 12:39:47 +0000 (14:39 +0200)]
v3d: do not pretend to fake rgtc-support
The is_format_support query doesn't pretent to have RGTC support, so
this doesn't seem like it ever did anything useful.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18439>
Tapani Pälli [Mon, 5 Sep 2022 05:08:32 +0000 (08:08 +0300)]
intel/compiler: implement Wa_14014595444 for DG2
According to the workaround, we should setup MLOD as parameter
4 and 5 for the sample_b message.
v2: only SAMPLE_B, not SAMPLE_B_C (Lionel)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18408>
Tapani Pälli [Mon, 5 Sep 2022 05:23:20 +0000 (08:23 +0300)]
anv: implement Wa_14015946265 for DG2
SOL unit issues, wa is to send PC with CS stall after SO_DECL.
v2: emit also in genX_gpu_memcpy (Lionel)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18409>
Tapani Pälli [Mon, 5 Sep 2022 05:19:30 +0000 (08:19 +0300)]
iris: implement Wa_14015946265 for DG2
SOL unit issues, wa is to send PC with CS stall after SO_DECL.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18409>
Bas Nieuwenhuizen [Sat, 20 Aug 2022 18:19:19 +0000 (20:19 +0200)]
radv: Expose 3d sparse images.
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18165>
Bas Nieuwenhuizen [Sat, 20 Aug 2022 18:17:07 +0000 (20:17 +0200)]
radv: Add 3d tile shapes for sparse binding.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18165>
Bas Nieuwenhuizen [Sat, 20 Aug 2022 18:16:30 +0000 (20:16 +0200)]
radv: Add binding code for 3d sparse images.
GFX7-8 code is kinda expected. For GFX9 and GFX10 the entire
mipchain is duplicated by "layer" even though smaller mips also
have less layers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18165>
Alyssa Rosenzweig [Tue, 6 Sep 2022 00:11:26 +0000 (20:11 -0400)]
asahi: Allocate new cmdbufs if out of space
Instead of crashing when we run out of space in the command buffer,
allocate a new buffer, jump to it with the STREAM_LINK command, and
use it to write new commands.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 21:43:34 +0000 (17:43 -0400)]
asahi: Handle Stream Link VDM commands
Jumps in the command streams, allowing us to chain ("link") command
buffers. Naming is from PowerVR, which contains an identical command.
PowerVR's has conditional jumps and function call support, it's likely
that AGX inherited this too but I haven't tested that. (Those might be
useful for conditional rendering and secondary command buffers
respectively?)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 19:49:39 +0000 (15:49 -0400)]
asahi: Express VDM commands according to PowerVR
Piles of unknown bits go away, as we find they're either "field present"
bits or block types. And yep, the block type enum lines up between AGX
and RGX.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 19:17:10 +0000 (15:17 -0400)]
asahi: Annotate VDM/CDM commands as per PVR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 18:59:19 +0000 (14:59 -0400)]
asahi: Make BO list growable
Back it by a simple dynamic array, ralloc'd off the batch (and make the
context/batch ralloc'd so stuff gets cleaned up).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 01:47:52 +0000 (21:47 -0400)]
asahi: Dirty track everything
Now that we have fine grained state emit code, let's use it to reduce
driver overhead. Dirty tracking is delicate: while this seems to work,
I've also added an ASAHI_MESA_DEBUG=dirty option in debug builds
to disable the optimizations here for future debug.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Mon, 5 Sep 2022 01:32:50 +0000 (21:32 -0400)]
asahi: Hoist constant PPP state to start of batch
This reduces how much we emit per draw.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Sun, 4 Sep 2022 19:17:22 +0000 (15:17 -0400)]
asahi: Match PPP data structures with PowerVR
Looking at PowerVR's PPP definitions in tree in Mesa
(src/imagination/csbgen/), we find that AGX's "tagged" data structures
are actually sequences of state items prefixed by a header specifying
which state follows. Rather than hardcoding the sequences in which Apple's
driver chooses to bundle state, we need the XML to be flexible enough to
encode or decode any valid combination of state. That means reworking
the XML. While doing so, we find a number of fields that are identical
between RGX and AGX, and fix the names while at it (for example, the W
Clamp floating point).
Names are from the PowerVR code in Mesa where sensible.
Once we've reworked the XML, we need to rework the decoder. Instead of
reading tags and printing the combined state packets, the decoder now
must unpack the header and print the individual state items specified by
the header, with slightly more complicated bounds checking.
Finally, state emission in the driver becomes much more flexible. To
prove the flexibility actually works, we now emit all PPP state (except for
viewport and scissor state) as a single PPP update. This works. After
this we can move onto more interesting arrangements of state for lower
driver overhead.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>
Alyssa Rosenzweig [Sun, 4 Sep 2022 18:51:05 +0000 (14:51 -0400)]
asahi: Don't use lower_wpos_pntc
Instead we can flip point coords with the object type. That means fewer
instructions without shader variants. Thanks, PowerVR ^_^
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18421>