Eric Engestrom [Tue, 14 Feb 2023 21:52:07 +0000 (21:52 +0000)]
meson: allow building GLES without GL
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21343>
Eric Engestrom [Fri, 17 Feb 2023 21:08:25 +0000 (21:08 +0000)]
meson/windows: only build libgl-gdi for desktop gl
Suggested-by: Jesse Natalie <jenatali@microsoft.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21343>
Eric Engestrom [Wed, 15 Feb 2023 19:18:27 +0000 (19:18 +0000)]
meson: make GLX require OpenGL
This isn't strictly true, but making that work isn't worth the effort;
see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21343#note_1774683
Suggested-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21343>
Alyssa Rosenzweig [Tue, 29 Nov 2022 03:28:13 +0000 (22:28 -0500)]
nir/lower_blend,agx,panfrost: Use lowered I/O
This is one step towards lowering I/O during shader preprocess rather than at
variant create time, which helps mitigate shader variant jank. It's also a lot
simpler.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> [v1]
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>
Alyssa Rosenzweig [Wed, 15 Feb 2023 15:27:48 +0000 (10:27 -0500)]
nir/lower_blend: Don't handle gl_FragColor
In OpenGL, FRAG_RESULT_COLOR implicitly broadcasts to every render target. Our
existing lower_blend code (somewhat arbitrarily) aliases to the the first render
target's format and blend settings. That said, I don't think that works if
different render targets have different settings -- or blend with their
different destinations -- though I don't have relevant spec text right now.
The actual reason this works is that all users of this pass either call
nir_lower_fragcolor first (panfrost, asahi) or don't have FRAG_RESULT_COLOR as
part of their API (panvk, soon agxv). Unless/until we actually have a use case
for nir_lower_blend with gl_FragColor, assert that gl_FragColor is lowered first
so we don't need to worry about this imaginary case.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>
Alyssa Rosenzweig [Wed, 15 Feb 2023 15:10:45 +0000 (10:10 -0500)]
nir/lower_blend: Don't touch store->dest
Stores don't have destinations, and if they did, it would be invalid to change
their ssa_def's num_components without also changing the SSA def. Remove the
nonsensical (but harmless) assignment.
This fixes
25249e8be2c ("nir/lower_blend: Expand or shrink output variables as
needed"), but as the bug is harmless in practice, it does not need to be
backported.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>
Alyssa Rosenzweig [Mon, 6 Feb 2023 16:27:37 +0000 (11:27 -0500)]
pan/lower_framebuffer: Operate on lowered I/O
This turns the early pass into a late pass, which is important because it
depends on the shader key and therefore should be called by the driver instead
of the compiler preprocessing. It's also simpler this way.
The shader key work is waiting for review in another merge request. In the mean
time, this patch will let us run blend lowering early for blend shaders on
Midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>
Alyssa Rosenzweig [Mon, 6 Feb 2023 15:49:35 +0000 (10:49 -0500)]
nir: Augment raw_output_pan with IO_SEMANTICS+BASE
This is a form of lowered I/O, it needs I/O semantics so we can know the
location to store to instead of passing via a sideband.
Over in !20906, we will use the BASE to lower blend shader with multisampling in
NIR instead of passing the number of samples and framebuffer format along a
sideband to the Midgard compiler. That's not needed for this series (this patch
was cherry-picked to avoid regressions in the lower_blend changes) but it's good
to model the full form of the I/O lowered intrinsic here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>
Ian Romanick [Wed, 8 Feb 2023 19:33:45 +0000 (11:33 -0800)]
nir/loop_analyze: Simplify some logic in compute_induction_information
This part now looks more like it did before
0b9639c35d0a.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Tue, 7 Feb 2023 22:32:37 +0000 (14:32 -0800)]
nir/loop_analyze: Track induction variables with uniform initializer
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Tue, 7 Feb 2023 21:11:51 +0000 (13:11 -0800)]
nir/loop_analyze: Eliminate nir_basic_induction_var
No longer used. All of the information that was previously track here is
tracked directly in nir_loop_variable... and, technically speaking, has
been tracked there ever since
0b9639c35d0a.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Tue, 7 Feb 2023 21:10:22 +0000 (13:10 -0800)]
nir/loop_analyze: Use nir_loop_variable::init_src instead of nir_basic_induction_var::def_outside_loop
These track the same information in a slightly different way. Since
nir_loop_variable::init_src is visible outside this module, it cannot
be eliminated.
As an intentional side effect, induction variables with constant
initializers will now have their nir_loop_induction_variable::init_src
field point to the load_const source. Previously this pointer would be
NULL.
v2: Update unit tests and commit message. Remove the now unused ind_var
variable in find_trip_count.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Tue, 7 Feb 2023 18:10:59 +0000 (10:10 -0800)]
nir/loop_analyze: Use nir_loop_variable::update_src instead of nir_basic_induction_var::alu
These track the same information in a slightly different way. Since
nir_loop_variable::update_src is visible outside this module, it cannot
be eliminated.
This leads to some nice simplification in find_trip_count. Previously
this code only had access to the ALU instruction that performs the
increment. It had to "search" the parameters to determine which (if any)
was the constant. With this change, this code has access to the
nir_alu_src of the ALU instruction that performs the increment. It no
longer needs to search the parameters for the constant. It's either the
supplied nir_alu_src or nothing.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Tue, 7 Feb 2023 17:18:45 +0000 (09:18 -0800)]
nir/loop_analyze: Track induction variables with uniform increments
As an intentional side effect, induction variables with constant
increments will now have their nir_loop_induction_variable::update_src
field point to the load_const source. Previously this pointer would be
NULL.
v2: Update unit tests and commit message.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Fri, 17 Feb 2023 18:10:41 +0000 (10:10 -0800)]
nir/tests: Add tests for nir_loop_info::induction_vars tracking
Later commits in this MR will change the way some data is track, and
these tests will verify this behavior change.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Wed, 1 Feb 2023 02:12:06 +0000 (18:12 -0800)]
nir/tests: Add tests for "inverted" loops
A couple basic tests for loops with the exit condition after the
increment. In compiler literature, the optimization that moves the exit
condition from the top to the bottom is called "loop inversion."
v2: Pass parameters to loop_builder_invert using a struct. Add a comment
describing the loop being constructed to loop_builder_invert. Both
suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Wed, 1 Feb 2023 01:50:11 +0000 (17:50 -0800)]
nir/tests: Refactor creation of loops for loop_analyze test cases
Inspired heavily by the work by Yevhenii Kolesnikov in the original
versions of !3445.
v2: Pass parameters to loop_builder using a struct. Add a comment
describing the loop being constructed to loop_builder. Both suggested by
Caio.
v3: mscv C++ designated initializer lolz.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Ian Romanick [Wed, 1 Feb 2023 21:40:13 +0000 (13:40 -0800)]
nir/tests: Don't unconditionally log shaders from this one CF test
All of the other tests only log the shader when validation fails, so
having that shader scroll by in the output is very distracting.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21289>
Eric Engestrom [Tue, 14 Feb 2023 12:42:41 +0000 (12:42 +0000)]
docs: add 23.1 branchpoint & rc dates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21322>
Lionel Landwerlin [Fri, 17 Feb 2023 14:34:10 +0000 (16:34 +0200)]
anv: fix vma heap memory leak
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
a5f9e59ce3 ("anv: Use vma_heap for descriptor pool host allocation")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21385>
Eric Engestrom [Fri, 17 Feb 2023 17:38:47 +0000 (17:38 +0000)]
ci: bump tags of deqp images
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21392>
Eric Engestrom [Fri, 17 Feb 2023 17:37:31 +0000 (17:37 +0000)]
ci: fix grouping of image tags
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21392>
Eric Engestrom [Fri, 17 Feb 2023 17:35:59 +0000 (17:35 +0000)]
ci: remove no-op sed
This is a duplicate from the first patch applied above.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21392>
Eric Engestrom [Fri, 17 Feb 2023 17:09:28 +0000 (17:09 +0000)]
ci: simplify adding & removing deqp patches
Instead of everyone having to copy the curl command from somewhere else
when a new deqp version needs new patches; now all they need to do is
paste the commit hash in the array.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21392>
Ryan Neph [Tue, 14 Feb 2023 22:47:36 +0000 (14:47 -0800)]
venus: temporarily redirect VkDrmFormatModifierPropertiesListEXT to "2" variant
Temporarily remove driver-side uses of
VkDrmFormatModifierPropertiesListEXT so the encode/decode procedures can
be fixed asynchronously in a follow-up.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21367>
Alyssa Rosenzweig [Fri, 17 Feb 2023 19:10:47 +0000 (14:10 -0500)]
panfrost: Fix prim restart XML on Valhall
Harmless in practice (so no need to backport) but still very wrong. Noticed
looking at traces of Dolphin trying to debug acute misrendering.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20780>
Chia-I Wu [Wed, 15 Feb 2023 19:11:16 +0000 (11:11 -0800)]
radv: add a size check in radv_create_buffer for Android
This is to make dEQP-VK.api.buffer.basic.size_max_uint64 pass on
android.
The test creates a buffer of size UINT64_MAX and makes sure the memory
requirement for the buffer is sane. It fails because our memory
requirement is "align64(UINT64_MAX, 16)" which is 0 after overflow.
The test checks maintenance4's maxBufferSize and is skipped normally.
But the extension can be disabled on an android build.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21346>
Timur Kristóf [Wed, 1 Feb 2023 00:02:09 +0000 (01:02 +0100)]
radv: Call nir_lower_array_deref_of_vec in radv_lower_io_to_scalar_early.
This fixes an issue when a vector component of an arrayed output has a deref.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8197
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21028>
Konstantin Seurer [Sat, 11 Feb 2023 21:27:45 +0000 (22:27 +0100)]
radv: Advertise ray query support with LLVM
What could go wrong?
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21268>
Konstantin Seurer [Fri, 17 Feb 2023 16:14:32 +0000 (17:14 +0100)]
radv: Pre-compile BVH build shaders if there is a cache
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21268>
Konstantin Seurer [Sat, 11 Feb 2023 21:26:56 +0000 (22:26 +0100)]
radv: Force ACO for BVH build shaders
They hang with LLVM.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21268>
Konstantin Seurer [Tue, 14 Feb 2023 09:55:44 +0000 (10:55 +0100)]
radv: Make accel struct meta state initialization thread safe
Fixes: 0d5570b ("radv: Always compile accel structure shaders on demand.")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21268>
Konstantin Seurer [Sat, 11 Feb 2023 21:26:25 +0000 (22:26 +0100)]
ac/llvm: Implement bvh64_intersect_ray_amd
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21268>
Mike Blumenkrantz [Fri, 17 Feb 2023 13:30:39 +0000 (08:30 -0500)]
zink: handle semi-matching i/o for separate shaders
while separate shaders requires i/o blocks to match between stages,
there are two tricky cases:
* sparse location specification
* variables are required to match in type by location
the first item means user locations must increment if a slot is not used
the second item means that e.g., a mat3x2 can match three vec2 variables
in matching slots
fix both of these cases now
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21383>
Alyssa Rosenzweig [Sat, 11 Feb 2023 03:19:37 +0000 (22:19 -0500)]
panfrost: Disable CRC by default
Known unsound code.
So far I'm not convinced transaction elimination is doing us much good. Even in
synthetic glmark style benchmarks this seems to be a few % hit at most. Given
that transaction elimination is unsound by design, and that panfrost's
implementation is buggy in several places and getting it right (up to the
unsoundness of the hardware feature itself) would take actual engineering
effort, and the priority is making glamor work... disabling is the obvious
choice here.
For now, we leave the code but gate it behind a env var
flag (PAN_MESA_DEBUG=crc) rather than defaulting to enabled unless
PAN_MESA_DEBUG=nocrc is set. This way, we can still experiment with it if we
need that data ("what performance could we gain if we had this feature,
unsoundness be damned?"). That said, I'm not really ok with having unsoundness
on my devices, y'know? Back of the napkin math suggests that it's not unlikely
that somebody has hit a transaction elimination collision in the wild with the
DDK.
Boils down to values.
Closes: #8113
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21258>
Lionel Landwerlin [Fri, 17 Feb 2023 12:31:56 +0000 (14:31 +0200)]
anv: track vram only BOs to print things out on ENOMEM execbuf
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21380>
Lionel Landwerlin [Tue, 14 Feb 2023 14:16:56 +0000 (16:16 +0200)]
anv: move debug submit to helper and call it on execbuf failure
Helps telling when you've run out of local memory.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21380>
Samuel Pitoiset [Wed, 15 Feb 2023 15:39:24 +0000 (16:39 +0100)]
radv: stop using a PS epilog when the FS doesn't write any color outputs
This is a small optimization for fragment shaders that only write
depth/stencil/sample mask without any color outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21341>
Samuel Pitoiset [Fri, 17 Feb 2023 08:24:14 +0000 (09:24 +0100)]
radv: only skip emitting the pipeline blend state if the FS uses an epilog
The blend state is emitted from the command buffer when the FS uses
an epilog (either compiled from a lib with GPL or compiled on-demand).
This shouldn't change anything but it will allow to disable using a
PS epilog when the fragment shader doesn't write any color outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21341>
Karmjit Mahil [Wed, 16 Nov 2022 14:17:08 +0000 (14:17 +0000)]
pvr: Handle VK_QUERY_RESULT_WAIT_BIT.
Not handling device loss currently. That needs to be done
throughout the code base so out of scope for this.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20091>
Tapani Pälli [Tue, 30 Aug 2022 10:59:16 +0000 (13:59 +0300)]
anv: Wa_14016407139, add required pc when SBA programmed
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21374>
Lionel Landwerlin [Fri, 17 Feb 2023 09:35:56 +0000 (11:35 +0200)]
intel/perf: also add the oa timestamp shift on MTL
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
90c86fe63e94 ("intel: add MTL performance metrics")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21378>
Samuel Pitoiset [Thu, 9 Feb 2023 15:07:56 +0000 (16:07 +0100)]
radv/amdgpu: only set a new pstate if the current one is different
AMDGPU pstate is per context but if there is multiple AMDGPU contexts
in flight, the kernel can return -EBUSY.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
Samuel Pitoiset [Thu, 9 Feb 2023 14:16:47 +0000 (15:16 +0100)]
Revert "radv: acquire pstate on-demand when capturing with RGP"
This change is wrong for two reasons:
- it hangs most of the time maybe, because changing PSTATE when the
application is running is broken somehow
- it increases the time between triggering and generating the capture
considerably, because there is a delay for changing PSTATE
This restores previous logic where PSTATE is set to profile_peak at
logical device creation. Though, it also re-introduces an issue with
multiple logical devices (kernel returns -EBUSY) but this will be
fixed in the next commit.
This fixes GPU hangs when trying to record RGP captures on my NAVI21.
Note that profile_peak is only required for some RDNA2 chips (including
VanGogh).
Cc: mesa-stable
This reverts commit
923a864d94517462698c529bdc0e5c056d37b4e1.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21222>
Erico Nunes [Sun, 12 Feb 2023 21:33:30 +0000 (22:33 +0100)]
lima: don't use resource_from_handle while creating scanout
resource_from_handle implementations create an additional reference to
the scanout resource, which caused lima to leak those resources after
commit
ad4d7ca8332488be8a75aff001f00306a9f6402e.
Do as the other drivers do and import the bo directly while creating
the scanount resource.
Cc: 22.3 mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8198
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21330>
Karmjit Mahil [Fri, 28 Oct 2022 09:47:02 +0000 (10:47 +0100)]
pvr: Add support to copy descriptors on vkUpdateDescriptorSets()
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21332>
Karmjit Mahil [Mon, 23 Jan 2023 21:58:43 +0000 (21:58 +0000)]
pvr: Move descriptor write into pvr_write_descriptor_set()
Moving descriptor write functionality from
pvr_UpdateDescriptorSets() into pvr_write_descriptor_set().
This is in preparation for adding descriptor copy support.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21332>
Boyuan Zhang [Fri, 10 Feb 2023 20:46:18 +0000 (15:46 -0500)]
virgl: add more formats to conv table
Adding UYVY, YUYV, P010 to formats_conv_table.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21250>
Gert Wollny [Wed, 15 Feb 2023 14:53:30 +0000 (15:53 +0100)]
r600/sfn: Fix Cayman trans from string and add test for copy prop
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Wed, 15 Feb 2023 14:31:27 +0000 (15:31 +0100)]
r600/sfn: Fix alu trans op flag setup
Fixes: commit
2df023a1f1990aad6c20eca85af19c7d21a43203
r600/sfn: pre-evaluate allowed dest mask in Alu instructions
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Wed, 15 Feb 2023 10:17:39 +0000 (11:17 +0100)]
r600/sfn: Fix handling of fetch through texture clause
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Wed, 15 Feb 2023 10:17:00 +0000 (11:17 +0100)]
r600: Don't start new CF for every fetch through tex clause
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Wed, 15 Feb 2023 07:33:18 +0000 (08:33 +0100)]
r600/sfn: Forward setting the block ID and index
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Tue, 14 Feb 2023 16:32:19 +0000 (17:32 +0100)]
r600/sfn: address use in group only if instr can be added
Otherwise the group will signal an address use that may not
be relevant.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Fri, 10 Feb 2023 16:57:16 +0000 (17:57 +0100)]
r600/sfn: rename texture coordinate offset for clarity
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Fri, 10 Feb 2023 15:31:32 +0000 (16:31 +0100)]
r600/sfn: Stop try scheduling in t-slot with empty related v-slot
This requires adding a nop in the relates v-slot, and the readport
valiation seems to be broken for this case, so drop this for now.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Fri, 10 Feb 2023 14:57:13 +0000 (15:57 +0100)]
r600/sfn: Don't copy propagate indirect loads to more than one dest
Propagating the indirect load to more instructions would result
in more address load instructions. This would (a) remove the advantage
of eliminating one move, and (b) introduce more latency, because between
address load and use two cycles must pass.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Fri, 27 Jan 2023 10:33:46 +0000 (11:33 +0100)]
r600/sfn: Silence warnings about unused parameters
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Wed, 18 Jan 2023 13:41:14 +0000 (14:41 +0100)]
r600/sfn: Fix a typo
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Sat, 28 Jan 2023 17:27:02 +0000 (18:27 +0100)]
r600/sfn: drop useless instr use count
This is handled with the dest registers
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Fri, 10 Feb 2023 14:17:11 +0000 (15:17 +0100)]
r600/sfn: Work around dependency issue when splitting op to group
The instruction that is split may still be referenced as extra
dependency in other instructions, so add a handle to the instruction
that it can be set to be scheduled.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Gert Wollny [Tue, 31 Jan 2023 15:49:51 +0000 (16:49 +0100)]
r600/sfn: Use range_base for atomics and images
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21357>
Tapani Pälli [Wed, 25 Jan 2023 07:57:50 +0000 (09:57 +0200)]
mesa/st: support compute shader decoding of ASTC
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19886>
Tapani Pälli [Tue, 24 Jan 2023 07:46:08 +0000 (09:46 +0200)]
mesa: add astc decoder shader template (glsl es version)
This shader originates from Granite 3D engine and has been adapted
to be used with Open GL and some GLSL ES specifics.
GLSL ES adaptation:
- remove Vulkan specifics: EXT_samplerless_texture_functions usage,
specialization constants, push constant usage
- inline bitextract.h
- always DECODE_8BIT and hardcode error color (for now)
- port to GLSL ES, required some type changes, explicit type
conversions and setting up precisions for types
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19886>
Tapani Pälli [Tue, 24 Jan 2023 06:55:08 +0000 (08:55 +0200)]
mesa/st: initialize resources for ASTC decoding
Generates required resources for ASTC texture decoding pass.
Partition table resources will be cached in to hash during runtime
as one is required for each block size.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19886>
Tapani Pälli [Tue, 10 Jan 2023 10:41:27 +0000 (12:41 +0200)]
mesa/st: add astc decoder lookup tables
Commit introduces ASTC decoding lookup tables from Granite 3D engine.
These lookup tables will be used during transcoding by a compute
shader in later commits when decoding ASTC textures.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19886>
Samuel Pitoiset [Fri, 10 Feb 2023 12:24:25 +0000 (13:24 +0100)]
radv: add support for rectangularLines
dEQP-VK.*rectangular_line* pass on NAVI21.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21287>
Samuel Pitoiset [Mon, 13 Feb 2023 14:30:44 +0000 (15:30 +0100)]
radv: reduce maximum line width to 8.0
Using 8191.875 seems to big for the hardware to correctly render wide
rectangular lines. This can also be reproduced with AMDVLK by forcing
rectangularLines = True, and fixed by reducing the maximum size as well.
Other drivers seem to expose that value.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21287>
Mike Blumenkrantz [Wed, 15 Feb 2023 12:30:43 +0000 (07:30 -0500)]
zink: more accurately handle i/o for separate shaders
this can be simplified since i/o is required to match exactly between
stages, meaning that assigning in increasing order should always be correct
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21336>
Mike Blumenkrantz [Wed, 15 Feb 2023 12:18:43 +0000 (07:18 -0500)]
zink: delete some now-broken ntv dref sampling code
depth splatting should be handled now by the match_tex_dests() pass
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21336>
Faith Ekstrand [Wed, 8 Feb 2023 16:48:54 +0000 (10:48 -0600)]
vulkan: Update the XML and headers to 1.3.241
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 16 Feb 2023 22:40:23 +0000 (16:40 -0600)]
vulkan/device-select-layer: Include vulkan.h
In the upcoming header update, vk_layer.h starts including vulkan_core.h
instead of vulkan.h. This will break this layer as it needs a couple of
window-system extension #defines.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 16 Feb 2023 18:57:36 +0000 (12:57 -0600)]
vulkan/layers: Use PUBLIC instead of VK_LAYER_EXPORT
VK_LAYER_EXPORT is going away in the next Vulkan header update. We
already have a PUBLIC macro in util/macros.h which does the same thing.
Unlike VK_LAYER_EXPORT, it should work in Windows too.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 18:20:29 +0000 (12:20 -0600)]
vulkan: Properly filter structs in vk_physical_device_features
This uses get_all_required to filter structs and also filters struct
members based on API.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 16:52:30 +0000 (10:52 -0600)]
vulkan: Move the features generator to vulkan/util
This makes it easier to start depending on vk_extensions.py
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 18:25:40 +0000 (12:25 -0600)]
vulkan: Filter out provisional extensions
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 01:05:01 +0000 (19:05 -0600)]
Vulkan: Properly filter structs in vk_cmd_queue_gen
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 00:37:04 +0000 (18:37 -0600)]
vulkan: Properly filter by api in enum_to_str
This switches us to using get_all_required() for figuring out which
enum types we care about and then carefully filtering every value as
needed. We also add a number field to Extension so we keep all the
extension XML parsing in one place.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 9 Feb 2023 18:27:34 +0000 (12:27 -0600)]
vulkan: Properly filter entrypoints
We now use get_all_required() to get all required commands and use that
to filter instead of doing it manually. Also, we can pull entrypoint
extension etc. information from the requirements struct. Finally, we
also have to filter the actual commands themselves as well as arguments
per-API because there may be multiple versions or variants depending on
the API being used.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Wed, 8 Feb 2023 22:36:26 +0000 (16:36 -0600)]
vulkan: Add a get_all_required() helper
This searches for the names of everything of a particular type: command,
enum, etc. and returns a Requirements struct with any core version and
extensions that require it.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Wed, 8 Feb 2023 22:17:58 +0000 (16:17 -0600)]
vulkan: Parse the platform in Extensions.from_xml()
This makes handling guards on entrypoints a bit easier.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Wed, 8 Feb 2023 19:47:57 +0000 (13:47 -0600)]
vulkan: Improve extension parsing
This adds an Extension.from_xml() helper for doing the parsing so we can
re-use it in other code. We also improve filtering of extensions. The
Vulkan XML schema is changing to make the supported attribute a comma-
separated list. This is to allow for vulkansc to also exist in the XML
schema.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Wed, 8 Feb 2023 20:27:04 +0000 (14:27 -0600)]
vulkan: Remove unused fields from Extension and ApiVersion
These are a left-over from when these classes were used by ANV to define
extension enables in python. They haven't been used since we added
extension table structs and move extension enables to C.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 16 Feb 2023 19:44:35 +0000 (13:44 -0600)]
Revert "vk/util: keep track of extension requirements"
This reverts commit
ca98e4446b690709ce517b33d17cb3e2af3f5084. The way
extension requirements are specified is about to change significantly.
Since this is so new, it's easier to just revert for now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Faith Ekstrand [Thu, 16 Feb 2023 19:45:14 +0000 (13:45 -0600)]
Revert "vk/runtime: turn vk.xml extension requirements into asserts"
This reverts commit
6ac830ccb1a54a821c8d035675425f0d97434faa.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21225>
Timothy Arceri [Mon, 14 Nov 2022 04:16:48 +0000 (15:16 +1100)]
glsl: copy prop vars before scalarizing alus
This generally gives us better results and doing it here in nir will
also allow us to remove more glsl optimisation calls that do a similiar
thing for us.
(Updated shader-db results by idr.)
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs:
20246333 ->
20240715 (-0.03%)
instructions in affected programs: 235253 -> 229635 (-2.39%)
helped: 425 / HURT: 114
total cycles in shared programs:
891730115 ->
891631113 (-0.01%)
cycles in affected programs:
37347925 ->
37248923 (-0.27%)
helped: 952 / HURT: 692
total spills in shared programs: 7072 -> 6716 (-5.03%)
spills in affected programs: 505 -> 149 (-70.50%)
helped: 7 / HURT: 0
total fills in shared programs: 9897 -> 8511 (-14.00%)
fills in affected programs: 1674 -> 288 (-82.80%)
helped: 7 / HURT: 0
total sends in shared programs: 1053685 -> 1053411 (-0.03%)
sends in affected programs: 2821 -> 2547 (-9.71%)
helped: 30
HURT: 2
LOST: 13
GAINED: 13
Broadwell and Haswell had similar results. (Broadwell shown)
total instructions in shared programs:
18149157 ->
18147271 (-0.01%)
instructions in affected programs: 204630 -> 202744 (-0.92%)
helped: 294 / HURT: 121
total cycles in shared programs:
939488196 ->
939508444 (<.01%)
cycles in affected programs:
36394777 ->
36415025 (0.06%)
helped: 718 / HURT: 620
total sends in shared programs: 1005426 -> 1005152 (-0.03%)
sends in affected programs: 2821 -> 2547 (-9.71%)
helped: 30 / HURT: 2
LOST: 2
GAINED: 2
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19715>
Italo Nicola [Thu, 16 Feb 2023 22:12:04 +0000 (22:12 +0000)]
panfrost: fix tiny sample_positions BO memory leak
Fixes a 4KB memory leak that happens once per-device creation.
Cc: mesa-stable
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Chris Healy healych@amazon.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21372>
Faith Ekstrand [Wed, 8 Feb 2023 16:46:07 +0000 (10:46 -0600)]
intel/nir: Use nir_lower_mem_access_bit_sizes()
This drops the Intel-specific pass in favor of the new generic one.
No shader-db changes on Skylake or DG2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21232>
Faith Ekstrand [Fri, 10 Feb 2023 00:01:16 +0000 (18:01 -0600)]
nir: Add a load/store bit size lowering pass
This is based on brw_nir_lower_mem_access_bit_sizes() but ended up being
substantially different. While the core concepts are all the same, the
brw_* version made a lot of Intel-specific assumptions. The new version
takes a callback which takes a number of bytes of data and an alignment
pair and returns a bit size and number of components to load/store.
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21232>
Timothy Arceri [Fri, 13 Jan 2023 01:57:23 +0000 (12:57 +1100)]
ci: enable dEQP-VK.ubo.random.all_shared_buffer.48
The previous commits fix the slow compile time, allowing us to
enable this test.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5152
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Sun, 18 Dec 2022 01:44:16 +0000 (12:44 +1100)]
nir/nir_opt_copy_prop_vars: don't call memset when cloning
This makes the pass significantly faster cutting execution time
by around 30% in the cts test
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
This 30% improvement is in addition to all the improvements from
the proceeding patches.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Sun, 18 Dec 2022 01:36:15 +0000 (12:36 +1100)]
nir/nir_opt_copy_prop_vars: reorder clone calls
This helps with the reuse of dynamic arrays.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Fri, 16 Dec 2022 05:29:15 +0000 (16:29 +1100)]
nir/nir_opt_copy_prop_vars: reuse dynamic arrays
As per the previous commit if we don't reuse these dynamic arrays
we end up needlessly thrashing the memory handling functions.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Fri, 16 Dec 2022 03:20:21 +0000 (14:20 +1100)]
nir/nir_opt_copy_prop_vars: reuse hash tables
Due to how this pass works we can end up thrashing memory if we
do not reuse these hash tables rather than reusing them.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Thu, 15 Dec 2022 00:48:58 +0000 (11:48 +1100)]
nir/nir_opt_copy_prop_vars: avoid comparison explosion
Previously the pass was comparing every deref to every load/store
causing the pass to slow down more the larger the shader is.
Here we use a hash table so we can simple store everything needed
for comparision of a var separately.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Timothy Arceri [Tue, 20 Dec 2022 23:29:45 +0000 (10:29 +1100)]
nir/nir_opt_copy_prop_vars: remove extra loop
The fix in
947f7b452a55 introduced an extra loop over the copies
array to find the correct entry in the case it had been moved.
The problem is these loops can be iterated over millions of times
so lets simply update the entry pointer in the case we change its
location in the array.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>
Faith Ekstrand [Tue, 14 Feb 2023 16:25:54 +0000 (10:25 -0600)]
nir/from_ssa: Move the loop bounds check in resolve_parallel_copy
We loop, effectively, over two stacks: ready and to_do and finish only
when both are empty. In the case where ready is empty, we pull one off
of to_do, add a copy to a temporary, and push it onto the ready stack.
Previously, we assumed that we would never get to the temporary copy
case if to_do has exactly one entry because that would imply that there
was only one copy left which means there can't possibly be a cycle to
break. This was true until
c7fc44f9ebbe ("nir/from_ssa: Respect and
populate divergence information") which changed things such that
temporary copies sometimes get added in the case where a convergent
value is copied both to convergent and divergent destinations.
This patch adjusts our loop iteration to always attempt to clear the
ready stack before checking if there's anything left on the to_do stack.
I also added an assert to make the exit condition more clear.
Fixes:
c7fc44f9ebbe ("nir/from_ssa: Respect and populate divergence information")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
Faith Ekstrand [Mon, 13 Feb 2023 23:32:35 +0000 (17:32 -0600)]
nir/from_ssa: Only re-locate values that are destinations
There is an optimization in the parallel copy algorithm where, after a
copy has been performed, we can treat the destination as the new source
for future copies of the same source. In particular, consider the
following parallel copy: A -> B, C -> A, A -> C. In this case, after we
have done the A -> B copy, we can make note that the value in A is now
in B and emit the sequence: A -> B, C -> A, B -> C. This allows us to
resolve the swap cycle between A anc C without allocating a temporary
register because we know B is also a copy of A.
When one of the registers involved is convergent and the other is
divergent, this optimization is problematic because, while convergent to
divergent copies are fine, we can't re-use the divergent copy in later
copies if any of those copies are to a convergent variable. We could,
but it would require a read_first_invocation which would get messy. In
In
c7fc44f9ebbe ("nir/from_ssa: Respect and populate divergence
information"), we attempted to deal with this by limiting the rename
optimization to the case where the divergence matched.
The problem is that we did the re-name part whenever the divergence
matched but only marked it as ready if the thing being copied was a
destination. (We actually left two instances of loc[a] = b, one which
always happened and one which only happened if we also wanted to flag
the source as being ready to use as a destination.) While this
technically doesn't cause any problems, it may result in more inter-mov
dependencies which hurts instruction scheduling. For example, if we had
the parallel copy A -> B, A -> C, A -> D, we now end up emitting the
sequence A -> B, B -> C, C -> D which has many more data hazards between
instructions caused by the constant shuffling.
This commit restores the original logic in which we only perform the
rename optimization if the rename would free up a register we will later
use as a destination. This isn't entirely optimal as it still doesn't
prove that there is a cycle involved first, but it should lead to a
reduction in unnecessary dependencies.
No shader-db changes on SKL or DG2
Fixes:
c7fc44f9ebbe ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>
Rob Clark [Sat, 11 Feb 2023 17:01:28 +0000 (09:01 -0800)]
freedreno/drm: Optimize stateobj re-emit
For long-lived stateobjs, it is common to re-emit to the same submit
multiple times. By giving each submit a unique sequence # we can detect
this case and skip the extra append_bo().
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>
Rob Clark [Sat, 11 Feb 2023 20:22:31 +0000 (12:22 -0800)]
freedreno: Add seqno helper
It is a pretty common pattern to allocate a non-zero sequence # for
lightweight checking if an object is the same, changed, for use in cache
keys, etc. (And also pretty common to forget to handle the rollover
zero case.) Add a helper for this.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>