Lionel Landwerlin [Fri, 4 Mar 2022 10:52:04 +0000 (12:52 +0200)]
anv: rename host only descriptor internal flag
We add an assert to verify that those are not bound.
v2: Drop != 0 (Tapani)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
Lionel Landwerlin [Tue, 8 Mar 2022 14:56:50 +0000 (16:56 +0200)]
anv: don't lazy allocate surface states in descriptor sets
In
4001d9ce1a6e we started lazily allocating surface states in the
descriptor sets rather than upfront in the descriptor pool. This was
to workaround vkd3d-proton allocating more than we could handle at the
HW level.
The issue introduced in that change is that we didn't protect the
descriptor pool free list as well as the anv_state_stream which are
now potentially used from different threads through the descriptor set
write functions.
This reverts the lazy allocation part of that change. Host only
descriptor sets changes remain.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
4001d9ce1a6e ("anv: Handle VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_VALVE for descriptor sets")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
Lionel Landwerlin [Mon, 7 Mar 2022 08:29:19 +0000 (10:29 +0200)]
anv: fix acceleration structure descriptor copies
We're not supposed to have a
VkWriteDescriptorSetAccelerationStructureKHR when doing a copy. We
should instead get the acceleration structure object from the source
descriptor.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
03e1e19246da ("anv: Refactor descriptor copy")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15241>
Pierre-Eric Pelloux-Prayer [Mon, 7 Mar 2022 10:02:45 +0000 (11:02 +0100)]
radeonsi: don't clear framebuffer.state before dcc decomp
This causes inconsistencies between sctx->framebuffer.state and other
sctx->framebuffer properties (like compressed_cb_mask).
The point of this code was to fix an issue with vi_separate_dcc_stop_query,
which was removed by
804e2924406 we can safely drop it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6099
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15261>
Kenneth Graunke [Thu, 10 Mar 2022 03:21:23 +0000 (19:21 -0800)]
iris: Restore flagging of dirty bindings in binder_realloc
When I switched iris over to use 3DSTATE_BINDING_TABLE_POOL_ALLOC, I
stopped flagging things dirty when allocating a new binder, because
the contents of the binding table were still valid, thanks to us not
having to subtract Surface State Base Address anymore.
This unfortunately missed the point that the old binding table is in the
old buffer, which is no longer what the binder pool base address points
to. So we'd either need to copy it over, or just flag it dirty and
re-emit it on the next draw.
Fixes misrendering in Ryujinx.
Fixes:
8b9045e7a45 ("intel: Use 3DSTATE_BINDING_TABLE_POOL_ALLOC exclusively on Gfx11+")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15314>
Samuel Pitoiset [Thu, 10 Mar 2022 08:15:12 +0000 (09:15 +0100)]
radv: optimize the number of loaded components for VS inputs in NIR
fossils-db (Sienna Cichlid):
Totals from 3691 (2.74% of 134913) affected shaders:
VGPRs: 121368 -> 121584 (+0.18%); split: -0.36%, +0.54%
CodeSize: 7597912 -> 7561140 (-0.48%); split: -0.66%, +0.18%
MaxWaves: 104706 -> 104772 (+0.06%)
Instrs: 1441229 -> 1437652 (-0.25%); split: -0.53%, +0.28%
Latency: 5500766 -> 5482101 (-0.34%); split: -0.45%, +0.11%
InvThroughput: 804401 -> 797178 (-0.90%); split: -1.09%, +0.20%
VClause: 25185 -> 25143 (-0.17%); split: -0.50%, +0.33%
SClause: 27486 -> 27445 (-0.15%); split: -0.57%, +0.42%
Copies: 143816 -> 147900 (+2.84%); split: -0.54%, +3.38%
PreSGPRs: 109584 -> 110396 (+0.74%); split: -0.04%, +0.79%
PreVGPRs: 95541 -> 94583 (-1.00%); split: -1.12%, +0.12%
fossils-db (Polaris10):
Totals from 1773 (1.30% of 135960) affected shaders:
SGPRs: 80848 -> 80864 (+0.02%); split: -0.14%, +0.16%
VGPRs: 56424 -> 55600 (-1.46%); split: -1.47%, +0.01%
CodeSize: 1732588 -> 1696840 (-2.06%); split: -2.07%, +0.01%
MaxWaves: 12103 -> 12106 (+0.02%)
Instrs: 347684 -> 341597 (-1.75%); split: -1.76%, +0.01%
Latency: 2542840 -> 2523946 (-0.74%); split: -0.95%, +0.21%
InvThroughput: 924601 -> 905102 (-2.11%); split: -2.13%, +0.02%
VClause: 9565 -> 9545 (-0.21%); split: -0.51%, +0.30%
SClause: 10587 -> 10333 (-2.40%); split: -2.82%, +0.43%
Copies: 19321 -> 20307 (+5.10%); split: -0.78%, +5.88%
PreSGPRs: 30879 -> 30875 (-0.01%); split: -0.20%, +0.18%
PreVGPRs: 41211 -> 41270 (+0.14%); split: -0.73%, +0.87%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15317>
Dave Airlie [Fri, 5 Nov 2021 06:03:24 +0000 (16:03 +1000)]
radv: abstract queue family away from queue family index.
If we introduce another queue type (video decode) we can have a
disconnect between the RADV_QUEUE_ enum and the API queue_family_index.
currently the driver has
GENERAL, COMPUTE, TRANSFER which would end up at QFI 0, 1, <nothing>
since we don't create transfer.
Now if I add VDEC we get
GENERAL, COMPUTE, TRANSFER, VDEC at QFI 0, 1, <nothing>, 2
or if you do nocompute
GENERAL, COMPUTE, TRANSFER, VDEC at QFI 0, <nothing>, <nothing>, 1
This means we have to add a remapping table between the API qfi
and the internal qf.
This patches tries to do that, in theory right now it just adds
overhead, but I'd like to exercise these paths.
v2: add radv_queue_ring abstraction, and pass physical device in,
as it makes adding uvd later easier.
v3: rename, and drop one direction as unneeded now, drop queue_family_index
from cmd_buffers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13687>
Mike Blumenkrantz [Mon, 31 Jan 2022 17:00:27 +0000 (12:00 -0500)]
lavapipe: more descriptor validation
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14656>
Mike Blumenkrantz [Fri, 21 Jan 2022 21:38:08 +0000 (16:38 -0500)]
lavapipe: validate per-stage descriptor limits when creating pipeline layouts
this is super annoying to track down later, so just crash early if it's seen
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14656>
Mike Blumenkrantz [Fri, 21 Jan 2022 21:37:33 +0000 (16:37 -0500)]
lavapipe: make device limits a physical device struct
it's useful to have this info around and a bit simpler to gather
info on init
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14656>
Mike Blumenkrantz [Tue, 8 Mar 2022 01:20:11 +0000 (20:20 -0500)]
anv: fix some dynamic rasterization discard cases in pipeline construction
cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15280>
Mike Blumenkrantz [Tue, 8 Mar 2022 01:15:50 +0000 (20:15 -0500)]
anv: fix CmdSetColorWriteEnableEXT for maximum rts
Fixes:
b15bfe92f7f ("anv: implement VK_EXT_color_write_enable")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15280>
Mike Blumenkrantz [Mon, 7 Mar 2022 21:35:04 +0000 (16:35 -0500)]
anv: fix xfb usage with rasterizer discard
in the initial implementation, a stream like:
* CmdBeginTransformFeedbackEXT
* CmdSetRasterizerDiscardEnableEXT
* CmdDraw
* CmdEndTransformFeedbackEXT
* CmdBeginTransformFeedbackEXT
* CmdDraw
* CmdEndTransformFeedbackEXT
would never enable transform feedback, as it only checked for the change
in rasterizer_discard state
Fixes:
4d531c67dfd ("anv: support rasterizer discard dynamic state")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15269>
Dave Airlie [Thu, 10 Mar 2022 05:01:20 +0000 (15:01 +1000)]
crocus: don't map scanout buffers as write-back
This essentially ports
64405230774210488dedbc54d73ba394ec6ae802
Author: Keith Packard <keithp@keithp.com>
Date: Fri Aug 6 16:11:18 2021 -0700
iris: Map scanout buffers WC instead of WB [v2]
to crocus.
Fixes:
f3630548f1da ("crocus: initial gallium driver for Intel gfx 4-7")
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15313>
Mike Blumenkrantz [Wed, 9 Mar 2022 16:48:51 +0000 (11:48 -0500)]
llvmpipe: fix occlusion queries with early depth test
for genuine early depth tests, the samplecount must be updated after depth
test but before samplemask is applied
for inferred-early or regular depth tests, the samplemask can be applied
before the depth test
Fixes:
d9276ae965a ("llvmpipe: handle gl_SampleMask writing.")
fixes:
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_samples_4
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15319>
Jason Ekstrand [Thu, 10 Mar 2022 17:02:08 +0000 (11:02 -0600)]
lavapipe: Use the common vk_enqueue_CmdBindDescriptorSets
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15329>
Jason Ekstrand [Thu, 10 Mar 2022 16:56:30 +0000 (10:56 -0600)]
lavapipe: Reference count pipeline layouts
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15329>
Jason Ekstrand [Thu, 10 Mar 2022 16:52:07 +0000 (10:52 -0600)]
lavapipe: Allocate descriptor set layouts with DEVICE scope
Because they can come and go at any time, we can't use OBJECT scope
because that might confuse the client allocator. Instead, use DEVICE
scope and always allocate off the device allocator.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15329>
Jason Ekstrand [Wed, 9 Mar 2022 21:27:33 +0000 (15:27 -0600)]
vulkan/cmd_queue: Add a common vk_cmd_enqueue_CmdBindDescriptorSets
In order for this to work, the driver must reference-count pipeline
layouts so we can take a reference while the command is in the queue.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15329>
Jason Ekstrand [Wed, 9 Mar 2022 21:15:40 +0000 (15:15 -0600)]
vulkan/cmd_queue: Add a driver_free_cb hook
If a driver sets driver_data but not driver_free_cb, driver_data will
get freed along with the command. If a driver sets driver_free_cb,
driver_data will not get automatically freed but the callback will get
called before the rest of the data structure is freed.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15329>
Mike Blumenkrantz [Thu, 10 Mar 2022 16:17:40 +0000 (11:17 -0500)]
lavapipe: ci updates
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15322>
Mike Blumenkrantz [Thu, 10 Mar 2022 16:16:03 +0000 (11:16 -0500)]
lavapipe: run nir_opt_copy_prop_vars during optimization loop
this enables better elimination of operations
fixes:
dEQP-VK.graphicsfuzz.spv-stable-mergesort-flatten-selection-dead-continues
fixes #5458
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15322>
Mike Blumenkrantz [Thu, 10 Mar 2022 19:25:51 +0000 (14:25 -0500)]
lavapipe: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15320>
Mike Blumenkrantz [Wed, 9 Mar 2022 19:26:19 +0000 (14:26 -0500)]
lavapipe: skip format checks for EXTENDED_USAGE
we can effectively skip any kind of checks here and just assume that one
of two scenarios is in effect:
* the user is about to attempt some incredibly illegal behavior that VVL will catch
* the user is about to attempt a pro gamer move and we'll be fine
in either case, it's EXTENDED_USAGE, so hopefully we're about to make a texture
view from a compatible and supported format
cc: mesa-stable
fixes:
dEQP-VK.image.extended_usage_bit_compatibility.image_format_properties*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15320>
Mike Blumenkrantz [Thu, 10 Mar 2022 19:05:32 +0000 (14:05 -0500)]
lavapipe: use the correct value for dynamic render resolve attachment indexing
subpass->color_count is (obviously) not set yet, so this would just clobber
the color attachments any time resolves were used
Fixes:
8a6160a3542 ("lavapipe: VK_KHR_dynamic_rendering")
fixes:
dEQP-VK.draw.dynamic_rendering.multiple_interpolation.structured.with_sample_decoration.4_samples
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15330>
Dave Airlie [Wed, 9 Mar 2022 06:31:46 +0000 (16:31 +1000)]
lavapipe: remove broken workaround for zink depth texturing.
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15297>
Dave Airlie [Thu, 10 Mar 2022 03:07:37 +0000 (13:07 +1000)]
zink: workaround depth texture mode alpha.
Since spir-v only has single channel depth sampling, it breaks
with the old school GL_ALPHA depth mode swizzle, so just detect
that case and smash all the channels.
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15297>
Connor Abbott [Tue, 7 Dec 2021 11:11:31 +0000 (12:11 +0100)]
tu: Expose subgroup arithmetic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Tue, 7 Dec 2021 11:11:05 +0000 (12:11 +0100)]
ir3: Add support for subgroup arithmetic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Mon, 10 Jan 2022 12:34:16 +0000 (13:34 +0100)]
ir3: Track physical edges when inserting (ss) for shared regs
Normally this wouldn't matter, but it will matter for the upcoming scan
macro because the running tally is communicated through a shared
register across a physical edge. It may also matter if a live-range
split occurs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Mon, 10 Jan 2022 13:01:34 +0000 (14:01 +0100)]
util/bitset: Fix off-by-one in __bitset_set_range
Fixes:
b3b03e33c9f ("util/bitset: add BITSET_SET_RANGE(..)")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Thu, 10 Mar 2022 10:24:44 +0000 (11:24 +0100)]
ir3/spill: Mark reload destination as early-clobber
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Fri, 3 Dec 2021 11:10:04 +0000 (12:10 +0100)]
ir3/ra: Add IR3_REG_EARLY_CLOBBER
We'll need this to model the subgroup reduction macros.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Thu, 2 Dec 2021 18:41:40 +0000 (19:41 +0100)]
ir3/ra: Add proper support for multiple destinations
We weren't considering the other destinations when allocating a
destination, so we could allocate overlapping destinations. This wasn't
done before because we never had a need for it, but the subgroup
reduction macros will need it.
The trickiest part of this is that we have to rewrite the
compress_regs_left fallback, because we may have to move around the
other already-allocated destinations. We now have a list of destinations
to (re)allocate in addition to the popped live intervals. For the rest
of the destination handling, we can just bail out if the proposed spot
for something overlaps another destination, but for the fallback we have
to handle all the cases gracefully. I also added support for odd
combinations of multiple destinations where some of them are tied, which
we'll use in the next commit to handle early-clobber destinations and
which will actually be used because one of the destinations of the
subgroup reduction macro will be early-clobber. The result is that the
order of intervals to allocate is now a lot more complicated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Mon, 10 Jan 2022 17:16:05 +0000 (18:16 +0100)]
ir3/ra: Sanitize parallel copy flags better
For pcopies we only care about the register's type, i.e. whether its a
half-register and whether it's an array (plus its size). Copying over
other flags like IR3_REG_RELATIV just leads to sadness and validator
assertions.
Fixes:
0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Thu, 2 Dec 2021 18:41:12 +0000 (19:41 +0100)]
ir3/ra: Fix ra_foreach_dst_n
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Thu, 2 Dec 2021 13:48:08 +0000 (14:48 +0100)]
ir3/ra: Fix tied destination handling with multiple destinations
Before, we were careful to
1. Get the source physreg.
2. Allocate the destination.
3. Insert a copy with the source being the physreg from step 1.
and this guaranteed that if the tied source were moved in step 2 we'd
still insert a copy from the correct place. However this won't work with
multiple destinations because an earlier destination could've already
moved the tied source around. Instead flip steps 2 and 3 (we'll insert
the copy before we allocate the interval, but that's ok) and run the
first two steps in a separate loop before any destinations are
allocated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Wed, 1 Dec 2021 15:33:57 +0000 (16:33 +0100)]
ir3/sched: Support multiple destinations
Note: this is a behavior change for arrays, because it will count the
entire array instead of just the components written in the register
pressure calculation. However this is more accurate since this matches
how RA works.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Wed, 1 Dec 2021 15:24:46 +0000 (16:24 +0100)]
ir3/dce: Support multiple destinations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Connor Abbott [Wed, 1 Dec 2021 15:13:31 +0000 (16:13 +0100)]
ir3/cp_postsched: Support multiple destinations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>
Jason Ekstrand [Wed, 9 Mar 2022 20:15:58 +0000 (14:15 -0600)]
vulkan,lavapipe: Move some enqueue helpers to common code
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Boris Brezillon [Wed, 9 Mar 2022 12:04:43 +0000 (13:04 +0100)]
lavapipe: Re-use auto-generated vk_cmd_enqueue entrypoints
Re-use auto-generated vk_cmd_enqueue entrypoints instead of generating
our own version doing the same thing. In order to effectively do this,
we also add an allow-list of which entrypoints lavapipe actually handles
to avoid issues where the autogenerated one stomps a vkCmdFoo2 wrapper.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Jason Ekstrand [Thu, 10 Mar 2022 02:14:44 +0000 (20:14 -0600)]
lavapipe: Reset the free_cmd_buffers list in TrimCommandPool
We delete all the command buffers but they're still in the list so
future allocations may try to re-use them post-free and another trim
will re-delete them.
Fixes:
b38879f8c5f5 ("vallium: initial import of the vulkan frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Jason Ekstrand [Tue, 8 Mar 2022 22:45:55 +0000 (16:45 -0600)]
vulkan/cmd_queue: Generate enqueue entrypoints
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Boris Brezillon [Tue, 1 Feb 2022 16:04:42 +0000 (17:04 +0100)]
vulkan/cmd_queue: Properly deconstify array of pointers
When manipulating an array of pointers, what we want to desconstify is
the array, not the entry type.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Jason Ekstrand [Tue, 8 Mar 2022 22:55:50 +0000 (16:55 -0600)]
vulkan/cmd_queue: Stop generating enqueue helpers for INTEL perf queries
They don't return void and they're not used by anyone except the Intel
drivers so there's no point in supporting them.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Jason Ekstrand [Tue, 8 Mar 2022 22:48:03 +0000 (16:48 -0600)]
vulkan/cmd_queue: Re-flow MANUAL_COMMANDS
This just makes it all a bit easier to read.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Boris Brezillon [Wed, 2 Feb 2022 15:13:20 +0000 (16:13 +0100)]
vulkan/cmd_queue: Remove duplicate entries in MANUAL_COMMANDS
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Louis-Francis Ratté-Boulianne [Tue, 14 Dec 2021 14:50:35 +0000 (15:50 +0100)]
vulkan/runtime: Add a vk_cmd_queue object to vk_command_buffer
This is paving the road for generic secondary command buffer support,
where commands are simply recorded in a software queue and replayed
on the primary command buffer when vkCmdExecuteCommands() is called.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Louis-Francis Ratté-Boulianne [Tue, 14 Dec 2021 14:51:16 +0000 (15:51 +0100)]
vulkan/cmd_queue: Add an initializer for the vk_cmd_queue object
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Boris Brezillon [Wed, 2 Feb 2022 13:00:33 +0000 (14:00 +0100)]
vulkan/cmd_queue: Constify vk_cmd_queue.alloc
The implementation shouldn't modify the allocator.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311>
Mike Blumenkrantz [Wed, 9 Mar 2022 02:52:53 +0000 (21:52 -0500)]
lavapipe: add the full list of cts fails
easier to keep track this way
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15294>
Akihiko Odaki [Tue, 22 Feb 2022 10:02:24 +0000 (19:02 +0900)]
virgl: Check texture multisample compatibility
v2: Support VIRGL_FORMAT_NONE (Gert Wollny)
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Suggested-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15115>
Akihiko Odaki [Wed, 9 Mar 2022 09:20:09 +0000 (18:20 +0900)]
virgl/ci: Uprev virglrenderer
Suggested-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15115>
Danylo Piliaiev [Thu, 30 Dec 2021 17:59:46 +0000 (19:59 +0200)]
tu: Implement VK_EXT_depth_clip_control
Since negativeOneToOne is a static property of the pipeline and
viewport state could be dynamic, we have to defer viewport state
emission until negativeOneToOne value is known.
See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6070
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14363>
Iago Toral Quiroga [Wed, 9 Mar 2022 14:58:05 +0000 (15:58 +0100)]
broadcom/compiler: remove unused functions
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15302>
Dylan Baker [Wed, 9 Mar 2022 21:47:07 +0000 (13:47 -0800)]
docs: add release notes for 22.0.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15312>
Dylan Baker [Wed, 9 Mar 2022 21:56:14 +0000 (13:56 -0800)]
docs: Add calendar entries for 22.0 release.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15312>
Dylan Baker [Wed, 9 Mar 2022 21:55:38 +0000 (13:55 -0800)]
docs: update calendar and link releases notes for 22.0.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15312>
Timur Kristóf [Tue, 8 Mar 2022 16:00:31 +0000 (17:00 +0100)]
ac: Query the amdgpu MEC firmware version.
MEC (Micro Engine Compute) is the firmware which is responsible for
the compute-only queues on AMD GPUs. It is present on GFX7 and newer.
This patch will query the version of this firmware and print it
among the others.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15283>
Rob Clark [Wed, 9 Mar 2022 19:32:38 +0000 (11:32 -0800)]
mesa: Fix discard_framebuffer for fbo vs winsys
GL is annoying when it comes to having different enums for winsys vs
fbo.
Note that the issue this closes was only accidentially exposed by a
change the resulted in sysmem vs GMEM path taken.
Fixes:
db2ae511210 ("mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6103
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15308>
Emma Anholt [Mon, 24 Jan 2022 05:40:45 +0000 (21:40 -0800)]
docs/ci: Add docs for using a POE switch to control boards, like nouveau.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Mon, 24 Jan 2022 05:39:02 +0000 (21:39 -0800)]
docs/ci: Update some bare-metal CI docs.
We haven't been using initramfs in a long time, don't point people that
direction. Do point people at existing instances of these CI variants,
though.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Fri, 3 Dec 2021 00:14:30 +0000 (16:14 -0800)]
ci/nouveau: Add a manual run for the Jetson Nano (GM20B).
The test suite is full of flakes around transform feedback, atomics, and
tess. But, I hope it can be useful for regression testing core Mesa
reworks.
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast.
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> (nouveau)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Fri, 3 Dec 2021 00:14:30 +0000 (16:14 -0800)]
ci/nouveau: Add nouveau support to the rootfs.
This required updating the kernel to 5.16.12 to get a more stable boot
process. That kernel rebuild caused an update of the container with
piglit which that was missed in a previous MR, so we got new xfails in x86
swrast. Also, including modules on arm64 exposed a bug in v3d's
poe-powered.sh rsyncing of modules.
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Tue, 25 Jan 2022 23:27:03 +0000 (15:27 -0800)]
ci: Stop xz-compressing firmware for ramdisks.
This ends up breaking nouveau because the renames break symlinks in the
firmware directory structure. We don't need it any more since we stopped
doing ramdisks.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Mon, 28 Feb 2022 22:30:31 +0000 (14:30 -0800)]
ci/bare-metal: Increase maximum retry count for POE boots.
The manual jetson CI job I'm introducing has serious boot reliability
trouble, but also we've seen frequent intermittent failures on bcm where
at least 2 boots don't seem to be enough (#6041).
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Emma Anholt [Sat, 22 Jan 2022 00:46:48 +0000 (16:46 -0800)]
ci/bare-metal: Drop the BM_POE_USERNAME/PASSWORD env var checks.
They're unused since the transition to SNMP in the rpi test farm.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15201>
Mike Blumenkrantz [Wed, 9 Mar 2022 18:44:16 +0000 (13:44 -0500)]
zink: lower dmod on AMD hardware
this hardware won't return the correct value from dmod instructions,
so lower it to ensure that cts passes
nobody else will ever hit this, so perf isn't an issue and regular fmod
can be left alone
fixes (amd):
KHR-GL46.gpu_shader_fp64.builtin.mod_d*
Fixes:
5fae35fb17d6d89c4fe1d9d5a19d827caf25b9fc ('zink: fix 64bit float shader ops ')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15306>
Mike Blumenkrantz [Wed, 9 Mar 2022 14:20:06 +0000 (09:20 -0500)]
zink: add another radv fail
it looks like this one was erroneously excluded
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15307>
Mike Blumenkrantz [Wed, 9 Mar 2022 13:49:51 +0000 (08:49 -0500)]
zink: update radv fails
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15307>
Chia-I Wu [Sat, 5 Mar 2022 06:38:25 +0000 (22:38 -0800)]
venus: add VK_EXT_vertex_attribute_divisor
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 06:56:03 +0000 (22:56 -0800)]
venus: add VK_EXT_shader_stencil_export
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 06:59:44 +0000 (22:59 -0800)]
venus: add VK_EXT_robustness2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 06:15:10 +0000 (22:15 -0800)]
venus: add VK_EXT_depth_clip_enable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 07:03:51 +0000 (23:03 -0800)]
venus: add VK_EXT_conservative_rasterization
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 06:56:32 +0000 (22:56 -0800)]
venus: add VK_EXT_shader_demote_to_helper_invocation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Chia-I Wu [Sat, 5 Mar 2022 05:52:44 +0000 (21:52 -0800)]
venus: update venus-protocol headers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15265>
Marcin Ślusarz [Thu, 9 Dec 2021 16:10:55 +0000 (17:10 +0100)]
anv: include Primitive Header in mesh shader per-primitive output
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Marcin Ślusarz [Thu, 9 Dec 2021 16:13:29 +0000 (17:13 +0100)]
anv: set number of viewports in clip state (mesh)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Marcin Ślusarz [Thu, 9 Dec 2021 16:11:01 +0000 (17:11 +0100)]
intel/compiler: mark some variables as per-primitive in FS if they come from MS
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Marcin Ślusarz [Tue, 1 Feb 2022 17:09:52 +0000 (18:09 +0100)]
intel/compiler: handle ViewportIndex, PrimitiveID and Layer in MUE setup
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Marcin Ślusarz [Tue, 1 Feb 2022 17:08:49 +0000 (18:08 +0100)]
intel/compiler: inject MUE initialization
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Marcin Ślusarz [Thu, 9 Dec 2021 15:51:41 +0000 (16:51 +0100)]
intel/compiler: shift mesh urb read/write window when offset is too large
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15303>
Samuel Pitoiset [Wed, 9 Mar 2022 10:09:08 +0000 (11:09 +0100)]
aco: always emit vk_cvt_pkrtz_f16_f32 for nir_op_pack_half_2x16_split
From the VK_KHR_shader_float_controls extension:
"5) Do any of the “Pack” GLSL.std.450 instructions count as
conversion instructions and have the rounding mode applied?"
"RESOLVED: No, only instructions listed in “section 3.32.11.
Conversion Instructions” of the SPIR-V specification count as
conversion instructions."
This is also the same logic as the LLVM backend.
No fossils-db changes on Sienna Cichlid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15301>
Erik Faye-Lund [Wed, 9 Mar 2022 14:40:25 +0000 (15:40 +0100)]
docs: improve language in zink article
Turns out, this was not proper use of language!
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15300>
Erik Faye-Lund [Wed, 9 Mar 2022 11:03:37 +0000 (12:03 +0100)]
docs: fixup zink gl 4.3 requirements
The multiViewport feature isn't required for GL 4.3, it's required for
GL 4.1. Technically speaking, we could have just dropped it because we
already list the maxViewports requirement. But it seems better to be
very clear here to me.
Fixes:
29f8f21bff6 ("docs: document zink GL 4.3 requirements")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15300>
Iago Toral Quiroga [Mon, 7 Mar 2022 15:27:02 +0000 (16:27 +0100)]
broadcom/compiler: don't always assign r5 if available
Instead, only favor assigning r5 if we have first decided to
assign an accumulator. This helps with assining r5 to short
lived uniforms, favoring accumulator rotation to facilitate
QPU merges.
total instructions in shared programs:
12656164 ->
12628339 (-0.22%)
instructions in affected programs: 5368373 -> 5340548 (-0.52%)
helped: 17420
HURT: 9996
total uniforms in shared programs: 3704776 -> 3704863 (<.01%)
uniforms in affected programs: 12247 -> 12334 (0.71%)
helped: 23
HURT: 78
total max-temps in shared programs: 2153505 -> 2152684 (-0.04%)
max-temps in affected programs: 26468 -> 25647 (-3.10%)
helped: 569
HURT: 328
total fills in shared programs: 4656 -> 4657 (0.02%)
fills in affected programs: 43 -> 44 (2.33%)
helped: 0
HURT: 1
total sfu-stalls in shared programs: 34728 -> 34403 (-0.94%)
sfu-stalls in affected programs: 3411 -> 3086 (-9.53%)
helped: 842
HURT: 534
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:03:03 +0000 (14:03 +0100)]
broadcom/compiler: add comment on why we don't use r5 with ldunifa
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:47:55 +0000 (14:47 +0100)]
broadcom/compiler: adjust register threshold for 2-thread compiles
We have twice the registers in this case so it makes sense to double
this as well. While this causes slight regressions in shader-db
stats (due to additional register pressure), it helps us hide latency
of memory reads better on 2-thread compiles, where the thread switch
mechanism will be less effective. This shows a ~3% performance
improvement on the UE4 SunTemple demo.
total instructions in shared programs:
12642413 ->
12656164 (0.11%)
instructions in affected programs: 2272652 -> 2286403 (0.61%)
helped: 2924
HURT: 3389
total uniforms in shared programs: 3703861 -> 3704776 (0.02%)
uniforms in affected programs: 213729 -> 214644 (0.43%)
helped: 823
HURT: 1272
total max-temps in shared programs: 2150686 -> 2153505 (0.13%)
max-temps in affected programs: 191332 -> 194151 (1.47%)
helped: 1900
HURT: 1891
total spills in shared programs: 3255 -> 3274 (0.58%)
spills in affected programs: 166 -> 185 (11.45%)
helped: 3
HURT: 6
total fills in shared programs: 4630 -> 4656 (0.56%)
fills in affected programs: 367 -> 393 (7.08%)
helped: 7
HURT: 15
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:42:39 +0000 (14:42 +0100)]
broadcom/compiler: add a strategy to disable scheduling of general TMU reads
This can add quite a bit of register pressure so it makes sense to disable it
to prevent us from dropping to 2 threads or increase spills:
total instructions in shared programs:
12672813 ->
12642413 (-0.24%)
instructions in affected programs: 256721 -> 226321 (-11.84%)
helped: 719
HURT: 77
total threads in shared programs: 415534 -> 416322 (0.19%)
threads in affected programs: 788 -> 1576 (100.00%)
helped: 394
HURT: 0
total uniforms in shared programs: 3711370 -> 3703861 (-0.20%)
uniforms in affected programs: 28859 -> 21350 (-26.02%)
helped: 204
HURT: 455
total max-temps in shared programs: 2159439 -> 2150686 (-0.41%)
max-temps in affected programs: 32945 -> 24192 (-26.57%)
helped: 585
HURT: 47
total spills in shared programs: 5966 -> 3255 (-45.44%)
spills in affected programs: 2933 -> 222 (-92.43%)
helped: 192
HURT: 4
total fills in shared programs: 9328 -> 4630 (-50.36%)
fills in affected programs: 5184 -> 486 (-90.62%)
helped: 196
HURT: 0
Compared to the stats before adding scheduling of non-filtered
memory reads we see we that we have now gotten back all that was
lost and then some:
total instructions in shared programs:
12663186 ->
12642413 (-0.16%)
instructions in affected programs: 2051803 -> 2031030 (-1.01%)
helped: 4885
HURT: 3338
total threads in shared programs: 415870 -> 416322 (0.11%)
threads in affected programs: 896 -> 1348 (50.45%)
helped: 300
HURT: 74
total uniforms in shared programs: 3711629 -> 3703861 (-0.21%)
uniforms in affected programs: 158766 -> 150998 (-4.89%)
helped: 1973
HURT: 499
total max-temps in shared programs: 2138857 -> 2150686 (0.55%)
max-temps in affected programs: 177920 -> 189749 (6.65%)
helped: 2666
HURT: 2035
total spills in shared programs: 3860 -> 3255 (-15.67%)
spills in affected programs: 2653 -> 2048 (-22.80%)
helped: 77
HURT: 21
total fills in shared programs: 5573 -> 4630 (-16.92%)
fills in affected programs: 3839 -> 2896 (-24.56%)
helped: 81
HURT: 15
total sfu-stalls in shared programs: 39583 -> 38154 (-3.61%)
sfu-stalls in affected programs: 8993 -> 7564 (-15.89%)
helped: 1808
HURT: 1038
total nops in shared programs: 324894 -> 323685 (-0.37%)
nops in affected programs: 30362 -> 29153 (-3.98%)
helped: 2513
HURT: 2077
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Mon, 7 Mar 2022 13:04:19 +0000 (14:04 +0100)]
broadcom/compiler: define v3d-specific delays for NIR instructions
We do a few changes over NIR's defaults:
1. Lower delay for texture reads. Empirically, we don't observe any
benefits with delays over 50 and since this delay value is still
used by the scheduler in the "favor register pressure" case it is
benefitial to avoid overestimating it too much.
2. Adjust delay for non-filtered TMU reads to the delay selected for
texture reads.
3. In our case, UBO reads from dynamically uniform addresses don't
use the TMU and have a latency of 1 instruction in the best case
scenario or 4 at worse, so we go with 1 so we don't try to move
this early.
This helps us get back some of what we lost when updating the
default scheduler configuration to add a delay for non-filtered
memory reads:
total instructions in shared programs:
13126587 ->
12671765 (-3.46%)
instructions in affected programs: 3764097 -> 3309275 (-12.08%)
helped: 14664
HURT: 4244
total threads in shared programs: 407208 -> 415522 (2.04%)
threads in affected programs: 8716 -> 17030 (95.39%)
helped: 4224
HURT: 67
total uniforms in shared programs: 3812698 -> 3711224 (-2.66%)
uniforms in affected programs: 335170 -> 233696 (-30.28%)
helped: 2816
HURT: 3551
total max-temps in shared programs: 2318430 -> 2159345 (-6.86%)
max-temps in affected programs: 539991 -> 380906 (-29.46%)
helped: 13173
HURT: 1440
total spills in shared programs: 49086 -> 5966 (-87.85%)
spills in affected programs: 48306 -> 5186 (-89.26%)
helped: 1655
HURT: 28
total fills in shared programs: 55810 -> 9328 (-83.29%)
fills in affected programs: 54821 -> 8339 (-84.79%)
helped: 1659
HURT: 22
LOST: 0
GAINED: 3
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Thu, 3 Mar 2022 11:18:02 +0000 (12:18 +0100)]
nir/schedule: allow drivers to decide about instruction latency
On V3D reading UBOs from uniform addresses uses a more efficient
mechanism with lower latency. On other platforms there may be
simular scenarios.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 2 Mar 2022 11:15:15 +0000 (12:15 +0100)]
nir/schedule: use larger delay for non-filtered memory reads
This has been pending for a long time. It is not very consistent to
add a significant delay for textures and not do it for UBOs, etc
The reason we have not been doing this so far is the accumulated effect
on register pressure for V3D as shown by shader-db results below, but
from the point of view of a generic scheduler it makes sense to do this.
Later patches will address V3D specific issues with register pressure
derived from this by letting the driver control its instruction delay
settings.
total instructions in shared programs:
12662138 ->
13126587 (3.67%)
instructions in affected programs: 1813091 -> 2277540 (25.62%)
helped: 2410
HURT: 10499
total threads in shared programs: 415858 -> 407208 (-2.08%)
threads in affected programs: 17348 -> 8698 (-49.86%)
helped: 8
HURT: 4333
total uniforms in shared programs: 3711483 -> 3812698 (2.73%)
uniforms in affected programs: 128012 -> 229227 (79.07%)
helped: 3474
HURT: 2143
total max-temps in shared programs: 2138763 -> 2318430 (8.40%)
max-temps in affected programs: 318780 -> 498447 (56.36%)
helped: 588
HURT: 11997
total spills in shared programs: 3860 -> 49086 (1171.66%)
spills in affected programs: 709 -> 45935 (6378.84%)
helped: 23
HURT: 1595
total fills in shared programs: 5573 -> 55810 (901.44%)
fills in affected programs: 1067 -> 51304 (4708.25%)
helped: 23
HURT: 1595
LOST: 3
GAINED: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 9 Mar 2022 11:11:55 +0000 (12:11 +0100)]
nir/schedule: handle nir_intrinsic_group_memory_barrier
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 9 Mar 2022 09:38:42 +0000 (10:38 +0100)]
nir/schedule: fix handling of generic memory barrier
We can get a generic nir_intrinsic_memory_barrier to represent a
barrier involving multiple semantics (instead of getting individual
specific barriers for each semantic). This means that we need to
consider these as potentially affecting shared memory access as well.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Iago Toral Quiroga [Wed, 2 Mar 2022 10:10:39 +0000 (11:10 +0100)]
broadcom/compiler: stop moving UBO loads before NIR scheduling
This doesn't have any significant impact shader-db stats and would
reduce our capacity to hide latency from the loads, so it is probably
undesirable:
total instructions in shared programs:
12663189 ->
12663186 (<.01%)
instructions in affected programs: 4222 -> 4219 (-0.07%)
helped: 9
HURT: 4
total uniforms in shared programs: 3711624 -> 3711629 (<.01%)
uniforms in affected programs: 186 -> 191 (2.69%)
helped: 0
HURT: 2
total max-temps in shared programs: 2138822 -> 2138857 (<.01%)
max-temps in affected programs: 569 -> 604 (6.15%)
helped: 1
HURT: 9
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
Michel Zou [Thu, 3 Mar 2022 06:12:05 +0000 (07:12 +0100)]
lavapipe: set non-zero device/driver uuid
Closes #5875
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15230>
Danylo Piliaiev [Fri, 11 Feb 2022 16:15:27 +0000 (18:15 +0200)]
turnip: Make autotuner work with reusable command buffers
To achieve it each command buffer now has its own GPU memory.
However the BOs usage by autotuner is not optimal, the ideal
pattern would be to use some memory pool to suballocate small
GPU memory chunks, since most command buffers have only a few
renderpasses.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5990
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14996>
Gert Wollny [Wed, 3 Nov 2021 13:13:43 +0000 (14:13 +0100)]
virgl: Add a few more formats to the format table
These formats are used by the piglit
arb_texture_buffer_object-formats fs arb
Adding them here keeps the piglit from crashing, but most of the related
tests don't pass.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13645>