Mike Blumenkrantz [Fri, 26 Jun 2020 19:16:17 +0000 (15:16 -0400)]
zink: optimize transfer_map for resources with pending reads/writes
we don't need to stall here if we know that we're not about to have any io
conflicts in the buffer
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>
Mike Blumenkrantz [Mon, 15 Jun 2020 19:51:05 +0000 (15:51 -0400)]
zink: add a mechanism to track current resource usage in batches
this is really primitive, but it at least gives an idea of whether a
resource has been submitted for writing in a pending batch
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>
Samuel Pitoiset [Mon, 12 Oct 2020 15:56:02 +0000 (17:56 +0200)]
radv: fix ignoring the vertex attribute stride if set as dynamic
The vertex attribute stride should be ignored, so make sure it's
initialized to zero if dynamic to avoid computing a wrong offset.
The fact that each element of pStrides must be greater than or equal
to the maximum extent of all vertex input attributes fetched saves us
one user SGPR for the dynamic stride.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3627
Cc: 20.2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7101>
James Park [Wed, 14 Oct 2020 04:48:25 +0000 (21:48 -0700)]
ac,amd/llvm,radv: Initialize structs with {0}
Necessary to compile with MSVC.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7123>
Samuel Pitoiset [Tue, 13 Oct 2020 12:13:15 +0000 (14:13 +0200)]
radv/aco: disable NGG GS support because it randomly hangs the GPU
Disable ACO NGG GS until the random GPU hangs are fixed
(one CTS run == one GPU hang here). No hangs so far after
5 full CTS runs with this disabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7108>
Rhys Perry [Tue, 13 Oct 2020 18:18:52 +0000 (19:18 +0100)]
nir/opt_uniform_atomics: remove useless returns
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7117>
James Park [Sat, 29 Aug 2020 18:35:29 +0000 (11:35 -0700)]
radv: Only close local_fd when valid
Necessary when drm_device is bypassed.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>
James Park [Sat, 8 Aug 2020 02:57:04 +0000 (19:57 -0700)]
util: Hide timespec_passed on Windows
Windows doesn't have clockid_t.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>
James Park [Wed, 19 Aug 2020 06:20:57 +0000 (23:20 -0700)]
radv: Increased const usage
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>
James Park [Mon, 3 Aug 2020 19:12:30 +0000 (12:12 -0700)]
amd/addrlib: Fix warning list for msvc
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>
Jason Ekstrand [Thu, 8 Oct 2020 19:41:43 +0000 (14:41 -0500)]
intel/fs: Rework scratch handling on Gen9+
The current scratch mechanism uses an MRF hack where we reserve a few
GRF registers to treat like the MRF and we collect the data into that
MRF region before doing a scratch write. We also use that region for
the header for scratch reads.
This commit changes things and gets rid of the MRF hack. Instead, we
reserve a single register (which RA is free to pick) for the scratch
header and uses split sends for scratch writes to avoid having to do
the copy. This should provide RA with more freedom in the presence of
spilling as well as avoid some unnecessary data moves. In future, the
new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own
per-thread scratch base address calculations rather than depending on
the scratch base address that gets pushed into g0. Having an opcode for
this lets us do it once at the top of the shader rather than repeating
it at every read/write.
One other noticeable difference is the use of SHADER_OPCODE_SEND. We
can get away with this thanks to the fact that we're now using a set to
track which instructions are generated by spills and don't rely on the
opcodes to find spill/fill instructions. This allows us to avoid adding
more virtual opcodes and let the normal code paths handle things like
scoreboard dependencies between header setup and the SEND. It also
means that post-RA scheduling may be able to space out the header setup
MOV and the SEND for better latency hiding.
Shader-db results on Skylake:
total spills in shared programs: 12137 -> 10604 (-12.63%)
spills in affected programs: 6685 -> 5152 (-22.93%)
helped: 274
HURT: 2
total fills in shared programs: 13065 -> 11515 (-11.86%)
fills in affected programs: 9007 -> 7457 (-17.21%)
helped: 275
HURT: 1
Shader-db results on Ice Lake:
total spills in shared programs: 12482 -> 10953 (-12.25%)
spills in affected programs: 6586 -> 5057 (-23.22%)
helped: 275
HURT: 0
total fills in shared programs: 12819 -> 11234 (-12.36%)
fills in affected programs: 7867 -> 6282 (-20.15%)
helped: 274
HURT: 0
Shader-db results on Tigerlake:
total spills in shared programs: 11689 -> 10233 (-12.46%)
spills in affected programs: 4740 -> 3284 (-30.72%)
helped: 259
HURT: 0
total fills in shared programs: 10840 -> 9443 (-12.89%)
fills in affected programs: 6244 -> 4847 (-22.37%)
helped: 259
HURT: 0
Fossil-db results on Ice Lake:
Spills in all programs: 245249 -> 201633 (-17.8%)
Fills in all programs: 366066 -> 314368 (-14.1%)
More practically, this seems to give about a 0.5-1% perf boost in
Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Fri, 9 Oct 2020 09:27:35 +0000 (04:27 -0500)]
intel/fs/ra: Use a set to track added spill/fill instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Mon, 12 Oct 2020 20:07:25 +0000 (15:07 -0500)]
intel/fs/ra: Sanity-check our IP counts
Starting with
e99081e76d4a, we don't re-construct liveness information
every time we spill a register. Instead, we're very careful to track
which instructions are spill instructions and not contribute those to
the IP count so that we can continue to use the old liveness information
even though instructions have been added. This commit adds an assert
that sanity-checks that we count the same number of instructions as our
liveness information is based on.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Thu, 8 Oct 2020 20:51:13 +0000 (15:51 -0500)]
intel/fs/ra: Store the last non-spill VGRF node
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Thu, 8 Oct 2020 19:32:30 +0000 (14:32 -0500)]
intel/fs/ra: Refactor handling of Gen7 scratch reads
The attempt at de-duplication with the gen7_read Boolean wasn't actually
saving us anything.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Thu, 8 Oct 2020 19:26:57 +0000 (14:26 -0500)]
intel/fs/ra: Increment spill_offset as part of the emit_spill loop
This makes it consistent with our handling of src.offset and with our
handling of spill_offset in emit_unspill.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Fri, 9 Oct 2020 09:13:20 +0000 (04:13 -0500)]
intel/fs: Add a SCRATCH_HEADER opcode
This opcode is responsible for setting up the buffer base address and
per-thread scratch space fields of a scratch message header. For the
most part, it's a copy of g0 but some messages need us to zero out g0.2
and the bottom bits of g0.5.
This may actually fix a bug when nir_load/store_scratch is used. The
docs say that the DWORD scattered messages respect the per-thread
scratch size specified in gN.3[3:0] in the message header but we've been
leaving it zero. This may mean that we've been ignoring any scratch
reads/writes from a load/store_scratch intrinsic above the 1KB mark.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Mon, 12 Oct 2020 21:15:02 +0000 (16:15 -0500)]
intel/fs: Copy the PTSS from g0 for scratch reads/writes
In theory, this fixes a bug where we were dropping the PTSS bound on the
floor. The hardware docs claim that the A32 DWORD and BYTE scattered
read/write messages do a PTSS bounds check. However, in practice, it
seems that the hardware ignores the bounds check so this doesn't
actually matter. I verified this with the following couple of piglit
tests:
https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399
In practice, this prevents the next commit from making a subtle
behavioral change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Jason Ekstrand [Mon, 12 Oct 2020 19:53:01 +0000 (14:53 -0500)]
intel/batch_decoder: Don't clame vec4 vs/gs/tcs shaders on Gen11+
Because we hard-coded the default to vec4, any platform where it doesn't
have a "Dispatch Mode" field gets vec4 by default. This includes Gen11+
where vec4 is no longer a thing. Change the default so it works on
newer hardware.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>
Alejandro Piñeiro [Tue, 13 Oct 2020 21:06:35 +0000 (23:06 +0200)]
v3dv/device: Support loader interface version 3.
Port of
1e41d7f7b0855934744fe578ba4eae9209ee69f7:
"anv: Support loader interface version 3 (patch v2)"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 9 Oct 2020 10:06:06 +0000 (12:06 +0200)]
v3dv: fix buffer copies to compressed images on the blit path
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 8 Oct 2020 07:12:23 +0000 (09:12 +0200)]
v3dv: drop a couple of obsolete comments
We only expose a coherent memory heap, so invalidation and flushing
are always no-ops for us.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 6 Oct 2020 06:57:47 +0000 (08:57 +0200)]
v3dv: limit blit framebuffer dimensions to max coordinates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Mon, 5 Oct 2020 08:44:59 +0000 (10:44 +0200)]
v3dv: generate proper UUIDs for device and driver
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 1 Oct 2020 08:02:23 +0000 (10:02 +0200)]
v3dv: fix blit path for copies from 3D compressed images
The aliasing we were using was not always correct. Particularly,
for 3D images, the simulator would complain about image strides
not being large enough in some cases.
This patch fixes this by aliasing both src and dst images and
carefully choosing the alias dimensions taking into account the
format chosen for the copy and the ratio of block sizes between
both images.
Playing a bit with the image dimensions used by the relevant CTS
tests we confirmed this works well for all tile layouts (lineartile,
ublinear1/2 and UIF).
This fixes all CTS tests involving 3D image copies from compressed
formats without needing to force UIF layout for all compressed
images (which would actually not work for all image sizes either).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 25 Sep 2020 13:00:41 +0000 (15:00 +0200)]
v3dv: fixes for barriers in secondary command buffers
This patch addresses various issues, mostly from secondary command buffers
that recorded pipeline barriers that are not consumed in the secondary itself,
so they need to be applied to jobs that come right after the execution of the
secondary in a primary command buffer.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 24 Sep 2020 08:09:00 +0000 (10:09 +0200)]
v3dv: implement workaround for GFXH-1918
Loading depth with odd width/height might cause incorrect loading
of the early-Z buffer.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 24 Sep 2020 07:23:29 +0000 (09:23 +0200)]
v3dv: implement workaround for GFXH-1461
If a subpass clears one aspect of Depth/Stencil but loads the other
the clear might get lost. Fix this by emitting the clear as a draw
call instead of relying on the TLB clear.
Fixes:
dEQP-VK.renderpass.suballocation.attachment.3.307
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 23 Sep 2020 10:38:27 +0000 (12:38 +0200)]
v3dv: flag tmu_dirty_rcl in primaries when linking secondaries that have it set
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 23 Sep 2020 09:28:41 +0000 (11:28 +0200)]
v3dv: only advertise one memory type
Our current implemenation is always coherent.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 23 Sep 2020 08:02:26 +0000 (10:02 +0200)]
v3dv: always program a reasonable internal depth type for copies/clears
This doesn't seem to fix anything, but it is the right thing to do.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sun, 20 Sep 2020 20:54:33 +0000 (22:54 +0200)]
v3dv/pipeline_cache: extend pipeline cache envvar
So far V3DV_ENABLE_DEFAULT_PIPELINE_CACHE allowed to configure
pipeline cache to avoid any caching using a pipeline cache.
With this change we can be more detailed. Then envvar is not anymore a
boolean. Allowed values:
* "off": no pipeline cache at all. PipelineCache objects behaves as
no-op objects.
* "no-default-cache": user PipelineCache caches nir/variants, but we
don't provide a default cache in case the user doesn't provide a
PipelineCache object, neither for internal pipelines.
* "full" (default): we provide a default PipelineCache, used when
the user doesn't provide one when creating a Pipeline, and for
internal Pipelines.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 19 Sep 2020 20:51:58 +0000 (22:51 +0200)]
v3dv/pipeline_cache: set a max size for the pipeline cache
We don't want to let the default pipeline cache grow without limit. We
choose a maximum number of entries that should work for all real world
applications. CTS will exceed that limit, but that is okay, as it will
prevent us from running out of memory.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 10 Sep 2020 07:51:54 +0000 (09:51 +0200)]
v3d/compiler: allow to batch spills
Some shaders that need to spill hundreds of registers can take very long times
to compile as each allocation attempt spills a single register and restarts
the allocation process. We can significantly cut down these times if we allow
the compiler to spill in batches, which should be possible if we are spilling
uniforms, which is in fact the kind of spills that we do first because they
have lower cost than TMU spills.
Doing this could cause us to slightly over spill in some cases (depending on
the chosen batch size) leading to slightly worse performance, so we only
enable this behavior after we have started to spill over a certain threshold,
at which point we assume that performance won't be good and we want to
favor compilation speed instead.
v2:
- Keep it simple and just try to spill a fixed amount of registers in a
batch instead of trying to compute this dynamically based on accumulated
spills and current register pressure. (Eric).
v3:
- Check if the node is valid before doing anything with it.
- Drop the environment variable to select batch size and just fix it to 20.
With this we can take this CTS test from 35 minutes down to about 3 minutes:
dEQP-VK.ssbo.layout.random.all_shared_buffer.5
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 18 Sep 2020 16:02:05 +0000 (18:02 +0200)]
v3dv: free noop job if needed when finishing the queue
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 17 Sep 2020 14:12:31 +0000 (16:12 +0200)]
v3dv: clean-up after obtaining an XCB connection
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 11 Sep 2020 10:20:20 +0000 (12:20 +0200)]
v3dv: don't leak dumb BO handles allocated for swapchain images
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Thu, 10 Sep 2020 21:59:47 +0000 (23:59 +0200)]
v3dv/meta_copy: fix TFU blitting when using 3D images
We had some code on blit_tfu to hande 3D images but it was wrong. For
example, it executed a copy on the 3D image no matter the depth
component copy needed. This was not detected until vk-gl-cts 1.2.4
introduced more 1D and 3D blitting tests.
Also add checks for rely on blit_shader if needed like when mirroring
on the depth component.
Fixes the following tests:
dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.mirror_z_3d.nearest
dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.whole_3d.nearest
dEQP-VK.api.copy_and_blit.dedicated_allocation.blit_image.simple_tests.mirror_z_3d.nearest
dEQP-VK.api.copy_and_blit.dedicated_allocation.blit_image.simple_tests.whole_3d.nearest
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 9 Sep 2020 07:18:43 +0000 (09:18 +0200)]
v3dv: honor VkPipelineDepthStencilStateCreateInfo::depthWriteEnable
Fixes:
dEQP-VK.renderpass.suballocation.subpass_dependencies.separate_channels.d24_unorm_s8_uint
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 8 Sep 2020 06:47:23 +0000 (08:47 +0200)]
v3dv: fix sampling from stencil aspect of a combined depth/stencil image
When sampling the stencil aspect we want to reinterpret the D24S8 format
as RGBA8 and read stencil values from the R component.
Fixes:
dEQP-VK.renderpass.suballocation.formats.d24_unorm_s8_uint.input.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Mon, 7 Sep 2020 22:57:31 +0000 (00:57 +0200)]
v3dv/formats: properly return unsupported for 1D compressed textures
Gets tests like the following one properly skipped:
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.1d.etc2_r8g8b8a8_unorm_block.etc2_r8g8b8a8_unorm_block.optimal_general
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Mon, 7 Sep 2020 09:09:23 +0000 (11:09 +0200)]
v3dv: signal semaphore/fence if needed after acquiring a swapchain image
Fixes:
dEQP-VK.wsi.*.swapchain.acquire.too_many
dEQP-VK.wsi.*.swapchain.acquire.too_many_timeout
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 2 Sep 2020 05:58:28 +0000 (07:58 +0200)]
v3dv: do not expose VK_IMAGE_USAGE_SAMPLED_BIT for swapchains
The display pipeline on the Rpi4 requires that images are linear and the
3D pipeline cannot sample from linear images.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 1 Sep 2020 11:02:32 +0000 (13:02 +0200)]
v3dv: fix size computed by vkGetImageSubresourceLayout for 3D images
Fixes:
dEQP-VK.image.subresource_layout.3d.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 1 Sep 2020 11:01:49 +0000 (13:01 +0200)]
v3dv: fix offset computed by vkGetImageSubresourceLayout for array images
Fixes:
dEQP-VK.image.subresource_layout.2d_array.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 1 Sep 2020 06:56:47 +0000 (08:56 +0200)]
v3dv: expose DRM modifiers based on supported features
So far we have only been exposing linear for WSI formats and UIF on
everythig else, but we should instead expose linear or UIF based
on whether the underlying format supports any features for the given
layout.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 1 Sep 2020 06:51:37 +0000 (08:51 +0200)]
v3dv: handle VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO
When negotiating DRM modifiers, applications may use this to validate the
features that are supported with a particular modifier. The WSI code in
Mesa relies on this to validate its modifiers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 29 Aug 2020 00:19:19 +0000 (02:19 +0200)]
v3dv/meta_copy: handle mirroring z component bliting 3D images
By basing the tex_coord on the max layer, instead of min (similarly to
what we do for mirroring x/y)
Avoid all crashes, and get to Pass most of the following tests:
dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.mirror_z_3d.*
The only one failing is this one:
dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.mirror_z_3d.nearest
but looks that the core cause would be different, as there are other
3d nearests tests failing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 27 Aug 2020 10:44:54 +0000 (12:44 +0200)]
v3dv: fix color clear pipeline destruction for 32-bit architectures
Command buffer object destruction callbacks take 64-bit object
handles, but we defined the color clear pipeline callback to take
a 32-bit argument.
Should fix recent crash regressions with some CTS tests on Rpi4.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 27 Aug 2020 08:48:29 +0000 (10:48 +0200)]
v3dv: hook up robust buffer access
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 27 Aug 2020 08:05:53 +0000 (10:05 +0200)]
v3d/compiler: add a lowering pass for robust buffer access
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 26 Aug 2020 10:01:27 +0000 (12:01 +0200)]
broadcom/compiler: rename QUNIFORM_GET_BUFFER_SIZE to QUNIFORM_GET_SSBO_SIZE
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 26 Aug 2020 10:00:43 +0000 (12:00 +0200)]
v3dv: handle QUNIFORM_GET_UBO_SIZE
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 26 Aug 2020 09:58:47 +0000 (11:58 +0200)]
v3d/compiler: implement nir_intrinsic_get_ubo_size
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 26 Aug 2020 09:46:55 +0000 (11:46 +0200)]
nir: add a nir_get_ubo_size intrinsic
This is the same as nir_get_buffer_size but geared towards UBOs instead
of SSBOs. The new intrinsic is useful in Vulkan backends that need to
add bound checks on buffer accesses to honor the robust buffer access
feature.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 26 Aug 2020 06:38:41 +0000 (08:38 +0200)]
v3dV: move meta init/finish to meta implementation files
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 25 Aug 2020 12:25:45 +0000 (14:25 +0200)]
v3dv: don't cache subpass color clear pipelines
Subpass color clear pipelines are those used to emit partial attachment
clears as draw calls inside the render pass currently bound by the
application in the command buffer, leading to a huge performance improvement
compared to the case where we emit them in their own render pass.
Unfortunately, because the pipeline references the render pass
object in which it is used and the render pass object is owned by the
application (and can be destroyed at any point), we can't cache these
pipelines (unless we implement a refcounting mechanism or other
similar strategy).
Performance impact looks negligible based on experiments with vkQuake3,
probably because the underlying pipeline cache is preventing the
redundant shader recompiles.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 25 Aug 2020 08:17:29 +0000 (10:17 +0200)]
v3dv: fix 3D image blits
Specifically, we should select the slice to blit from on the source
image to be in the middle of the depth step.
This issue was only raised recently after the CTS improved the 3D
blitting tests.
Fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.*.3d.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 25 Aug 2020 06:47:30 +0000 (08:47 +0200)]
v3dv: only require texel-size alignment for linear images
Originally, copies between buffers and images required a buffer offset
that was a multiple of 4 bytes, however, the spec was later fixed to
relax this rule and only require offsets that had texel alignment.
Our implementation of image to buffer copies using the blit path needs
to bind the destination buffer as a linear image and be able to bind
the requested buffer memory at the required offset, so for that to work
we need to chnage the alignment requirements for linear images to match
the relaxed texel alignment requirement.
Fixes new tests in Vulkan CTS 1.2.4:
dEQP-VK.api.copy_and_blit.core.image_to_buffer.buffer_offset_relaxed
dEQP-VK.api.copy_and_blit.dedicated_allocation.image_to_buffer.buffer_offset_relaxed
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 6 Aug 2020 12:15:41 +0000 (14:15 +0200)]
v3dv: lower interpolateAt functions in NIR and enable sample rate shading
The lowering will get all the interpolateAt() functions from GLSL lowered to
the corresponding intrinsics we have just implemented in the compiler backend,
which was the last piece we needed to enable the feature.
This gets us to pass all the relevant tests in:
dEQP-VK.pipeline.multisample_interpolation.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 7 Aug 2020 08:34:30 +0000 (10:34 +0200)]
nir/lower_io: add an option to lower interpolateAt functions
The option use_interpolated_input_intrinsics will lower these as well
as regular input loads. This is inconvenient for V3D, where we can
produce optimal code for regular input loads based on the input
variable layout qualifiers, so this change adds an option to only
lower instances of interpolateAt().
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sun, 16 Aug 2020 00:19:10 +0000 (02:19 +0200)]
v3dv/device: enable largePoints
as we have just set proper values for point granularity etc, we can
enable largePoints. With this change tests like this:
dEQP-VK.rasterization.primitive_size.points.point_size_*
goes from Skip to Pass.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 15 Aug 2020 22:05:01 +0000 (00:05 +0200)]
v3dv/device: fix point-related VkPhysicalDeviceLimits
As we are here, we also tweak some line-related limits, as some use
the same value that for point, and in order to use the enum we added
recently at common/v3d_limits.h
Fixes the following test:
dEQP-VK.glsl.builtin_var.simple.pointcoord
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 15 Aug 2020 22:13:20 +0000 (00:13 +0200)]
v3d/limits: add line width and point size limits
They will be the same for the OpenGL and Vulkan driver, so let's put
it on the commit limits header.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Wed, 19 Aug 2020 14:38:34 +0000 (16:38 +0200)]
v3dv/cmd_buffer: set instance id to 0 at start of tile
PTB assumes that instance id to be 0 at start of tile, but hw would
not do that, we need to set it.
This fixes some Vulkan CTS tests that start to fails after some other
tests used an instance id.
So for example, before this commit for the following tests, executed
in that order, we got the following behaviour:
dEQP-VK.pipeline.vertex_input.multiple_attributes.binding_one_to_many.attributes.float.mat2.mat3 => Pass
dEQP-VK.draw.indexed_draw.draw_instanced_indexed_triangle_strip => Pass
dEQP-VK.pipeline.vertex_input.multiple_attributes.binding_one_to_many.attributes.float.mat2.mat3 => Fails
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Wed, 19 Aug 2020 21:49:50 +0000 (23:49 +0200)]
v3dv/pipeline: set 16bit return_size for shadows always
So far we were pre-generating two variants, an all 16 bit return_size
and an all 32-bit return_size, as at pipeline creation time we don't
know the texture format that it would be used finally used.
But it is possible to override or at least refine the 32bit case, as
we know in advance that all shadow textures can (and in fact should)
use return_size 16bit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Wed, 19 Aug 2020 21:34:30 +0000 (23:34 +0200)]
v3dv/pipeline: track if texture is shadow
To be used to decide the texture return size. We add it on the
descriptor map because it is the easier place to do so. As we are
lowering the texture accesses we can check instr->is_shadow at that
point. It is true that it is somewhat odd, as so far the descriptor
map was general-descriptor info, but is_shadow is only for
textures. But it doesn't make sense to make an effort now, as it is
possible that we would get more descriptor-specific info on the map on
the future. We can revisit that later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Mon, 21 Sep 2020 21:14:08 +0000 (23:14 +0200)]
v3dv: Call nir_lower_io for push constants
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Wed, 12 Aug 2020 21:35:04 +0000 (23:35 +0200)]
v3dv/pipeline: use derefs for ubo/ssbo
There are some potential advantages for that. Even if we are not
taking advantage of them, it would be interesting to be using this
path now, specially as non-deref path could be removed at some point.
Note that instead of returning for both resource_index and
vulkan_descriptor a vec2, we return a scalar for the first one, as it
is what the v3d backend expect (like for get_ssbo_size). For this to
work, we reconfigure the vec2 at vulkan_descriptor using the index and
an unused 0 value.
As far as I see turnip avoids that by lowering too load_ssbo/ubo, so
it just gets the index lowered (that in their case it is a vec3 with a
fixed 0 on the third component), but for now it is easier doing this.
v2: return a single-component for the index, to avoid the backend
needing to handle it (Eric, Jason).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 8 Aug 2020 22:46:20 +0000 (00:46 +0200)]
v3dv/device: fix compute_heap_size for the simulator
Asking the simulator the total memory it is using, instead of sysinfo
(that returned the host system memory).
Fixes the following CTS tests when using the simulator:
dEQP-VK.memory.allocation.basic.percent_1.forward.count_12
dEQP-VK.memory.allocation.basic.percent_1.reverse.count_12
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 8 Aug 2020 22:45:14 +0000 (00:45 +0200)]
v3d/simulator: add v3d_simulator_get_mem_size
Reviewed-by: Iago Toral <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sat, 8 Aug 2020 21:18:41 +0000 (23:18 +0200)]
broadcom/compiler: allow GLSL_SAMPLER_DIM_BUF on txs emission
Although we don't support texture buffers on the OpenGL driver, we are
already doing that for the Vulkan driver. This would be needed for the
OpenGL driver in any case.
Fixes following tests on v3dv:
dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.*
dEQP-VK.memory.pipeline_barrier.transfer_dst_uniform_texel_buffer.*
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Fri, 7 Aug 2020 13:16:19 +0000 (15:16 +0200)]
v3dv/meta: fix hash table insertion
So far we were using directly the local variable key to do the
insertion, when the hash table expects a permanent address. We add a
key field on all the meta structures (that are already basically a
wrapper over v3dv_pipeline).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Fri, 7 Aug 2020 00:21:45 +0000 (02:21 +0200)]
v3dv/pipeline: fix combined_index_map insertions
We were inserting as key directly the local key variable used to
search for entries, but hash_table expect a real pointer. Fixed by
using the array of keys that we already had at v3dv_pipeline.
Fixed failures on the rpi4 like:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.a1r5g5b5_unorm_pack16.a1r5g5b5_unorm_pack16.general_general_linear
but fwiw, this tests on the simulator, and several other tests on both
the simulator and rpi4, were working just by luck.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Thu, 6 Aug 2020 13:45:59 +0000 (15:45 +0200)]
v3dv/debug: add v3dv_print_v3d_key
Useful to print which v3d keys were used for each variant.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Wed, 5 Aug 2020 08:35:16 +0000 (10:35 +0200)]
v3dv/device: warn when the pipeline cache is disabled
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Tue, 4 Aug 2020 21:12:51 +0000 (23:12 +0200)]
v3dv/device: add assert for texture-related limits
There are several limits that when added shouldn't be greater than
V3D_MAX_TEXTURE_SAMPLERS (defined at common/v3d_limits.h), so let's
assert it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 4 Aug 2020 11:11:10 +0000 (13:11 +0200)]
v3dv: handle multisample rasterization with empty framebuffers
If the framebuffer has no attachments then multisample rasterization
is enabled based on the rasterizationSamples multisample state of
the pipelines. It should be noted that since we don't support
the variableMultisampleRate feature, all pipelines in the same
subpass must have matching number of samples.
V3D requires that we specifically setup our frames to enable
multisampling or not, and we do this when we create jobs inside
a subpass. Since we create the first job for a subpass as soon as
the subpas starts, this is problematic: if we don't have any
attachments, we don't won't enable MSAA at this point, but later
on we might bind an MSAA pipeline, since pipelines can be bound
at any point in the lifespan of a command buffer.
Here, we fix this by testing if the first draw call in a job uses
an MSAA pipeline but the job the was setup to not use MSAA, and in
that case we re-start the job with MSAA enabled.
We also take care of a corner case that seems to be tested by CTS
where a framebuffer with no attachments doesn't bind any pipelines
with MSAA enabled (so according to the Vulkan spec, multisample
rasterization must be disabled) but the fragment shader in use
reads gl_SampleID (which enables per-sample shading). This would
lead to enabling per-sample shading with single-sample rasterization,
which doesn't make sense and makes the simulator complain, so we just
disable per-sample shading in that case.
Fixes:
dEQP-VK.pipeline.multisample.mixed_count.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 4 Aug 2020 09:21:14 +0000 (11:21 +0200)]
v3dv: implement nir_texop_texture_samples
Fixes:
dEQP-VK.glsl.texture_functions.query.texturesamples.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 4 Aug 2020 06:39:14 +0000 (08:39 +0200)]
v3dv: enable sample rate shading if fragment shader reads gl_SampleID
According to the spec, if a fragment shader reads gl_SampleID then the
shader must be evaluated per-sample.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.write_sample_mask.4_samples
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 4 Aug 2020 06:37:07 +0000 (08:37 +0200)]
broadcom/compiler: track if the fragment shader forces per-sample MSAA
For example, regarding gl_SampleID, the GLSL spec states:
"Any static use of this variable in a fragment shader causes the
entire shader to be evaluated per-sample."
So we need to track if the fragment shader does anything that implicitly
enables per-sample shading in the compiler for the driver to
auto-enable sample rate shading if needed.
v2:
- Instead of tracking reads of gl_SampleID, check SYSTEM_BIT_SAMPLE_ID
and SYSTEM_BIT_SAMPLE_POS as well as the sample layout qualifier like
other drivers are doing to activate this behavior (Eric).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Mon, 3 Aug 2020 14:40:06 +0000 (16:40 +0200)]
v3dv/descriptor: remove v3dv_descriptor_map_get_image_view
Now that we added support for texel_buffers, on all the cases that we
were checking for a image_view we end checking for a image_view or
buffer_view, so we stopped to use it. Remove it as it become
superfluous.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Mon, 3 Aug 2020 14:38:19 +0000 (16:38 +0200)]
v3dv/uniforms: handle texture size for texel buffers
This gets tests like the following one working:
dEQP-VK.image.image_size.buffer.readonly_writeonly_1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Mon, 3 Aug 2020 09:41:28 +0000 (11:41 +0200)]
broadcom/compiler: implement nir_intrinsic_load_sample_pos
This is intended to return the sample location within the pixel.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_position.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Sun, 2 Aug 2020 00:23:16 +0000 (02:23 +0200)]
v3dv/formats: fix exposing FEATURE_UNIFORM/STORAGE_TEXEL_BUFFER_BIT
If the formats are not suitable as texture type, then they can't be
used as texel buffers.
Gets tests like the following one:
dEQP-VK.image.load_store.without_format.buffer.r32g32b32_sfloat_minalign_uniform
to be properly skipped (instead of Crash on the simulator)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 16:25:06 +0000 (18:25 +0200)]
v3dv: handle multisample image clears
Fixes:
dEQP-VK.pipeline.framebuffer_attachment.*_ms
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 12:58:42 +0000 (14:58 +0200)]
v3dv: handle multisample resolves for formats that don't support TLB resolves
The TLB multisample resolve feature is only limited to specific format types.
For everything else, including sfloat and integer formats, we need to
fallback to a blit resolve. This needs to be handled both for in-pass
resolves as well as for vkCmdResolveImage.
Because these blits would happen after the tile store operations, we need
to make sure we store the multisampled buffers so we can then read them for
the blit resolve.
Fixes the remaining test failures in:
dEQP-VK.renderpass.suballocation.multisample_resolve.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 12:59:09 +0000 (14:59 +0200)]
v3dv: handle multisample resolve of integer formats
The multisample resolve of an integer framebuffer should just take one
of the samples instead of averaging.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 12:52:58 +0000 (14:52 +0200)]
v3dv: fix blitting of signed integer formats
For these we want to select a signed integer output format
and a signed sampler type.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 12:50:57 +0000 (14:50 +0200)]
nir/glsl: add a glsl_ivec4_type() helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Fri, 31 Jul 2020 07:26:06 +0000 (09:26 +0200)]
v3dv: amend tile size tables with smallest tile sizes available
We'll need this for some cases involving maximum number of multisampled
color targets.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Thu, 30 Jul 2020 12:35:43 +0000 (14:35 +0200)]
v3dv/device: fix minTexelBufferOffsetAlingment
As we understand that texture accesses should be aligned to the UIF
block size.
Fixes several of the CTS tests under this pattern:
dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.*.offset_nonzero
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_texel_buffer.*.offset_nonzero
Note: for those tests, using a lower value (64) was enough to get them
working, but again, we understand that the real alignment is the UIF
block size.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Thu, 30 Jul 2020 12:32:26 +0000 (14:32 +0200)]
v3dv: add v3dv_limits file
There are several definitions for hw limits on v3dv_image that we want
to share, but v3dv_private was already growing bigger and messier.
So let's move them to a specific header. Note that there is already a
broadcom/common/v3d_limits.h. We are not putting them there because
right now they are only used by the Vulkan driver, but are candidates
to be moved.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Alejandro Piñeiro [Tue, 28 Jul 2020 22:28:28 +0000 (00:28 +0200)]
v3dv/descriptor: support for UNIFORM/STORAGE_TEXEL_BUFFER
This gets passing most uniform/storage_texel buffer tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 30 Jul 2020 09:29:12 +0000 (11:29 +0200)]
broadcom/compiler: handle gl_SampleMask writes in fragment shaders
We didn't need this until now, since this was included with GLES 3.2,
but we need it for Vulkan.
Eric had already done the plumbing for it though, we just need to
actually emit the mask.
Fixes some tests in:
dEQP-VK.renderpass.suballocation.multisample_resolve.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Thu, 30 Jul 2020 07:00:07 +0000 (09:00 +0200)]
v3dv: handle multisampled image copies with the blit path
This should be able to handle partial copies of multisampled images.
This change extends our blit shader interface to also handle multisampled
destinations so that if the blit destination is a multisampled image,
the blit will rely on sample rate shading to copy all samples from
the source image (which must have a matching number of samples).
I have not found any tests in CTS that do partial copies of
multisampled images, so I tested this with a full multisampled image
copy, using this test:
dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving.4_bit
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 29 Jul 2020 07:49:47 +0000 (09:49 +0200)]
v3dv: add a blit fallback path for vkCmdResolveImage
This fallback is required when we have to do partial resolves. It
works the same way as other blit fallbacks for copy operations: it
will bind the source image as a source texture and blit the selected
region to the destination image.
The difference in this case is that the source image is multisampled
and the blit shader needs to fetch and average individual samples for
each texel.
This gets us to pass all the remaining test cases in
dEQP-VK.api.copy_and_blit.core.resolve_image.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Wed, 29 Jul 2020 07:49:08 +0000 (09:49 +0200)]
v3dv: setup texture shader state correctly for multisampled images
Fixes multisampled cases in:
dEQP-VK.pipeline.multisample.sampled_image.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 28 Jul 2020 08:33:17 +0000 (10:33 +0200)]
v3dv: handle multisampled image copies in the TLB path
vkCmdCopyImage can be used to copy multisampled images. We can
easily support that on the TLB path, which copies full images.
For partial copies we will need to amend our blit shader path
to support multisampling resolve.
Fixes:
dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving.4_bit
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Iago Toral Quiroga [Tue, 28 Jul 2020 08:50:01 +0000 (10:50 +0200)]
v3dv: implement vkCmdResolveImage for whole images
For partial resolves we will need a shader blit & resolve fallback.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>