Qiang Yu [Tue, 28 Jun 2022 03:30:15 +0000 (11:30 +0800)]
nir: fix nir_xfb_info buffer_to_stream length
Fixes:
19064b8c3a8 ("nir: Add a pass for gathering transform feedback info")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
Samuel Pitoiset [Tue, 13 Sep 2022 15:35:04 +0000 (17:35 +0200)]
radv: do not remove PSIZ for VS when the topology is unknown
When compiling only the pre-rast stages in a library, the input
assembly state might not be present and the topology would be 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Samuel Pitoiset [Fri, 9 Sep 2022 14:28:50 +0000 (16:28 +0200)]
radv: enable the VS prologs cache if graphicsPipelineLibrary is enabled
GPL will re-use most of the VS prologs code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Samuel Pitoiset [Fri, 9 Sep 2022 14:55:23 +0000 (16:55 +0200)]
radv: bind the VS input state for prologs created with GPL
If we have a VS that needs a prolog without using the dynamic state,
that means that it comes from a library, so we can overwrite the
cmdbuf VS input state.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Samuel Pitoiset [Fri, 9 Sep 2022 14:59:26 +0000 (16:59 +0200)]
radv: prepare the VS input state for prologs created with GPL
This state will be bound at pipeline bind time.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Samuel Pitoiset [Fri, 9 Sep 2022 14:52:55 +0000 (16:52 +0200)]
radv: rename radv_pipeline_key::vs::dynamic_vs_input to has_prolog
With GPL it's possible to create VS prologs without this dynamic state,
so it seems better to rename.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Samuel Pitoiset [Thu, 15 Sep 2022 07:35:49 +0000 (09:35 +0200)]
radv: disable VK_EXT_graphics_pipeline_library with LLVM
Epilogs/prologs aren't supported at all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18609>
Iago Toral Quiroga [Wed, 14 Sep 2022 06:44:28 +0000 (08:44 +0200)]
v3dv: don't return incompatible driver if GPU is not present
Instead, we should just return VK_SUCCESS. The physical device
won't be initialized and vkEnumeratePhysicalDevices will not
list it as available, which is the expected behavior here.
Also, VK_ERROR_INCOMPATIBLE_DRIVER is not a valid return code
from vkEnumeratePhysicalDevices, so never return that, instead
we return VK_ERROR_INITIALIZATION_FAILED if a valid device was
found but we failed to create the physical device for it.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-By: Ryan Houdek <Sonicadvance1@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18591>
Emma Anholt [Wed, 14 Sep 2022 19:29:02 +0000 (12:29 -0700)]
turnip: Keep a host copy of push descriptor sets.
Otherwise, the back-copy on same-layout push descriptor updates would read
from WC memory, which is absurdly slow. Improves performance of
vkoverhead's descriptor_template_12ubo_push from 760k/sec to 2876k/sec.
Improves submit-disabled gfxbench gl_driver2 performance on zink from 79.6
fps to 103.6.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18561>
Emma Anholt [Wed, 14 Sep 2022 20:02:02 +0000 (13:02 -0700)]
turnip: Ignore pDescriptorCounts[] for non-variable-count layouts.
The spec says "If VkDescriptorSetAllocateInfo::pSetLayouts[i] does not
include a variable-sized descriptor binding, then pDescriptorCounts[i] is
ignored." So, make sure that we ignore it unless there is a
variable-sized binding. And, we can keep it simple just taking the
variable-sized path for variable-sized bindings with the 0 variable_count
value to handle "If descriptorSetCount is zero or this structure is not
included in the pNext chain, then the variable lengths are considered to
be zero."
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18561>
Yonggang Luo [Thu, 15 Sep 2022 12:33:45 +0000 (20:33 +0800)]
drm-shim: drop gnu99 override
If we override with gnu99 here, we effectively down-grade from C11,
meaning we can no longer assume static_assert support.
Fixes:
45fb815a756 ("util: implement STATIC_ASSERT using c++11 / c11 primitives")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Suggested-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18611>
Emma Anholt [Tue, 13 Sep 2022 23:16:30 +0000 (16:16 -0700)]
turnip: Skip rather than invalidate LRZ on gl_FragDepth writes.
As long as the direction is still compatible, if we skip the LRZ use and
updates for this draw, then we can keep using LRZ later in the scene, as
whatever gl_FragDepth will get written by the shader later will still have
to move the depth in the right direction.
Similarly, the no_earlyz flag that contributes to DISABLE_LRZ just wants
to make sure we don't kill fragments before dispatch, not change what Z
eventually lands.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
Emma Anholt [Tue, 13 Sep 2022 23:09:15 +0000 (16:09 -0700)]
turnip: Don't look at RB.Z_READ_ENABLE for setting LRZ.Z_TEST_ENABLE.
It will always be set in HW when RB.Z_WRITE_ENABLE is set (since that
implies RB.Z_TEST_ENABLE), but in the case of dynamic Z the flag gets
computed at emit time and not stored to cmd->state.rb_depth_cntl. This
bug effectively disabled LRZ for zink.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
Emma Anholt [Tue, 13 Sep 2022 22:58:58 +0000 (15:58 -0700)]
turnip: Ignore dynamic color write enables past our number of attachments.
We were always disabling LRZ writes on zink+turnip because it sets all the
color write enables (translating directly from GL turning them all on).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
Emma Anholt [Tue, 13 Sep 2022 22:57:30 +0000 (15:57 -0700)]
turnip: Add some missing LRZ disable debug.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
Alyssa Rosenzweig [Sat, 20 Aug 2022 16:59:01 +0000 (12:59 -0400)]
u_transfer_helper: Pack Z24S8 to Z24-in-Z32F and S8
On Asahi needed to pass
dEQP-GLES3.functional.texture.specification.texsubimage2d_depth.depth24_stencil8
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18136>
Alyssa Rosenzweig [Fri, 19 Aug 2022 03:21:52 +0000 (23:21 -0400)]
u_transfer_helper: Handle Z24X8 for drivers that don't use the
interleaved transfer_map
Fixes
dEQP-GLES3.functional.texture.format.sized.2d.depth_component24_pot on
Asahi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18136>
Konstantin Seurer [Wed, 14 Sep 2022 15:13:01 +0000 (17:13 +0200)]
radv: Cleanup radv_GetInstanceProcAddr
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18600>
Juston Li [Wed, 14 Sep 2022 23:30:56 +0000 (16:30 -0700)]
venus: use buffer cache for vkGetDeviceBufferMemoryRequirements
Align with vkGetBufferMemoryRequirements2 and utilize the cache for
retrieving memory requirements before trying the host call.
Fixes
dEQP-VK.api.invariance.memory_requirements_matching
dEQP-VK.memory.requirements.create_info.buffer.regular
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18603>
Chia-I Wu [Wed, 14 Sep 2022 22:25:45 +0000 (15:25 -0700)]
vulkan: update comments to device enumeration callbacks
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18607>
Hans-Kristian Arntzen [Thu, 15 Sep 2022 11:10:27 +0000 (13:10 +0200)]
radv: Implement VK_EXT_mutable_descriptor_type.
Trivial promotion from VALVE, just rename enums and types.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18610>
Hans-Kristian Arntzen [Thu, 15 Sep 2022 11:06:37 +0000 (13:06 +0200)]
vulkan: Update to 1.3.228 headers.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18610>
Sil Vilerino [Wed, 14 Sep 2022 17:14:47 +0000 (13:14 -0400)]
d3d12: d3d12_video_buffer_create_impl make resident after checking for resource creation
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Mon, 12 Sep 2022 17:20:36 +0000 (13:20 -0400)]
d3d12: Add VPBlit processor check for D3D12_FEATURE_VIDEO_PROCESS_MAX_INPUT_STREAMS
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Mon, 12 Sep 2022 15:03:46 +0000 (11:03 -0400)]
d3d12: Allow video processing for formats other than NV12
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Mon, 12 Sep 2022 15:02:53 +0000 (11:02 -0400)]
d3d12: Allow formats other than NV12 in d3d12_video_buffer
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 7 Sep 2022 17:37:18 +0000 (13:37 -0400)]
d3d12: Add support for importing d3d12_video_buffer from handle
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 20 Jul 2022 17:31:19 +0000 (13:31 -0400)]
d3d12: Fix leak in d3d12_resource_from_resource and usage in d3d12 video dec, enc
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 20 Jul 2022 16:21:08 +0000 (12:21 -0400)]
d3d12: Fix winsys displaytarget leak in d3d12_resource
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 14 Sep 2022 11:58:45 +0000 (07:58 -0400)]
d3d12: Fix leak in d3d12_video_proc when re-creating ID3D12VideoProcessor
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 14 Sep 2022 11:56:29 +0000 (07:56 -0400)]
d3d12: Fill feedback in d3d12_video_encoder_encode_bitstream so vaSyncSurface properly populates buf->coded_size
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Fri, 2 Sep 2022 15:01:25 +0000 (11:01 -0400)]
d3d12: Avoid heap allocations on hot path d3d12_video_decoder_dxva_picparams_from_pipe_picparams_hevc
Using pre-allocated storage now.
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Fri, 2 Sep 2022 14:42:14 +0000 (10:42 -0400)]
d3d12: Avoid local allocations for D3D12_RESOURCE_BARRIER on hot paths
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Fri, 2 Sep 2022 14:20:56 +0000 (10:20 -0400)]
d3d12: Avoid extra allocation, copies when generating DXVA_Slice_Hxxx_Short arrays
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 14 Sep 2022 20:13:55 +0000 (16:13 -0400)]
d3d12: Add HEVC Decode/Encode
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 14 Sep 2022 14:12:22 +0000 (10:12 -0400)]
gallium/vl: Rename s_addr variable in vl_idct.c as it conflicts with windows existing inaddr.h keyword definition
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 31 Aug 2022 16:22:56 +0000 (12:22 -0400)]
gallium/vl: Allow vl_zscan.h to be included from C++
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Wed, 14 Sep 2022 11:55:08 +0000 (07:55 -0400)]
d3d12/va: Name convention rename PIPE_VIDEO_SUPPORTS_CONTIGUOUS_PLANES_MAP to PIPE_VIDEO_CAP_SUPPORTS_CONTIGUOUS_PLANES_MAP
Reviewed-by: Giancarlo Devich <gdevich@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Thu, 28 Jul 2022 16:53:01 +0000 (12:53 -0400)]
frontends/va: Support HEVC caps regarding features, block sizes, prediction direction
Add new pipe structures: PIPE_VIDEO_CAP_ENC_HEVC_BLOCK_SIZES, PIPE_VIDEO_CAP_ENC_HEVC_FEATURE_FLAGS, PIPE_VIDEO_CAP_ENC_HEVC_PREDICTION_DIRECTION
Implement new VA caps VAConfigAttribEncHEVCFeatures, VAConfigAttribEncHEVCBlockSizes, VAConfigAttribPredictionDirection
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Tue, 30 Aug 2022 21:09:10 +0000 (17:09 -0400)]
frontends/va: Extend single to multiple L0-L1 references for HEVC Encode
Also fixing refactored variable name for L0/L1 lists in drivers/radeonsi to avoid build break.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Tue, 30 Aug 2022 21:08:24 +0000 (17:08 -0400)]
frontends/va: Add HEVC Encode support multi slice and extend pipe args
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Thu, 1 Sep 2022 14:37:49 +0000 (10:37 -0400)]
frontends/va: Mark IsLongTerm in HEVC decode args
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Tue, 6 Sep 2022 14:32:41 +0000 (10:32 -0400)]
frontends/omx: Fill HEVC Decode param IntraPicFlag
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Tue, 6 Sep 2022 14:32:21 +0000 (10:32 -0400)]
frontends/vdpau: Fill HEVC Decode param IntraPicFlag
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Tue, 6 Sep 2022 14:31:51 +0000 (10:31 -0400)]
frontends/va: Add HEVC decode args: IntraPicFlag, no_pic_reordering_flag, no_bipred_flag
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Sil Vilerino [Thu, 21 Jul 2022 16:08:50 +0000 (12:08 -0400)]
frontends/va: Add HEVC decode slice descriptors
Adds HEVC decoded slice descriptors to the pipe interface and also to the VA frontend
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18328>
Martin Roukala (né Peres) [Wed, 14 Sep 2022 06:08:44 +0000 (09:08 +0300)]
radv/ci: move some tests from the renoir fail to its flake list
This mirrors the change we made for vega10 (
6bbe3c6d3) in August...
Seems like the chances of a PASS are indeed slim, but possible.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18590>
Erik Faye-Lund [Wed, 10 Aug 2022 12:35:21 +0000 (14:35 +0200)]
panfrost: do not fake rgtc-support
Panfrost doesn't expose LATC format support at all, so RGTC
state-tracker level RGTC support is sufficient to drop the fake RGTC
flag on Panfrost.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Wed, 10 Aug 2022 06:59:28 +0000 (08:59 +0200)]
mesa/st: enable rgtc extension with fallback
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Tue, 23 Aug 2022 12:48:39 +0000 (14:48 +0200)]
mesa/st: do not fall back to uncompressed for rgtc
This logic doesn't really do what it pretends to; we don't expose the
RGTC features unless we actually have RGTC support. This is about to
change, but for that logic to work, we need to be able to tell if we're
using a fallback-format or not, and we can't do that unless we keep the
format as RGTC.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Wed, 10 Aug 2022 06:45:16 +0000 (08:45 +0200)]
mesa/st: implement fallback for rgtc
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Wed, 10 Aug 2022 11:23:08 +0000 (13:23 +0200)]
mesa/main: add _mesa_unpack_rgtc
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Tue, 23 Aug 2022 09:33:57 +0000 (11:33 +0200)]
util/format: implement rgtc -> r8 / r8g8 unpack
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Tue, 23 Aug 2022 09:18:09 +0000 (11:18 +0200)]
util/format: allow unpacking less than a block from rgtc
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Tue, 23 Aug 2022 09:12:32 +0000 (11:12 +0200)]
util/format: fix broken indentation
This file had a mixture of tabs and spaces for indent.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Wed, 10 Aug 2022 06:27:19 +0000 (08:27 +0200)]
mesa: add format-helper for rgtc
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Erik Faye-Lund [Wed, 10 Aug 2022 06:24:29 +0000 (08:24 +0200)]
mesa/st: add context-flag for rgtc
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
Samuel Pitoiset [Wed, 14 Sep 2022 07:52:00 +0000 (09:52 +0200)]
radv/ci: cleanup lists of failures/flakes
When tests are already in the flakes list, it's useless to mark them
as expected failures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18592>
Chia-I Wu [Fri, 19 Aug 2022 20:51:08 +0000 (13:51 -0700)]
turnip: use vk_descriptor_set_layout
Mainly for vk_descriptor_set_layout_{ref,unref}.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18588>
Chia-I Wu [Fri, 19 Aug 2022 20:28:23 +0000 (13:28 -0700)]
turnip: use vk_buffer
Mainly for vk_buffer_range.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18588>
Rob Clark [Wed, 14 Sep 2022 22:06:08 +0000 (15:06 -0700)]
freedreno: We really don't need aligned vbo's
The logic was inverted, we don't need aligned for later gens.
Fixes:
60912f1ebd3 ("freedreno: we don't need aligned vbo's")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18605>
Rob Clark [Wed, 14 Sep 2022 21:06:15 +0000 (14:06 -0700)]
freedreno/drm/virtio: Handle read after upload
If we get CPU access (such as a read) after an upload transfer, we need
to ensure that the host has handled the upload. Do this by stalling
when the buffer is mapped. (The previous commit ensures we don't try to
do a pointless upload for an already mapped buffer.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
Rob Clark [Wed, 14 Sep 2022 21:04:38 +0000 (14:04 -0700)]
freedreno/drm/virtio: Don't prefer upload for mapped buffers
The upload path is intended to avoid stalling on host in order to mmap
recently allocated buffers. But if we already had to mmap it, no point
in taking the upload path.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
Rob Clark [Tue, 13 Sep 2022 22:20:58 +0000 (15:20 -0700)]
freedreno/virtio: Don't upload if we have valid range
A transfer that only partially writes the staging buffer could overwrite
valid buffer contents, unless we are told that it is ok to discard the
entire range.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
Emma Anholt [Thu, 25 Aug 2022 21:48:01 +0000 (14:48 -0700)]
mesa: Lower mediump temps and CS shared when the driver supports FP16+INT16.
Typically GLSL mediump lowering will have lowered all the ALU ops
generating the values to 16-bit, and once vars_to_ssa happens the mediump
temps disappear. However, if they don't disappear (for example, the var
gets indirected and eventually gets lowered to scratch or indirect
lowering), then you don't want the storage upconverted to 32-bit.
Also, if a CS shared var is declared mediump, then storing it as 16 bit
prevents conversions around the load store assuming the ALU ops related to
them are 16 bit. For gfxbench aztec ruins, the CS shared var sizes are
cut in half, improving overall perf by 0.805549% +/- 0.0953482% (n=6) on
gl-5-normal.
freedreno shader-db:
total instructions in shared programs: 2917577 -> 2917743 (<.01%)
instructions in affected programs: 46141 -> 46307 (0.36%)
total last-baryf in shared programs: 109712 -> 109492 (-0.20%)
last-baryf in affected programs: 638 -> 418 (-34.48%)
total full in shared programs: 190275 -> 190218 (-0.03%)
full in affected programs: 156 -> 99 (-36.54%)
total constlen in shared programs: 492596 -> 492600 (<.01%)
constlen in affected programs: 8 -> 12 (50.00%)
total cat6 in shared programs: 33019 -> 33107 (0.27%)
cat6 in affected programs: 3604 -> 3692 (2.44%)
total stp in shared programs: 3626 -> 3670 (1.21%)
stp in affected programs: 3336 -> 3380 (1.32%)
total ldp in shared programs: 1718 -> 1762 (2.56%)
ldp in affected programs: 1680 -> 1724 (2.62%)
(this is all in aztec ruins)
total sstall in shared programs: 195656 -> 195182 (-0.24%)
sstall in affected programs: 3249 -> 2775 (-14.59%)
total (ss) in shared programs: 52823 -> 52966 (0.27%)
(ss) in affected programs: 1733 -> 1876 (8.25%)
total systall in shared programs: 507928 -> 508687 (0.15%)
systall in affected programs: 103010 -> 103769 (0.74%)
total (sy) in shared programs: 23185 -> 23196 (0.05%)
(sy) in affected programs: 1276 -> 1287 (0.86%)
total waves in shared programs: 435290 -> 435302 (<.01%)
waves in affected programs: 12 -> 24 (100.00%)
total loops in shared programs: 407 -> 405 (-0.49%)
loops in affected programs: 9 -> 7 (-22.22%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18452>
Emma Anholt [Wed, 7 Sep 2022 00:09:06 +0000 (17:09 -0700)]
nir/lower_mediump_vars: Don't lower mediump shared vars with atomic access.
I don't know of any GPUs doing 16-bit atomic accesses, nor do I know of
anybody wanting that in shaders. But deqp has GLES CTS cases that set
mediump on shared variables, so just skip lowering for those vars.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18452>
Emma Anholt [Wed, 10 Aug 2022 00:48:14 +0000 (17:48 -0700)]
freedreno/ir3: Consistently lower mediump inputs to 16-bit (when we can).
If every use was a conversion to 16, then ir3_cf would fold it into the
bary instruction. But if something had generated a highp comparison of
the mediump input with a mediump op result, it would get stuck as highp,
even though we could have used 16-bit values without upconverting.
This fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on
ANGLE on turnip, closing #7043. fossil-db results are mixed:
fossil-db:
Totals from 697 (4.65% of 14988) affected shaders:
MaxWaves: 10712 -> 10736 (+0.22%)
Instrs: 82394 -> 83572 (+1.43%); split: -1.31%, +2.74%
CodeSize: 178280 -> 180118 (+1.03%); split: -0.46%, +1.49%
NOPs: 15887 -> 16067 (+1.13%); split: -7.48%, +8.61%
MOVs: 1297 -> 1328 (+2.39%); split: -6.86%, +9.25%
Full: 3730 -> 3842 (+3.00%); split: -1.80%, +4.80%
(ss): 1877 -> 1849 (-1.49%); split: -5.59%, +4.10%
(sy): 1249 -> 1255 (+0.48%); split: -1.04%, +1.52%
(ss)-stall: 6809 -> 6364 (-6.54%); split: -13.85%, +7.31%
(sy)-stall: 17059 -> 17257 (+1.16%); split: -6.51%, +7.67%
Cat0: 17220 -> 17400 (+1.05%); split: -6.90%, +7.94%
Cat1: 5307 -> 6366 (+19.95%); split: -6.93%, +26.89%
Cat2: 39138 -> 39101 (-0.09%); split: -0.31%, +0.22%
Cat3: 16772 -> 16741 (-0.18%)
Cat5: 1269 -> 1276 (+0.55%)
I tried to pick some apps to test that looked the most impacted, and
indeed the results are mixed:
cookie_run_kingdom: +0.275514% +/- 0.0883816% (n=68)
trex_200: +0.0943847% +/- 0.0297073% (n=1463)
command_and_conquer_rivals: no difference (n=131)
war_planet_online: no difference (n=120)
lego_legacy: -0.192131% +/- 0.152083% (n=99)
among_us: -0.625227% +/- 0.385419% (n=60)
Given that the perf results are small and go both ways, and apparently
we're an outlier in not always lowering mediump inputs to 16-bit, just do
it for consistency with other drivers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18506>
José Roberto de Souza [Fri, 9 Sep 2022 17:27:28 +0000 (10:27 -0700)]
intel/compiler/fs: Use DF to load constants when has_64bit_int is not supported
This was already been done to gen7 platforms, so now extending to all
platforms without has_64bit_int.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>
José Roberto de Souza [Thu, 8 Sep 2022 15:49:05 +0000 (08:49 -0700)]
intel/compiler/fs: Fix compilation of shaders with SHADER_OPCODE_SHUFFLE of float64 type
During the lower_regioning() optimization, required_exec_type() is
returning BRW_REGISTER_TYPE_UQ type when processing
SHADER_OPCODE_SHUFFLE instructions of type BRW_REGISTER_TYPE_DF but
MTL has float64 support but lacks int64 support causing shader
compilation to fail.
To fix that we could make required_exec_type() return
BRW_REGISTER_TYPE_DF in such case but SHADER_OPCODE_SHUFFLE virtual
instruction runs in the integer pipeline(inferred_exec_pipe()).
So here replacing the has_64bit check by has_64bit_int, this will
properly handle older and newer cases making this function return
BRW_REGISTER_TYPE_UD.
Then lower_exec_type() will take care to generate 2 32bits operations
to accomplish the same.
While at it also dropping the 'devinfo->verx10 == 70' check as
GFX7_FEATURES fall into the same category as MTL, has float64 but no
int64 support.
Fixes at least this crucible tests:
func.uniform-subgroup.exclusive.fadd64.q0
func.uniform-subgroup.exclusive.fmin64.q0
func.uniform-subgroup.exclusive.fmax64.q0
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>
Samuel Pitoiset [Tue, 13 Sep 2022 07:24:52 +0000 (09:24 +0200)]
radv: stop checking for NULL pipelines in radv_CmdBindPipeline()
This should never happen now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18567>
Samuel Pitoiset [Tue, 13 Sep 2022 07:23:08 +0000 (09:23 +0200)]
radv: stop dirtying the graphics pipeline when restoring it
radv_CmdBindPipeline() does it already.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18567>
Samuel Pitoiset [Tue, 13 Sep 2022 07:21:48 +0000 (09:21 +0200)]
radv: reset the compute pipeline when the saved one was NULL
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18567>
Samuel Pitoiset [Tue, 13 Sep 2022 07:21:17 +0000 (09:21 +0200)]
radv: do not bind NULL graphics pipeline when restoring the meta state
It's invalid to bind NULL pipelines, but make sure to reset it to
its previous NULL state.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18567>
Samuel Pitoiset [Mon, 12 Sep 2022 16:37:04 +0000 (18:37 +0200)]
radv: stop setting redundant viewport/scissor for internal operations
Only emit them when it's needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18567>
David Riley [Fri, 8 Jul 2022 20:23:23 +0000 (13:23 -0700)]
drm-shim: Allow drm-shim to work with glibc fortify.
Signed-off-by: David Riley <davidriley@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18558>
Boris Brezillon [Tue, 13 Sep 2022 14:18:33 +0000 (16:18 +0200)]
ci/panvk: Skip dEQP-VK.api.object_management.max_concurrent.query_pool
This test times out occasionally. Let's disable it for now.
Reported-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18578>
David Heidelberg [Wed, 14 Sep 2022 10:20:57 +0000 (12:20 +0200)]
ci/traces: remove first line with YAML version to prevent failure
Older libyaml (0.2.2) fail with YAML 1.2, just drop it.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18595>
David Heidelberg [Wed, 14 Sep 2022 07:36:48 +0000 (09:36 +0200)]
ci: add jq utility
Needed as a dependency for the yq utility.
Also bump x86-build-base image.
Fixes:
f2649b93e29e ("ci: performance traces: make use of no-perf label")
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18595>
David Heidelberg [Wed, 14 Sep 2022 11:03:44 +0000 (13:03 +0200)]
ci: use xargs instead of find -exec
This allows us to see failure when yamllint return non-zero.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18595>
Chad Versace [Tue, 30 Aug 2022 23:42:07 +0000 (16:42 -0700)]
venus: Use VkPhysicalDeviceVulkan13{Features,Properties}
Add the structs to vn_physical_device, just like we do for the 1.1 and
1.2 structs.
Prepares for Vulkan 1.3 enablement. No intended change in behavior.
Tested with gpu Intel Tigerlake on CrOS device volteer.
I tested only a small subset of dEQP because this branch only touches
the code for VkPhysicalDevice{Features2,Properties2}.
vulkan-cts-1.3.3.0
dEQP-VK.api.info.*
dEQP-VK.api.smoke.*
pass/skip/fail = 3796/9/0
I tested Dota 2 on borealis on volteer, with non-Proton Vulkan. The
game launches and reaches the main menu. Same with Hades with DX on
Proton 7.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18158>
Chad Versace [Fri, 19 Aug 2022 23:45:42 +0000 (16:45 -0700)]
venus: Fix features/properties for unavailable extensions
In vn_physical_device_init_features() and
vn_physical_device_init_properties(), we queried many extension structs
even if the extension was unavailable. Afterwards we copied the
undefined values from the extension structs into the core structs.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18158>
Chad Versace [Fri, 26 Aug 2022 22:29:03 +0000 (15:29 -0700)]
venus: Add macros VN_SET_CORE_*
Used to refactor vn_physical_device.c. The new code easier to read and
has less duplication.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18158>
Chad Versace [Wed, 17 Aug 2022 23:59:24 +0000 (16:59 -0700)]
venus: Refactor VN_ADD_TO_PNEXT
Motivation is easier sorting and readability.
- In VN_ADD_TO_PNEXT_OF, re-arrange params to allow sorting. Param1 is
invariant in each block. Param2 is sType.
- In VN_ADD_EXT_TO_PNEXT_OF, make its initial params match those of
VN_ADD_TO_PNEXT_OF.
- Then sort the macro calls.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18158>
Chad Versace [Mon, 12 Sep 2022 21:39:29 +0000 (14:39 -0700)]
venus: Rename some feature/property structs
Make the variable name more closely match the type name.
This also allows them to sort correctly.
argb_4444_formats -> _4444_formats
eight_bit_storage -> _8bit_storage
sixteen_bit_storage -> _16bit_storage
While touching vn_physical_device.[ch], also run clang-format.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18158>
Mike Blumenkrantz [Mon, 12 Sep 2022 18:47:39 +0000 (14:47 -0400)]
zink: handle split acquire/present
if the swapchain image is acquired in a different cmdbuf than it gets
presented with, the acquire semaphore will have already been submitted
by this point, and the swapchain should be flagged as such
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18557>
Mike Blumenkrantz [Thu, 8 Sep 2022 21:14:35 +0000 (17:14 -0400)]
radv: avoid bottlenecking on sequential sparse buffer binds
it's more costly to submit individual sparse buffer binds than to
merge them and submit bigger binds, so try to pre-compare and flatten
out the bind array as much as possible to reduce ioctl counts
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18507>
Mike Blumenkrantz [Wed, 14 Sep 2022 12:01:00 +0000 (08:01 -0400)]
docs: add more features
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17956>
Mike Blumenkrantz [Thu, 1 Sep 2022 11:04:07 +0000 (07:04 -0400)]
lavapipe: ARM/EXT_rasterization_order_attachment_access
another no-op
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17956>
Mike Blumenkrantz [Tue, 9 Aug 2022 13:16:29 +0000 (09:16 -0400)]
lavapipe: VK_EXT_attachment_feedback_loop_layout
no-op
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17956>
Iago Toral Quiroga [Mon, 12 Sep 2022 10:58:20 +0000 (12:58 +0200)]
v3dv: expose VK_EXT_load_store_op_none
This extension adds new NONE attachment load / store operations,
which are identical to the DONT_CARE variants with the difference
that DONT_CARE doesn't ensure that the original contents of the
memory within the render area are preserved and these new versions
do (with some caveats).
Our implementation was not destroying data with DONT_CARE anyway
so we already support the new semantics. Our implementation is
such that we don't need to do anything specific with the new
operations and the current behavior will do what is expected.
We pass all the tests under:
dEQP-VK.renderpass*.load_store_op_none.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18570>
Iago Toral Quiroga [Tue, 13 Sep 2022 06:52:58 +0000 (08:52 +0200)]
v3dv: don't load an attachment for unaligned render area if we are not storing
If the render area is not aligned to tile boundaries it means we have partially
covered tiles in the framebuffer. In this case, we always need to load the tile
buffer from memory in order to preserve the contents outside the render area
on the tile buffer store. However, if in this scenario we know we won't be
storing the tile buffer we can skip the load safely.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18570>
Danylo Piliaiev [Mon, 5 Sep 2022 08:12:01 +0000 (11:12 +0300)]
turnip: implement VK_EXT_multi_draw
vkoverhead running:
* draw numbers are reported as thousands of operations per second
* percentages for draw cases are relative to 'draw'
0, draw, 29151, 100.0%
1, draw_multi, 35449, 121.6%
2, draw_vertex, 28907, 99.2%
3, draw_multi_vertex, 56658, 194.4%
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11502>
Rajnesh Kanwal [Tue, 13 Sep 2022 14:30:33 +0000 (15:30 +0100)]
pvr: Fix multiple file descriptor leaks.
Signed-off-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Reported-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18594>
Connor Abbott [Tue, 26 Jul 2022 10:25:30 +0000 (12:25 +0200)]
tu: Initial implementation of VK_EXT_inline_uniform_block
This is a trivial implementation where we just insert a UBO descriptor
pointing to the actual data and then treat it as a normal UBO everywhere
else. In theory an indirect CP_LOAD_STATE would be more efficient than
ldc.k to preload inline uniform blocks to constants. However we will
always need the UBO descriptor anyway, even if we lower the limits
enough to always be able to preload them, because with variable pointers
we may have a pointer that could be to either an inline uniform block or
regular uniform block. So, using an indirect CP_LOAD_STATE should be an
optimization on top of this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17960>
Connor Abbott [Tue, 26 Jul 2022 10:20:16 +0000 (12:20 +0200)]
tu: Don't preload variable-count descriptors
We don't know how many descriptors will actually be valid, which could
lead to preloading descriptors out-of-bounds of the descriptor size.
This was leading to GPU hangs on some tests once we enabled inline
uniforms.
Fixes:
d9fcf5de55a ("turnip: Enable nonuniform descriptor indexing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17960>
Connor Abbott [Tue, 26 Jul 2022 10:09:36 +0000 (12:09 +0200)]
tu: Fix descriptor set size bounds
This old code looks like it was left around from anv. Make it use the
limits the rest of the code uses.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17960>
Rhys Perry [Fri, 28 Jan 2022 15:48:39 +0000 (15:48 +0000)]
nir/algebraic: optimize fabs(bcsel(b, fneg(a), a))
fossil-db (Sienna Cichlid):
Totals from 207 (0.15% of 134913) affected shaders:
VGPRs: 7152 -> 6928 (-3.13%)
CodeSize: 762404 -> 752888 (-1.25%)
MaxWaves: 6138 -> 6146 (+0.13%)
Instrs: 144031 -> 142184 (-1.28%)
Latency: 817783 -> 807286 (-1.28%)
InvThroughput: 151031 -> 147497 (-2.34%)
VClause: 1490 -> 1453 (-2.48%)
SClause: 3357 -> 3331 (-0.77%); split: -0.92%, +0.15%
Copies: 9632 -> 9555 (-0.80%); split: -0.81%, +0.01%
Branches: 4306 -> 4270 (-0.84%)
PreSGPRs: 11232 -> 11218 (-0.12%); split: -0.15%, +0.03%
PreVGPRs: 6307 -> 6121 (-2.95%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14772>
Danylo Piliaiev [Mon, 5 Sep 2022 09:18:55 +0000 (12:18 +0300)]
ir3: Prevent reordering movmsk with kill
`kill` changes which fibers are active, thus reodering instructions
which depend on which fibers are active - is wrong.
The issue was hidden because only `ballot(true)` is translated to movmsk
immidiately, while others are passed as MACRO and don't properly
take part in ir3_sched (which does the reordering).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7162
Fixes CTS test (on gen3+):
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.subgroup_ballot
Fixes:
b1b80c06a78e62b2d8477b07f12b0153435b66a8
("ir3: Implement nir subgroup intrinsics")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18413>
Frank Binns [Sat, 20 Aug 2022 17:49:52 +0000 (18:49 +0100)]
pvr: add required pixel formats
As per section 33.3 ("Required Format Support") of the Vulkan 1.0 spec - see
tables 42 to 52.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18572>
Tapani Pälli [Wed, 11 May 2022 10:10:07 +0000 (13:10 +0300)]
iris: disable preemption on VFG, Wa_14015207028 for DG2
This workaround disables batch level preemption for Polygon,
Trifan and Lineloop primitive topologies.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18456>