Samuel Pitoiset [Tue, 25 Jul 2023 06:07:29 +0000 (08:07 +0200)]
radv: pass submit info to radv_check_gpu_hangs()
This will allow to dump preambles/postambles CS and eventually even
more CS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24191>
Samuel Pitoiset [Tue, 25 Jul 2023 06:07:07 +0000 (08:07 +0200)]
radv/amdgpu: rename old_ib to ib in radv_amdgpu_winsys_cs_dump()
Forgot this variable when I renamed the ib_buffers array.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24191>
Samuel Pitoiset [Tue, 25 Jul 2023 06:06:45 +0000 (08:06 +0200)]
radv/amdgpu: fix dumping CS with the chained IBs path
ib_buffer is now NULL in both paths, and the first IB is the beginning
of the chain.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24191>
Samuel Pitoiset [Thu, 20 Jul 2023 15:31:55 +0000 (17:31 +0200)]
radv: use next_stage for determining the stage to lower NGG
If the next stage is FS, it's also the last VGT API stage.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Thu, 20 Jul 2023 15:11:50 +0000 (17:11 +0200)]
radv: simplify getting next VS stage for VS prologs
It's the VS shader info stage.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Fri, 21 Jul 2023 09:10:37 +0000 (11:10 +0200)]
radv: determine as_ls earlier by using the next stage
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Fri, 21 Jul 2023 08:54:40 +0000 (10:54 +0200)]
radv: determine ES info for VS/TES with GS earlier
By using the next stage, it's possible to compute these information
earlier without having to link shaders info.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Fri, 21 Jul 2023 08:40:28 +0000 (10:40 +0200)]
radv: use the number of GS linked inputs to compute the ESGS itemsize
It's similar.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Fri, 21 Jul 2023 08:39:39 +0000 (10:39 +0200)]
radv: add a helper to compute the ESGS itemsize
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Samuel Pitoiset [Fri, 21 Jul 2023 06:44:39 +0000 (08:44 +0200)]
radv: remove the pipeline dependency for creating a GS copy shader
This is unnecessary. While we are at it, stop passing the array of
shaders and use the GS stage only.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24273>
Jianxun Zhang [Fri, 21 Jul 2023 03:33:34 +0000 (20:33 -0700)]
intel/common: Only set op mask on instructions in decoder
When a default value of a struct's field, which is in the
higher half of the first dword, is specified in a gen xml
file, setting op mask makes decoder treat the field as a
header (intel_field_is_header()). As a result, it won't
output the field in batch dump. This is not a common case
but can happen once a gen xml file includes such fields.
The op mask is only meaningful to instructions, so we fix
the above issue by not setting op mask of structs (also
registers).
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24268>
Nanley Chery [Fri, 21 Apr 2023 21:34:48 +0000 (14:34 -0700)]
iris: Handle clear color compatibility in prepare_render
Before this patch, iris_resource_render_aux_usage would disable
compression when the clear color did not support format
reinterpretation.
With this patch, iris now replaces the clear color with zero and keeps
compression enabled. Disabling fast clears would be enough for most aux
usages, but replacement is also done to handle ISL_AUX_USAGE_FCV_CCS_E.
Note that this also fixes a bug. Format reinterpretation with
incompatible clear colors previously was not handled for the MCS aux
usages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Nanley Chery [Wed, 7 Jun 2023 19:49:31 +0000 (15:49 -0400)]
iris: Create BLORP surfaces after resource preparation
iris_resource_prepare_render will soon gain the ability to change a
resource's clear color. iris_blorp_surf_for_resource will keep a copy of
that clear color, so make sure calls to it happen after the render
preparation helper. At the moment, this shouldn't have an impact besides
improving debugging.
While we're here, do the same for the generic access preparation helper.
We may convert those to more specific helpers at a later time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Nanley Chery [Wed, 7 Jun 2023 18:47:10 +0000 (14:47 -0400)]
iris: Pass the render format to prepare_render
This will be used in an upcoming patch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Nanley Chery [Wed, 7 Jun 2023 19:39:27 +0000 (15:39 -0400)]
iris: Reorder render_aux_usage parameters
Match the order of the parameters for iris_resource_texture_aux_usage.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Nanley Chery [Tue, 13 Jun 2023 12:02:39 +0000 (08:02 -0400)]
intel/blorp: Ambiguate after CCS resolves on gfx7-8
ISL's state-machine of CCS_D describes full resolves as leaving the aux
buffer in the pass-through state. Hardware doesn't behave this way on
gfx8 however. On that platform, full resolves transition the aux buffer
to the resolved state. This was verified by dumping the CCS before and
after a full resolve on BDW (gfx7 is simply assumed to behave the same).
Ambiguate after resolving to match driver expectations.
Prevents iris from failing piglit's fcc-write-after-clear on BDW with a
future patch which relies on fast-clear encodings being removed after a
resolve. The avoided failure is:
Testing implicit read of partial block UNORM -> SNORM
Probe color at (0,1,0)
Expected: 1.000000 1.000000 1.000000 1.000000
Observed: 0.000000 0.000000 0.000000 0.000000
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23676>
Lionel Landwerlin [Wed, 19 Jul 2023 06:22:59 +0000 (09:22 +0300)]
intel/fs: don't try to rebuild sequences of non ssa values
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
04777171e0 ("intel/fs: try to rematerialize surface computation code")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9378
Reviewed-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228>
Caio Oliveira [Wed, 19 Jul 2023 21:57:03 +0000 (14:57 -0700)]
meson: Ensure that LLVMSPIRVLib is not required for Clover
Fixes:
cb588d5d6ee ("compiler/clc: Move related NIR passes to the common mesa clc")
Closes: #9391
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24244>
Emma Anholt [Tue, 18 Jul 2023 23:34:14 +0000 (16:34 -0700)]
ci/tgl: Improve the info for ANGLE's MSAA regression on TGL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24200>
Emma Anholt [Mon, 17 Jul 2023 19:26:09 +0000 (12:26 -0700)]
ci: Uprev ANGLE to
0518a3ff4d4e ("Android: Simplify power metrics collection")
There have been some fixes for our drivers that we'd like to bring in.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24200>
Emma Anholt [Mon, 17 Jul 2023 21:57:01 +0000 (14:57 -0700)]
ci/radv: Clarify when the ANGLE GS failures started happening.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24200>
Faith Ekstrand [Fri, 21 Jul 2023 19:52:41 +0000 (14:52 -0500)]
anv,hasvk,iris: sampler_prog_key::swizzles is only used on crocus
The field is no longer consumed by brw_complie_* and is instead handled
directly by the crocus driver. Therefore, it's safe to leave it zero
and not even bother setting it. This removes our reliance on the
SWIZZLE_* macros in prog_instructions.h.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24288>
Christian Gmeiner [Tue, 18 Jul 2023 08:55:18 +0000 (10:55 +0200)]
etnaviv: nir: convert to new-style NIR registers
The initial plan was to use 'nir_legacy' helpers but it turns out
that our RA pass is hard to confince to be happy with it. So we are
useing the 'chasing' helpers now.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Tue, 18 Jul 2023 09:36:08 +0000 (11:36 +0200)]
etnaviv: nir: switch to etna_nir_lower_to_source_mods(..)
nir's source modifiers are going away soon and with it also the lowering
pass. Lets switch to our own lowering pass. We need to run our own
lowering pass almost at the end else opc_cse(..) etc. might do some
wrong needed opts as nir does not see our modifiers.
Also we need to remove the last nir_opt_dce(..) as it will remove not dead
code caused by the used load_const hack.
32 %15 = load_const (0x00000000 = 0.000000)
32 %4 = fabs %15 (0.000000)
nir_opt_dce is correct when it removes the two instructions. But in reality
the load_const is a uniform that should not be removed.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 08:41:16 +0000 (10:41 +0200)]
etnaviv: nir: add etna_nir_lower_to_source_mods(..)
This is more or less a copy of nir_lower_to_source_mods(..) with
the following differences:
- we store the source mods in pass_flags
- we do not try to saturate the destination
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 15:26:16 +0000 (17:26 +0200)]
etnaviv: nir: look at parent instr in lower_alu(..)
When we switch to our own lower_to_source_mods pass we will start
to see such patterns:
32x4 %18 = fneg %5 (-5.125000, -30.000000, 5.500000, -6.500000)
32x4 %19 = ffma %18, %8, %4 (-6.500000, -7.750000, 6.500000, 6.000000)
This is a problem as we will generate instruction that accesses two
different uniforms, which is a problem on GPUs where has_no_oneconst_limit
is false.
Make lower_alu(..) smarter by looking in the parent for for the constant
value.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 08:45:59 +0000 (10:45 +0200)]
etnaviv: do not clear all pass_flags before RA
We only need to clear the 'dead' bits. The others are
used for source mods.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 08:35:39 +0000 (10:35 +0200)]
etnaviv: extend etna_pass_flags with source modifiers
As nir_lower_to_source_mods(..) will be deleted and with it the modifier storage
in nir's core we need to find an other way store the information.
We have have 6 bits left in nir's pass_flags - so lets go that route.
This also adds some small helpers that will be used later.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 07:57:10 +0000 (09:57 +0200)]
etnaviv: add is_dead_instruction(..) helper
As we are going to extend the enum etna_pass_flags it makes sense
to add a small helper to test if an instruction is dead. An instruction
is dead if BYPASS_DST or BYPASS_SRC is set.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 07:52:27 +0000 (09:52 +0200)]
etnaviv: name the enum used for pass_flags
This enum is used for the pass_flags that can be set on a
nir_instr. Name it to make the intention of its usage clear.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
Christian Gmeiner [Wed, 19 Jul 2023 07:46:03 +0000 (09:46 +0200)]
etnaviv: make use of BITFIELD_BIT(..) macro
It helps to make the code easier to read.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24216>
David Rosca [Fri, 21 Jul 2023 09:05:35 +0000 (11:05 +0200)]
frontends/va: Add YUV420 to NV12 postproc conversion
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7853
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24272>
David Rosca [Thu, 20 Jul 2023 19:40:27 +0000 (21:40 +0200)]
gallium/auxiliary/vl: Fix blurry output of compute_shader_yuv
There is a linear sampler used, so add half texel offset
to avoid undesirable blur when input and output resolutions
are the same.
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24272>
David Rosca [Thu, 20 Jul 2023 16:02:08 +0000 (18:02 +0200)]
gallium/auxiliary/vl: Handle UV subsampling in compute_shader_yuv
Also remove the 1px vertical shift as it results
in a black line at the bottom of the picture.
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24272>
Georg Lehmann [Mon, 24 Jul 2023 11:57:16 +0000 (13:57 +0200)]
aco: improve get_gfx11_true16_mask description
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24294>
Georg Lehmann [Sat, 22 Jul 2023 21:17:45 +0000 (23:17 +0200)]
aco/gfx11: fix get_gfx11_true16_mask with v_cmp_class_f16
The second operand is 16bit, so the we need to use VOP3 to address v128-v255.
Closes: #9413
Fixes:
6872f8d861b ("aco/gfx11: allow true 16-bit instructions to access v128+")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24294>
Rhys Perry [Mon, 24 Jul 2023 11:02:48 +0000 (12:02 +0100)]
nir/tests: add nir_opt_dead_cf_test.jump_before_constant_if
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
Rhys Perry [Wed, 19 Jul 2023 13:05:11 +0000 (14:05 +0100)]
nir/opt_dead_cf: remove nodes after a jump earlier
In the case of:
halt
// succs: b9
if %618 {
block b3:// preds:
break
// succs: b6
} else {
block b4: // preds: , succs: b5
}
block b5: // preds: b4
32 %556 = iadd %617, %2 (0x1)
opt_constant_if() doesn't work because stitch_blocks() can't join blocks if the
before ends in a jump and the after isn't empty.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
Konstantin Seurer [Thu, 20 Jul 2023 08:35:09 +0000 (10:35 +0200)]
nir/tests: Use a single binary
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24249>
Konstantin Seurer [Thu, 20 Jul 2023 08:33:52 +0000 (10:33 +0200)]
nir/tests: Refactor boilerplate into a common header
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24249>
Danylo Piliaiev [Fri, 21 Jul 2023 11:27:38 +0000 (13:27 +0200)]
tu,freedreno: Forbid blit event for R8G8_SRGB due to gpu faults
Same cause as for other R8G8 formats - msaa resolve via
blit event causes gpu fault.
Fixes:
dEQP-VK.api.image_clearing.*.clear_color_attachment.*.r8g8_srgb_*
Fixes:
029919f3c83f379065515708188d5c439c3fa6bc
("tu: allow using resolve engine for SRGB MSAA resolves")
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24277>
Charles Giessen [Fri, 21 Jul 2023 20:28:13 +0000 (20:28 +0000)]
panvk: Use 1.0 in ICD Manifest json
PanVK downgraded from supporting Vulkan 1.1 to 1.0, but did not change
their ICD Manifest api_version to reflect that. This cause the Vulkan-Loader
to interpret the ICD as a 1.1 driver erroneously. Originally discussed in this
issue https://github.com/KhronosGroup/Vulkan-Loader/issues/1242
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24289>
Marcin Ślusarz [Fri, 21 Jul 2023 09:50:51 +0000 (11:50 +0200)]
intel/compiler: load debug mesh compaction options once
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
Marcin Ślusarz [Wed, 21 Dec 2022 14:42:55 +0000 (15:42 +0100)]
intel/compiler,anv: put some vertex and primitive data in headers
Both per-primitive and per-vertex space is allocated in MUE in 8 dword
chunks and those 8-dword chunks (granularity of
3DSTATE_SBE_MESH.Per[Primitive|Vertex]URBEntryOutputReadLength)
are passed to fragment shaders as inputs (either non-interpolated
for per-primitive and flat vertex attributes or interpolated
for non-flat vertex attributes).
Some attributes have a special meaning and must be placed in separate
8/16-dword slot called Primitive Header or Vertex Header.
Primitive Header contains 4 such attributes (Cull Primitive,
ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword
slot) potentially unused.
Vertex Header is similar - it starts with 3 unused dwords, 1 dword for
Point Size (but if we declare that shader doesn't produce Point Size
then we can reuse it), followed by 4 dwords for Position and optionally
8 dwords for clip distances.
This means we have an interesting optimization problem - we can put
some user attributes into holes in Primitive and Vertex Headers, which
may lead to smaller MUE size and potentially more mesh threads running
in parallel, but we have to be careful to use those holes only when
we need it, otherwise we could force HW to pass too much data to
fragment shader.
Example 1:
Let's assume that Primitive Header is enabled and user defined
12 dwords of per-primitive attributes.
Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of
MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader.
With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to
fragment shader.
16/16 is better than 24/16, so packing makes sense.
Example 2:
Now let's assume that Primitive Header is enabled and user defined
16 dwords of per-primitive attributes.
Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of
MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader.
With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to
fragment shader.
24/24 is worse than 24/16, so packing doesn't make sense.
This change doesn't affect vk_meshlet_cadscene in default configuration,
but it speeds it up by up to 25% with "-extraattributes N", where
N is some small value divisible by 2 (by default N == 1) and we
are bound by URB size.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
Marcin Ślusarz [Wed, 21 Dec 2022 14:40:07 +0000 (15:40 +0100)]
intel/compiler/mesh: compactify MUE layout
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.
There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
float[N] requires N 1-dword slots, but
i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
if possible is beneficial
This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.
Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>
Samuel Pitoiset [Fri, 21 Jul 2023 12:19:28 +0000 (14:19 +0200)]
radv: add radv_compile_cs() to compile a compute shader
This doesn't rely on the pipeline.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24280>
Samuel Pitoiset [Fri, 21 Jul 2023 12:28:06 +0000 (14:28 +0200)]
radv: stop using an array of binaries when compiling a compute shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24280>
Zhang Ning [Sun, 18 Jun 2023 01:12:16 +0000 (09:12 +0800)]
Revert "intel/ci: disable iris-jsl-deqp because it always fails for an AMD MR"
This reverts commit
da4b5b4a47ca727a7c8892d2bea50739df3b94ed.
Signed-off-by: Zhang Ning <zhangn1985@outlook.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23815>
Timothy Arceri [Tue, 18 Jul 2023 06:28:30 +0000 (16:28 +1000)]
nir/opt_copy_prop_vars: drop reuse of dynamic arrays
After the previous commit there are so few to reuse that this is no
longer worth doing and actually causes compilation to slow down.
The Blender shader compile time in issue #9326 improves as folows:
21.11 seconds -> 9.90 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
0.92 seconds -> 0.68 seconds
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9326
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Timothy Arceri [Thu, 13 Jul 2023 11:12:32 +0000 (21:12 +1000)]
nir/opt_copy_prop_vars: skip cloning of copies arrays until needed
Most of the variables in the hash table will never actually be looked up
for any given block so cloning every possible value just creates a bunch
of unrequired memcpy calls.
Here we change the code to only clone the copies array once it is
actually looked up for the first time.
The Blender shader compile time in issue #9326 improves as folows:
151.09 seconds -> 21.11 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
1.67 seconds -> 0.92 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Timothy Arceri [Wed, 12 Jul 2023 06:15:29 +0000 (16:15 +1000)]
nir/opt_copy_prop_vars: remove var hash entry on kill alias
If kill alias results in the hash table entry holding an empty
copies array then remove the hash entry and return the dynamic array
to the unused pool.
This helps avoid hash table size getting out of control in very large
shaders.
151.09 seconds -> 118.60 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Timothy Arceri [Wed, 12 Jul 2023 06:34:11 +0000 (16:34 +1000)]
nir/opt_copy_prop_vars: speedup cloning of copy tables
Here we change things to simply clone the entire hash table. This
is much faster than trying to rebuild it and is needed to avoid
slow compilation of very large shaders.
The Blender shader compile time in issue #9326 improves as folows:
251.29 seconds -> 151.09 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
2.38 seconds -> 1.67 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Timothy Arceri [Tue, 11 Jul 2023 03:21:57 +0000 (13:21 +1000)]
nir/opt_copy_prop_vars: don't clone copies if branch empty
There is no point doing an expensive clone of the copies if the
if-branch is empty.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Qiang Yu [Thu, 6 Jul 2023 02:39:33 +0000 (10:39 +0800)]
radeonsi: enable aco compile for mono merged ES/GS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 02:36:48 +0000 (10:36 +0800)]
radeonsi: enable aco compile for mono merged LS/HS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Mon, 17 Jul 2023 14:27:35 +0000 (22:27 +0800)]
radeonsi: calculate lds size for merged shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 02:27:02 +0000 (10:27 +0800)]
radeonsi: aco compile support merged mono shader
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Wed, 12 Jul 2023 07:34:03 +0000 (15:34 +0800)]
radeonsi: refine si_llvm_es_build_end
1. merge si_set_es_return_value_for_gs into si_llvm_es_build_end
2. stop return value when mono mode in which case GS use ES input as
input instead of ES output
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Wed, 12 Jul 2023 07:21:43 +0000 (15:21 +0800)]
radeonsi: refine si_llvm_ls_build_end
1. merge si_set_ls_return_value_for_tcs into si_llvm_ls_build_end because they
do the same job to return value
2. stop return value when mono mode with different thread count, in which case
TCS use LS input as its input instead of LS output
3. use si_insert_input_ret_float
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Tue, 11 Jul 2023 10:08:19 +0000 (18:08 +0800)]
radeonsi: remove param type check in wrapper function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 13:35:05 +0000 (21:35 +0800)]
radeonsi: move vertex shader vb desc input sgpr args to last
ACO use same args for merged shader stages, but vb desc input sgpr args
is not present when second stage of merged shader. In order to share
same shaders args, move it to last so other args have same index.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Tue, 11 Jul 2023 09:56:29 +0000 (17:56 +0800)]
radeonsi: simplify si_build_wrapper_function
We only need it to merge LS/HS or ES/GS now, prolog and epilog have
been lowered in nir already. So we just need to handle two parts and
they are sure to be first and second stage of a merged shader.
This also remove the needs SGPRs must be before VGPRs, which is required
by following commits to move some SGPRs after VGPRs.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 07:11:57 +0000 (15:11 +0800)]
radeonsi: init aco shader info for merged LS/HS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 02:09:34 +0000 (10:09 +0800)]
radeonsi: extract si_get_prev_stage_nir_shader to be shared with aco
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Wed, 5 Jul 2023 07:48:50 +0000 (15:48 +0800)]
radeonsi: aco does not pass LS outputs to HS by arg
aco has global input/output variables for this.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
Qiang Yu [Thu, 6 Jul 2023 06:45:02 +0000 (14:45 +0800)]
aco,radv: replace tess_input_vertices shader info param
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24204>
David Heidelberg [Sun, 23 Jul 2023 23:37:01 +0000 (01:37 +0200)]
ci/freedreno: cover all texture gather flakes
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24300>
Konstantin Seurer [Wed, 19 Jul 2023 19:19:40 +0000 (21:19 +0200)]
llvmpipe: Fix compiling with LP_USE_TEXTURE_CACHE
Fixes: 36eb75d ("llvmpipe: move to common sampler/image binding code")
Closes: #9359
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24243>
Bas Nieuwenhuizen [Thu, 20 Jul 2023 21:01:01 +0000 (23:01 +0200)]
nir: Fix 16-component nir_replicate.
Fixes:
f534c2c539f ("nir/builder: Add nir_replicate helper")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24286>
Bas Nieuwenhuizen [Fri, 21 Jul 2023 19:00:42 +0000 (21:00 +0200)]
aco: Fix some constant patterns in 16-bit vec4 construction with s_pack.
Fixes:
04e3d7ad930 ("aco: improve nir_op_vec with constant operands")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24286>
Bas Nieuwenhuizen [Thu, 20 Jul 2023 21:04:44 +0000 (23:04 +0200)]
aco: fix nir_op_vec8/16 with 16-bit elements.
Fixes:
5718347c2b4 ("aco: implement vec2/3/4 with subdword operands")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24286>
Alyssa Rosenzweig [Sat, 22 Jul 2023 16:40:49 +0000 (12:40 -0400)]
asahi: Don't depend on glibc to decode
fopencookie is a glibc feature, so we can't use it on macOS (and
probably other libc's?). It's only used for the hypervisor interface,
though, so we can just make the hypervisor piece glibc-only while
otherwise fixing the wrap.dylib build.
Fixes:
ee83453f69f ("asahi: Add a shared library interface for decode")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24293>
Eric Engestrom [Fri, 21 Jul 2023 17:30:53 +0000 (18:30 +0100)]
asahi: drop unused include paths
Signed-off-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24287>
Christian Gmeiner [Fri, 21 Jul 2023 15:00:31 +0000 (17:00 +0200)]
ci/etnaviv: update ci expectations
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24291>
Chia-I Wu [Sat, 22 Jul 2023 01:13:24 +0000 (18:13 -0700)]
amd/ci: update radv-stoney-aco-fails.txt for depth/stencil resolve
image_2d_16_64_6 ones have been fixed by the previous commit. The
others are outdated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23959>
Chia-I Wu [Sat, 1 Jul 2023 00:36:57 +0000 (17:36 -0700)]
radv: disable tc-compat htile for layered images on gfx8
sliceInterleaved may be true for layered images on gfx8. Such a htile
cannot be cleared with radv_clear_htile.
Fixes 24 failures in
dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_16_64_6.* on GFX8.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23959>
Thomas H.P. Andersen [Fri, 21 Jul 2023 15:19:38 +0000 (17:19 +0200)]
tgsi: drop two unused functions
Removes:
* tgsi_util_get_src_from_ind
* tgsi_full_src_register_from_dst
The last usage of these got removed in !24175
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24283>
Yiwei Zhang [Wed, 12 Jul 2023 03:24:31 +0000 (20:24 -0700)]
venus: use in_render_pass to skip present_src counting
It's an early return also benefiting dynamic rendering. We then no
longer need to track the legacy pass from inheritance info.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Wed, 12 Jul 2023 02:49:43 +0000 (19:49 -0700)]
venus: refactor more cmd states into cmd builder
This change:
- adds helpers for cmd begin/end rendering
- simplifies cmd reset
- updates ordering to align with cmd builder
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Mon, 10 Jul 2023 23:15:06 +0000 (16:15 -0700)]
venus: avoid redundant tracking of render pass
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Mon, 10 Jul 2023 22:50:27 +0000 (15:50 -0700)]
venus: add helpers to track subpass view mask
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Fri, 7 Jul 2023 07:14:07 +0000 (00:14 -0700)]
venus: cleanup vn_cmd_begin_render_pass usage
For secondary command buffers, vn_cmd_begin_render_pass was only used to
track inherited render pass previously. So we clean it up.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Mon, 10 Jul 2023 21:58:58 +0000 (14:58 -0700)]
venus: use tracked queue_family_index from the cmd pool
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Thu, 6 Jul 2023 22:43:02 +0000 (15:43 -0700)]
venus: remove redundant fb tracking from cmd builder
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Sun, 9 Jul 2023 05:11:44 +0000 (22:11 -0700)]
venus: move transient storage from cmd to pool
The storage is for command scope usage, so it fits better for the pool.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Wed, 12 Jul 2023 02:09:05 +0000 (19:09 -0700)]
venus: log and doc the broken query feedback in suspended render pass
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Mon, 10 Jul 2023 23:31:48 +0000 (16:31 -0700)]
venus: fix cmd state leak across implicit reset
Reset cmd states during vkBeginCommandBuffer regardless of the
VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT for simplicity.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Yiwei Zhang [Thu, 6 Jul 2023 22:55:46 +0000 (15:55 -0700)]
venus: fix a cmd builder render_pass state leak across reset
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24103>
Dave Airlie [Fri, 21 Jul 2023 01:35:36 +0000 (11:35 +1000)]
gallivm: fix atomic global temporary storage.
Fixes regression on llvm15 with
piglit tests/cl/program/execute/builtin/atomic/atomic_xchg-global.cl
Fixes:
f28129000511 ("gallivm: Fix atomic_global types")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24267>
Charmaine Lee [Wed, 19 Jul 2023 06:09:53 +0000 (09:09 +0300)]
svga: set clear_texture to NULL for vgpu9
With PIPE_CAP_CLEAR_TEXTURE removed, we need to set clear_texture to NULL
on svga vgpu9 device so it can use the fallback path.
Fixes:
a1eabeff660 ("gallium: remove PIPE_CAP_CLEAR_TEXTURE")
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24264>
Yiwei Zhang [Thu, 20 Jul 2023 00:07:48 +0000 (17:07 -0700)]
ci/venus: update venus-lavapipe expectations
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24263>
Yiwei Zhang [Thu, 20 Jul 2023 18:47:52 +0000 (11:47 -0700)]
lvp: avoid reading immutable sampler from desc write info
Lavapipe has switched to layer push descriptor support atop descriptor
updates internally since
12a7fc51c77925a5562fd104a8fbd664a46ffc8b, so
it must skip retrieving immutable samplers from the write info even if
the update call itself is blessed by the spec to not hit that case.
Fixes:
12a7fc51c77 ("lavapipe: Rework descriptor handling")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24263>
Lionel Landwerlin [Mon, 12 Jun 2023 08:09:56 +0000 (11:09 +0300)]
vulkan: bump header register to 1.3.258
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24279>
Connor Abbott [Fri, 14 Jul 2023 16:15:09 +0000 (18:15 +0200)]
tu, freedreno/a6xx: Remove has_ccu_flush_bug
Based on the previous commit, this isn't actually a bug and is expected
behavior. Turnip should already be handling it correctly for user
flushes, we just have to make sure to handle it for flushes we insert
ourselves in turnip and freedreno.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24162>
Connor Abbott [Fri, 14 Jul 2023 15:48:12 +0000 (17:48 +0200)]
tu: Fix and simplify execution dependency handling
When I wrote this code, I was under the impression that at most one
context from each cluster could be executing at a time. This would mean
that we could treat clusters as pipeline stages and only insert a WFI if
there was a bubble where an earlier stage depends on the result of a
later stage.
This mental model was wrong, though. Experiments on a6xx show it's
possible for two contexts to be executing simultaneously, even though
there are only two contexts - register writing is just stalled until the
earliest-launched context finishes.
This means that the mental model is now much simpler. Any draw can, in
theory, execute in parallel with any previous draw, blit, flush, etc,
although it seems that flushes do wait for any earlier work to finish.
Clusters are mostly just an implementation detail that only matter in
some corner cases, like setting a non-context register (written in the
last cluster) that is used by an earlier cluster that can race ahead of
the write.
An example where this makes a difference is a fragment shader that
writes an image via stib followed by a blit from that same image.
Because both operations happen in the same cluster and use the same
cache, we wouldn't emit anything in the barrier, however actually we
still need to WFI.
This was getting worse on a7xx because later clusters now have 4
contexts, making it easier for draws to be executed in parallel. However
AFAICT it was already a problem on a6xx.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24162>
Connor Abbott [Fri, 14 Jul 2023 18:12:27 +0000 (20:12 +0200)]
tu: Fix vk2tu_*_stage flag type
New flags were silently getting dropped.
Fixes:
59259a01671 ("tu: Convert to sync2 entrypoints")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24162>
Eric Engestrom [Fri, 21 Jul 2023 12:56:20 +0000 (13:56 +0100)]
docs: update calendar for 23.1.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24281>
Eric Engestrom [Fri, 21 Jul 2023 12:55:57 +0000 (13:55 +0100)]
docs: add sha256sum for 23.1.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24281>
Eric Engestrom [Fri, 21 Jul 2023 12:42:31 +0000 (13:42 +0100)]
docs: add release notes for 23.1.4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24281>
David Rosca [Wed, 19 Jul 2023 10:10:21 +0000 (12:10 +0200)]
gallium/auxiliary/vl: Fix RGB->YCbCr full range matrix
Also rename it to bt_709_rev_full as there already
is bt_709 which is used for YCbCr->RGB.
Fixes:
8a21efce3a2 ("frontends/va: Add postproc support for converting to full range")
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24238>