Karmjit Mahil [Wed, 21 Sep 2022 15:41:10 +0000 (16:41 +0100)]
pvr: Add layer count support to pvr_clear_vdm_state().
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Karmjit Mahil [Wed, 21 Sep 2022 15:52:48 +0000 (16:52 +0100)]
pvr: Move clear VDM state into pvr_clear.h .
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Karmjit Mahil [Wed, 21 Sep 2022 13:40:23 +0000 (14:40 +0100)]
pvr: Add clear rta vert shader pds program.
The rta program will be used in following commits adding support
for vkCmdClearAttachments().
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Karmjit Mahil [Tue, 20 Sep 2022 13:23:57 +0000 (14:23 +0100)]
pvr: Add pvr_clear.{h,c} .
This moves some clear related functionality into a new
pvr_clear.{h,c} just to for better organisation and allow for
easier reusability.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Karmjit Mahil [Thu, 15 Sep 2022 15:09:23 +0000 (16:09 +0100)]
pvr: Add multi layer passthough vert shader upload in device.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Karmjit Mahil [Thu, 15 Sep 2022 14:43:13 +0000 (15:43 +0100)]
pvr: Change "ID" to "id" in instance_ID_modifier.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20055>
Corentin Noël [Wed, 4 Jan 2023 10:34:46 +0000 (11:34 +0100)]
ci: Remove MESA_ARM_BUILD_TAG environment variable
Its value is already the same as MESA_IMAGE_TAG so no need to duplicate it.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20488>
Corentin Noël [Tue, 3 Jan 2023 10:51:24 +0000 (11:51 +0100)]
ci: Bump crosvm and virglrenderer versions
Update virglrenderer and crosvm to the latest version on time.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20488>
Gert Wollny [Wed, 4 Jan 2023 13:31:48 +0000 (14:31 +0100)]
r600/sfn: make sure we return a non-negative number of registers
If a shader doesn't use any register and only ssa values we might
end up with zero minimum registers, and because a unsigned is
returned that goes wrong.
Fixes:
565816dfa15214abbeef9a9d94e44f30507ca4d7
r600/sfn: Set minimum required registers based on array allocation
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8008
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20516>
Samuel Pitoiset [Fri, 16 Dec 2022 12:26:36 +0000 (13:26 +0100)]
radv: rework generating the PS epilog key
Generating a PS epilog key will also be used when compiling PS epilogs
on-demand. This introduces a new helper that generates it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20485>
Samuel Pitoiset [Fri, 16 Dec 2022 09:01:00 +0000 (10:01 +0100)]
radv: simplify removing unused color exports
If CB_TARGET_MASK (color write mask) is 0 for a given MRT, this implies
that the color format is 0 because the driver compacts MRTs.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20485>
Danylo Piliaiev [Tue, 3 Jan 2023 15:29:50 +0000 (16:29 +0100)]
docs/freedreno: Extract debug tooling docs and improve gpu dbg docs
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20492>
Rhys Perry [Mon, 2 Jan 2023 19:17:16 +0000 (19:17 +0000)]
radv/winsys: set has_3d_cube_border_color_mipmap for null winsys
Without this, NIR->LLVM will set level_zero to false, crashing compilation
of some GFX11 shaders with LLVM (image_gather4_c_o is not supported, while
image_gather4_c_lz_o is).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20483>
Danylo Piliaiev [Tue, 3 Jan 2023 12:34:24 +0000 (13:34 +0100)]
docs/freedreno: Extract LRZ docs from tu_lrz
Most of the docs describe HW and are not specific to Turnip.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20491>
Samuel Pitoiset [Thu, 24 Nov 2022 14:07:32 +0000 (15:07 +0100)]
radv: fix multiple resolves in the same subpass
If there is multiple resolves, the driver shouldn't always select the
fragment path because it doesn't work for all images.
Fixes dEQP-VK.pipeline.monolithic.multisample.misc.*
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19986>
Ian Romanick [Tue, 13 Dec 2022 18:49:41 +0000 (10:49 -0800)]
glsl: Remove bit_count lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR lower_bit_count
flag.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
Ian Romanick [Tue, 13 Dec 2022 18:42:49 +0000 (10:42 -0800)]
glsl: Remove bitfield_reverse lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR
lower_bitfield_reverse flag.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
Ian Romanick [Tue, 13 Dec 2022 18:36:18 +0000 (10:36 -0800)]
glsl: Remove bitfield_extract and bitfield_insert lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets some subset of the various
NIR lower_bitfield_extract and lower_bitfield_insert flags.
v2: Declaration of 'result' still needs to be added to the IR. Noticed
by marge.
v3: Fix 'git rebase --autosquash' putting the v2 fix in the wrong
place. I've never seen that happen before. :(
Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
Ian Romanick [Tue, 13 Dec 2022 19:11:13 +0000 (11:11 -0800)]
nir: Don't allow conflicting bitfield lowering passes
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
Ian Romanick [Tue, 13 Dec 2022 17:43:39 +0000 (09:43 -0800)]
intel/compiler: Enable lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts for pre-Gfx7
GLSL IR opcodes generated for bitfieldExtract and bitfieldInsert are
lowered by lower_instructions.
4dff3ff005b ("nir/opt_algebraic:
Optimize open coded bfm.") adds an optimization that can rematerialize
nir_op_bfm that was prevented by the GLSL IR lowering.
It appears that every piece of hardware, except older Intel GPUS, that
has real integers (i.e., lower_bitops is not set) also sets
lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes:
4dff3ff005b ("nir/opt_algebraic: Optimize open coded bfm.")
Closes: #7874
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
Jason Ekstrand [Wed, 21 Dec 2022 22:59:02 +0000 (16:59 -0600)]
util: Drop the ENUM_PACKED macro
We have both PACKED and ENUM_PACKED macros which expand to the same
thing. PACKED was based on a meson check for function attributes while
ENUM_PACKED appears to be a legacy gallium thing which was based on
defined(__GCC__). This changes the one use of ENUM_PACKED to PACKED and
deletes ENUM_PACKED.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20412>
Emma Anholt [Tue, 29 Nov 2022 22:35:19 +0000 (14:35 -0800)]
ci: Update the skqp testing docs and retire the old runner script.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20070>
Emma Anholt [Tue, 29 Nov 2022 22:26:47 +0000 (14:26 -0800)]
ci/intel: Switch skqp testing over to deqp-runner.
The skqp runner gets us parallel execution, automatic caselist handling,
nice reports, and the same xfail/flake handling you know and love from
deqp and piglit.
And, now that we have flake handling, we can turn the tests back on!
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20070>
Emma Anholt [Tue, 29 Nov 2022 22:23:56 +0000 (14:23 -0800)]
ci/amd: Switch raven skqp testing over to deqp-runner.
The skqp runner gets us parallel execution, automatic caselist handling,
nice reports, and the same xfail/flake handling you know and love from
deqp and piglit.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20070>
Emma Anholt [Tue, 29 Nov 2022 22:19:34 +0000 (14:19 -0800)]
ci/freedreno: Switch skqp testing to using deqp-runner.
The skqp runner gets us parallel execution, automatic caselist handling
(which would have prevented a recent regression due to some skqp tests
having been forgotten in the checked in caselists), nice reports, and the
same xfail/flake handling you know and love from deqp and piglit.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20070>
Adam Jackson [Mon, 12 Dec 2022 15:59:44 +0000 (10:59 -0500)]
glx: Remove the GetProcAddress special case for indirect rendering
Some GL entrypoints would be aliased in an API sense but have different
GLX protocol. The only one that matters to us is EXT_texture_object,
which is the pre-GL-1.1 API. We're just going to drop support for that
and assume you have 1.1 or better, since 1.0 + EXT_texture_object is a
vanishingly rare combo at this point.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Thu, 8 Dec 2022 16:44:08 +0000 (11:44 -0500)]
glx: Only compute client GL extensions for indirect contexts
This is sort of a spiky way to do it, but the effect is to send the
appropriate SetClientInfo twice for indirect screens, where the second
one fills in the GL extensions. We can get away with this because the
only place the string is used is when the server computes the reply for
glGetString(GL_EXTENSIONS), which never matters for direct contexts.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Mon, 5 Dec 2022 19:59:22 +0000 (14:59 -0500)]
glx: Require GLX 1.3
GLX is a means to the end of direct rendered GL, really. Our indirect
protocol support has been largely untouched forever, anyone who wants it
can find it in amber. We're not going to drop or intentionally break it
(indirect support), but we're also not going to try super hard to
preserve its quirks anymore.
xserver has typically supported GLX 1.4 since 2009 (xserver 1.8,
ad5c0d9e)
and unconditionally since 2016 (xserver 1.19,
36bcbf76). Assuming GLX
1.3 internally will let us fix some GLX drawable lifetime issues.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Tue, 13 Dec 2022 17:26:58 +0000 (12:26 -0500)]
glx: Remove pointless GLX_INTEL_swap_event paranoia
It's not our job to filter this out, it's the server's job to not send
events that haven't been selected for. We'll still throw the event away
if we don't have any client-side state for it though.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Mon, 12 Dec 2022 18:50:15 +0000 (13:50 -0500)]
glx: Drop GLX_MESA_{pixmap_colormap,release_buffers} stubs
Whatever compatibility purpose these served has long since passed.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Wed, 14 Dec 2022 18:27:16 +0000 (13:27 -0500)]
glx: Replace FreeB 2.0 text with SPDX-License-Identifier: SGI-B-2.0
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Mon, 12 Dec 2022 15:40:30 +0000 (10:40 -0500)]
glx: Remove dead declarations from <GL/glx.h>
MESA_swap_control is defined in glxext.h now. MESA_swap_frame_usage was
removed in Mesa 7.9 in 2010. The other two were never specified or
implemented.
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Adam Jackson [Mon, 12 Dec 2022 15:35:28 +0000 (10:35 -0500)]
include: Sync <GL/glxext.h> with Khronos
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20300>
Pavel Ondračka [Mon, 2 Jan 2023 16:23:44 +0000 (17:23 +0100)]
r300: don't convert to constant swizzles when translating from TGSI
We currently convert it twice for unknown reasons, first when
translating from TGSI and later in constant folding. Not only is this
unnecessary, the first translation doesn't check for non-native
swizzles, so removing it actually saves few instructions and gains
a single Unigine shader for R300 at the expense of few more constant
loads and temps.
Also fixes few dEQPs because we could previously generate code like
TEX temp[1], none.01__, 2D[0];
and the native swizzle rewrite pass was not ready for it.
RV370 shader-db:
total instructions in shared programs: 84441 -> 84436 (<.01%)
instructions in affected programs: 63 -> 58 (-7.94%)
helped: 4
HURT: 0
total temps in shared programs: 12398 -> 12400 (0.02%)
temps in affected programs: 10 -> 12 (20.00%)
helped: 1
HURT: 3
total consts in shared programs: 79081 -> 79090 (0.01%)
consts in affected programs: 12 -> 21 (75.00%)
helped: 0
HURT: 7
GAINED: shaders/tropics/465.shader_test FS
No shader-db change with RV530.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20484>
Pavel Ondračka [Mon, 2 Jan 2023 16:17:17 +0000 (17:17 +0100)]
r300: allow copy propagate of RC_FILE_NONE reads to TEX instructions
Texturing instructions can't read from constant sources, however this
can work when the constant was transformed to constant swizzles and
hence RC_FILE_NONE.
Prevents a regression in single Unigine Tropics shader that uses
constant (0.5,0.5) as a TEX coordinate in a next patch. We now
convert to constant swizzles twice, first when translating from TGSI
and than in constant folding. If we disable the first conversion
rc_transform_tex will emit a mov from constant to temporary. With this
patch, copy propagate will clean it up later.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20484>
Pavel Ondračka [Mon, 2 Jan 2023 20:02:05 +0000 (21:02 +0100)]
r300: don't copy propagate constant swizzles to KIL on R300
Transforming
0: MOV temp[1], -none.1111;
1: KIL temp[1];
to
0: KIL -none.1111;
Doesn't work on R300 while it works just fine with R500.
Prevents a regression when we enable the copy propagate of RC_FILE_NONE
to texture instructions in the next commit.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20484>
Georg Lehmann [Sat, 17 Dec 2022 10:55:29 +0000 (11:55 +0100)]
aco: Use v_mov_b16 on GFX11.
Foz-DB GFX1100:
Totals from 4684 (3.47% of 134913) affected shaders:
CodeSize:
41086444 ->
41043476 (-0.10%)
Instrs: 8176019 -> 8175995 (-0.00%)
Latency:
83792071 ->
83792023 (-0.00%)
InvThroughput:
10311371 ->
10311369 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20369>
Jesse Natalie [Thu, 29 Dec 2022 18:32:22 +0000 (10:32 -0800)]
CI/Windows: Use waffle instead of freeglut for piglit
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20466>
Jesse Natalie [Thu, 29 Dec 2022 20:59:03 +0000 (12:59 -0800)]
CI/Windows: Update piglit for Waffle fix
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20466>
Rob Clark [Fri, 30 Dec 2022 21:01:40 +0000 (13:01 -0800)]
docs/freedreno: Add bindless/bindful descriptor docs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Rob Clark [Thu, 29 Dec 2022 21:29:04 +0000 (13:29 -0800)]
freedreno/registers: Cleanup bindless-base regs
Make it clear that the low two bits of the 64b address is it's own
bitfield.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Rob Clark [Sun, 1 Jan 2023 20:40:41 +0000 (12:40 -0800)]
freedreno/registers: Fix bo fields with low != 0
We need to add the missing left-shift. And a right-shift is negative!
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Rob Clark [Thu, 29 Dec 2022 19:39:05 +0000 (11:39 -0800)]
freedreno/decode: Improved reg64 decoding
This also (other than for an a5xx hack) gets rid of relying on
type0_reg_vals which isn't updated in all paths.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Rob Clark [Thu, 29 Dec 2022 19:07:17 +0000 (11:07 -0800)]
freedreno/decode: Add rnn_reginfo_free() helper
Simplify things a bit.. and fix a few places that just leaked the
rnndecaddrinfo.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Rob Clark [Thu, 29 Dec 2022 17:42:13 +0000 (09:42 -0800)]
freedreno/registers: Fix reg64 support
The maximum "high" position depends on 32b vs 64b registers.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20456>
Sil Vilerino [Tue, 3 Jan 2023 17:11:49 +0000 (12:11 -0500)]
ci: Update mingw and vs2019 libva build dependency to libva/releases/tag/2.17.0
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20498>
Konstantin Seurer [Mon, 2 Jan 2023 16:57:17 +0000 (17:57 +0100)]
radv: Use the correct pipeline layout for LBVH IR generation
Fixes: 5ba950e ("radv: Switch to new LBVH implementation.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20481>
Adam Stylinski [Sat, 31 Dec 2022 21:14:38 +0000 (16:14 -0500)]
nv30: Fix an offset for vbos being applied to a buffer twice
Similar to
1387d1d4, this offset was being applied twice (once in
translate_generic, and once when the buffer is mapped).
This fixes 7972, which was initially thought to be an endianness
specific issue.
CC: mesa-stable
Tested-by: Filip Gawin <filip@gawin.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20468>
Yiwei Zhang [Tue, 3 Jan 2023 08:22:23 +0000 (00:22 -0800)]
ci: update venus-lavapipe test expectations
Remove fixed push descriptor tests from expected failures.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20471>
Yiwei Zhang [Sun, 1 Jan 2023 01:21:53 +0000 (17:21 -0800)]
venus: properly ignore the sampler for immutable sampler
This was found while debugging venus-lavapipe ci failure. It's a real
bug though no tests have caught this yet, but fixing this would regress
venus-lavapipe non-templated push tests if without the dependent lvp
fix. The sampler in the descriptor write can be garbled if the binding
has immutable samplers.
cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20471>
Yiwei Zhang [Sun, 1 Jan 2023 01:07:19 +0000 (17:07 -0800)]
lvp: properly ignore sampler write for immutable sampler
The issue is hidden due to a overly relaxed cts:
dEQP-VK.binding_model.shader_access.primary_cmd_buf.with_push*
that doesn't scrub the sampler from descriptor writes for immutable
samplers. The issue is exposed via venus-lavapipe ci because venus must
ignore the potentially garbled sampler. This change aligns the
VK_DESCRIPTOR_TYPE_SAMPLER path with the
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER path by removing a false check
against the provided sampler from push since the sampler can be null. An
alternative is to also check against !binding->immutable_samplers there.
Test: venus-lavapipe with venus push descriptor support
cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20471>
Jesse Natalie [Tue, 27 Dec 2022 18:47:50 +0000 (10:47 -0800)]
spirv2dxil: Support linking multiple shaders
This probably could/should be split up into multiple commits, but
it's simpler to make this a monolithic change.
This change inlines a bunch of logic from spirv_to_dxil into the
spirv2dxil tool so that linking can be done on the nir shaders.
Probably the linking functionality should be exposed in the lib/dll
form too, which means that a helper for freeing intermediate nir
would be needed too. That's TODO for now.
The tool now requires arguments to be in-order, and once a filename
is encountered, will use the previous arguments to compile the shader.
If multiple graphics shaders are passed, they're linked as if they
were forming a pipeline together.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20440>
Jesse Natalie [Tue, 27 Dec 2022 18:47:16 +0000 (10:47 -0800)]
spirv2dxil: Rename and move prep helper
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20440>
Jesse Natalie [Wed, 28 Dec 2022 23:47:54 +0000 (15:47 -0800)]
CI/Windows: Use deqp-runner for D3D12 piglit
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20454>
Chad Versace [Tue, 8 Nov 2022 17:43:50 +0000 (09:43 -0800)]
vulkan/runtime: Preserve pNext when upgrading to synchronization2 structs
The functions that upgraded VkFooMemoryBarrier to VkFooMemoryBarrier2
dropped the pNext pointers. It loses VkSampleLocationsInfoEXT, and may
lose additional structs too if VkFooMemoryBarrier receives further
extensions in the future.
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20477>
Samuel Pitoiset [Mon, 12 Dec 2022 17:46:50 +0000 (18:46 +0100)]
radv: determine the gfx scratch size at pipeline bind time
This doesn't need to be in the draw path.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20299>
Samuel Pitoiset [Fri, 16 Dec 2022 13:25:17 +0000 (14:25 +0100)]
radv: dirty all dynamic states when beginning a new cmdbuf
Sounds safer to not rely on other cmdbuf states.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20299>
Samuel Pitoiset [Mon, 12 Dec 2022 17:43:31 +0000 (18:43 +0100)]
radv: dirty states when beginning a cmdbuf instead of when a pipeline is bound
To reduce CPU overhead of radv_emit_graphics_pipeline().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20299>
Samuel Pitoiset [Mon, 12 Dec 2022 15:56:42 +0000 (16:56 +0100)]
radv: move emitting the strmout buffer in CmdDrawIndirectByteCountEXT()
This doesn't need to be in the generic draw path because only one
draw command uses it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20299>
Samuel Pitoiset [Mon, 12 Dec 2022 15:37:21 +0000 (16:37 +0100)]
radv: flush DFSM on CB_TARGET_MASK changes when it's emitted
To avoid performing the same check twice and to emit it at the right
place.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20299>
Sil Vilerino [Tue, 3 Jan 2023 15:16:29 +0000 (10:16 -0500)]
frontends/va: Update state var frame_num disregarding cap check
The frame_num variable must be updated for encode entrypoint disregarding
the outcome of the PIPE_VIDEO_CAP_REQUIRES_FLUSH_ON_END_FRAME cap check
fixes:
229c6f79a660e5c7999ffc94e1fb514692df3b6a ("frontends/va: Implement vaSyncBuffer")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20490>
Rhys Perry [Mon, 2 Jan 2023 18:05:14 +0000 (18:05 +0000)]
radeonsi,radv/llvm: fix amdgpu-color/depth-export with epilogs
The main shader wouldn't use ac_build_export(), and the discard exit would
have no export.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes:
1174ab6d56e ("ac/llvm: use amdgpu-color-export/amdgpu-depth-export")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7991
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20482>
David Heidelberg [Mon, 2 Jan 2023 23:25:52 +0000 (00:25 +0100)]
postprocess: move the definition of pp_filters into **/pp_init.c
An LTO-friendly move.
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7881
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20374>
David Heidelberg [Sat, 17 Dec 2022 23:30:49 +0000 (00:30 +0100)]
ci: build test LTO
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20374>
Mike Blumenkrantz [Wed, 26 Oct 2022 16:08:08 +0000 (12:08 -0400)]
zink: use EXT_descriptor_buffer with ZINK_DESCRIPTORS=db
this should be bug-free, as it passes cts/piglit/gaming on multiple drivers,
but since it's new, it stays behind an env var for at least one release
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20489>
Mike Blumenkrantz [Mon, 24 Oct 2022 20:40:32 +0000 (16:40 -0400)]
zink: move some descriptor data into a substruct
no functional changes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20489>
Mike Blumenkrantz [Mon, 2 Jan 2023 14:31:20 +0000 (09:31 -0500)]
Revert "zink: remove descriptor-mode selection infrastructure"
this would've been in-use, but khronos changes while I was on vacation
blocked a merge
This reverts commit
3f371d4e940509c73fa19c4e50ae319e75636eb0.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20489>
Tapani Pälli [Thu, 29 Dec 2022 07:52:36 +0000 (09:52 +0200)]
anv: implement Wa_14015814527 for task shaders
After using task shader, we need to emit a zero URB state and a
nullprim (empty pipe control) before rendering with primitives.
After this, a normal URB state needs to be returned, this will
happen when pipeline batch is emitted during pipeline switch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20334>
Pavel Ondračka [Sat, 31 Dec 2022 13:45:22 +0000 (14:45 +0100)]
nir: basic tests for nir_opt_shrink_vectors
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
Pavel Ondračka [Sat, 31 Dec 2022 11:04:28 +0000 (12:04 +0100)]
nir: fix shrinking of load_const for large vectors
Specifically when shrinking load_const with number of components
> 5, if the final number of components is not allowed (for example 8->6)
it would report false for progress even if we actually did some
reshuffling and also it would skip on the rewrite of the readers.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
Pavel Ondračka [Sun, 4 Dec 2022 16:34:37 +0000 (17:34 +0100)]
nir: remove duplicate alu channels in nir_opt_shrink_vectors
This will clean code like:
vec3 32 ssa_8 = frcp ssa_7.www
vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8
into
vec1 32 ssa_8 = frcp ssa_7.w
vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8.xxx
This helps r300 driver because we can only do single channel for math
ops at a time, so the first version would result in three frcp
instructions. The nir_opt_shrink_vectors comments even claim the pass
should be doing this, however it actually does it only for nir_op_vecx
instructions, so extend this for generic alu instructions.
RV530 shader-db:
total instructions in shared programs: 135032 -> 133707 (-0.98%)
instructions in affected programs: 46121 -> 44796 (-2.87%)
helped: 452
HURT: 26
total temps in shared programs: 17051 -> 17033 (-0.11%)
temps in affected programs: 1509 -> 1491 (-1.19%)
helped: 91
HURT: 30
12.02->12.08 (+0.5%) fps gain in Unigine Sanctuary (n=5) with RV530
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7051
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reiewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
Alyssa Rosenzweig [Thu, 29 Dec 2022 19:26:26 +0000 (14:26 -0500)]
pan/bi: Move Bifrost specific C code to src/compiler/bifrost
The goal is to make files at the root of src/compiler/ apply to both Bifrost and
Valhall, while ISA-specific code (e.g. instruction packing) code goes in
compiler/bifrost/ or compiler/valhall/. This is what Valhall is already doing,
the Bifrost specific stuff was just grandfathered in.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
Alyssa Rosenzweig [Thu, 29 Dec 2022 19:32:12 +0000 (14:32 -0500)]
pan/bi: Remove standalone compiler
This functionality is now available on Linux with drm-shim + shader-db, and I
suspect the version bundled here is broken anyway. Strictly this drops
Windows/macOS support for the known-broken frontend to the shader compiler but I
can't say I'm terribly worried about that.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
Alyssa Rosenzweig [Thu, 29 Dec 2022 19:37:06 +0000 (14:37 -0500)]
pan/bi: Rename panfrost/bifrost -> panfrost/compiler
This is the compiler for both Bifrost and Valhall, and presumably future
Mali GPUs too. Give it a more generic name so we can use the bifrost/ path for
something a bit more specific.
For historical reasons the compiler's name is still "bifrost" and uses the
prefix `bi_`. I think that's ok in the same way that i915 in the kernel supports
way more than just i915.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20455>
Sviatoslav Peleshko [Thu, 3 Nov 2022 14:01:58 +0000 (16:01 +0200)]
hasvk: Add layer with work-around for Doom 64 texture corruption
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7817
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19502>
Sviatoslav Peleshko [Wed, 2 Nov 2022 23:41:53 +0000 (01:41 +0200)]
anv: Add layer with work-around for Doom 64 texture corruption
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7817
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19502>
Konstantin Seurer [Tue, 27 Dec 2022 15:57:36 +0000 (16:57 +0100)]
radv: Add an app layer driconf and use it for Metro Exodus
To make adding more application layers easier.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20439>
Konstantin Seurer [Tue, 27 Dec 2022 15:32:12 +0000 (16:32 +0100)]
radv: Clean up entrypoints generation
This should make it easier to add new tracing and application layers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20439>
Konstantin Seurer [Tue, 27 Dec 2022 15:25:12 +0000 (16:25 +0100)]
radv: Use multiple dispatch tables for layers
Every layer has its own dispatch table that it can use to call down the
layer stack. This allows us to use RRA and RGP tracing simultaneously.
Using application layers with tracing should work as well.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20439>
Konstantin Seurer [Tue, 27 Dec 2022 13:37:01 +0000 (14:37 +0100)]
radv: Move dispatch table init into a separate function
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20439>
Konstantin Seurer [Tue, 27 Dec 2022 13:35:23 +0000 (14:35 +0100)]
vulkan: Allow passing NULL dispatch tables to vk_device_init
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20439>
Gert Wollny [Thu, 29 Dec 2022 11:14:23 +0000 (12:14 +0100)]
r600: Don't merge alu groups with variable length dot using t-slot
Since the variable length dot must stay in its slot configuration
do not try to merge the group with the previous group when an op may be
moved to the t slot, because this may lead to breaking the multi-slot
operation.
Fixes:
357e5fac9953b26eedc8819ab528b981be6e1b69
r600/sfn: Use variable length DOT on Evergreen and Cayman
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Thu, 29 Dec 2022 10:56:38 +0000 (11:56 +0100)]
r600/sfn: Set minimum required registers based on array allocation
In the rare case that after register allocation the highest directly
accessed register index is below the highest value used for an
indirectly accessed array we have to ensure that the shader allocates
enough registers to account for these indices that are not seen by the
assembler.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7966
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Wed, 28 Dec 2022 16:59:07 +0000 (17:59 +0100)]
r600: enable ARB_gl_spirv
76 out of 86 piglits pass.
Some fail because SSBOs are only supported for FS and CS on r600, but
the piglits try to use SSBOs with VS, and there are piglits that try to
bind SSBO at 8, and only 0-7 is supported as binding point.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Wed, 28 Dec 2022 16:56:41 +0000 (17:56 +0100)]
r600: Fix early exit when setting SSBOs
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Wed, 28 Dec 2022 16:34:45 +0000 (17:34 +0100)]
r600/sfn: Fix FS primid input slot
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Thu, 22 Dec 2022 12:54:36 +0000 (13:54 +0100)]
r600/sfn: Fix warning for mixed use of enum and integer
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Thu, 22 Dec 2022 12:21:03 +0000 (13:21 +0100)]
r600/sfn: pre-evaluate allowed dest mask in Alu instructions
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Gert Wollny [Thu, 22 Dec 2022 12:11:19 +0000 (13:11 +0100)]
r600/sfn: move handling of legacy math rules to assembler
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20451>
Erico Nunes [Tue, 27 Dec 2022 11:32:07 +0000 (12:32 +0100)]
st/mesa: Fix free of non-shareable shaders on context destroy
On drivers that do not set PIPE_CAP_SHAREABLE_SHADERS,
st_destroy_program_variants() may reach st_save_zombie_shader()
which accesses st->zombie_shaders.mutex.
Destroying st->zombie_shaders.mutex before destroying program variants
may result in an invalid access in a multiple context scenario for
those drivers.
Move the mutex destroy call to after program variants destroy so that
it doesn't hit a deadlock on context destroy.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20458>
Rhys Perry [Fri, 16 Dec 2022 12:24:40 +0000 (12:24 +0000)]
ac/llvm: use amdgpu-color-export/amdgpu-depth-export
These are necessary to use the correct export target on GFX11:
https://reviews.llvm.org/D128185
Fixes artifacts on Lara in Rise of the Tomb Raider benchmark and hair in
The Witcher 3 (classic).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20357>
Timur Kristóf [Wed, 21 Dec 2022 16:32:57 +0000 (17:32 +0100)]
radv: Decouple radv_before_taskmesh_draw from radv_before_draw.
radv_before_taskmesh_draw will no longer call radv_before_draw and
instead implement the necessary functionality on its own.
radv_before_draw will no longer have to emit mesh shader descriptors.
As a result, both functions should have a lower CPU overhead now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18829>
Samuel Pitoiset [Fri, 16 Dec 2022 07:12:50 +0000 (08:12 +0100)]
radv: fix missing initialization of radv_resolve_barrier::dst_stage_mask
Otherwise, this value is unitialized when read in
radv_ace_internal_barrier().
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7909
Fixes:
4c6f83006d4 ("radv: Synchronization for task shaders.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20351>
Marek Olšák [Sun, 18 Dec 2022 22:02:37 +0000 (17:02 -0500)]
iris: implement PIPE_CAP_MAP_UNSYNCHRONIZED_THREAD_SAFE
required by glthread
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20379>
Marek Olšák [Sat, 24 Dec 2022 16:04:08 +0000 (11:04 -0500)]
glthread,gallium: add a CAP to disable glBufferSubData optimization in glthread
it regresses performance on iris
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20379>
Daniel Schürmann [Wed, 2 Nov 2022 12:35:57 +0000 (13:35 +0100)]
aco: Reassign dead definitions of p_split_vector to associated register
Any unused split_vector definition can always use the same register
as the operand. This avoids creating unnecessary copies.
Fossil DB stats on Rembrandt (RDNA2):
Totals from 3904 (2.89% of 134906) affected shaders:
CodeSize:
18326692 ->
18271688 (-0.30%)
Instrs: 3386632 -> 3372888 (-0.41%)
Latency:
42337481 ->
42330085 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 6566731 -> 6566424 (-0.00%); split: -0.01%, +0.00%
Copies: 224301 -> 210559 (-6.13%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
Timur Kristóf [Tue, 26 Apr 2022 09:09:47 +0000 (11:09 +0200)]
aco: Try to reassign split vector registers post-RA.
Eliminate unnecessary copies when the operand registers of a
p_split_vector instruction are not clobbered between the p_split_vector
and the user of its definitions.
This happens when p_split_vector doesn't kill its operand and its
definitions have a shorter lifespan that the operand. It affects every
NGG culling shader among other things.
This optimization exists because it's too difficult to solve it
in RA, and should be removed after we solved this in RA.
v2 by Daniel Schürmann:
- Rearrange and simplify conditions for the new optimization
- Fix a few bugs
v3 by Daniel Schürmann:
- Check number of encoded ALU operands
Fossil DB stats on Rembrandt (RDNA2):
Totals from 64896 (48.10% of 134906) affected shaders:
CodeSize:
175693348 ->
175434944 (-0.15%)
Instrs:
33333912 ->
33269388 (-0.19%)
Latency:
183766084 ->
183763432 (-0.00%); split: -0.00%, +0.00%
InvThroughput:
28589651 ->
28589340 (-0.00%); split: -0.00%, +0.00%
Copies: 2806550 -> 2742038 (-2.30%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
Timur Kristóf [Thu, 3 Nov 2022 17:30:28 +0000 (18:30 +0100)]
aco/optimizer_postRA: Distinguish overwritten untrackable and subdword.
This allows is_overwritten_since to return false when the last
writer instruction of a register can't be tracked but we know it wasn't
written in the current block.
Fossil DB stats on Rembrandt (RDNA2):
Totals from 1163 (0.86% of 134906) affected shaders:
CodeSize: 9815920 -> 9805016 (-0.11%)
Instrs: 1843688 -> 1840962 (-0.15%)
Latency:
19219153 ->
19209171 (-0.05%)
InvThroughput: 3354375 -> 3353852 (-0.02%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
Daniel Schürmann [Mon, 14 Nov 2022 17:04:23 +0000 (18:04 +0100)]
aco/optimizer_postRA: Initialize loop header with preheader information
This works because of SSA and should be safer than just setting 'not_written_yet'.
No Fossil DB changes on Rembrandt (RDNA2).
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>
Daniel Schürmann [Mon, 14 Nov 2022 15:17:25 +0000 (16:17 +0100)]
aco: fix reset_block_regs() in postRA-optimizer
Accidentally, we picked the index of the predecessors instead of the predecessors.
Totals from 8496 (6.30% of 134913) affected shaders: (GFX10.3)
CodeSize:
64070724 ->
64022516 (-0.08%); split: -0.08%, +0.00%
Instrs:
11932750 ->
11920698 (-0.10%); split: -0.10%, +0.00%
Latency:
144040266 ->
144017062 (-0.02%); split: -0.02%, +0.00%
InvThroughput:
29327735 ->
29326421 (-0.00%); split: -0.00%, +0.00%
Fossil DB stats on Rembrandt (RDNA2):
Totals from 4488 (3.33% of 134906) affected shaders:
CodeSize:
42759736 ->
42735392 (-0.06%); split: -0.06%, +0.00%
Instrs: 7960522 -> 7954436 (-0.08%); split: -0.08%, +0.00%
Latency:
96192647 ->
96172571 (-0.02%); split: -0.02%, +0.00%
InvThroughput:
19313576 ->
19312575 (-0.01%); split: -0.01%, +0.00%
Fixes:
75967a4814be7988afc20e59bac4b48bafacab00 ('aco/optimizer_postRA: Speed up reset_block() with predecessors.')
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16161>