Alyssa Rosenzweig [Mon, 20 Jun 2022 15:04:58 +0000 (11:04 -0400)]
r600g: Remove streamout-based buffer copy path
r600g is the only user of util_blitter_copy_buffer in tree, which implements
buffer copies with streamout. This path for r600g was added in
8ac9801669c
("r600g: accelerate buffer copying"), a commit from 2012. At that point there
was no DMA path for buffer copies. Since then, a DMA path has been added,
conditional only on the kernel version -- not the hardware. It appears the
required kernel support has been mainline for at least 4 years now. Mesa 22.2
doesn't need to provide optimal performance on an old kernel -- for performance,
a DMA-capable kernel should be used, and for compatability, the CPU fallback
(used for unaligned buffers as it is) is still available. Remove the streamout
path "in the middle" that appears ~unused today.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17142>
Enrico Galli [Wed, 22 Jun 2022 00:40:51 +0000 (17:40 -0700)]
microsoft/spirv_to_dxil: Fix discard semantics
Unlike in nir, discard is not a super return in DXIL. Therefore, we
will lower discard and terminate to demote + return.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17179>
Ian Romanick [Tue, 21 Jun 2022 23:47:31 +0000 (16:47 -0700)]
nir/range_analysis: Teach range analysis about fdot opcodes
This really, really helps on platforms where fabs() isn't free. A great
many shaders use a * frsq(fabs(fdot(a, a))) to normalize a vector.
Since the result of the fdot must be non-negative, the fabs can be
eliminated by an existing algebraic rule.
shader-db results:
r300 (run on R420 - X800XL)
total instructions in shared programs: 1369807 -> 1368550 (-0.09%)
instructions in affected programs: 59986 -> 58729 (-2.10%)
helped: 609
HURT: 0
total vinst in shared programs: 512899 -> 512861 (<.01%)
vinst in affected programs: 1522 -> 1484 (-2.50%)
helped: 36
HURT: 0
total sinst in shared programs: 260690 -> 260570 (-0.05%)
sinst in affected programs: 1419 -> 1299 (-8.46%)
helped: 120
HURT: 0
total consts in shared programs: 957295 -> 957230 (<.01%)
consts in affected programs: 849 -> 784 (-7.66%)
helped: 65
HURT: 0
LOST: 0
GAINED: 3
The 3 gained shaders are all vertex shaders from XCom: Enemy Unknown.
I'm guessing that game is never going to run on my X800XL. :)
i915
total instructions in shared programs: 791121 -> 780843 (-1.30%)
instructions in affected programs: 220170 -> 209892 (-4.67%)
helped: 2085
HURT: 0
total temps in shared programs: 47765 -> 47766 (<.01%)
temps in affected programs: 9 -> 10 (11.11%)
helped: 0
HURT: 1
total const in shared programs: 93048 -> 92983 (-0.07%)
const in affected programs: 784 -> 719 (-8.29%)
helped: 65
HURT: 0
LOST: 0
GAINED: 36
Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs:
16702250 ->
16697908 (-0.03%)
instructions in affected programs: 119277 -> 114935 (-3.64%)
helped: 1065
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 4.08 x̃: 4
helped stats (rel) min: 0.48% max: 10.17% x̄: 3.66% x̃: 3.94%
95% mean confidence interval for instructions value: -4.26 -3.89
95% mean confidence interval for instructions %-change: -3.76% -3.56%
Instructions are helped.
total cycles in shared programs:
880772068 ->
880734134 (<.01%)
cycles in affected programs: 2134456 -> 2096522 (-1.78%)
helped: 941
HURT: 324
helped stats (abs) min: 2 max: 2180 x̄: 123.06 x̃: 44
helped stats (rel) min: 0.04% max: 49.96% x̄: 7.08% x̃: 3.81%
HURT stats (abs) min: 2 max: 2098 x̄: 240.33 x̃: 35
HURT stats (rel) min: 0.04% max: 77.07% x̄: 12.34% x̃: 3.00%
95% mean confidence interval for cycles value: -47.93 -12.04
95% mean confidence interval for cycles %-change: -2.87% -1.34%
Cycles are helped.
No shader-db changes on any other Intel platform.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17181>
Sebastian Keller [Wed, 22 Jun 2022 00:42:15 +0000 (02:42 +0200)]
egl/wayland: Don't try to access modifiers u_vector as dynarray
The modifiers are u_vectors, but the code was trying to access them
as dynarrays. This resulted in a wrong number of modifiers, which then
later on would also lead to invalid reads used as modifiers.
In the case of the iris driver, a wrongly read number of modifiers > 0
would also trigger an error message.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6643
Fixes:
b5848b2dac1 ("egl/wayland: use surface dma-buf feedback to allocate surface buffers")
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17180>
Jason Ekstrand [Wed, 22 Jun 2022 21:40:34 +0000 (16:40 -0500)]
vulkan: Add a vk_pipeline_shader_stage_to_nir helper
This is similar to vk_shader_module_to_nir only it takes a
VkPipelineShaderStageCreateInfo and handles
VK_KHR_graphics_pipeline_library semantics for when a
VkShaderModuleCreateInfo is provided instead of an actual module.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17196>
Jason Ekstrand [Wed, 22 Jun 2022 21:40:13 +0000 (16:40 -0500)]
vulkan/nir: Make spirv_data const in vk_spirv_to_nir
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17196>
Matt Coster [Thu, 26 May 2022 09:04:40 +0000 (10:04 +0100)]
pvr: csbgen: Make all generated enums unambiguous
This change involves two enums:
* rogue_texstate.xml: All COMPRESSED_* members of FORMAT are moved
to FORMAT_COMPRESSED (without the prefix). A second field is added
to IMAGE_WORD0 (texformat_compressed) which overlaps with the
original (texformat), and
* rogue_pbestate.xml: REG_WORD0_LINESTRIDE was not a real enum; it's
removed entirely. It only has value when feature
pbe_stride_align_1pixel is present, so a FIXME comment was added to
this effect.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17204>
Pavel Ondračka [Mon, 20 Jun 2022 20:17:21 +0000 (22:17 +0200)]
r300: only run merge_movs pass on R500
This pass currently generates some swizzles that the R300 and R400
hardware can't handle, make it R500 for now.
Fixes:
6c2959c0
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17150>
Vinson Lee [Thu, 16 Jun 2022 22:42:29 +0000 (15:42 -0700)]
tu: Check dereferenced value of rop_reads_dst.
Fix defect reported by Coverity Scan.
Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking rop_reads_dst suggests that it may be
null, but it has already been dereferenced on all paths leading to the
check.
Fixes:
94be0dd0b86 ("tu: Implement extendedDynamicState2LogicOp")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17099>
Connor Abbott [Tue, 14 Jun 2022 18:47:29 +0000 (20:47 +0200)]
ir3: Fix vectorizer condition for SSBOs
SSBO access works very differently from UBO access. Straddling
loads/stores isn't an issue, loads/stores instead must be aligned to the
element size and can have up to 4 components.
We support 16-bit access with SSBOs on a650+, and sometimes the
vectorizer tries to create a misaligned 32-bit access when combining
32-bit and 16-bit accesses. The UBO-focused logic didn't reject this,
which is now fixed. This fixes a number of VK-CTS regressions on a650+.
Fixes:
bf49d4a084b ("freedreno/ir3: Enable load/store vectorization for SSBO access, too.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17040>
Rhys Perry [Tue, 21 Jun 2022 17:34:44 +0000 (18:34 +0100)]
aco: don't skip VS->TCS barrier if TCS output vertices doesn't match input
TCS invocations correspond to output patch vertices, not input. If they
differ, TCS invocations can be in a different subgroup than VS invocations
of the input patch.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6564
Fixes:
152092b8ead ("aco: skip s_barrier if TCS patches are within subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17174>
Yonggang Luo [Thu, 21 Apr 2022 18:09:52 +0000 (02:09 +0800)]
meson: Enable wgl tests on mingw
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Thu, 5 May 2022 17:08:59 +0000 (01:08 +0800)]
ci: Building all mesa functional with mingw on debian
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Wed, 1 Jun 2022 08:03:38 +0000 (16:03 +0800)]
d3d12: Turn d3d12_format.h to include d3d12_common.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Thu, 5 May 2022 17:08:59 +0000 (01:08 +0800)]
ci: Trigger the new mingw/linux dockers to be build
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Thu, 5 May 2022 11:44:02 +0000 (19:44 +0800)]
ci: Prepare the container for building all mesa components with mingw under linux
`x86_build-base-wine.sh` are usd to install wine and xvfb
`x86_build-mingw-patch.sh` are used to pull packages from msys2 that can be directly used.
`x86_build-mingw-source-deps.sh` are used to building llvm, libclc, clang, spirv-tools and directx-headers from source
xvfb are used to enable wgl tests on debian
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Fri, 6 May 2022 22:48:57 +0000 (06:48 +0800)]
ci/x86_build: Getting pushd popd be paired, avoid using cd
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Wed, 15 Jun 2022 03:00:14 +0000 (11:00 +0800)]
dzn: Fixes incompatible pointer type error
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Sat, 7 May 2022 22:04:55 +0000 (06:04 +0800)]
microsoft/clc: Disable clc_compiler_test on non-windows platform
The test can compile, but can not pass, so compile it but not running it
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Sat, 7 May 2022 01:46:21 +0000 (09:46 +0800)]
microsoft/clc: Fixes narrowing error in clc_compiler_test.cpp with mingw/gcc
errors:
../../src/microsoft/clc/clc_compiler_test.cpp:563:19: error: narrowing conversion of '
268435457' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
563 | 0x00000000, 0x10000001, 0x20000020, 0x30000300,
| ^~~~~~~~~~
../../src/microsoft/clc/clc_compiler_test.cpp:563:31: error: narrowing conversion of '
536870944' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
563 | 0x00000000, 0x10000001, 0x20000020, 0x30000300,
| ^~~~~~~~~~
../../src/microsoft/clc/clc_compiler_test.cpp:563:43: error: narrowing conversion of '
805307136' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
563 | 0x00000000, 0x10000001, 0x20000020, 0x30000300,
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Tue, 10 May 2022 21:24:10 +0000 (05:24 +0800)]
d3d12: Fixes compiling error in d3d12/wgl/d3d12_wgl_framebuffer.cpp with gcc
error message:
```
../../src/gallium/winsys/d3d12/wgl/d3d12_wgl_framebuffer.cpp:231:42: error: no matching function for call to 'operator new(sizetype, d3d12_wgl_framebuffer*&)'
231 | new (fb) struct d3d12_wgl_framebuffer();
| ^
<built-in>: note: candidate: 'void* operator new(long long unsigned int)'
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Wed, 1 Jun 2022 01:09:55 +0000 (09:09 +0800)]
d3d12: Convert #include <Windows.h> to #include <windows.h> for mingw on linux
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Yonggang Luo [Thu, 19 May 2022 01:37:07 +0000 (09:37 +0800)]
d3d12: Use static_cast instead of dynamic_cast in d3d12_video_enc_h264.cpp
Because we may compile mesa with both rtti=enabled and rtti=disabled because of LLVM
Fixes errors:
../../src/gallium/drivers/d3d12/d3d12_video_enc_h264.cpp:777:7: error: 'dynamic_cast' not permitted with '-fno-rtti'
777 | dynamic_cast<d3d12_video_bitstream_builder_h264 *>(pD3D12Enc->m_upBitstreamBuilder.get());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
Jason Ekstrand [Sat, 12 Mar 2022 18:28:22 +0000 (12:28 -0600)]
turnip: Use the new border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
Jason Ekstrand [Sat, 12 Mar 2022 18:19:52 +0000 (12:19 -0600)]
lavapipe: Use the new border color helper
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
Jason Ekstrand [Sat, 12 Mar 2022 18:10:08 +0000 (12:10 -0600)]
panvk: Use the new border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
Jason Ekstrand [Sat, 12 Mar 2022 17:00:52 +0000 (11:00 -0600)]
vulkan: Add some border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
Zhang, Jianxun [Tue, 10 May 2022 19:55:13 +0000 (12:55 -0700)]
iris: Wa_14016820455 for GFX_VERx10 == 12.5
Reprogram SF CLIP viewport pointer by not skipping its
dirty flag bit.
Many thanks to Lin, Shuicheng <shuicheng.lin@intel.com>,
Jerez Plata, Francisco <francisco.jerez.plata@intel.com>,
Graunke, Kenneth W <kenneth.w.graunke@intel.com>,
and others for their great help.
Signed-off-by: Zhang, Jianxun <jianxun.zhang@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17171>
Lionel Landwerlin [Mon, 4 Apr 2022 20:40:24 +0000 (23:40 +0300)]
anv: limit RT writes to number of color outputs
Not doing so crates skews occlusion queries. Fixes Zink's piglit
occlusion_query tests.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
a4f502de3228 ("anv: fix VK_DYNAMIC_STATE_COLOR_WRITE_ENABLE_EXT state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6205
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15740>
Alyssa Rosenzweig [Sat, 11 Jun 2022 23:57:38 +0000 (19:57 -0400)]
agx: Handle loop { if { loop { .. } } }
We need to push loop nesting to handle this correctly -- at the end of
the innermost loop, the correct nesting is 1 (from the if), not 0.
Fixes assertion failure in
dEQP-GLES2.functional.shaders.struct.local.dynamic_loop_nested_struct_array_fragment,UnexpectedPass
dEQP-GLES2.functional.shaders.struct.local.dynamic_loop_nested_struct_array_vertex,UnexpectedPass
dEQP-GLES2.functional.shaders.struct.uniform.dynamic_loop_nested_struct_array_fragment,UnexpectedPass
dEQP-GLES2.functional.shaders.struct.uniform.dynamic_loop_nested_struct_array_vertex,UnexpectedPass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17128>
Emma Anholt [Mon, 20 Jun 2022 19:03:40 +0000 (12:03 -0700)]
ci/bare-metal: Collapse artifacts wget by default.
In trying to figure out why an rpi3 job took so long, I wished that the
wget verbosity was hidden by default, and that it told me how long it took
like other sections do.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17146>
Emma Anholt [Mon, 20 Jun 2022 18:51:40 +0000 (11:51 -0700)]
ci/bare-metal: Consolidate needs declarations in .baremetal-test-*.
We had it set up for arm64 asan already, do it for everyone else too. In
cleaning up the duplication, this fixes a pasteo in rpi3 which had the
"artifacts: false" on the wrong job, causing it to do a slow download of
the mesa build from gitlab.
Doing this required also moving the ".use-debian/arm_test" in as well, so
that its "needs:" didn't overwrite ours if it appeared after us in the
consumer's "extends:"
Should save about 20 seconds on rpi3 jobs.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17146>
Emma Anholt [Tue, 21 Jun 2022 15:41:44 +0000 (08:41 -0700)]
ci/bare-metal: Remove "stage: test" from .baremetal-test.
This is not one of the valid stages in the top level "stages:"
declaration, you're supposed to get your stage from your
test-source-dep.yml include. Avoids issues when .baremetal-test gets
included after your test-source-dep.
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17146>
Mike Blumenkrantz [Thu, 7 Apr 2022 22:03:06 +0000 (18:03 -0400)]
zink: flag dmabufs for foreign queue transition on flush_resource call
this is needed by ext_external_objects
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15810>
Mike Blumenkrantz [Thu, 7 Apr 2022 22:02:57 +0000 (18:02 -0400)]
zink: add flag to indicate if a resource is a dmabuf
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15810>
Emma Anholt [Tue, 7 Jun 2022 00:07:43 +0000 (17:07 -0700)]
ci/freedreno: Turn a530 back on by default and update expectations.
I think it should be fixed since I redid how we manage serial around
restarts. Haven't seen a fail in the manual runs I've done.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
Emma Anholt [Tue, 7 Jun 2022 18:31:03 +0000 (11:31 -0700)]
freedreno/a5xx: Set the buffer bit appropriately in XS_CTRL_REG0.
This seems to be how the bit gets used, from grepping my blob traces.
Hopefully this helps stabilize some stuff.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
Emma Anholt [Tue, 7 Jun 2022 17:53:57 +0000 (10:53 -0700)]
freedreno/ir3: Disable image/ssbo 16-bit conversion folding pre-a6xx.
I don't see it in blob dumps, and the reordered args tripped up validation.
Fixes:
49dc60efa1df ("freedreno/ir3: Fold 16-bit conversions into image load/store src/dsts.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
Ian Romanick [Wed, 19 Jun 2019 01:08:45 +0000 (18:08 -0700)]
nir: Add and use algebraic property "is selection"
There are several places that should have supported the various sized
versions of bcsel and the various nir_op_[fi]csel_* opcodes. Rather
than enumerate the whole list, add a property.
v2: Make the comment for NIR_OP_IS_SELECTION more descriptive.
Suggested by Jason.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
Ian Romanick [Thu, 25 Mar 2021 20:13:21 +0000 (13:13 -0700)]
nir/algebraic: Fix NaN-unsafe fcsel patterns
For example, the proof for this pattern
(('bcsel', ('flt', 'a@32', 0), 'b@32', 'c@32'), ('fcsel_ge', a, c, b)),
would be
bcsel(a < 0, b, c)
bcsel(!(a < 0), c, b)
bcsel(a >= 0, c, b)
fcsel_ge(a, c, b)
However, !(a < 0) => (a >= 0) is well known to produce different
results if `a` is NaN.
Instead of that replacement, use this replacement:
bcsel(a < 0, b, c)
bcsel(-0 < -a, b, c)
bcsel(0 < -a, b, c)
fcsel_gt(-a, b, c)
This is NaN-safe and exact.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes:
0f5b3c37c5d ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
Ian Romanick [Wed, 15 Jun 2022 16:21:10 +0000 (09:21 -0700)]
nir: i32csel opcodes should compare with integer zero
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Noticed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes:
0f5b3c37c5d ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
Connor Abbott [Wed, 22 Jun 2022 15:42:46 +0000 (17:42 +0200)]
tu: Fix linemode for tessellation with isolines
Fixes:
542211676c5 ("turnip: enable VK_EXT_line_rasterization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17190>
Alyssa Rosenzweig [Fri, 17 Jun 2022 21:21:20 +0000 (17:21 -0400)]
v3d: Drop workaround for u_blitter bug
This doesn't seem to be necessary.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17113>
Danylo Piliaiev [Tue, 21 Jun 2022 15:33:15 +0000 (18:33 +0300)]
tu: Do not expose storage image/buffer features for PACK16 formats
We don't support storing into them.
Fixes GL CTS tests running through ZINK:
KHR-GL46.packed_pixels.pbo_rectangle.*
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17168>
Emma Anholt [Tue, 21 Jun 2022 16:12:50 +0000 (09:12 -0700)]
vc4: Propagate txf_ms's dest_type to the lowered txf.
This was missing, and the added validation caught it.
Fixes:
708c47e663be ("nir: Validate nir_tex_instr::dest_type bitsize")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
Emma Anholt [Tue, 21 Jun 2022 16:10:33 +0000 (09:10 -0700)]
ci/vc4: Turn on deqp-egl testing by default.
Now that we have one less job, let's flip this on.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
Emma Anholt [Tue, 21 Jun 2022 16:03:54 +0000 (09:03 -0700)]
ci/vc4: Merge quick_shader in with deqp-gles
All 4 jobs had a total of about 26 minutes of runner time, so squish them
onto 3 runners and use gbm for the .shader_tests to avoid X overhead and
hopefully succeed with full concurrency.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
Mike Blumenkrantz [Mon, 20 Jun 2022 19:28:24 +0000 (15:28 -0400)]
zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:39:36 +0000 (14:39 -0400)]
mesa: explicitly disallow multiple pointsize exports from generating
for the fixedfunc vertex case this is important since the fixedfunc shader
may have already added an (attenuated) pointsize
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:38:06 +0000 (14:38 -0400)]
mesa: enforce pointsize exports if pointsize is being clamped
min/max pointsize clamping affects the value that must be used,
meaning that it may not be 1.0
in the case where clamping changes the value from 1.0, ensure the shader
export path is used if attenuation isn't enabled
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:37:04 +0000 (14:37 -0400)]
mesa: skip pointsize exports if pointsize attenuation is enabled
attenuation has its own method of exporting pointsize in fixedfunc shaders,
so ensure the attenuated size isn't overwritten
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:36:21 +0000 (14:36 -0400)]
mesa: rename PointSizeIsOne -> PointSizeIsSet
this will better convey the meaning of the value
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:35:35 +0000 (14:35 -0400)]
mesa: break out PointSizeIsOne setting to util function
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Mike Blumenkrantz [Mon, 20 Jun 2022 18:33:09 +0000 (14:33 -0400)]
nir/lower_point_size: apply point size clamping
point size min/max values are provided through the state vars, so ensure
these are always applied in order to respect ARB_point_parameters
cc: mesa-stable
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
Italo Nicola [Mon, 20 Jun 2022 17:45:43 +0000 (14:45 -0300)]
virgl: overpropagate precise flags
As it turns out, MOVs weren't the only instructions that blocked precise
flags propagation in the transition to nir-to-tgsi.
This commit fixes some rendering regressions caused by
a4a34cd3.
Fixes:
a4a34cd3
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collanora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17144>
Jason Volk [Wed, 9 Mar 2022 21:14:38 +0000 (13:14 -0800)]
radeon: Support shared memory user pointers.
The RADEON_GEM_USERPTR_ANONONLY flag is hardcoded here which excludes
shared memory pages. DRM is actually capable of supporting shared file-
backed memory, but only if it's read-only. This mutability intent has to
be conveyed through the stack, so a flags argument is added to the winsys
regime to pass RADEON_FLAG_READ_ONLY.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16115>
Marcin Ślusarz [Tue, 21 Jun 2022 08:47:10 +0000 (10:47 +0200)]
intel/compiler: assert that base is 0 for [load|store]_shared intrins
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
Timur Kristóf [Mon, 20 Jun 2022 15:31:06 +0000 (17:31 +0200)]
nir/lower_task_shader: don't use base index for shared memory intrinsics
Intel backend doesn't handle them very well.
Fixes:
8aff8d3dd42 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
Marcin Ślusarz [Mon, 20 Jun 2022 13:41:38 +0000 (15:41 +0200)]
nir/lower_task_shader: insert barrier before/after shared memory read/write
Fixes:
8aff8d3dd42 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
Connor Abbott [Fri, 20 May 2022 11:57:23 +0000 (13:57 +0200)]
ir3/sched: Fix could_sched() determination
This needs to be accurate so that when we split and then schedule a new
a0.x/a1.x/p0.x write we will eventually make progress. It wasn't taking
the kill_path into account which could create an infinite loop as we
keep scheduling writes whose uses are blocked because they are memory
instructions not on the kill_path.
Closes: #6413
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16635>
Danylo Piliaiev [Tue, 21 Jun 2022 18:07:22 +0000 (21:07 +0300)]
meson/tu: Don't compile libdrm paths if KGSL is selected
Even if there is libdrm we shouldn't use it if KGSL is selected.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
Danylo Piliaiev [Tue, 21 Jun 2022 18:05:51 +0000 (21:05 +0300)]
meson/pps: Check if libdrm exists to compile pps
For Turnip with KGSL we may have perffeto enabled but we don't
have libdrm.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
Danylo Piliaiev [Tue, 21 Jun 2022 18:03:22 +0000 (21:03 +0300)]
meson: Define _GNU_SOURCE for android host system
Otherwise sched_getaffinity isn't be defined and util_cpu_detect_once
fails to compile.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
Samuel Pitoiset [Tue, 21 Jun 2022 15:52:29 +0000 (17:52 +0200)]
radv/llvm: always emit a null export even if the FS doesn't discard
Even with a noop FS, the color blend state can still be non-zero, and
then SPI color related registers won't be 0 and this would hang.
Fixes:
bdf3797aeb7 ("ac,radeonsi: don't export null from PS if it has no effect on gfx10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17169>
Pavel Asyutchenko [Wed, 1 Sep 2021 20:26:44 +0000 (23:26 +0300)]
llvmpipe: enable PIPE_CAP_FBFETCH_ZS
Support for it was added in previous commits.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Sun, 19 Jun 2022 18:16:26 +0000 (21:16 +0300)]
llvmpipe: implement FB fetch for depth/stencil
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Sun, 19 Jun 2022 18:13:26 +0000 (21:13 +0300)]
llvmpipe: simplify early/late zs tests selection
This does not change selection logic.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Thu, 7 Oct 2021 21:23:50 +0000 (00:23 +0300)]
llvmpipe: enable per-sample shading when FB fetch is used
This matches specifications of both color and ZS fetch extensions.
Cc: mesa-stable
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Fri, 3 Dec 2021 21:38:56 +0000 (00:38 +0300)]
nir_to_tgsi: Don't count ZS fbfetch vars as outputs
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Wed, 1 Sep 2021 20:26:12 +0000 (23:26 +0300)]
glsl: add language support for GL_ARM_shader_framebuffer_fetch_depth_stencil
This extension adds built-in variables gl_LastFragDepthARM and gl_LastFragStencilARM
which can be implemented almost the same as gl_LastFragData from color fetch extension.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Pavel Asyutchenko [Wed, 1 Sep 2021 20:25:16 +0000 (23:25 +0300)]
gallium: add PIPE_CAP_FBFETCH_ZS and expose extension
st/mesa will expose GL_ARM_shader_framebuffer_fetch_depth_stencil
if this new capability is supported by the driver.
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
Dave Airlie [Mon, 20 Jun 2022 07:33:21 +0000 (17:33 +1000)]
glx/drisw: use xcb instead of X to query connection
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
Dave Airlie [Tue, 21 Jun 2022 19:53:52 +0000 (05:53 +1000)]
wsi/x11: add xcb_put_image support for larger transfers.
This was noticed as a problem in the EGL code, just fixup wsi.
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
Dave Airlie [Tue, 21 Jun 2022 19:49:44 +0000 (05:49 +1000)]
egl/x11: add missing put_image cookie cleanups
These might not be required but be consistent with the wsi code.
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
Dave Airlie [Tue, 21 Jun 2022 00:59:32 +0000 (10:59 +1000)]
egl/x11: split large put image requests to avoid server destroy
wezterm in fullscreen 4k was exceeding the xcb max request size
on the put image with llvmpipe. This fixes it to send sub-images,
the Xlib put image used in glx does this internally, but not
the xcb one, so just do it in sections here.
Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
Mike Blumenkrantz [Tue, 21 Jun 2022 01:31:31 +0000 (21:31 -0400)]
zink: fix dual_src_blend driconf workaround
not sure when this broke but it broke
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17156>
Mike Blumenkrantz [Mon, 20 Jun 2022 19:37:46 +0000 (15:37 -0400)]
glx/drisw: invalidate drawables upon binding context if flush extension exists
this forces surface resize as expected
cc: mesa-stable
fixes #6706
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
Mike Blumenkrantz [Mon, 20 Jun 2022 19:36:58 +0000 (15:36 -0400)]
glx/drisw: store the flush extension to the screen
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
Guilherme Gallo [Mon, 20 Jun 2022 22:12:28 +0000 (19:12 -0300)]
ci/lava: Filter out undesired messages
Some LAVA jobs emit lots of messages "Listened to connection for
namespace 'common' for up to 1s" in a row at the end of the logs, making
difficult to see the result of the test script.
This commit removes those lines until a proper solution is deployed on
the LAVA side.
Closes: #6116
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17151>
Jason Ekstrand [Tue, 21 Jun 2022 17:08:06 +0000 (12:08 -0500)]
vulkan/wsi: Use HAVE_LIBDRM to detect DRM instead of !_WIN32
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17170>
Jordan Justen [Tue, 14 Jun 2022 06:45:35 +0000 (23:45 -0700)]
intel/tools: Print memory info in intel_dev_info
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Jordan Justen [Wed, 18 May 2022 08:21:44 +0000 (01:21 -0700)]
iris/bufmgr: Use memory info from devinfo
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Jordan Justen [Wed, 18 May 2022 16:52:37 +0000 (09:52 -0700)]
anv: Use memory info from devinfo
Rework:
* Jordan: Drop regions.valid (Lionel implemented a fallback)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Lionel Landwerlin [Wed, 15 Jun 2022 08:03:29 +0000 (11:03 +0300)]
intel/dev: add a fallback when memory regions are not available
We have this in Anv and it could be reused in Iris for integrated
memory system.
Rework:
* Jordan: Drop regions.valid (Lionel implemented a fallback)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Lionel Landwerlin [Wed, 15 Jun 2022 07:28:07 +0000 (10:28 +0300)]
intel/dev: add a helper to update memory info
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Jordan Justen [Mon, 16 May 2022 09:19:48 +0000 (02:19 -0700)]
intel/dev: Add devinfo::mem to store i915 regions information
Reworks:
* Lionel: Change check on memory region valid to vram size
* Jordan: Drop regions.valid (Lionel implemented a fallback)
* Jordan: Rename devinfo::regions to devinfo::mem.
* Jordan: Add devinfo::mem::use_class_instance
* Add mesa_logw for lmem requiring regions. (s-b Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
Alyssa Rosenzweig [Fri, 17 Jun 2022 16:54:07 +0000 (12:54 -0400)]
panfrost: Bump ESSL_FEATURE_LEVEL on Valhall
This advertises ARB_gpu_shader5 on Valhall, which should be working now. On the
GLES3.1 side, this notably adds support for sample variables and dynamic offsets
for texture gathers, both of which should now be working.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 16:54:03 +0000 (12:54 -0400)]
panfrost: Enable CAP_INDIRECT_TEMP_ADDR on Valhall
For parity with Bifrost. Apparently this pattern is sufficiently obscure that
the shader-db results on Mali-G57 are mostly noise.
total instructions in shared programs: 2675116 -> 2674820 (-0.01%)
instructions in affected programs: 4336 -> 4040 (-6.83%)
helped: 8
HURT: 1
helped stats (abs) min: 1.0 max: 52.0 x̄: 37.88 x̃: 49
helped stats (rel) min: 0.46% max: 8.20% x̄: 5.97% x̃: 7.56%
HURT stats (abs) min: 7.0 max: 7.0 x̄: 7.00 x̃: 7
HURT stats (rel) min: 5.98% max: 5.98% x̄: 5.98% x̃: 5.98%
95% mean confidence interval for instructions value: -52.90 -12.88
95% mean confidence interval for instructions %-change: -8.48% -0.81%
Instructions are helped.
total cvt in shared programs: 14127.08 -> 14126.53 (<.01%)
cvt in affected programs: 33.84 -> 33.30 (-1.62%)
helped: 10
HURT: 1
helped stats (abs) min: 0.015625 max: 0.125 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.71% max: 2.93% x̄: 1.76% x̃: 1.78%
HURT stats (abs) min: 0.09375 max: 0.09375 x̄: 0.09 x̃: 0
HURT stats (rel) min: 7.89% max: 7.89% x̄: 7.89% x̃: 7.89%
95% mean confidence interval for cvt value: -0.09 -0.01
95% mean confidence interval for cvt %-change: -2.89% 1.13%
Inconclusive result (%-change mean confidence interval includes 0).
total sfu in shared programs: 7572 -> 7555.69 (-0.22%)
sfu in affected programs: 37.19 -> 20.88 (-43.87%)
helped: 6
HURT: 3
helped stats (abs) min: 2.75 max: 2.75 x̄: 2.75 x̃: 2
helped stats (rel) min: 47.31% max: 48.89% x̄: 48.63% x̃: 48.89%
HURT stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel) min: 5.56% max: 6.25% x̄: 5.79% x̃: 5.56%
95% mean confidence interval for sfu value: -2.89 -0.73
95% mean confidence interval for sfu %-change: -51.41% -9.57%
Sfu are helped.
total quadwords in shared programs: 1450040 -> 1449896 (<.01%)
quadwords in affected programs: 1992 -> 1848 (-7.23%)
helped: 6
HURT: 0
helped stats (abs) min: 24.0 max: 24.0 x̄: 24.00 x̃: 24
helped stats (rel) min: 6.82% max: 7.50% x̄: 7.24% x̃: 7.32%
95% mean confidence interval for quadwords value: -24.00 -24.00
95% mean confidence interval for quadwords %-change: -7.48% -6.99%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 16:53:49 +0000 (12:53 -0400)]
panfrost: Enable more FP16 caps on Valhall
This brings the FP16 capabilities of Valhall to parity with Bifrost.
Supporting FP16 constant buffers in particular reduces ALU in a ton of GLES
shaders, so that's a nice win. FP16 derivatives get vectorized which is a big
win where that applies, but they are considerably less common.
The lost shaders are from enabling PIPE_SHADER_CAP_FP16_CONST_BUFFERS (these
shaders compile on Midgard but not on Bifrost). The shaders in question declare
the same uniform in linked vertex and fragment shaders with different
precisions. This is contrary to the GLSL ES specification, which states
precisions must match for default uniforms of linked shaders. All the lost
shaders are in 8 Ball Pool and Hill Climb Racing. As those are proprietary
games, if that becomes a problem in the future, drirc is the solution.
total instructions in shared programs: 2697897 -> 2674595 (-0.86%)
instructions in affected programs: 1019922 -> 996620 (-2.28%)
helped: 4838
HURT: 2599
helped stats (abs) min: 1.0 max: 52.0 x̄: 7.13 x̃: 5
helped stats (rel) min: 0.16% max: 46.51% x̄: 8.04% x̃: 5.33%
HURT stats (abs) min: 1.0 max: 36.0 x̄: 4.30 x̃: 3
HURT stats (rel) min: 0.17% max: 133.33% x̄: 10.53% x̃: 3.85%
95% mean confidence interval for instructions value: -3.32 -2.95
95% mean confidence interval for instructions %-change: -1.89% -1.22%
Instructions are helped.
total cycles in shared programs: 141764.61 -> 140602.88 (-0.82%)
cycles in affected programs: 5728.22 -> 4566.48 (-20.28%)
helped: 665
HURT: 89
helped stats (abs) min: 0.015625 max: 15.0 x̄: 1.75 x̃: 0
helped stats (rel) min: 0.30% max: 61.54% x̄: 11.17% x̃: 4.62%
HURT stats (abs) min: 0.015625 max: 0.265625 x̄: 0.04 x̃: 0
HURT stats (rel) min: 0.30% max: 66.67% x̄: 6.77% x̃: 1.94%
95% mean confidence interval for cycles value: -1.77 -1.31
95% mean confidence interval for cycles %-change: -10.11% -7.99%
Cycles are helped.
total fma in shared programs: 22577.56 -> 22575.91 (<.01%)
fma in affected programs: 2422.78 -> 2421.12 (-0.07%)
helped: 533
HURT: 653
helped stats (abs) min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.30% max: 50.00% x̄: 8.25% x̃: 1.35%
HURT stats (abs) min: 0.015625 max: 0.125 x̄: 0.03 x̃: 0
HURT stats (rel) min: 0.19% max: 100.00% x̄: 4.53% x̃: 2.08%
95% mean confidence interval for fma value: -0.00 0.00
95% mean confidence interval for fma %-change: -1.98% -0.44%
Inconclusive result (value mean confidence interval includes 0).
total cvt in shared programs: 14460.95 -> 14122.50 (-2.34%)
cvt in affected programs: 6159.02 -> 5820.56 (-5.50%)
helped: 4827
HURT: 2577
helped stats (abs) min: 0.015625 max: 0.796875 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.20% max: 81.82% x̄: 17.78% x̃: 12.90%
HURT stats (abs) min: 0.015625 max: 0.546875 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.00% max: 600.00% x̄: 43.66% x̃: 13.04%
95% mean confidence interval for cvt value: -0.05 -0.04
95% mean confidence interval for cvt %-change: 2.28% 4.93%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).
total sfu in shared programs: 7593.56 -> 7571.06 (-0.30%)
sfu in affected programs: 357.19 -> 334.69 (-6.30%)
helped: 149
HURT: 1
helped stats (abs) min: 0.0625 max: 0.25 x̄: 0.15 x̃: 0
helped stats (rel) min: 5.26% max: 36.36% x̄: 6.79% x̃: 5.56%
HURT stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel) min: 3.57% max: 3.57% x̄: 3.57% x̃: 3.57%
95% mean confidence interval for sfu value: -0.16 -0.14
95% mean confidence interval for sfu %-change: -7.51% -5.93%
Sfu are helped.
total v in shared programs: 8722.62 -> 8722.31 (<.01%)
v in affected programs: 1.62 -> 1.31 (-19.23%)
helped: 2
HURT: 0
total ls in shared programs: 129666 -> 128494 (-0.90%)
ls in affected programs: 4163 -> 2991 (-28.15%)
helped: 192
HURT: 0
helped stats (abs) min: 1.0 max: 15.0 x̄: 6.10 x̃: 5
helped stats (rel) min: 4.35% max: 75.00% x̄: 30.23% x̃: 26.32%
95% mean confidence interval for ls value: -6.67 -5.54
95% mean confidence interval for ls %-change: -32.67% -27.79%
Ls are helped.
total quadwords in shared programs: 1461496 -> 1449768 (-0.80%)
quadwords in affected programs: 273592 -> 261864 (-4.29%)
helped: 1992
HURT: 687
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.76 x̃: 8
helped stats (rel) min: 1.43% max: 50.00% x̄: 16.30% x̃: 11.11%
HURT stats (abs) min: 8.0 max: 16.0 x̄: 8.31 x̃: 8
HURT stats (rel) min: 1.92% max: 100.00% x̄: 36.39% x̃: 25.00%
95% mean confidence interval for quadwords value: -4.67 -4.08
95% mean confidence interval for quadwords %-change: -3.95% -1.62%
Quadwords are helped.
total threads in shared programs: 53496 -> 53551 (0.10%)
threads in affected programs: 112 -> 167 (49.11%)
helped: 74
HURT: 19
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.42 0.76
95% mean confidence interval for threads %-change: 56.83% 81.88%
Threads are helped.
total loops in shared programs: 128 -> 127 (-0.78%)
loops in affected programs: 1 -> 0
helped: 1
HURT: 0
total fills in shared programs: 684 -> 672 (-1.75%)
fills in affected programs: 160 -> 148 (-7.50%)
helped: 2
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 16:58:55 +0000 (12:58 -0400)]
pan/bi: Tune lower_vars_to_scratch
Increase the threshold to lower indirect indexing of arrays to scratch memory
all the way up to 256 bytes, which was the lowest power-of-two threshold for
which enabling the pass on Mali-G57 was a win in shaderdb.
It's difficult to tell what threshold is optimal here. The shader-db stats are
based on a rough cycle model that assumes a 16:1 ratio between CVT and
load/store on Valhall, and a 24:1 ratio between arithmetic and load/store on
Bifrost. Those ratios are at most rules of thumb, as the number of cycles
required by a load/store instruction will vary tremendously based on caching and
the memory controller. However, they may well be lower bounds (if those are the
upper bounds on instruction issuing in the Mali shader cores). As such, a large
threshold seems well motivated.
shader-db results on Mali-G52 follow, results on Mali-G57 were similar. Note the
shader that's hurt for spills/fills is *helped* for load/store overall.
cycles helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))
ldst helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))
total instructions in shared programs: 2415410 -> 2415372 (<.01%)
instructions in affected programs: 1041 -> 1003 (-3.65%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 31.0 x̄: 12.67 x̃: 5
helped stats (rel) min: 2.08% max: 6.02% x̄: 3.90% x̃: 3.60%
total tuples in shared programs: 1928558 -> 1928527 (<.01%)
tuples in affected programs: 826 -> 795 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 6.0 max: 26.0 x̄: 16.00 x̃: 16
helped stats (rel) min: 3.72% max: 9.68% x̄: 6.70% x̃: 6.70%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.54% max: 1.54% x̄: 1.54% x̃: 1.54%
total clauses in shared programs: 355013 -> 354981 (<.01%)
clauses in affected programs: 220 -> 188 (-14.55%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 27.0 x̄: 10.67 x̃: 3
helped stats (rel) min: 13.99% max: 21.43% x̄: 16.93% x̃: 15.38%
total cycles in shared programs: 166610.27 -> 166574.90 (-0.02%)
cycles in affected programs: 138 -> 102.62 (-25.63%)
helped: 3
HURT: 0
helped stats (abs) min: 0.
4583330000000001 max: 31.0 x̄: 11.79 x̃: 3
helped stats (rel) min: 15.28% max: 65.28% x̄: 34.86% x̃: 24.03%
total arith in shared programs: 73690.13 -> 73690.58 (<.01%)
arith in affected programs: 29.71 -> 30.17 (1.54%)
helped: 1
HURT: 2
helped stats (abs) min: 0.
0833339999999998 max: 0.
0833339999999998 x̄: 0.08 x̃: 0
helped stats (rel) min: 3.85% max: 3.85% x̄: 3.85% x̃: 3.85%
HURT stats (abs) min: 0.125 max: 0.
4166659999999993 x̄: 0.27 x̃: 0
HURT stats (rel) min: 1.66% max: 5.17% x̄: 3.42% x̃: 3.42%
total ldst in shared programs: 135611 -> 135571 (-0.03%)
ldst in affected programs: 138 -> 98 (-28.99%)
helped: 3
HURT: 0
helped stats (abs) min: 3.0 max: 31.0 x̄: 13.33 x̃: 6
helped stats (rel) min: 24.03% max: 100.00% x̄: 74.68% x̃: 100.00%
total quadwords in shared programs: 1674599 -> 1674523 (<.01%)
quadwords in affected programs: 838 -> 762 (-9.07%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 65.0 x̄: 25.33 x̃: 9
helped stats (rel) min: 3.39% max: 15.00% x̄: 9.14% x̃: 9.04%
total spills in shared programs: 37 -> 40 (8.11%)
spills in affected programs: 17 -> 20 (17.65%)
helped: 0
HURT: 1
total fills in shared programs: 190 -> 196 (3.16%)
fills in affected programs: 34 -> 40 (17.65%)
helped: 0
HURT: 1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:59:09 +0000 (11:59 -0400)]
pan/va: Replace MKVEC.v4i8 with MKVEC.v2i8
This is the instruction that the hardware actually supports. Do the rename, use
the more specific accurate model in the IR, and rework the Valhall texturing
code to emit MKVEC.v2i8 instead of MKVEC.v4i8.
Will fix:
dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:56:21 +0000 (11:56 -0400)]
pan/va: Pack MKVEC.v2i8 byte lanes
They are in a different place, but the encoding is otherwise as usual. This will
be required for texture gathers with dynamic offsets.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:55:23 +0000 (11:55 -0400)]
pan/bi: Constant fold MKVEC.v2i8
Constant MKVEC.v2i8 will be generated during texturing on Valhall, just like
constant MKVEC.v4i8 is currently generated.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:53:53 +0000 (11:53 -0400)]
pan/bi: Model MKVEC.v2i8
Valhall does not have Bifrost's 4-source MKVEC.v4i8. Instead, it has a (somewhat
limtied) 3-source MKVEC.v2i8. The full MKVEC.v4i8 may be lowered to a pair of
MKVEC.v2i8 instructions.
For good code quality on both Bifrost and Valhall, we need to model both
instructions in their full generality.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:35:41 +0000 (11:35 -0400)]
pan/bi: Remove FRSCALE from IR
It's just LDEXP in different clothing.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 15:34:51 +0000 (11:34 -0400)]
pan/va: Rename RSCALE to LDEXP
This avoids needless variation from Bifrost. While at it, fix the opcode
definition: there are no abs/neg/swizzle modifiers on the signed integer source,
and there's no clamp. However, there are round and infinity modes, like on
Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 14:47:23 +0000 (10:47 -0400)]
pan/va: Implement sample positions FAU packing
This will fix:
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Fri, 17 Jun 2022 14:40:27 +0000 (10:40 -0400)]
pan/va: Lower FADD_RSCALE.f32 to FMA_RSCALE.f32
We generate FADD_RSCALE.f32 in our sample variables implementations. Valhall
doesn't have a dedicated FADD_RSCALE.f32 implementation, it should be aliased to
FMA_RSCALE.f32. Handle that alias in isel lowering. This will fix:
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.*
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Thu, 16 Jun 2022 23:14:58 +0000 (19:14 -0400)]
pan/bi: Align accesses with packed TLS
When lowering vars to scratch, we need to be careful with alignment on Valhall,
where packed TLS access must not straddle a 16-byte boundary. Fixes regressions
when enabling indirect access to temps on Valhall.
Fixes:
6761dbf8915 ("panfrost: Use packed TLS on Valhall")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
Alyssa Rosenzweig [Thu, 16 Jun 2022 23:04:13 +0000 (19:04 -0400)]
pan/bi: Fix LD_BUFFER.i16 definition
This was missing the message, breaking UBO-to-push and who-knows-what-else, when
enabling fp16 const buffers.
Fixes:
3dc2095b079 ("pan/bi: Model LD_BUFFER instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>