Alyssa Rosenzweig [Wed, 16 Jun 2021 17:29:53 +0000 (13:29 -0400)]
panfrost: Enable more tiler levels if we can
Boosts glmark2 scores on Mali G52.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Wed, 16 Jun 2021 17:28:58 +0000 (13:28 -0400)]
panfrost: Query tiler features
We need the maximum levels to configure the hierarchy mask correctly. We
should also respect the bin size...
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Wed, 16 Jun 2021 15:29:28 +0000 (11:29 -0400)]
panfrost: Zero depth_source in vertex shaders
Spurious assignment.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Wed, 16 Jun 2021 15:28:09 +0000 (11:28 -0400)]
panfrost: Don't set zs_update_operation in vertex shaders
Spurious assignment.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Tue, 15 Jun 2021 22:59:33 +0000 (18:59 -0400)]
panfrost: Add a performance counter dump utility
This uses Antonio's src/panfrost/perf for all the heavylifting, just
like the Perfetto producer. Unlike the Perfetto producer, it has no
dependencies and is a lot less useful. But it's a good smoke test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Fri, 11 Jun 2021 19:23:12 +0000 (15:23 -0400)]
panfrost: Fix FPK enable condition
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Mon, 28 Jun 2021 14:49:56 +0000 (10:49 -0400)]
pan/bi: Don't lower fpow
We can fuse the intermediate multiply with the FMA_RSCALE in the
exponent code and save an instruction. Whether this is better than
adding a NIR op remains to be seen.
total instructions in shared programs: 146614 -> 146190 (-0.29%)
instructions in affected programs: 40724 -> 40300 (-1.04%)
helped: 157
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.22% max: 10.34% x̄: 1.37% x̃: 1.20%
95% mean confidence interval for instructions value: -3.00 -2.40
95% mean confidence interval for instructions %-change: -1.58% -1.15%
Instructions are helped.
total tuples in shared programs: 128116 -> 127696 (-0.33%)
tuples in affected programs: 33421 -> 33001 (-1.26%)
helped: 150
HURT: 0
helped stats (abs) min: 1.0 max: 16.0 x̄: 2.80 x̃: 2
helped stats (rel) min: 0.28% max: 4.37% x̄: 1.36% x̃: 1.07%
95% mean confidence interval for tuples value: -3.24 -2.36
95% mean confidence interval for tuples %-change: -1.50% -1.21%
Tuples are helped.
total clauses in shared programs: 27531 -> 27483 (-0.17%)
clauses in affected programs: 719 -> 671 (-6.68%)
helped: 20
HURT: 0
helped stats (abs) min: 1.0 max: 8.0 x̄: 2.40 x̃: 1
helped stats (rel) min: 1.61% max: 12.90% x̄: 6.96% x̃: 5.33%
95% mean confidence interval for clauses value: -3.48 -1.32
95% mean confidence interval for clauses %-change: -9.10% -4.82%
Clauses are helped.
total cycles in shared programs: 12250.81 -> 12233.69 (-0.14%)
cycles in affected programs: 1251.50 -> 1234.38 (-1.37%)
helped: 141
HURT: 0
helped stats (abs) min: 0.
041665999999999315 max: 0.
6666670000000003 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.29% max: 5.00% x̄: 1.48% x̃: 1.20%
95% mean confidence interval for cycles value: -0.14 -0.10
95% mean confidence interval for cycles %-change: -1.63% -1.32%
Cycles are helped.
total arith in shared programs: 4840.25 -> 4822.71 (-0.36%)
arith in affected programs: 1324.08 -> 1306.54 (-1.32%)
helped: 151
HURT: 0
helped stats (abs) min: 0.
041665999999999315 max: 0.
6666670000000003 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.29% max: 5.00% x̄: 1.43% x̃: 1.13%
95% mean confidence interval for arith value: -0.13 -0.10
95% mean confidence interval for arith %-change: -1.59% -1.28%
Arith are helped.
total texture in shared programs: 1666.50 -> 1666.50 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0
total vary in shared programs: 639.06 -> 639.06 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0
total ldst in shared programs: 9682 -> 9682 (0.00%)
ldst in affected programs: 0 -> 0
helped: 0
HURT: 0
total quadwords in shared programs: 116758 -> 116378 (-0.33%)
quadwords in affected programs: 28054 -> 27674 (-1.35%)
helped: 148
HURT: 2
helped stats (abs) min: 1.0 max: 16.0 x̄: 2.58 x̃: 2
helped stats (rel) min: 0.29% max: 5.13% x̄: 1.54% x̃: 1.23%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.67% max: 0.85% x̄: 0.76% x̃: 0.76%
95% mean confidence interval for quadwords value: -2.94 -2.12
95% mean confidence interval for quadwords %-change: -1.69% -1.33%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Mon, 28 Jun 2021 14:33:04 +0000 (10:33 -0400)]
pan/bi: Factor out exp2/log2 code
Will be reused for fpow.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Mon, 28 Jun 2021 14:22:35 +0000 (10:22 -0400)]
pan/bi: Comment the fexp2 implementation
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Fri, 25 Jun 2021 22:31:50 +0000 (18:31 -0400)]
pan/bi: Simplify cube map descriptor generation
We don't need to do the bitwise manipulation ourselves, we can just use
a bitwise MUX instead.
total instructions in shared programs: 146840 -> 146614 (-0.15%)
instructions in affected programs: 15037 -> 14811 (-1.50%)
helped: 109
HURT: 0
helped stats (abs) min: 2.0 max: 4.0 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.86% max: 4.00% x̄: 1.70% x̃: 1.77%
95% mean confidence interval for instructions value: -2.15 -2.00
95% mean confidence interval for instructions %-change: -1.81% -1.59%
Instructions are helped.
total tuples in shared programs: 128149 -> 128116 (-0.03%)
tuples in affected programs: 2896 -> 2863 (-1.14%)
helped: 16
HURT: 0
helped stats (abs) min: 1.0 max: 5.0 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.65% max: 2.33% x̄: 1.16% x̃: 0.70%
95% mean confidence interval for tuples value: -3.01 -1.12
95% mean confidence interval for tuples %-change: -1.50% -0.83%
Tuples are helped.
total cycles in shared programs: 12257.10 -> 12250.81 (-0.05%)
cycles in affected programs: 449.87 -> 443.58 (-1.40%)
helped: 92
HURT: 0
helped stats (abs) min: 0.
0416660000000002 max: 0.
20833400000000069 x̄: 0.07 x̃: 0
helped stats (rel) min: 0.93% max: 2.53% x̄: 1.40% x̃: 1.26%
95% mean confidence interval for cycles value: -0.08 -0.06
95% mean confidence interval for cycles %-change: -1.48% -1.32%
Cycles are helped.
total arith in shared programs: 4847.33 -> 4840.25 (-0.15%)
arith in affected programs: 490.37 -> 483.29 (-1.44%)
helped: 109
HURT: 0
helped stats (abs) min: 0.
0416660000000002 max: 0.
20833400000000069 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.93% max: 5.56% x̄: 1.51% x̃: 1.26%
95% mean confidence interval for arith value: -0.07 -0.06
95% mean confidence interval for arith %-change: -1.64% -1.39%
Arith are helped.
total quadwords in shared programs: 116775 -> 116758 (-0.01%)
quadwords in affected programs: 1331 -> 1314 (-1.28%)
helped: 7
HURT: 0
helped stats (abs) min: 1.0 max: 4.0 x̄: 2.43 x̃: 3
helped stats (rel) min: 0.91% max: 2.38% x̄: 1.65% x̃: 1.39%
95% mean confidence interval for quadwords value: -3.48 -1.38
95% mean confidence interval for quadwords %-change: -2.27% -1.04%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Fri, 25 Jun 2021 15:40:47 +0000 (11:40 -0400)]
pan/bi: Workaround widen restrictions on +FADD.f32
We can use *FADD.f32 for these cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 23:36:11 +0000 (19:36 -0400)]
pan/bi: Add a constant subexpression elimination pass
ALU only. Intended to clean up the lowerings used with complex
texturings. Ex: if a shader reads two cube maps at the same coordinates,
this deduplicates the cube map transformation.
This needs to happen in the backend since we do the cube map
transformation with the backend builder, rather than special NIR ops.
This is a tradeoff.
Pass based on ir3's, which in turn is inspired by NIR's.
total instructions in shared programs: 148799 -> 147348 (-0.98%)
instructions in affected programs: 20509 -> 19058 (-7.07%)
helped: 145
HURT: 0
helped stats (abs) min: 4.0 max: 30.0 x̄: 10.01 x̃: 8
helped stats (rel) min: 1.92% max: 54.55% x̄: 10.87% x̃: 7.41%
95% mean confidence interval for instructions value: -10.73 -9.28
95% mean confidence interval for instructions %-change: -12.81% -8.94%
Instructions are helped.
total tuples in shared programs: 129992 -> 128908 (-0.83%)
tuples in affected programs: 17624 -> 16540 (-6.15%)
helped: 145
HURT: 0
helped stats (abs) min: 2.0 max: 25.0 x̄: 7.48 x̃: 7
helped stats (rel) min: 0.74% max: 42.86% x̄: 9.16% x̃: 7.22%
95% mean confidence interval for tuples value: -7.96 -6.99
95% mean confidence interval for tuples %-change: -10.52% -7.79%
Tuples are helped.
total clauses in shared programs: 27632 -> 27582 (-0.18%)
clauses in affected programs: 1077 -> 1027 (-4.64%)
helped: 44
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.14 x̃: 1
helped stats (rel) min: 2.50% max: 16.67% x̄: 4.99% x̃: 4.45%
95% mean confidence interval for clauses value: -1.26 -1.01
95% mean confidence interval for clauses %-change: -5.70% -4.27%
Clauses are helped.
total cycles in shared programs: 12323 -> 12285.63 (-0.30%)
cycles in affected programs: 618.25 -> 580.88 (-6.05%)
helped: 120
HURT: 0
helped stats (abs) min: 0.
08333299999999966 max: 0.
5416680000000014 x̄: 0.31 x̃: 0
helped stats (rel) min: 0.77% max: 66.67% x̄: 7.60% x̃: 7.37%
95% mean confidence interval for cycles value: -0.33 -0.29
95% mean confidence interval for cycles %-change: -8.73% -6.47%
Cycles are helped.
total arith in shared programs: 4916.75 -> 4866.88 (-1.01%)
arith in affected programs: 677.79 -> 627.92 (-7.36%)
helped: 145
HURT: 0
helped stats (abs) min: 0.
08333299999999966 max: 1.
0833329999999997 x̄: 0.34 x̃: 0
helped stats (rel) min: 0.77% max: 66.67% x̄: 12.81% x̃: 7.87%
95% mean confidence interval for arith value: -0.37 -0.32
95% mean confidence interval for arith %-change: -15.33% -10.29%
Arith are helped.
total quadwords in shared programs: 118117 -> 117262 (-0.72%)
quadwords in affected programs: 15283 -> 14428 (-5.59%)
helped: 143
HURT: 0
helped stats (abs) min: 1.0 max: 23.0 x̄: 5.98 x̃: 5
helped stats (rel) min: 0.44% max: 25.71% x̄: 7.56% x̃: 5.56%
95% mean confidence interval for quadwords value: -6.46 -5.50
95% mean confidence interval for quadwords %-change: -8.59% -6.53%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 22:35:58 +0000 (18:35 -0400)]
pan/bi: Fuse LD_VAR+TEXS_2D -> VAR_TEX
When the LD_VAR is only used once as an input to a texture instruction,
this is an improvement. We handle this case as a backwards pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 15:49:19 +0000 (11:49 -0400)]
pan/bi: Analyze helper invocations
Set the .skip bit on texture instructions and the terminate discarded
threads bit on the clause header based on data flow analysis of helper
invocations. This code is adapted from Midgard, which requires the same
analysis with a few details changed.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Tue, 6 Jul 2021 15:56:16 +0000 (11:56 -0400)]
pan/bi: Track LOD mode even for TEXC
Redundant with the texture operation descriptor, but we don't want to
parse that in the rest of the compiler. Handling it as a pseudo-modifier
lets us share a code path with TEXS.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 15:04:25 +0000 (11:04 -0400)]
pan/bi: Report cycle counts
Based on analysis of results from the Mali Offline Compiler. I am
uncertain how well these translate to real life, and they are
normalized counts only...
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Wed, 16 Jun 2021 20:13:56 +0000 (16:13 -0400)]
pan/bi: Only spill nodes that could progress in RA
This reduces number of spills and hence compile-time by avoiding
pointless decisions. In a terrain shader forced to use full threads:
Before: 39:168 spills:fills
After: 23:127 spills:fills
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Wed, 16 Jun 2021 18:35:43 +0000 (14:35 -0400)]
pan/bi: Try to hit full occupancy on v7
Bifrost v7 trades off register pressure and occupancy. If we restrict to
[R0, R15] U [R48, R63], we get full occupancy, but if we use the full
register file, we only get half occupancy. Try to allocate just 32
registers, and only use the full 64 registers if that would spill.
Clever heuristics could make this both more effective (live range
splitting, shuffling, spilling if deemed acceptable) and cheaper at
compile-time (tracking maximum liveness to determine if it's possible to
hit at all). For now, this should suffice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Tue, 15 Jun 2021 21:19:38 +0000 (17:19 -0400)]
pan/bi: Pack staging_barrier for the -next- clause
Match the semantic in the compiler header.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 18:07:43 +0000 (14:07 -0400)]
pan/bi: Add bi_foreach_instr_global_rev_safe helper
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Thu, 24 Jun 2021 18:07:27 +0000 (14:07 -0400)]
pan/bi: Fix skip/lod_mode aliasing with VAR_TEX
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Alyssa Rosenzweig [Tue, 15 Jun 2021 21:20:04 +0000 (17:20 -0400)]
pan/bi: Improve clause printing
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11733>
Icecream95 [Fri, 9 Jul 2021 12:31:02 +0000 (00:31 +1200)]
panfrost: Always use a fragment shader when alpha test is enabled
Fixes incorrect rendering with OpenSCAD.
Fixes:
275277a2b48 ("panfrost: Implement alpha testing natively")
Reported-by: Urja Rannikko <urjaman@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11812>
Samuel Pitoiset [Fri, 9 Jul 2021 16:41:47 +0000 (18:41 +0200)]
radv: fix applying radv_disable_dcc for DOOM 2016 again
application_name_match is a regex... and DCC was also disabled for
DOOM Eternal (because DOOMEternal matches DOOM). Fun.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11805>
Yiwei Zhang [Wed, 7 Jul 2021 20:16:21 +0000 (20:16 +0000)]
egl/android: restore image creation fallback path used by virgl
For virgl backend used in ARCVM, cros buffer info query brings back
real modifier info for the host image, which cannot be resolved by the
gallium virgl backend. Thus the fallback path is used here.
This patch fixes a behavior change introduced by a prior commit.
Fixes:
5d3e64f1 ("egl: android: prepare code for adding more buffer_info getters")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11771>
Heinrich Fink [Thu, 1 Jul 2021 09:26:57 +0000 (11:26 +0200)]
llvmpipe: do not leak display target mapped ptr in cs setup
For compute shader textures that are backed by a display target, do not
leak the mapped pointer and unmap when access to the mapped resource is
not needed anymore.
Also use llvmpipe_resource_[un]map instead of calling winsys map
functions directly.
v2:
- use llvmpipe_resource_[un]map directly instead of winsys DT map
func and unneeded helper function for unmapping.
v3 (Emil Velikov):
- add comment in lp_csctx_set_sampler_views to explain
unmapping current texture early in the loop.
Signed-off-by: Heinrich Fink <hfink@snap.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11741>
Heinrich Fink [Mon, 5 Jul 2021 17:54:20 +0000 (19:54 +0200)]
llvmpipe: do not leak map of display target in fs setup
For fragment shader textures that are backed by a display target, do not
leak the mapped pointer, but unmap before unref'ing its associated
pipe_resource instances.
Also, make sure that the pointer that's mapped into a jit texture stays
valid while rasterization works on a jit context copy by mapping the
display target again during scene setup, and unmapping when finalizing
rasterization.
v2 (Daniel Stone):
- remove redundant helper function for [un]mapping DT, use
llvmpipe_resource_[un]map right away
v3 (Emil Velikov):
- add comment in lp_setup_set_fragment_sampler_views to explain
unmapping current texture early in the loop
Signed-off-by: Heinrich Fink <hfink@snap.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11741>
Heinrich Fink [Thu, 1 Jul 2021 12:39:36 +0000 (14:39 +0200)]
softpipe: unmap display target of shader sampler
Unmap display target in cleanup routine for sampler views that are using
textures backed by a display target.
v2:
- remove obsolete comment
Signed-off-by: Heinrich Fink <hfink@snap.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11741>
Heinrich Fink [Wed, 30 Jun 2021 11:28:17 +0000 (13:28 +0200)]
llvmpipe: unmap display target of shader image/sampler
Revive hooks for cleaning up shader sampler and image data and when
finalizing llvmpipe_draw_vbo. In cleanup routines, for any sampler or
image that was set up with displaytarget_map, call displaytarget_unmap.
This fixes leaks of mmap calls of the underlying displaytarget
resources.
v2 (Daniel Stone):
- Use a single cleanup function for sampler/image to simplify patchset
v3 (Emil Velikov):
- use llvmpipe_resource_[un]map instead of open-coding through
winsys
v4:
- check tex/image for NULL before calling into
llvmpipe_resource_unmap (fixes dEQP crash of llvmpipe runner)
Signed-off-by: Heinrich Fink <hfink@snap.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11741>
Jason Ekstrand [Thu, 8 Jul 2021 19:36:15 +0000 (14:36 -0500)]
nir/lower_subgroups: Pad ballot values before bitcasting
Otherwise, if we cast from a uint32_t to a uint64_t, the bitcast will
fail before we pad. This happens on Intel.
Fixes:
e4e79de2a420 "nir/subgroups: Support > 1 ballot components"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5045
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11786>
Jason Ekstrand [Fri, 9 Jul 2021 12:16:36 +0000 (07:16 -0500)]
android: Restore android/Android.mk
It was accidentally dropped as part of
d4b482d378e3 but it's the one
Android makefile we want to keep.
Fixes:
d4b482d378e3 "android: Drop the Android.mk build system"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11798>
Samuel Pitoiset [Thu, 1 Jul 2021 11:32:05 +0000 (13:32 +0200)]
ac,radv: implement the cs_regalloc_hang HW bug workaround
Might fix spurious failures on GFX6 and some GFX7 chips.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11675>
Erik Faye-Lund [Fri, 9 Jul 2021 10:16:10 +0000 (12:16 +0200)]
docs: update zink requirements
We currently require VK_EXT_line_rasterization with *all* optional
features to render all kinds of lines required. Because some (if not
all) of these can be emulated, let's make the list explicit, so it's
easy to remove items as we implement emulation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11795>
Erik Faye-Lund [Tue, 20 Apr 2021 16:28:24 +0000 (18:28 +0200)]
zink: fill in the right line-mode based on state
We need to fill in the right line-mode here based on the state to get
the correct rasterization; bresenham isn't always the right one.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11795>
Erik Faye-Lund [Mon, 28 Oct 2019 16:34:38 +0000 (17:34 +0100)]
zink: support line stippling
VK_EXT_line_rasterization allows us to specify a line-stilling pattern.
So let's do that.
While we're at it, use more bit-allocation here.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11795>
Erik Faye-Lund [Fri, 9 Jul 2021 10:23:16 +0000 (12:23 +0200)]
zink: use bit-allocation for boolean rasterizer-state
This reduces the size of the struct a bit, and we're about to add some
more bit-allocated stuff in the next commit.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11795>
Erik Faye-Lund [Mon, 28 Oct 2019 15:42:48 +0000 (16:42 +0100)]
zink: hook up line-rasterization ext
The VK_EXT_line_rasterization extension allows to specify the
correct line-rasterization rules, which is needed for correct OpenGL
rendering.
So, let's prepare for filling this one out. Right now, it does a whole
lot of nothing, but that's about to change.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11795>
Danylo Piliaiev [Wed, 7 Jul 2021 11:40:52 +0000 (14:40 +0300)]
ir3: add newly found shlg.b16 instruction
Example of blob's output:
(nop3) shlg.b16 hr8.x, (r)8, (r)hr8.x, 12
It does: (src2 << src1) | src2
src1 and src2 could be GPRs, relative GPRs, relative consts,
or immidiates. However, they could not be plain const registers.
Blob does use it in conjuncture with "samgq" instruction.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11760>
Samuel Pitoiset [Fri, 9 Jul 2021 09:21:01 +0000 (11:21 +0200)]
aco: use nir_ssa_def_is_unused() to determine if atomic dest is used
Instead of duplicating this chunk everywhere.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11793>
Georg Lehmann [Thu, 8 Jul 2021 09:47:52 +0000 (11:47 +0200)]
vulkan/wsi/wayland: Add support for more SRGB formats.
This is required by the Vulkan specification:
If pSurfaceFormats includes an entry whose value for colorSpace is
VK_COLOR_SPACE_SRGB_NONLINEAR_KHR and whose value for format is a UNORM
(or SRGB) format and the corresponding SRGB (or UNORM) format is a color
renderable format for VK_IMAGE_TILING_OPTIMAL, then pSurfaceFormats must also
contain an entry with the same value for colorSpace and format equal to the
corresponding SRGB (or UNORM) format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11778>
Alejandro Piñeiro [Fri, 9 Jul 2021 09:47:58 +0000 (11:47 +0200)]
v3dv/format: expose properly that some formats are not filterable
Specifically A8B8G8R8_UINT_PACK32, A8B8G8R8_SINT_PACK32, and
A2B10G10R10_UINT_PACK32. They are based on the internal types RGBA8UI,
RGBA8I, and RGB10_A2UI, that are not filterable.
That gets several failing CTS like this:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.3d.a8b8g8r8_uint_pack32.a8b8g8r8_uint_pack32.optimal_optimal_linear_stripes_x
To properly skip instead of fail. So we also update the ci
expectation.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11794>
Michel Dänzer [Wed, 12 May 2021 14:55:00 +0000 (16:55 +0200)]
ci: Add Fedora release build job
The intention is for this to more or less match the Fedora package
build. The main benefit right now is GCC 11 build test coverage.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
Michel Dänzer [Wed, 12 May 2021 14:42:37 +0000 (16:42 +0200)]
ci: Add Fedora 34 based x86 build docker image
v2:
* Do not install weak dependencies in Fedora docker image.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
Michel Dänzer [Tue, 15 Jun 2021 16:23:33 +0000 (18:23 +0200)]
ci: Rename Debian based build jobs from meson-* to debian-*
meson has been the only build system in tree for some time, so the
meson- prefix was a bit meaningless.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
Michel Dänzer [Tue, 15 Jun 2021 15:42:22 +0000 (17:42 +0200)]
ci: Add debian/ prefix to job names for Debian based docker images
And move the image build scripts to a subdirectory correspondingly.
Preparation for adding images based on other OSs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
Michel Dänzer [Fri, 14 May 2021 13:59:33 +0000 (15:59 +0200)]
turnip: Mark local variable ASSERTED
It's only used in assert. Avoids compiler warning/error with assertions disabled:
../src/freedreno/vulkan/tu_cs.h: In function 'tu_cs_reserve':
../src/freedreno/vulkan/tu_cs.h:208:13: error: unused variable 'result' [-Werror=unused-variable]
208 | VkResult result = tu_cs_reserve_space(cs, reserved_size);
| ^~~~~~
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11412>
Pierre-Eric Pelloux-Prayer [Fri, 18 Jun 2021 16:10:05 +0000 (18:10 +0200)]
dlist: skip NOP command at the head of a list
If we build a dlist starting with a NOP (for alignment purpose),
we don't have to execute the NOP.
Instead shift the start value by one.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Fri, 18 Jun 2021 13:11:54 +0000 (15:11 +0200)]
dlist: remove unused _mesa_dlist_alloc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Fri, 18 Jun 2021 13:10:38 +0000 (15:10 +0200)]
dlist: remove _mesa_dlist_alloc_aligned
It was only used in _mesa_dlist_alloc_vertex_list, so inline it there
instead.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Tue, 15 Jun 2021 13:42:59 +0000 (15:42 +0200)]
dlist: store all dlist in a continuous memory block
This reduces cache-misses in execute_list for apps using lots of small
dlist, like viewperf.
This is only done for small dlist (fitting in one block) because doing
this for larger ones wouldn't bring any benefit.
For instance, in vp13/snx test 10: the % of cache-misses events in
_mesa_glthread_execute_list/execute_list goes down from 17%/10% to 4%/3%.
If "struct gl_display_list" were stored in an array this would also
remove source of cache-misses since currently they're malloc-ed
individually.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Wed, 16 Jun 2021 15:44:29 +0000 (17:44 +0200)]
dlist: increment/check list nesting when handling OPCODE_CALL_LIST(S)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Wed, 16 Jun 2021 08:57:19 +0000 (10:57 +0200)]
dlist: use a new OPCODE to avoid loading cold data
Also add a 'bool copy_to_current' param to vbo_save_playback_vertex_list:
this way we can decide if we need to call playback_copy_to_current without
loading any cold data.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Wed, 16 Jun 2021 08:30:11 +0000 (10:30 +0200)]
dlist: use a separate opcode for vbo replay using loopback
Remember is the current list needs to fallback to loopback,
and patch the list in glEndList.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Mon, 14 Jun 2021 18:41:25 +0000 (20:41 +0200)]
dlist: split hot/cold data from vertex_list
Store data not used in the hot-path (= vbo_save_playback_vertex_list) in a
pointer, to reduce the size of vbo_save_vertex_list.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Mon, 17 May 2021 20:37:15 +0000 (16:37 -0400)]
util/idalloc: add util_idalloc_alloc_range
v2: fixed infinite loop (Pierre-Eric)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Mon, 17 May 2021 19:44:50 +0000 (15:44 -0400)]
util/idalloc: add exists and foreach helpers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Tue, 18 May 2021 09:11:41 +0000 (05:11 -0400)]
util/idalloc: hide or remove unused public functions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Mon, 17 May 2021 19:43:54 +0000 (15:43 -0400)]
util/idalloc: reserving an ID that already exists should be no-op
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Mon, 17 May 2021 19:38:34 +0000 (15:38 -0400)]
util/idalloc: fold the size call into init
It's required, otherwise idalloc would fail.
v2: renamed util_idalloc_(mt_)init param initial_num_ids (Pierre-Eric)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Marek Olšák [Mon, 17 May 2021 19:59:20 +0000 (15:59 -0400)]
util/idalloc: change num_elements to units of elements instead of bits
and use memset in resize().
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Tue, 15 Jun 2021 16:34:14 +0000 (18:34 +0200)]
dlist: always use merged primitive for drawing
OpenGL 4.6 compatibility profile spec, Appendix B:
21. For any GL and framebuffer state, and for any group of GL commands and
arguments, the resulting GL and framebuffer state is identical whether the
GL commands and arguments are executed normally or from a display list.
The only exception to this corollary is for built-in shader variables
gl_VertexID and gl_PrimitiveID, which are not defined when drawing
geometry within a display list.
(thanks Ian Romanick for pointing this out in piglit !419 MR)
Remove the code introduced in
ebb228bec52a to determine if merged draws can be used.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Tue, 15 Jun 2021 13:43:57 +0000 (15:43 +0200)]
dlist: use an union instead of allocating a 1-sized array
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Fri, 18 Jun 2021 11:08:49 +0000 (13:08 +0200)]
dlist: unindent code
Use a goto instead of wrapping the main part of the function
in a if() {}
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Tue, 15 Jun 2021 13:52:14 +0000 (15:52 +0200)]
dlist: remove InstSize
Instead store the instruction size alongside the opcode.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Tue, 15 Jun 2021 10:14:22 +0000 (12:14 +0200)]
dlist: remove OPCODE_EXT_0
This should have been removed in
bb108bdec73 ("dlist: remove ListExt feature")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Mon, 14 Jun 2021 19:18:50 +0000 (21:18 +0200)]
dlist: prelock ctx->Shared->DisplayList before execute_list
Together with the glCallList change this transforms this sequence:
lock - execute - unlock - lock - execute - unlock - ...
In this sequence :
lock - execute - execute - execute - ... - unlock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Mon, 14 Jun 2021 18:43:50 +0000 (20:43 +0200)]
dlist: add locked param to _mesa_lookup_list
This allows to take the lock once and reduce the cpu-overhead of
locking/unlocking multiple times when executing multiple lists.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Thu, 17 Jun 2021 19:49:44 +0000 (21:49 +0200)]
glthread: merge sucessive glCallList
Then unmarshalling a glCallList cmd if the next command(s) are also
glCallList, they are batched in a single glCallLists.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Mon, 30 Nov 2020 16:32:06 +0000 (17:32 +0100)]
glthread: use custom marshal/unmarshal for CallList
Will be used in the next commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Thu, 17 Jun 2021 20:14:13 +0000 (22:14 +0200)]
glthread: return consumed bytes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Pierre-Eric Pelloux-Prayer [Thu, 17 Jun 2021 19:41:33 +0000 (21:41 +0200)]
glthread: add a last parameter to unmarshal functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11493>
Connor Abbott [Thu, 8 Jul 2021 10:39:22 +0000 (12:39 +0200)]
ir3/nir: Lower indirect references of compact variables
Fixes Sascha Willems "tessellation" demo on Turnip (it contains
indirect dereference of tessellation levels).
Fixes: 643f2cb ("ir3, tu: Cleanup indirect i/o lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11781>
Andrii Simiklit [Thu, 8 Jul 2021 09:11:32 +0000 (12:11 +0300)]
Remove redundant assignment
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780>
Samuel Pitoiset [Mon, 5 Jul 2021 06:27:04 +0000 (08:27 +0200)]
radv: fix applying radv_disable_dcc for DOOM and Wolfenstein II
Mismatch between executable and application names.
Fixes:
28e1b02a6f1 ("radv: disable DCC for DOOM 2016 and Wolfenstein II")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5024
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11708>
Yiwei Zhang [Thu, 8 Jul 2021 19:39:58 +0000 (19:39 +0000)]
egl/android: only apply front rendering usage in shared buffer mode
When EGL_KHR_mutable_render_buffer extension is enabled, advertised
configs unconditionally include EGL_MUTABLE_RENDER_BUFFER_BIT_KHR bit.
However,
f61337b5 starts requesting front rendering usage bit when
EGL_MUTABLE_RENDER_BUFFER_BIT_KHR is seen on the SurfaceType, which
essentially forces linear usage on all winsys BOs for gallium dri and
i965 drivers on Android when cros gralloc is in use.
This patch dynamically appends or strips the front rendering usage bit
depends on whether EGL_RENDER_BUFFER is EGL_SINGLE_BUFFER or
EGL_BACK_BUFFER. The next dequeuBuffer call will switch the buffer
sharing mode while re-allocating winsys BOs given the updated gralloc
usage bits if necessary.
v2: handle ANativeWindow_setUsage on error
Fixes:
f61337b5 ("egl/android: check front rendering support for cros gralloc")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Rob Clark <robdclark@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11787>
Chia-I Wu [Thu, 8 Jul 2021 21:08:55 +0000 (14:08 -0700)]
venus: fix empty submits with BOs
Empty submits with BOs (!batch_count && bo_count) were incorrectly
skipped.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11791>
Alyssa Rosenzweig [Wed, 7 Jul 2021 22:15:48 +0000 (18:15 -0400)]
docs: Update relnotes for panfrost/asahi
Big changes of the branch point.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11773>
Jason Ekstrand [Mon, 7 Jun 2021 16:40:11 +0000 (11:40 -0500)]
android: Drop the Android.mk build system
Android.mk files haven't really been supported by Mesa devs for a long
time. Most of us have been willing to update Makefile.sources if we
remember and sometimes we try to blind code some Android.mk for a new
generator. However, the reality is that it breaks regularly and ends up
being maintained by the Android community. To address this problem
another approach was implemented in !10183 utilizing the maintained
meson build system. The old Android.mk files are no longer required.
This commit was created with the following commands:
git rm **/Android.mk
git rm **/Android.*.mk
git rm **/Makefile.sources
git rm CleanSpec.mk
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4487
Acked-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9728>
Marek Olšák [Thu, 8 Jul 2021 03:36:32 +0000 (23:36 -0400)]
radeonsi: enable uniform inlining by default
I think there is no reason to keep this disabled because it improves
viewperf and it might improve other things.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Wed, 7 Jul 2021 20:37:30 +0000 (16:37 -0400)]
ac,radeonsi: move late alloc computation into common code and shader states
This also fixes a rare deadlock when a scratch buffer is used.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Wed, 7 Jul 2021 19:25:09 +0000 (15:25 -0400)]
radeonsi: move an incorrectly placed comment about late alloc
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Wed, 7 Jul 2021 03:06:43 +0000 (23:06 -0400)]
radeonsi,radv: fix a late alloc deadlock with <= 6 CUs per SA
We should always prevent 1 CU from executing VS and GS waves
to prevent a deadlock.
Fixes:
c377f45c1833052 "radeonsi/gfx10: rewrite late alloc computation"
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Tue, 6 Jul 2021 17:15:12 +0000 (13:15 -0400)]
ac/surface/tests: fix the ARM build
Fixes:
8771d45a "ac/surface/tests: fix a random segfault in the modifier test"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4655
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Mon, 28 Jun 2021 17:42:26 +0000 (13:42 -0400)]
radeonsi: rewrite a confusing comment in si_upload_and_prefetch_VB_descriptors
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Wed, 7 Jul 2021 03:01:01 +0000 (23:01 -0400)]
ac/llvm: rework how negative W affects culling to not call accept_func twice
Always execute the bbox code regardless of negative W, and then simply
use || to discard the result if any W is negative. This is expected to be
rare. (it only happens when a primitive intersects the near plane)
This allows us to eliminate the else statement, which is no longer
executed for accepted primitives with negative W, which are the only
primitives that needed the else branch.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Wed, 7 Jul 2021 02:30:32 +0000 (22:30 -0400)]
ac/llvm: don't return a status from ac_cull_triangle because it's unused
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Marek Olšák [Tue, 6 Jul 2021 17:18:00 +0000 (13:18 -0400)]
radeonsi: drop smoothing quality to 4xAA for better performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
Jason Ekstrand [Thu, 11 Jun 2020 23:23:07 +0000 (18:23 -0500)]
nir: Drop nir_ssa_def::name and nir_register::name
We say that they're for debug only but we don't really have a good
policy around when to set them and when not to. In particular,
nir_lower_system_values and nir_lower_vars_to_ssa which are the chief
producers of SSA values which might reasonably have a name do not bother
to set one. We have some names set from things like BLORP and RADV's
meta shaders but AFAICT, they're setting a name more because it's there
than because they actually care.
Also, most things other than nir_clone and nir_serialize don't bother to
try and preserve them. You can see in the diffstat of this commit
exactly what passes attempt to preserve names. Notably missing from the
list is opt_algebraic which is the single largest source of SSA def
churn and it happily throws names away.
These observations lead me to question whether or not names are actually
useful at all or if they're just taking up space (8B per instruction)
and wasting CPU cycles (to ralloc_strdup on the off chance we do have
one). I don't think I can think of a single time in recent history
where I've been debugging a shader issue and a SSA value name has been
there and been useful. If anything, the few times they are there, they
just throw me off because they mess up the indentation in nir_print.
iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5)
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>
Chia-I Wu [Wed, 7 Jul 2021 22:01:13 +0000 (15:01 -0700)]
vulkan/wsi: fix select_memory_type when all MTs are local
The intention is to pick the system memory for the prime blit dst, but
that is not possible when all memory types are advertised to be local.
This fixes venus over vtest (i.e., unix socket) because the driver
provides no PCI bus info and wsi_device_matches_drm_fd returns false. A
driver might also use can_present_on_device to force prime blit.
Fixes:
469875596a6 ("vulkan/wsi: Fix prime blits to use system memory for the destination")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11774>
Connor Abbott [Mon, 14 Sep 2020 10:30:12 +0000 (12:30 +0200)]
tu: Update subgroup properties
Everything should be in place for this to actually work. Support a size
of 128, unlike the blob. I've also plumbed through ballot support, so
enable that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 28 Jun 2021 11:22:59 +0000 (13:22 +0200)]
ir3/legalize: Fix loop convergence behavior
This prevents the previous commit from being undone by the jump
optimizations in legalize, and fixes another potential case where
instead of a continue we have an if/else at the end of a loop.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 28 Jun 2021 10:56:15 +0000 (12:56 +0200)]
ir3: Fix convergence behavior for loops with continues
When loops have continue statements, it's expected that when we execute
a divergent continue (i.e. a continue where not all of the threads
active at the start take it) we keep going with the rest of the loop
body and then reconverge at the start of the next iteration. However the
Adreno ISA seems to always take a branch that jumps backwards, assuming
it's the bottom of a loop, so we get a different, undesired convergence
behavior. There's no way I know of to control this behavior in the
instruction set, so we have to instead insert a "continue block" at the
end of the loop where continue statements reconverge which then jumps
back to the top of the loop. Since this doesn't correspond 1:1 with any
NIR block we have to make control flow handling in NIR->ir3 a bit more
complicated, unfortunately.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 31 May 2021 10:58:26 +0000 (12:58 +0200)]
ir3: Implement nir subgroup intrinsics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 31 May 2021 10:21:29 +0000 (12:21 +0200)]
ir3: Handle shared registers in lower_parallelcopy
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 31 May 2021 10:09:42 +0000 (12:09 +0200)]
ir3: Add subgroup pseudoinstructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Fri, 11 Sep 2020 11:17:40 +0000 (13:17 +0200)]
ir3: Support any/all/getone branches
This plumbs through the support in the IR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Tue, 1 Sep 2020 13:22:14 +0000 (15:22 +0200)]
ir3: Cleanup ir3_legalize jump optimization
Do the optimization parts in their own loop, and be more robust when
detecting the useless jumps.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Fri, 28 May 2021 15:31:48 +0000 (17:31 +0200)]
ir3/sched: Handle branch condition in split_pred()
Before this, if there was a block with multiple things writing p0.x,
it was a tossup whether the right one would be used as the branch
condition. Found by inspection.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Mon, 28 Jun 2021 16:41:41 +0000 (18:41 +0200)]
ir3: Fix infinite loop in scheduler when splitting
When we go to split e.g. a p0.x producer, the only other instructions
ready to schedule are often only p0.x producers. It could happen that
they all have a lower priority than the split instruction. Then we would
immediately schedule the split instruction again, then again try to
schedule one of the other producers, be blocked, and split it, around
and around again, leading to an infinite loop. The following commit
triggered this with
dEQP-GLES3.functional.shaders.discard.dynamic_loop_always on a3xx.
Fixes: d2f4d33 ("freedreno/ir3: new pre-RA scheduler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
Connor Abbott [Fri, 28 May 2021 14:03:16 +0000 (16:03 +0200)]
ir3: Make MOVMSK use repeat
MOVMSK is a bit of a special case, because it takes multiple cycles (and
therefore reduces the nops needed if it's between some other assigner
and consumer) however weird things happen if you try to start reading
the first component while it isn't finished yet. On balance making it
use repeat seems to result in a fewer special cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>