platform/upstream/mesa.git
2 years agopanfrost: Fix ubo_mask calculation
Icecream95 [Fri, 31 Dec 2021 13:15:21 +0000 (02:15 +1300)]
panfrost: Fix ubo_mask calculation

BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.

Fixes shaders with only sysvals but no UBOs when push constants are
disabled.

This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.

Fixes: c246af0dd80 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Improve comment for emit_fragment_job
Icecream95 [Sun, 20 Feb 2022 09:45:02 +0000 (22:45 +1300)]
panfrost: Improve comment for emit_fragment_job

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Add documentation for bifrost_nir_lower_store_component
Icecream95 [Wed, 23 Feb 2022 10:16:35 +0000 (23:16 +1300)]
pan/bi: Add documentation for bifrost_nir_lower_store_component

Taken from the commit that introduced the function,
95458c40330 ("pan/bi: Lower stores with component != 0").

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Make disassembler build reproducibly
Icecream95 [Thu, 27 Jan 2022 04:46:54 +0000 (17:46 +1300)]
pan/bi: Make disassembler build reproducibly

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Re-emit descriptors after resource shadowing
Icecream95 [Tue, 15 Feb 2022 07:34:57 +0000 (20:34 +1300)]
panfrost: Re-emit descriptors after resource shadowing

This could be made slightly more efficient by only setting the dirty
state that is needed, but eventually you reach a point where it's
cheaper to re-emit everything than work out what can or can't be kept.

Fixes rendering issues in Duckstation.

Fixes: cd2c1ef9da6 ("panfrost: Dirty track textures/samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Set dirty state in set_shader_buffers
Icecream95 [Tue, 15 Feb 2022 07:24:01 +0000 (20:24 +1300)]
panfrost: Set dirty state in set_shader_buffers

Otherwise the pointer (which is uploaded as a sysval) won't be updated
when a new SSBO is bound.

Fixes: c34b760b9f9 ("panfrost: Dirty track constant buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Check dependencies of both destinations of instructions
Icecream95 [Mon, 14 Feb 2022 03:26:59 +0000 (16:26 +1300)]
pan/bi: Check dependencies of both destinations of instructions

TEXC can have two destinations; the value for neither of them can be
used in the same bundle, so extend the code to check for this to
iterate over both destinations.

Fixes artefacts in the game "LIMBO".

Fixes: a303076c1ab ("pan/bi: Add bi_instr_schedulable predicate")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Add interference between destinations
Icecream95 [Mon, 14 Feb 2022 03:20:47 +0000 (16:20 +1300)]
pan/bi: Add interference between destinations

Trying to write to overlapping register ranges from a single
instruction is undefined behaviour, so add interference between the
nodes to avoid this.

Hit in a dual-texture instruction in LIMBO.

Fixes: 9146bafbb42 ("pan/bi: Add dual texture fusing pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Disable point size upper limit clamping
Icecream95 [Tue, 18 Jan 2022 03:02:12 +0000 (16:02 +1300)]
panfrost: Disable point size upper limit clamping

The hardware already clamps this, there is no need to do it in the
shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Update point size limits to match hardware behaviour
Icecream95 [Tue, 18 Jan 2022 02:48:17 +0000 (15:48 +1300)]
panfrost: Update point size limits to match hardware behaviour

Found while reverse-engineering the tiler heap format.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Icecream95 [Tue, 18 Jan 2022 02:16:49 +0000 (15:16 +1300)]
panfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

Fixes arb-provoking-vertex-render Piglit test.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/mdg: Use util_logbase2 instead of C99 log2
Icecream95 [Sat, 19 Jun 2021 03:01:15 +0000 (15:01 +1200)]
pan/mdg: Use util_logbase2 instead of C99 log2

log2 operates on double, we only need the integer util/ function.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agoa4xx: add emission of compute state, and compute dispatch
Ilia Mirkin [Sun, 14 Nov 2021 18:01:57 +0000 (13:01 -0500)]
a4xx: add emission of compute state, and compute dispatch

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agoa4xx: add logic to emit image/ssbo state
Ilia Mirkin [Sun, 14 Nov 2021 17:59:40 +0000 (12:59 -0500)]
a4xx: add logic to emit image/ssbo state

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/ir3: support a4xx compute differences
Ilia Mirkin [Sun, 14 Nov 2021 18:05:07 +0000 (13:05 -0500)]
freedreno/ir3: support a4xx compute differences

Mainly the workgroup id comes injected via consts by the hardware (or
CP), and we must make room for it, otherwise the driver won't know where
to put it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/ir3: support a4xx in load/store buffer/image emission
Ilia Mirkin [Sun, 14 Nov 2021 18:04:26 +0000 (13:04 -0500)]
freedreno/ir3: support a4xx in load/store buffer/image emission

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/perfetto+fdperf: Set SYSPROF param
Rob Clark [Thu, 3 Mar 2022 23:38:22 +0000 (15:38 -0800)]
freedreno/perfetto+fdperf: Set SYSPROF param

No need to check error return and deal with older kernels.  Older
kernels won't have this param but their default behavior allows for
systemwide perfcntr collection.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agofreedreno/drm: Add SYSPROF param
Rob Clark [Thu, 3 Mar 2022 23:34:36 +0000 (15:34 -0800)]
freedreno/drm: Add SYSPROF param

Add new param for putting kernel in system-profiling mode and add
corresponding fd_pipe_set_param() mechanism.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agofreedreno: Update uapi header
Rob Clark [Thu, 3 Mar 2022 23:17:13 +0000 (15:17 -0800)]
freedreno: Update uapi header

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agoegl+libsync: Add helper to complain about invalid fence fd's
Rob Clark [Fri, 25 Feb 2022 17:08:34 +0000 (09:08 -0800)]
egl+libsync: Add helper to complain about invalid fence fd's

Debugging fd lifetime issues can be hard.  Add a helper for debug builds
to print out an error if an fd is not a fence fd, and sprinkle it around

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agoandroid: Push in-fence-fd down to driver
Rob Clark [Sat, 19 Feb 2022 16:38:26 +0000 (08:38 -0800)]
android: Push in-fence-fd down to driver

Rather than immediately stall on the CPU in SwapBuffers() if the
in-fence for the dequeued buffer is not yet signaled, push it down
to the driver.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6048
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agogallium/dri: Extend image extension to support in-fence
Rob Clark [Sat, 19 Feb 2022 16:36:43 +0000 (08:36 -0800)]
gallium/dri: Extend image extension to support in-fence

Extend dri so that an in-fence-fd can be plumbed through to driver.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agoradv/ci: update list of expected failures
Samuel Pitoiset [Thu, 3 Mar 2022 19:26:54 +0000 (20:26 +0100)]
radv/ci: update list of expected failures

Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3
which fails on all generations.

It looks like CTS should relax tolerance slightly.

Co-authored-by: Charlie Turner <cturner@igalia.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>

2 years agoradv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask
Samuel Pitoiset [Fri, 4 Mar 2022 15:20:25 +0000 (16:20 +0100)]
radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask

They randomly hang on Navi10 and randomly fail on Sienna Cichlid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>

2 years agov3d: rebind sampler view if resource changed the BO
Juan A. Suarez Romero [Thu, 3 Mar 2022 11:09:32 +0000 (12:09 +0100)]
v3d: rebind sampler view if resource changed the BO

When discarding the whole resource to create a new one, if this resource
is used by a sampler view, a rebind must be done to use the new
resource.

But this must be done when setting the sampler views, because we don't
have access to those samplers before.

v2:
 - Pack shader state on setting sampler views (Iago)
 - Use a serial ID to know when to rebind sampler views (Juan)

v3:
 - Move check to caller (Iago)
 - Keep rebind sampler view on BO change (Iago)

v4:
 - Rename "serial_bo" to "serial_id" (Iago)
 - Add comments (Iago)

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171>

2 years agopanfrost: Push twice as many uniforms
Alyssa Rosenzweig [Thu, 3 Mar 2022 22:39:02 +0000 (17:39 -0500)]
panfrost: Push twice as many uniforms

The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.

total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs)   min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel)   min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.

total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs)   min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel)   min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.

total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel)   min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.

total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.

total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs)   min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.

total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.

total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel)   min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.

total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.

total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0

total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0

Fixes: d4dccea0ba3 ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>

2 years agopan/bi: Run CSE after lowering FAU
Alyssa Rosenzweig [Fri, 4 Mar 2022 02:04:28 +0000 (21:04 -0500)]
pan/bi: Run CSE after lowering FAU

Lowering FAU can add moves from uniforms. If a uniform is moved out to a
register mulitple times in a basic block, these moves can be CSE'd, saving
instructions at the cost of register pressure.

854 shaders in my shader-db are helped on cycle count (average 2.94% reduction
in cycles). Only 9 shaders have hurt thread count, and there is no change in
spills or fills. Overall, this seems to be a win.

Prevents instruction count regressions from the next commit.

total instructions in shared programs: 2454423 -> 2444690 (-0.40%)
instructions in affected programs: 386274 -> 376541 (-2.52%)
helped: 2105
HURT: 0
helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2
helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92%
95% mean confidence interval for instructions value: -4.91 -4.33
95% mean confidence interval for instructions %-change: -3.83% -3.45%
Instructions are helped.

total tuples in shared programs: 1963534 -> 1957106 (-0.33%)
tuples in affected programs: 233562 -> 227134 (-2.75%)
helped: 1491
HURT: 117
helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2
helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.61 x̃: 1
HURT stats (rel)   min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05%
95% mean confidence interval for tuples value: -4.28 -3.71
95% mean confidence interval for tuples %-change: -4.20% -3.73%
Tuples are helped.

total clauses in shared programs: 387848 -> 387079 (-0.20%)
clauses in affected programs: 13718 -> 12949 (-5.61%)
helped: 583
HURT: 60
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1
helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00%
95% mean confidence interval for clauses value: -1.29 -1.10
95% mean confidence interval for clauses %-change: -7.57% -6.58%
Clauses are helped.

total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%)
cycles in affected programs: 6241.79 -> 6058.50 (-2.94%)
helped: 952
HURT: 98
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666700000000034 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43%
95% mean confidence interval for cycles value: -0.19 -0.16
95% mean confidence interval for cycles %-change: -3.80% -3.24%
Cycles are helped.

total arith in shared programs: 74924.00 -> 74660.12 (-0.35%)
arith in affected programs: 9303.67 -> 9039.79 (-2.84%)
helped: 1513
HURT: 118
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666800000000137 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37%
95% mean confidence interval for arith value: -0.17 -0.15
95% mean confidence interval for arith %-change: -4.48% -3.98%
Arith are helped.

total quadwords in shared programs: 1757254 -> 1751978 (-0.30%)
quadwords in affected programs: 197399 -> 192123 (-2.67%)
helped: 1464
HURT: 110
helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 1.71 x̃: 1
HURT stats (rel)   min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93%
95% mean confidence interval for quadwords value: -3.58 -3.13
95% mean confidence interval for quadwords %-change: -3.97% -3.53%
Quadwords are helped.

total threads in shared programs: 52899 -> 52890 (-0.02%)
threads in affected programs: 18 -> 9 (-50.00%)
helped: 0
HURT: 9
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>

2 years agofrontends/va: ignore incoming frame_num from VA picture parameters
Henry Goffin [Wed, 29 Dec 2021 09:20:30 +0000 (09:20 +0000)]
frontends/va: ignore incoming frame_num from VA picture parameters

The Gallium pipe video "frame_num" variable is internally used as a
counter of elapsed reference frames since the last IDR. The incoming
frame_num field from VA picture parameters is not equivalent; the VA
value may wrap to zero prematurely, as it is a 16-bit struct field with
a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1.

This change improves "infinite GOP" single-client live streaming, where
it is reasonable for the server to desire an endless series of P-frames
without IDR. Without this change, it is difficult/impossible for an
application to encode a P- or B-frame after the VA frame_num field wraps
around to zero, depending on the backend encoder implementation.

This change has no effect on existing applications that always signal an
IDR frame and reset the VA frame_num to zero before it wraps around. For
example, the FFmpeg vaapi encoder ignores the VA documentation and sends
an un-wrapped VA frame_num, which results in identical computation of
the internal frame_num (as long as each GOP is less than 65536 frames).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768

Reviewed-by: Thong Thai <thong.thai@amd.com>
patch revision 3: correctly avoid incrementing frame_num when the encoded
frame is not a reference, per h264 spec and ffmpeg behavior

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332>

2 years agoaco: rework removal of jumps over branches
Rhys Perry [Wed, 2 Mar 2022 13:32:42 +0000 (13:32 +0000)]
aco: rework removal of jumps over branches

Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.

Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.

fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f030b75b7d2 ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>

2 years agoac/nir: implement nir_op_pack_{uint,sint}_2x16
Samuel Pitoiset [Thu, 3 Mar 2022 12:08:17 +0000 (13:08 +0100)]
ac/nir: implement nir_op_pack_{uint,sint}_2x16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>

2 years agoaco: implement nir_op_pack_{uint,sint}_2x16
Samuel Pitoiset [Thu, 3 Mar 2022 07:53:06 +0000 (08:53 +0100)]
aco: implement nir_op_pack_{uint,sint}_2x16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>

2 years agonir: introduce nir_pack_{sint,uint}_2x16 instructions
Samuel Pitoiset [Thu, 3 Mar 2022 07:37:29 +0000 (08:37 +0100)]
nir: introduce nir_pack_{sint,uint}_2x16 instructions

These instructions have AMD hardware equivalent and they will be used
to lower fragment shader outputs in NIR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>

2 years agoiris: Mark a dirty update when vs_needs_sgvs_element value changed
Xiaohui Gu [Mon, 25 Oct 2021 05:58:03 +0000 (13:58 +0800)]
iris: Mark a dirty update when vs_needs_sgvs_element value changed

Add vs_needs_sgvs_element value check when updating vertex
element dirty state in iris_update_compiled_vs to solve
render error of Android game "Genshin Impact".

Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142>

2 years agovenus: add VK_EXT_image_robustness support
Yiwei Zhang [Tue, 1 Mar 2022 19:11:29 +0000 (19:11 +0000)]
venus: add VK_EXT_image_robustness support

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agovenus: add VK_EXT_provoking_vertex support
Yiwei Zhang [Tue, 1 Mar 2022 18:50:40 +0000 (18:50 +0000)]
venus: add VK_EXT_provoking_vertex support

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agovenus: add VK_EXT_line_rasterization support
Yiwei Zhang [Tue, 1 Mar 2022 18:40:31 +0000 (18:40 +0000)]
venus: add VK_EXT_line_rasterization support

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agovenus: update to latest venus protocol
Yiwei Zhang [Tue, 1 Mar 2022 18:30:36 +0000 (18:30 +0000)]
venus: update to latest venus protocol

Added the below extension support:
- VK_EXT_line_rasterization
- VK_EXT_provoking_vertex

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agovenus: group extensions promoted to 1.3
Yiwei Zhang [Thu, 3 Mar 2022 22:28:03 +0000 (22:28 +0000)]
venus: group extensions promoted to 1.3

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agovenus: clean up physical device features and properties
Yiwei Zhang [Thu, 3 Mar 2022 22:23:23 +0000 (22:23 +0000)]
venus: clean up physical device features and properties

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>

2 years agonir/opt_shrink_vectors: update docstring
Daniel Schürmann [Thu, 19 Aug 2021 08:46:09 +0000 (10:46 +0200)]
nir/opt_shrink_vectors: update docstring

in order to reflect the various recent improvements.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>

2 years agonir/opt_shrink_vectors: remove duplicate components from vecN
Daniel Schürmann [Tue, 17 Aug 2021 12:01:36 +0000 (14:01 +0200)]
nir/opt_shrink_vectors: remove duplicate components from vecN

vecN instructions which are only used by other ALU
will now get duplicate channels removed.

i915g:
total instructions in shared programs: 396309 -> 396294 (<.01%)
instructions in affected programs: 186 -> 171 (-8.06%)

r300:
total instructions in shared programs: 1165059 -> 1164354 (-0.06%)
instructions in affected programs: 35884 -> 35179 (-1.96%)
total temps in shared programs: 165497 -> 165326 (-0.10%)
temps in affected programs: 2990 -> 2819 (-5.72%)

softpipe:
total instructions in shared programs: 2860028 -> 2859084 (-0.03%)
instructions in affected programs: 55539 -> 54595 (-1.70%)
total temps in shared programs: 516939 -> 516546 (-0.08%)
temps in affected programs: 6623 -> 6230 (-5.93%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>

2 years agonir/opt_shrink_vectors: shrink load_const properly
Daniel Schürmann [Tue, 17 Aug 2021 11:23:22 +0000 (13:23 +0200)]
nir/opt_shrink_vectors: shrink load_const properly

This patch enables removal of arbitrary channels in
load_const instructions, if they are either unused or
duplicates of other channels and only used by ALU.

Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3)
VGPRs: 21832 -> 21544 (-1.32%)
CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01%
Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00%
Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15%
InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01%
VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09%
SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17%
Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02%
PreSGPRs: 20439 -> 20437 (-0.01%)
PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32%

i915g:
total instructions in shared programs: 396471 -> 396309 (-0.04%)
instructions in affected programs: 6408 -> 6246 (-2.53%)
total const in shared programs: 56458 -> 56422 (-0.06%)
const in affected programs: 407 -> 371 (-8.85%)
LOST:   shaders/closed/steam/trine-2/fp-3.shader_test FS

r300:
total instructions in shared programs: 1164421 -> 1165059 (0.05%)
instructions in affected programs: 143981 -> 144619 (0.44%)
total temps in shared programs: 165488 -> 165497 (<.01%)
temps in affected programs: 318 -> 327 (2.83%)
total consts in shared programs: 922140 -> 921952 (-0.02%)
consts in affected programs: 12438 -> 12250 (-1.51%)

softpipe:
total instructions in shared programs: 2859978 -> 2860028 (<.01%)
instructions in affected programs: 183355 -> 183405 (0.03%)
total temps in shared programs: 517071 -> 516939 (-0.03%)
temps in affected programs: 1416 -> 1284 (-9.32%)
total imm in shared programs: 103601 -> 102767 (-0.81%)
imm in affected programs: 3928 -> 3094 (-21.23%)

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>

2 years agocrocus: change the line width workaround for gfx4/5
Dave Airlie [Thu, 3 Mar 2022 05:37:25 +0000 (15:37 +1000)]
crocus: change the line width workaround for gfx4/5

This fixes piglit line-flat-clip-color and the hud fps counter.

Fixes: 6b7a68b7c21e ("crocus: add missing line smooth bits.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229>

2 years agovenus: abort when stuck
Chia-I Wu [Mon, 28 Feb 2022 22:28:36 +0000 (14:28 -0800)]
venus: abort when stuck

This gives

  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 4096
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 8192
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 12288
  MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 16384
  MESA-VIRTIO: debug: aborting
  Aborted

which should be more friendly than printing the messages forever.

On my i7-7820HQ, this aborts after roughly 4+8+16+32=60 seconds

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15200>

2 years agoaco/ra: don't immediately assign a register for p_branch
Daniel Schürmann [Wed, 2 Mar 2022 23:52:06 +0000 (00:52 +0100)]
aco/ra: don't immediately assign a register for p_branch

These get now assigned after handling phis.

Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency: 11880452 -> 11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco/tests: add test for branch definition RA
Rhys Perry [Tue, 27 Jul 2021 10:24:53 +0000 (11:24 +0100)]
aco/tests: add test for branch definition RA

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco: fix branch definition validation
Rhys Perry [Fri, 10 Sep 2021 17:20:03 +0000 (18:20 +0100)]
aco: fix branch definition validation

Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco: add validate_instr_defs()
Rhys Perry [Wed, 20 Oct 2021 08:11:51 +0000 (09:11 +0100)]
aco: add validate_instr_defs()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco/ra: fix register allocation of branch definitions
Rhys Perry [Wed, 28 Jul 2021 15:30:57 +0000 (16:30 +0100)]
aco/ra: fix register allocation of branch definitions

fossil-db (Sienna Cichlid):
Totals from 704 (0.52% of 134913) affected shaders:
CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07%
Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09%
Latency: 17993572 -> 18001344 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01%
Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88%
Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco/ra: add get_reg_phi() helper
Rhys Perry [Wed, 28 Jul 2021 15:20:07 +0000 (16:20 +0100)]
aco/ra: add get_reg_phi() helper

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agoaco: remove vcc hint from branch definitions
Rhys Perry [Wed, 28 Jul 2021 14:11:03 +0000 (15:11 +0100)]
aco: remove vcc hint from branch definitions

This doesn't seem to have much benefit anymore.

fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>

2 years agor300: schedule TEX instructions before OUT instructions
Pavel Ondračka [Thu, 3 Mar 2022 11:59:00 +0000 (12:59 +0100)]
r300: schedule TEX instructions before OUT instructions

NIR-to-TGSI produces partial output writes contrary to the old paths
that always wrote the full outputs. Therefore if there is now a partial
output write ready to be scheduled and nothing else besides a tex
is ready, we would schedule the output write first. This was not a
problem before as usually at last some component of the full output write
depended on the tex result.

This is not optimal from the performance point of view and resulted in
~20% slowdown in the Unigine demos. The docs say:

The first OUTPUT instruction will reserve space in the output register
fifo. This space is limited, therefore issuing an OUTPUT earlier than
necessary may cause threads to stall earlier than necessary. You
should not set an ALU instruction as type OUTPUT unless it is actually
writing to an output register, or it is the last instruction of
the program.

Fix it by explicitly prefering a TEX before OUT and restore the
performance: 9.66 -> 12.12 fps (as compared to 11.83 with the old
glsl-to-TGSI path) in Unigine Sanctuary. No change in Lightsmark or
GLmark.

This is also a win from the intructions point of view as we are usually
able to schedule the partial output writes in a single pair at the end.

total instructions in shared programs: 106009 -> 105891 (-0.11%)
instructions in affected programs: 10153 -> 10035 (-1.16%)
helped: 118
HURT: 0

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5840
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>

2 years agor300: remove some dead logic in tex pair scheduling
Pavel Ondračka [Thu, 3 Mar 2022 11:24:37 +0000 (12:24 +0100)]
r300: remove some dead logic in tex pair scheduling

The max_score == -1 condition is already before so this
will never trigger. Its unclear what was the intention anyway. Now we
emit either:
- if we have accumulated enough tex intructions for a full block
- if we have nothing else to emit
- or if we can emit all remaining tex instructions already.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>

2 years agoVenus: Add `vn_physical_device_{features, properties}` for better organization
Igor Torrente [Mon, 28 Feb 2022 11:59:41 +0000 (08:59 -0300)]
Venus: Add `vn_physical_device_{features, properties}` for better organization

New extensions properties/feature are being put in the `vn_physical_device`
which is not ideal from an organization point of view.

Here the `vn_physical_device_{features,properties}` are two new struct to
help the `vn_physical_device` organzation.

Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15170>

2 years agofreedreno/a4xx: fix integer tg4
Ilia Mirkin [Mon, 29 Nov 2021 06:44:32 +0000 (01:44 -0500)]
freedreno/a4xx: fix integer tg4

Something is slightly off in the integer values returned. It passes many
tests without the fixup, but the dEQP-GLES31 tests complain. The blob
ends up doing 3x gathers, and selects between them based on getinfo
results. Since we already have a per-sampler key with some spare bits,
just stick the bit-size info in there. And we can derive signedness from
the associated type info.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>

2 years agofreedreno/a4xx: add swizzles to shader keys for tg4 workaround
Ilia Mirkin [Sun, 14 Nov 2021 09:38:04 +0000 (04:38 -0500)]
freedreno/a4xx: add swizzles to shader keys for tg4 workaround

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>

2 years agofreedreno/a4xx: move tex_type to header
Ilia Mirkin [Sun, 23 Jan 2022 17:14:43 +0000 (12:14 -0500)]
freedreno/a4xx: move tex_type to header

This will be used in several places. Factor it out for common use.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>

2 years agonir: remove bogus logic to allow cube + offset to work
Ilia Mirkin [Sun, 14 Nov 2021 06:01:47 +0000 (01:01 -0500)]
nir: remove bogus logic to allow cube + offset to work

This was done for an a4xx hack which is now removed. No API allows cube
texturing to have offsets.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>

2 years agofreedreno/ir3: remove bogus tg4 -> tex lowering pass
Ilia Mirkin [Sun, 14 Nov 2021 05:58:30 +0000 (00:58 -0500)]
freedreno/ir3: remove bogus tg4 -> tex lowering pass

It can't be done. This just provides bad results. The blob had a
comparable approach where they fixed up coordinates, but that also can't
work with a separate texture definition with nearest filtering. By then,
might as well provide a unswizzled variant instead, and using native
functionality.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>

2 years agor300/compiler/tests: print regoff_t as size_t
Alex Xu (Hello71) [Wed, 24 Nov 2021 23:01:29 +0000 (18:01 -0500)]
r300/compiler/tests: print regoff_t as size_t

fixes compilation on musl

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13949>

2 years agoradv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
Samuel Pitoiset [Wed, 2 Mar 2022 14:47:03 +0000 (15:47 +0100)]
radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16

v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>

2 years agovulkan/wsi: drop unused wsi_create_win32_image
Michel Zou [Fri, 18 Feb 2022 20:03:33 +0000 (21:03 +0100)]
vulkan/wsi: drop unused wsi_create_win32_image

fixes: ed391d2a

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15088>

2 years agoglsl: add member's location layout qualifier rules for `arrayed` in/out blocks
Andrii Simiklit [Fri, 31 Jul 2020 11:53:25 +0000 (14:53 +0300)]
glsl: add member's location layout qualifier rules for `arrayed` in/out blocks

From Section 4.4.1 (Input Layout Qualifiers) of the GLSL 4.50 spec:

     "For some blocks declared as arrays, the location can only be applied
     at the block level: When a block is declared as an array where
     additional locations are needed for each member for each block array
     element, it is a compile-time error to specify locations on the block
     members. That is, when locations would be under specified by applying
     them on block members, they are not allowed on block members. For
     arrayed interfaces (those generally having an extra level of
     arrayness due to interface expansion), the outer array is stripped
     before applying this rule"

From Section 1.2.1 (Changes from Revision 6 of GLSL Version) of the GLSL 4.50 spec:

     "Private Bug 15678: Don’t allow location = on block members where
      the block needs an array of locations"

From Section 4.4.1 (Input Layout Qualifiers) of the GLSL ES 3.20 spec

     "If an input is declared as an array of blocks, excluding per-vertex-arrays
      as required for tessellation, it is an error to declare a member of
      the block with a location qualifier"

From Section 1.1.3 (Changes from GLSL ES 3.2 revision 3) of the GLSL ES 3.20 spec:

     "Arrayed blocks cannot have layout location qualifiers on members"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11522>

2 years agozink: ci updates
Mike Blumenkrantz [Thu, 3 Mar 2022 05:07:45 +0000 (00:07 -0500)]
zink: ci updates

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>

2 years agozink: split primitives generated queries if xfb/gs states change
Mike Blumenkrantz [Wed, 2 Mar 2022 21:20:49 +0000 (16:20 -0500)]
zink: split primitives generated queries if xfb/gs states change

if one of these states change then it affects which result needs to be
used for that query, so split it up over multiple query ids to make sure
the correct result is obtained

fixes (lavapipe):
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_pause_resume
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>

2 years agozink: split out query suspending into util function
Mike Blumenkrantz [Wed, 2 Mar 2022 21:20:30 +0000 (16:20 -0500)]
zink: split out query suspending into util function

no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>

2 years agozink: update query states before starting renderpass during draw
Mike Blumenkrantz [Wed, 2 Mar 2022 21:19:46 +0000 (16:19 -0500)]
zink: update query states before starting renderpass during draw

this gives some leeway for doing transfer ops without crashing the renderpass

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>

2 years agonvc0: disable EXT_texture_sRGB_RG8
Ilia Mirkin [Wed, 2 Mar 2022 04:38:31 +0000 (23:38 -0500)]
nvc0: disable EXT_texture_sRGB_RG8

Looks like the green component doesn't get srgb-decoding, and no obvious
way to force it. It works fine on nv50 though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>

2 years agomesa: enable GL_EXT_texture_sRGB_RG8 on desktop
Ilia Mirkin [Wed, 2 Mar 2022 04:26:55 +0000 (23:26 -0500)]
mesa: enable GL_EXT_texture_sRGB_RG8 on desktop

Looks like an extension number was assigned in late 2020. This makes it
possible to hook up this format to teximage-colors without teaching it
about ES.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>

2 years agozink: remove loop from generated tcs
Mike Blumenkrantz [Wed, 9 Feb 2022 12:59:46 +0000 (07:59 -0500)]
zink: remove loop from generated tcs

this is already using per-vertex io, no need to add conditionals to verify

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15225>

2 years agofreedreno/registers: Add a couple regs we need for kernel
Rob Clark [Thu, 3 Mar 2022 01:11:10 +0000 (17:11 -0800)]
freedreno/registers: Add a couple regs we need for kernel

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>

2 years agogallivm/llvmpipe: add support for NIR to the linear/aos paths.
Dave Airlie [Fri, 25 Feb 2022 00:00:25 +0000 (10:00 +1000)]
gallivm/llvmpipe: add support for NIR to the linear/aos paths.

When the AOS/linear code was added it only worked with TGSI which
meant nothing in mesa upstream was really using it.

This adds support to analyse NIR shaders, and adds aos support
to the backend.

AOS support is limited to mov,vec,fmul,tex sampling in order to
accelerate mostly compositing operations. I've tested weston uses
the fast path. gnome-shell can't use it yet as we can't optimise
the depth test paths.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>

2 years agogallivm/nir: split load_const out into backend helper.
Dave Airlie [Thu, 24 Feb 2022 23:05:11 +0000 (09:05 +1000)]
gallivm/nir: split load_const out into backend helper.

This just makes adding aos support easier.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>

2 years agollvmpipe/linear: fix disk caching.
Dave Airlie [Tue, 22 Feb 2022 23:26:05 +0000 (09:26 +1000)]
llvmpipe/linear: fix disk caching.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>

2 years agozink: switch to u_foreach_bit for ntv image access decorations
Mike Blumenkrantz [Wed, 2 Mar 2022 18:36:00 +0000 (13:36 -0500)]
zink: switch to u_foreach_bit for ntv image access decorations

no functional changes

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15217>

2 years agozink: emit Aliased decorations for any image that isn't explicitly marked restrict
Mike Blumenkrantz [Wed, 2 Mar 2022 18:36:21 +0000 (13:36 -0500)]
zink: emit Aliased decorations for any image that isn't explicitly marked restrict

these might be aliased

fixes:
arb_shader_image_load_store-restrict

fixes #6090

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15217>

2 years agozink: remove a bunch of flakes
Mike Blumenkrantz [Thu, 3 Mar 2022 00:55:07 +0000 (19:55 -0500)]
zink: remove a bunch of flakes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>

2 years agolavapipe: always set read/write on ssbo/images.
Dave Airlie [Thu, 3 Mar 2022 00:14:48 +0000 (10:14 +1000)]
lavapipe: always set read/write on ssbo/images.

This fixes a regressions with overlap in llvmpipe, this is pessimistic
we should write code to make it work properly.

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>

2 years agopan/bi: Add arithmetic flag to RSHIFT ops
Alyssa Rosenzweig [Fri, 25 Feb 2022 19:53:39 +0000 (14:53 -0500)]
pan/bi: Add arithmetic flag to RSHIFT ops

Models ops like ARSHIFT_OR.i32 on Valhall without adding piles of new
instructions to the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Extend LD_TILE with a register format
Alyssa Rosenzweig [Fri, 28 Jan 2022 23:15:47 +0000 (18:15 -0500)]
pan/bi: Extend LD_TILE with a register format

Required for Valhall. NIR has the information anyway, pass it along.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Add BRANCHZI instruction
Alyssa Rosenzweig [Tue, 1 Mar 2022 20:14:13 +0000 (15:14 -0500)]
pan/bi: Add BRANCHZI instruction

Technically this is just JUMP on Valhall, but the semantic is an indirect branch
based on comparing with zero. It can also be used as a conservative branch (like
BRANCHC), but this isn't modeled.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Model LD_BUFFER instructions
Alyssa Rosenzweig [Fri, 21 Jan 2022 19:03:35 +0000 (14:03 -0500)]
pan/bi: Model LD_BUFFER instructions

We'll use these to read from UBOs on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Model offset for LOAD/STORE
Alyssa Rosenzweig [Sun, 12 Dec 2021 21:17:14 +0000 (16:17 -0500)]
pan/bi: Model offset for LOAD/STORE

Needed to model the immediate offset on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Model pos/vary segments in STORE instructions
Alyssa Rosenzweig [Wed, 8 Dec 2021 00:07:43 +0000 (19:07 -0500)]
pan/bi: Model pos/vary segments in STORE instructions

For Bifrost, we model load/store segments, for example for thread local storage.
We need something similar on Valhall -- access modifiers. There are four access
modifiers on Valhall, controlling memory subsystem optimizations for the access:

none: Nothing may be assumed. Corresponds to "global".

istream: Internally streaming within the GPU. Corresponds to "pos", as it's
used for position stores.

estream: Externally streaming outside the GPU. Corresponds to "vary", as it's
used for varying stores.

force: Force access in discarded threads. Corresponds to "tl", as it's required
for correct behaviour of helper invocations that use the stack.

If these access modifiers end up being useful outside these fixed purposes, we
may need to rework this part of the IR. For now, this should suffice.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Model LEA_BUF_IMM in the IR
Alyssa Rosenzweig [Wed, 8 Dec 2021 00:04:25 +0000 (19:04 -0500)]
pan/bi: Model LEA_BUF_IMM in the IR

Required for varying stores in malloced IDVS jobs on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Add LD_VAR_BUF_IMM.f16/f32 instructions
Alyssa Rosenzweig [Tue, 7 Dec 2021 23:55:57 +0000 (18:55 -0500)]
pan/bi: Add LD_VAR_BUF_IMM.f16/f32 instructions

For use on Valhall with memory-allocated IDVS jobs.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Generalize I->table for Valhall
Alyssa Rosenzweig [Mon, 6 Dec 2021 22:55:10 +0000 (17:55 -0500)]
pan/bi: Generalize I->table for Valhall

Can be reused for resource tables in a natural way.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Extend BLEND to take a register format
Alyssa Rosenzweig [Thu, 29 Jul 2021 20:58:13 +0000 (16:58 -0400)]
pan/bi: Extend BLEND to take a register format

Needed on Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/bi: Model Valhall texture instructions
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:03:54 +0000 (10:03 -0500)]
pan/bi: Model Valhall texture instructions

These act like a TEXC+immediate.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Add memory access modifier to LOADs
Alyssa Rosenzweig [Thu, 3 Mar 2022 00:00:43 +0000 (19:00 -0500)]
pan/va: Add memory access modifier to LOADs

Might be required for correct spilling in some circumstances.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Remap "store segment" to "memory access"
Alyssa Rosenzweig [Wed, 2 Mar 2022 23:58:39 +0000 (18:58 -0500)]
pan/va: Remap "store segment" to "memory access"

For now, the difference does not matter. However it's better to model the actual
hardware behaviour, rather than isomorphic driver behaviour, when we can do so.
So fix the names.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Fix LEA_BUF_IMM definition
Alyssa Rosenzweig [Wed, 2 Mar 2022 16:19:41 +0000 (11:19 -0500)]
pan/va: Fix LEA_BUF_IMM definition

Technically the table is folded, too; the 0xD refers to table 61. But this
instruction is more general than previously thought.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Fix definitions of LD_VAR_BUF_IMM
Alyssa Rosenzweig [Wed, 2 Mar 2022 16:06:55 +0000 (11:06 -0500)]
pan/va: Fix definitions of LD_VAR_BUF_IMM

So close! However, LD_VAR_IMM is something else -- Bifrost-style varying
interpolation, without a hardware buffer. For ES3, we'll need to support both.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Add TEX_GATHER instruction
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:36:57 +0000 (10:36 -0500)]
pan/va: Add TEX_GATHER instruction

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Add TEX_DUAL instruction
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:29:28 +0000 (10:29 -0500)]
pan/va: Add TEX_DUAL instruction

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Add modifiers required for gathers
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:36:43 +0000 (10:36 -0500)]
pan/va: Add modifiers required for gathers

Mostly isomorphic to Bifrost-style gathers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agopan/va: Handle force_enum differing from name
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:42:47 +0000 (10:42 -0500)]
pan/va: Handle force_enum differing from name

Needed for secondary register width, for dual texturing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>

2 years agoi915g: Emit better code for SEQ(x, 0) and SNE(x, 0)
Ian Romanick [Wed, 2 Mar 2022 00:30:31 +0000 (16:30 -0800)]
i915g: Emit better code for SEQ(x, 0) and SNE(x, 0)

total instructions in shared programs: 789000 -> 788481 (-0.07%)
instructions in affected programs: 16179 -> 15660 (-3.21%)
helped: 157
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 3.31 x̃: 3
helped stats (rel) min: 1.56% max: 14.29% x̄: 4.24% x̃: 2.56%
95% mean confidence interval for instructions value: -3.51 -3.10
95% mean confidence interval for instructions %-change: -4.70% -3.78%
Instructions are helped.

LOST:   0
GAINED: 3

v2: Drop setting src1 to zero.  Suggested by Emma.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>

2 years agoi915g: Handle constants composed exclusively of 0 or ±1 specially
Ian Romanick [Tue, 1 Mar 2022 23:48:51 +0000 (15:48 -0800)]
i915g: Handle constants composed exclusively of 0 or ±1 specially

This can avoid some cases where a constant has to be loaded into a
temporary register.

v2: Update i915-g33-fails.txt.

total instructions in shared programs: 788625 -> 782376 (-0.79%)
instructions in affected programs: 166269 -> 160020 (-3.76%)
helped: 1578
HURT: 0
helped stats (abs) min: 3 max: 21 x̄: 3.96 x̃: 3
helped stats (rel) min: 1.56% max: 33.33% x̄: 4.82% x̃: 3.45%
95% mean confidence interval for instructions value: -4.06 -3.86
95% mean confidence interval for instructions %-change: -5.00% -4.64%
Instructions are helped.

LOST:   0
GAINED: 35

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>

2 years agonir/algebraic: Optimize some cases of (sXX(a, b) != 0.0)
Ian Romanick [Tue, 1 Mar 2022 23:21:02 +0000 (15:21 -0800)]
nir/algebraic: Optimize some cases of (sXX(a, b) != 0.0)

I noticed the SGE case while looking at the output of
shaders/closed/steam/trine-2/fp-3.shader_test on i915g.  These are
especially bad on i915 that needs two instructions to implement SNE.

An alternative would be to duplicate the sne(sXX(a, b), 0.0) rules in an
algebraic pass that occurs after bool_to_float.  Doing the work earlier
seems preferable.

i915
total instructions in shared programs: 788274 -> 788223 (<.01%)
instructions in affected programs: 666 -> 615 (-7.66%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 12 x̄: 10.20 x̃: 9
helped stats (rel) min: 5.00% max: 11.11% x̄: 8.12% x̃: 8.16%
95% mean confidence interval for instructions value: -12.24 -8.16
95% mean confidence interval for instructions %-change: -10.81% -5.43%
Instructions are helped.

LOST:   0
GAINED: 2

The two gained shaders are assembly fragment programs in Euro Truck
Simulator 2.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>