Rob Clark [Sat, 19 Feb 2022 16:38:26 +0000 (08:38 -0800)]
android: Push in-fence-fd down to driver
Rather than immediately stall on the CPU in SwapBuffers() if the
in-fence for the dequeued buffer is not yet signaled, push it down
to the driver.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6048
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>
Rob Clark [Sat, 19 Feb 2022 16:36:43 +0000 (08:36 -0800)]
gallium/dri: Extend image extension to support in-fence
Extend dri so that an in-fence-fd can be plumbed through to driver.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>
Samuel Pitoiset [Thu, 3 Mar 2022 19:26:54 +0000 (20:26 +0100)]
radv/ci: update list of expected failures
Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3
which fails on all generations.
It looks like CTS should relax tolerance slightly.
Co-authored-by: Charlie Turner <cturner@igalia.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>
Samuel Pitoiset [Fri, 4 Mar 2022 15:20:25 +0000 (16:20 +0100)]
radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask
They randomly hang on Navi10 and randomly fail on Sienna Cichlid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>
Juan A. Suarez Romero [Thu, 3 Mar 2022 11:09:32 +0000 (12:09 +0100)]
v3d: rebind sampler view if resource changed the BO
When discarding the whole resource to create a new one, if this resource
is used by a sampler view, a rebind must be done to use the new
resource.
But this must be done when setting the sampler views, because we don't
have access to those samplers before.
v2:
- Pack shader state on setting sampler views (Iago)
- Use a serial ID to know when to rebind sampler views (Juan)
v3:
- Move check to caller (Iago)
- Keep rebind sampler view on BO change (Iago)
v4:
- Rename "serial_bo" to "serial_id" (Iago)
- Add comments (Iago)
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171>
Alyssa Rosenzweig [Thu, 3 Mar 2022 22:39:02 +0000 (17:39 -0500)]
panfrost: Push twice as many uniforms
The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.
total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs) min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel) min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.
total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs) min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel) min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.
total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel) min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.
total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.
041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs) min: 0.
041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel) min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.
total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.
041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs) min: 0.
041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel) min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.
total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.
total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs) min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel) min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.
total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.
total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0
total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0
Fixes:
d4dccea0ba3 ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>
Alyssa Rosenzweig [Fri, 4 Mar 2022 02:04:28 +0000 (21:04 -0500)]
pan/bi: Run CSE after lowering FAU
Lowering FAU can add moves from uniforms. If a uniform is moved out to a
register mulitple times in a basic block, these moves can be CSE'd, saving
instructions at the cost of register pressure.
854 shaders in my shader-db are helped on cycle count (average 2.94% reduction
in cycles). Only 9 shaders have hurt thread count, and there is no change in
spills or fills. Overall, this seems to be a win.
Prevents instruction count regressions from the next commit.
total instructions in shared programs: 2454423 -> 2444690 (-0.40%)
instructions in affected programs: 386274 -> 376541 (-2.52%)
helped: 2105
HURT: 0
helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2
helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92%
95% mean confidence interval for instructions value: -4.91 -4.33
95% mean confidence interval for instructions %-change: -3.83% -3.45%
Instructions are helped.
total tuples in shared programs: 1963534 -> 1957106 (-0.33%)
tuples in affected programs: 233562 -> 227134 (-2.75%)
helped: 1491
HURT: 117
helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2
helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.61 x̃: 1
HURT stats (rel) min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05%
95% mean confidence interval for tuples value: -4.28 -3.71
95% mean confidence interval for tuples %-change: -4.20% -3.73%
Tuples are helped.
total clauses in shared programs: 387848 -> 387079 (-0.20%)
clauses in affected programs: 13718 -> 12949 (-5.61%)
helped: 583
HURT: 60
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1
helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00%
95% mean confidence interval for clauses value: -1.29 -1.10
95% mean confidence interval for clauses %-change: -7.57% -6.58%
Clauses are helped.
total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%)
cycles in affected programs: 6241.79 -> 6058.50 (-2.94%)
helped: 952
HURT: 98
helped stats (abs) min: 0.
04166399999999726 max: 2.625 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38%
HURT stats (abs) min: 0.
041665999999999315 max: 0.
16666700000000034 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43%
95% mean confidence interval for cycles value: -0.19 -0.16
95% mean confidence interval for cycles %-change: -3.80% -3.24%
Cycles are helped.
total arith in shared programs: 74924.00 -> 74660.12 (-0.35%)
arith in affected programs: 9303.67 -> 9039.79 (-2.84%)
helped: 1513
HURT: 118
helped stats (abs) min: 0.
04166399999999726 max: 2.625 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67%
HURT stats (abs) min: 0.
041665999999999315 max: 0.
16666800000000137 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37%
95% mean confidence interval for arith value: -0.17 -0.15
95% mean confidence interval for arith %-change: -4.48% -3.98%
Arith are helped.
total quadwords in shared programs: 1757254 -> 1751978 (-0.30%)
quadwords in affected programs: 197399 -> 192123 (-2.67%)
helped: 1464
HURT: 110
helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52%
HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.71 x̃: 1
HURT stats (rel) min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93%
95% mean confidence interval for quadwords value: -3.58 -3.13
95% mean confidence interval for quadwords %-change: -3.97% -3.53%
Quadwords are helped.
total threads in shared programs: 52899 -> 52890 (-0.02%)
threads in affected programs: 18 -> 9 (-50.00%)
helped: 0
HURT: 9
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>
Henry Goffin [Wed, 29 Dec 2021 09:20:30 +0000 (09:20 +0000)]
frontends/va: ignore incoming frame_num from VA picture parameters
The Gallium pipe video "frame_num" variable is internally used as a
counter of elapsed reference frames since the last IDR. The incoming
frame_num field from VA picture parameters is not equivalent; the VA
value may wrap to zero prematurely, as it is a 16-bit struct field with
a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1.
This change improves "infinite GOP" single-client live streaming, where
it is reasonable for the server to desire an endless series of P-frames
without IDR. Without this change, it is difficult/impossible for an
application to encode a P- or B-frame after the VA frame_num field wraps
around to zero, depending on the backend encoder implementation.
This change has no effect on existing applications that always signal an
IDR frame and reset the VA frame_num to zero before it wraps around. For
example, the FFmpeg vaapi encoder ignores the VA documentation and sends
an un-wrapped VA frame_num, which results in identical computation of
the internal frame_num (as long as each GOP is less than 65536 frames).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768
Reviewed-by: Thong Thai <thong.thai@amd.com>
patch revision 3: correctly avoid incrementing frame_num when the encoded
frame is not a reference, per h264 spec and ffmpeg behavior
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332>
Rhys Perry [Wed, 2 Mar 2022 13:32:42 +0000 (13:32 +0000)]
aco: rework removal of jumps over branches
Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.
Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.
fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes:
f030b75b7d2 ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>
Samuel Pitoiset [Thu, 3 Mar 2022 12:08:17 +0000 (13:08 +0100)]
ac/nir: implement nir_op_pack_{uint,sint}_2x16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
Samuel Pitoiset [Thu, 3 Mar 2022 07:53:06 +0000 (08:53 +0100)]
aco: implement nir_op_pack_{uint,sint}_2x16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
Samuel Pitoiset [Thu, 3 Mar 2022 07:37:29 +0000 (08:37 +0100)]
nir: introduce nir_pack_{sint,uint}_2x16 instructions
These instructions have AMD hardware equivalent and they will be used
to lower fragment shader outputs in NIR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
Xiaohui Gu [Mon, 25 Oct 2021 05:58:03 +0000 (13:58 +0800)]
iris: Mark a dirty update when vs_needs_sgvs_element value changed
Add vs_needs_sgvs_element value check when updating vertex
element dirty state in iris_update_compiled_vs to solve
render error of Android game "Genshin Impact".
Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142>
Yiwei Zhang [Tue, 1 Mar 2022 19:11:29 +0000 (19:11 +0000)]
venus: add VK_EXT_image_robustness support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Yiwei Zhang [Tue, 1 Mar 2022 18:50:40 +0000 (18:50 +0000)]
venus: add VK_EXT_provoking_vertex support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Yiwei Zhang [Tue, 1 Mar 2022 18:40:31 +0000 (18:40 +0000)]
venus: add VK_EXT_line_rasterization support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Yiwei Zhang [Tue, 1 Mar 2022 18:30:36 +0000 (18:30 +0000)]
venus: update to latest venus protocol
Added the below extension support:
- VK_EXT_line_rasterization
- VK_EXT_provoking_vertex
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Yiwei Zhang [Thu, 3 Mar 2022 22:28:03 +0000 (22:28 +0000)]
venus: group extensions promoted to 1.3
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Yiwei Zhang [Thu, 3 Mar 2022 22:23:23 +0000 (22:23 +0000)]
venus: clean up physical device features and properties
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>
Daniel Schürmann [Thu, 19 Aug 2021 08:46:09 +0000 (10:46 +0200)]
nir/opt_shrink_vectors: update docstring
in order to reflect the various recent improvements.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
Daniel Schürmann [Tue, 17 Aug 2021 12:01:36 +0000 (14:01 +0200)]
nir/opt_shrink_vectors: remove duplicate components from vecN
vecN instructions which are only used by other ALU
will now get duplicate channels removed.
i915g:
total instructions in shared programs: 396309 -> 396294 (<.01%)
instructions in affected programs: 186 -> 171 (-8.06%)
r300:
total instructions in shared programs: 1165059 -> 1164354 (-0.06%)
instructions in affected programs: 35884 -> 35179 (-1.96%)
total temps in shared programs: 165497 -> 165326 (-0.10%)
temps in affected programs: 2990 -> 2819 (-5.72%)
softpipe:
total instructions in shared programs: 2860028 -> 2859084 (-0.03%)
instructions in affected programs: 55539 -> 54595 (-1.70%)
total temps in shared programs: 516939 -> 516546 (-0.08%)
temps in affected programs: 6623 -> 6230 (-5.93%)
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
Daniel Schürmann [Tue, 17 Aug 2021 11:23:22 +0000 (13:23 +0200)]
nir/opt_shrink_vectors: shrink load_const properly
This patch enables removal of arbitrary channels in
load_const instructions, if they are either unused or
duplicates of other channels and only used by ALU.
Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3)
VGPRs: 21832 -> 21544 (-1.32%)
CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01%
Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00%
Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15%
InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01%
VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09%
SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17%
Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02%
PreSGPRs: 20439 -> 20437 (-0.01%)
PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32%
i915g:
total instructions in shared programs: 396471 -> 396309 (-0.04%)
instructions in affected programs: 6408 -> 6246 (-2.53%)
total const in shared programs: 56458 -> 56422 (-0.06%)
const in affected programs: 407 -> 371 (-8.85%)
LOST: shaders/closed/steam/trine-2/fp-3.shader_test FS
r300:
total instructions in shared programs: 1164421 -> 1165059 (0.05%)
instructions in affected programs: 143981 -> 144619 (0.44%)
total temps in shared programs: 165488 -> 165497 (<.01%)
temps in affected programs: 318 -> 327 (2.83%)
total consts in shared programs: 922140 -> 921952 (-0.02%)
consts in affected programs: 12438 -> 12250 (-1.51%)
softpipe:
total instructions in shared programs: 2859978 -> 2860028 (<.01%)
instructions in affected programs: 183355 -> 183405 (0.03%)
total temps in shared programs: 517071 -> 516939 (-0.03%)
temps in affected programs: 1416 -> 1284 (-9.32%)
total imm in shared programs: 103601 -> 102767 (-0.81%)
imm in affected programs: 3928 -> 3094 (-21.23%)
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>
Dave Airlie [Thu, 3 Mar 2022 05:37:25 +0000 (15:37 +1000)]
crocus: change the line width workaround for gfx4/5
This fixes piglit line-flat-clip-color and the hud fps counter.
Fixes:
6b7a68b7c21e ("crocus: add missing line smooth bits.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229>
Chia-I Wu [Mon, 28 Feb 2022 22:28:36 +0000 (14:28 -0800)]
venus: abort when stuck
This gives
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 4096
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 8192
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 12288
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 16384
MESA-VIRTIO: debug: aborting
Aborted
which should be more friendly than printing the messages forever.
On my i7-7820HQ, this aborts after roughly 4+8+16+32=60 seconds
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15200>
Daniel Schürmann [Wed, 2 Mar 2022 23:52:06 +0000 (00:52 +0100)]
aco/ra: don't immediately assign a register for p_branch
These get now assigned after handling phis.
Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency:
11880452 ->
11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Tue, 27 Jul 2021 10:24:53 +0000 (11:24 +0100)]
aco/tests: add test for branch definition RA
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Fri, 10 Sep 2021 17:20:03 +0000 (18:20 +0100)]
aco: fix branch definition validation
Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Wed, 20 Oct 2021 08:11:51 +0000 (09:11 +0100)]
aco: add validate_instr_defs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Wed, 28 Jul 2021 15:30:57 +0000 (16:30 +0100)]
aco/ra: fix register allocation of branch definitions
fossil-db (Sienna Cichlid):
Totals from 704 (0.52% of 134913) affected shaders:
CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07%
Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09%
Latency:
17993572 ->
18001344 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01%
Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88%
Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Wed, 28 Jul 2021 15:20:07 +0000 (16:20 +0100)]
aco/ra: add get_reg_phi() helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Rhys Perry [Wed, 28 Jul 2021 14:11:03 +0000 (15:11 +0100)]
aco: remove vcc hint from branch definitions
This doesn't seem to have much benefit anymore.
fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
Pavel Ondračka [Thu, 3 Mar 2022 11:59:00 +0000 (12:59 +0100)]
r300: schedule TEX instructions before OUT instructions
NIR-to-TGSI produces partial output writes contrary to the old paths
that always wrote the full outputs. Therefore if there is now a partial
output write ready to be scheduled and nothing else besides a tex
is ready, we would schedule the output write first. This was not a
problem before as usually at last some component of the full output write
depended on the tex result.
This is not optimal from the performance point of view and resulted in
~20% slowdown in the Unigine demos. The docs say:
The first OUTPUT instruction will reserve space in the output register
fifo. This space is limited, therefore issuing an OUTPUT earlier than
necessary may cause threads to stall earlier than necessary. You
should not set an ALU instruction as type OUTPUT unless it is actually
writing to an output register, or it is the last instruction of
the program.
Fix it by explicitly prefering a TEX before OUT and restore the
performance: 9.66 -> 12.12 fps (as compared to 11.83 with the old
glsl-to-TGSI path) in Unigine Sanctuary. No change in Lightsmark or
GLmark.
This is also a win from the intructions point of view as we are usually
able to schedule the partial output writes in a single pair at the end.
total instructions in shared programs: 106009 -> 105891 (-0.11%)
instructions in affected programs: 10153 -> 10035 (-1.16%)
helped: 118
HURT: 0
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5840
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>
Pavel Ondračka [Thu, 3 Mar 2022 11:24:37 +0000 (12:24 +0100)]
r300: remove some dead logic in tex pair scheduling
The max_score == -1 condition is already before so this
will never trigger. Its unclear what was the intention anyway. Now we
emit either:
- if we have accumulated enough tex intructions for a full block
- if we have nothing else to emit
- or if we can emit all remaining tex instructions already.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>
Igor Torrente [Mon, 28 Feb 2022 11:59:41 +0000 (08:59 -0300)]
Venus: Add `vn_physical_device_{features, properties}` for better organization
New extensions properties/feature are being put in the `vn_physical_device`
which is not ideal from an organization point of view.
Here the `vn_physical_device_{features,properties}` are two new struct to
help the `vn_physical_device` organzation.
Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15170>
Ilia Mirkin [Mon, 29 Nov 2021 06:44:32 +0000 (01:44 -0500)]
freedreno/a4xx: fix integer tg4
Something is slightly off in the integer values returned. It passes many
tests without the fixup, but the dEQP-GLES31 tests complain. The blob
ends up doing 3x gathers, and selects between them based on getinfo
results. Since we already have a per-sampler key with some spare bits,
just stick the bit-size info in there. And we can derive signedness from
the associated type info.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
Ilia Mirkin [Sun, 14 Nov 2021 09:38:04 +0000 (04:38 -0500)]
freedreno/a4xx: add swizzles to shader keys for tg4 workaround
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
Ilia Mirkin [Sun, 23 Jan 2022 17:14:43 +0000 (12:14 -0500)]
freedreno/a4xx: move tex_type to header
This will be used in several places. Factor it out for common use.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
Ilia Mirkin [Sun, 14 Nov 2021 06:01:47 +0000 (01:01 -0500)]
nir: remove bogus logic to allow cube + offset to work
This was done for an a4xx hack which is now removed. No API allows cube
texturing to have offsets.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
Ilia Mirkin [Sun, 14 Nov 2021 05:58:30 +0000 (00:58 -0500)]
freedreno/ir3: remove bogus tg4 -> tex lowering pass
It can't be done. This just provides bad results. The blob had a
comparable approach where they fixed up coordinates, but that also can't
work with a separate texture definition with nearest filtering. By then,
might as well provide a unswizzled variant instead, and using native
functionality.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>
Alex Xu (Hello71) [Wed, 24 Nov 2021 23:01:29 +0000 (18:01 -0500)]
r300/compiler/tests: print regoff_t as size_t
fixes compilation on musl
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13949>
Samuel Pitoiset [Wed, 2 Mar 2022 14:47:03 +0000 (15:47 +0100)]
radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>
Michel Zou [Fri, 18 Feb 2022 20:03:33 +0000 (21:03 +0100)]
vulkan/wsi: drop unused wsi_create_win32_image
fixes:
ed391d2a
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15088>
Andrii Simiklit [Fri, 31 Jul 2020 11:53:25 +0000 (14:53 +0300)]
glsl: add member's location layout qualifier rules for `arrayed` in/out blocks
From Section 4.4.1 (Input Layout Qualifiers) of the GLSL 4.50 spec:
"For some blocks declared as arrays, the location can only be applied
at the block level: When a block is declared as an array where
additional locations are needed for each member for each block array
element, it is a compile-time error to specify locations on the block
members. That is, when locations would be under specified by applying
them on block members, they are not allowed on block members. For
arrayed interfaces (those generally having an extra level of
arrayness due to interface expansion), the outer array is stripped
before applying this rule"
From Section 1.2.1 (Changes from Revision 6 of GLSL Version) of the GLSL 4.50 spec:
"Private Bug 15678: Don’t allow location = on block members where
the block needs an array of locations"
From Section 4.4.1 (Input Layout Qualifiers) of the GLSL ES 3.20 spec
"If an input is declared as an array of blocks, excluding per-vertex-arrays
as required for tessellation, it is an error to declare a member of
the block with a location qualifier"
From Section 1.1.3 (Changes from GLSL ES 3.2 revision 3) of the GLSL ES 3.20 spec:
"Arrayed blocks cannot have layout location qualifiers on members"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11522>
Mike Blumenkrantz [Thu, 3 Mar 2022 05:07:45 +0000 (00:07 -0500)]
zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
Mike Blumenkrantz [Wed, 2 Mar 2022 21:20:49 +0000 (16:20 -0500)]
zink: split primitives generated queries if xfb/gs states change
if one of these states change then it affects which result needs to be
used for that query, so split it up over multiple query ids to make sure
the correct result is obtained
fixes (lavapipe):
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_pause_resume
GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
Mike Blumenkrantz [Wed, 2 Mar 2022 21:20:30 +0000 (16:20 -0500)]
zink: split out query suspending into util function
no functional changes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
Mike Blumenkrantz [Wed, 2 Mar 2022 21:19:46 +0000 (16:19 -0500)]
zink: update query states before starting renderpass during draw
this gives some leeway for doing transfer ops without crashing the renderpass
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>
Ilia Mirkin [Wed, 2 Mar 2022 04:38:31 +0000 (23:38 -0500)]
nvc0: disable EXT_texture_sRGB_RG8
Looks like the green component doesn't get srgb-decoding, and no obvious
way to force it. It works fine on nv50 though.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>
Ilia Mirkin [Wed, 2 Mar 2022 04:26:55 +0000 (23:26 -0500)]
mesa: enable GL_EXT_texture_sRGB_RG8 on desktop
Looks like an extension number was assigned in late 2020. This makes it
possible to hook up this format to teximage-colors without teaching it
about ES.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>
Mike Blumenkrantz [Wed, 9 Feb 2022 12:59:46 +0000 (07:59 -0500)]
zink: remove loop from generated tcs
this is already using per-vertex io, no need to add conditionals to verify
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15225>
Rob Clark [Thu, 3 Mar 2022 01:11:10 +0000 (17:11 -0800)]
freedreno/registers: Add a couple regs we need for kernel
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>
Dave Airlie [Fri, 25 Feb 2022 00:00:25 +0000 (10:00 +1000)]
gallivm/llvmpipe: add support for NIR to the linear/aos paths.
When the AOS/linear code was added it only worked with TGSI which
meant nothing in mesa upstream was really using it.
This adds support to analyse NIR shaders, and adds aos support
to the backend.
AOS support is limited to mov,vec,fmul,tex sampling in order to
accelerate mostly compositing operations. I've tested weston uses
the fast path. gnome-shell can't use it yet as we can't optimise
the depth test paths.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>
Dave Airlie [Thu, 24 Feb 2022 23:05:11 +0000 (09:05 +1000)]
gallivm/nir: split load_const out into backend helper.
This just makes adding aos support easier.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>
Dave Airlie [Tue, 22 Feb 2022 23:26:05 +0000 (09:26 +1000)]
llvmpipe/linear: fix disk caching.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>
Mike Blumenkrantz [Wed, 2 Mar 2022 18:36:00 +0000 (13:36 -0500)]
zink: switch to u_foreach_bit for ntv image access decorations
no functional changes
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15217>
Mike Blumenkrantz [Wed, 2 Mar 2022 18:36:21 +0000 (13:36 -0500)]
zink: emit Aliased decorations for any image that isn't explicitly marked restrict
these might be aliased
fixes:
arb_shader_image_load_store-restrict
fixes #6090
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15217>
Mike Blumenkrantz [Thu, 3 Mar 2022 00:55:07 +0000 (19:55 -0500)]
zink: remove a bunch of flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>
Dave Airlie [Thu, 3 Mar 2022 00:14:48 +0000 (10:14 +1000)]
lavapipe: always set read/write on ssbo/images.
This fixes a regressions with overlap in llvmpipe, this is pessimistic
we should write code to make it work properly.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15219>
Alyssa Rosenzweig [Fri, 25 Feb 2022 19:53:39 +0000 (14:53 -0500)]
pan/bi: Add arithmetic flag to RSHIFT ops
Models ops like ARSHIFT_OR.i32 on Valhall without adding piles of new
instructions to the IR.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Fri, 28 Jan 2022 23:15:47 +0000 (18:15 -0500)]
pan/bi: Extend LD_TILE with a register format
Required for Valhall. NIR has the information anyway, pass it along.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Tue, 1 Mar 2022 20:14:13 +0000 (15:14 -0500)]
pan/bi: Add BRANCHZI instruction
Technically this is just JUMP on Valhall, but the semantic is an indirect branch
based on comparing with zero. It can also be used as a conservative branch (like
BRANCHC), but this isn't modeled.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Fri, 21 Jan 2022 19:03:35 +0000 (14:03 -0500)]
pan/bi: Model LD_BUFFER instructions
We'll use these to read from UBOs on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Sun, 12 Dec 2021 21:17:14 +0000 (16:17 -0500)]
pan/bi: Model offset for LOAD/STORE
Needed to model the immediate offset on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 8 Dec 2021 00:07:43 +0000 (19:07 -0500)]
pan/bi: Model pos/vary segments in STORE instructions
For Bifrost, we model load/store segments, for example for thread local storage.
We need something similar on Valhall -- access modifiers. There are four access
modifiers on Valhall, controlling memory subsystem optimizations for the access:
none: Nothing may be assumed. Corresponds to "global".
istream: Internally streaming within the GPU. Corresponds to "pos", as it's
used for position stores.
estream: Externally streaming outside the GPU. Corresponds to "vary", as it's
used for varying stores.
force: Force access in discarded threads. Corresponds to "tl", as it's required
for correct behaviour of helper invocations that use the stack.
If these access modifiers end up being useful outside these fixed purposes, we
may need to rework this part of the IR. For now, this should suffice.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 8 Dec 2021 00:04:25 +0000 (19:04 -0500)]
pan/bi: Model LEA_BUF_IMM in the IR
Required for varying stores in malloced IDVS jobs on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Tue, 7 Dec 2021 23:55:57 +0000 (18:55 -0500)]
pan/bi: Add LD_VAR_BUF_IMM.f16/f32 instructions
For use on Valhall with memory-allocated IDVS jobs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Mon, 6 Dec 2021 22:55:10 +0000 (17:55 -0500)]
pan/bi: Generalize I->table for Valhall
Can be reused for resource tables in a natural way.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Thu, 29 Jul 2021 20:58:13 +0000 (16:58 -0400)]
pan/bi: Extend BLEND to take a register format
Needed on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:03:54 +0000 (10:03 -0500)]
pan/bi: Model Valhall texture instructions
These act like a TEXC+immediate.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Thu, 3 Mar 2022 00:00:43 +0000 (19:00 -0500)]
pan/va: Add memory access modifier to LOADs
Might be required for correct spilling in some circumstances.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 23:58:39 +0000 (18:58 -0500)]
pan/va: Remap "store segment" to "memory access"
For now, the difference does not matter. However it's better to model the actual
hardware behaviour, rather than isomorphic driver behaviour, when we can do so.
So fix the names.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 16:19:41 +0000 (11:19 -0500)]
pan/va: Fix LEA_BUF_IMM definition
Technically the table is folded, too; the 0xD refers to table 61. But this
instruction is more general than previously thought.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 16:06:55 +0000 (11:06 -0500)]
pan/va: Fix definitions of LD_VAR_BUF_IMM
So close! However, LD_VAR_IMM is something else -- Bifrost-style varying
interpolation, without a hardware buffer. For ES3, we'll need to support both.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:36:57 +0000 (10:36 -0500)]
pan/va: Add TEX_GATHER instruction
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:29:28 +0000 (10:29 -0500)]
pan/va: Add TEX_DUAL instruction
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:36:43 +0000 (10:36 -0500)]
pan/va: Add modifiers required for gathers
Mostly isomorphic to Bifrost-style gathers.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Alyssa Rosenzweig [Wed, 2 Mar 2022 15:42:47 +0000 (10:42 -0500)]
pan/va: Handle force_enum differing from name
Needed for secondary register width, for dual texturing.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15216>
Ian Romanick [Wed, 2 Mar 2022 00:30:31 +0000 (16:30 -0800)]
i915g: Emit better code for SEQ(x, 0) and SNE(x, 0)
total instructions in shared programs: 789000 -> 788481 (-0.07%)
instructions in affected programs: 16179 -> 15660 (-3.21%)
helped: 157
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 3.31 x̃: 3
helped stats (rel) min: 1.56% max: 14.29% x̄: 4.24% x̃: 2.56%
95% mean confidence interval for instructions value: -3.51 -3.10
95% mean confidence interval for instructions %-change: -4.70% -3.78%
Instructions are helped.
LOST: 0
GAINED: 3
v2: Drop setting src1 to zero. Suggested by Emma.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
Ian Romanick [Tue, 1 Mar 2022 23:48:51 +0000 (15:48 -0800)]
i915g: Handle constants composed exclusively of 0 or ±1 specially
This can avoid some cases where a constant has to be loaded into a
temporary register.
v2: Update i915-g33-fails.txt.
total instructions in shared programs: 788625 -> 782376 (-0.79%)
instructions in affected programs: 166269 -> 160020 (-3.76%)
helped: 1578
HURT: 0
helped stats (abs) min: 3 max: 21 x̄: 3.96 x̃: 3
helped stats (rel) min: 1.56% max: 33.33% x̄: 4.82% x̃: 3.45%
95% mean confidence interval for instructions value: -4.06 -3.86
95% mean confidence interval for instructions %-change: -5.00% -4.64%
Instructions are helped.
LOST: 0
GAINED: 35
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
Ian Romanick [Tue, 1 Mar 2022 23:21:02 +0000 (15:21 -0800)]
nir/algebraic: Optimize some cases of (sXX(a, b) != 0.0)
I noticed the SGE case while looking at the output of
shaders/closed/steam/trine-2/fp-3.shader_test on i915g. These are
especially bad on i915 that needs two instructions to implement SNE.
An alternative would be to duplicate the sne(sXX(a, b), 0.0) rules in an
algebraic pass that occurs after bool_to_float. Doing the work earlier
seems preferable.
i915
total instructions in shared programs: 788274 -> 788223 (<.01%)
instructions in affected programs: 666 -> 615 (-7.66%)
helped: 5
HURT: 0
helped stats (abs) min: 9 max: 12 x̄: 10.20 x̃: 9
helped stats (rel) min: 5.00% max: 11.11% x̄: 8.12% x̃: 8.16%
95% mean confidence interval for instructions value: -12.24 -8.16
95% mean confidence interval for instructions %-change: -10.81% -5.43%
Instructions are helped.
LOST: 0
GAINED: 2
The two gained shaders are assembly fragment programs in Euro Truck
Simulator 2.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
Ian Romanick [Wed, 2 Mar 2022 01:23:36 +0000 (17:23 -0800)]
i915g/ci: update piglit fails
I believe these were fixed by !14573.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15210>
Emma Anholt [Thu, 3 Feb 2022 20:23:34 +0000 (12:23 -0800)]
nir: Switch to using nir_vec_scalars() for things that used nir_channel().
This should reduce follow-on optimization work to copy-propagate and
dead-code away the movs generated in construction of vectors.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
Emma Anholt [Thu, 3 Feb 2022 19:15:59 +0000 (11:15 -0800)]
nir: Add a helper for setting up a nir_ssa_scalar struct.
Trivial, but will help users avoid some struct constructions that can be
awkward in C.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
Emma Anholt [Thu, 3 Feb 2022 19:11:00 +0000 (11:11 -0800)]
nir: Introduce a nir_vec_scalars() helper using nir_ssa_scalar.
Many users of nir_vec() do so by nir_channel()-ing a new ssa defs as movs
from other vectors to put the new vector together, which then just have to
get copy-propagated into the ALU srcs and DCEed away the temporary movs.
If they instead take nir_ssa_scalar, we can avoid that extra work.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14865>
Dave Airlie [Mon, 21 Feb 2022 05:43:39 +0000 (15:43 +1000)]
vulkan/wsi: handle queue families properly for non-concurrent sharing mode.
"queueFamilyIndexCount is the number of queue families having access to the image(s) of the
swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT.
pQueueFamilyIndices is a pointer to an array of queue family indices having access to the
images(s) of the swapchain when imageSharingMode is VK_SHARING_MODE_CONCURRENT."
If the type isn't concurrent, don't attempt to access the arrays.
dEQP-VK.wsi.xlib.swapchain.create.exclusive_nonzero_queues on lavapipe.
Fixes:
5b13d74583513 ("vulkan/wsi/drm: Break create_native_image in pieces")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15101>
Emma Anholt [Mon, 28 Feb 2022 22:04:45 +0000 (14:04 -0800)]
ci/freedreno: Consolidate some information about an a630 flake.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
Emma Anholt [Mon, 28 Feb 2022 21:21:16 +0000 (13:21 -0800)]
ir3: Don't assert on not finding the VS output for an FS input.
It should return undefined data, not terminate the program. Fixes some
piglit tests poking at the UB handling.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15197>
Rhys Perry [Fri, 25 Feb 2022 14:54:22 +0000 (14:54 +0000)]
radv: include disable_aniso_single_level and adjust_frag_coord_z in key
Fixes potential pipeline caching bug.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15175>
Emma Anholt [Mon, 28 Feb 2022 22:45:19 +0000 (14:45 -0800)]
freedreno: Improve robustness behavior for VBs with offset > size.
We were just emitting the bad reloc (either an assert fail on a debug
build or for a release build likely a GPU hang from the resulting fault).
Given that the GLES 3.2 spec's robust context requirement says we should
return undefined data but not terminate for element indices outside of the
VB, ignoring the offset in this case seems like a better behavior to have
in all cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
Emma Anholt [Tue, 1 Mar 2022 22:57:29 +0000 (14:57 -0800)]
freedreno: Fix start_slot handling in set_vertex_buffers.
mesa/st only ever provides a 0 for this value, but let's be ready if it
ever does something else.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
Emma Anholt [Mon, 28 Feb 2022 22:39:59 +0000 (14:39 -0800)]
freedreno: Use the resource size rather than BO size for VFD_FETCH[].SIZE.
We should be using the API size to clamp, rather than what we allocated
the BO into.
Fixes: #13
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15198>
Erik Faye-Lund [Wed, 2 Mar 2022 12:04:40 +0000 (13:04 +0100)]
docs: match build-flags markup with meson docs
In meson.rst, we document build-flags with double backticks, which puts
it inside a code-tag in the rendered HTML instead of a cite-tag like we
currently do.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15213>
Erik Faye-Lund [Wed, 2 Mar 2022 09:53:54 +0000 (10:53 +0100)]
docs: fix a broken link
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15213>
Lionel Landwerlin [Mon, 28 Feb 2022 13:13:07 +0000 (15:13 +0200)]
intel/fs: fix total_scratch computation
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.
We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
Juan A. Suarez Romero [Fri, 25 Feb 2022 16:47:44 +0000 (17:47 +0100)]
v3d: enable texture filtering anisotropic
Seems we already had implemented this feature (see commit
521e1d0275e
"broadcom/vc5: Add support for anisotropic filtering"), but we didn't
enable the proper capability.
Also update the maximum level of anistropy supported.
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4201
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15180>
Caio Oliveira [Tue, 1 Mar 2022 19:29:41 +0000 (11:29 -0800)]
intel/compiler: Use pass helper in brw_nir_adjust_offset_for_arrayed_indices
Also change the code to preserve certain metadata: control flow is not changed
so both block indices and dominance information is preserved.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15206>
Iago Toral Quiroga [Fri, 25 Feb 2022 11:07:05 +0000 (12:07 +0100)]
broadcom/compiler: simplify node/temp translation during register allocation
Now that we don't sort our nodes we can arrange them so we can
easily translate between nodes and temps without a mapping table,
just applying an offset.
To do this we have a single array of nodes where twe put first the nodes
for accumulators and then the nodes for temps. With this setup we can
ensure that for any given temp T, its node is always T + ACC_COUNT.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
Iago Toral Quiroga [Fri, 25 Feb 2022 08:18:34 +0000 (09:18 +0100)]
broadcom/compiler: don't sort nodes for register allocation
Nodes are allocated in order to registers so initially sorting
was used to ensure that nodes with smaller life ranges would
be assigned first and therefore be more likely to get
accumulators.
However, since
d81a6e5f1d now we don't rely on order to make
decisions about accumulators and instead we make policy decisions
based on actual liveness, so sorting is no longer strictly
relevant to this decision.
Furthermore, we are not re-sorting nodes after each spill either,
since that would probably require that we rebuild the interference
graph after each spill (the graph identifies nodes by their index).
Shader-db results show a significant improvement in instruction
counts, due to more optimal accumulator assignments. The reason for
this is that we use a round-robin policy for choosing the next
accumulator to assign. The idea behind this is preventing nearby
temps to be assigned to the same accumulator so that QPU scheduling
is more flexible, but if we sort our nodes, we are basically not
assigning temps in program order any more and the round-robin policy
becomes less effective:
total instructions in shared programs:
13000420 ->
12663189 (-2.59%)
instructions in affected programs:
11791267 ->
11454036 (-2.86%)
helped: 62890
HURT: 19987
total threads in shared programs: 415874 -> 415870 (<.01%)
threads in affected programs: 20 -> 16 (-20.00%)
helped: 2
HURT: 4
total uniforms in shared programs: 3711652 -> 3711624 (<.01%)
uniforms in affected programs: 43430 -> 43402 (-0.06%)
helped: 134
HURT: 173
total max-temps in shared programs: 2144876 -> 2138822 (-0.28%)
max-temps in affected programs: 123334 -> 117280 (-4.91%)
helped: 4112
HURT: 1195
total spills in shared programs: 3870 -> 3860 (-0.26%)
spills in affected programs: 1013 -> 1003 (-0.99%)
helped: 14
HURT: 12
total fills in shared programs: 5560 -> 5573 (0.23%)
fills in affected programs: 1765 -> 1778 (0.74%)
helped: 14
HURT: 17
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
Iago Toral Quiroga [Wed, 23 Feb 2022 08:58:33 +0000 (09:58 +0100)]
broadcom/compiler: sink uniform loads
total instructions in shared programs:
13014428 ->
13000420 (-0.11%)
instructions in affected programs: 743624 -> 729616 (-1.88%)
helped: 1392
HURT: 611
total threads in shared programs: 415858 -> 415874 (<.01%)
threads in affected programs: 16 -> 32 (100.00%)
helped: 8
HURT: 0
total uniforms in shared programs: 3720410 -> 3711652 (-0.24%)
uniforms in affected programs: 113442 -> 104684 (-7.72%)
helped: 635
HURT: 29
total max-temps in shared programs: 2154268 -> 2144876 (-0.44%)
max-temps in affected programs: 61279 -> 51887 (-15.33%)
helped: 1124
HURT: 187
total spills in shared programs: 4002 -> 3870 (-3.30%)
spills in affected programs: 265 -> 133 (-49.81%)
helped: 6
HURT: 0
total fills in shared programs: 5788 -> 5560 (-3.94%)
fills in affected programs: 603 -> 375 (-37.81%)
helped: 6
HURT: 0
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
Iago Toral Quiroga [Mon, 21 Feb 2022 12:03:54 +0000 (13:03 +0100)]
broadcom/compiler: move constants before their first user
For us they are basically uniforms too so we want to make their
lifespans short to facilitate allocating them to accumulators.
total instructions in shared programs:
13043585 ->
13015385 (-0.22%)
instructions in affected programs: 8326040 -> 8297840 (-0.34%)
helped: 24939
HURT: 19894
total threads in shared programs: 415860 -> 415858 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 1
total uniforms in shared programs: 3721953 -> 3720451 (-0.04%)
uniforms in affected programs: 96134 -> 94632 (-1.56%)
helped: 744
HURT: 435
total max-temps in shared programs: 2173431 -> 2154260 (-0.88%)
max-temps in affected programs: 264598 -> 245427 (-7.25%)
helped: 10858
HURT: 841
total spills in shared programs: 4005 -> 4010 (0.12%)
spills in affected programs: 700 -> 705 (0.71%)
helped: 5
HURT: 10
total fills in shared programs: 5801 -> 5817 (0.28%)
fills in affected programs: 1346 -> 1362 (1.19%)
helped: 6
HURT: 11
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>