Qiang Yu [Sat, 2 Apr 2022 09:12:20 +0000 (17:12 +0800)]
mesa/st: implement hardware accelerated GL_SELECT
Use an internal geometry shader to handle input primitives. Do full
accurate culling and clipping in the shader and output hit result and
min/max depth to a SSBO for final being written to select buffer.
With multiple result slots in SSBO we can left multiple draws on the
fly and wait them done when buffer is full or exit GL_SELECT mode.
This provides quicker selection response compared to software based
solution. Tested on Discovery Studio 2020: some complex model needs
1~2s selection response time originally, now it's almost selected
immidiately.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Thu, 17 Mar 2022 03:23:22 +0000 (11:23 +0800)]
mesa: pass select result buffer offset as attribute/varying
Will be used by geometry shader to store hit result.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Fri, 13 May 2022 09:34:14 +0000 (17:34 +0800)]
mesa: add HWSelectModeBeginEnd dispatch table
Used when in glBegin/End section and HW GL_RENDER mode.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Sat, 14 May 2022 03:34:41 +0000 (11:34 +0800)]
mesa: set CurrentServerDispatch too when glBegin/End
When glthread not enabled, CurrentClientDispatch and CurrentServerDispatch
should be same. This does not cause problems before because OutsideBeginEnd
and BeginEnd have same BeginEnd entries, so when
CurrentServerDispatch==OutsideBeginEnd
CurrentClientDispatch==BeginEnd
will call into same BeginEnd _mesa_* functions.
But we'll add another dispatch table to replace BeginEnd when HW GL_SELECT
mode, so this needs to be fixed. Otherwise some function like _mesa_Rectf
which always call with CurrentServerDispatch will go into wrong entries.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Fri, 13 May 2022 13:11:47 +0000 (21:11 +0800)]
mapi: add api setup header for hw select mode
Used by GL_SELECT mode dispatch table setup.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Fri, 13 May 2022 11:50:04 +0000 (19:50 +0800)]
mesa/vbo: enclose none-vertex functions with HW_SELECT_MODE
For constructing dispatch table used in GL_SELECT mode. Every vertex
inserted need to also insert a name stack offset attribute.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Wed, 9 Mar 2022 07:03:01 +0000 (15:03 +0800)]
mesa: add hw select name stack code path
HW code path will not flush vertex whenever name stack change.
It will save the current name stack and write to select buffer
only when no space left or exit select mode.
This let us submit multi draws from different name stack at
once instead of submit draws for a single name stack then
wait it finish before submit next one.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Wed, 9 Mar 2022 06:33:22 +0000 (14:33 +0800)]
mesa: refine name stack code to prepare for hw select
No functional change, just pack existing software based implementation into
the HardwareAcceleratedSelect switch, will add hardware implementation in
next commit.
ctx->Select.NameStackDepth is sure to be <=MAX_NAME_STACK_DEPTH, so removed
the overflow check in _mesa_LoadName.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Sgined-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Wed, 9 Mar 2022 03:34:57 +0000 (11:34 +0800)]
mesa: add _mesa_bufferobj_get_subdata
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Mon, 7 Mar 2022 07:37:03 +0000 (15:37 +0800)]
mesa: add hardware accelerated select constant
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Sat, 19 Mar 2022 13:05:33 +0000 (21:05 +0800)]
nir/builder: add load/store array variable helper functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Mon, 14 Mar 2022 07:11:34 +0000 (15:11 +0800)]
mesa/vbo: remove unused vbo_context->binding
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Qiang Yu [Thu, 24 Mar 2022 03:15:58 +0000 (11:15 +0800)]
mesa/program: fix nir output reg overflow
outputs_written is uint64_t, should count max reg number
by util_last_bit64(). Otherwise the following access will
overflow the allocated array with a smaller size.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765>
Alyssa Rosenzweig [Thu, 2 Jun 2022 18:50:39 +0000 (14:50 -0400)]
pan/va: Unit test constant lowering pass
Like other optimizations, breaking this pass may not affect functional
correctness. It's also dead simple to unit test the pass, so we have no excuse
not to. Add unit tests for the functionality we currently support, since we just
extended it and want to make sure everything still works.
This includes tests for use of modifiers to get more small constants. There are
lots of subtle gotchas there, so let's add lots of unit tests to make sure we
got it right.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
Alyssa Rosenzweig [Thu, 2 Jun 2022 23:07:49 +0000 (19:07 -0400)]
pan/va: Try widening small constants
Many small integers are availabled as small constants, but the table of small
constants is tightly packed. Zero and sign extensions are usually required to
access small integers. When packing constants, try zero/sign extension for
unsigned/signed integer instructions respectively.
total instructions in shared programs: 2716912 -> 2707795 (-0.34%)
instructions in affected programs: 1045609 -> 1036492 (-0.87%)
helped: 4460
HURT: 125
helped stats (abs) min: 1.0 max: 58.0 x̄: 2.14 x̃: 1
helped stats (rel) min: 0.14% max: 23.85% x̄: 1.35% x̃: 0.88%
HURT stats (abs) min: 1.0 max: 68.0 x̄: 3.41 x̃: 1
HURT stats (rel) min: 0.34% max: 3.88% x̄: 0.93% x̃: 0.70%
95% mean confidence interval for instructions value: -2.09 -1.89
95% mean confidence interval for instructions %-change: -1.33% -1.25%
Instructions are helped.
total cycles in shared programs: 141984.06 -> 141932.42 (-0.04%)
cycles in affected programs: 552.08 -> 500.44 (-9.35%)
helped: 18
HURT: 0
helped stats (abs) min: 0.015625 max: 11.0 x̄: 2.87 x̃: 0
helped stats (rel) min: 0.50% max: 19.64% x̄: 5.36% x̃: 1.53%
95% mean confidence interval for cycles value: -5.17 -0.56
95% mean confidence interval for cycles %-change: -9.28% -1.44%
Cycles are helped.
total cvt in shared programs: 13805.05 -> 13663.39 (-1.03%)
cvt in affected programs: 6127.45 -> 5985.80 (-2.31%)
helped: 4460
HURT: 125
helped stats (abs) min: 0.015625 max: 0.90625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.35% max: 50.00% x̄: 5.19% x̃: 4.00%
HURT stats (abs) min: 0.015625 max: 1.0625 x̄: 0.05 x̃: 0
HURT stats (rel) min: 0.77% max: 9.30% x̄: 3.40% x̃: 2.78%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -5.10% -4.81%
Cvt are helped.
total ls in shared programs: 129545 -> 129494 (-0.04%)
ls in affected programs: 495 -> 444 (-10.30%)
helped: 6
HURT: 0
helped stats (abs) min: 2.0 max: 11.0 x̄: 8.50 x̃: 11
helped stats (rel) min: 1.49% max: 19.64% x̄: 13.95% x̃: 19.64%
95% mean confidence interval for ls value: -12.68 -4.32
95% mean confidence interval for ls %-change: -23.23% -4.67%
Ls are helped.
total quadwords in shared programs: 1476416 -> 1469824 (-0.45%)
quadwords in affected programs: 121208 -> 114616 (-5.44%)
helped: 820
HURT: 16
helped stats (abs) min: 8.0 max: 32.0 x̄: 8.28 x̃: 8
helped stats (rel) min: 1.39% max: 50.00% x̄: 11.00% x̃: 10.00%
HURT stats (abs) min: 8.0 max: 32.0 x̄: 12.50 x̃: 8
HURT stats (rel) min: 1.38% max: 10.00% x̄: 6.19% x̃: 7.14%
95% mean confidence interval for quadwords value: -8.14 -7.63
95% mean confidence interval for quadwords %-change: -11.20% -10.15%
Quadwords are helped.
total threads in shared programs: 53633 -> 53663 (0.06%)
threads in affected programs: 39 -> 69 (76.92%)
helped: 33
HURT: 3
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.64 1.02
95% mean confidence interval for threads %-change: 73.27% 101.73%
Threads are helped.
total spills in shared programs: 154 -> 103 (-33.12%)
spills in affected programs: 75 -> 24 (-68.00%)
helped: 6
HURT: 0
total fills in shared programs: 656 -> 656 (0.00%)
fills in affected programs: 148 -> 148 (0.00%)
helped: 2
HURT: 4
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
Alyssa Rosenzweig [Thu, 2 Jun 2022 19:10:09 +0000 (15:10 -0400)]
pan/va: Try negating small constants when lowering
If a constant is used with a floating point instruction with a floating-point
negate modifier, we can use the modifier to negate constants in the table for
free. Each floating point in the table is positive, so this is required for
negative small constants.
total instructions in shared programs: 2728438 -> 2716912 (-0.42%)
instructions in affected programs: 1418220 -> 1406694 (-0.81%)
helped: 6053
HURT: 94
helped stats (abs) min: 1.0 max: 43.0 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.06% max: 18.18% x̄: 1.34% x̃: 0.84%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 2.34 x̃: 2
HURT stats (rel) min: 0.09% max: 21.43% x̄: 1.87% x̃: 0.91%
95% mean confidence interval for instructions value: -1.93 -1.82
95% mean confidence interval for instructions %-change: -1.34% -1.25%
Instructions are helped.
total cycles in shared programs: 142103 -> 141984.06 (-0.08%)
cycles in affected programs: 766.70 -> 647.77 (-15.51%)
helped: 97
HURT: 0
helped stats (abs) min: 0.015625 max: 40.0 x̄: 1.23 x̃: 0
helped stats (rel) min: 0.27% max: 41.24% x̄: 3.63% x̃: 2.08%
95% mean confidence interval for cycles value: -2.41 -0.04
95% mean confidence interval for cycles %-change: -4.68% -2.57%
Cycles are helped.
total cvt in shared programs: 13983.34 -> 13805.05 (-1.28%)
cvt in affected programs: 7952.45 -> 7774.16 (-2.24%)
helped: 6049
HURT: 98
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.25% max: 100.00% x̄: 4.74% x̃: 2.52%
HURT stats (abs) min: 0.015625 max: 0.078125 x̄: 0.04 x̃: 0
HURT stats (rel) min: 0.17% max: 100.00% x̄: 5.48% x̃: 2.54%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -4.83% -4.32%
Cvt are helped.
total ls in shared programs: 129660 -> 129545 (-0.09%)
ls in affected programs: 601 -> 486 (-19.13%)
helped: 7
HURT: 0
helped stats (abs) min: 3.0 max: 40.0 x̄: 16.43 x̃: 8
helped stats (rel) min: 2.88% max: 41.24% x̄: 17.41% x̃: 12.50%
95% mean confidence interval for ls value: -31.42 -1.44
95% mean confidence interval for ls %-change: -29.25% -5.58%
Ls are helped.
total quadwords in shared programs: 1482728 -> 1476416 (-0.43%)
quadwords in affected programs: 131200 -> 124888 (-4.81%)
helped: 798
HURT: 15
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.06 x̃: 8
helped stats (rel) min: 0.34% max: 50.00% x̄: 10.15% x̃: 6.67%
HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel) min: 1.49% max: 100.00% x̄: 11.25% x̃: 2.78%
95% mean confidence interval for quadwords value: -7.92 -7.60
95% mean confidence interval for quadwords %-change: -10.52% -8.99%
Quadwords are helped.
total threads in shared programs: 53585 -> 53633 (0.09%)
threads in affected programs: 51 -> 99 (94.12%)
helped: 49
HURT: 1
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.88 1.04
95% mean confidence interval for threads %-change: 90.97% 103.03%
Threads are helped.
total spills in shared programs: 125 -> 154 (23.20%)
spills in affected programs: 75 -> 104 (38.67%)
helped: 3
HURT: 4
total fills in shared programs: 800 -> 656 (-18.00%)
fills in affected programs: 476 -> 332 (-30.25%)
helped: 7
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
Alyssa Rosenzweig [Thu, 2 Jun 2022 23:06:07 +0000 (19:06 -0400)]
pan/va: Record which instructions are signed
We need to distinguish signed integer instructions from unsigned integer
instructions, to distinguish sign-extension and zero-extension of sources.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862>
Rhys Perry [Mon, 30 May 2022 11:46:20 +0000 (12:46 +0100)]
aco: fix SMEM load_global with VGPR address and non-zero offset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes:
3e9517c7577 ("aco: implement _amd global access intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16775>
Rhys Perry [Mon, 30 May 2022 11:45:10 +0000 (12:45 +0100)]
aco: fix SMEM load_global_amd with non-zero offset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes:
3e9517c7577 ("aco: implement _amd global access intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16775>
Juan A. Suarez Romero [Thu, 2 Jun 2022 15:06:20 +0000 (17:06 +0200)]
v3d: save only required states in blitter
Some blitter operations, like clear, doesn't require to save all the
states.
This is particular important because, besides saving time, the blitter
operation restores the state required for the operation, and if we saved
more states than those, these ones won't be restored and will be leak.
So this also fixes some leaks when running CTS tests.
CC: mesa-stable
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16837>
Juan A. Suarez Romero [Thu, 2 Jun 2022 15:05:07 +0000 (17:05 +0200)]
v3d: use function to initialize refcount
Call proper pipe reference function to initialize the reference
counting.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16837>
Alyssa Rosenzweig [Fri, 3 Jun 2022 00:13:14 +0000 (20:13 -0400)]
pan/bi: Implement b2i with MUX
The result_type modifier propagation looks for MUX instructions, so using this
canonical b2i implementation allows the sequence b2i(cmp) to be fused.
It's also faster on its own: on Valhall, MUX may be implemented as CSEL on the
CVT unit, while AND may only be implemented on the SFU unit. So in case this
doesn't get fused, we expect 4x better throughput for b2i with this
implementation. Similarly, on Bifrost, MUX may be scheduled to either unit (as
CSEL on FMA or MUX on ADD), whereas AND may only be scheduled to FMA.
Results on Mali-G52:
total instructions in shared programs: 2419171 -> 2414814 (-0.18%)
instructions in affected programs: 272203 -> 267846 (-1.60%)
helped: 767
HURT: 0
helped stats (abs) min: 1.0 max: 138.0 x̄: 5.68 x̃: 2
helped stats (rel) min: 0.12% max: 15.57% x̄: 2.09% x̃: 0.68%
95% mean confidence interval for instructions value: -6.68 -4.68
95% mean confidence interval for instructions %-change: -2.37% -1.82%
Instructions are helped.
total tuples in shared programs: 1932822 -> 1929234 (-0.19%)
tuples in affected programs: 76485 -> 72897 (-4.69%)
helped: 380
HURT: 3
helped stats (abs) min: 1.0 max: 138.0 x̄: 9.46 x̃: 1
helped stats (rel) min: 0.14% max: 15.96% x̄: 3.81% x̃: 0.92%
HURT stats (abs) min: 1.0 max: 6.0 x̄: 2.67 x̃: 1
HURT stats (rel) min: 0.38% max: 8.57% x̄: 3.80% x̃: 2.44%
95% mean confidence interval for tuples value: -11.30 -7.44
95% mean confidence interval for tuples %-change: -4.27% -3.22%
Tuples are helped.
total clauses in shared programs: 356094 -> 355992 (-0.03%)
clauses in affected programs: 3264 -> 3162 (-3.12%)
helped: 80
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.27 x̃: 1
helped stats (rel) min: 0.81% max: 50.00% x̄: 4.83% x̃: 3.39%
95% mean confidence interval for clauses value: -1.49 -1.06
95% mean confidence interval for clauses %-change: -6.23% -3.43%
Clauses are helped.
total cycles in shared programs: 167337.10 -> 167329.19 (<.01%)
cycles in affected programs: 510.08 -> 502.17 (-1.55%)
helped: 80
HURT: 2
helped stats (abs) min: 0.
041665999999999315 max: 0.
7916659999999993 x̄: 0.10 x̃: 0
helped stats (rel) min: 0.51% max: 13.64% x̄: 2.12% x̃: 1.34%
HURT stats (abs) min: 0.
041665999999999315 max: 0.
0416669999999999 x̄: 0.04 x̃: 0
HURT stats (rel) min: 0.39% max: 2.78% x̄: 1.58% x̃: 1.58%
95% mean confidence interval for cycles value: -0.12 -0.07
95% mean confidence interval for cycles %-change: -2.59% -1.48%
Cycles are helped.
total arith in shared programs: 73819.54 -> 73669.25 (-0.20%)
arith in affected programs: 2840.54 -> 2690.25 (-5.29%)
helped: 383
HURT: 3
helped stats (abs) min: 0.
041665999999999315 max: 5.75 x̄: 0.39 x̃: 0
helped stats (rel) min: 0.33% max: 18.81% x̄: 4.39% x̃: 0.98%
HURT stats (abs) min: 0.
041665999999999315 max: 0.25 x̄: 0.11 x̃: 0
HURT stats (rel) min: 0.39% max: 8.96% x̄: 4.04% x̃: 2.78%
95% mean confidence interval for arith value: -0.47 -0.31
95% mean confidence interval for arith %-change: -4.93% -3.71%
Arith are helped.
total quadwords in shared programs: 1679798 -> 1676259 (-0.21%)
quadwords in affected programs: 72826 -> 69287 (-4.86%)
helped: 381
HURT: 15
helped stats (abs) min: 1.0 max: 142.0 x̄: 9.35 x̃: 1
helped stats (rel) min: 0.25% max: 18.87% x̄: 4.33% x̃: 1.13%
HURT stats (abs) min: 1.0 max: 6.0 x̄: 1.47 x̃: 1
HURT stats (rel) min: 0.30% max: 6.25% x̄: 0.77% x̃: 0.35%
95% mean confidence interval for quadwords value: -10.76 -7.11
95% mean confidence interval for quadwords %-change: -4.71% -3.56%
Quadwords are helped.
Results on Mali-G57:
total instructions in shared programs: 2704193 -> 2699317 (-0.18%)
instructions in affected programs: 293366 -> 288490 (-1.66%)
helped: 758
HURT: 5
helped stats (abs) min: 1.0 max: 151.0 x̄: 6.45 x̃: 2
helped stats (rel) min: 0.11% max: 22.22% x̄: 2.05% x̃: 0.64%
HURT stats (abs) min: 1.0 max: 7.0 x̄: 2.20 x̃: 1
HURT stats (rel) min: 0.22% max: 1.69% x̄: 0.87% x̃: 1.08%
95% mean confidence interval for instructions value: -7.42 -5.36
95% mean confidence interval for instructions %-change: -2.27% -1.79%
Instructions are helped.
total cycles in shared programs: 141711.73 -> 141711.84 (<.01%)
cycles in affected programs: 214.36 -> 214.47 (0.05%)
helped: 4
HURT: 42
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.20 x̃: 0
helped stats (rel) min: 1.85% max: 12.78% x̄: 9.12% x̃: 10.93%
HURT stats (abs) min: 0.015625 max: 0.09375 x̄: 0.02 x̃: 0
HURT stats (rel) min: 0.17% max: 17.65% x̄: 0.84% x̃: 0.34%
95% mean confidence interval for cycles value: -0.02 0.03
95% mean confidence interval for cycles %-change: -1.23% 1.17%
Inconclusive result (value mean confidence interval includes 0).
total cvt in shared programs: 14479.14 -> 14474.19 (-0.03%)
cvt in affected programs: 2877.05 -> 2872.09 (-0.17%)
helped: 508
HURT: 209
helped stats (abs) min: 0.015625 max: 0.453125 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.25% max: 16.67% x̄: 1.23% x̃: 0.37%
HURT stats (abs) min: 0.015625 max: 0.296875 x̄: 0.03 x̃: 0
HURT stats (rel) min: 0.15% max: 18.18% x̄: 1.70% x̃: 0.34%
95% mean confidence interval for cvt value: -0.01 -0.00
95% mean confidence interval for cvt %-change: -0.57% -0.18%
Cvt are helped.
total sfu in shared programs: 7875.69 -> 7590.75 (-3.62%)
sfu in affected programs: 1567.38 -> 1282.44 (-18.18%)
helped: 906
HURT: 0
helped stats (abs) min: 0.0625 max: 8.625 x̄: 0.31 x̃: 0
helped stats (rel) min: 2.38% max: 100.00% x̄: 16.80% x̃: 5.63%
95% mean confidence interval for sfu value: -0.37 -0.26
95% mean confidence interval for sfu %-change: -18.43% -15.17%
Sfu are helped.
total quadwords in shared programs: 1468152 -> 1465800 (-0.16%)
quadwords in affected programs: 37104 -> 34752 (-6.34%)
helped: 161
HURT: 2
helped stats (abs) min: 8.0 max: 80.0 x̄: 14.71 x̃: 8
helped stats (rel) min: 1.67% max: 20.00% x̄: 8.05% x̃: 7.69%
HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel) min: 3.57% max: 3.85% x̄: 3.71% x̃: 3.71%
95% mean confidence interval for quadwords value: -16.29 -12.57
95% mean confidence interval for quadwords %-change: -8.58% -7.22%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
Alyssa Rosenzweig [Fri, 6 May 2022 21:02:57 +0000 (17:02 -0400)]
pan/va: Add MUX lowering tests
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
Alyssa Rosenzweig [Fri, 6 May 2022 21:10:56 +0000 (17:10 -0400)]
pan/va: Lower MUX to CSEL where possible
CSEL executes on the conversion unit (CVT), while MUX executes on the special
function unit (SFU). Throughput on CVT is 4x higher than SFU, so this is
(almost) always an optimization.
The "real" MUX is still used for unusual cases, like 8-bit and bitselect.
Note that it's easier for us to use MUX everywhere for the IR. This is an easy
fixup to get better codegen on Valhall without touching the core Bifrost code.
shader-db is a bit of a toss up: register pressure and instruction count are
hurt in some cases due to restrictions on FAU access. In particular, a shader
that muxes between two uniforms needs an extra move due to extra constant
(zero). However, in terms of throughput this is still a win: 2 CVT instructions
(MOV + CSEL) have 2x throughput to 1 SFU instruction (MUX). The MOV has
opportunities for CSE, but that can hurt pressure in turn. Overall, cycles are
helped substantially.
total instructions in shared programs: 2728438 -> 2731597 (0.12%)
instructions in affected programs: 414391 -> 417550 (0.76%)
helped: 87
HURT: 1063
helped stats (abs) min: 1.0 max: 6.0 x̄: 5.17 x̃: 6
helped stats (rel) min: 0.19% max: 15.79% x̄: 4.12% x̃: 4.11%
HURT stats (abs) min: 1.0 max: 56.0 x̄: 3.40 x̃: 2
HURT stats (rel) min: 0.11% max: 23.43% x̄: 1.15% x̃: 0.63%
95% mean confidence interval for instructions value: 2.47 3.03
95% mean confidence interval for instructions %-change: 0.61% 0.90%
Instructions are HURT.
total cycles in shared programs: 142103 -> 142015.75 (-0.06%)
cycles in affected programs: 1263.45 -> 1176.20 (-6.91%)
helped: 281
HURT: 176
helped stats (abs) min: 0.015625 max: 2.234375 x̄: 0.50 x̃: 0
helped stats (rel) min: 0.71% max: 54.17% x̄: 16.93% x̃: 15.31%
HURT stats (abs) min: 0.015625 max: 30.0 x̄: 0.30 x̃: 0
HURT stats (rel) min: 0.84% max: 120.00% x̄: 7.16% x̃: 5.00%
95% mean confidence interval for cycles value: -0.33 -0.05
95% mean confidence interval for cycles %-change: -9.08% -6.22%
Cycles are helped.
total cvt in shared programs: 13983.34 -> 14891.70 (6.50%)
cvt in affected programs: 7498.36 -> 8406.72 (12.11%)
helped: 71
HURT: 4711
helped stats (abs) min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
helped stats (rel) min: 5.41% max: 40.00% x̄: 10.23% x̃: 9.30%
HURT stats (abs) min: 0.015625 max: 2.640625 x̄: 0.19 x̃: 0
HURT stats (rel) min: 0.18% max: 141.18% x̄: 16.21% x̃: 9.52%
95% mean confidence interval for cvt value: 0.18 0.20
95% mean confidence interval for cvt %-change: 15.21% 16.42%
Cvt are HURT.
total sfu in shared programs: 11320.44 -> 7882.56 (-30.37%)
sfu in affected programs: 7618.50 -> 4180.62 (-45.13%)
helped: 4782
HURT: 0
helped stats (abs) min: 0.0625 max: 10.5625 x̄: 0.72 x̃: 0
helped stats (rel) min: 1.34% max: 100.00% x̄: 41.91% x̃: 37.50%
95% mean confidence interval for sfu value: -0.75 -0.68
95% mean confidence interval for sfu %-change: -42.68% -41.14%
Sfu are helped.
total ls in shared programs: 129660 -> 129690 (0.02%)
ls in affected programs: 25 -> 55 (120.00%)
helped: 0
HURT: 1
total quadwords in shared programs: 1482728 -> 1484128 (0.09%)
quadwords in affected programs: 58624 -> 60024 (2.39%)
helped: 24
HURT: 195
helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
helped stats (rel) min: 3.70% max: 20.00% x̄: 10.34% x̃: 10.00%
HURT stats (abs) min: 8.0 max: 24.0 x̄: 8.16 x̃: 8
HURT stats (rel) min: 1.41% max: 50.00% x̄: 4.84% x̃: 2.56%
95% mean confidence interval for quadwords value: 5.70 7.09
95% mean confidence interval for quadwords %-change: 2.22% 4.14%
Quadwords are HURT.
total spills in shared programs: 125 -> 127 (1.60%)
spills in affected programs: 0 -> 2
helped: 0
HURT: 1
total fills in shared programs: 800 -> 828 (3.50%)
fills in affected programs: 0 -> 28
helped: 0
HURT: 1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
Alyssa Rosenzweig [Fri, 6 May 2022 21:23:10 +0000 (17:23 -0400)]
pan/va: Implement more lanes
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
Alyssa Rosenzweig [Fri, 6 May 2022 21:09:56 +0000 (17:09 -0400)]
pan/bi: Extract MUX to CSEL optimization
It's portable, and useful to both Bifrost and Valhall, in the clause scheduler
and in an instruction selection respectively. Move it from the Bifrost clause
scheduler to common code so we can share the benefits.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16857>
Frank Binns [Mon, 30 May 2022 17:16:46 +0000 (18:16 +0100)]
pvr: shorten error to err in label names
This is for consistency with the rest of the driver.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16882>
Juan A. Suarez Romero [Tue, 31 May 2022 09:28:14 +0000 (11:28 +0200)]
v3d/ci: Add traces
Add a job to run and test traces from Tracies DB.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16809>
Alyssa Rosenzweig [Sat, 4 Jun 2022 13:58:40 +0000 (09:58 -0400)]
panfrost: Don't calculate min/max indices on v9
On Valhall, we always* use memory-allocated IDVS, which does not require min/max
indices. As such, we do not want to calculate min/max indices, as this is quite
slow. Skip this step.
* except for blit shaders, which don't use an index buffer anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16867>
Alyssa Rosenzweig [Sat, 4 Jun 2022 13:57:29 +0000 (09:57 -0400)]
panfrost: Extract panfrost_get_index_buffer helper
Memory-allocated IDVS does not require min/max indices to be calculated, but it
of course requires an index buffer. Extract a helper to upload the index buffer
without calculating bounds.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16867>
Alyssa Rosenzweig [Sat, 4 Jun 2022 13:32:30 +0000 (09:32 -0400)]
pan/va: Do not insert NOPs into empty shaders
It's unnecessary and breaks the empty shader optimizations. Noticed while
inspecting a trace from dEQP-GLES3.functional.color_clear.masked_scissored_rgb,
which does not produce any varyings other than gl_Position in its vertex shader
and hence should omit the varying shader.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16868>
Konstantin Seurer [Sat, 4 Jun 2022 18:50:49 +0000 (20:50 +0200)]
radv: Require an alignment of 64 for accel structs
Top level acceleration structures need the bottom
6 bits to store the root ids of instances. If we
don't require that alignment, more "advanced"
allocators like VMA may sub allocate a buffer
which can lead to the 6 getting lost.
Fixes the Khronos ray tracing Vulkan samples.
Closes: #6598
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16870>
David Heidelberg [Mon, 6 Jun 2022 13:20:43 +0000 (15:20 +0200)]
ci/virgl: traces: temporarily disable nheko trace
Disable nheko trace until apitrace gets fixed.
apitrace currently fails with this trace, when more than 1 run is
requested.
Upstream issue: https://github.com/apitrace/apitrace/issues/800
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16887>
Mike Blumenkrantz [Fri, 3 Jun 2022 18:00:11 +0000 (14:00 -0400)]
zink: remove buffer valid range tracking from blit
I copy/pasted too hard. this code could never be reached
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 17:59:11 +0000 (13:59 -0400)]
zink: invalidate blit dsts if fully covered
tiling perf++ since there's no need to load
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 17:57:30 +0000 (13:57 -0400)]
zink: hook up surface invalidation to LOAD_OP_DONT_CARE
this should improve perf for tilers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 17:55:44 +0000 (13:55 -0400)]
zink: split out a dynamic render ternary
this is going to get bigger
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 17:52:45 +0000 (13:52 -0400)]
zink: rename renderpass attrib value
this never really meant "swapchain", it just meant that load isn't needed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Sun, 5 Jun 2022 13:14:30 +0000 (09:14 -0400)]
zink: flag renderpass for change if image resource changes valid state
the next renderpass instance will need to use different load ops,
so flag it here to ensure that gets picked up
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 17:50:55 +0000 (13:50 -0400)]
zink: track invalidation for image resources
an image only has valid data if:
* it's imported
* it's written to
* it's mapped for write
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16877>
Mike Blumenkrantz [Fri, 3 Jun 2022 20:40:32 +0000 (16:40 -0400)]
zink: disable EXT_primitives_generated_query on turnip
this is broken
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16861>
Mike Blumenkrantz [Fri, 3 Jun 2022 20:39:07 +0000 (16:39 -0400)]
zink: remove ANV depth clip control workaround
this was fixed a while ago and I forgot
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16861>
Mike Blumenkrantz [Fri, 27 May 2022 17:34:09 +0000 (13:34 -0400)]
mesa: handle atomic counter lowering for drivers with big ssbo offset aligns
according to the spec, atomic counters can be bound at any offset divisible by 4,
which means that any driver that uses the ssbo lowering pass and doesn't have
a min offset align of 4 is potentially broken
to handle this, use a statevar to inject the misaligned remainder of the offset
into the shader as a uniform. for well-aligned counter binds, the uniform offset
will be 0
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16749>
Mike Blumenkrantz [Thu, 2 Jun 2022 21:44:34 +0000 (17:44 -0400)]
st/glsl_to_nir: call st_set_prog_affected_state_flags() as late as possible
this function should be called late to allow for other passes potentially
making changes which affect the states in use by shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16749>
Mike Blumenkrantz [Fri, 27 May 2022 17:33:14 +0000 (13:33 -0400)]
mesa: conditionally set constants dirty for atomic counter binds
this is necessary for updating the offset uniforms
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16749>
Mike Blumenkrantz [Fri, 27 May 2022 17:30:11 +0000 (13:30 -0400)]
mesa: add statevar for atomic counter offsets
some hardware can't do a ssbo offset=4, as required by the atomic->ssbo
lowering pass, so for these cases an offset can be passed for the counter
as a uniform, and the shaders can be adjusted accordingly
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16749>
Pavel Ondračka [Fri, 20 May 2022 09:11:07 +0000 (11:11 +0200)]
r300: merge simple movs with constant swizzles together
This pass will merge instructions like these
MOV output[0].x, temp[5].x___;
MOV output[0].yzw, none._001;
into
MOV output[0].xyzw, temp[5].x001;
It is currently very careful with control flow and dependency
tracking, so there is still room for improvements.
Shader-db stats with RV530:
total instructions in shared programs: 132486 -> 132256 (-0.17%)
instructions in affected programs: 6186 -> 5956 (-3.72%)
helped: 65
HURT: 0
total temps in shared programs: 18035 -> 18014 (-0.12%)
temps in affected programs: 295 -> 274 (-7.12%)
helped: 22
HURT: 1
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Filip Gawin [Sat, 12 Feb 2022 23:28:36 +0000 (00:28 +0100)]
r300: don't check for unitialized reads when rewriting register
This fixes the "Rewrite of inst X failed Can't allocate source
for Inst X src_type=X new_index=X new_mask=X" errors.
The compiler is quite strict when rewriting registers during
the pair allocation and checks that all of the reads of it are
initialized. However the spec doesn't enfore that, and
specifically with control flow depending on user input we can't
really know...
In the following example temp[4].x is written only in one branch,
that might or might not be taken, but this is enough to keep the
compiler happy:
IF aluresult.x___;
MAD temp[4].x, src0.1__, src0.111, src0.000
ENDIF;
src0.xyz = temp[4], src0.w = temp[4]
MAD color[0].xyz, src0.xyz, src0.111, src0.000
MAD color[0].w, src0.w, src0.1, src0.0
After switch to ntt, more IFs are converted to CMP, and the color
write looks like this. Please note that the CMP here is not TGSI
opcode but rather our US_OP_RGB_CMP: src2 >= 0 ? src0 : src1
src0.xyz = temp[4], src0.w = temp[4], src1.xyz = temp[3], src1.w = temp[12], src2.xyz = temp[2]
CMP color[0].xyz, src0.xyz, src1.xyz, -src2.xxx
CMP color[0].w, src0.w, src1.w, -src2.x
At this point temp[4].x is undefined. Now when compiler tries to
allocate register for temp[4] at some previous instruction, it will
find out that it is used as a source in the final CMP and bail out.
Instead of increasing the complexitty even more trying to account for
this, just get rid of the check completelly.
Fixes:
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec2_dynamic_subscript_write_component_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec2_dynamic_subscript_write_direct_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec2_dynamic_subscript_write_dynamic_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec2_dynamic_subscript_write_static_loop_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec2_dynamic_subscript_write_static_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec3_dynamic_subscript_write_component_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec3_dynamic_subscript_write_direct_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec3_dynamic_subscript_write_dynamic_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec3_dynamic_subscript_write_static_loop_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec3_dynamic_subscript_write_static_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec4_dynamic_subscript_write_component_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec4_dynamic_subscript_write_direct_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec4_dynamic_subscript_write_dynamic_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec4_dynamic_subscript_write_static_loop_subscript_read_fragment,Fail
dEQP-GLES2.functional.shaders.indexing.vector_subscript.vec4_dynamic_subscript_write_static_subscript_read_fragment,Fail
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Pavel Ondračka [Thu, 19 May 2022 08:32:56 +0000 (10:32 +0200)]
r300: Update list of RV515 dEQP failures and add some flakes
The fixes are mostly from
23dfae4c810e5e31cea647b7803700b0fcd4eb96
dEQP-GLES2.functional.fragment_ops.depth_stencil tests show random
flakes. The ones in failures are showing unexpected pass, however other
random test failures from the same group keep showing so just mark it
all as flakes.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Pavel Ondračka [Thu, 19 May 2022 10:38:01 +0000 (12:38 +0200)]
r300: don't try to use inline constants instead of constant swizzles
It doesn't make sense and was not working anyway. This was spotted
by Filip Gawin in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13978
however the fix there was IMO just papering over the problem.
I don't believe that this could manifest as a real issues, because
when all of the swizzles were constant the file would be set to
RC_FILE_NONE already. So in theory this could lead to an issue only
in the close to impossible circumstance that the out of bounds memory
read by constant->u.Immediate[swz] would end with the same exact value
as another inlineable constant in different channel. However in some
circumstances it would lead to following valgrind warnings:
Conditional jump or move depends on uninitialised value(s)
at 0x5D4E690: ieee_754_to_r300_float (radeon_inline_literals.c:61)
by 0x5D4E690: rc_inline_literals (radeon_inline_literals.c:133)
by 0x5D3877A: rc_run_compiler_passes (radeon_compiler.c:436)
by 0x5D38821: rc_run_compiler (radeon_compiler.c:458)
by 0x5D4AF63: r3xx_compile_fragment_program (r3xx_fragprog.c:139)
by 0x5D48377: r300_translate_fragment_shader (r300_fs.c:499)
by 0x5D491B0: r300_pick_fragment_shader (r300_fs.c:601)
by 0x5D2BFEE: r300_create_fs_state (r300_state.c:1072)
by 0x57DDC36: st_create_nir_shader (st_program.c:538)
by 0x57DF10E: st_create_fp_variant (st_program.c:1056)
by 0x57E057C: st_get_fp_variant (st_program.c:1102)
by 0x57E0AB1: st_precompile_shader_variant (st_program.c:1287)
by 0x57E0AB1: st_finalize_program (st_program.c:1333)
by 0x57CB6F3: st_link_nir (st_glsl_to_nir.cpp:958)
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Pavel Ondračka [Fri, 13 May 2022 07:11:27 +0000 (09:11 +0200)]
r300: be less agresive with copy propagate in loops
When there are multiple MOVs with the same destination in loop
in different branches and some readers after the loop, we would
now errorneously copy propagate the last MOV, like in the following
snippet:
BGNLOOP;
...
IF temp[3].x___;
MOV temp[2], const[1].yxxy;
BRK;
ENDIF;
IF temp[4].x___;
MOV temp[2], const[1].xyxy;
BRK;
ENDIF;
...
MOV temp[2], const[1].xyxy;
ENDLOOP;
ADD_SAT temp[0], temp[2], temp[1];
into:
BGNLOOP;
...
IF temp[3].x___;
MOV temp[2], const[1].yxxy;
BRK;
ENDIF;
IF temp[3].y___;
MOV temp[2], const[1].xyxy;
BRK;
ENDIF;
...
ENDLOOP;
ADD_SAT temp[0], const[1].xyxy, temp[1];
We need the copy propagate just for simple cleanups after ttn,
anything more complex should have been handled already in NIR.
So just bail out if any of the readers is after the loop.
No changes in shader-db.
Fixes few piglit tests when loop unrolling is disabled:
spec@glsl-1.10@execution@vs-loop-complex-unroll
spec@glsl-1.10@execution@vs-loop-complex-unroll-nested-break
spec@glsl-1.10@execution@vs-loop-complex-unroll-with-else-break
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6467
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Pavel Ondračka [Thu, 19 May 2022 07:49:05 +0000 (09:49 +0200)]
r300: deduplicate common NIR options
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16657>
Mike Blumenkrantz [Fri, 3 Jun 2022 13:52:29 +0000 (09:52 -0400)]
mesa/st: bump param reservation to 28
now d3d12 is hitting it, so here we go
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16872>
Mike Blumenkrantz [Sun, 5 Jun 2022 13:04:39 +0000 (09:04 -0400)]
virgl: add some ci flakes
issue #6614
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16876>
Vinson Lee [Wed, 1 Jun 2022 04:24:15 +0000 (21:24 -0700)]
clc: Fix build with llvm-15.
opencl_c_h is defined only for llvm < 15.
Fixes:
bcc2df48905 ("clc: speed up compilation by not relying on opencl-c.h")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16808>
Mike Blumenkrantz [Sat, 4 Jun 2022 13:12:15 +0000 (09:12 -0400)]
d3d12: skip time-elapsed piglit tests in ci
flaky
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16866>
Timothy Arceri [Fri, 6 May 2022 02:28:33 +0000 (12:28 +1000)]
glsl: remove the now unused GLSL IR loop unrolling code
This code was slow, buggy and hard to understand. All drivers
have now switched to using the NIR unrolling code \o/
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 02:13:44 +0000 (12:13 +1000)]
gallium: remove PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT CAP
This is used for the old, buggy and slow GLSL IR loop unrolling
code. All drivers have now switched to the NIR unrolling code so
here we remove the CAP.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 02:09:09 +0000 (12:09 +1000)]
svga: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.
Here we also fix up the force unroll settings.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:52:31 +0000 (11:52 +1000)]
nouveau/nvc0: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.
Shader-db results (nv120):
total gpr in shared programs: 893490 -> 893898 (0.05%)
gpr in affected programs: 15338 -> 15746 (2.66%)
total instructions in shared programs: 6243205 -> 6237068 (-0.10%)
instructions in affected programs: 71160 -> 65023 (-8.62%)
total bytes in shared programs:
66729616 ->
66664760 (-0.10%)
bytes in affected programs: 759328 -> 694472 (-8.54%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:50:42 +0000 (11:50 +1000)]
nouveau/nv50: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.
Shader-db results (nv92):
total gpr in shared programs: 734638 -> 735037 (0.05%)
gpr in affected programs: 11058 -> 11457 (3.61%)
total instructions in shared programs: 6073415 -> 6073398 (<.01%)
instructions in affected programs: 10079 -> 10062 (-0.17%)
total bytes in shared programs:
41837432 ->
41838872 (<.01%)
bytes in affected programs: 252504 -> 253944 (0.57%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:47:11 +0000 (11:47 +1000)]
nouveau/nv30: disable GLSL IR loop unrolling
NIR loop unrolling is already enabled so just let it do its job.
Shader-db results (nv40):
total instructions in shared programs:
17446532 ->
17446068 (<.01%)
instructions in affected programs: 15532 -> 15068 (-2.99%)
total gpr in shared programs: 82658 -> 82801 (0.17%)
gpr in affected programs: 1680 -> 1823 (8.51%)
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:44:31 +0000 (11:44 +1000)]
lima: switch to NIR loop unrolling
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Wed, 18 May 2022 06:33:37 +0000 (16:33 +1000)]
lima: fixup nir indirect unroll options to match gallium CAP
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Wed, 18 May 2022 05:32:09 +0000 (15:32 +1000)]
lima: lower all undefs to zero in vs
Otherwise we will later hit:
gpir_error("nir_ssa_undef_instr is not supported\n");
Unfortunatly this causes a piglit failure due to increased register
pressure in an unrealistic shader but since not doing this can
result in hitting the not supported error in more relistic shaders
this seems the right thing to do for now.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:38:09 +0000 (11:38 +1000)]
freedreno: switch to NIR loop unrolling
Force unroll setting based on GLSL IR settings:
case PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR:
case PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR:
case PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR:
case PIPE_SHADER_CAP_INDIRECT_CONST_ADDR:
/* a2xx compiler doesn't handle indirect: */
return is_ir3(screen) ? 1 : 0;
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Wed, 18 May 2022 06:00:42 +0000 (16:00 +1000)]
freedreno/ir3: tidy up duplication of common nir options
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Timothy Arceri [Fri, 6 May 2022 01:17:38 +0000 (11:17 +1000)]
gallivm: disable GLSL IR loop unrolling in LLVMPIPE
The NIR unroller is already enabled so just allow it to do its job.
We add a new failure here because llvmpipe fails to handle a
shader that is no longer unrolled.
Previously GLSL IR could unroll the loop because it only had a
single break. However once lower_returns passes over the shader
it ends up with more than 2 breaks making it no longer possible
to unroll. This is a disadvantage of doing the unrolling in NIR
however in practice we don't see shaders in the wild with multiple
returns inside loops.
Being unable to handle this loop is an existing bug with llvmpipe
exposed by the loop no longer being unrolled.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16366>
Alyssa Rosenzweig [Fri, 1 Apr 2022 21:23:09 +0000 (17:23 -0400)]
panfrost: Launch transform feedback shaders
We now have infrastructure in place to generate variants of vertex shaders
specialized for transform feedback. All that's left is launching these
compute-like kernels before the IDVS job, implementing both the
transform feedback and the regular rasterization pipeline. This implements
transform feedback on Valhall, passing the relevant GLES3.1 tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Thu, 2 Jun 2022 14:51:16 +0000 (10:51 -0400)]
panfrost: Create transform feedback shaders
Valhall has no architectural support for transform feedback. So if a vertex
shader uses transform feedback, we need to split the shader into two: a pure
vertex stage and a compute-like transform feedback stage. This splitting
resembles the splitting we do for IDVS.
When compiling a vertex shader that uses transform feedback on Bifrost, also
compile the transform feedback variant. That variant (marked by internal=true)
will get its stores lowered by the NIR pass introduced earlier in this series.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Fri, 1 Apr 2022 21:22:05 +0000 (17:22 -0400)]
panfrost: Wire up transfrom feedback sysvals
Wire the Gallium interface for transform feedback up to the system values that
will be fed into our lowering code. This is based on our existing transform
feedback implementation for Midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Mon, 4 Apr 2022 19:58:10 +0000 (15:58 -0400)]
panfrost: Don't allow vertex shaders to have side effects
In both GL and VK, the driver may choose not to support vertex shaders with side
effects (SSBOs, atomics, images). Supporting this opens a can of worms for IDVS.
Neither freedreno nor the (Vulkan?) DDK advertise support, for this reason.
Apps should not be using this anti-feature anyway.
Stop advertising support.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Fri, 1 Apr 2022 21:24:21 +0000 (17:24 -0400)]
pan/bi: Handle transform feedback intrinsics
Translate the intrinsics we introduced to lower away transform feedback into
Panfrost system values which the GL driver can handle.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Thu, 2 Jun 2022 14:50:54 +0000 (10:50 -0400)]
pan/bi: Add transform feedback lowering pass
Add a simple NIR-based implementation of transform feedback, appropriate for
OpenGL ES 3.1 class hardware (compute but no geometry or tessellation shaders).
Stores to varyings that will be captured are replaced by stores to transform
feedback buffers and some addressing math. This allows implementing the semantic
of transform feedback in a compute-like stage.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Mon, 30 May 2022 15:43:03 +0000 (11:43 -0400)]
nir: Export nir_io_add_intrinsic_xfb_info
This is useful for drivers which wish to consume XFB information. These
hopefully-uncontroversial hunks are extracted from the much more controversial
"st,nir,radeons: Move nir_lower_io_passes to si_nir_lower_io" by Jason.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Alyssa Rosenzweig [Fri, 1 Apr 2022 21:20:09 +0000 (17:20 -0400)]
nir: Add transform feedback system values
These will be used to facilitate transform feedback lowering for Panfrost,
although other backends could use the sysvals in the future.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>
Mike Blumenkrantz [Fri, 3 Jun 2022 20:32:39 +0000 (16:32 -0400)]
format_utils: properly parenthesize macro params
this otherwise breaks evaluation of the parameters on arm64
cc: mesa-stable
fixes #6496
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16860>
Alyssa Rosenzweig [Fri, 3 Jun 2022 18:33:58 +0000 (14:33 -0400)]
panfrost: Use C11 static_assert for enums
Rather than asserting everything in an unused function, just do it in global
context with C11 static_asserts. This is a bit neater now that we depend on C11
projectwide.
Obvious follow-on from !16670.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16856>
Mike Blumenkrantz [Fri, 3 Jun 2022 13:52:29 +0000 (09:52 -0400)]
mesa/st: bump param reservation to 20
I was hitting the realloc assert, so increase this again
fixes (zink+tu):
KHR-GL46.geometry_shader.api.max_atomic_counter_buffers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16851>
Mike Blumenkrantz [Fri, 3 Jun 2022 13:36:46 +0000 (09:36 -0400)]
mesa: improve relocation problem message
make it easier to immediately know what the problem is
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16851>
Timothy Arceri [Fri, 6 May 2022 12:45:24 +0000 (22:45 +1000)]
glsl: remove now unused lower_const_arrays_to_uniforms()
We now use a NIR version instead.
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Mon, 11 Oct 2021 11:31:25 +0000 (22:31 +1100)]
glsl: switch to NIR based implementation of lower_const_arrays_to_uniforms()
Shader-db results iris (BDW):
total instructions in shared programs:
17523543 ->
17513909 (-0.05%)
instructions in affected programs: 218091 -> 208457 (-4.42%)
helped: 69
HURT: 327
helped stats (abs) min: 2 max: 2919 x̄: 160.84 x̃: 12
helped stats (rel) min: 0.21% max: 96.88% x̄: 14.87% x̃: 6.40%
HURT stats (abs) min: 1 max: 47 x̄: 4.48 x̃: 1
HURT stats (rel) min: 0.10% max: 22.02% x̄: 3.33% x̃: 0.18%
95% mean confidence interval for instructions value: -45.02 -3.63
95% mean confidence interval for instructions %-change: -1.16% 1.47%
Inconclusive result (%-change mean confidence interval includes 0).
total loops in shared programs: 4875 -> 4868 (-0.14%)
loops in affected programs: 7 -> 0
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.
total cycles in shared programs:
858032406 ->
857984712 (<.01%)
cycles in affected programs:
22940290 ->
22892596 (-0.21%)
helped: 155
HURT: 312
helped stats (abs) min: 1 max: 49696 x̄: 1697.70 x̃: 62
helped stats (rel) min: <.01% max: 70.84% x̄: 5.60% x̃: 0.82%
HURT stats (abs) min: 1 max: 19640 x̄: 690.54 x̃: 100
HURT stats (rel) min: <.01% max: 217.23% x̄: 33.57% x̃: 0.92%
95% mean confidence interval for cycles value: -436.09 231.84
95% mean confidence interval for cycles %-change: 15.39% 25.75%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 16289 -> 15205 (-6.65%)
spills in affected programs: 2753 -> 1669 (-39.38%)
helped: 9
HURT: 1
total fills in shared programs: 20347 -> 20324 (-0.11%)
fills in affected programs: 1642 -> 1619 (-1.40%)
helped: 9
HURT: 1
total sends in shared programs: 972151 -> 971960 (-0.02%)
sends in affected programs: 1910 -> 1719 (-10.00%)
helped: 25
HURT: 20
helped stats (abs) min: 1 max: 50 x̄: 9.00 x̃: 2
helped stats (rel) min: 0.87% max: 53.76% x̄: 13.89% x̃: 6.25%
HURT stats (abs) min: 1 max: 8 x̄: 1.70 x̃: 1
HURT stats (rel) min: 8.33% max: 200.00% x̄: 52.36% x̃: 33.33%
95% mean confidence interval for sends value: -8.19 -0.29
95% mean confidence interval for sends %-change: -1.07% 32.18%
Inconclusive result (%-change mean confidence interval includes 0).
LOST: 3
GAINED: 27
Note a small number of tests fail on lima and r300 after this patch.
However since we are doing the correct thing here and they only
fail due to a slight increase in instruction count pushing them
over their instruction count limit, we are defering that issue
to a different bug report for further discussion.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6540
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Mon, 30 May 2022 23:55:02 +0000 (09:55 +1000)]
glsl: move gl_nir_link_opts() call out of the st code
Calling this directly in the linker code allows us to place it between
the varying linker and uniform linker calls which allows for better
optimisation/removal of uniforms.
Also in a later patch it allows us to insert a new nir based
lower_const_arrays_to_uniforms() call after the gl_nir_link_opts()
call. This is important because it allows the linking opts to
move constant arrays to later stages if possible before
lower_const_arrays_to_uniforms() turns them into uniforms.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6541
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Mon, 30 May 2022 23:44:21 +0000 (09:44 +1000)]
glsl: move common link time optimisation calls to linker code
In the following patch we will move the users of this function to
this file too and make it static again.
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Mon, 11 Oct 2021 04:48:28 +0000 (15:48 +1100)]
glsl/nir: allow the nir linker to remove dead uniforms we created
Some backends lower constant arrays to uniforms in GLSL IR. These
create so called hidden uniforms. Since we know these are added
per stage it is safe to remove them if we detect they are dead.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Tue, 12 Oct 2021 02:38:46 +0000 (13:38 +1100)]
glsl/nir: skip adding hidden uniforms to the remap tables
The remap tables are used with the GL API so there is no need to
add hidden uniforms to them. Also when we switch to lowering some
constant arrays to uniforms in NIR in a following patch there
will no longer be enough room in the tables as we assign their
size in the GLSL IR linker not the NIR linker currently.
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Timothy Arceri [Sun, 10 Oct 2021 09:33:15 +0000 (20:33 +1100)]
nir: add nir based version of the lower_const_arrays_to_uniforms pass
Doing this in NIR should give better results, but also allows us to
stop calling more GLSL IR optimisations passes.
v2: Skip 8bit and 16bit type that would require further processing
I believe this is an existing bug in the GLSL IR pass also.
v3: rebuild constant initialisers as we want to call this pass
after nir has already lowered them and performed optimisations.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16770>
Georg Lehmann [Fri, 3 Jun 2022 14:26:11 +0000 (16:26 +0200)]
zink: Use VK_USE_64_BIT_PTR_DEFINES to check for 64bit platforms.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6605
Cc: mesa-stable
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16853>
Mike Blumenkrantz [Wed, 1 Jun 2022 15:31:28 +0000 (11:31 -0400)]
zink: add back kms handling
removing this broke the ability to create system compositors
rework it a bit though so that kms handles are stored and destroyed
when the bo is freed
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16815>
Mike Blumenkrantz [Wed, 1 Jun 2022 15:30:32 +0000 (11:30 -0400)]
Revert "zink: remove drm_fd"
This reverts commit
c5960f64b139605dbefa34c2cc2a089ba00ae1e2.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16815>
Mike Blumenkrantz [Wed, 1 Jun 2022 19:46:27 +0000 (15:46 -0400)]
zink: handle aux plane imports
basically do nothing here and it magically works
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16822>
Mike Blumenkrantz [Wed, 1 Jun 2022 19:20:30 +0000 (15:20 -0400)]
zink: rename a variable
no functional changes
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16822>
Mike Blumenkrantz [Wed, 1 Jun 2022 18:26:15 +0000 (14:26 -0400)]
zink: represent plane offsets using offset from plane 0 vs size of plane
this is a bit easier to keep track of
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16822>
Mike Blumenkrantz [Wed, 1 Jun 2022 18:12:16 +0000 (14:12 -0400)]
zink: fix dmabuf plane layout struct scoping
this struct needs to exist for all the scopes it's used in
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16822>
Adam Jackson [Fri, 3 Jun 2022 15:33:39 +0000 (11:33 -0400)]
zink: Print the VkResult if vkCreateInstance fails
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16854>
Mike Blumenkrantz [Fri, 3 Jun 2022 22:47:19 +0000 (18:47 -0400)]
ci: disable unit tests
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16863>
Alyssa Rosenzweig [Fri, 3 Jun 2022 18:23:04 +0000 (14:23 -0400)]
panfrost/ci: Mark draw_buffers_indexed.* as flakes
These keep flaking. Icecream95 observes the issue relates to AFBC in the
discussion of the flake in issue 6604. Until the root cause can be identified
and fixed, mark the tests as known flakes for CI.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16855>
Mike Blumenkrantz [Wed, 1 Jun 2022 15:31:49 +0000 (11:31 -0400)]
kopper: use get_drawable_info path for non-x11 drawables
wayland surfaces need to take this path to get resizing right
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16814>
Mike Blumenkrantz [Wed, 1 Jun 2022 20:34:32 +0000 (16:34 -0400)]
egl/wayland: skip buffer creation on zink
this happens through wsi, so don't create resources that aren't used
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16814>
Mike Blumenkrantz [Wed, 1 Jun 2022 14:47:27 +0000 (10:47 -0400)]
egl/wayland: manually swap backbuffer when using zink
this would usually occur through dri2_wl_swrast_commit_backbuffer(),
but zink triggers this functionality using vulkan wsi, which fails to
perform these updates as expected
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16814>