platform/upstream/mesa.git
2 years agofreedreno/a6xx: assert valid vertex_flags reg
Rob Clark [Fri, 1 Jul 2022 21:29:00 +0000 (14:29 -0700)]
freedreno/a6xx: assert valid vertex_flags reg

If this somehow gets optimized out, the GS will run forever.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17341>

2 years agointel/fs: Remove non-_LOGICAL URB messages
Ian Romanick [Tue, 28 Jun 2022 15:19:59 +0000 (08:19 -0700)]
intel/fs: Remove non-_LOGICAL URB messages

The _LOGICAL versions are lowered direct to SEND, so nothing can ever
generate these messages.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agointel/fs: Lower URB messages to SEND
Ian Romanick [Mon, 27 Jun 2022 22:34:01 +0000 (15:34 -0700)]
intel/fs: Lower URB messages to SEND

Before rebasing on top of Ken's split-SEND optimization (see !17018),
this commit just caused some scheduling changes in various tessellation
and geometry shaders.  These changes were caused by the addition of real
latency information for the URB messages.

With the addition of the split-SEND optimization, the changes
are... staggering.  All of the shaders helped for spills and fills are
vertex shaders from Batman Arkham Origins.  What surprises me is that
these shaders account for such a high percentage of the spills and fills
in fossil-db.  85%?!?

v2: Use FIXED_GRF instead of BRW_GENERAL_REGISTER_FILE in an assertion.
Suggested by Ken.

Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20013625 -> 19954020 (-0.30%)
instructions in affected programs: 4007157 -> 3947552 (-1.49%)
helped: 31161
HURT: 0
helped stats (abs) min: 1 max: 400 x̄: 1.91 x̃: 2
helped stats (rel) min: 0.08% max: 59.70% x̄: 2.20% x̃: 1.83%
95% mean confidence interval for instructions value: -1.97 -1.86
95% mean confidence interval for instructions %-change: -2.22% -2.18%
Instructions are helped.

total cycles in shared programs: 859337569 -> 858636788 (-0.08%)
cycles in affected programs: 74168298 -> 73467517 (-0.94%)
helped: 13812
HURT: 16846
helped stats (abs) min: 1 max: 291078 x̄: 82.83 x̃: 4
helped stats (rel) min: <.01% max: 37.09% x̄: 3.47% x̃: 2.02%
HURT stats (abs)   min: 1 max: 1543 x̄: 26.31 x̃: 14
HURT stats (rel)   min: <.01% max: 77.97% x̄: 4.11% x̃: 2.58%
95% mean confidence interval for cycles value: -55.10 9.39
95% mean confidence interval for cycles %-change: 0.62% 0.77%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total cycles in shared programs: 904844939 -> 904832320 (<.01%)
cycles in affected programs: 525360 -> 512741 (-2.40%)
helped: 215
HURT: 4
helped stats (abs) min: 4 max: 1018 x̄: 60.16 x̃: 39
helped stats (rel) min: 0.14% max: 15.85% x̄: 2.16% x̃: 2.04%
HURT stats (abs)   min: 79 max: 79 x̄: 79.00 x̃: 79
HURT stats (rel)   min: 1.31% max: 1.57% x̄: 1.43% x̃: 1.43%
95% mean confidence interval for cycles value: -75.02 -40.22
95% mean confidence interval for cycles %-change: -2.37% -1.81%
Cycles are helped.

No shader-db changes on any older Intel platforms.

Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 142622800 -> 141461114 (-0.8%)
Instructions helped: 197186

Cycles in all programs: 9101223846 -> 9099440025 (-0.0%)
Cycles helped: 37963
Cycles hurt: 151233

Spills in all programs: 98829 -> 13695 (-86.1%)
Spills helped: 2159

Fills in all programs: 128142 -> 18400 (-85.6%)
Fills helped: 2159

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agointel/fs: Add _LOGICAL versions of URB messages
Ian Romanick [Mon, 27 Jun 2022 22:22:03 +0000 (15:22 -0700)]
intel/fs: Add _LOGICAL versions of URB messages

The lowering is currently fake.  It just changes the opcode from the
_LOGICAL version to the non-_LOGICAL version.

v2: Remove some rebase cruft.  's/gfx8_//;s/simd8_/' in
brw_instruction_name.  Both suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agointel/compiler: Move logical-send lowering to a separate file
Ian Romanick [Mon, 27 Jun 2022 19:24:58 +0000 (12:24 -0700)]
intel/compiler: Move logical-send lowering to a separate file

brw_fs.cpp was 10kloc.  Now it's only 7.5kloc.  Ugh.

v2: Rebase on 9680e0e4a2d.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agointel/eu: Validate some aspects of URB messages
Ian Romanick [Tue, 28 Jun 2022 19:26:04 +0000 (12:26 -0700)]
intel/eu: Validate some aspects of URB messages

If these checks had been in place previously, some bugs
that... eh-hem... practically took down the Intel CI would have been
caught earlier. *blush*

v2: Update to account for split sends.

v3: Add some more Gfx version checks.  Remove the redundant "src0 is a
GRF" check.  Both suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agointel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix
Ian Romanick [Tue, 5 Jul 2022 17:03:07 +0000 (10:03 -0700)]
intel/compiler: Rename vec4 state URB opcodes to have VEC4_ prefix

An argument could be made that all stage-specific opcodes for vec4
stages should be prefixed with VEC4_ like the stage-agnostic opcodes.
I'll leave those additional sed jobs for another day.

    egrep -lr '(VS|GS|TCS)_OPCODE_URB_WRITE' src |\
    while read f; do
        sed --in-place 's/\(VS\|GS\|TCS\)_OPCODE_URB_WRITE/VEC4_\1_OPCODE_URB_WRITE/g' $f
    done

    egrep -lr 'T.S_OPCODE[_A-Z]*URB_OFFSETS' src |\
    while read f; do
        sed --in-place 's/\(T.S_OPCODE[_A-Z]*URB_OFFSETS\)/VEC4_\1/g' $f
    done

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>

2 years agodzn: Add for condition to break nested loop
Jesse Natalie [Wed, 6 Jul 2022 16:22:08 +0000 (09:22 -0700)]
dzn: Add for condition to break nested loop

Fixes: d132ec92 ("dzn: Support native image copies when formats are compatible")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17377>

2 years agodzn: Fix incompatible pointer type error affecting MSYS2 MINGW32
pal1000 [Fri, 8 Jul 2022 09:40:55 +0000 (12:40 +0300)]
dzn: Fix incompatible pointer type error affecting MSYS2 MINGW32
Suggested-by: Yonggang Luo <luoyonggang@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6807

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17414>

2 years agoci/traces: piglit, be more verbose
David Heidelberg [Thu, 7 Jul 2022 13:13:46 +0000 (15:13 +0200)]
ci/traces: piglit, be more verbose

Print more information about traces testing progress.

Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17416>

2 years agoradv/ci: enable fossils testing for GFX1100
Samuel Pitoiset [Wed, 11 May 2022 08:29:12 +0000 (10:29 +0200)]
radv/ci: enable fossils testing for GFX1100

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16447>

2 years agoaco: use scratch_* for VGPR spill/reload on GFX9+
Rhys Perry [Thu, 19 May 2022 15:09:13 +0000 (16:09 +0100)]
aco: use scratch_* for VGPR spill/reload on GFX9+

fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 122808 -> 122782 (-0.02%); split: -0.11%, +0.09%
CodeSize: 711248 -> 710788 (-0.06%); split: -0.16%, +0.10%
SpillSGPRs: 928 -> 831 (-10.45%)
SpillVGPRs: 1626 -> 1624 (-0.12%)
Latency: 4960285 -> 4932547 (-0.56%)
InvThroughput: 2574083 -> 2559953 (-0.55%)
VClause: 3404 -> 3402 (-0.06%)
Copies: 36992 -> 37181 (+0.51%); split: -0.05%, +0.56%
Branches: 3582 -> 3585 (+0.08%)
PreVGPRs: 3055 -> 3057 (+0.07%)

fossil-db (vega10):
Totals from 12 (0.01% of 161355) affected shaders:
Instrs: 124817 -> 124383 (-0.35%); split: -0.46%, +0.12%
CodeSize: 705116 -> 703664 (-0.21%); split: -0.44%, +0.23%
SpillSGPRs: 1012 -> 898 (-11.26%)
SpillVGPRs: 1632 -> 1624 (-0.49%)
Scratch: 201728 -> 200704 (-0.51%)
Latency: 6160115 -> 6266025 (+1.72%); split: -0.34%, +2.06%
InvThroughput: 6440203 -> 6544595 (+1.62%); split: -0.35%, +1.97%
VClause: 3409 -> 3423 (+0.41%)
Copies: 37929 -> 37748 (-0.48%); split: -1.16%, +0.69%
Branches: 3851 -> 3855 (+0.10%); split: -0.13%, +0.23%
PreVGPRs: 3053 -> 3055 (+0.07%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: use scratch_* for scratch load/store on GFX9+
Rhys Perry [Thu, 19 May 2022 13:12:08 +0000 (14:12 +0100)]
aco: use scratch_* for scratch load/store on GFX9+

fossil-db (navi21):
Totals from 52 (0.03% of 162293) affected shaders:
Instrs: 83190 -> 82145 (-1.26%)
CodeSize: 454892 -> 447260 (-1.68%); split: -1.68%, +0.00%
VGPRs: 4768 -> 4672 (-2.01%)
Latency: 1490887 -> 1487170 (-0.25%); split: -0.68%, +0.43%
InvThroughput: 935500 -> 933060 (-0.26%); split: -0.72%, +0.46%
VClause: 2715 -> 2632 (-3.06%); split: -4.53%, +1.47%
SClause: 1902 -> 1883 (-1.00%)
Copies: 8839 -> 8496 (-3.88%)
PreSGPRs: 2012 -> 1807 (-10.19%)
PreVGPRs: 3282 -> 3192 (-2.74%)

fossil-db (vega10):
Totals from 41 (0.03% of 161355) affected shaders:
Instrs: 35772 -> 35699 (-0.20%)
CodeSize: 187040 -> 186584 (-0.24%)
VGPRs: 4044 -> 4072 (+0.69%)
Latency: 243088 -> 242379 (-0.29%)
InvThroughput: 180301 -> 179783 (-0.29%)
VClause: 1204 -> 1216 (+1.00%)
SClause: 653 -> 637 (-2.45%)
Copies: 3736 -> 3704 (-0.86%); split: -0.88%, +0.03%
PreSGPRs: 1331 -> 1207 (-9.32%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: initialize scratch base registers on GFX9-GFX10.3
Rhys Perry [Thu, 19 May 2022 13:12:08 +0000 (14:12 +0100)]
aco: initialize scratch base registers on GFX9-GFX10.3

fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)

fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: treat flat-like as vmem in some scheduling heuristics
Rhys Perry [Thu, 19 May 2022 17:21:34 +0000 (18:21 +0100)]
aco: treat flat-like as vmem in some scheduling heuristics

fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 48754 -> 48762 (+0.02%)
CodeSize: 267092 -> 267124 (+0.01%)
Latency: 1293798 -> 1292303 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 854599 -> 853578 (-0.12%)
VClause: 1623 -> 1619 (-0.25%)
SClause: 1187 -> 1188 (+0.08%); split: -0.08%, +0.17%

fossil-db (vega10):
Totals from 1 (0.00% of 161355) affected shaders:
Latency: 18720 -> 18848 (+0.68%)
InvThroughput: 5775 -> 5776 (+0.02%)
SClause: 12 -> 11 (-8.33%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: include scratch/global in VMEM WAW optimization
Rhys Perry [Wed, 25 May 2022 16:21:50 +0000 (17:21 +0100)]
aco: include scratch/global in VMEM WAW optimization

fossil-db (navi21):
Totals from 2 (0.00% of 162293) affected shaders:
Instrs: 4788 -> 4785 (-0.06%)
CodeSize: 25884 -> 25872 (-0.05%)
Latency: 255008 -> 252950 (-0.81%)
InvThroughput: 170005 -> 168633 (-0.81%)
VClause: 206 -> 205 (-0.49%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: avoid WAW hazard with BVH MIMG and other VMEM
Rhys Perry [Wed, 25 May 2022 16:21:10 +0000 (17:21 +0100)]
aco: avoid WAW hazard with BVH MIMG and other VMEM

According to LLVM, image_bvh64_intersect_ray does not write results in
order with other VMEM instructions.

fossil-db (navi21):
Totals from 7 (0.00% of 162293) affected shaders:
Instrs: 39978 -> 39985 (+0.02%)
CodeSize: 219356 -> 219384 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: refactor VGPR spill/reload lowering
Rhys Perry [Thu, 19 May 2022 15:09:13 +0000 (16:09 +0100)]
aco: refactor VGPR spill/reload lowering

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: handle subtractions in parse_base_offset
Rhys Perry [Thu, 19 May 2022 14:34:04 +0000 (15:34 +0100)]
aco: handle subtractions in parse_base_offset

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: combine additions and constants into scratch load/store
Rhys Perry [Thu, 19 May 2022 14:19:12 +0000 (15:19 +0100)]
aco: combine additions and constants into scratch load/store

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: improve support for scratch_* instructions
Rhys Perry [Thu, 19 May 2022 14:55:53 +0000 (15:55 +0100)]
aco: improve support for scratch_* instructions

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: make FLAT_instruction::offset signed
Rhys Perry [Thu, 19 May 2022 14:18:36 +0000 (15:18 +0100)]
aco: make FLAT_instruction::offset signed

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: include flat-like in vmem clause statistics
Rhys Perry [Thu, 19 May 2022 16:54:38 +0000 (17:54 +0100)]
aco: include flat-like in vmem clause statistics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agoaco: make flat access latency match mtbuf/mubuf/mimg
Rhys Perry [Thu, 19 May 2022 15:56:56 +0000 (16:56 +0100)]
aco: make flat access latency match mtbuf/mubuf/mimg

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>

2 years agovirgl: Only progagate the uniform numbers if the numbers are actually right
Corentin Noël [Fri, 8 Jul 2022 10:06:26 +0000 (12:06 +0200)]
virgl: Only progagate the uniform numbers if the numbers are actually right

When the field was first introduces, the numbers were reporting the number of
vec4 instead of the number of float. Do not propagate them if they are wrong.

Fixes: d92c1ca01b326d8f0ff210828830d6542f9e67f7

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17415>

2 years agoci/lava: Add canceled job status
Guilherme Gallo [Thu, 7 Jul 2022 02:51:26 +0000 (23:51 -0300)]
ci/lava: Add canceled job status

We should be explicit that we are cancelling jobs once the script finds
some log messages that are linked with known issues. That means the
script preemptively retried the job without giving chances to recover.

Adds magenta color to cancelled jobs.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agoci/lava: Add `slow` pytest marker
Guilherme Gallo [Thu, 7 Jul 2022 02:31:27 +0000 (23:31 -0300)]
ci/lava: Add `slow` pytest marker

Mark test_full_yaml_log with this new marker to be easily run by the
developers.
Make `debian-testing` skip this test with `not slow` marker hint.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agoci/lava: Color red for fatal and yellow for warning
Guilherme Gallo [Thu, 7 Jul 2022 02:22:09 +0000 (23:22 -0300)]
ci/lava: Color red for fatal and yellow for warning

Fatal errors now have red foreground color and retry messages yellow
one.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agoci/lava: Make hung job status yellow
Guilherme Gallo [Thu, 7 Jul 2022 02:19:53 +0000 (23:19 -0300)]
ci/lava: Make hung job status yellow

It will help to know what happened to a non-successful job.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agoci/lava: Detect R8152 issues preemptively and retry
Guilherme Gallo [Thu, 7 Jul 2022 01:59:11 +0000 (22:59 -0300)]
ci/lava: Detect R8152 issues preemptively and retry

Implement a log-based retry hint for R8152 issue described in #6681,
which is based on detecting these two consecutive lines:

```
r8152 <USB> eth0: Tx status -71
nfs: server <IP> not responding, still trying
```

Where <IP> and <USB> could be any IP and USB addresses, respectfully.

This commit is a temporary fix since it requires a section-aware log
follower, implemented in !16323. When the cited MR is merged, one will
make a proper fix on top of that.

Closes: #6681

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agoci/lava: Split lava_log into modules
Guilherme Gallo [Thu, 7 Jul 2022 01:52:23 +0000 (22:52 -0300)]
ci/lava: Split lava_log into modules

This script is getting too big, it been hard to extend it.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17389>

2 years agozink: flush pending clears for fb texture barriers
Mike Blumenkrantz [Tue, 5 Jul 2022 14:52:42 +0000 (10:52 -0400)]
zink: flush pending clears for fb texture barriers

if a texture barrier occurs while clears are pending, these clears should
show up if the fb attachments are read in shaders, so trigger a renderpass
to flush out the clears

cc: mesa-stable

fixes #6766

fixes (radv):
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_advanced_blend_eq
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_common.common_blend_eq_buffer_advanced_blend_eq
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_common.common_separate_blend_eq_buffer_advanced_blend_eq
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_indexed.common_advanced_blend_eq_buffer_advanced_blend_eq
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_indexed.common_advanced_blend_eq_buffer_blend_eq
dEQP-GLES3.functional.draw_buffers_indexed.overwrite_indexed.common_advanced_blend_eq_buffer_separate_blend_eq

Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17363>

2 years agoradv: fix dumping VS prologs assembly
Samuel Pitoiset [Fri, 8 Jul 2022 06:31:01 +0000 (08:31 +0200)]
radv: fix dumping VS prologs assembly

This got removed by mistake and broke
RADV_DEBUG=shaders,nocache,prologs.

Fixes: 9fe2b6b7480 ("aco/radv: provide a vs prolog callback from aco to radv.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17413>

2 years agoradv: Fix vkCmdCopyQueryResults -> vkCmdResetPool hazard.
Tatsuyuki Ishi [Tue, 10 May 2022 04:44:05 +0000 (13:44 +0900)]
radv: Fix vkCmdCopyQueryResults -> vkCmdResetPool hazard.

The Vulkan specification states:

> Query commands, for the same query and submitted to the same queue,
> execute in their entirety in submission order, relative to each other. In
> effect there is an implicit execution dependency from each such query
> command to all query commands previously submitted to the same queue.

Fixes dEQP-VK.query_pool.statistics_query.reset_after_copy.*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17400>

2 years agoaco/assembler: Fix s_bitreplicate_b64_b32 on GFX9.
Georg Lehmann [Thu, 7 Jul 2022 20:10:09 +0000 (22:10 +0200)]
aco/assembler: Fix s_bitreplicate_b64_b32 on GFX9.

This seems to be a relic from before aco added per generation opcodes.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17405>

2 years agoaco: Fix swapping sources in SOPC -> SOPK optimization.
Georg Lehmann [Thu, 7 Jul 2022 21:54:39 +0000 (23:54 +0200)]
aco: Fix swapping sources in SOPC -> SOPK optimization.

Fixes: 2d6b0a4177b ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17407>

2 years agor600/sfn: Add missing std::array include.
Georg Lehmann [Fri, 8 Jul 2022 06:43:59 +0000 (08:43 +0200)]
r600/sfn: Add missing std::array include.

Fixes: 79ca456b483 ("r600/sfn: rewrite NIR backend")
Closes https://gitlab.freedesktop.org/mesa/mesa/-/issues/6824

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17412>

2 years agoradeonsi: use LLVMBuildLoad2 for inter-stage outputs loads
Pierre-Eric Pelloux-Prayer [Tue, 5 Jul 2022 11:44:24 +0000 (13:44 +0200)]
radeonsi: use LLVMBuildLoad2 for inter-stage outputs loads

The PS case was covered by the previous commit, so we can use f32
everywhere.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoradeonsi: use LLVMBuildLoad2 in llvm PS
Pierre-Eric Pelloux-Prayer [Tue, 5 Jul 2022 11:42:23 +0000 (13:42 +0200)]
radeonsi: use LLVMBuildLoad2 in llvm PS

PS is the only shader type where unpacked 16-bit outputs can be used,
so use ac_shader_abi::is_16bit to pass the proper type to LLVMBuildLoad2.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoac/llvm: use LLVMBuildLoad2 in visit_load
Pierre-Eric Pelloux-Prayer [Tue, 5 Jul 2022 11:37:59 +0000 (13:37 +0200)]
ac/llvm: use LLVMBuildLoad2 in visit_load

Only FS can have f16 outputs, so always use f32 here.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoac/llvm: handle opaque pointers in visit_store_output
Pierre-Eric Pelloux-Prayer [Tue, 5 Jul 2022 11:36:50 +0000 (13:36 +0200)]
ac/llvm: handle opaque pointers in visit_store_output

Outputs are always f32 or f16.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoac: add per output is_16bit flag to ac_shader_abi
Pierre-Eric Pelloux-Prayer [Tue, 5 Jul 2022 10:26:00 +0000 (12:26 +0200)]
ac: add per output is_16bit flag to ac_shader_abi

Outputs are always f32 except for FS that may use unpacked f16.
Store this information here to make it available to later processing.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoradeonsi: use LLVMBuildLoad2 where possible
Pierre-Eric Pelloux-Prayer [Mon, 20 Jun 2022 08:14:51 +0000 (10:14 +0200)]
radeonsi: use LLVMBuildLoad2 where possible

This commit replaces LLVMBuildLoad usage by LLVMBuildLoad2
where possible (= where the pointee type is known).

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agoac: use LLVMContextSetOpaquePointers if available
Pierre-Eric Pelloux-Prayer [Mon, 4 Jul 2022 14:14:21 +0000 (16:14 +0200)]
ac: use LLVMContextSetOpaquePointers if available

Disabling opaque pointers in LLVM doesn't fix all the issues but
it makes pointers non-opaque by default (eg LLVMPointerType()
returns a typed pointer).

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>

2 years agozink: re-enable EXT_primitives_generated_query for Turnip
Danylo Piliaiev [Thu, 7 Jul 2022 14:58:27 +0000 (17:58 +0300)]
zink: re-enable EXT_primitives_generated_query for Turnip

https://gitlab.freedesktop.org/mesa/mesa/-/issues/6602 is resolved
so the extension could be re-enabled.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17163>

2 years agotu: Fix prim gen query and pipeline stats query interaction
Danylo Piliaiev [Mon, 27 Jun 2022 16:01:08 +0000 (19:01 +0300)]
tu: Fix prim gen query and pipeline stats query interaction

Fixed:
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT was able to stop prim counter
  when pipeline stats query is running.
  - This may have happened when prim gen query was in secondary cmdbuf.
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT counting geometry in each tile.
- VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT counting geometry in each tile
  when pipeline stats query is started inside prim gen query and inside
  a renderpass.

The matter of VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT and pipeline stats
interaction is solved by tracking whether pipeline stats query is
running both on CPU (for non secondary cmdbuf case) and on GPU (for
secondary cmdbuf).

Note, prim gen query is not allowed with secondary command buffers, so
only pipeline stats query is tracked on gpu.
See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3142

Counting geometry per each tile is solved by:
- Conditionally executing START/STOP_PRIMITIVE_CTRS to not run in tiling
  pass. Solves the case when prim gen query is inside a renderpass.
- Stop prim counters before executing `draw_cs` and restarting them
  afterwards. Solves prim gen query being outside a renderpass.

Fixes GL CTS tests with Zink + `TU_DEBUG=gmem`:
 GTF-GL46.gtf30.GL3Tests.transform_feedback.transform_feedback_max_separate
 GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_basic
 GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_framebuffer
 GTF-GL46.gtf40.GL3Tests.transform_feedback3.transform_feedback3_streams_overflow
 GTF-GL46.gtf40.GL3Tests.transform_feedback3.transform_feedback3_streams_queried
 GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6602

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17163>

2 years agotu,freedreno: Refactored START/STOP events for pipeline stats
Danylo Piliaiev [Tue, 21 Jun 2022 11:37:00 +0000 (14:37 +0300)]
tu,freedreno: Refactored START/STOP events for pipeline stats

For a5xx+ renamed:
- RST_PIX_CNT -> START_FRAGMENT_CTRS
- RST_VTX_CNT -> STOP_FRAGMENT_CTRS
- TILE_FLUSH  -> START_COMPUTE_CTRS
- STAT_EVENT  -> STOP_COMPUTE_CTRS
I'm not sure about a5xx itself but I'll take a chance of it being
similar to a6xx in this regard.

Knowing this emit_begin_stat_query/emit_end_stat_query can now emit
only events that are needed for the pool's flags.

Also primitive generated query clearly doesn't need fragment and
compute counters.

Passes tests:
 dEQP-VK.query_pool.statistics_query.*
 dEQP-VK.transform_feedback.primitives_generated_query.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17163>

2 years agoaco: fix load_barycentric_at_sample without MSAA
Samuel Pitoiset [Thu, 7 Jul 2022 10:55:05 +0000 (12:55 +0200)]
aco: fix load_barycentric_at_sample without MSAA

It's legal to use this instruction in a fragment shader, even if the
graphics pipeline doesn't use MSAA.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.sample_n_*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17398>

2 years agonir/serialize: fix missing divergence info after deserialization
Iago Toral Quiroga [Thu, 7 Jul 2022 10:44:10 +0000 (12:44 +0200)]
nir/serialize: fix missing divergence info after deserialization

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17397>

2 years agovenus: Use maintenance4 to get max_size_buffer
Igor Torrente [Thu, 30 Jun 2022 12:50:36 +0000 (09:50 -0300)]
venus: Use maintenance4 to get max_size_buffer

This should help speedup the device initalization process.

Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17399>

2 years agovenus: Add support to VK_KHR_maintenance4 extension
Igor Torrente [Wed, 29 Jun 2022 20:38:43 +0000 (17:38 -0300)]
venus: Add support to VK_KHR_maintenance4 extension

Signed-off-by: Igor Torrente <igor.torrente@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17399>

2 years agodzn: Only support high/normal queue priorities
Jesse Natalie [Thu, 7 Jul 2022 20:35:08 +0000 (13:35 -0700)]
dzn: Only support high/normal queue priorities

D3D uses an int which seems like it'd support value between 0 and 100,
but in reality it only accepts values of exactly 0, or 100. The space
is left in case future values were to be added, so that comparisons
would work (e.g. MEDIUM_HIGH < HIGH).

Treat higher than 0.5 to be HIGH, and anything less to be NORMAL.

Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17406>

2 years agopanfrost: Clear with a quad to avoid flushing
Alyssa Rosenzweig [Fri, 17 Jun 2022 20:40:32 +0000 (16:40 -0400)]
panfrost: Clear with a quad to avoid flushing

Flushing the batch midframe (splitting a renderpass) is expensive on a tiler, as
it requires the GPU to flush the framebuffer contents to main memory and read
them back. Clearing the framebuffer should not trigger a flush. Apps expect
clears to be (almost) free, flushing for a clear is at the very least unexpected
behaviour.

The only reason we previously flushed is to ensure we could always use a "fast"
clear. But a slow clear is a heck of a lot faster than a flush ;-) Instead of
flushing, we should clear with a draw (via u_blitter) in case a fast clear isn't
possible.

This fixes pathological performance for applications that rely on partial clears
within a frame. This issue was identified with Inochi2D, which repeatedly clears
the stencil buffer midframe, in order to implement masking efficiently with the
stencil buffer. In total, the all-important workload of rendering Asahi Lina is
improved from 17fps to 29fps on a panfrost device.

Fixes: c138ca80d23 ("panfrost: Make sure a clear does not re-use a pre-existing batch")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17112>

2 years agopanfrost/ci: Disable 0ad trace on T860
Alyssa Rosenzweig [Thu, 7 Jul 2022 22:28:44 +0000 (18:28 -0400)]
panfrost/ci: Disable 0ad trace on T860

The last few frames of the trace are expensive (in terms of GPU time) and are
close to hitting the timeout. With the next commit, they do hit the timeout due
to using a larger batch. Nevertheless the next commit should be an overall perf
improvement on average, so this remove to unblock CI.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17112>

2 years agopan/decode: Change indent when decoding resources
Icecream95 [Thu, 7 Jul 2022 00:27:50 +0000 (12:27 +1200)]
pan/decode: Change indent when decoding resources

Make the separation between entries in the resource table more
obvious.

Increase the indent by two levels to keep descriptors distinct from
the resource entry itself.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17371>

2 years agopan/decode: Use tag bits for resource entry count
Icecream95 [Thu, 7 Jul 2022 00:26:03 +0000 (12:26 +1200)]
pan/decode: Use tag bits for resource entry count

Fixes crashes when decoding the blob, which sometimes uses fewer than
9 entries.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17371>

2 years agopan/decode: fflush buffers after dumping and before aborts
Icecream95 [Wed, 6 Jul 2022 23:51:49 +0000 (11:51 +1200)]
pan/decode: fflush buffers after dumping and before aborts

Otherwise trace files or other files being written (dEQP TestResults?)
might be truncated.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17371>

2 years agopan/va: Use the _safe iterator when adding blend shader calls
Icecream95 [Wed, 6 Jul 2022 09:24:40 +0000 (21:24 +1200)]
pan/va: Use the _safe iterator when adding blend shader calls

Otherwise the list 'next' changing will cause the assertion in
list_for_each_entry to be hit.

This was not hit before because list_assert is defined for debug
builds but not debugoptimized.

Fixes: 5067a26f443 ("pan/bi: Use flow control lowering on Valhall")
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17371>

2 years agopanfrost: Enable QUAD_STRIP and POLYGON on v6
Icecream95 [Thu, 19 May 2022 11:01:56 +0000 (23:01 +1200)]
panfrost: Enable QUAD_STRIP and POLYGON on v6

I wrote fiction about dreaming that these were supported but after
waking finding that they were not. Somehow I came to consider that
fiction as fact and I never thought to test if they did work.

While reverse engineering the polygon list format, I found that these
were supported after all.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17371>

2 years agopanfrost: Fix PIPE_COMPUTE_CAP_SUBGROUP_SIZE
Alyssa Rosenzweig [Fri, 24 Jun 2022 21:33:15 +0000 (17:33 -0400)]
panfrost: Fix PIPE_COMPUTE_CAP_SUBGROUP_SIZE

Use the new helper to implement the CAP, correctly handling Midgard and Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17265>

2 years agopanfrost: Fix PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
Alyssa Rosenzweig [Fri, 24 Jun 2022 21:37:30 +0000 (17:37 -0400)]
panfrost: Fix PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS

This maps to CL_DEVICE_MAX_COMPUTE_UNITS, which is defined as:

   The number of parallel compute cores on the OpenCL device.

Since all supported Malis are unified shader cores, the number of parallel
compute cores is simply the number of shader cores.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17265>

2 years agopanfrost: Move bifrost_lanes_per_warp to common
Alyssa Rosenzweig [Mon, 4 Jul 2022 20:43:59 +0000 (16:43 -0400)]
panfrost: Move bifrost_lanes_per_warp to common

Whereas the compiler needs to know the warp size for lowering divergent
indirects, the driver needs to know it to report the subgroup size. Move the
Bifrost-specific helper to common and add the trivial implementation for
Midgard.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17265>

2 years agopanfrost: Separate core ID range from core count
Alyssa Rosenzweig [Fri, 24 Jun 2022 21:43:09 +0000 (17:43 -0400)]
panfrost: Separate core ID range from core count

To query the core count, the hardware has a SHADERS_PRESENT register containing
a mask of shader cores connected. The core count equals the number of 1-bits,
regardless of placement. This value is useful for public consumption (like
in clinfo).

However, internally we are interested in the range of core IDs.
We usually query core count to determine how many cores to allocate various
per-core buffers for (performance counters, occlusion queries, and the stack).
In each case, the hardware writes at the index of its core ID, so we have to
allocate enough for entire range of core IDs. If the core mask is
discontiguous, this necessarily overallocates.

Rename the existing core_count to core_id_range, better reflecting its
definition and purpose, and repurpose core_count for the actual core count.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17265>

2 years agopan/bi: Implement f2f16{_rtz, _rtne}
Alyssa Rosenzweig [Mon, 27 Jun 2022 19:46:15 +0000 (15:46 -0400)]
pan/bi: Implement f2f16{_rtz, _rtne}

Float conversions with explicit rounding modes are required for OpenCL,
as well as for Vulkan with the VK_KHR_16bit_storage extension (mandatory
in Vulkan 1.1). Since the hardware conversion instructions allow
configuring the round mode, this is easy to support :-)

Fixes test_half.vstore_half_rtz.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17262>

2 years agopan/va: Add missing <roundmode/> to V2F32_TO_V2F16
Alyssa Rosenzweig [Mon, 27 Jun 2022 19:48:31 +0000 (15:48 -0400)]
pan/va: Add missing <roundmode/> to V2F32_TO_V2F16

So we can implement f2f16_rtz.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17262>

2 years agointel/fs: ray query fix for global address
Lionel Landwerlin [Thu, 23 Jun 2022 11:15:51 +0000 (14:15 +0300)]
intel/fs: ray query fix for global address

With stages dispatching with a mask, we can run into situations where
we don't have the global address in all lanes. The existing code
always assumed we had the addres in at least lane0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bb40e999d114 ("intel/nir: use a single intel intrinsic to deal with ray traversal")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17330>

2 years agopan/bi: Fix unpack_32_2x16 definition
Alyssa Rosenzweig [Thu, 23 Jun 2022 13:58:59 +0000 (09:58 -0400)]
pan/bi: Fix unpack_32_2x16 definition

This got messed up when scalarizing the IR. Fix the definition of the opcode to
return (instead of break, asserting out) and to respect the swizzle (instead of
failing validation). Noticed when bringing up OpenCL on Valhall.

Fixes: 5febeae58e0 ("pan/bi: Emit collect and split")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17222>

2 years agodzn: Fix winsys reporting
Jesse Natalie [Sat, 2 Jul 2022 18:08:46 +0000 (11:08 -0700)]
dzn: Fix winsys reporting

For Windows we don't support using the DISPLAY winsys,
and for WSL we should add the ones that support software

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17346>

2 years agoradv,aco,ac/llvm: use nir_op_f{sin,cos}_amd
Rhys Perry [Mon, 3 May 2021 10:10:06 +0000 (11:10 +0100)]
radv,aco,ac/llvm: use nir_op_f{sin,cos}_amd

This lets NIR optimize the multiplication, particularly sin/cos(a * #b).

fossil-db (Sienna Cichlid):
Totals from 12306 (7.58% of 162293) affected shaders:
MaxWaves: 224814 -> 224834 (+0.01%)
Instrs: 17365273 -> 17338758 (-0.15%); split: -0.16%, +0.00%
CodeSize: 93478488 -> 93354912 (-0.13%); split: -0.14%, +0.01%
VGPRs: 752080 -> 752072 (-0.00%); split: -0.00%, +0.00%
SpillSGPRs: 8440 -> 8410 (-0.36%)
Latency: 200402154 -> 200279405 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 37588077 -> 37545545 (-0.11%); split: -0.11%, +0.00%
VClause: 293863 -> 293874 (+0.00%); split: -0.03%, +0.03%
SClause: 619539 -> 619064 (-0.08%); split: -0.09%, +0.01%
Copies: 1151591 -> 1151641 (+0.00%); split: -0.04%, +0.05%
Branches: 506434 -> 506437 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 877609 -> 877517 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 711938 -> 711940 (+0.00%); split: -0.00%, +0.00%

fossil-db (LLVM, Sienna Cichlid):
Totals from 4377 (3.59% of 121873) affected shaders:
SGPRs: 358960 -> 359176 (+0.06%); split: -0.18%, +0.25%
VGPRs: 319832 -> 319720 (-0.04%); split: -0.18%, +0.15%
SpillSGPRs: 46983 -> 47007 (+0.05%); split: -0.99%, +1.04%
CodeSize: 30872812 -> 30764512 (-0.35%); split: -0.39%, +0.04%
MaxWaves: 73814 -> 73904 (+0.12%); split: +0.25%, -0.13%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>

2 years agonir: allow 16-bit fsin_amd/fcos_amd
Rhys Perry [Thu, 13 May 2021 14:31:56 +0000 (15:31 +0100)]
nir: allow 16-bit fsin_amd/fcos_amd

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>

2 years agonir/algebraic: optimize bcsel(c, fsin/cos_amd(a), fsin/cos_amd(b))
Rhys Perry [Mon, 3 May 2021 09:55:39 +0000 (10:55 +0100)]
nir/algebraic: optimize bcsel(c, fsin/cos_amd(a), fsin/cos_amd(b))

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>

2 years agonir: rename fsin_r600/fcos_r600 to fsin_amd/fcos_amd
Rhys Perry [Fri, 1 Jul 2022 13:13:25 +0000 (14:13 +0100)]
nir: rename fsin_r600/fcos_r600 to fsin_amd/fcos_amd

GCN has better range, but constant folding is the same.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>

2 years agovulkan/wsi: define pWaitDstStageMask in the blit submission
Pierre-Eric Pelloux-Prayer [Thu, 30 Jun 2022 07:37:31 +0000 (09:37 +0200)]
vulkan/wsi: define pWaitDstStageMask in the blit submission

Otherwise we get a crash in vk_common_QueueSubmit when doing:
   .stageMask   = pSubmits[s].pWaitDstStageMask[i],

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6712
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17310>

2 years agozink: support PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO
Mike Blumenkrantz [Wed, 22 Jun 2022 13:16:17 +0000 (09:16 -0400)]
zink: support PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agozink: don't warn for missing customBorderColorWithoutFormat on turnip
Mike Blumenkrantz [Tue, 21 Jun 2022 17:00:31 +0000 (13:00 -0400)]
zink: don't warn for missing customBorderColorWithoutFormat on turnip

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agozink: disable customBorderColorWithoutFormat on turnip
Mike Blumenkrantz [Tue, 21 Jun 2022 16:59:51 +0000 (12:59 -0400)]
zink: disable customBorderColorWithoutFormat on turnip

this should "just work" now

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agozink: init driver workarounds earlier in screen creation
Mike Blumenkrantz [Tue, 21 Jun 2022 16:59:29 +0000 (12:59 -0400)]
zink: init driver workarounds earlier in screen creation

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agomesa/st: add PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO
Mike Blumenkrantz [Wed, 22 Jun 2022 13:14:51 +0000 (09:14 -0400)]
mesa/st: add PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO

this is for drivers (like freedreno) which need the format in the sampler
state in order to accurately handle border colors

when set, drivers MAY receive a format in the sampler state if the frontend
supports it (e.g., nine does not), and the cso sampler cache will include
the format member of the struct

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agomesa/st: make get_sampler_view_format() public
Mike Blumenkrantz [Wed, 22 Jun 2022 13:14:11 +0000 (09:14 -0400)]
mesa/st: make get_sampler_view_format() public

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17189>

2 years agointel/compiler: Avoid copy propagating large registers into EOT messages
Kenneth Graunke [Thu, 7 Jul 2022 03:29:02 +0000 (20:29 -0700)]
intel/compiler: Avoid copy propagating large registers into EOT messages

EOT messages need to use g112-g127 for their sources.  With the new
opt_split_sends pass, we may be constructing an EOT message from two
different registers, and be able to copy propagate the original values
into those SENDs.

This can cause problems if we copy propagate from a large register
(say an RGBA value which is 4 GRFs in SIMD8 or 8 GRFs in SIMD16), in a
situation where the SEND only read a subset of that (say the alpha value
out of an RGBA texturing result).  g112-127 can only hold 16 registers
worth of data, and sometimes we can only use g112-126.  So, we can't
propagate if the GRFs in question are larger than 15 GRFs.

Fixes a shader validation failure in Alan Wake.  Thanks to Ian Romanick
for catching this!

shader-db on Icelake shows that only SIMD32 programs are affected, and
the results are pretty negligable:

   total instructions in shared programs: 19615228 -> 19615269 (<.01%)
   instructions in affected programs: 10702 -> 10743 (0.38%)
   helped: 1 / HURT: 43 / largest change: +/- 2 instructions

   total cycles in shared programs: 852001706 -> 852001566 (<.01%)
   cycles in affected programs: 767098 -> 766958 (-0.02%)
   helped: 68 / HURT: 64 / largest change: +/- 774 cycles

   GAINED: 2 / LOST: 0

Fixes: 589b03d02f0 ("intel/fs: Opportunistically split SEND message payloads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6803
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17390>

2 years agor600/sfn: support nir_op_mulz and legazy math rules
Gert Wollny [Thu, 16 Jun 2022 07:37:56 +0000 (09:37 +0200)]
r600/sfn: support nir_op_mulz and legazy math rules

v2: Handle nir_op_ffmaz as well (Georg Lehmann)

Closes: #6390
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600: Cleanup nir options
Gert Wollny [Wed, 18 May 2022 20:35:48 +0000 (22:35 +0200)]
r600: Cleanup nir options

A general cleanop of the nir compiler options including separating
the handling for FS and all the other shaders.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600: enable sb by default also for NIR
Gert Wollny [Wed, 18 May 2022 20:29:06 +0000 (22:29 +0200)]
r600: enable sb by default also for NIR

Currently, the NIR code path doesn't use clause local registers,
but these seem to help a lot with some work loads.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600/sfn: rewrite NIR backend
Gert Wollny [Sat, 19 Jun 2021 11:03:32 +0000 (13:03 +0200)]
r600/sfn: rewrite NIR backend

This is a rewite of the NIR backend. it adds some optimization
and a scheduler.

v2: - replace some magic numbers by constants
    - make sure constructor is always used with new
    - use default initialization in more places
      (changes suggested by Filip Gawin)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600: Update nir options
Gert Wollny [Wed, 18 May 2022 20:35:48 +0000 (22:35 +0200)]
r600: Update nir options

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600: Make sure that LDS instructions only use bank swizzle 012
Gert Wollny [Wed, 18 May 2022 20:27:39 +0000 (22:27 +0200)]
r600: Make sure that LDS instructions only use bank swizzle 012

Not sure whether this is really needed. With the TGSI code path no
other bank swizzle is emitted for LDS ops, so make sure it's the same here.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agor600: Add number of ALU groups to statistics
Gert Wollny [Wed, 18 May 2022 20:41:54 +0000 (22:41 +0200)]
r600: Add number of ALU groups to statistics

The number of ALU groups is important for good sccheduling, so
let's add it to the stats.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17076>

2 years agoanv/utrace: use a bo pool for utrace buffers
Lionel Landwerlin [Thu, 19 May 2022 17:53:30 +0000 (20:53 +0300)]
anv/utrace: use a bo pool for utrace buffers

When utrace/perfetto is active, we allocate/free utrace buffers at the
same rate as command buffers. It's useful to have a pool that avoids
GEM_CREATE/GEM_CLOSE ioctls.

v2: Use the pool more

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16613>

2 years agodozen: Stop asking for semaphore/fence signaling
Jason Ekstrand [Thu, 7 Jul 2022 00:59:14 +0000 (19:59 -0500)]
dozen: Stop asking for semaphore/fence signaling

Dozen is currently a SW driver as far as WSI is concerned so it's going
to wait on a fence anyway.  Also, I highly doubt it's actually hooked
these up properly.  It's probably just a copy+paste from ANV.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agodozen: Increase optimalBufferCopy*Alignment
Jason Ekstrand [Thu, 7 Jul 2022 14:47:54 +0000 (09:47 -0500)]
dozen: Increase optimalBufferCopy*Alignment

D3D12 requires the offset to be 512B-aligned and row pitch to be
256B-aligned for copy commands.  There will need to be a fallback
written eventually because Vulkan has no such requirements but these
will remain the optimal limits as they allow using the D3D12 copy
commands directly.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi/win32: Use the new helpers and persistent map
Jesse Natalie [Thu, 7 Jul 2022 02:32:49 +0000 (19:32 -0700)]
vulkan/wsi/win32: Use the new helpers and persistent map

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi/wayland: Use host pointer import when available
Jason Ekstrand [Wed, 6 Jul 2022 22:27:42 +0000 (17:27 -0500)]
vulkan/wsi/wayland: Use host pointer import when available

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi/x11: Only use MIT_SHM if the device supports EXT_external_memory_host
Jason Ekstrand [Thu, 7 Jul 2022 00:31:38 +0000 (19:31 -0500)]
vulkan/wsi/x11: Only use MIT_SHM if the device supports EXT_external_memory_host

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi/x11: Don't leak shm_reply if we don't have dri3 or present
Jason Ekstrand [Thu, 7 Jul 2022 00:30:11 +0000 (19:30 -0500)]
vulkan/wsi/x11: Don't leak shm_reply if we don't have dri3 or present

Fixes: b5c390c113d3 ("vulkan/wsi: add support for detecting mit-shm pixmaps.")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi: Support tiled CPU images
Jason Ekstrand [Wed, 6 Jul 2022 23:04:58 +0000 (18:04 -0500)]
vulkan/wsi: Support tiled CPU images

Some drivers such as lavapipe are 100% fine with using linear for WSI
images.  Most HW drivers, however, would rather render tiled and eat a
blit.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi: Align buffer image strides to optimalBufferCopyRowPitchAlignment
Jason Ekstrand [Thu, 7 Jul 2022 14:54:19 +0000 (09:54 -0500)]
vulkan/wsi: Align buffer image strides to optimalBufferCopyRowPitchAlignment

This isn't a big deal for the current buffer paths because the required
alignment for PRIME is already higher than any driver advertises.
However, the SW path we're about to add won't have the PRIME requirement.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi: Compute stride and size in configure_buffer_image
Jason Ekstrand [Thu, 7 Jul 2022 15:08:30 +0000 (10:08 -0500)]
vulkan/wsi: Compute stride and size in configure_buffer_image

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi: Persistently map CPU images
Jason Ekstrand [Wed, 6 Jul 2022 23:52:32 +0000 (18:52 -0500)]
vulkan/wsi: Persistently map CPU images

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi: Delete SW support from configure_native_image
Jason Ekstrand [Wed, 6 Jul 2022 21:50:48 +0000 (16:50 -0500)]
vulkan/wsi: Delete SW support from configure_native_image

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>

2 years agovulkan/wsi/wayland: Use wsi_configure_cpu_image
Jason Ekstrand [Wed, 6 Jul 2022 21:48:53 +0000 (16:48 -0500)]
vulkan/wsi/wayland: Use wsi_configure_cpu_image

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17388>