freedreno/a6xx: Fix SP_HS_UNKNOWN_A831 value and document it
authorDanylo Piliaiev <dpiliaiev@igalia.com>
Mon, 14 Dec 2020 16:42:59 +0000 (18:42 +0200)
committerDanylo Piliaiev <dpiliaiev@igalia.com>
Mon, 21 Dec 2020 14:25:34 +0000 (16:25 +0200)
commite5499ca2bfe8b8d9520e2ef72ebc0da24425992c
tree2c67c83fd269190b6cf1dfadc9d65fe80e03e790
parent22180137e9709937116f622185d027a108410236
freedreno/a6xx: Fix SP_HS_UNKNOWN_A831 value and document it

It appears that storage for varyings in a wave has an upper
limit of wavesize * max_a831 where max_a831 is 64.
Exceeding the limit seam to force gpu to reduce primitives
processed per wave, at least calculations make sense with
such interpretation.

With blob SP_HS_UNKNOWN_A831 never exceeds 64 and setting
it to 65 in freedreno leads to a hang.

On A630 tests (patch_size=3 + gl_Position + array of vec4)
have shown such relation:

| Num of vec4 | A831 | PC_HS_INPUT_SIZE |
|-------------|------|------------------|
| 1           | 0x10 | 0xc              |
| 2           | 0x14 | 0xf              |
| 3           | 0x18 | 0x12             |
| 4           | 0x1c | 0x15             |
| 5           | 0x20 | 0x18             |
| 6           | 0x24 | 0x1b             |
| 7           | 0x28 | 0x1e             |
| 8           | 0x2c | 0x21             |
| 9           | 0x30 | 0x24             |
| 10          | 0x34 | 0x27             |
| 11          | 0x38 | 0x2a             |
| 12          | 0x3c | 0x2d             |
| 13          | 0x3f | 0x30             |
| 14          | 0x40 | 0x33             |
| 15          | 0x3d | 0x36             |
| 16          | 0x3d | 0x39             |
| 17          | 0x40 | 0x3c             |
| 18          | 0x3f | 0x3f             |
| 19          | 0x3e | 0x42             |
| 20          | 0x3d | 0x45             |
| 21          | 0x3f | 0x48             |
| 22          | 0x3d | 0x4b             |
| 23          | 0x40 | 0x4e             |
| 24          | 0x3d | 0x51             |
| 25          | 0x3f | 0x54             |
| 26          | 0x3c | 0x57             |
| 27          | 0x3e | 0x5a             |
| 28          | 0x40 | 0x5d             |
| 29          | 0x3c | 0x60             |
| 30          | 0x3e | 0x63             |
| 31          | 0x40 | 0x66             |
|-------------|------|------------------|

Brief tests with high patch sizes also confirm that formula
matches blob behaviour.

A831 is not a limit for storage available for one thread, so
naming it as SP_HS_WAVE_INPUT_SIZE would make more sense.

Fixes: 47e2c195 "freedreno/a6xx: Program state for tessellation stages"

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7917>
src/freedreno/.gitlab-ci/reference/crash.log
src/freedreno/.gitlab-ci/reference/dEQP-VK.draw.indirect_draw.indexed.indirect_draw_count.triangle_list.log
src/freedreno/.gitlab-ci/reference/fd-clouds.log
src/freedreno/registers/adreno/a6xx.xml
src/freedreno/vulkan/tu_pipeline.c
src/gallium/drivers/freedreno/a6xx/fd6_program.c