ac/nir: fix ngg culling on gfx11
This subtraction can underflow.
If subgroup_id*wave_size is larger than num_live_vertices_in_workgroup,
num_es_threads_var should be zero.
fossil-db (gfx1100, nggc):
Totals from 41388 (30.75% of 134574) affected shaders:
Instrs:
25700772 ->
25783544 (+0.32%)
CodeSize:
126950072 ->
127281160 (+0.26%)
Latency:
92809233 ->
92849566 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 9526675 -> 9542194 (+0.16%)
Copies: 2031078 -> 2031074 (-0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20321>