ir3: Calcuate max_waves and threadsize
max_waves is just for shader-db stats for now, but threadsize will
replace the various mechanisms used to determine threadsize across the
different gen's. Calculating these correctly entails adding a bunch of
details about the sizes of various things to ir3. In the future we will
use the guts of the max_waves calculation to inform RA decisions as
well, which is why the max_waves calculation is broken up into register
dependent/independent pieces.
Something should be said about the units of reg_size_vec4. These units
were chosen for two reasons:
1. As said in the comment, it makes some calculations easier.
2. For a4xx/a5xx, where we don't know as much because we haven't done
the same sorts of experiments to probe for the HW configuration, it
corresponds more directly to things that are known. The existing code
switches to the smaller threadsize when r24.x or higher is used,
which translates directly to a reg_size_vec4 of 48. If we chose
different units (e.g. multiplying by wave_granularity and/or
threadsize_base), then to match the same behavior we'd have to set
reg_size_vec4 based on some other parameters that aren't 100% known.
If someone comes along and updates them, they might inadvertantly
break it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>