platform/upstream/mesa.git
9 months agov3d: add v71 hw generation
Alejandro Piñeiro [Tue, 23 May 2023 21:32:37 +0000 (23:32 +0200)]
v3d: add v71 hw generation

Starting point for v71 version inclusion:
 * Adds as one of the versions to be compiled on meson
 * Updated the v3d_X and v3dX macros to include version 71
 * Update the code enough to get it building when using v71.

Any real v71 support will be implemented on following commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: fix depth clipping then Z scale is too small in V3D 7.x
Iago Toral Quiroga [Tue, 14 Feb 2023 09:09:53 +0000 (10:09 +0100)]
v3dv: fix depth clipping then Z scale is too small in V3D 7.x

When the Z scale is too small guardband clipping may not clip
correctly, so disable it, which is a new option in V3D 7.x.

This fixes this test in V3D 7.x without needing any workarounds:
dEQP-VK.draw.renderpass.inverted_depth_ranges.nodepthclamp_deltazero

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: make v3dv_viewport_compute_xform depend on the V3D version
Iago Toral Quiroga [Wed, 20 Oct 2021 09:22:11 +0000 (11:22 +0200)]
v3dv: make v3dv_viewport_compute_xform depend on the V3D version

For 4.x we have a workaround for too small Z scale values that is not
required for V3D 7.x.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: add support for TFU jobs in v71
Alejandro Piñeiro [Wed, 17 Nov 2021 10:33:59 +0000 (11:33 +0100)]
v3dv: add support for TFU jobs in v71

This includes update the simulator.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: handle Z clipping in v71
Iago Toral Quiroga [Fri, 15 Oct 2021 11:06:31 +0000 (13:06 +0200)]
v3dv: handle Z clipping in v71

Fixes the following tests:

dEQP-VK.clipping.clip_volume.*
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_* (except deltazero)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: don't convert floating point border colors in v71
Iago Toral Quiroga [Thu, 7 Oct 2021 10:43:49 +0000 (12:43 +0200)]
v3dv: don't convert floating point border colors in v71

The TMU does this for us now.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: no specific separate_segments flag for V3D 7.1
Alejandro Piñeiro [Fri, 1 Oct 2021 13:18:38 +0000 (15:18 +0200)]
v3dv: no specific separate_segments flag for V3D 7.1

On V3D 7.1 there is not a flag on the Shader State Record to specify
if we are using shared or separate segments. This is done by setting
the vpm input size to 0 (so we need to ensure that the output would be
the max needed for input/output).

We were already doing the latter on the prog_data_vs, so we just need
to use those values, instead of assigning default values.

As we are here, we also add some comments on the compiler part.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: handle RTs with no color targets in v71
Iago Toral Quiroga [Wed, 29 Sep 2021 07:07:28 +0000 (09:07 +0200)]
v3dv: handle RTs with no color targets in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: handle early Z/S clears for v71
Iago Toral Quiroga [Wed, 29 Sep 2021 06:22:59 +0000 (08:22 +0200)]
v3dv: handle early Z/S clears for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update thread end restrictions validation for v71
Iago Toral Quiroga [Tue, 28 Sep 2021 06:59:08 +0000 (08:59 +0200)]
broadcom/compiler: update thread end restrictions validation for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: GFX-1461 does not affect V3D 7.x
Iago Toral Quiroga [Tue, 28 Sep 2021 06:31:04 +0000 (08:31 +0200)]
v3dv: GFX-1461 does not affect V3D 7.x

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: handle render pass global clear for v71
Iago Toral Quiroga [Tue, 28 Sep 2021 06:23:48 +0000 (08:23 +0200)]
v3dv: handle render pass global clear for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: implement noop job for v71
Iago Toral Quiroga [Tue, 28 Sep 2021 06:14:11 +0000 (08:14 +0200)]
v3dv: implement noop job for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: handle new texture state transfer functions in v71
Iago Toral Quiroga [Sun, 24 Oct 2021 23:38:31 +0000 (01:38 +0200)]
v3dv: handle new texture state transfer functions in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: fix up texture shader state for v71
Iago Toral Quiroga [Sun, 24 Oct 2021 23:37:12 +0000 (01:37 +0200)]
v3dv: fix up texture shader state for v71

There are some new fields for YCbCr with pointers for the various
planes in multi-planar formats. These need to match the base address
pointer in the texture state, or the hardware will assume this is a
multi-planar texture.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: setup TLB clear color for meta operations in v71
Iago Toral Quiroga [Wed, 22 Sep 2021 10:04:21 +0000 (12:04 +0200)]
v3dv: setup TLB clear color for meta operations in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: setup render pass color clears for any format bpp in v71
Iago Toral Quiroga [Wed, 22 Sep 2021 10:03:58 +0000 (12:03 +0200)]
v3dv: setup render pass color clears for any format bpp in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/pipeline: handle GL_SHADER_STATE_RECORD changed size on v71
Alejandro Piñeiro [Wed, 28 Jul 2021 11:45:52 +0000 (13:45 +0200)]
v3dv/pipeline: handle GL_SHADER_STATE_RECORD changed size on v71

It is likely that we would need more changes, as this packet changed,
but this is enough to get basic tests running. Any additional support
will be handled with new commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/pipeline: default vertex attributes values are not needed for v71
Alejandro Piñeiro [Wed, 28 Jul 2021 10:05:26 +0000 (12:05 +0200)]
v3dv/pipeline: default vertex attributes values are not needed for v71

There are not part of the shader state record.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: default vertex attribute values are gen dependant
Alejandro Piñeiro [Wed, 28 Jul 2021 10:01:38 +0000 (12:01 +0200)]
v3dv: default vertex attribute values are gen dependant

Content, structure and size would depend on the generation. Even if it
is needed at all.

So let's move it to the v3dvx files.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/cmd_buffer: just don't fill up early-z fields for CFG_BITS for v71
Alejandro Piñeiro [Tue, 27 Jul 2021 12:02:30 +0000 (14:02 +0200)]
v3dv/cmd_buffer: just don't fill up early-z fields for CFG_BITS for v71

For v71 early_z_enable/early_z_updates_enable is configured with
packet 121.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/uniforms: update VIEWPORT_X/Y_SCALE uniforms for v71
Alejandro Piñeiro [Tue, 14 Sep 2021 08:08:19 +0000 (10:08 +0200)]
v3dv/uniforms: update VIEWPORT_X/Y_SCALE uniforms for v71

As the packet CLIPPER_XY scaling, this needs to be computed on 1/64ths
of pixel, instead of 1/256ths of pixels.

As this is the usual values that we get from macros, we add manually a
v42 and v71 macro, and define a new helper (V3DV_X) to get the value
for the current hw version.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/cmd_buffer: emit CLIPPER_XY_SCALING for v71
Alejandro Piñeiro [Sun, 19 Sep 2021 21:37:32 +0000 (23:37 +0200)]
v3dv/cmd_buffer: emit CLIPPER_XY_SCALING for v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dvx/cmd_buffer: emit CLEAR_RENDER_TARGETS for v71
Alejandro Piñeiro [Mon, 26 Jul 2021 13:08:11 +0000 (15:08 +0200)]
v3dvx/cmd_buffer: emit CLEAR_RENDER_TARGETS for v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/cmd_buffer: emit TILE_RENDERING_MODE_CFG_RENDER_TARGET_PART1 for v71
Alejandro Piñeiro [Thu, 22 Jul 2021 12:26:13 +0000 (14:26 +0200)]
v3dv/cmd_buffer: emit TILE_RENDERING_MODE_CFG_RENDER_TARGET_PART1 for v71

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: emit TILE_BINNING_MODE_CFG and TILE_RENDERING_MODE_CFG_COMMON for v71
Alejandro Piñeiro [Tue, 20 Jul 2021 12:00:44 +0000 (14:00 +0200)]
v3dv: emit TILE_BINNING_MODE_CFG and TILE_RENDERING_MODE_CFG_COMMON for v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/device: handle new rpi5 device (bcm2712)
Iago Toral Quiroga [Wed, 10 Nov 2021 06:54:35 +0000 (07:54 +0100)]
v3dv/device: handle new rpi5 device (bcm2712)

This includes both master and primary devices.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv: expose V3D revision number in device name
Iago Toral Quiroga [Wed, 10 Nov 2021 09:06:50 +0000 (10:06 +0100)]
v3dv: expose V3D revision number in device name

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agov3dv/meson: add v71 hw generation
Alejandro Piñeiro [Tue, 29 Jun 2021 09:59:53 +0000 (11:59 +0200)]
v3dv/meson: add v71 hw generation

Starting point for v71 version inclusion.

This just adds it as one of the versions to be compiled (on meson),
updates the v3dX/v3dv_X macros, and update the code enough to get it
compiling when building using the two versions. For any packet not
available on v71 we just provide a generic asserted placeholder of
generation not supported.

Any real v71 support will be implemented on following commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: only assign rf0 as last resort in V3D 7.x
Iago Toral Quiroga [Mon, 15 May 2023 08:02:10 +0000 (10:02 +0200)]
broadcom/compiler: only assign rf0 as last resort in V3D 7.x

So we can use it for ldunif(a) and avoid generating ldunif(a)rf which
can't be paired with conditional instructions.

shader-db (pi5):

total instructions in shared programs: 11357802 -> 11338883 (-0.17%)
instructions in affected programs: 7117889 -> 7098970 (-0.27%)
helped: 24264
HURT: 17574
Instructions are helped.

total uniforms in shared programs: 3857808 -> 3857815 (<.01%)
uniforms in affected programs: 92 -> 99 (7.61%)
helped: 0
HURT: 1

total max-temps in shared programs: 2230904 -> 2230199 (-0.03%)
max-temps in affected programs: 52309 -> 51604 (-1.35%)
helped: 1219
HURT: 725
Max-temps are helped.

total sfu-stalls in shared programs: 15021 -> 15236 (1.43%)
sfu-stalls in affected programs: 6848 -> 7063 (3.14%)
helped: 1866
HURT: 1704
Inconclusive result

total inst-and-stalls in shared programs: 11372823 -> 11354119 (-0.16%)
inst-and-stalls in affected programs: 7149177 -> 7130473 (-0.26%)
helped: 24315
HURT: 17561
Inst-and-stalls are helped.

total nops in shared programs: 273624 -> 273711 (0.03%)
nops in affected programs: 31562 -> 31649 (0.28%)
helped: 1619
HURT: 1854
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't assign registers to unused nodes/temps
Iago Toral Quiroga [Tue, 2 May 2023 08:17:47 +0000 (10:17 +0200)]
broadcom/compiler: don't assign registers to unused nodes/temps

In programs with a lot of unused temps, if we don't do this, we may
end up recycling previously used rfs more often, which can be
detrimental to instruction pairing.

total instructions in shared programs: 11464335 -> 11444136 (-0.18%)
instructions in affected programs: 8976743 -> 8956544 (-0.23%)
helped: 33196
HURT: 33778
Inconclusive result

total max-temps in shared programs: 2230150 -> 2229445 (-0.03%)
max-temps in affected programs: 86413 -> 85708 (-0.82%)
helped: 2217
HURT: 1523
Max-temps are helped.

total sfu-stalls in shared programs: 18077 -> 17104 (-5.38%)
sfu-stalls in affected programs: 8669 -> 7696 (-11.22%)
helped: 2657
HURT: 2182
Sfu-stalls are helped.

total inst-and-stalls in shared programs: 11482412 -> 11461240 (-0.18%)
inst-and-stalls in affected programs: 8995697 -> 8974525 (-0.24%)
helped: 33319
HURT: 33708
Inconclusive result

total nops in shared programs: 298140 -> 296185 (-0.66%)
nops in affected programs: 52805 -> 50850 (-3.70%)
helped: 3797
HURT: 2662
Inconclusive result

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: improve allocation for final program instructions
Iago Toral Quiroga [Tue, 2 May 2023 08:12:37 +0000 (10:12 +0200)]
broadcom/compiler: improve allocation for final program instructions

The last 3 instructions can't use specific registers so flag all the
nodes for temps used in the last program instructions and try to
avoid assigning any of these. This may help us avoid injecting nops
for the last thread switch instruction.

Because regisster allocation needs to happen before QPU scheduling
and instruction merging we can't tell exactly what the last 3
instructions will be, so we do this for a few more instructions than
just 3.

We only do this for fragment shaders because other shader stages
always end with VPM store instructions that take an small immediate
and therefore will never allow us to merge the final thread switch
earlier, so limiting allocation for these shaders will never improve
anything and might instead be detrimental.

total instructions in shared programs: 11471389 -> 11464335 (-0.06%)
instructions in affected programs: 582908 -> 575854 (-1.21%)
helped: 4669
HURT: 578
Instructions are helped.

total max-temps in shared programs: 2230497 -> 2230150 (-0.02%)
max-temps in affected programs: 5662 -> 5315 (-6.13%)
helped: 344
HURT: 44
Max-temps are helped.

total sfu-stalls in shared programs: 18068 -> 18077 (0.05%)
sfu-stalls in affected programs: 264 -> 273 (3.41%)
helped: 37
HURT: 48
Inconclusive result (value mean confidence interval includes 0).

total inst-and-stalls in shared programs: 11489457 -> 11482412 (-0.06%)
inst-and-stalls in affected programs: 585180 -> 578135 (-1.20%)
helped: 4659
HURT: 588
Inst-and-stalls are helped.

total nops in shared programs: 301738 -> 298140 (-1.19%)
nops in affected programs: 14680 -> 11082 (-24.51%)
helped: 3252
HURT: 108
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't allocate spill base to rf0 in V3D 7.x
Iago Toral Quiroga [Tue, 18 Apr 2023 06:50:13 +0000 (08:50 +0200)]
broadcom/compiler: don't allocate spill base to rf0 in V3D 7.x

Otherwise it can be stomped by instructions doing implicit rf0 writes.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: new packing/conversion v71 instructions
Alejandro Piñeiro [Fri, 26 Nov 2021 00:24:12 +0000 (01:24 +0100)]
broadcom/qpu: new packing/conversion v71 instructions

This commits adds the qpu definitions for several new v71
instructions.

Packing:
  * vpack does a 2x32 to 2x16 bit integer pack
  * v8pack: Pack 2 x 2x16 bit integers into 4x8 bits
  * v10pack packs parts of 2 2x16 bit integer into r10g10b10a2.
  * v11fpack packs parts of 2 2x16 bit float into r11g11b10 rounding
    to nearest

Conversion to unorm/snorm:
  * vftounorm8/vftosnorm8: converts from 2x16-bit floating point
    to 2x8 bit unorm/snorm.
  * ftounorm16/ftosnorm16: converts floating point to 16-bit
    unorm/snorm
  * vftounorm10lo: Convert 2x16-bit floating point to 2x10-bit unorm
  * vftounorm10hi: Convert 2x16-bit floating point to one 2-bit and one 10-bit unorm

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: fix up copy propagation for v71
Iago Toral Quiroga [Tue, 9 Nov 2021 10:34:59 +0000 (11:34 +0100)]
broadcom/compiler: fix up copy propagation for v71

Update rules for unsafe copy propagations to match v7.x.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: lift restriction on vpmwt in last instruction for V3D 7.x
Iago Toral Quiroga [Mon, 29 Nov 2021 12:23:11 +0000 (13:23 +0100)]
broadcom/compiler: lift restriction on vpmwt in last instruction for V3D 7.x

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: validate restrictions after TLB Z write
Iago Toral Quiroga [Thu, 25 Nov 2021 12:00:34 +0000 (13:00 +0100)]
broadcom/compiler: validate restrictions after TLB Z write

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: start allocating from RF 4 in V7.x
Iago Toral Quiroga [Fri, 26 Nov 2021 09:37:05 +0000 (10:37 +0100)]
broadcom/compiler: start allocating from RF 4 in V7.x

In V3D 4.x we start at RF3 so that we allocate RF0-2 only if there
aren't any other RFs available. This is useful with small shaders to
ensure that our TLB writes don't use these registers because these are
the last instructions we emit in fragment shaders and the last
instructions in a program can't write to these registers, so if we do,
we need to emit NOPs.

In V3D 7.x the registers affected by this restriction are RF2-3, so we
choose to start at RF4.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: lift restriction for branch + msfign after setmsf for v7.x
Iago Toral Quiroga [Thu, 25 Nov 2021 07:31:02 +0000 (08:31 +0100)]
broadcom/compiler: lift restriction for branch + msfign after setmsf for v7.x

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update ldvary thread switch delay slot restriction for v7.x
Iago Toral Quiroga [Tue, 23 Nov 2021 09:04:49 +0000 (10:04 +0100)]
broadcom/compiler: update ldvary thread switch delay slot restriction for v7.x

In V3D 7.x we don't have accumulators which would not survive a thread
switch, so the only restriction is that ldvary can't be placed in the
second delay slot of a thread switch.

shader-db results for UnrealEngine4 shaders:

total instructions in shared programs: 446458 -> 446401 (-0.01%)
instructions in affected programs: 13492 -> 13435 (-0.42%)
helped: 58
HURT: 3
Instructions are helped.

total nops in shared programs: 19571 -> 19541 (-0.15%)
nops in affected programs: 161 -> 131 (-18.63%)
helped: 30
HURT: 0
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update thread end restrictions for v7.x
Iago Toral Quiroga [Mon, 22 Nov 2021 11:56:03 +0000 (12:56 +0100)]
broadcom/compiler: update thread end restrictions for v7.x

In 4.x it is not allowed to write to the register file in the last 3
instructions, but in 7.x we only have this restriction in the thread
end instruction itself, and only if the write comes from the ALU
ports.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: implement small immediates for v71
Iago Toral Quiroga [Wed, 3 Nov 2021 09:34:19 +0000 (10:34 +0100)]
broadcom/compiler: implement small immediates for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: convert mul to add when needed to allow merge
Iago Toral Quiroga [Mon, 25 Oct 2021 07:38:57 +0000 (09:38 +0200)]
broadcom/compiler: convert mul to add when needed to allow merge

V3D 7.x added 'mov' opcodes to the ADD alu, so now it is possible to
move these to the ADD alu to facilitate merging them with other MUL
instructions.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't assign rf0 to temps that conflict with ldvary
Iago Toral Quiroga [Fri, 29 Oct 2021 11:00:56 +0000 (13:00 +0200)]
broadcom/compiler: don't assign rf0 to temps that conflict with ldvary

ldvary writes to rf0 implicitly, so we don't want to allocate rf0 to
any temps that are live across ldvary's rf0 live ranges.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: try to use ldunif(a) instead of ldunif(a)rf in v71
Iago Toral Quiroga [Thu, 28 Oct 2021 12:13:29 +0000 (14:13 +0200)]
broadcom/compiler: try to use ldunif(a) instead of ldunif(a)rf in v71

The rf variants need to encode the destination in the cond bits, which
prevents these to be merged with any other instruction that need them.

In 4.x, ldunif(a) write to r5 which is a special register that only
ldunif(a) and ldvary can write so we have a special register class for
it and only allow it for them. Then when we need to choose a register
for a node, if this register is available we always use it.

In 7.x these instructions write to rf0, which can be used by any
instruction, so instead of restricting rf0, we track the temps that
are used as ldunif(a) destinations and use that information to favor
rf0 for them.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: enable ldvary pipelining on v71
Iago Toral Quiroga [Wed, 27 Oct 2021 09:35:12 +0000 (11:35 +0200)]
broadcom/compiler: enable ldvary pipelining on v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: handle rf0 flops storage restriction in v71
Iago Toral Quiroga [Wed, 6 Oct 2021 11:58:27 +0000 (13:58 +0200)]
broadcom/compiler: handle rf0 flops storage restriction in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add packing for fmov on ADD alu
Iago Toral Quiroga [Tue, 26 Oct 2021 06:37:54 +0000 (08:37 +0200)]
broadcom/qpu: add packing for fmov on ADD alu

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update peripheral access restrictions for v71
Iago Toral Quiroga [Tue, 26 Oct 2021 09:43:02 +0000 (11:43 +0200)]
broadcom/compiler: update peripheral access restrictions for v71

In V3D 4.x only a couple of simultaneous accesses where allowed, but
V3D 7.x is a bit more flexible, so rather than trying to check for all
the allowed combinations it is easier to check if we are one of the
disallows.

Shader-db (pi5):

total instructions in shared programs: 11338883 -> 11307386 (-0.28%)
instructions in affected programs: 2727201 -> 2695704 (-1.15%)
helped: 12555
HURT: 289
Instructions are helped.

total max-temps in shared programs: 2230199 -> 2229260 (-0.04%)
max-temps in affected programs: 20508 -> 19569 (-4.58%)
helped: 608
HURT: 4
Max-temps are helped.

total sfu-stalls in shared programs: 15236 -> 15293 (0.37%)
sfu-stalls in affected programs: 148 -> 205 (38.51%)
helped: 38
HURT: 64
Inconclusive result (%-change mean confidence interval includes 0).

total inst-and-stalls in shared programs: 11354119 -> 11322679 (-0.28%)
inst-and-stalls in affected programs: 2732262 -> 2700822 (-1.15%)
helped: 12550
HURT: 304
Inst-and-stalls are helped.

total nops in shared programs: 273711 -> 274095 (0.14%)
nops in affected programs: 9626 -> 10010 (3.99%)
helped: 186
HURT: 397
Nops are HURT.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update payload registers handling when computing live intervals
Alejandro Piñeiro [Tue, 19 Oct 2021 21:52:30 +0000 (23:52 +0200)]
broadcom/compiler: update payload registers handling when computing live intervals

As for v71 the payload registers are not the same. Specifically now
rf3 is used as payload register, so this is needed to avoid rf3 being
selected as a instruction dst by the register allocator, overwriting
the payload value that could be still used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update ldunif/ldvary comment for v71
Alejandro Piñeiro [Tue, 19 Oct 2021 09:51:32 +0000 (11:51 +0200)]
broadcom/compiler: update ldunif/ldvary comment for v71

For v42 and below ldunif/ldvary write both on r5, but with a different
delay, so we need to take that into account when scheduling both.

For v71 the register used is rf0, but the behaviour is the same. So
the scheduling code can be the same, but the comment needs update.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update one TMUWT restriction for v71
Alejandro Piñeiro [Tue, 19 Oct 2021 09:16:43 +0000 (11:16 +0200)]
broadcom/compiler: update one TMUWT restriction for v71

TMUWT not allowed in the final instruction restriction doesn't apply
for v71.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: v71 isn't affected by double-rounding of viewport X,Y coords
Iago Toral Quiroga [Thu, 14 Oct 2021 12:16:40 +0000 (14:16 +0200)]
broadcom/compiler: v71 isn't affected by double-rounding of viewport X,Y coords

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: generalize check for shaders using pixel center W
Iago Toral Quiroga [Fri, 8 Oct 2021 13:10:24 +0000 (15:10 +0200)]
broadcom/compiler: generalize check for shaders using pixel center W

V3D 4.x has pixel center W in rf0 and V3D 7.x has it in rf3. We already
account for this when we setup the c->payload_w, so use that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: fail packing on unhandled mul pack/unpack
Iago Toral Quiroga [Wed, 6 Oct 2021 10:01:10 +0000 (12:01 +0200)]
broadcom/qpu: fail packing on unhandled mul pack/unpack

We are doing this for the ADD alu already and it may be helpful to
identify cases where we have QPU code with pack/unpack modifiers on
MUL opcodes that we then are not packing into the actual QPU
instructions.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add MOV integer packing/unpacking variants
Iago Toral Quiroga [Wed, 6 Oct 2021 07:27:43 +0000 (09:27 +0200)]
broadcom/qpu: add MOV integer packing/unpacking variants

These are new in v71 and cover MOV on both the ADD and the MUL alus.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: allow instruction merges in v71
Iago Toral Quiroga [Thu, 30 Sep 2021 11:22:48 +0000 (13:22 +0200)]
broadcom/compiler: allow instruction merges in v71

In v3d 4.x there were restrictions based on the number of raddrs used
by the combined instructions, but we don't have these restrictions in
v3d 7.x.

It should be noted that while there are no restrictions on the number
of raddrs addressed, a QPU instruction can only address a single small
immediate, so we should be careful about that when we add support for
small immediates.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't schedule rf0 writes right after ldvary
Iago Toral Quiroga [Fri, 22 Oct 2021 11:39:48 +0000 (13:39 +0200)]
broadcom/compiler: don't schedule rf0 writes right after ldvary

ldvary writes rf0 implicitly on the next cycle so they would clash.
This case is not handled correctly by our normal dependency tracking,
which doesn't know anything about delayed writes from instructions
and thinks the rf0 write happens on the same cycle ldvary is emitted.

Fixes (v71):
dEQP-VK.glsl.conversions.matrix_to_matrix.mat2x3_to_mat4x2_fragment

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: CS payload registers have changed in v71
Iago Toral Quiroga [Tue, 28 Sep 2021 11:37:28 +0000 (13:37 +0200)]
broadcom/compiler: CS payload registers have changed in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't assign rf0 to temps across implicit rf0 writes
Iago Toral Quiroga [Wed, 29 Sep 2021 10:14:04 +0000 (12:14 +0200)]
broadcom/compiler: don't assign rf0 to temps across implicit rf0 writes

In platforms that don't have accumulators and have implicit writes to
the register file we need to be careful and avoid assigning a physical
register to a temp that lives across an implicit write to that same
physical register.

For now, we have the case of implicit writes to rf0 from various
signals, but it should be easy to extend this to include additional
registers if needed.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: only handle accumulator classes if present
Iago Toral Quiroga [Wed, 29 Sep 2021 10:10:31 +0000 (12:10 +0200)]
broadcom/compiler: only handle accumulator classes if present

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: rename vir_writes_rX to vir_writes_rX_implicitly
Iago Toral Quiroga [Wed, 29 Sep 2021 10:03:50 +0000 (12:03 +0200)]
broadcom/compiler: rename vir_writes_rX to vir_writes_rX_implicitly

Since that represents more accurately what they check..

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: make vir_write_rX return false on platforms without accums
Iago Toral Quiroga [Wed, 29 Sep 2021 09:54:18 +0000 (11:54 +0200)]
broadcom/compiler: make vir_write_rX return false on platforms without accums

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: implement switch rules for fmin/fmax fadd/faddnf for v71
Alejandro Piñeiro [Mon, 27 Sep 2021 23:17:08 +0000 (01:17 +0200)]
broadcom/qpu: implement switch rules for fmin/fmax fadd/faddnf for v71

They use the same opcodes, and switch between one and the other based
on raddr.

Note that the rule includes also if small_imm_a/b are used. That is
still not in place so that part is hardcoded. Would be updated later
when small immediates support for v71 gets implemented.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: fix packing/unpacking of fmov variants for v71
Iago Toral Quiroga [Mon, 4 Oct 2021 11:07:35 +0000 (13:07 +0200)]
broadcom/qpu: fix packing/unpacking of fmov variants for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add new ADD opcodes for FMOV/MOV in v71
Iago Toral Quiroga [Mon, 27 Sep 2021 11:26:04 +0000 (13:26 +0200)]
broadcom/qpu: add new ADD opcodes for FMOV/MOV in v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: prevent rf2-3 usage in thread end delay slots for v71
Iago Toral Quiroga [Mon, 27 Sep 2021 09:49:24 +0000 (11:49 +0200)]
broadcom/compiler: prevent rf2-3 usage in thread end delay slots for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: add a v3d71_qpu_writes_waddr_explicitly helper
Iago Toral Quiroga [Wed, 6 Oct 2021 11:58:00 +0000 (13:58 +0200)]
broadcom/compiler: add a v3d71_qpu_writes_waddr_explicitly helper

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: implement read stall check for v71
Iago Toral Quiroga [Thu, 23 Sep 2021 09:44:59 +0000 (11:44 +0200)]
broadcom/compiler: implement read stall check for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: implement "reads/writes too soon" checks for v71
Iago Toral Quiroga [Thu, 23 Sep 2021 09:19:58 +0000 (11:19 +0200)]
broadcom/compiler: implement "reads/writes too soon" checks for v71

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update register classes to not include accumulators on v71
Alejandro Piñeiro [Wed, 15 Sep 2021 22:49:25 +0000 (00:49 +0200)]
broadcom/compiler: update register classes to not include accumulators on v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu_schedule: update write deps for v71
Alejandro Piñeiro [Wed, 15 Sep 2021 09:12:59 +0000 (11:12 +0200)]
broadcom/qpu_schedule: update write deps for v71

We just need to add a write dep if rf0 is written implicitly.

Note that we don't need to check if we have accumulators when checking
for r3/r4/r5, as v3d_qpu_writes_rX would return false for hw version
that doesn't have accumulators.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: payload_w is loaded on rf3 for v71
Alejandro Piñeiro [Tue, 14 Sep 2021 23:14:15 +0000 (01:14 +0200)]
broadcom/compiler: payload_w is loaded on rf3 for v71

And in general rf0 is now used for other needs.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: add support for varyings on nir to vir generation for v71
Alejandro Piñeiro [Tue, 14 Sep 2021 08:42:55 +0000 (10:42 +0200)]
broadcom/compiler: add support for varyings on nir to vir generation for v71

Needs update as v71 doesn't have accumulators anymore, and ldvary uses
now rf0 to return the value.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: return false on qpu_writes_accumulatorXX helpers for v71
Alejandro Piñeiro [Wed, 15 Sep 2021 08:55:49 +0000 (10:55 +0200)]
broadcom/qpu: return false on qpu_writes_accumulatorXX helpers for v71

As for v71 doesn't have accumulators (devinfo->has_accumulators set to
false), those methods would always return false.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: update disasm_raddr for v71
Alejandro Piñeiro [Thu, 9 Sep 2021 23:20:44 +0000 (01:20 +0200)]
broadcom/qpu: update disasm_raddr for v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu_schedule: add process_raddr_deps
Alejandro Piñeiro [Thu, 9 Sep 2021 21:59:28 +0000 (23:59 +0200)]
broadcom/qpu_schedule: add process_raddr_deps

On v71 we don't have muxes, but more raddr. Adding a equivalent add
deps function.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update vir_to_qpu::set_src for v71
Alejandro Piñeiro [Wed, 8 Sep 2021 23:18:54 +0000 (01:18 +0200)]
broadcom/compiler: update vir_to_qpu::set_src for v71

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/vir: implement is_no_op_mov for v71
Alejandro Piñeiro [Wed, 8 Sep 2021 22:28:53 +0000 (00:28 +0200)]
broadcom/vir: implement is_no_op_mov for v71

Did some refactoring/splitting.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: don't favor/select accum registers for hw not supporting it
Alejandro Piñeiro [Thu, 16 Sep 2021 23:07:06 +0000 (01:07 +0200)]
broadcom/compiler: don't favor/select accum registers for hw not supporting it

Note that what we do is to just return false on the favor/select accum
methods. We could just avoid to call them, but as the select is called
more than once, it is just easier this way.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: phys index depends on hw version
Alejandro Piñeiro [Mon, 23 Aug 2021 00:18:43 +0000 (02:18 +0200)]
broadcom/compiler: phys index depends on hw version

For 7.1 there are not accumulators. So we replace the macro with a
function call.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: update node/temp translation for v71
Iago Toral Quiroga [Sat, 28 Jan 2023 23:27:11 +0000 (00:27 +0100)]
broadcom/compiler: update node/temp translation for v71

As the offset applied needs to take into account if we have
accumulators or not.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add pack/unpack support for v71
Alejandro Piñeiro [Sat, 7 Aug 2021 00:20:39 +0000 (02:20 +0200)]
broadcom/qpu: add pack/unpack support for v71

Note that we provide new v71 alu pack/unpack methods. As there are a
lot that it is equivalent, initially we tried to use existing methods
as template and add version checks on the existing methods. At some
early point that become just really unreadable, so it become better to
just provide new methods, even if v42 and v71 methods have a really
similar structure.

Note that we have splitted the op tables, and created a two (add/mul)
for v71. As the description struct include versioning info, we could
have just used one table. But, specially with the add table, there are
a lot of differences with v71. So it is slightly tidier this
way. Also, taking into account that we do a linear search on the
tables, this can be even justified by performance.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add qpu_writes_rf0_implicitly helper
Alejandro Piñeiro [Wed, 15 Sep 2021 08:56:43 +0000 (10:56 +0200)]
broadcom/qpu: add qpu_writes_rf0_implicitly helper

On v71 rf0 replaces r5 as the register that gets updated implicitly
with uniform loads, and gets the C coefficient with ldvary. This
helper return if rf0 gets implicitly updated.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/commmon: add has_accumulators field on v3d_device_info
Alejandro Piñeiro [Thu, 16 Sep 2021 23:04:31 +0000 (01:04 +0200)]
broadcom/commmon: add has_accumulators field on v3d_device_info

Even if we can just check for the version on the code, checking for
this field makes several places more readable. So for example, on the
register allocate code we doesn't assign an accumulator because we
don't have accumulators on that hw, instead of because hw version is a
given one.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: defining shift/mask for raddr_c/d
Alejandro Piñeiro [Thu, 12 Aug 2021 00:24:02 +0000 (02:24 +0200)]
broadcom/qpu: defining shift/mask for raddr_c/d

On V3D 7.x it replaces mul_a/b and add_a/b

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add raddr on v3d_qpu_input
Alejandro Piñeiro [Thu, 5 Aug 2021 23:33:32 +0000 (01:33 +0200)]
broadcom/qpu: add raddr on v3d_qpu_input

On V3D 7.x mux are not used, and raddr_a/b/c/d are used instead

This is not perfect, as for v71, the raddr_a/b defined at qpu_instr
became superfluous. But the alternative would be to define two
different structs, or even having them defined based on version
ifdefs, so this is a reasonable compromise.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: define v3d_qpu_input, use on v3d_qpu_alu_instr
Alejandro Piñeiro [Thu, 5 Aug 2021 23:22:31 +0000 (01:22 +0200)]
broadcom/qpu: define v3d_qpu_input, use on v3d_qpu_alu_instr

At this point it just tidy up a little the alu_instr structure.

But also serves to prepare the structure for new changes, as 7.x uses
raddr instead of mux, and it is just easier to add the raddr to the
new input structure.

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add v71 signal map
Alejandro Piñeiro [Tue, 3 Aug 2021 23:11:16 +0000 (01:11 +0200)]
broadcom/qpu: add v71 signal map

Compared with v41, the differences are:
   * 14, 15, 29 and 30 are now about immediate a, b, c, d respectively
   * 23 is now reserved. On v42 this was for rotate signals, that are
     gone on v71.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: add small_imm a/c/d on v3d_qpu_sig
Alejandro Piñeiro [Wed, 4 Aug 2021 22:50:12 +0000 (00:50 +0200)]
broadcom/compiler: add small_imm a/c/d on v3d_qpu_sig

small_imm_a, small_imm_c and small_imm_d added on top of the already
existing small_imm_b, as V3D 7.1 defines 4 small immediates, tied to
the 4 raddr. Note that this is only the definition, and just a inst
validation rule to check that are not used before v71. Any real use is
still pending.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/compiler: rename small_imm to small_imm_b
Alejandro Piñeiro [Sun, 19 Sep 2021 01:20:18 +0000 (03:20 +0200)]
broadcom/compiler: rename small_imm to small_imm_b

Current small_imm is associated with the "B" read address.

We do this change in advance for v71 support, where we will have 4
different small_imm (a/b/c/d), so we start with a renaming.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: set V3D 7.x names for some waddr aliasing
Alejandro Piñeiro [Wed, 4 Aug 2021 23:00:47 +0000 (01:00 +0200)]
broadcom/qpu: set V3D 7.x names for some waddr aliasing

V3D 7.x got rid of the accumulator, but still uses the values for
WADDR_R5 and WADDR_R5REP, so let's return a proper name and add some
aliases.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/qpu: add comments on waddr not used on V3D 7.x
Alejandro Piñeiro [Wed, 4 Aug 2021 23:03:11 +0000 (01:03 +0200)]
broadcom/qpu: add comments on waddr not used on V3D 7.x

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/common: add some common v71 helpers
Alejandro Piñeiro [Wed, 17 Nov 2021 13:40:47 +0000 (14:40 +0100)]
broadcom/common: add some common v71 helpers

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/common: retrieve V3D revision number
Iago Toral Quiroga [Tue, 9 Nov 2021 07:50:51 +0000 (08:50 +0100)]
broadcom/common: retrieve V3D revision number

The subrev field from the hub ident3 register is bumped with every
hardware revision doing backwards incompatible changes so we want to
keep track of this.

Instead of modifying the 'ver' field info to acommodate subrev info,
which would require a lot of changes, simply add a new 'rev' field in
devinfo that we can use when we need to make changes based on the
revision number of a hardware release.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/cle: update the packet definitions for new generation v71
Alejandro Piñeiro [Tue, 29 Jun 2021 10:03:24 +0000 (12:03 +0200)]
broadcom/cle: update the packet definitions for new generation v71

Using as reference the spec for 7.1.5. This include totally new
packets, and redefine some that already existed on v42.

Full list:
 * Add Depth Bounds Test Limits
 * Redefine Tile Binning Mode Cfg
 * Redefine Cfg Bits. There are some changes on the fields:
   * Line Rasterization is now 1 bit size
   * Depth Bounds Enable (that takes one of the bits of Line Rasterization)
   * Early-Z/Early-Z updates enable bits (16-17) figure now as reserved.
   * New Z-Clipping mode field
 * Redefine Tile Rendering Mode Cfg (Common). Changes with respect to v42:
   * New log2 tile height/width fields starting at bit 52/55
   * Due those two news, end pad is smaller
   * sub-id has now a size of 3. Bit 4 is reserved.
   * Number of render targets: this field max value is now 7 (not
     reflected on the xml).
   * Maximum BPP is removed on v71 (now bits 40-41 are reserved)
   * Depth Buffer disable: on bit 44
 * Update Store Tile Buffer General
 * Adding Cfg Render Target Part1/2/3 packets: they replace v4X "Tile
   Rendering Mode Cfg (Color)" (real name "Rendering Configuration
   (Render Targets Config)"), "Tile Rendering Mode Cfg (Clear Colors
   Part1)", "Tile Rendering Mode Cfg (Clear Colors Part2)", and "Tile
   Rendering Mode Cfg (Clear Colors Part3)". On those old versions,
   the first packet is used to configure 4 render targets. Now that 8
   are supported, invididual per-render-target are used.
 * Update ZS clear values packet.
 * Add new v71 output formats
 * Define Clear Render Targets (Replaces Clear Tile Buffers from v42)
 * Redefine GL Shader State Record. Changes copared with v42:
   * Fields removed:
     * "Coordinate shader has separate input and output VPM blocks"
       (reserved bit now)
     * "Vertex shader has separate input and output VPM blocks"
       (reserved bit now)
     * "Address of table of default attribute Values." (we needed to
       change the start position for all the following fields)
   * New field:
     * "Never defer FEP depth writes to fragment shader auto Z writes
        on scoreboard conflict"
 * Redefine clipper xy scaling: Now it uses 1/64ths of pixels, instead
   of 1/256ths
 * Update texture shader state.
   * Notice we don't use an address type for these fields in the XML
     description. This is because the addresses are 64-bit aligned
     (even though the PRM doesn't say it) which means the 6 LSB bits
     are implicitly 0, but the fields are encoded before the 6th bit
     of their starting byte, so we can't use the usual trick we do
     with address types where the first 6 bits in the byte are
     implicitly overwritten by other fields and we have to encode this
     manually as a uint field. This would mean that if we had an
     actual BO we would also need to add it manually to the job's
     list, but since we don't have one, we don't have to do anything
     about it.
   * Add new RB_Swap field for texture shader state
   * Document Cb/Cr addresses as uint fields in texture shader state
 * Fixup Blend Config description: we now support 8 RTs.
 * TMU config parameter 2 has new fields
 * Add new clipper Z without guardband packet in v71
 * Add enums for the Z clip modes accepted in v71
 * Fix texture state array stride packing for V3D 7.1.5

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom/simulator: reset CFG7 for compute dispatch in v71
Iago Toral Quiroga [Tue, 28 Sep 2021 11:16:49 +0000 (13:16 +0200)]
broadcom/simulator: reset CFG7 for compute dispatch in v71

This register is new in 7.x, it doesn't seem that we need to
do anything specific for now, but let's make sure it is reset
every time.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agobroadcom(cle,clif,common,simulator): add 7.1 version on the list of versions to build
Alejandro Piñeiro [Sun, 25 Apr 2021 22:02:21 +0000 (00:02 +0200)]
broadcom(cle,clif,common,simulator): add 7.1 version on the list of versions to build

This adds 7.1 to the list of available V3D_VERSION, and first changes
on the simulator needed to get it working.

Note that we needed to touch all those 4 codebases because it is
needed if we want to use V3D_DEBUG=clif with the simulator, that it is
the easier way to see which packets a vulkan program is using.

About the simulator, this commit only handle the rename of some
registers. Any additional changes needed to get a proper support for
v71 will be handled them on following commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>

9 months agoblorp: Use the correct miptail start LOD for surfaces
Sagar Ghuge [Thu, 12 Oct 2023 17:53:07 +0000 (10:53 -0700)]
blorp: Use the correct miptail start LOD for surfaces

Use the correct miptail start LOD for the surfaces involved in the
XY_BLOCK_COPY_BLT/XY_FAST_COLOR_BLT instructions.

Thanks to Lionel for pointing out the issue.

Fixes: 46f45d62d1 ("intel/isl: Start using miptails")

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25688>

9 months agorusticl/memory: fix potential use-after-free in clEnqueueSVMFree
LingMan [Fri, 13 Oct 2023 16:51:22 +0000 (18:51 +0200)]
rusticl/memory: fix potential use-after-free in clEnqueueSVMFree

Fixes: bfee3a8563d ("rusticl: add support for fine-grained system SVM")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25719>