Iago Toral Quiroga [Tue, 9 Nov 2021 10:34:59 +0000 (11:34 +0100)]
broadcom/compiler: fix up copy propagation for v71
Update rules for unsafe copy propagations to match v7.x.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 29 Nov 2021 12:23:11 +0000 (13:23 +0100)]
broadcom/compiler: lift restriction on vpmwt in last instruction for V3D 7.x
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 25 Nov 2021 12:00:34 +0000 (13:00 +0100)]
broadcom/compiler: validate restrictions after TLB Z write
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Fri, 26 Nov 2021 09:37:05 +0000 (10:37 +0100)]
broadcom/compiler: start allocating from RF 4 in V7.x
In V3D 4.x we start at RF3 so that we allocate RF0-2 only if there
aren't any other RFs available. This is useful with small shaders to
ensure that our TLB writes don't use these registers because these are
the last instructions we emit in fragment shaders and the last
instructions in a program can't write to these registers, so if we do,
we need to emit NOPs.
In V3D 7.x the registers affected by this restriction are RF2-3, so we
choose to start at RF4.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 25 Nov 2021 07:31:02 +0000 (08:31 +0100)]
broadcom/compiler: lift restriction for branch + msfign after setmsf for v7.x
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 23 Nov 2021 09:04:49 +0000 (10:04 +0100)]
broadcom/compiler: update ldvary thread switch delay slot restriction for v7.x
In V3D 7.x we don't have accumulators which would not survive a thread
switch, so the only restriction is that ldvary can't be placed in the
second delay slot of a thread switch.
shader-db results for UnrealEngine4 shaders:
total instructions in shared programs: 446458 -> 446401 (-0.01%)
instructions in affected programs: 13492 -> 13435 (-0.42%)
helped: 58
HURT: 3
Instructions are helped.
total nops in shared programs: 19571 -> 19541 (-0.15%)
nops in affected programs: 161 -> 131 (-18.63%)
helped: 30
HURT: 0
Nops are helped.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 22 Nov 2021 11:56:03 +0000 (12:56 +0100)]
broadcom/compiler: update thread end restrictions for v7.x
In 4.x it is not allowed to write to the register file in the last 3
instructions, but in 7.x we only have this restriction in the thread
end instruction itself, and only if the write comes from the ALU
ports.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 3 Nov 2021 09:34:19 +0000 (10:34 +0100)]
broadcom/compiler: implement small immediates for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 25 Oct 2021 07:38:57 +0000 (09:38 +0200)]
broadcom/compiler: convert mul to add when needed to allow merge
V3D 7.x added 'mov' opcodes to the ADD alu, so now it is possible to
move these to the ADD alu to facilitate merging them with other MUL
instructions.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Fri, 29 Oct 2021 11:00:56 +0000 (13:00 +0200)]
broadcom/compiler: don't assign rf0 to temps that conflict with ldvary
ldvary writes to rf0 implicitly, so we don't want to allocate rf0 to
any temps that are live across ldvary's rf0 live ranges.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 28 Oct 2021 12:13:29 +0000 (14:13 +0200)]
broadcom/compiler: try to use ldunif(a) instead of ldunif(a)rf in v71
The rf variants need to encode the destination in the cond bits, which
prevents these to be merged with any other instruction that need them.
In 4.x, ldunif(a) write to r5 which is a special register that only
ldunif(a) and ldvary can write so we have a special register class for
it and only allow it for them. Then when we need to choose a register
for a node, if this register is available we always use it.
In 7.x these instructions write to rf0, which can be used by any
instruction, so instead of restricting rf0, we track the temps that
are used as ldunif(a) destinations and use that information to favor
rf0 for them.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 27 Oct 2021 09:35:12 +0000 (11:35 +0200)]
broadcom/compiler: enable ldvary pipelining on v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 6 Oct 2021 11:58:27 +0000 (13:58 +0200)]
broadcom/compiler: handle rf0 flops storage restriction in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 26 Oct 2021 06:37:54 +0000 (08:37 +0200)]
broadcom/qpu: add packing for fmov on ADD alu
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 26 Oct 2021 09:43:02 +0000 (11:43 +0200)]
broadcom/compiler: update peripheral access restrictions for v71
In V3D 4.x only a couple of simultaneous accesses where allowed, but
V3D 7.x is a bit more flexible, so rather than trying to check for all
the allowed combinations it is easier to check if we are one of the
disallows.
Shader-db (pi5):
total instructions in shared programs:
11338883 ->
11307386 (-0.28%)
instructions in affected programs: 2727201 -> 2695704 (-1.15%)
helped: 12555
HURT: 289
Instructions are helped.
total max-temps in shared programs: 2230199 -> 2229260 (-0.04%)
max-temps in affected programs: 20508 -> 19569 (-4.58%)
helped: 608
HURT: 4
Max-temps are helped.
total sfu-stalls in shared programs: 15236 -> 15293 (0.37%)
sfu-stalls in affected programs: 148 -> 205 (38.51%)
helped: 38
HURT: 64
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs:
11354119 ->
11322679 (-0.28%)
inst-and-stalls in affected programs: 2732262 -> 2700822 (-1.15%)
helped: 12550
HURT: 304
Inst-and-stalls are helped.
total nops in shared programs: 273711 -> 274095 (0.14%)
nops in affected programs: 9626 -> 10010 (3.99%)
helped: 186
HURT: 397
Nops are HURT.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 19 Oct 2021 21:52:30 +0000 (23:52 +0200)]
broadcom/compiler: update payload registers handling when computing live intervals
As for v71 the payload registers are not the same. Specifically now
rf3 is used as payload register, so this is needed to avoid rf3 being
selected as a instruction dst by the register allocator, overwriting
the payload value that could be still used.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 19 Oct 2021 09:51:32 +0000 (11:51 +0200)]
broadcom/compiler: update ldunif/ldvary comment for v71
For v42 and below ldunif/ldvary write both on r5, but with a different
delay, so we need to take that into account when scheduling both.
For v71 the register used is rf0, but the behaviour is the same. So
the scheduling code can be the same, but the comment needs update.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 19 Oct 2021 09:16:43 +0000 (11:16 +0200)]
broadcom/compiler: update one TMUWT restriction for v71
TMUWT not allowed in the final instruction restriction doesn't apply
for v71.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 14 Oct 2021 12:16:40 +0000 (14:16 +0200)]
broadcom/compiler: v71 isn't affected by double-rounding of viewport X,Y coords
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Fri, 8 Oct 2021 13:10:24 +0000 (15:10 +0200)]
broadcom/compiler: generalize check for shaders using pixel center W
V3D 4.x has pixel center W in rf0 and V3D 7.x has it in rf3. We already
account for this when we setup the c->payload_w, so use that.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 6 Oct 2021 10:01:10 +0000 (12:01 +0200)]
broadcom/qpu: fail packing on unhandled mul pack/unpack
We are doing this for the ADD alu already and it may be helpful to
identify cases where we have QPU code with pack/unpack modifiers on
MUL opcodes that we then are not packing into the actual QPU
instructions.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 6 Oct 2021 07:27:43 +0000 (09:27 +0200)]
broadcom/qpu: add MOV integer packing/unpacking variants
These are new in v71 and cover MOV on both the ADD and the MUL alus.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 30 Sep 2021 11:22:48 +0000 (13:22 +0200)]
broadcom/compiler: allow instruction merges in v71
In v3d 4.x there were restrictions based on the number of raddrs used
by the combined instructions, but we don't have these restrictions in
v3d 7.x.
It should be noted that while there are no restrictions on the number
of raddrs addressed, a QPU instruction can only address a single small
immediate, so we should be careful about that when we add support for
small immediates.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Fri, 22 Oct 2021 11:39:48 +0000 (13:39 +0200)]
broadcom/compiler: don't schedule rf0 writes right after ldvary
ldvary writes rf0 implicitly on the next cycle so they would clash.
This case is not handled correctly by our normal dependency tracking,
which doesn't know anything about delayed writes from instructions
and thinks the rf0 write happens on the same cycle ldvary is emitted.
Fixes (v71):
dEQP-VK.glsl.conversions.matrix_to_matrix.mat2x3_to_mat4x2_fragment
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 28 Sep 2021 11:37:28 +0000 (13:37 +0200)]
broadcom/compiler: CS payload registers have changed in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 29 Sep 2021 10:14:04 +0000 (12:14 +0200)]
broadcom/compiler: don't assign rf0 to temps across implicit rf0 writes
In platforms that don't have accumulators and have implicit writes to
the register file we need to be careful and avoid assigning a physical
register to a temp that lives across an implicit write to that same
physical register.
For now, we have the case of implicit writes to rf0 from various
signals, but it should be easy to extend this to include additional
registers if needed.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 29 Sep 2021 10:10:31 +0000 (12:10 +0200)]
broadcom/compiler: only handle accumulator classes if present
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 29 Sep 2021 10:03:50 +0000 (12:03 +0200)]
broadcom/compiler: rename vir_writes_rX to vir_writes_rX_implicitly
Since that represents more accurately what they check..
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 29 Sep 2021 09:54:18 +0000 (11:54 +0200)]
broadcom/compiler: make vir_write_rX return false on platforms without accums
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Mon, 27 Sep 2021 23:17:08 +0000 (01:17 +0200)]
broadcom/qpu: implement switch rules for fmin/fmax fadd/faddnf for v71
They use the same opcodes, and switch between one and the other based
on raddr.
Note that the rule includes also if small_imm_a/b are used. That is
still not in place so that part is hardcoded. Would be updated later
when small immediates support for v71 gets implemented.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 4 Oct 2021 11:07:35 +0000 (13:07 +0200)]
broadcom/qpu: fix packing/unpacking of fmov variants for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 27 Sep 2021 11:26:04 +0000 (13:26 +0200)]
broadcom/qpu: add new ADD opcodes for FMOV/MOV in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Mon, 27 Sep 2021 09:49:24 +0000 (11:49 +0200)]
broadcom/compiler: prevent rf2-3 usage in thread end delay slots for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Wed, 6 Oct 2021 11:58:00 +0000 (13:58 +0200)]
broadcom/compiler: add a v3d71_qpu_writes_waddr_explicitly helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 23 Sep 2021 09:44:59 +0000 (11:44 +0200)]
broadcom/compiler: implement read stall check for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Thu, 23 Sep 2021 09:19:58 +0000 (11:19 +0200)]
broadcom/compiler: implement "reads/writes too soon" checks for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 15 Sep 2021 22:49:25 +0000 (00:49 +0200)]
broadcom/compiler: update register classes to not include accumulators on v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 15 Sep 2021 09:12:59 +0000 (11:12 +0200)]
broadcom/qpu_schedule: update write deps for v71
We just need to add a write dep if rf0 is written implicitly.
Note that we don't need to check if we have accumulators when checking
for r3/r4/r5, as v3d_qpu_writes_rX would return false for hw version
that doesn't have accumulators.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 14 Sep 2021 23:14:15 +0000 (01:14 +0200)]
broadcom/compiler: payload_w is loaded on rf3 for v71
And in general rf0 is now used for other needs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 14 Sep 2021 08:42:55 +0000 (10:42 +0200)]
broadcom/compiler: add support for varyings on nir to vir generation for v71
Needs update as v71 doesn't have accumulators anymore, and ldvary uses
now rf0 to return the value.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 15 Sep 2021 08:55:49 +0000 (10:55 +0200)]
broadcom/qpu: return false on qpu_writes_accumulatorXX helpers for v71
As for v71 doesn't have accumulators (devinfo->has_accumulators set to
false), those methods would always return false.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 9 Sep 2021 23:20:44 +0000 (01:20 +0200)]
broadcom/qpu: update disasm_raddr for v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 9 Sep 2021 21:59:28 +0000 (23:59 +0200)]
broadcom/qpu_schedule: add process_raddr_deps
On v71 we don't have muxes, but more raddr. Adding a equivalent add
deps function.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 8 Sep 2021 23:18:54 +0000 (01:18 +0200)]
broadcom/compiler: update vir_to_qpu::set_src for v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 8 Sep 2021 22:28:53 +0000 (00:28 +0200)]
broadcom/vir: implement is_no_op_mov for v71
Did some refactoring/splitting.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 16 Sep 2021 23:07:06 +0000 (01:07 +0200)]
broadcom/compiler: don't favor/select accum registers for hw not supporting it
Note that what we do is to just return false on the favor/select accum
methods. We could just avoid to call them, but as the select is called
more than once, it is just easier this way.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Mon, 23 Aug 2021 00:18:43 +0000 (02:18 +0200)]
broadcom/compiler: phys index depends on hw version
For 7.1 there are not accumulators. So we replace the macro with a
function call.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Sat, 28 Jan 2023 23:27:11 +0000 (00:27 +0100)]
broadcom/compiler: update node/temp translation for v71
As the offset applied needs to take into account if we have
accumulators or not.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Sat, 7 Aug 2021 00:20:39 +0000 (02:20 +0200)]
broadcom/qpu: add pack/unpack support for v71
Note that we provide new v71 alu pack/unpack methods. As there are a
lot that it is equivalent, initially we tried to use existing methods
as template and add version checks on the existing methods. At some
early point that become just really unreadable, so it become better to
just provide new methods, even if v42 and v71 methods have a really
similar structure.
Note that we have splitted the op tables, and created a two (add/mul)
for v71. As the description struct include versioning info, we could
have just used one table. But, specially with the add table, there are
a lot of differences with v71. So it is slightly tidier this
way. Also, taking into account that we do a linear search on the
tables, this can be even justified by performance.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 15 Sep 2021 08:56:43 +0000 (10:56 +0200)]
broadcom/qpu: add qpu_writes_rf0_implicitly helper
On v71 rf0 replaces r5 as the register that gets updated implicitly
with uniform loads, and gets the C coefficient with ldvary. This
helper return if rf0 gets implicitly updated.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 16 Sep 2021 23:04:31 +0000 (01:04 +0200)]
broadcom/commmon: add has_accumulators field on v3d_device_info
Even if we can just check for the version on the code, checking for
this field makes several places more readable. So for example, on the
register allocate code we doesn't assign an accumulator because we
don't have accumulators on that hw, instead of because hw version is a
given one.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 12 Aug 2021 00:24:02 +0000 (02:24 +0200)]
broadcom/qpu: defining shift/mask for raddr_c/d
On V3D 7.x it replaces mul_a/b and add_a/b
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 5 Aug 2021 23:33:32 +0000 (01:33 +0200)]
broadcom/qpu: add raddr on v3d_qpu_input
On V3D 7.x mux are not used, and raddr_a/b/c/d are used instead
This is not perfect, as for v71, the raddr_a/b defined at qpu_instr
became superfluous. But the alternative would be to define two
different structs, or even having them defined based on version
ifdefs, so this is a reasonable compromise.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Thu, 5 Aug 2021 23:22:31 +0000 (01:22 +0200)]
broadcom/qpu: define v3d_qpu_input, use on v3d_qpu_alu_instr
At this point it just tidy up a little the alu_instr structure.
But also serves to prepare the structure for new changes, as 7.x uses
raddr instead of mux, and it is just easier to add the raddr to the
new input structure.
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 3 Aug 2021 23:11:16 +0000 (01:11 +0200)]
broadcom/qpu: add v71 signal map
Compared with v41, the differences are:
* 14, 15, 29 and 30 are now about immediate a, b, c, d respectively
* 23 is now reserved. On v42 this was for rotate signals, that are
gone on v71.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 4 Aug 2021 22:50:12 +0000 (00:50 +0200)]
broadcom/compiler: add small_imm a/c/d on v3d_qpu_sig
small_imm_a, small_imm_c and small_imm_d added on top of the already
existing small_imm_b, as V3D 7.1 defines 4 small immediates, tied to
the 4 raddr. Note that this is only the definition, and just a inst
validation rule to check that are not used before v71. Any real use is
still pending.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Sun, 19 Sep 2021 01:20:18 +0000 (03:20 +0200)]
broadcom/compiler: rename small_imm to small_imm_b
Current small_imm is associated with the "B" read address.
We do this change in advance for v71 support, where we will have 4
different small_imm (a/b/c/d), so we start with a renaming.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 4 Aug 2021 23:00:47 +0000 (01:00 +0200)]
broadcom/qpu: set V3D 7.x names for some waddr aliasing
V3D 7.x got rid of the accumulator, but still uses the values for
WADDR_R5 and WADDR_R5REP, so let's return a proper name and add some
aliases.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 4 Aug 2021 23:03:11 +0000 (01:03 +0200)]
broadcom/qpu: add comments on waddr not used on V3D 7.x
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Wed, 17 Nov 2021 13:40:47 +0000 (14:40 +0100)]
broadcom/common: add some common v71 helpers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 9 Nov 2021 07:50:51 +0000 (08:50 +0100)]
broadcom/common: retrieve V3D revision number
The subrev field from the hub ident3 register is bumped with every
hardware revision doing backwards incompatible changes so we want to
keep track of this.
Instead of modifying the 'ver' field info to acommodate subrev info,
which would require a lot of changes, simply add a new 'rev' field in
devinfo that we can use when we need to make changes based on the
revision number of a hardware release.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Tue, 29 Jun 2021 10:03:24 +0000 (12:03 +0200)]
broadcom/cle: update the packet definitions for new generation v71
Using as reference the spec for 7.1.5. This include totally new
packets, and redefine some that already existed on v42.
Full list:
* Add Depth Bounds Test Limits
* Redefine Tile Binning Mode Cfg
* Redefine Cfg Bits. There are some changes on the fields:
* Line Rasterization is now 1 bit size
* Depth Bounds Enable (that takes one of the bits of Line Rasterization)
* Early-Z/Early-Z updates enable bits (16-17) figure now as reserved.
* New Z-Clipping mode field
* Redefine Tile Rendering Mode Cfg (Common). Changes with respect to v42:
* New log2 tile height/width fields starting at bit 52/55
* Due those two news, end pad is smaller
* sub-id has now a size of 3. Bit 4 is reserved.
* Number of render targets: this field max value is now 7 (not
reflected on the xml).
* Maximum BPP is removed on v71 (now bits 40-41 are reserved)
* Depth Buffer disable: on bit 44
* Update Store Tile Buffer General
* Adding Cfg Render Target Part1/2/3 packets: they replace v4X "Tile
Rendering Mode Cfg (Color)" (real name "Rendering Configuration
(Render Targets Config)"), "Tile Rendering Mode Cfg (Clear Colors
Part1)", "Tile Rendering Mode Cfg (Clear Colors Part2)", and "Tile
Rendering Mode Cfg (Clear Colors Part3)". On those old versions,
the first packet is used to configure 4 render targets. Now that 8
are supported, invididual per-render-target are used.
* Update ZS clear values packet.
* Add new v71 output formats
* Define Clear Render Targets (Replaces Clear Tile Buffers from v42)
* Redefine GL Shader State Record. Changes copared with v42:
* Fields removed:
* "Coordinate shader has separate input and output VPM blocks"
(reserved bit now)
* "Vertex shader has separate input and output VPM blocks"
(reserved bit now)
* "Address of table of default attribute Values." (we needed to
change the start position for all the following fields)
* New field:
* "Never defer FEP depth writes to fragment shader auto Z writes
on scoreboard conflict"
* Redefine clipper xy scaling: Now it uses 1/64ths of pixels, instead
of 1/256ths
* Update texture shader state.
* Notice we don't use an address type for these fields in the XML
description. This is because the addresses are 64-bit aligned
(even though the PRM doesn't say it) which means the 6 LSB bits
are implicitly 0, but the fields are encoded before the 6th bit
of their starting byte, so we can't use the usual trick we do
with address types where the first 6 bits in the byte are
implicitly overwritten by other fields and we have to encode this
manually as a uint field. This would mean that if we had an
actual BO we would also need to add it manually to the job's
list, but since we don't have one, we don't have to do anything
about it.
* Add new RB_Swap field for texture shader state
* Document Cb/Cr addresses as uint fields in texture shader state
* Fixup Blend Config description: we now support 8 RTs.
* TMU config parameter 2 has new fields
* Add new clipper Z without guardband packet in v71
* Add enums for the Z clip modes accepted in v71
* Fix texture state array stride packing for V3D 7.1.5
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Iago Toral Quiroga [Tue, 28 Sep 2021 11:16:49 +0000 (13:16 +0200)]
broadcom/simulator: reset CFG7 for compute dispatch in v71
This register is new in 7.x, it doesn't seem that we need to
do anything specific for now, but let's make sure it is reset
every time.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Alejandro Piñeiro [Sun, 25 Apr 2021 22:02:21 +0000 (00:02 +0200)]
broadcom(cle,clif,common,simulator): add 7.1 version on the list of versions to build
This adds 7.1 to the list of available V3D_VERSION, and first changes
on the simulator needed to get it working.
Note that we needed to touch all those 4 codebases because it is
needed if we want to use V3D_DEBUG=clif with the simulator, that it is
the easier way to see which packets a vulkan program is using.
About the simulator, this commit only handle the rename of some
registers. Any additional changes needed to get a proper support for
v71 will be handled them on following commits.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Sagar Ghuge [Thu, 12 Oct 2023 17:53:07 +0000 (10:53 -0700)]
blorp: Use the correct miptail start LOD for surfaces
Use the correct miptail start LOD for the surfaces involved in the
XY_BLOCK_COPY_BLT/XY_FAST_COLOR_BLT instructions.
Thanks to Lionel for pointing out the issue.
Fixes:
46f45d62d1 ("intel/isl: Start using miptails")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25688>
LingMan [Fri, 13 Oct 2023 16:51:22 +0000 (18:51 +0200)]
rusticl/memory: fix potential use-after-free in clEnqueueSVMFree
Fixes:
bfee3a8563d ("rusticl: add support for fine-grained system SVM")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25719>
Christian Gmeiner [Tue, 11 Jul 2023 10:42:59 +0000 (12:42 +0200)]
docs: update etnaviv extensions
Next round of feature updates.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25700>
Eric Engestrom [Fri, 13 Oct 2023 14:55:42 +0000 (15:55 +0100)]
ci_run_n_monitor: dependency jobs must always be started
Fixes:
6b49b477aca7ba572b06 ("ci/ci_run_n_monitor: simplify enable/cancel logic in monitor_pipeline()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25715>
Samuel Pitoiset [Fri, 13 Oct 2023 13:12:10 +0000 (15:12 +0200)]
zink/ci: remove expected failures that are skipped for RADV
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25711>
Erik Faye-Lund [Thu, 12 Oct 2023 13:17:53 +0000 (15:17 +0200)]
meson: add wayland-protocols from meson wrapdb
Sometimes, users don't have a recent enough version of wayland-protocols
to build Mesa. But these days, Meson's WrapDB has a wrap that's more
than new enough for our needs.
If we add the wrap-file to the subprojects folder, we'll download a more
recent wrap if the installed version is too old (or doesn't exist at
all).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25683>
Gert Wollny [Fri, 29 Sep 2023 13:36:13 +0000 (15:36 +0200)]
r600: drop egcm_load_index_reg
This is now handled in SFN.
v2: remove obsolte comments (Vitaliy Kuzmin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25475>
Gert Wollny [Fri, 29 Sep 2023 19:09:58 +0000 (21:09 +0200)]
r600/sfn: don't remove texture sources by using the enum value
We have to query the index first, otherwise we remove the wrong value.
Fixes:
02bb506c54f998cfbc907758282a5748755c67ea
r600/sfn: Lower tex,txl,txb and txf to backend
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25475>
Martin Roukala (né Peres) [Fri, 13 Oct 2023 11:27:23 +0000 (14:27 +0300)]
zink/ci: tighten the zink-radv-vangogh timeouts
The jobs should never take longer than 30 minutes, so let's enforce it!
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25704>
Martin Roukala (né Peres) [Fri, 13 Oct 2023 11:26:37 +0000 (14:26 +0300)]
radv/ci: tighten the vkcts-navi21 timeouts
The jobs should never take longer than 30 minutes, so let's enforce it!
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25704>
Erik Faye-Lund [Fri, 13 Oct 2023 07:30:33 +0000 (09:30 +0200)]
ci/etnaviv: move failure to flake
Turns out, this passes sometimes... So let's mark it as a flake
instead.
Fixes:
7368a897528 ("ci/etnaviv: update ci expectation")
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25699>
Matt Coster [Fri, 11 Aug 2023 10:18:10 +0000 (11:18 +0100)]
pvr: Use common physical device properties
Make use of the common vulkan properties code introduced in [1].
[1]: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24575
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25326>
Matt Coster [Tue, 15 Aug 2023 09:45:27 +0000 (10:45 +0100)]
pvr: Minor refactor of pvr_device.c
Moving a few functions further up here to prepare for the next commit;
should make the diffs a lot nicer. No (intentional) functional changes.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25326>
Matt Coster [Tue, 15 Aug 2023 09:42:12 +0000 (10:42 +0100)]
pvr: Don't pass pvr_physical_device when only device info is needed
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25326>
Erico Nunes [Fri, 13 Oct 2023 09:51:34 +0000 (11:51 +0200)]
Revert "ci/lima: farm is down, disable for now"
This reverts commit
c7806daf43b47c463e0fc4b9d3896ee321cd17ae.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25702>
cheyang [Fri, 13 Oct 2023 03:43:57 +0000 (11:43 +0800)]
isaspec : fix isaspec build error in aosp
in Android 12 build have error "ninja:
'external/mesa/src/compiler/isaspec/README.rst', needed by
'out/target/product/s/obj/MESON_MESA3D_GEN/.timestamp', missing and
no known rule to make it" because commit:
d48d8aefdff44e8a6ece030a782dc9d152bb1d5d (docs: Move isaspec out of
drivers/freedreno) modify isaspec.rst Location.
Signed-off-by: cheyang <cheyang@bytedance.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25697>
Karol Herbst [Thu, 12 Oct 2023 22:06:51 +0000 (00:06 +0200)]
rusticl/kernel: Fix creation from programs not built for every device
OpenCL does not require that a kernel is created for every device. So we
shouldn't assume there is a build for every device.
API validation around launching kernels already takes this possibility
into account.
I did not verify if the commit below is actually the culprit and whether
this bug existed before that, but a fix for older code also would have to
look differently anyway.
Fixes:
323dcbb4b52 ("rusticl: Move NirKernelBuild to ProgramDevBuild")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9968
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25696>
Danylo Piliaiev [Thu, 12 Oct 2023 09:27:44 +0000 (11:27 +0200)]
freedreno/rddecompiler: Decompile repeated IBs
Otherwise we don't reconstruct the whole cmdstream.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25677>
Danylo Piliaiev [Thu, 12 Oct 2023 09:26:00 +0000 (11:26 +0200)]
freedreno/rddecompiler: Use fd_dev_gen to pass gpu_id to ir3 disasm
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25677>
Eric Engestrom [Thu, 12 Oct 2023 19:02:36 +0000 (20:02 +0100)]
include/dri_interface.h: restore define mistakenly removed in !25587
This file is a public API used by Xserver; removing something from it
means Xserver can't build anymore, and even if we fix Xserver, old
versions are still around and we want to keep them working to allow for
bisecting issues.
Fixes:
7301914755f5843f1095 ("dri: Remove __driDriverExtensions leftovers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9976
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25692>
Lionel Landwerlin [Thu, 5 Oct 2023 12:18:50 +0000 (15:18 +0300)]
Revert "intel/fs: limit register flag interaction of FIND_*LIVE_CHANNEL"
This reverts commit
c9739e8912286a212359f3a5a4f958c6165ce2cc.
We don't have a full understanding of what is going on but reverting
definitely fixes a hang.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
c9739e8912 ("intel/fs: limit register flag interaction of FIND_*LIVE_CHANNEL")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9868
Tested-By: Valentin Geyer <trayshar@t-online.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25563>
José Roberto de Souza [Thu, 12 Oct 2023 14:20:18 +0000 (07:20 -0700)]
intel: Prepare implementation of Wa_18019816803 and Wa_16013994831 for future platforms
Those workarounds are temporary for newer platforms so we can't use
INTEL_NEEDS_WA_*, luckly those already had runtime checks.
INTEL_NEEDS_WA_* was only used because it was accessing instructions
or fields of the instructions that only exists in gfx12 or gfx125.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25685>
Alyssa Rosenzweig [Thu, 5 Oct 2023 21:44:09 +0000 (17:44 -0400)]
nir/opt_algebraic: Optimize LLVM booleans
Helps OpenCL kernels.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25687>
Jordan Justen [Thu, 12 Oct 2023 08:30:49 +0000 (01:30 -0700)]
intel/dev: Add 0x56ba-0x56bd DG2 PCI IDs
Cc: mesa-stable
Ref: https://lists.freedesktop.org/archives/intel-gfx/2023-October/337287.html
Ref: BSpec 44477
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25673>
Jordan Justen [Wed, 11 Oct 2023 19:56:45 +0000 (12:56 -0700)]
anv/batch: Assert that extend_cb is non-NULL if the batch is out of space
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25672>
Jordan Justen [Wed, 11 Oct 2023 19:54:27 +0000 (12:54 -0700)]
anv/batch: Check if batch already has an error in anv_queue_submit_simple_batch()
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25672>
Emma Anholt [Thu, 12 Oct 2023 19:17:50 +0000 (12:17 -0700)]
ci/radeonsi: Drop an xfail for vangogh.
It's passed in the last 3 nightly runs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25693>
Emma Anholt [Thu, 12 Oct 2023 19:15:47 +0000 (12:15 -0700)]
ci/zink: Add a TGL flake that's showed up in nightlies recently.
I don't know how recently, since the nightlies were timing out for a long
time.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25693>
Emma Anholt [Tue, 10 Oct 2023 17:47:59 +0000 (10:47 -0700)]
nir/print: Decode system values in the variable declarations.
decl_var system INTERP_MODE_NONE none vec4 #0
decl_var system INTERP_MODE_FLAT none mediump uint #1
turns into:
decl_var system INTERP_MODE_NONE none vec4 #0 (SYSTEM_VALUE_FRAG_COORD)
decl_var system INTERP_MODE_FLAT none mediump uint #1 (SYSTEM_VALUE_SUBGROUP_INVOCATION)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25647>
Mike Blumenkrantz [Wed, 27 Sep 2023 14:12:03 +0000 (10:12 -0400)]
zink: shrink vectors during optimization
this avoids a number of cases where a shader was reading more components
from an input than an output was providing. functionally there was never
any issue as these read components were subsequently rewritten to use
constant data, but the read itself is a spec violation
shrinking can't be done in finalize, however, as that enables the frontend
to optimize vertex states, which seems like a good thing but ends up being
a bad thing since it may or may not be consistent across frontends and I
don't wanna deal with having to reorder i/o locations in unintuitive ways
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25433>
Alyssa Rosenzweig [Sun, 1 Oct 2023 13:50:41 +0000 (09:50 -0400)]
nir/opt_algebraic: Reduce int64
If we just want the bottom 32-bits we don't need a full 64-bit operation.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
Alyssa Rosenzweig [Tue, 5 Sep 2023 22:09:40 +0000 (18:09 -0400)]
nir/lower_io: Use load_global_constant for OpenCL
Map __constant with a 64-bit address format to load_global_constant instead of
load_global. This notably allows nir_opt_preamble to hoist the load.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
Alyssa Rosenzweig [Tue, 5 Sep 2023 21:43:49 +0000 (17:43 -0400)]
nir/print: Handle KERNEL
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
Alyssa Rosenzweig [Tue, 5 Sep 2023 21:27:17 +0000 (17:27 -0400)]
nir/legalize_16bit_sampler_srcs: Use instr_pass
Fixes the pass with multiple functions.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
Alyssa Rosenzweig [Tue, 5 Sep 2023 21:19:50 +0000 (17:19 -0400)]
nir/opt_phi_precision: Work with libraries
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>
Alyssa Rosenzweig [Thu, 12 Oct 2023 14:40:09 +0000 (10:40 -0400)]
r600/sfn: Handle load_global_constant
as an alias of load_global, for CL.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>