platform/upstream/mesa.git
16 months agoradv: Enable the null export workaround with POPS
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 19:23:11 +0000 (22:23 +0300)]
radv: Enable the null export workaround with POPS

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradv: Enable POPS collision wave ID shader argument
Vitaliy Triang3l Kuzmin [Fri, 2 Jun 2023 19:58:47 +0000 (22:58 +0300)]
radv: Enable POPS collision wave ID shader argument

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradv: Declare POPS collision wave ID shader argument
Vitaliy Triang3l Kuzmin [Fri, 2 Jun 2023 19:55:48 +0000 (22:55 +0300)]
radv: Declare POPS collision wave ID shader argument

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradv: Ensure 1x1 shading rate on GFX10.3 with interlock execution mode
Vitaliy Triang3l Kuzmin [Fri, 2 Jun 2023 21:29:31 +0000 (00:29 +0300)]
radv: Ensure 1x1 shading rate on GFX10.3 with interlock execution mode

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradv: Detect the use of Primitive Ordered Pixel Shading
Vitaliy Triang3l Kuzmin [Fri, 2 Jun 2023 21:26:31 +0000 (00:26 +0300)]
radv: Detect the use of Primitive Ordered Pixel Shading

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradv: Remove unconditional POPS_DRAIN_PS_ON_OVERLAP setting
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 19:15:37 +0000 (22:15 +0300)]
radv: Remove unconditional POPS_DRAIN_PS_ON_OVERLAP setting

This hardware hang workaround (PAL waMiscPopsMissedOverlap) is needed only
on some Vega chips, and only for 8 or more samples per pixel. It has a
significant performance cost (around 1.5x-2x in
nvpro-samples/vk_order_independent_transparency), so it should be precisely
configured when setting up Primitive Ordered Pixel Shading.

It was added in 47b780be21d917eaa6a6a6c9e30ba9fba52d9acd, when POPS was not
used in Mesa, with the change being described as "this may not be needed
yet, but let's set it now".

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoradeonsi: Remove unconditional POPS_DRAIN_PS_ON_OVERLAP setting
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 19:14:39 +0000 (22:14 +0300)]
radeonsi: Remove unconditional POPS_DRAIN_PS_ON_OVERLAP setting

This hardware hang workaround (PAL waMiscPopsMissedOverlap) is needed only
on some Vega chips, and only for 8 or more samples per pixel. It has a
significant performance cost (around 1.5x-2x in
nvpro-samples/vk_order_independent_transparency), so it should be precisely
configured when setting up Primitive Ordered Pixel Shading.

It was added in 47b780be21d917eaa6a6a6c9e30ba9fba52d9acd, when POPS was not
used in Mesa, with the change being described as "this may not be needed
yet, but let's set it now".

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Implement fragment shader interlock intrinsics
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:54:38 +0000 (21:54 +0300)]
aco: Implement fragment shader interlock intrinsics

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Add Primitive Ordered Pixel Shading waitcnt rules
Vitaliy Triang3l Kuzmin [Thu, 6 Apr 2023 20:09:35 +0000 (23:09 +0300)]
aco: Add Primitive Ordered Pixel Shading waitcnt rules

When letting the overlapping waves enter their ordered sections, there must
be no memory accesses to resources which need primitive-ordered access that
are still pending, or there would be a race between the current wave and
the overlapping waves.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Send MSG_ORDERED_PS_DONE where necessary
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:18:21 +0000 (21:18 +0300)]
aco: Send MSG_ORDERED_PS_DONE where necessary

If the wave has set the Primitive Ordered Pixel Shading packer ID hardware
register, it must send MSG_ORDERED_PS_DONE once before the program ends.
It's also safe to send the message if the packer ID register hasn't been
set yet, therefore the message may be sent conservatively. For simplicity,
to ensure that it's sent on all execution paths after setting the packer ID
register, always sending it from a top-level block. This is required for
GFX9-10.3 POPS.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Add Primitive Ordered Pixel Shading scheduling rules
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:27:47 +0000 (21:27 +0300)]
aco: Add Primitive Ordered Pixel Shading scheduling rules

Implementing the acquire/release semantics of fragment shader interlock
ordered section in Vulkan, and preventing reordering of memory accesses
requiring primitive ordering out of the ordered section.

Also, the ordered section should be as short as possible, so not reordering
the instructions awaiting overlapped waves upwards, and the exit from the
ordered section downwards.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Skip waitcnt insertion in the discard early exit block
Vitaliy Triang3l Kuzmin [Sat, 15 Apr 2023 18:45:12 +0000 (21:45 +0300)]
aco: Skip waitcnt insertion in the discard early exit block

Waits are needed for early exits from inside a Primitive Ordered Pixel
Shading ordered section, but that code doesn't insert them reliably anyway
because it doesn't obtain the counters for the exact locations of the
jumps, which may be anywhere inside the predecessor blocks.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Add Primitive Ordered Pixel Shading pseudo-instructions
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:22:02 +0000 (21:22 +0300)]
aco: Add Primitive Ordered Pixel Shading pseudo-instructions

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Add s_wait_event argument bit definitions
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:09:34 +0000 (21:09 +0300)]
aco: Add s_wait_event argument bit definitions

A wait for export_ready (if the corresponding bit is not set in the
instruction) is done to enter the Primitive Ordered Pixel Shading ordered
section on GFX11.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoac: Define POPS collision wave ID argument SGPR
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 18:08:15 +0000 (21:08 +0300)]
ac: Define POPS collision wave ID argument SGPR

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoaco: Support pops_exiting_wave_id PhysReg usage
Vitaliy Triang3l Kuzmin [Mon, 3 Apr 2023 17:50:41 +0000 (20:50 +0300)]
aco: Support pops_exiting_wave_id PhysReg usage

pops_exiting_wave_id is a volatile ALU source operand containing the ID of
the latest wave that hasn't exited yet, for comparing with the newest
overlapped wave ID in overlapping waves.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agoac/nir: Support Primitive Ordered Pixel Shading in lower_ps
Vitaliy Triang3l Kuzmin [Wed, 26 Apr 2023 18:09:48 +0000 (21:09 +0300)]
ac/nir: Support Primitive Ordered Pixel Shading in lower_ps

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agodocs/amd: Document Primitive Ordered Pixel Shading
Vitaliy Triang3l Kuzmin [Sun, 23 Apr 2023 20:12:58 +0000 (23:12 +0300)]
docs/amd: Document Primitive Ordered Pixel Shading

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22250>

16 months agogallivm: Use NIR_PASS macros
Alyssa Rosenzweig [Thu, 22 Jun 2023 15:08:19 +0000 (11:08 -0400)]
gallivm: Use NIR_PASS macros

These run nir_validate in debug builds, which will avoid bugs slipping in. It's
not enough that llvmpipe doesn't mind illegal NIR, these passes are well within
their rights to fail spectacularly if the NIR wouldn't validate. So validate so
we catch issues early.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>

16 months agonir/lower_locals_to_regs: Add bool bitsize knob
Alyssa Rosenzweig [Thu, 22 Jun 2023 20:12:40 +0000 (16:12 -0400)]
nir/lower_locals_to_regs: Add bool bitsize knob

GLSL booleans (and hence bool derefs) may be translated either as 1-bit or
32-bit NIR registers, depending whether the backend uses nir_lower_bool_to_int32
or not. Add a knob for this and choose the right type for different backends.

Fixes nir_validate failure on
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcast_bvec3 run under
lavapipe. That test indexes into a bvec3 array, and gallivm first lowers bools
and then lowers derefs to registers, resulting in random 1-bit booleans mixed in
with 32-bit bools.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>

16 months agonir/lower_bool_to_int32: Fix progress reporting
Alyssa Rosenzweig [Thu, 22 Jun 2023 16:24:28 +0000 (12:24 -0400)]
nir/lower_bool_to_int32: Fix progress reporting

If we only lower parameters, that's still progress. Technically.

Fixes: 6a29cb2654f ("nir/lower_bool_to_int32: add support for lowering functions.")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23804>

16 months agorusticl/api: Wire up CL_DEVICE_PROFILING_TIMER_RESOLUTION
Dr. David Alan Gilbert [Sun, 25 Jun 2023 16:22:29 +0000 (17:22 +0100)]
rusticl/api: Wire up CL_DEVICE_PROFILING_TIMER_RESOLUTION

Wire up the CL_DEVICE_PROFILING_TIMER_RESOLUTION from the PIPE_CAP.
While here, also set CL_PLATFORM_HOST_TIMER_RESOLUTION to 1;
that's bogus since we're using the same value as for device, but
at this point we don't have a device to ask.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>

16 months agorusticl/api: Implement get_{device_and_}host_timer
Dr. David Alan Gilbert [Wed, 14 Jun 2023 00:31:29 +0000 (01:31 +0100)]
rusticl/api: Implement get_{device_and_}host_timer

Use the get_timestamp as both the device_timestamp in
get_device_and_host_timer and host_timestamp in that
and get_host_timer.

Having eliminited most other clock sources, discussions
on previous versions have concluded it's best to use the
same timer as the 'host_timestamp' since the main requirements
are that it must be one that's a time seen by the device and
that it's very closely coupled.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>

16 months agorusticl/device: Stash timestamp availability
Dr. David Alan Gilbert [Sat, 24 Jun 2023 21:03:02 +0000 (22:03 +0100)]
rusticl/device: Stash timestamp availability

Check if the device claims to have timestamps and a valid resolution
and stash it in the device.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>

16 months agorusticl/screen: Wrap get_timestamp
Dr. David Alan Gilbert [Tue, 13 Jun 2023 00:48:34 +0000 (01:48 +0100)]
rusticl/screen: Wrap get_timestamp

Add a wrapper on our screen type to call get_timestamp.

Signed-off-by: Dr. David Alan Gilbert <dave@treblig.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23639>

16 months agodraw: use unsigned instead of uint
Erik Faye-Lund [Fri, 23 Jun 2023 12:21:20 +0000 (14:21 +0200)]
draw: use unsigned instead of uint

uint isn't a standard type, just something we accidentally get from some
other headers.

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: match type of pipe_draw_start_count_bias::count
Erik Faye-Lund [Fri, 23 Jun 2023 13:35:51 +0000 (15:35 +0200)]
draw: match type of pipe_draw_start_count_bias::count

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agocso: use unsigned instead of uint
Erik Faye-Lund [Fri, 23 Jun 2023 12:09:27 +0000 (14:09 +0200)]
cso: use unsigned instead of uint

uint isn't a standard type, just something we accidentally get from some
other headers.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: use stdint.h types
Erik Faye-Lund [Fri, 23 Jun 2023 13:29:54 +0000 (15:29 +0200)]
draw: use stdint.h types

Here, we want explicitly sized types, not just types that happen to be
of the right size.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: track vertices and vertex_ptr as byte-pointers
Erik Faye-Lund [Fri, 23 Jun 2023 12:52:13 +0000 (14:52 +0200)]
draw: track vertices and vertex_ptr as byte-pointers

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: use enum for primitive-type
Erik Faye-Lund [Fri, 23 Jun 2023 12:47:19 +0000 (14:47 +0200)]
draw: use enum for primitive-type

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: use uint32_t instead of uint
Erik Faye-Lund [Fri, 23 Jun 2023 12:36:29 +0000 (14:36 +0200)]
draw: use uint32_t instead of uint

In these cases we actually want uint32_t, because we're doing 32-bit
things to them.

The hwinfo-bit is only being used by i915, and should probably be
moved to i915 instead. But it shoukd *also* be converted, so let's do
that now.

While we're at it, fixup the bit-setting as well.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agodraw: use enum for tgsi-semantic
Erik Faye-Lund [Fri, 23 Jun 2023 12:17:28 +0000 (14:17 +0200)]
draw: use enum for tgsi-semantic

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agocso: use enum for render-conditions
Erik Faye-Lund [Fri, 23 Jun 2023 12:07:51 +0000 (14:07 +0200)]
cso: use enum for render-conditions

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23833>

16 months agoradv/amdgpu: add a helper to get a new IB
Samuel Pitoiset [Wed, 21 Jun 2023 11:45:01 +0000 (13:45 +0200)]
radv/amdgpu: add a helper to get a new IB

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: rename old_ib_buffers to ib_buffers
Samuel Pitoiset [Wed, 21 Jun 2023 11:44:37 +0000 (13:44 +0200)]
radv/amdgpu: rename old_ib_buffers to ib_buffers

No need to prefix with 'old' actually because this is just an array
of IB buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: use cs_finalize() when growing a CS
Samuel Pitoiset [Wed, 21 Jun 2023 11:44:09 +0000 (13:44 +0200)]
radv/amdgpu: use cs_finalize() when growing a CS

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: use the array of IB buffers for the chained IB path
Samuel Pitoiset [Wed, 21 Jun 2023 11:43:44 +0000 (13:43 +0200)]
radv/amdgpu: use the array of IB buffers for the chained IB path

For executing IB on the compute queue (ie. IB2 isn't supported), we
will need to break chaining, this is a first step towards this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: do not set the IB size when ending a CS with RADV_DEBUG=noibs
Samuel Pitoiset [Wed, 21 Jun 2023 11:43:22 +0000 (13:43 +0200)]
radv/amdgpu: do not set the IB size when ending a CS with RADV_DEBUG=noibs

This was only necessary for preambles/postambles, let's clarify this
by determining the IB info from the first IB in the array instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: rework growing a CS with the chained IB path slightly
Samuel Pitoiset [Wed, 21 Jun 2023 11:43:01 +0000 (13:43 +0200)]
radv/amdgpu: rework growing a CS with the chained IB path slightly

This should allow us to use cs_finalize().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agoradv/amdgpu: use the correct IB size when growing a CS with RADV_DEBUG=noibs
Samuel Pitoiset [Wed, 21 Jun 2023 11:42:41 +0000 (13:42 +0200)]
radv/amdgpu: use the correct IB size when growing a CS with RADV_DEBUG=noibs

The current IB size is copied when radv_amdgpu_cs_add_old_ib_buffer()
is called, which might not be the real IB size because we might still
pad the CS with NOP packets after.

Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23727>

16 months agopvr: Advance entry pointer in pvr_setup_vertex_buffers()
Matt Coster [Thu, 22 Jun 2023 08:59:48 +0000 (09:59 +0100)]
pvr: Advance entry pointer in pvr_setup_vertex_buffers()

Fixes: dEQP-VK.robustness.robustness1_vertex_access
  .out_of_bounds_stride_0
  .out_of_bounds_stride_16_single_buffer
  .out_of_bounds_stride_30_middle_of_buffer
  .out_of_bounds_stride_8_middle_of_buffer_separate

Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23834>

16 months agocompiler: Allow the explicit_stride of aoa types to be zero
Corentin Noël [Wed, 14 Jun 2023 15:00:22 +0000 (17:00 +0200)]
compiler: Allow the explicit_stride of aoa types to be zero

The explicit stride doesn't have to be defined to aoa and therefore can be
zero in some cases, like in arrays of arrays of uniform blocks.

Resolves crash with spec@arb_gl_spirv@execution@ubo@aoa-2.shader_test piglit test for virgl.

Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23648>

16 months agoanv: fix to set predicted weight tables correctly.
Hyunjun Ko [Fri, 16 Jun 2023 05:54:21 +0000 (14:54 +0900)]
anv: fix to set predicted weight tables correctly.

Fixes: 8d519eb5f ("anv: add initial video decode support for h265")
Closes: mesa/mesa#9214

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>

16 months agointel/genxml: changes the type for predicted weight to unsigned.
Hyunjun Ko [Fri, 16 Jun 2023 05:50:24 +0000 (14:50 +0900)]
intel/genxml: changes the type for predicted weight to unsigned.

Turned out to be unsigned here after some experiments.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>

16 months agovulkan/video: keep delta weight and offsets of predicted weight tables in h265 slice...
Hyunjun Ko [Fri, 16 Jun 2023 05:40:23 +0000 (14:40 +0900)]
vulkan/video: keep delta weight and offsets of predicted weight tables in h265 slice parsing

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23790>

16 months agovulkan: Update XML and headers to 1.3.255
Caio Oliveira [Fri, 23 Jun 2023 17:59:54 +0000 (10:59 -0700)]
vulkan: Update XML and headers to 1.3.255

Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23837>

16 months agovulkan: Add NV suffix to VK_NV_cooperative_matrix feature names
Caio Oliveira [Wed, 17 May 2023 18:35:55 +0000 (11:35 -0700)]
vulkan: Add NV suffix to VK_NV_cooperative_matrix feature names

In the new Vulkan Headers, VK_KHR_cooperative_matrix gets added and the feature
names are the same.

Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23837>

16 months agorusticl/program: skip linking compiled binaries
Karol Herbst [Sat, 24 Jun 2023 22:41:00 +0000 (00:41 +0200)]
rusticl/program: skip linking compiled binaries

Applications can do their own caching, but are in any case required to
properly "compiler" the binaries via clBuildProgram or clCompileProgram +
clLinkPrograms.

In any case, there is no point building something if we already have the
result.

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23847>

16 months agorusticl: bump bindgen requirement
Karol Herbst [Fri, 23 Jun 2023 20:11:16 +0000 (22:11 +0200)]
rusticl: bump bindgen requirement

Apparently on some ARM systems any older bindgen version crashes.

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23840>

16 months agonir: Add function nir_function_set_impl
Yonggang Luo [Fri, 23 Jun 2023 03:57:47 +0000 (11:57 +0800)]
nir: Add function nir_function_set_impl

This function is added for create strong relationship between
nir_function_impl and nir_function.

So that nir_function->impl->function == nir_function is always true when
(nir_function->impl != NULL && nir_function->impl != NIR_SERIALIZE_FUNC_HAS_IMPL)

And indeed this invariant is already done in functions validate_function and validate_function_impl
of nir_validate

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>

16 months agovtn: Do not assign main_entry_point->impl twice
Yonggang Luo [Fri, 23 Jun 2023 03:52:07 +0000 (11:52 +0800)]
vtn: Do not assign main_entry_point->impl twice

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23820>

16 months agodraw: Update the comment and function name to match the type
Yonggang Luo [Sat, 24 Jun 2023 05:42:10 +0000 (13:42 +0800)]
draw: Update the comment and function name to match the type

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>

16 months agodraw: Replace usage of ubyte/ushort/uint with uint8_t/uint16_t/uint32_t in draw_pt_vs...
Yonggang Luo [Sat, 24 Jun 2023 05:38:18 +0000 (13:38 +0800)]
draw: Replace usage of ubyte/ushort/uint with uint8_t/uint16_t/uint32_t in draw_pt_vsplit.c

This can not be done with tools, so do it manually

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>

16 months agodraw: Replace usage of boolean/TRUE/FALSE with bool/true/false in draw_pt_vsplit*
Yonggang Luo [Thu, 22 Jun 2023 10:38:10 +0000 (18:38 +0800)]
draw: Replace usage of boolean/TRUE/FALSE with bool/true/false in draw_pt_vsplit*

These change can not be done with tools, so do it manually

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23845>

16 months agorusticl/mesa: create proper build-id hash for the disk cache
Karol Herbst [Tue, 28 Feb 2023 23:39:25 +0000 (00:39 +0100)]
rusticl/mesa: create proper build-id hash for the disk cache

Without generating a proper timestamp for the disk cache, we pull old
binaries out of the disk cache, potentially being buggy or simply
outdated.

Once meson 1.2 lands we can easily pull in LLVM functions.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>

16 months agorusticl/meson: extract common bindgen rust args
Karol Herbst [Sat, 24 Jun 2023 08:59:16 +0000 (10:59 +0200)]
rusticl/meson: extract common bindgen rust args

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>

16 months agorusticl: generate bindings for build-id stuff
Karol Herbst [Tue, 28 Feb 2023 23:38:50 +0000 (00:38 +0100)]
rusticl: generate bindings for build-id stuff

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>

16 months agorusticl: structurize and reorder mesa binding args
Karol Herbst [Tue, 28 Feb 2023 21:22:18 +0000 (22:22 +0100)]
rusticl: structurize and reorder mesa binding args

It became quite a mess, I had enough 🙃

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21612>

16 months agov3dv: replace boolean and uint with bool and size_t
Eric Engestrom [Thu, 22 Jun 2023 11:12:44 +0000 (12:12 +0100)]
v3dv: replace boolean and uint with bool and size_t

There's no reason to use the gallium `p_compiler.h` types in vulkan code.

Inspired by https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23577,
but using `size_t` for `ulist_data_size` because its two users are
`blob_read_bytes()` and `memcpy()`, both of which expect a `size_t`.

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23795>

16 months agodocs/coding-style: add pre-commit hook fallback for clang-format
Eric Engestrom [Mon, 19 Jun 2023 22:46:26 +0000 (23:46 +0100)]
docs/coding-style: add pre-commit hook fallback for clang-format

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>

16 months agodocs/coding-style: add example emacs config for clang-format
Eric Engestrom [Mon, 19 Jun 2023 11:42:26 +0000 (12:42 +0100)]
docs/coding-style: add example emacs config for clang-format

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>

16 months agodocs/coding-style: add example vim config for clang-format
Eric Engestrom [Mon, 19 Jun 2023 11:09:41 +0000 (12:09 +0100)]
docs/coding-style: add example vim config for clang-format

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23722>

16 months agor300: properly count maximum used register index
Pavel Ondračka [Fri, 23 Jun 2023 19:14:43 +0000 (21:14 +0200)]
r300: properly count maximum used register index

The problem is when we have DP2 or DP3 instruction that writes a w
channel like here:

DP3 temp[148].w, -temp[147].xyz_, temp[57].xyz_;

will get pair-converted to

src0.xyz = temp[147], src1.xyz = temp[57]
DP3, -src0.xyz, src1.xyz
DP3 temp[148].w, -src0._, src0._

where the alpha instruction is a basically just a replicate of the
result from the RGB sub intruction. However the destination register
index in the RBG slot is also 148. Now we pair-schedule and regalloc

src0.xyz = temp[13], src1.xyz = temp[3]
DP3, -src0.xyz, src1.xyz
DP3 temp[3].w, -src0._, src0._

We properly regalloc the alpha channel, but we obviously skip the rgb,
because the writemask is empty there. However when we emit the shader
later, we actually check the number of used regs based on the maximum
used register index and we don't consider the writemasks, so we would
think we use 149 temps. AFAIK the shader would be still completelly OK.
But we would think it hits the HW limits and used a dummy one instead.

Fix this by checking for empty writemasks when marking the registers as
used.

GAINED: shaders/glmark/1-22.shader_test FS

This is also needed to prevent another lost Trine shader from
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089

Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23838>

16 months agoanv: Only expose video decode bits with KHR_video_decode_queue
Matt Turner [Wed, 21 Jun 2023 16:55:59 +0000 (12:55 -0400)]
anv: Only expose video decode bits with KHR_video_decode_queue

This fixes dEQP-VK.api.info.format_properties.g8_b8r8_2plane_420_unorm
in combination with the CTS fix from
https://gerrit.khronos.org/c/vk-gl-cts/+/12191

Fixes: 93614817806 ("anv: add video format features for the one supported video output format")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8263
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23776>

16 months agoanv: Pipe anv_physical_device to anv_get_image_format_features2
Matt Turner [Wed, 21 Jun 2023 16:54:42 +0000 (12:54 -0400)]
anv: Pipe anv_physical_device to anv_get_image_format_features2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23776>

16 months agonv50/ir/nir: set numBarriers if we emit an OP_BAR
Karol Herbst [Tue, 20 Jun 2023 15:47:48 +0000 (17:47 +0200)]
nv50/ir/nir: set numBarriers if we emit an OP_BAR

Even though the field is called `numBarriers` we set it to 1 just like
we do with TGSI. It's unknown on what's the proper behavior here is. But
without this set the GPU will complain to us loudly, so this silences at
least that.

Fixes: a2d7a4f9788 ("nv50/ir: convert to scoped_barrier")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23749>

16 months agonvc0: fix printing shaders
Karol Herbst [Tue, 20 Jun 2023 15:28:29 +0000 (17:28 +0200)]
nvc0: fix printing shaders

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23749>

16 months agorusticl/program: add debugging option to disable SPIR-V validation
Karol Herbst [Fri, 23 Jun 2023 00:05:38 +0000 (02:05 +0200)]
rusticl/program: add debugging option to disable SPIR-V validation

This is useful for running applications known to pass in invalid SPIR-V.

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>

16 months agorusticl/program: add debugging for OpenCL C compilation
Karol Herbst [Tue, 30 May 2023 10:52:29 +0000 (12:52 +0200)]
rusticl/program: add debugging for OpenCL C compilation

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>

16 months agodocs: document CLC_DEBUG
Karol Herbst [Fri, 23 Jun 2023 00:11:34 +0000 (02:11 +0200)]
docs: document CLC_DEBUG

Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23818>

16 months agointel: Initialize FF_MODE2 on all Gfx12 platforms
Kenneth Graunke [Thu, 22 Jun 2023 22:59:31 +0000 (15:59 -0700)]
intel: Initialize FF_MODE2 on all Gfx12 platforms

On Alchemist, the FF_MODE2 documentation says that we must set the
FF_MODE2 timer values for GS and HS to 224.  The hardware performance
tuning guide also recommends setting the TDS timer to 4.

On Tigerlake, i915 applies workarounds to set the GS timer to 224
(failing to do so can cause HS/DS unit hangs), and the TDS timer to 4
(for performance).  It doesn't currently apply a HS timer there, and
I'm not sure if it's strictly necessary, but given that Alchemist
needed it, and the other two settings matched, let's assume that it
ought to match as well.

Unfortunately, there has been a bug in the i915 workarounds
infrastructure for non-masked context registers where writing one
field of the register zeroes out all the others.  So, I believe the
Tigerlake TDS timer value of 4 isn't being applied correctly there,
though the register is also not readable on that platform which
makes it hard to verify.  So, this may also speed up tessellation.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9233
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23839>

16 months agointel/gfx12.5: Enable L3 partial write merging for compressible surfaces among other...
Francisco Jerez [Wed, 21 Jun 2023 01:12:12 +0000 (18:12 -0700)]
intel/gfx12.5: Enable L3 partial write merging for compressible surfaces among other cases.

This enables L3 partial write merging for a number of cases that seem
to be getting accidentally disabled by the kernel, which was causing a
serious performance bottleneck on DG2 and MTL platforms.  The
"Compressible Partial Write Merge Enable", "Coherent Partial Write
Merge Enable" and "Cross-Tile Partial Write Merge Enable" bits in
L3SQCREG5 were expected to be enabled by default (and confusingly,
they even read off as enabled if you ran 'intel_reg read 0xb158' on an
idle system), but they are getting clobbered during 3D context
initialization by an i915 workaround.

Enabling L3 partial write merging of compressible surfaces in
particular seems to increase rendering fillrate by over 3x in some
cases (e.g. the
"VulkanFillRate/FillRateGPU/resolution:1[0-3]/format:*/blend:0"
fillrate-bound microbenchmarks).  Significant improvements can also be
reproduced in most real-world workloads we've tested so far,
e.g. Counter Strike GO improves by ~11%, Shadow Of the Tomb Raider
improves by ~5.5%, and AztecRuins-VK improves by ~6.5% on DG2-512 --
Thanks a lot to Caleb Callaway for these figures.  No regressions have
been observed so far.

Even though this patch might strike as surprisingly simple for such a
large payoff, it's the result of Felix DeGrood and I trying to
root-cause the rendering performance gap of DG2 on Linux vs Windows on
and off during the last year, and some of the OA statistics captured
by Felix early this month were greatly helpful for me to connect the
last few dots, so Felix deserves a big chunk of the credit for this
work.

Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783>

16 months agoci/fastboot: use gzipped Image to avoid compressing on the runner
David Heidelberg [Mon, 12 Jun 2023 02:08:10 +0000 (04:08 +0200)]
ci/fastboot: use gzipped Image to avoid compressing on the runner

Faster download, one less step. Win-win.

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23816>

16 months agofrontends/va: fix some coverity scan reported issues
Thong Thai [Mon, 12 Jun 2023 14:38:25 +0000 (10:38 -0400)]
frontends/va: fix some coverity scan reported issues

Added some checks for NULL pointer dereferencing and loop bounds.
v2: Use ARRAY_SIZE instead of magic numbers (@jenatali)

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23598>

16 months agomeson: Explicitly add "check : false" to a couple instances of run_command
Caio Oliveira [Fri, 23 Jun 2023 04:37:29 +0000 (21:37 -0700)]
meson: Explicitly add "check : false" to a couple instances of run_command

In both cases there's code right after the execution to check the result and
give a proper message.

This gets rid of meson warning

```
WARNING: You should add the boolean check kwarg to the run_command call.
         It currently defaults to false,
         but it will default to true in future releases of meson.
         See also: https://github.com/mesonbuild/meson/issues/9300
```

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23821>

16 months agoamd/drm-shim: use fixed-width types
Rhys Perry [Mon, 19 Jun 2023 12:00:19 +0000 (13:00 +0100)]
amd/drm-shim: use fixed-width types

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9221
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23725>

16 months agoagx: Implement vector live range splitting
Alyssa Rosenzweig [Fri, 23 Sep 2022 21:05:59 +0000 (17:05 -0400)]
agx: Implement vector live range splitting

The SSA killer feature is that, under an "optimal" allocator, the number of
registers used (register demand) is *equal* to the number of registers required
(register pressure, the maximum number of variables simultaneously live at any
point in the program). I put "optimal" in scare quotes, because we don't need to
use the exact minimum number of registers as long as we don't sacrifice thread
count or introduce spilling, and using a few extra registers when possible can
help coalesce moves. Details-shmetails.

The problem is that, prior to this commit, our register allocator was not
well-behaved in certain circumstances, and would require an arbitrarily large
number of registers. In particular, since different variables have different
sizes and require contiguous allocation, in large programs the register file may
become fragmented, causing the RA to use arbitrarily many registers despite
having lots of registers free.

The solution is vector live range splitting. First, we calculate the register
pressure (the minimum number of registers that it is theoretically possible to
allocate successfully), and round up to the maximum number of registers we will
actually use (to give some wiggle room to coalesce moves). Then, we will treat
this maximum as a *bound*, requiring that we don't use more registers than
chosen. In the event that register file fragmentation prevents us from finding a
contiguous sequence of registers to allocate a variable, rather than giving up
or using registers we don't have, we shuffle the register file around
(defragmenting it) to make room for the new variable. That lets us use a
few moves to avoid sacrificing thread count or introducing spilling, which is
usually a great choice.

Android GLES3.1 shader-db results are as expected: some noise / small
regressions for instruction count, but a bunch of shaders with improved thread
count. The massive increase in register demand may seem weird, but this is the
RA doing exactly what it's supposed to: using more registers if and only if they
would not hurt thread count. Notice that no programs whatsoever are hurt for
thread count, which is the salient part.

   total instructions in shared programs: 1781473 -> 1781574 (<.01%)
   instructions in affected programs: 276268 -> 276369 (0.04%)
   helped: 1074
   HURT: 463
   Inconclusive result (value mean confidence interval includes 0).

   total bytes in shared programs: 12196640 -> 12201670 (0.04%)
   bytes in affected programs: 1987322 -> 1992352 (0.25%)
   helped: 1060
   HURT: 513
   Bytes are HURT.

   total halfregs in shared programs: 488755 -> 529651 (8.37%)
   halfregs in affected programs: 295651 -> 336547 (13.83%)
   helped: 358
   HURT: 9737
   Halfregs are HURT.

   total threads in shared programs: 18875008 -> 18885440 (0.06%)
   threads in affected programs: 64576 -> 75008 (16.15%)
   helped: 82
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx/lower_parallel_copy: Lower 64-bit copies
Alyssa Rosenzweig [Sun, 5 Mar 2023 00:52:32 +0000 (19:52 -0500)]
agx/lower_parallel_copy: Lower 64-bit copies

To 32-bit. This way we don't get into bad situations where we need to eg swap
unaligned 64-bit values or something funny like that.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Validate predecessor information
Alyssa Rosenzweig [Mon, 15 May 2023 21:17:27 +0000 (17:17 -0400)]
agx: Validate predecessor information

Including the new loop header? flag.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Add loop header? flag
Alyssa Rosenzweig [Mon, 15 May 2023 21:17:05 +0000 (17:17 -0400)]
agx: Add loop header? flag

This is useful for deciding whether we need to fix up phis in RA.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Recollect stored vectors at their use
Alyssa Rosenzweig [Wed, 15 Feb 2023 04:58:54 +0000 (23:58 -0500)]
agx: Recollect stored vectors at their use

This is Timur's cheesy solution to split-hell.shader_test. Seems to work ok
here.

Before: 94 inst, 588 bytes, 165 halfregs, 1 threads, 0 loops, 0:0 spills:fills
After: 63 inst, 454 bytes, 129 halfregs, 1 threads, 0 loops, 0:0 spills:fills

On Android GLES3.1 shader-db, a few shaders are helped a lot:

   total instructions in shared programs: 1781706 -> 1781473 (-0.01%)
   instructions in affected programs: 4284 -> 4051 (-5.44%)
   helped: 16
   HURT: 2
   Instructions are helped.

   total bytes in shared programs: 12197854 -> 12196640 (<.01%)
   bytes in affected programs: 29526 -> 28312 (-4.11%)
   helped: 20
   HURT: 2
   Bytes are helped.

   total halfregs in shared programs: 489007 -> 488755 (-0.05%)
   halfregs in affected programs: 945 -> 693 (-26.67%)
   helped: 7
   HURT: 0
   Halfregs are helped.

   total threads in shared programs: 18873216 -> 18875008 (<.01%)
   threads in affected programs: 5376 -> 7168 (33.33%)
   helped: 7
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Extract coordinate register size calculation
Alyssa Rosenzweig [Fri, 19 May 2023 17:07:25 +0000 (13:07 -0400)]
agx: Extract coordinate register size calculation

It will be used for image writes too, not just reads.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoasahi: Pass through surface sample count
Asahi Lina [Wed, 14 Jun 2023 17:49:25 +0000 (02:49 +0900)]
asahi: Pass through surface sample count

This makes PIPE_CAP_SURFACE_SAMPLE_COUNT do something, namely, explode
with lots of fireworks. We'll have to figure out what's wrong, but at
least now we aren't just not trying at all. Should not break anything as
long as PIPE_CAP_SURFACE_SAMPLE_COUNT is not flipped on.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoasahi: Disable PIPE_CAP_SURFACE_SAMPLE_COUNT
Asahi Lina [Wed, 14 Jun 2023 17:48:48 +0000 (02:48 +0900)]
asahi: Disable PIPE_CAP_SURFACE_SAMPLE_COUNT

This never worked, disable it.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoasahi: Revert "Advertise ARB_texture_barrier"
Asahi Lina [Wed, 14 Jun 2023 08:09:15 +0000 (17:09 +0900)]
asahi: Revert "Advertise ARB_texture_barrier"

This reverts commit 9e67d3f23780a818b9fc764105f39c6d595c6530.

We do not, in fact, implement texture barriers. Texture barriers are
supposed to allow non-overlapping rendering feedback loops. We cannot
support that at non-tile boundaries when texture compression is enabled
without some kind of downgrade path or other special handling.

Fixes Emacs corruption on X/Glamor.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Fix discards
Alyssa Rosenzweig [Wed, 14 Jun 2023 16:34:34 +0000 (12:34 -0400)]
agx: Fix discards

Switch our frontends from generating sample_mask_agx to discard_agx, and
switching from legalizing sample_mask_agx to lowering discard_agx to
sample_mask_agx. This is a much easier problem and is done here in a way that is
simple (and inefficient) but obviously correct.

This should fix corruption in Darwinia.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoagx: Update explanation of sample_mask behaviour
Alyssa Rosenzweig [Wed, 14 Jun 2023 15:46:01 +0000 (11:46 -0400)]
agx: Update explanation of sample_mask behaviour

We discovered today that these (probably) trigger depth/stencil testing, which
has significant implications for the correct/performant use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agonir: Add discard_agx intrinsic
Alyssa Rosenzweig [Wed, 14 Jun 2023 16:32:24 +0000 (12:32 -0400)]
nir: Add discard_agx intrinsic

sample_mask_agx corresponds directly to the hardware's 2-source instruction, but
it's hard to use correctly and even harder to legalize after the fact, since
it's responsible for not only discard but also late depth/stencil testing. For
our various high-level lowering passes, it's easier to use a one-source discard
(where we don't have to worry about sample masks), which the compiler will
internally lower to the two-source instruction. Introduce such an instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>

16 months agoradv: adjust alignment of the preprocess buffer with DGC
Samuel Pitoiset [Wed, 21 Jun 2023 11:07:34 +0000 (13:07 +0200)]
radv: adjust alignment of the preprocess buffer with DGC

The preprocess buffer is the buffer used to generate the cmdbuf. It
was aligned to 256 bytes but the correct alignment is actually
ac_gpu_info::ib_alignment.

Otherwise, if a DGC IB is executed like a IB1, this hits an assertion
in radv_amdgpu_cs_submit() because the alignment is incorrect.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23764>

16 months agoradv: only dirty the active push constant stages with DGC
Samuel Pitoiset [Wed, 21 Jun 2023 07:14:44 +0000 (09:14 +0200)]
radv: only dirty the active push constant stages with DGC

It's unnecessary to dirty all stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23761>

16 months agoradv: only dirty the index type when necessary with DGC
Samuel Pitoiset [Wed, 21 Jun 2023 07:09:01 +0000 (09:09 +0200)]
radv: only dirty the index type when necessary with DGC

This should only be needed for non-indexed draws and it's already
dirty if the DGC binds an index buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23761>

16 months agoradv/amdgpu: dump all cs with RADV_DEBUG=noibs
Samuel Pitoiset [Wed, 14 Jun 2023 12:01:23 +0000 (14:01 +0200)]
radv/amdgpu: dump all cs with RADV_DEBUG=noibs

It was only dumping the oldest.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23646>

16 months agoradv/amdgpu: fix dumping cs with RADV_DEBUG=noibs
Samuel Pitoiset [Wed, 14 Jun 2023 11:29:02 +0000 (13:29 +0200)]
radv/amdgpu: fix dumping cs with RADV_DEBUG=noibs

The ib_buffer is NULL now.

Fixes: 50e6b16855d ("radv/amdgpu: Use fallback submit for queues that can't use IBs.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23646>

16 months agopvr: Correctly read dynamic state setup during blend constant setup
Matt Coster [Fri, 9 Jun 2023 15:53:59 +0000 (16:53 +0100)]
pvr: Correctly read dynamic state setup during blend constant setup

Somewhat counterintuitively, dynamic_state.set contains the bits that
have been loaded from static state, i.e. those that are _not_ dynamic.

Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23590>

16 months agoradeonsi: disable H264HIGH10 profile
Boyuan Zhang [Fri, 23 Jun 2023 05:02:49 +0000 (01:02 -0400)]
radeonsi: disable H264HIGH10 profile

Issue: H.264 high 10 profile is currently not supported, but is shown as
supported in vainfo.

Reason: Kernel reported capabilities for video encoder/decode doesn't
consider the actual profile (only using reduced profile).

Solution: Use kernel reported capabilities only for basic H.264/HEVC
profiles. Other profiles (e.g. 10 bits) should be checked based on HW.

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9242

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23824>

16 months agoradv: reserve more space in CS for SQTT
Samuel Pitoiset [Fri, 23 Jun 2023 07:01:32 +0000 (09:01 +0200)]
radv: reserve more space in CS for SQTT

Otherwise, it can hit an assertion.

Fixes: 7893040f807 ("radv: Add stricter space checks.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23826>

16 months agoaco: Drop NIR parallel copy handling
Alyssa Rosenzweig [Fri, 16 Jun 2023 11:30:51 +0000 (07:30 -0400)]
aco: Drop NIR parallel copy handling

Backends never see these instructions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23831>

16 months agoaco: Remove unneeded stage related info fields.
Timur Kristóf [Mon, 12 Jun 2023 13:58:19 +0000 (15:58 +0200)]
aco: Remove unneeded stage related info fields.

Cleanup of various fields with redundant information.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>

16 months agoaco: Use aco_shader_info::hw_stage instead of guessing.
Timur Kristóf [Mon, 12 Jun 2023 13:55:40 +0000 (15:55 +0200)]
aco: Use aco_shader_info::hw_stage instead of guessing.

With this change, ACO is going to rely on the caller to set
the HW stage and will no longer guess it from the input shaders.

This will help enable compiling merged shaders separately,
but that will need further changes in instruction selection.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>