review.tizen.org Git - platform/upstream/mesa.git/log

nir: simplify nir_block_cf_tree_{next|prev}

Removes some case distinction by first checking if this is
the first/last block of a cf_node.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir/lower_continue_targets: only repair SSA when necessary

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir/lower_continue_constructs: special-case Continue Constructs with zero or one predecessors

If a loop has only a single continue, the control flow is already
converged and we can inline the continue construct.
If a loop has no continue statement at all, the Continue Construct
is unreachable and can simply be deleted.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

spirv: use Loop Continue Construct to emit SPIR-V loops and lower after parsing

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir: add lowering for Loop Continue Constructs

This pass lowers Loop Continue Constructs to the previous solution
by inserting it at the beginning of the loop:

loop {
   if (i != 0) {
      continue construct
   }
   loop body
}

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir: create nir_push_continue() and related helpers

nir_control_flow.h:
void nir_loop_add_continue_construct(nir_loop *loop);
void nir_loop_remove_continue_construct(nir_loop *loop);

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir: add assertions that loops don't have a Continue Construct

Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.

Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

nir: add Continue Construct to nir_loop

The added continue_list corresponds to the SPIR-V
Continue Construct and serves as a converged control-flow
construct and is executed after each continue statement
and before the next iteration of the loop body.

Also adds validation rules for loops with Continue Construct

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>

glsl: Account for unsized arrays in NIR linker

Follow the same approach as the pre-NIR linker.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5891
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21401>

zink/kopper: Add extra swapchain images for Venus

Together with the previous patch that corrects the number of
swapchain images on Xwayland this gives Zink/Venus a spead
boost in a number of work loads and close the gap or even
surpass VirGL when the benchmark is not GPU bound.
Some numbers:

zink (Virtio-GPU Venus (Host: RADV RENOIR)) / VirGL

Benchmark                   VirGL    baseline  Zink/Venus +1
                                                and Xwayland +1
    ==================================================================
    OpenArena (FPS)            63.8     60.1     148.5
    Unigine Sancuary (FPS)    129.1    121.4     164.7
    Unigine Tropics (FPS)     107.2     85.7     114.3
    Unigine Heaven (FPS)       48.5     48.0      51.5
    Unigine Valley (FPS)       48.0     45.6      47.4
    Xonotic (FPS)              90.5     59.4      89.2
    GpuTest/Volcano (Points)   2960     2966      3013

  zink (Virtio-GPU Venus (Host: Intel Xe TGL GT2)) / VirGL

Benchmark                   VirGL    baseline  Zink/Venus +1
                                                and Xwayland +1
    ===========================================================
    OpenArena (FPS)          95.1       59.8        78.9
    Unigine Sancuary (FPS)   85.5       76.6        81.8
    Unigine Tropics (FPS)    66.0       59.8        62.7
    Unigine Heaven (FPS)     28.8       28.7        28.0
    Unigine Valley (FPS)     29.0       28.0        27.0
    Xonotic (FPS)            64.2       49.4        51.1
    GpuTest/Volcano (Points) 2855       2718        2747

v2: Fix limiting minImageCount (Mike)

Signed-off-by: Gert Wollny <gert.wollny@collabora.co.uk>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21136>

vulkan/wsi: Take Xwayland into account for x11_min_image_count

For wayland we report a minimum of four swapchain images, so for
Xwayland we should report the same.

v2: Fix typo (Eric)
v3: Make that four images on Xwayland (Daniel)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21136>

asahi: Make shader-db work again

We need a nontrivial blend state otherwise the whole frag shader is optimized
out.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21432>

asahi: Implement color masks with masked stores

Blend states can require masking colour. Currently, this is handled by
nir_lower_blend, which lowers masks to a read-modify-write operation as required
on Mali hardware. However, our "tilebuffer store" instruction supports a write
mask, allowing us to write only a subset of channels to the tilebuffer. It's
more efficient to use that than to emit pointless tilebuffer loads.

Note that even without tilebuffer loads, non-opaque masks don't work with opaque
pass types. Here, we handle this with a translucent pass type, which gets HSR
to do the right thing and is consistent with the pass type used previously.
However, it's a bit heavy handed -- Apple manages to use an opaque pass type
with masking but with some unknown HSR fields twiddled. IMO reverse-engineering
those details shouldn't block this because this gets us closer to optimal (just
not all the way there) and is strictly better than what we had before.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>

agx: Add agx_internal_format_supports_mask helper

Not all formats can be masked, add a query to check which can be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>

agx: Handle ssa_undef as zero

Masked stores may result in undefs after optimization. Rather than call
lower_undef_to_zero late (but get no benefit), we may as well handle ourselves
to prepare for proper undef support down the line.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>

agx: Add and use agx_nir_ssa_index helper

Common subexpression that we'll repeat once more in the next patch.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21431>

radv: enable SQTT tracing on GFX11

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

radv: disable SPM counters with RGP on GFX11

They are likely different and perfcounters aren't defined on GFX11 yet.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

radv: implement a workaround for SQTT on GFX11

Found in AMDVLK, see the comment below for an explanation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

radv: make sure to wait for the trace buffer also on GFX11

Otherwise, we might get incomplete data.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

radv: only enable SQTT for SE0 on GFX11

For weird reasons, the hardware doesn't return any data for other SEs.
RadeonSI is also affected by the same issue, enable only SE0 for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

radv: configure SQ_THREAD_TRACE_CTRL.REG_AT_HWM on GFX11

AMDVLK sets this to 2 when the always stall mode is enabled, which is
the default in RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20338>

util/u_process: implement util_get_command_line for BSDs

Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21052>

winsys/amdgpu: use amdgpu_device_get_fd

If radv is initialized before radeonsi, doing:

aws->fd = fd;

is incorrect because the device was initialized using the fd
passed by radv.

libdrm has a helper to query the fd used to create the device,
so use it.

We also need to init the kms_handles table in this case
because we're going to share BOs between radeonsi's fd and
the device fd.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3424
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20983>

freedreno: check for conditional rendering in launch_grid

fixes: KHR-GL45.compute_shader.conditional-dispatching

Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21428>

agx: Handle group_memory_barrier

A combination of control_barrier + memory_barrier but it's always seen with
those. This would be safer with scoped barriers...

Fixes dEQP-GLES31.functional.synchronization.inter_invocation.ssbo

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Implement b2b32

Shows up with store_shared.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Pack local atomics

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Lower shared memory offsets to 16-bit

Per the hardware requirement. This simplifies instruction selection (it avoids
the need to constant fold u2u16 in the backend).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Translate load/store_shared

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Translate NIR atomics

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Pack local load/store instructions

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Pack global atomics

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Disallow immediate bases to device_load

Lina pointed this out in review.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Model local loads/stores

Aka shared memory or threadgroup memory.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

agx: Model atomic instructions

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21326>

iris: Export num_fences()

This function will be needed by i915 and Xe backends.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21389>

iris: Export update_batch_syncobjs()

This function will be needed by i915 and Xe backends.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21389>

iris: Export batch debug functions

Those function will be called by different backends, so exporting it.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21389>

asahi: Advertise ARB_texture_barrier

We already implement it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

asahi: Advertise ARB_derivative_control

Our native fddx instruction is already fine, so it's fine to use it for both
fddx_coarse and fddx_fine. We handle both of those cases already so the
extension is trivial.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

docs/features: Sync Asahi with reality

A few features were either missed in the original patch or have since been
added, update features.txt to light up more green on the mesa matrix.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

agx: Implement gathers (nir_texop_tg4)

Passes dEQP-GLES31.functional.texture.gather.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

agx: Model and pack gathers

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

agx: Lower offsets in NIR

Rather than the backend. This way we can handle non-constant offsets as well as
constants with a single code path (with the constant offset code subsumed as a
special case via NIR's constant folding). This nets us dynamic offset support.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21264>

ci: revert download of git cache to the wget

At this point of CI there is not curl available.

Fixes: 796686af1b37 ("ci: migrate from wget to curl")

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21414>

pvr: Use descriptor/set/table offsets from driver

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Co-Authored-By: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Split pvr_private.h

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Load descriptors from memory

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Support loading immediate values

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Additional register subarray support

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Add bitwise instruction support

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Add memory load support

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Add ADD64 support

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by Frank Binns <frank.binns@imgtec.comr>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Add PVR_SELECT() helper macro

For pvr_setup_descriptor_mappings_new() there will be quite a few
variables of which the value depend on the stage so rather than
having all that selection in the `switch` at the beginning of the
function the helper macro provides a compact selection in the
desired scope.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Write descriptor set addrs table dev addr into shareds

Previously UBOs and various buffers, as well as the native
descriptor sets were DMAed into the shared registers. This added
complexity in allocating the registers and various other places.
We also ended up being in situations were we wouldn't know the size
of a buffer by the time the shaders were being compiled. It would
be possible to determine the size by inspecting the shader but
that would introduce more complexity in the compiler.
To get things working sooner, avoid extra complexity for
now, a different approach was devised.

The driver will write the addresses of the currently bound
descriptor sets into a device buffer. The device buffer is referred
to as the descriptor set addrs table. The dev addr of the table is
written into a shared register. To access the buffers the shader
will first get the address of the descriptor set from the in memory
table. Then get the primary descriptor from the descriptor set. And
finally access the in memory buffer with the address it read from
the descriptor. Essentially there's three level of indirection and
all the buffers are in memory. The shader will know what offset the
primary descriptor is located based on the descriptor set layout.
The descriptor set address could have been written into the shareds
directly but that would require extra handling on the compiler side
so opted to just write the table address instead.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Change last_DMA to last_dma

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Put old descriptor set approach behind a hardcoding check

This commit sets up the infrastructure to introduce the new
descriptor set approach while keeping the old paths so the
hard coded apps are still operational. The old paths will be
removed once the compiler can compiler shaders for those apps
and the driver-compiler interface is fully flushed out.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

pvr: Store enum pvr_stage_allocation instead of VkShaderStageFlags

This commit changes the pipeline layout, desc. set layout,
and desc. set layout binding to keep track of shader stage usage
with a mask of enum pvr_stage_allocation instead of
VkShaderStageFlags.

This commit also makes renames the relevant fields to
'shader_stage_mask' to make the naming uniform across stucts.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21331>

radv/ci: move CI lists for external GPUs in separate folder

A bunch of CI lists are maintained by ourselves with GPUs outside of
Mesa CI. Move them to a separate folder to avoid confusion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21417>

radv/ci: disable vkcts-kabini-valve

It's no longer reachable.

Suggested-by: Martin Roukala <martin.roukala@mupuf.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21417>

asahi: Fix rendering into mipmapped framebuffers

batch->key.width will be minified, but then the PBE::level field will
incorrectly minify again.

Fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.framebuffer_texture_level

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21400>

agx: Do more work in agx_preprocess_nir

agx_preprocess_nir runs once per shader, whereas agx_optimize_nir runs once per
variant. That means we want to do as much work as possible in agx_preprocess_nir
to make shader variants as cheap as possible to compiler. So, move our standard
suite of lowering and optimizing to the preprocess loop, leaving just a single
(easy) trip through the optimizer for simple variant processing.

Plus, we can remove variables when preprocessing, since we no longer use
variables anywhere. We remove them to reduce the RAM and disk cache footprint of
shader variants.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21104>

agx: Don't treat clip distances specially

We've been using the clip lowering, but it's been broken upstream because of
this artefact from the (non-lowered implementation) sneaking in from downstream.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21104>

asahi: Only apply FS lowerings to fragment shaders

Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21104>

asahi: Move agx_preprocess_nir to CSO create

Now we preprocess shaders once at link time, rather than every time we spawn a
variant. This should reduce variant pain.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21104>

asahi: Lower clip distances late

This pass works either early or late, so run it late. It creates some
nir_variables as a side effect, which is weird, but it doesn't matter because
the AGX backend doesn't look at variables and the metadata and lowered I/O
intrinsics are all correct.

This is the last step to moving I/O lowering (and hence shader preprocessing) to
CSO create time.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21104>

docs/release-calendar: drop the last 22.2.x, it won't happen

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21415>

zink/ci: set RADV_PERFTEST=gpl for RADV jobs

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21329>

zink/ci: skip KHR-GL46.texture_swizzle.functional with RADV

They usually timeout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21329>

ci: uprev vkd3d-proton

This adds test coverage for VK_EXT_image_sliced_view_of_3d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21384>

v3d: support r{g,gba}16f formats for vertex buffers

These are supported, and in fact we are exposing them through
Vulkan. Makes SuperTuxKart significantly faster in GL, I've
observed an FPS increase from ~100% to ~500% depending on the
track.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21361>

gallium: create query_memory_info implementation for sw drivers

For ATI_meminfo or NVX_gpu_memory_info on llvmpipe and softpipe.

Signed-off-by: Yusuf Khan <yusisamerican@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21373>

intel: Use common helpers for TCS passthrough shaders

Rob added these new helpers a while back, which freedreno and radeonsi
both share.  We should use them too.  The new helpers use variables and
system value intrinsics, so we can drop the explicit binding table
creation and just use the normal paths.

Because we have to rewrite the system value uploading anyway, we drop
the scrambling of the default tessellation levels on upload, and instead
let the compiler go ahead and remap components like any normal shader.
In theory, this results in more shuffling in the shader.  In practice,
we already do MOVs for message setup.  In the passthrough shaders I
looked at, this resulted in no extra instructions on Icelake (SIMD8
SINGLE_PATCH) and Tigerlake (8_PATCH).  On Haswell, one shader grew by
a single instruction for a pittance of cycles in a stage that isn't a
performance bottleneck anyway.  Avoiding remapping wasn't so much of an
optimization as just the way that I originally wrote it.  Not worth it.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20809>

glsl: isolate object macro replacments

Here we use a leading space to isolate them from
the code they will be inserted into. For example:

#define VALUE -1.0
int a = -VALUE;

Should be evaluated to int a = - -1.0; not int a = --1.0;

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7932

Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21352>

glsl: add _token_list_prepend() helper to the parser

This will be used in the following patch.

Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21352>

aco/optimizer: Change v_cmp with subgroup invocation to constant.

When a shader has a comparison with the subgroup invocation id,
we can use a constant instead, saving a VALU instruction.
When the constant can't be represented as a 64-bit literal,
use the s_bfm_b64 instruction to generate it instead, which
is still a win.

Fossil DB stats on GFX11:
Totals from 300 (0.22% of 134913) affected shaders:
CodeSize: 2223052 -> 2214336 (-0.39%); split: -0.43%, +0.04%
Instrs: 430216 -> 429882 (-0.08%); split: -0.14%, +0.06%
Latency: 5881180 -> 5878181 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 731846 -> 729293 (-0.35%)
Copies: 31662 -> 31847 (+0.58%); split: -0.03%, +0.61%
Branches: 8241 -> 8100 (-1.71%)
PreVGPRs: 15788 -> 15786 (-0.01%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20843>

glthread: don't restore non-VBO vertex arrays after all draws

glthread takes care of all uploads, so it's OK to leave uploaded VBOs
bound. The only thing that will be wrong is the bound vertex buffer
returned by glGet, but the only case when that would be wrong is when
an app that doesn't use VBOs queries the current VBO. That never happens.

However, this adds code to unbind all internal VBOs for the case when
glthread is abruptly disabled (e.g. for GL_DEBUG_OUTPUT_SYNCHRONOUS).

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't free glthread for GL_DEBUG_OUTPUT_SYNCHRONOUS, only disable it

and enable it when GL_DEBUG_OUTPUT_SYNCHRONOUS is disabled.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: remove unnecessary debug code

_mesa_glthread_destroy won't be called for GL_DEBUG_OUTPUT_SYNCHRONOUS_ARB,
so the "reason" parameter will be useless.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: convert (Multi)DrawIndirect into direct if user buffers are present

so that user buffers are uploaded without syncing.

Now glthread fully handles non-VBO uploads, so that we can disable user
buffer codepaths in st/mesa.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: add API to allow passing DrawID from glthread to mesa

This will be needed for lowering DrawIndirect in glthread, which is
needed if non-VBO vertex arrays are present.

This only adds the drawid parameter in glthread's draw_arrays and
draw_elements functions, and implements where needed.

New GL API functions are added because we want to use separate
DISPATCH_CMD_* enums for draws with DrawID, so that we don't increase
the memory footprint of draws in glthread batches if drawid == 0.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: handle non-VBO uploads for glMultiModeDraw{Arrays,Elements}IBM

This was unimplemented, and this implementation matches exactly what we do
in main/draw.c.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't execute Draw and BufferSubData calls if the context is lost

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: inline draw functions that have only one use

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: remove goto statements and add unlikely() into draw functions

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: add ctx->GLThread.draw_always_async to simplify draw checking

This just precomputes 3 terms of the condition to draw asynchronously.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: reorder draw code a little

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: add a vertex upload path that unrolls indices for glDrawElements

u_vbuf does this too. This is the last big missing piece to stop using
u_vbuf.

If the vertex range to upload is much larger than the draw vertex count and
if all attribs are not in VBOs, convert glDrawElements to glBegin/End.

This is a path that makes the Cogs game go from 1 FPS to ~197 FPS. There is
no change in FPS because u_vbuf does this, but it will be disabled.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: track vertex formats for all attributes

We'll need this for a special vertex upload fallback.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't bind/unbind uploaded indexbuf, pass it to glMultiDraw directly

MultiDrawElementsUserBuf is changed to mean the same thing as
glMultiDrawElementsBaseVertex, but "gl_buffer_object *index_buffer" is
passed via a parameter instead of using the bound GL_ELEMENT_ARRAY_BUFFER.

This skips binding and unbinding the index buffer around every draw
where glthread uploads indices.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't bind/unbind uploaded indexbuf, pass it to glDraw directly

DrawElementsUserBuf is changed to mean the same thing as
glDrawElementsInstancedBaseVertexBaseInstance, but "gl_buffer_object *
index_buffer" is passed via a parameter instead of using the bound
GL_ELEMENT_ARRAY_BUFFER.

This skips binding and unbinding the index buffer around every draw
where glthread uploads indices.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: move some draw call parameters closer to their use

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't pass index bounds to the driver for async calls

They are never used with vertex uploads in glthread.
For example, glDrawRangeElements is converted to glDrawElements.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: don't execute glDraw code if we're inside glBegin/End

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: change glMultiDrawElements to execute draw_count < 0 asynchronously

also clean up the conditions.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

glthread: rewrite glMultiDrawArrays to never fail to upload vertices

The main goal is to never fail to upload non-VBO vertex arrays.
When glthread synchronized, it didn't upload vertices, expecting st/mesa
to do that. This keeps the required sync, and then upload vertices
in glthread.

Also, reorder the code and remove goto statements. This is pretty much
a rewrite.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20624>

Revert "ci/zink: Disable Amnesia trace until the linked issue gets fixed."

This reverts commit 2e807a028aa9366be39a4c9445377dbb11e1dcf5.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21039>

glthread: ignore non-VBO vertex arrays with NULL data pointers

This can happen when an attrib is enabled, but the shader doesn't use it,
so it's ignored by mesa/state_tracker, and should be ignored here as well.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8138

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21039>

glthread: add a heuristic to stop locking global mutexes with multiple contexts

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4516
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8035

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21039>