review.tizen.org Git - platform/upstream/mesa.git/log

panfrost: Assert staging resource allocation was successful

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10266>

Revert "glx: Lift sending the MakeCurrent request to top-level code"

This provokes crashes in Cinnamon for some reason that I haven't
diagnosed yet.

This reverts commit 80b67a3b444f31462890a8e390650fa77c4d2010.

Fixes: 80b67a3b444 glx: Lift sending the MakeCurrent request to top-level code
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4639
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10260>

venus: cap api version to 1.1 for Android

Android hasn't officially adopted 1.2 yet, so we just cap it to avoid
troubles(e.g. vkjson doesn't like 1.2 atm).

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10258>

freedreno: Fix YUV sampler regression.

We have to keep sampler uniforms around for later YUV lowering, and we
only need to remove uniforms that take up storage space. Code comes from
radeonsi.

Closes: #4644.
Fixes: de17b4aab568 ("freedreno: Remove uniform variables after finalizing NIR.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10246>

ci: Move -Werror enabling from job definitions to meson build script

It was enabled in all jobs.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

ci: Enable -Werror for the remaining GCC build jobs

Same principle as for clang, with much fewer exceptions left.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

osmesa: Replace default case FALLTHROUGH annotation by following return

Avoids warning about the annotation with GCC 10:

../src/gallium/frontends/osmesa/osmesa.c: In function ‘osmesa_choose_format’:
../src/util/compiler.h:84:21: warning: attribute ‘fallthrough’ not preceding a case label or default label
   84 | #define FALLTHROUGH __attribute__((fallthrough))
      |                     ^~~~~~~~~~~~~
../src/gallium/frontends/osmesa/osmesa.c:316:7: note: in expansion of macro ‘FALLTHROUGH’
  316 |       FALLTHROUGH;
      |       ^~~~~~~~~~~

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

ci: Enable -Werror in clang jobs

They're not warning-clean yet, but we can enable -Werror in general and
just allow the existing types of warnings as exceptions with
-Wno-error=[...]. This way, new warnings of all other types will be
prevented from entering the code base.

Once all warnings of a certain type have been eliminated in a job, the
exception for that type can be dropped from that job. This provides a
realistic path to a fully warning-clean CI build in the future.

v2:
* Use echo -n (Juan A. Suarez)

Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

Use explicit break instead of fall-through to break-only case

clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

iris: Drop unneeded default switch case

Avoids clang warning about the fall-through annotation.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

llvmpipe: Drop switch with only default case

Replace it with the default case contents.

Avoids clang warning about the fall-through annotation.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

Guard FALLTHROUGH annotations after assert()

clang warns if it can determine that the assert() never returns and
there's a fall-through annotation below.

Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

Convert most remaining free-form fall-through comments to FALLTHROUGH

One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

util: Remove unused Android options_tbl_lock

Avoids warning:

../src/util/os_misc.c:132:21: error: unused variable 'options_tbl_lock' [-Werror,-Wunused-variable]
static simple_mtx_t options_tbl_lock = _SIMPLE_MTX_INITIALIZER_NP;
^

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

lima/ppir: Cast pointer to uintptr_t instead of uint64_t

Avoids warnings on armhf:

./src/gallium/drivers/lima/ir/pp/nir.c: In function 'ppir_get_block':
../src/gallium/drivers/lima/ir/pp/nir.c:554:66: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
    ppir_block *block = _mesa_hash_table_u64_search(comp->blocks, (uint64_t)nblock);
                                                                  ^
../src/gallium/drivers/lima/ir/pp/nir.c: In function 'ppir_compile_nir':
../src/gallium/drivers/lima/ir/pp/nir.c:899:52: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
          _mesa_hash_table_u64_insert(comp->blocks, (uint64_t)nblock, block);
                                                    ^

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>

tu: Expose VK_EXT_robustness2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

tu: Handle null descriptors

Writing all 0's, including for the format, seems to work. Actually
setting the format seems to break textureSize() (getsize returns 1 for
some reason).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

tu: Handle robust UBO behavior for pushed UBO ranges

If we push a UBO range but then find out at draw-time that part of the
pushed range is out of range of the UBO descriptor, then we have to fill
in the rest of the range with 0's to mimic the bounds-checking that ldc
would've done.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

tu: Correctly preserve old push descriptor contents

We were never setting set->size, so we were always copying 0 bytes. But
as we only copy the contents when the layout and therefore the size is
the same, we don't have to take the old size into account anyway.

This fixes some VK_EXT_robustness2 tests that use push descriptors.

Fixes: 6d4f33e ("turnip: initial implementation of VK_KHR_push_descriptor")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

ir3, tu: Add compiler flag for robust UBO behavior

This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

ir3: Reduce max const file indirect offset base to 9 bits

This fixes
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_buffer.no_fmt_qual.len_260.samples_1.1d.frag,
which accesses the shader UBO with c<a0.x + 512> due to the constant
data UBO coming before it in the const file. The len_256 variant has a
smaller constant data UBO, so it uses c<a0.x + 256> instead, and that
works, so 512 seems to be the real limit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

ir3: Fix list corruption in legalize_block()

We forgot to remove the instruction under consideration from instr_list
before inserting it into the block's list, which caused instr_list to
become corrupted. This happened to work but caused further corruption in
some rare scenarios.

Fixes: adf1659 ("freedreno/ir3: use standard list implementation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>

gitlab-ci: enable Intel AML-Y as experimental

The LAVA lab has been running well with the rammus chromebook for some
time now. Let's add it to MesaCI as experimental to get more testing,
and later enable it in production.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10238>

traces-iris: fix expectation for Intel GLK

glmark2/buffer-columns=200:interleave=true:update-dispersion=0.9:upd...
was missing the expectation checksum.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10238>

v3dv: don't use a dedicated BO for each occlusion query

Dedicated BOs waste memory and are also a significant cause of CPU
overhead when applications use hundreds of them per frame due to
all the work the kernel has to do to page in all these BOs for a job.
The UE4 Vehicle demo was hitting this causing it to freeze and stutter
under 1fps.

The hardware allows us to setup groups of 16 queries in consecutive
4-byte addresses, requiring only that each group of 16 queries is
aligned to a 1024 byte boundary. With this change, we allocate all
the queries in a pool in a single BO and we assign them different
offsets based on the above restriction. This eliminates the freezes
and stutters in the Vehicle sample.

One caveat of this solution is that we can only wait or test for
completion of a query by testing if the GPU is still using its BO,
which basically means that we can only wait for all active queries
in a pool to complete and not just the ones being requested by the
API. Since the Vulkan recommendation is to use a different query
pool per frame this should not be a big issue though.

If this ever becomes a problem (for example if an application does't
follow the recommendation and instead allocates a single pool and
splits its queries between frames), we could try to group queries
in a pool into a number of BOs to try and find a balance, but for
now this should work fine in most cases.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10253>

docs: update GL_ARB_texture_filter_minmax for zink

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10248>

zink: export PIPE_CAP_SAMPLER_REDUCTION_MINMAX_ARB

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10248>

zink: handle minmax sampler creation for VK_EXT_sampler_filter_minmax

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10248>

zink: support format queries for VK_EXT_sampler_filter_minmax

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10248>

zink: hook up VK_EXT_sampler_filter_minmax

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10248>

radv: fix barrier in radv_decompress_dcc_compute shader

ACO doesn't create a waitcnt for barriers between texture samples and
image stores because texture samples are supposed to use read-only
memory. It could also schedule the barrier to above the texture sample.
We also have use a larger memory scope to avoid an ACO optimization.

Tested on GFX8 with Sachsa Willems deferred sample. With some DCC
decompressions and the compute path forced.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 21.1 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9496>

radv: Allocate buffer list for MUTABLE descriptor types as well.

Fixes: 86644b84b94 ("radv: Implement VK_VALVE_mutable_descriptor_type.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10132>

radv: Take image alignment into account when allocating MUTABLE pool.

Allocating a descriptor set is aligned to 32 bytes, so just like the
other buffer types, bump the descriptor size to 32 bytes when allocating
MUTABLE descriptor types from a pool.

Fixes: 86644b84b94 ("radv: Implement VK_VALVE_mutable_descriptor_type.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10132>

clover/llvm: handle Fixed vs Scalable vectors explicitly starting with llvm-11

This fixes compilation with llvm-13.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4200
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9973>

v3dv: fix array sizes when tracking BOs during uniform setup

The resource indices we get point to descriptor map entries that include
all shader stages, so we need to size the arrays to account for more than
just one stage.

For now we only support up to 2 stages in a pipeline, so we use that.

Fixes: 002304482ce ('v3dv: avoid redundant BO job additions for UBO/SSBO')
Fixes: fa170dab4c5 ('v3dv: avoid redundant BO job additions for textures and samplers')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>

v3dv: fix descriptor set limits

There were various issues here:
   - MAX_DYNAMIC_UNIFORM_BUFFERS was larger than MAX_UNIFORM_BUFFERS.
   - In some cases we were exposing more than the minimums required.
     While that is not incorrect, it is not following what we have
     been doing in general.
   - The Vulkan spec states that some of the MaxDescriptorSet limits
     need to be multipled by 6 to include all shader stages, even
     if the implementation doesn't support all shader stages.

Fixes: cbd299b051 ('v3dv/device: do not compute per-pipeline limits multiplying per-stage')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>

v3dv/debug: use gl stage when checking debug flag

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10229>

v3dv/debug: print correct stage name

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10229>

ci/freedreno: Merge a630 piglit to a single job.

piglit_gl clocked in at 6:12 end-to-end runtime, and piglit_shader spent
2:53 in deqp-runner, so merging them together should be about 9 minutes.
Removing a boot should save us a minute or two of runner time per
pipeline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10243>

ac/surface: allow non-DCC modifiers for YUV on GFX9+

Accept non-linear tiling for multi-planar formats on GFX9+, as long
as DCC is disabled. DCC support is possible in theory, but untested
for now.

GFX8 is still restricted to linear tiling because it's not yet clear
how modifiers should be handled on these chips for multi-planar
formats. Each plane may need a different modifier.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>

radeonsi: stop special-casing YUV formats in si_query_dmabuf_modifiers

Instead of having a special case for YUV formats in
si_query_dmabuf_modifiers, let ac_get_supported_modifiers handle
them. Keep setting external_only = 1 for YUV formats, since we
can only sample from such formats (we can't use them as render
targets).

This shouldn't change si_query_dmabuf_modifiers's behavior, because
for YUV formats ac_get_supported_modifiers will return a single
LINEAR modifier.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>

ac/surface: use blocksizebits instead of blocksize

util_format_get_blocksize asserts that the blocksize isn't zero.
However the blocksize will be zero if the format's channel encoding
is unspecified. The channel encoding is only meaningful for the
plain u_format layout, so util_format_get_blocksize can't be used
for formats with another layout. For example, YUV formats don't have
the channel encoding specified.

Use util_format_get_blocksizebits, which just returns zero without
an assertion for formats which don't have a channel encoding.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>

util/format: document block depth field

After the pixel block width and height, a third field is used to
store the pixel block depth. Document this field.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>

radeon/vcn: handle tiled buffers when decoding

Set the swizzle mode when decoding.

Add a safe-guard to make sure the provided surface isn't DCC, because
we don't handle this situation.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>

turnip: fix typo in tu_CmdBeginRenderPass2()

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip/lrz: added support for depth bounds test enable

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip: document GRAS_LRZ_CNTL's UNK5 bitfield

It is used by the blob to enable depth bounds test for LRZ.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip/lrz: add support for VK_EXT_extended_dynamic_state

When the depth or stencil state changes dynamically, that might affect
LRZ state and we need to recalculate it and emit it again.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip: refactor how LRZ state is calculated

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip: initialize pipeline->rb_{stencil,depth}_cntl always

This change will simplify further changes on LRZ state management.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

turnip: move pipeline gras_su and rb{stencil,depth}_cntl_mask initialization

Move them up, so they are initialized even when the dynamic state is
not used.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8615>

v3dv: use a bitfield to implement a quick check for job BO tracking

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>

v3dv: optimize a few cases of BO job additions

In these cases we know that the BO has not been added to the job
before, so we can skip the usual process for adding the BO where
we check if we had already added it before to avoid duplicates.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>

v3dv: avoid redundant BO job additions for spill / shared BOs

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>

v3dv: avoid redundant BO job additions for UBO/SSBO

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>

v3dv: avoid redundant BO job additions for textures and samplers

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10210>

intel/blorp: remove tile flush from emit surface state

Tile cache flush not required when emitting new surface state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>

iris: reduce redundant tile cache flushes

We are flushing tile cache more often than is necessary. In
unified cache mode, tile cache flushing is expensive, evicting all
depth/pixel data from the L3$. This is only need for a handful of
cases, such as: making cpu or gpu changes globally visible
(e.g. map), fast color clears, or slow depth clears. Tile cache
flushing is a gen12+ feature.

Remove blanket flushing of tile cache on all depth/RT flushes.
Replace with selective tile cache flushing.

Improves performance in several workloads:
AztecRuins.ogl-high-offscreen-1440p 1%
UnigineValley.ogl-g2                1%
Dota 2 (replay Jul 2020).ogl-g2     1%
Counter-Strike GO.ogl-g2            1%
Manhattan.ogl-Off-19x10             2%
CarChase.ogl-Off-19x10              1%
Bioshock Infinite.ogl-g2            1%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>

iris: only flush the render cache for aux changes, not format changes

Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>

iris: Cache VB/IB in L3$ for Gen12

Gen12 enables caching of Vertex and Index Buffers in L3.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>

intel: add L3 Bypass Disable to gen xml

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10217>

mesa/st: plumb GL_TEXTURE_REDUCTION_MODE_ARB through QueryInternalFormat

enable per-format querying of texture_filter_minmax support if the ARB extension
is enabled

also now return 0 if neither extension is supported

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>

gallium: split PIPE_CAP_SAMPLER_REDUCTION_MINMAX into modes

this enables detection for the EXT vs the ARB extension, which have
different specifications regarding which formats must be supported

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>

gallium: add PIPE_BIND_SAMPLER_REDUCTION_MINMAX

for querying format support

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>

venus: implement dma_buf fd import and properties query

This change is for supporting VK_ANDROID_native_buffer implementation,
and it does not advertise VK_KHR_external_memory_fd.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10195>

venus: update venus-protocol headers

Add support for external memory fd properties query and import
- vkGetMemoryResourcePropertiesMESA
- VkImportMemoryResourceInfoMESA
- VkMemoryResourcePropertiesMESA

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10195>

freedreno: Add missing foreach macros and update indentation

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10247>

venus: remove vn_renderer_info::has_timeline_sync

We are no longer limited to Vulkan 1.1 in VMs.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: wait on vkQueuePresentKHR

Add vn_renderer_info::has_implicit_fencing. Force vkQueueWaitIdle
during vkQueuePresentKHR when it is false.

This kills the performance, but we have to do this until the kernel does
implicit fencing.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: remove vn_ring_wait_all

It is unused now.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: remove vn_queue::sync_queue_index

It is unused now.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: remove VN_SYNC_TYPE_SYNC

It is unused now.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: remove vn_renderer_sync support from vn_queue_submission

It is unused now.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: stop using vn_renderer_sync in vn_queue

Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: stop using vn_renderer_sync in vn_semaphore

Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

venus: stop using vn_renderer_sync in vn_fence

Move away from vn_renderer_sync and toward a userspace-only solution
temporarily until the kernel does what we need.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10146>

docs: reset new_features.txt

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10242>

VERSION: bump to 21.2.0-devel

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10242>

anv: Avoid corrupting indirect depth clear values

We don't need to initialize the BO since blorp updates the clear color
BO content with fast clear value i.e ANV_HZ_FC_VAL for depth surface.

With this approach, we can get rid of possibility of corruption since we
are no longer sharing the same clear BO for depth formats.

Closes: #3614

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9941>

anv: Set correct fast clear value for depth during blorp operation

Previously, on the platforms which support the indirect clear color
values, we were just setting the clear color address and not enforcing
any clear color values but some of the blorp operations were using the
wrong fast clear value.

With this patch, we make sure to set the correct fast clear color value
during blorp operations.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9941>

panfrost: Don't advertise AFBC mods when the format is not supported

On Bifrost, AFBC is not supported if the format has a non-identity
swizzle. For internal resources we fix the format at runtime, but this
fixup is not applicable when we export the resource. Don't advertise
AFBC modifiers on such formats.

Fixes: 44217be92134 ("panfrost: Adjust the format for AFBC textures on Bifrost v7")
Cc: mesa-stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10233>

freedreno: Manual fixups

Things I couldn't figure out how to get clang-format to not mess up.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>

freedreno: Re-indent

clang-format -fallback-style=none --style=file -i src/gallium/drivers/freedreno/*.[ch] src/gallium/drivers/freedreno/*/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>

freedreno: Some manual reformatting

Take care of a few things that clang-format makes a hash of.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>

freedreno: Add .clang-format

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883>

meson: Fix winflexbison warnings

Undefine __STDC_VERSION__ for C files to avoid mismatch with C11/C17.

Define __STDC_VERSION__ for C++ files to use <inttypes.h> path.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10196>

aco/ra: remove live-in temporary from live_out_per_block when moving it

Otherwise, handle_loop_phis() might pass it to handle_live_in() and then
we could have two phis for this variable.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 7c64623e948 ("aco/ra: refactor SSA repairing during register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>

aco/ra: use original names when renaming loop carried phi operands

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 7c64623e948 ("aco/ra: refactor SSA repairing during register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>

anv: bump internal descriptor index fields to 32bits

Prior to supporting VK_EXT_descriptor_indexing all of our descriptor
limits where below 64k which fitted a uint16_t. Now all of those can
go up to 2^20 entries so we need 32bits indexes to keep track of them.

This change leaves the dynamic indexes at 16bits. We could arguably
bump them too, up to the reviewer's taste.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6e230d7607f9b3 ("anv: Implement VK_EXT_descriptor_indexing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4636
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10228>

ac: add missing BUF_DATA_FORMAT_10_11_11 vertex format on GFX10+

This format is supported by the driver.

Fixes vertex explosion in Dirt 5.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4635
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10226>

ir3/sched: Don't schedule too many tex/SFU instructions

Consider a simple loop that does a series of texture instructions and
then reduces the results:

vec4 sum = vec4(0);
for (int i = 0; i < N; i++) {
sum += texture(...);
}

Assume that the loop is unrolled and we schedule the resulting basic
block. Right now, after we schedule the first texture instruction, the
only instructions available to schedule that don't incur a sync are the
instructions to setup the second texture instruction. So we keep picking
the texture instructions, no matter how large N is, resulting in a
pathological schedule for register pressure when N is very large:

sum1 = texture(...);
sum2 = texture(...);
sum3 = texture(...);
...
sum = sum1 + sum2 + sum3 + ...;

In particular this happens with some CTS tests for VK_EXT_robustness2,
where a loop like that with many iterations is marked as [[unroll]],
forcing NIR to unroll it.

This solution is a balance between the current approach and always
scheduling for register pressure (and ignoring sync's). We only allow a
certain number of texture fetches to be in flight before considering
textures to "sync", even though they don't really, both because they
likely *will* sync in reality (overflowing the internal queue of waiting
texture instructions) and because at some point we need the normal
algorithm to kick in and start lowering register pressure.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>

ir3/sched: Don't penalize uses of already-waited tex/SFU

Once we insert a use of a given tex or SFU instruction, then we must
wait for that tex/SFU instruction (as well as all earlier ones) to
complete, so we shouldn't penalize further uses, even if a subsequent
tex/SFU instruction gets scheduled after the first use. This especially
matters after the next commit when we start forcibly breaking up long
sequences of texture instructions, since if we schedule a group of 8
texture instructions then we want to schedule the uses of those
instructions in parallel with the next 8 texture instructions to reduce
register pressure.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>

zink: verify that source-format support linear-filter

Similar to the previous commit, we should also verify that the
source-format support linear-filter if we try to blit with it.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10234>

zink: verify that src/dst support blitting

Some Vulkan-drivers don't support blitting between all formats and
layouts. So let's verify this while blitting, and fall back to the
normal rendering code-path instead.

This fixes a crash on start-up in OpenArena on V3DV.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10234>

radv/winsys: Remove use_local_bos

Now that perftest is stored in the winsys.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>

radv: Use VRAM cmdbuffers in more situations.

In most games I tested we use 32 MiB of cmdbuffers+cmd upload buffers
at most. Especially since we have mutable descriptors it seems
somewhat unlikely anything else will eat it up so be a bit more
aggressive allocating them in VRAM.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>

radv: Refactor cs_domain to be a winsys function.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>

zink: do not dereference NULL pointer

If first_frame_done isn't set, but fence is NULL, we end up dereferncing
that NULL-pointer.

This can happen in the case where the first submitted batch has no work,
and pfence was passed as a NULL-pointer.

While we're at it, simplify the check with the surrounding code, which
also checks for a NULL-pointer here.

Fixes: e93ca92d4ae ("zink: force explicit fence only on first frame flush")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10235>

aco: Add a simple heuristic to decide early or late primitive export.

Late export is theoretically better if used with LATE_ALLOC,
but in practice, the early export has an advantage of
lower register usage, therefore more concurrent waves.

The idea of this commit is that "small" shaders benefit from early
primitive export more, due to being able to launch much more waves.

Let's consider a NIR shader "small" when it has only 1 block.
This yields both better performance, and better stats, than always
using late export.

Fossil DB on Sienna:

Totals from 12807 (8.76% of 146265) affected shaders:
VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83%
SpillSGPRs: 1458 -> 1538 (+5.49%)
CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14%
MaxWaves: 282902 -> 278516 (-1.55%)
Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18%
VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30%
SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16%
Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26%
Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26%
PreSGPRs: 434701 -> 447396 (+2.92%)
PreVGPRs: 527783 -> 540527 (+2.41%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>

aco: Emit fewer branches for NGG VS/TES with late primitive export.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>