platform/upstream/mesa.git
2 years agoaco: don't expand smem/mubuf global loads
Rhys Perry [Thu, 2 Dec 2021 10:57:35 +0000 (10:57 +0000)]
aco: don't expand smem/mubuf global loads

For example, dwordx3->dwordx4 or ubyte3->dwordx2.

Global loads don't have the bounds checking that buffer loads have that
makes this safe.

The alignment checks are added to global_load_callback() in case
byte_align_loads=false, align=1 and bytes_needed=3. Without them, the
callback will create a dword load.

fossil-db (Sienna Cichlid):
Totals from 267 (0.20% of 134621) affected shaders:
CodeSize: 1603352 -> 1606568 (+0.20%)
Instrs: 294946 -> 295482 (+0.18%); split: -0.00%, +0.18%
Latency: 2997003 -> 2997052 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 526645 -> 526659 (+0.00%)
SClause: 9179 -> 9185 (+0.07%); split: -0.02%, +0.09%
Copies: 25363 -> 25375 (+0.05%); split: -0.08%, +0.13%
Branches: 8298 -> 8299 (+0.01%)

fossil-db (Polaris10):
Totals from 267 (0.20% of 135668) affected shaders:
CodeSize: 1636672 -> 1638756 (+0.13%); split: -0.00%, +0.13%
Instrs: 308484 -> 308733 (+0.08%); split: -0.01%, +0.09%
Latency: 3446045 -> 3446904 (+0.02%); split: -0.00%, +0.03%
InvThroughput: 1206722 -> 1206828 (+0.01%); split: -0.00%, +0.01%
SClause: 9308 -> 9311 (+0.03%); split: -0.08%, +0.11%
Copies: 36933 -> 36921 (-0.03%); split: -0.08%, +0.05%

fossil-db (Pitcairn):
Totals from 275 (0.20% of 135668) affected shaders:
SGPRs: 17616 -> 17520 (-0.54%); split: -0.64%, +0.09%
VGPRs: 15428 -> 15540 (+0.73%); split: -0.23%, +0.96%
CodeSize: 1885792 -> 1929120 (+2.30%); split: -0.00%, +2.30%
MaxWaves: 1284 -> 1285 (+0.08%)
Instrs: 368963 -> 376095 (+1.93%); split: -0.00%, +1.94%
Latency: 5122922 -> 5168398 (+0.89%); split: -0.01%, +0.90%
InvThroughput: 2562866 -> 2604279 (+1.62%)
VClause: 9268 -> 9296 (+0.30%); split: -0.13%, +0.43%
SClause: 10702 -> 10705 (+0.03%); split: -0.05%, +0.07%
Copies: 48620 -> 50629 (+4.13%); split: -0.08%, +4.21%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: use saddr for global access with sgpr address
Rhys Perry [Tue, 9 Mar 2021 16:09:15 +0000 (16:09 +0000)]
aco: use saddr for global access with sgpr address

fossil-db (Sienna Cichlid):
Totals from 38 (0.03% of 134621) affected shaders:
CodeSize: 237196 -> 237060 (-0.06%); split: -0.09%, +0.03%
Instrs: 43895 -> 43894 (-0.00%); split: -0.02%, +0.01%
Latency: 914633 -> 916263 (+0.18%); split: -0.01%, +0.19%
InvThroughput: 468215 -> 468971 (+0.16%); split: -0.02%, +0.18%
SClause: 1239 -> 1242 (+0.24%)
PreSGPRs: 997 -> 1003 (+0.60%)
PreVGPRs: 936 -> 923 (-1.39%); split: -1.50%, +0.11%

Regression seems to be RA noise, creating a waitcnt.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoaco: use vcc for 64-bit vgpr addition
Rhys Perry [Tue, 9 Mar 2021 16:40:23 +0000 (16:40 +0000)]
aco: use vcc for 64-bit vgpr addition

fossil-db (Sienna Cichlid):
Totals from 229 (0.17% of 134621) affected shaders:
CodeSize: 1520192 -> 1517644 (-0.17%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoradv: don't require robust vectorization for nir_var_mem_global
Rhys Perry [Thu, 15 Apr 2021 13:22:11 +0000 (14:22 +0100)]
radv: don't require robust vectorization for nir_var_mem_global

Robust vectorization is to prevent vectorization of loads using the near
maximum offset with loads of offset 0. Global loads can't read from offset
0 (NULL) anyways, so this isn't necessary.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124>

2 years agoiris: Don't leak scratch BOs
Jason Ekstrand [Tue, 12 Apr 2022 16:45:41 +0000 (11:45 -0500)]
iris: Don't leak scratch BOs

Fixes: 4d219b0eb3d6 ("iris: implement scratch space!")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15897>

2 years agoradv: Only use TES vertex offset 2 for triangles and quads.
Timur Kristóf [Wed, 13 Apr 2022 12:54:30 +0000 (14:54 +0200)]
radv: Only use TES vertex offset 2 for triangles and quads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15837>

2 years agoradv: Fix gs_vgpr_comp_cnt for NGG VS without passthrough mode.
Timur Kristóf [Sat, 9 Apr 2022 20:00:10 +0000 (22:00 +0200)]
radv: Fix gs_vgpr_comp_cnt for NGG VS without passthrough mode.

When not in passthrough mode, the NGG shader needs to calculate the
primitive export value from the input primitive's vertex indices.

So, GS vertex offset 2 is needed when NGG has triangles
and isn't in passthrough mode.

Fixes: 7ad69e2f7ee10c0e7afc302b9324e7a320424dcb "radv: stop loading invocation ID for NGG vertex shaders"
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15837>

2 years agonir: Handle out of bounds access in nir_vectorize_tess_levels.
Timur Kristóf [Wed, 6 Apr 2022 16:53:20 +0000 (18:53 +0200)]
nir: Handle out of bounds access in nir_vectorize_tess_levels.

Replace out of bounds loads with undef.
Then, delete instructions with out of bounds access.

Fixes: f5adf27fb926a330a13af716f0a03da1a224656d "nir,radv: add and use nir_vectorize_tess_levels()"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6264
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15775>

2 years agoaco: Fix VOP2 instruction format in visit_tex.
Timur Kristóf [Wed, 13 Apr 2022 12:11:18 +0000 (14:11 +0200)]
aco: Fix VOP2 instruction format in visit_tex.

There was a v_or_b32 that accidentally used SOP2.
It should use VOP2.

Issue found by looking at a gfxreconstruct trace posted by a user
in this bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5838

Cc: mesa-stable
Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 "aco: Initial commit of independent AMD compiler"

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15923>

2 years agoiris: set a default EDSC flag
Rohan Garg [Tue, 12 Apr 2022 19:07:13 +0000 (21:07 +0200)]
iris: set a default EDSC flag

anv sets the default EDSC flag, do the same for iris too

Fixes: 5ae278da18b6 ("iris: use vtbl to avoid multiple symbols, fix state base address")

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15905>

2 years agointel/fs: add a note on possible optimization of root node address
Lionel Landwerlin [Wed, 13 Apr 2022 06:39:31 +0000 (09:39 +0300)]
intel/fs: add a note on possible optimization of root node address

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>

2 years agointel/fs: fix metadata preserve on trace_ray intrinsic
Lionel Landwerlin [Tue, 12 Apr 2022 18:59:58 +0000 (21:59 +0300)]
intel/fs: fix metadata preserve on trace_ray intrinsic

c78be5da300 ("intel/fs: lower ray query intrinsics") introduced a
helper function using nir_(push|pop)_if which invalidated dominance &
block_index for the replacement of nir_intrinsic_rt_trace_ray.

We can still keep dominance/block_index metadata for the lowering of
nir_intrinsic_rt_execute_callable though.

This change uses 2 different lowering function with correct metadata
preservation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c78be5da300 ("intel/fs: lower ray query intrinsics")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>

2 years agozink: avoid creating ssbo variable types with multiple runtime arrays
Mike Blumenkrantz [Tue, 12 Apr 2022 13:47:03 +0000 (09:47 -0400)]
zink: avoid creating ssbo variable types with multiple runtime arrays

this is illegal

affects:
KHR-GL46.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-packed-matC

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15894>

2 years agozink: use the calculated last struct member idx for ssbo size in ntv
Mike Blumenkrantz [Tue, 12 Apr 2022 13:46:32 +0000 (09:46 -0400)]
zink: use the calculated last struct member idx for ssbo size in ntv

this may or may not be 1

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15894>

2 years agovirgl: Fix relocating the re-writing the transformation code
Gert Wollny [Wed, 13 Apr 2022 09:59:20 +0000 (11:59 +0200)]
virgl: Fix relocating the re-writing the transformation code

The transformation must come before the code emission.

Fixes: 6a264e7024a29eb7

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15919>

2 years agoiris: Add VF_CACHE_INVALIDATE to IRIS_DOMAIN_OTHER_WRITE flush bits
Kenneth Graunke [Tue, 8 Mar 2022 07:37:58 +0000 (23:37 -0800)]
iris: Add VF_CACHE_INVALIDATE to IRIS_DOMAIN_OTHER_WRITE flush bits

Suggested by Francisco Jerez.

Although including VF invalidation in the flush bits is strange, we
believe this is the only way to guarantee that stream output has
finished.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Demote DC flush to HDC flush in cache tracker
Kenneth Graunke [Wed, 25 Aug 2021 01:09:53 +0000 (18:09 -0700)]
iris: Demote DC flush to HDC flush in cache tracker

FLUSH_HDC is sufficient to flush things out to L3, so we'd rather
use that where possible.  It's also emulated via DATA_CACHE_FLUSH
on platforms where it isn't supported, so we can use it unconditionally.

We still use DATA_CACHE_FLUSH for invalidating the data cache, and to
flush the DC-tagged cachelines in L3 to be globally-observable.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Emit flushes for push constant source buffers
Kenneth Graunke [Fri, 4 Mar 2022 11:47:06 +0000 (03:47 -0800)]
iris: Emit flushes for push constant source buffers

Push constant loading is not coherent with L3 according to the document
that describes the hardware change for the vertex buffer L3 Bypass
Disable field.

If we've updated a push constant buffer with say, a blorp_buffer_copy,
we may need to flush both the render cache and the tile cache.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Use cache-tracker for draw count flushing
Kenneth Graunke [Tue, 8 Mar 2022 08:07:16 +0000 (00:07 -0800)]
iris: Use cache-tracker for draw count flushing

We should be using the cache tracker for this.  We can consider
this access IRIS_DOMAIN_OTHER_READ now that it's the catch-all
non-L3-coherent read-only access domain.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add pre-draw flushing for stream output targets
Kenneth Graunke [Tue, 8 Mar 2022 06:01:08 +0000 (22:01 -0800)]
iris: Add pre-draw flushing for stream output targets

When stream output is active, we need to let the cache tracker know
about any SO buffers, which we access via IRIS_DOMAIN_OTHER_WRITE.

In particular, we may have written to those buffers via another
mechanism, such as BLORP buffer copies.  In that case, previous writes
happened via IRIS_DOMAIN_RENDER_WRITE, in which case we'd need to flush
both the render cache and the tile cache to make that data globally-
observable before we begin writing via streamout, which is incoherent
with the earlier mechanism.

Fixes misrendering in Ryujinx.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6085
Fixes: d8cb76211c5 ("iris: Fix MOCS for buffer copies")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Extend the cache tracker to handle L3 flushes and invalidates
Kenneth Graunke [Mon, 2 Aug 2021 21:50:19 +0000 (14:50 -0700)]
iris: Extend the cache tracker to handle L3 flushes and invalidates

Most clients are L3-coherent these days.  However, there are some
notable exceptions, such as push constants, stream output, and command
streamer memory reads and writes.

With the advent of the tile cache, flushing the render or depth caches
alone are no longer sufficient for memory to become globally-observable.
For those, we need to flush the tile cache as well.  However, we'd like
to avoid that for L3-coherent clients, as it shouldn't be necessary,
and is expensive.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bit
Kenneth Graunke [Sat, 9 Apr 2022 09:19:15 +0000 (02:19 -0700)]
iris: Add a separate PIPE_CONTROL_L3_READ_ONLY_CACHE_INVALIDATE bit

This will let us use it without performing a VF cache invalidation,
should we want to do that.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Add an iris_is_domain_l3_coherent helper.
Kenneth Graunke [Mon, 2 Aug 2021 19:47:10 +0000 (12:47 -0700)]
iris: Add an iris_is_domain_l3_coherent helper.

The render, depth, sampler, and data (HDC) caches are all coherent
with L3.  We consider OTHER_READ and OTHER_WRITE to be non-coherent,
as they're kitchen-sink domains which include non-L3-clients.

Starting with Tigerlake, the VF cache is coherent with L3 (because we
set the L3BypassDisable bit in the vertex/index buffer packets).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Fix UBO cache tracking for the !indirect_ubos_use_sampler case
Kenneth Graunke [Fri, 4 Mar 2022 11:09:36 +0000 (03:09 -0800)]
iris: Fix UBO cache tracking for the !indirect_ubos_use_sampler case

On Tigerlake, we use the data cache for reading indirect UBOs instead
of the sampler.  But we still use the constant cache for direct UBO
access, so unfortunately we may access it through two different domains.

To work around this, we add a new domain for pull constants (UBOs),
which will be either constant+texture or constant+data.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Split out an IRIS_DOMAIN_SAMPLER_READ domain from OTHER_READ
Kenneth Graunke [Fri, 4 Mar 2022 11:27:05 +0000 (03:27 -0800)]
iris: Split out an IRIS_DOMAIN_SAMPLER_READ domain from OTHER_READ

The bulk of IRIS_DOMAIN_OTHER_READ domain usage was the 3D sampler, but
there were also a few oddball cases like command streamer reads, blitter
access, and so on.  The sampler is definitely L3 coherent, but some off
the more esoteric reads may not be, so I'd like to separate them, so
that OTHER_READ can become a non-L3-coherent kitchen-sink domain.

The sampler cases only need TEXTURE_CACHE_INVALIDATE, and can skip the
CONSTANT_CACHE_INVALIDATE we had on IRIS_DOMAIN_OTHER_READ.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agoiris: Use IRIS_DOMAIN_DEPTH_WRITE for read only depth/stencil.
Kenneth Graunke [Mon, 4 Apr 2022 18:04:40 +0000 (11:04 -0700)]
iris: Use IRIS_DOMAIN_DEPTH_WRITE for read only depth/stencil.

We were using IRIS_DOMAIN_OTHER_READ for read-only depth/stencil access
in an attempt to avoid unnecessary flushing; IRIS_DOMAIN_DEPTH_WRITE
could indicate read-write access.

However, IRIS_DOMAIN_OTHER_READ is clearly the wrong domain.  Depth and
stencil data is read via the depth cache, while IRIS_DOMAIN_OTHER_READ
currently corresponds to the sampler cache and constant cache together
(although this will change in future patches).

It's unclear whether this hack was useful.  For now, just drop it and
use the correct depth cache domain, even if it's marked as read-write.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15275>

2 years agovirgl: Apply integer op fix only for ALU ops and clear modifiers
Gert Wollny [Tue, 12 Apr 2022 15:53:12 +0000 (17:53 +0200)]
virgl: Apply integer op fix only for ALU ops and clear modifiers

For texture fetches and buffer load the fix is not needed,
and the override creates faulty TGSI.

In addition remove all modifiers from the src in the additional mov
instruction.

Fixes: d1c7a7b1317c518e160cc6d37245de22b2bfa60d
  virgl: Add an extra mov for int outputs from constant and immediate inputs

v2: Move workaround after the use of
    virgl_tgsi_rewrite_src_for_input_temp (Emma)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15896>

2 years agor600: Assign shader type when creating a new CS state
Gert Wollny [Mon, 4 Apr 2022 12:37:42 +0000 (14:37 +0200)]
r600: Assign shader type when creating a new CS state

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15898>

2 years agost/mesa: Transcode ASTC to BC7 (BPTC) where possible
Kenneth Graunke [Tue, 15 Feb 2022 20:47:49 +0000 (12:47 -0800)]
st/mesa: Transcode ASTC to BC7 (BPTC) where possible

This patch adds support for transcoding ASTC to BC7 (BPTC) and prefers
it over BC3 (DXT5) when hardware supports that format.

BC7 is a much newer format (~2009 vs. ~1999) and offers higher quality
than the older BC3 format.  Furthermore, our encoder seems to be faster.

Tapani put together a small benchmark for transcoding a 1024x1024 ASTC
texture, and switching from BC3 to BC7 improves performance of that
microbenchmark by 25% on my Tigerlake NUC (with hardware ASTC disabled
so we can test this path).  Presumably, this isn't fundamental to the
formats, but rather reflects the speed of our in-tree compressors.

So, we should use BC7 where possible.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15875>

2 years agost/mesa: Make transcode_astc also check for non-SRGB format support
Kenneth Graunke [Tue, 15 Feb 2022 20:44:30 +0000 (12:44 -0800)]
st/mesa: Make transcode_astc also check for non-SRGB format support

This is probably unnecessary in that all drivers which support the sRGB
format likely also support the non-sRGB format.  But we may as well
check both the formats we use, for documentation if nothing else.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15875>

2 years agoci: Move most stuff out of root .gitlab-ci.yml
Tomeu Vizoso [Fri, 8 Apr 2022 11:29:04 +0000 (13:29 +0200)]
ci: Move most stuff out of root .gitlab-ci.yml

This file was getting a bit hard to navigate. Split container, build and
test jobs to their own files.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Allow local installations to build additional stuff into the rootfs
Tomeu Vizoso [Tue, 29 Mar 2022 11:48:49 +0000 (13:48 +0200)]
ci: Allow local installations to build additional stuff into the rootfs

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Add env var to add packages to install in debian/arm_build image
Tomeu Vizoso [Tue, 29 Mar 2022 11:47:52 +0000 (13:47 +0200)]
ci: Add env var to add packages to install in debian/arm_build image

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Add env var to add packages to install in rootfs
Tomeu Vizoso [Tue, 29 Mar 2022 11:47:26 +0000 (13:47 +0200)]
ci: Add env var to add packages to install in rootfs

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Allow specifying a different kernel in LAVA jobs
Tomeu Vizoso [Thu, 17 Mar 2022 15:35:30 +0000 (16:35 +0100)]
ci: Allow specifying a different kernel in LAVA jobs

To make it possible to use a kernel different from that built along with
the rootfs.

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agoci: Use CI_PROJECT_NAME instead of hardcoding 'mesa'
Tomeu Vizoso [Thu, 17 Mar 2022 14:09:18 +0000 (15:09 +0100)]
ci: Use CI_PROJECT_NAME instead of hardcoding 'mesa'

This can make it more convenient for other projects to reuse these
scripts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15891>

2 years agonir/lower_shader_calls: name resume shaders
Lionel Landwerlin [Tue, 12 Apr 2022 13:04:52 +0000 (16:04 +0300)]
nir/lower_shader_calls: name resume shaders

Helpful when lost in a sea of NIR :)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15887>

2 years agoci: Disable Google's lab
Tomeu Vizoso [Wed, 13 Apr 2022 06:11:05 +0000 (08:11 +0200)]
ci: Disable Google's lab

The runner is down and pipelines are being stuck.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15909>

2 years agozink: rework choose_pdev to (finally) be competent
Mike Blumenkrantz [Mon, 11 Apr 2022 15:04:45 +0000 (11:04 -0400)]
zink: rework choose_pdev to (finally) be competent

now zink will init using a priority system if multiple devices are available

multiple devices will ONLY be available if:
* the user does not specify VK_ICD_FILENAMES as they should
* the user does not specify LIBGL_ALWAYS_SOFTWARE
* multiple drivers exist

I've prioritized the virtualized gpu here with the assumption that if
such a thing is detected, the environment is most likely virtualized

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoaux/trace: clean up some zink+lavapipe tracing awfulness
Mike Blumenkrantz [Mon, 11 Apr 2022 14:41:22 +0000 (10:41 -0400)]
aux/trace: clean up some zink+lavapipe tracing awfulness

now that it's easier to determine whether zink is being used (mostly),
this whole thing can be simplified

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agozink: ZINK_USE_LAVAPIPE -> LIBGL_ALWAYS_SOFTWARE
Mike Blumenkrantz [Mon, 11 Apr 2022 14:39:05 +0000 (10:39 -0400)]
zink: ZINK_USE_LAVAPIPE -> LIBGL_ALWAYS_SOFTWARE

this is a documented variable, so reuse it

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoegl: don't make LIBGL_ALWAYS_SOFTWARE and MESA_LOADER_DRIVER_OVERRIDE=zink exclusive
Mike Blumenkrantz [Mon, 11 Apr 2022 14:31:40 +0000 (10:31 -0400)]
egl: don't make LIBGL_ALWAYS_SOFTWARE and MESA_LOADER_DRIVER_OVERRIDE=zink exclusive

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15857>

2 years agoac/gpu_info: disallow displayable DCC for Navi12 and Navi14
Indrajit Kumar Das [Fri, 8 Apr 2022 05:21:54 +0000 (10:51 +0530)]
ac/gpu_info: disallow displayable DCC for Navi12 and Navi14

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15813>

2 years agointel/nir: Lower 8 and 16-bit bitwise unops
Jason Ekstrand [Fri, 8 Apr 2022 20:17:33 +0000 (15:17 -0500)]
intel/nir: Lower 8 and 16-bit bitwise unops

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agointel/fs: Implement 16-bit [ui]mul_high
Jason Ekstrand [Fri, 8 Apr 2022 20:17:12 +0000 (15:17 -0500)]
intel/fs: Implement 16-bit [ui]mul_high

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agonir/lower_int64: Fix [iu]mul_high handling
Jason Ekstrand [Fri, 8 Apr 2022 20:06:11 +0000 (15:06 -0500)]
nir/lower_int64: Fix [iu]mul_high handling

e551040c602d, which added a new mechanism for 64-bit imul which is more
efficient on BDW and later Intel hardware also introduced a bug where we
weren't properly walking both X and Y.  No idea how testing didn't find
this.

Fixes: e551040c602d ("nir/glsl: Add another way of doing lower_imul64 for gen8+"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6306
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>

2 years agokopper: print better error message if loader not detected
Mike Blumenkrantz [Mon, 11 Apr 2022 12:34:36 +0000 (08:34 -0400)]
kopper: print better error message if loader not detected

silently failing on release builds is annoying

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15851>

2 years agolima: fix vector const src referenced multiple times
Erico Nunes [Sun, 3 Apr 2022 16:28:19 +0000 (18:28 +0200)]
lima: fix vector const src referenced multiple times

It can happen that a single vector constant is referenced multiple times
by the same node, with different swizzles.
This needs to be taken into account by checking and updating the
swizzles for all the srcs of a target node when inserting the const
node to the same instruction.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15726>

2 years agofeatures: mark off ARB_seamless_cubemap_per_texture for zink
Mike Blumenkrantz [Tue, 12 Apr 2022 18:42:56 +0000 (14:42 -0400)]
features: mark off ARB_seamless_cubemap_per_texture for zink

forgot to do this with the MR

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15902>

2 years agontt: translate nir_intrinsic_shader_clock
Gert Wollny [Tue, 12 Apr 2022 14:03:59 +0000 (16:03 +0200)]
ntt: translate nir_intrinsic_shader_clock

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15889>

2 years agozink: finish up radv piglit baseline updates
Mike Blumenkrantz [Tue, 12 Apr 2022 18:00:35 +0000 (14:00 -0400)]
zink: finish up radv piglit baseline updates

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15900>

2 years agoradv: Refactor ray tracing support checks
Konstantin Seurer [Mon, 11 Apr 2022 16:11:26 +0000 (18:11 +0200)]
radv: Refactor ray tracing support checks

Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15860>

2 years agoradv: Refactor radv_tex_aniso_filter
Konstantin Seurer [Mon, 11 Apr 2022 15:53:57 +0000 (17:53 +0200)]
radv: Refactor radv_tex_aniso_filter

Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15860>

2 years agoradv: set read/write without format flags for supported texel buffers
Mike Blumenkrantz [Fri, 8 Apr 2022 17:34:23 +0000 (13:34 -0400)]
radv: set read/write without format flags for supported texel buffers

if the storage case is supported, this should be supported too

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15826>

2 years agoRevert "radv: Disable NGG for GS with suboptimal output vertex count."
Samuel Pitoiset [Tue, 12 Apr 2022 10:18:03 +0000 (12:18 +0200)]
Revert "radv: Disable NGG for GS with suboptimal output vertex count."

It breaks too many things and shouldn't have been merged. The fix isn't
trivial and it will probably not be backported because it's intrusive.

It will be re-applied later when everything will work.

This reverts commit 94706601fa2f52605d6e488f30fad9a0e2440612.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15882>

2 years agor600: make r600_load_ar available to driver code
Gert Wollny [Fri, 1 Apr 2022 16:23:29 +0000 (18:23 +0200)]
r600: make r600_load_ar available to driver code

This is needed for the new NIR assembler

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agor600: Set the last bit if an alu group is split by kcache allocation
Gert Wollny [Tue, 8 Feb 2022 22:14:21 +0000 (23:14 +0100)]
r600: Set the last bit if an alu group is split by kcache allocation

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agor600: Force last instruction of group when starting a new CF
Gert Wollny [Fri, 31 Dec 2021 20:34:02 +0000 (21:34 +0100)]
r600: Force last instruction of group when starting a new CF

When emitting the AR forces splitting an ALU group, and at the same time
a new CF instruction is started, then the last instrcution in the finished
CF block might not have the "last" bit set, which results in an invalid
shader that might hang, or crash SB.
So when a new CF is started, force the last bit in the last ALU instruction.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agor600: don't reschedule INTERP_LOAD_P0
Gert Wollny [Thu, 10 Feb 2022 17:44:13 +0000 (18:44 +0100)]
r600: don't reschedule INTERP_LOAD_P0

With the NIR code, we have instructions groups that use
INTERP_LOAD_P0 that don't fill all slots. Just make sure
the backend scheduler doesn't fill in INTERP_LOAD_P0
instructions with a different LDS location.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agor600: ignore dest sel for non-write targets when counting registers
Gert Wollny [Sun, 6 Feb 2022 16:23:27 +0000 (17:23 +0100)]
r600: ignore dest sel for non-write targets when counting registers

Since the value is not written, there is no need to allocate
a register for it, so don't take it into account.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agor600: Don't limit scheduling of PARAM_SRC values
Gert Wollny [Fri, 10 Dec 2021 21:44:50 +0000 (22:44 +0100)]
r600: Don't limit scheduling of PARAM_SRC values

ALU_SRC_PARAM_BASE is an inline constant that defines the
address for pulling data from LDS memory for interpolation
and not a value from the kcache, so there is no need to
take these values into account when allocating kcache
load slots.

v2: Fix the constant range check to not exclude the translated
    ranges for kcache banks 2 and 3.
v3: limit range check to only include kcache values and and
    rename relevant function (Emma).

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15714>

2 years agoradv: increase inline push constant limit if we can inline all constants
Rhys Perry [Thu, 29 Jul 2021 16:29:43 +0000 (17:29 +0100)]
radv: increase inline push constant limit if we can inline all constants

fossil-db (Sienna Cichlid):
Totals from 665 (0.49% of 134627) affected shaders:
CodeSize: 4519620 -> 4491724 (-0.62%); split: -0.62%, +0.01%
Instrs: 842745 -> 837313 (-0.64%); split: -0.66%, +0.01%
Latency: 7289925 -> 7279661 (-0.14%); split: -0.30%, +0.16%
InvThroughput: 1240770 -> 1240639 (-0.01%); split: -0.01%, +0.00%
VClause: 15799 -> 15772 (-0.17%)
SClause: 33773 -> 32604 (-3.46%); split: -3.66%, +0.20%
Copies: 67695 -> 64992 (-3.99%); split: -4.49%, +0.50%
PreSGPRs: 38597 -> 38640 (+0.11%); split: -0.14%, +0.25%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>

2 years agoradv,aco: implement 64-bit inline push constants
Rhys Perry [Fri, 30 Jul 2021 17:08:16 +0000 (18:08 +0100)]
radv,aco: implement 64-bit inline push constants

fossil-db (Sienna Cichlid):
Totals from 21 (0.02% of 134621) affected shaders:
CodeSize: 1932 -> 1560 (-19.25%)
Instrs: 357 -> 303 (-15.13%)
Latency: 6576 -> 5883 (-10.54%)
InvThroughput: 26304 -> 23532 (-10.54%)
SClause: 42 -> 24 (-42.86%)
Copies: 90 -> 105 (+16.67%); split: -10.00%, +26.67%
PreSGPRs: 144 -> 201 (+39.58%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>

2 years agoradv: allow holes in inline push constants
Rhys Perry [Thu, 29 Jul 2021 15:47:44 +0000 (16:47 +0100)]
radv: allow holes in inline push constants

Use a dword mask instead of a range to track which push constants to
inline.

fossil-db (Sienna Cichlid):
Totals from 5724 (4.25% of 134621) affected shaders:
CodeSize: 20894044 -> 20815748 (-0.37%); split: -0.39%, +0.02%
Instrs: 4002568 -> 3988385 (-0.35%); split: -0.38%, +0.02%
Latency: 29285060 -> 29224414 (-0.21%); split: -0.22%, +0.01%
InvThroughput: 5529700 -> 5526893 (-0.05%); split: -0.05%, +0.00%
VClause: 78093 -> 78240 (+0.19%); split: -0.23%, +0.41%
SClause: 135495 -> 131027 (-3.30%); split: -3.30%, +0.00%
Copies: 330856 -> 324552 (-1.91%); split: -2.37%, +0.46%
PreSGPRs: 226031 -> 224778 (-0.55%); split: -0.61%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>

2 years agoradv: allow inline push constants in more situations
Rhys Perry [Fri, 17 Dec 2021 19:09:46 +0000 (19:09 +0000)]
radv: allow inline push constants in more situations

We don't need to disable this path if there are indirect or 8/16/64-bit
push constant loads. We can just use the default path for them.

fossil-db (Sienna Cichlid):
Totals from 21 (0.02% of 134621) affected shaders:
CodeSize: 2028 -> 1884 (-7.10%)
Instrs: 366 -> 363 (-0.82%); split: -2.46%, +1.64%
Latency: 6630 -> 6579 (-0.77%)
InvThroughput: 26520 -> 26316 (-0.77%)
Copies: 84 -> 102 (+21.43%)
PreSGPRs: 141 -> 222 (+57.45%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12145>

2 years agointel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+
Mykhailo Skorokhodov [Thu, 2 Dec 2021 08:33:05 +0000 (10:33 +0200)]
intel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+

The commit enables the optimization for Intel Gfx12+ graphics.

Tigerlake
```
total instructions in shared programs: 1289326 -> 1289015 (-0.02%)
instructions in affected programs: 37841 -> 37530 (-0.82%)
helped: 78
HURT: 9
helped stats (abs) min: 1 max: 26 x̄: 4.69 x̃: 3
helped stats (rel) min: 0.10% max: 12.50% x̄: 2.07% x̃: 1.21%
HURT stats (abs)   min: 1 max: 18 x̄: 6.11 x̃: 4
HURT stats (rel)   min: 0.16% max: 1.95% x̄: 0.94% x̃: 0.61%
95% mean confidence interval for instructions value: -4.95 -2.20
95% mean confidence interval for instructions %-change: -2.34% -1.18%
Instructions are helped.

total cycles in shared programs: 105606388 -> 105606442 (<.01%)
cycles in affected programs: 620119 -> 620173 (<.01%)
helped: 49
HURT: 28
helped stats (abs) min: 2 max: 3618 x̄: 228.63 x̃: 12
helped stats (rel) min: 0.02% max: 23.31% x̄: 4.60% x̃: 1.11%
HURT stats (abs)   min: 1 max: 2142 x̄: 402.04 x̃: 29
HURT stats (rel)   min: 0.01% max: 36.42% x̄: 5.01% x̃: 0.46%
95% mean confidence interval for cycles value: -151.80 153.20
95% mean confidence interval for cycles %-change: -3.00% 0.79%
Inconclusive result (value mean confidence interval includes 0).
```

Related-to: https://gitlab.freedesktop.org/mesa/mesa/-/commit/7725d609387a8165ccb71e2d9e0221d9248b1729
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14017>

2 years agovirgl: Add an extra mov for int outputs from constant and immediate inputs
Gert Wollny [Tue, 12 Apr 2022 09:47:31 +0000 (11:47 +0200)]
virgl: Add an extra mov for int outputs from constant and immediate inputs

virglrenderer doesn't properly emit the conversion code when the source
is a integer value and the output is also integer.

Fixes on NTT:
  dEQP-GLES31.functional.shaders.sample_variables.sample_mask.inverse_per_*

v2: fix typo (Emma)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15836>

2 years agovirgl: Always make some extra temps available for transformations
Gert Wollny [Sat, 9 Apr 2022 09:54:29 +0000 (11:54 +0200)]
virgl: Always make some extra temps available for transformations

The host driver will optimize unused variables away, and checking thoroughly whether we
may need an extra temp is just uselessly costly.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15836>

2 years agovirgl: Propagate precice flag through moves
Gert Wollny [Sat, 9 Apr 2022 07:49:25 +0000 (09:49 +0200)]
virgl: Propagate precice flag through moves

NIR doesn't propagate precise through moves, and with NTT the
last output is usually preceded by a move, so that we no longer
see that the evaluation of some value is supposed to be exact,
and, hence we can't decorate the outputs accordingly.

Fixes with NTT:
 dEQP-GLES31.functional.tessellation.common_edge.
     triangles_equal_spacing_precise
     triangles_fractional_odd_spacing_precise
     triangles_fractional_even_spacing_precise
     quads_equal_spacing_precise
     quads_fractional_odd_spacing_precise
     quads_fractional_even_spacing_precise

v2: Don't clear the precise flag when we hit a mov, because we may
    hit a if/else construct like below and we don't track branches

    IF X
       TEMP[0] = OP_PRECICE ...
    ELSE
       TEMP[0] = MOV CONST[]
    ENDIF

    Thanks Emma for pointing out the problem.

v2: allocate precise handling flags to transform_prolog (Emma)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15836>

2 years agoci: add Broadcom CI maintainer
Juan A. Suarez Romero [Mon, 11 Apr 2022 14:51:29 +0000 (16:51 +0200)]
ci: add Broadcom CI maintainer

Include in the CODEOWNERS file who to ping in case of issues with the
Broadcom (V3D/V3DV/VC4) CI.

v2:
 - Add Chema (Chema)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15858>

2 years agoCODEOWNERS: add Broadcom maintainers
Juan A. Suarez Romero [Mon, 11 Apr 2022 14:50:29 +0000 (16:50 +0200)]
CODEOWNERS: add Broadcom maintainers

v2:
 - Add more maintainers (Iago)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15858>

2 years agor600: Only emit the NOP group triggered by dest.rel after a full group
Gert Wollny [Mon, 11 Apr 2022 09:50:10 +0000 (11:50 +0200)]
r600: Only emit the NOP group triggered by dest.rel after a full group

In addition really fill all slots, because otherwise the alu-group merger
might move a read from the indirectly written register into the 't' slot.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15848>

2 years agodrm-shim: Implement a shim function for close
Icecream95 [Mon, 6 Sep 2021 07:54:32 +0000 (19:54 +1200)]
drm-shim: Implement a shim function for close

Remove the fd from the fd_map, so that if the fd is later reused for
another file then mmap won't be intercepted.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12203>

2 years agodrm-shim: Explicitly use off64_t for the offset to drm_shim_mmap
Icecream95 [Mon, 6 Sep 2021 07:44:23 +0000 (19:44 +1200)]
drm-shim: Explicitly use off64_t for the offset to drm_shim_mmap

drm_shim.c undefines the _FILE_OFFSET_BITS macro, so plain off_t might
be 32 bits, while it's 64 bits in device.c. To avoid this mismatch,
use off64_t which will always be 64 bits in both source files.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12203>

2 years agodrm-shim: Return fake render nodes in /dev/dri first
Icecream95 [Sun, 8 Aug 2021 01:20:06 +0000 (13:20 +1200)]
drm-shim: Return fake render nodes in /dev/dri first

loader_open_render_node returns the first device in /dev/dri that it
can use. To make sure the drm-shim device always gets chosen, return
the fake entries in readdir before returning the real ones.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12203>

2 years agodrm-shim: Add a function for mmap64 rather than using an alias
Icecream95 [Sun, 1 Aug 2021 11:11:41 +0000 (23:11 +1200)]
drm-shim: Add a function for mmap64 rather than using an alias

Fixes build on 32-bit systems.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12203>

2 years agonir: remove gl_PrimitiveID output from MS when it's not used in FS
Marcin Ślusarz [Fri, 11 Mar 2022 10:01:38 +0000 (11:01 +0100)]
nir: remove gl_PrimitiveID output from MS when it's not used in FS

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15340>

2 years agoanv: initialize 3DMESH_1D.ExtendedParameter0 when ExtendedParameter0Present
Marcin Ślusarz [Fri, 8 Apr 2022 08:58:33 +0000 (10:58 +0200)]
anv: initialize 3DMESH_1D.ExtendedParameter0 when ExtendedParameter0Present

When IndirectParameterEnable==true it's not actually used by the hardware,
but if it's not initialized and INTEL_DEBUG=bat is set, then Valgrind complains.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>

2 years agoanv: fix push constant lowering for task/mesh
Marcin Ślusarz [Tue, 5 Apr 2022 14:41:44 +0000 (16:41 +0200)]
anv: fix push constant lowering for task/mesh

Fixes: a6031cd9bd4 ("anv: fix push constant lowering with bindless shaders")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>

2 years agoglsl/st: use nir pass to lower indirect rather than GLSL IR
Timothy Arceri [Tue, 12 Apr 2022 01:23:31 +0000 (11:23 +1000)]
glsl/st: use nir pass to lower indirect rather than GLSL IR

Will allow us to drop more GLSL IR code in future once we switch
all drivers to NIR. Also stops the need for all drivers to call
this pass to remove indirect temps that may have been added during
the NIR varying linking lowering/optimisations.

This patch fixes some tests on i915, d3d12, lima and vc4.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15871>

2 years agoradv: add few helpers to deal with pipeline layout
Samuel Pitoiset [Mon, 11 Apr 2022 10:13:13 +0000 (12:13 +0200)]
radv: add few helpers to deal with pipeline layout

With VK_EXT_graphics_pipeline_library, we will have to support
independent sets and also to merge sets from different libraries.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15849>

2 years agoradv: remove unused radv_pipeline_layout::size field
Samuel Pitoiset [Mon, 11 Apr 2022 08:51:49 +0000 (10:51 +0200)]
radv: remove unused radv_pipeline_layout::size field

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15849>

2 years agoradv: drop the remaining uses of shader modules
Samuel Pitoiset [Fri, 8 Apr 2022 14:41:28 +0000 (16:41 +0200)]
radv: drop the remaining uses of shader modules

With VK_EXT_graphics_pipeline_library, shader modules can be NULL and
be passed via the pNext of VkPipelineShaderStageCreateInfo. To prepare
for this, just store everything we need to radv_pipeline_stage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15847>

2 years agoradv: store the shader sha1 to radv_pipeline_stage
Samuel Pitoiset [Fri, 8 Apr 2022 13:55:19 +0000 (15:55 +0200)]
radv: store the shader sha1 to radv_pipeline_stage

To remove use of shader modules completely in the next commit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15847>

2 years agoradv: replace convert_rt_stage() by vk_to_mesa_shader_stage()
Samuel Pitoiset [Fri, 8 Apr 2022 14:28:26 +0000 (16:28 +0200)]
radv: replace convert_rt_stage() by vk_to_mesa_shader_stage()

Mesa shader stages are correctly sorted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15847>

2 years agonine: check hardware support before using vertex texture
Pavel Ondračka [Mon, 11 Apr 2022 19:03:25 +0000 (21:03 +0200)]
nine: check hardware support before using vertex texture

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15864>

2 years agozink: create pipeline layout if only bindless descriptor set is used
Mike Blumenkrantz [Mon, 11 Apr 2022 19:07:50 +0000 (15:07 -0400)]
zink: create pipeline layout if only bindless descriptor set is used

bindless descriptors are descriptors too.

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15853>

2 years agozink: handle 0 ubos and 0 ssbos in pipeline layout
Mike Blumenkrantz [Mon, 11 Apr 2022 19:07:11 +0000 (15:07 -0400)]
zink: handle 0 ubos and 0 ssbos in pipeline layout

this is the number of types needed, and it can be zero

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15853>

2 years agozink: prune unused st-injected pointsize exports
Mike Blumenkrantz [Fri, 8 Apr 2022 13:18:17 +0000 (09:18 -0400)]
zink: prune unused st-injected pointsize exports

only the last vertex stage needs to keep these, so prune any that aren't
being weirdly passed through

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15853>

2 years agozink: try copy region first for non-resolve blits
Mike Blumenkrantz [Tue, 1 Mar 2022 16:59:48 +0000 (11:59 -0500)]
zink: try copy region first for non-resolve blits

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15853>

2 years agozink: refactor copy_region path in zink_blit to util function
Mike Blumenkrantz [Wed, 30 Mar 2022 20:48:22 +0000 (16:48 -0400)]
zink: refactor copy_region path in zink_blit to util function

no functional changes

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15853>

2 years agodraw: handle tess eval shader when getting num outputs
Dave Airlie [Tue, 12 Apr 2022 03:03:44 +0000 (13:03 +1000)]
draw: handle tess eval shader when getting num outputs

This tripped up some pointsize/prim id interactions with zink.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15872>

2 years agoturnip: Move autotune buffers to suballoc.
Emma Anholt [Fri, 18 Mar 2022 17:31:12 +0000 (10:31 -0700)]
turnip: Move autotune buffers to suballoc.

Now the ANGLE trex_200 trace replay does a single BO allocation at startup
for autotune results instead of one per frame (~350 for the whole replay).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Get autotune off of ralloc destructors.
Emma Anholt [Fri, 18 Mar 2022 17:58:38 +0000 (10:58 -0700)]
turnip: Get autotune off of ralloc destructors.

We've wanted to remove destructors from ralloc's API for a long time (it's
an extra storage cost per ralloc for a rarely-used feature), and for the
suballoc change we'd need to spend more storage on storing the tu_device
pointer per result since destructors don't get anything else but the
pointer passed into them.

Fixes use-after-frees:

=================================================================
==2383==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff88fe1940 at pc 0xffff934f427c bp 0xfffff5481e90 sp 0xfffff5481ea8
WRITE of size 8 at 0xffff88fe1940 thread T0
    #0 0xffff934f4278 in list_del ../src/util/list.h:108
    #1 0xffff934f4278 in result_destructor ../src/freedreno/vulkan/tu_autotune.c:237
    #2 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #3 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #4 0xffff934f4368 in history_destructor ../src/freedreno/vulkan/tu_autotune.c:229
    #5 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #6 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #7 0xffff934f5990 in tu_autotune_on_submit ../src/freedreno/vulkan/tu_autotune.c:442
[...]

0xffff88fe1940 is located 80 bytes inside of 112-byte region [0xffff88fe18f0,0xffff88fe1960)
freed by thread T0 here:
    #0 0xffff9c1c90d8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0xffff934f4368 in history_destructor ../src/freedreno/vulkan/tu_autotune.c:229
    #2 0xffff9377793c in unsafe_free ../src/util/ralloc.c:300
    #3 0xffff9377793c in ralloc_free ../src/util/ralloc.c:265
    #4 0xffff934f5990 in tu_autotune_on_submit ../src/freedreno/vulkan/tu_autotune.c:442
    #5 0xffff935cf2ac in tu_queue_submit_locked ../src/freedreno/vulkan/tu_drm.c:997
[...]

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Reduce the pipeline's CS allocation a bit.
Emma Anholt [Wed, 23 Feb 2022 22:45:59 +0000 (14:45 -0800)]
turnip: Reduce the pipeline's CS allocation a bit.

We don't return unused space to the suballocator, so it's a little useful
to limit how much we overallocate to reduce memory footprint.  I took a
look through the tu_cs_emit_array() calls and accounted for a couple of
them in the variant-specific space calculation, then dropped the base
allocation by factors of 2 until we started throwing asserts.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Skip telling the kernel the BO list when we don't need any.
Emma Anholt [Thu, 17 Feb 2022 21:24:20 +0000 (13:24 -0800)]
turnip: Skip telling the kernel the BO list when we don't need any.

In fencing, we sometimes do a dummy submit with no nr_cmds.  If we don't
have commands to execute, we don't need to pin or fence any BOs either.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Sub-allocate pipelines out of a device-global BO pool.
Emma Anholt [Mon, 14 Feb 2022 22:45:01 +0000 (14:45 -0800)]
turnip: Sub-allocate pipelines out of a device-global BO pool.

Allocating a BO for each pipeline meant that for apps with many pipelines
(such as Asphalt9 under ANGLE), we would end up spending too much time in
the kernel tracking the BO references.

Looking at CS:Source on zink, before we had 85 BOs for the pipelines for a
total of 1036 kb, and now we have 7 BOs for a total of 896 kb.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Stop allocating unused pvtmem space in the pipeline CS.
Emma Anholt [Tue, 15 Feb 2022 21:52:15 +0000 (13:52 -0800)]
turnip: Stop allocating unused pvtmem space in the pipeline CS.

The pvtmem was split off to a separate read/write BO.

Fixes: 931ad19a1817 ("turnip: make cmdstream bo's read-only to GPU")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agoturnip: Track refcounts on BOs in kgsl as well.
Emma Anholt [Wed, 6 Apr 2022 20:05:40 +0000 (13:05 -0700)]
turnip: Track refcounts on BOs in kgsl as well.

I'm going to be using the BO refcount for the pipeline and autotune buffer
suballocation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15038>

2 years agointel/perf: Fix OA report accumulation on Gfx12+.
Francisco Jerez [Mon, 4 Apr 2022 22:45:54 +0000 (15:45 -0700)]
intel/perf: Fix OA report accumulation on Gfx12+.

The intel_perf_query path used for performance queries on GL was
passing a bogus "end" pointer to intel_perf_query_result_accumulate(),
causing it to accumulate garbage values.  This was causing the values
of many performance counters to be corrupted.

The "end" pointer was incorrect because the current code was assuming
that different OA reports were located TOTAL_QUERY_DATA_SIZE bytes
apart, which is a hard-coded preprocessor define.  However recent
(Gfx12+) hardware generations use a variable query size determined by
the query layout.  Use the size derived from it instead, and remove
the stale define.

Fixes: 3c513250255d6a ("intel/perf: switch query code to use query layout")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15783>