platform/upstream/mesa.git
4 years agopan/bi: Get rid of the regs argument in bi_assign_fau_idx()
Boris Brezillon [Mon, 12 Oct 2020 09:07:45 +0000 (11:07 +0200)]
pan/bi: Get rid of the regs argument in bi_assign_fau_idx()

Regs are already part of the bundle struct, let's just pass a pointer
to this bundle object instead of passing both the bundle and regs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>

4 years agopan/bi: Use canonical name for FAU RAM sources
Boris Brezillon [Mon, 12 Oct 2020 08:57:40 +0000 (10:57 +0200)]
pan/bi: Use canonical name for FAU RAM sources

The uniform_constant field and BIFROST_SRC_CONST_{LO,HI} definitions
seem to imply that those only deal with embedded constants. Let's
rename them to reflect the fact that they actually encode accesses to
the Fast-Access-Uniform RAM.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>

4 years agopan/bi: Copy blend shader info from compile_inputs
Boris Brezillon [Mon, 12 Oct 2020 13:00:02 +0000 (15:00 +0200)]
pan/bi: Copy blend shader info from compile_inputs

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>

4 years agopanfrost: Extend compile_inputs to pass a blend descriptor
Boris Brezillon [Mon, 12 Oct 2020 12:56:45 +0000 (14:56 +0200)]
panfrost: Extend compile_inputs to pass a blend descriptor

This is needed for BLEND instructions used from a blend shader so we can
store the result of the shader-based blending back to the tile buffer.
We let the gallium driver build this blend descriptor for us in order
to keep the compiler cmdstream-agnostic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>

4 years agopanfrost: Fix fixed-function blend on bifrost
Boris Brezillon [Fri, 9 Oct 2020 12:00:28 +0000 (14:00 +0200)]
panfrost: Fix fixed-function blend on bifrost

The conversion from a 32b float to a 16b fixed-point number was wrong.

Fixes: 8389976b7c09 ("panfrost: XML-ify the blend descriptors")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7151>

4 years agov3d/compiler: implement load interpolated input intrinsics
Iago Toral Quiroga [Thu, 6 Aug 2020 12:14:17 +0000 (14:14 +0200)]
v3d/compiler: implement load interpolated input intrinsics

We will lower GLSL interpolateAt functions to these.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7155>

4 years agobroadcom/compiler: track partially interpolated fragment inputs
Iago Toral Quiroga [Wed, 5 Aug 2020 08:53:59 +0000 (10:53 +0200)]
broadcom/compiler: track partially interpolated fragment inputs

We will need these to implement GLSL's interpolateAt*() functions where
we are required to perform interpolation in the shader at arbitrary
offsets.

Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7155>

4 years agoiris: Fix doubling of shared local memory (SLM) sizes.
Kenneth Graunke [Wed, 14 Oct 2020 21:56:19 +0000 (14:56 -0700)]
iris: Fix doubling of shared local memory (SLM) sizes.

Commit 67ee9c5f5537fe85357556a4322a07253d13a697 added support for
using the `pipe_compute_state::req_local_mem` field, because Clover
can have a run-time specified size that isn't baked into the shaders.

However, it started adding the static size from the shader to the
dynamic state-supplied size.  The Mesa state tracker fills out
req_local_mem to prog->Base.info.cs.shared_size, which is exactly
what we fill out prog_data->total_shared to be.  Effectively, this
meant that we double-counted the same SLM requirements, doubling
our space requirements.

Fixes a 10% performance regression in Synmark2's OglCSDof test.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>

4 years agointel/compiler, anv: Delete cs_prog_data->slm_size
Kenneth Graunke [Wed, 14 Oct 2020 21:52:37 +0000 (14:52 -0700)]
intel/compiler, anv: Delete cs_prog_data->slm_size

cs_prog_data->slm_size is basically redundant with
prog_data->total_shared, which is the field that we actually use for
controlling the shared local memory size in all drivers.  We were
still using it in one place for VK_EXT_pipeline_executable_properties,
but we should just fix that and delete the field.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7152>

4 years agobroadcom/compiler: use nir io semantics
Arcady Goldmints-Orlov [Mon, 28 Sep 2020 06:38:34 +0000 (01:38 -0500)]
broadcom/compiler: use nir io semantics

This allows to clean up some code.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>

4 years agonir/lower_io_to_scalar: update io semantics on per-component inst
Alejandro Piñeiro [Mon, 12 Oct 2020 23:24:29 +0000 (01:24 +0200)]
nir/lower_io_to_scalar: update io semantics on per-component inst

When we replace the original instruction with per-channel operations,
the new instruction should inherint the semantics of the original
instruction.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>

4 years agobroadcom/compiler: support varyings with struct types
Arcady Goldmints-Orlov [Sat, 25 Jul 2020 15:50:01 +0000 (10:50 -0500)]
broadcom/compiler: support varyings with struct types

This adds support for using structs as outputs from vertex shaders and
inputs to fragment shaders.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6721>

4 years agodocs/release-calendar: plan 20.3 release
Eric Engestrom [Sun, 6 Sep 2020 07:12:43 +0000 (09:12 +0200)]
docs/release-calendar: plan 20.3 release

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6616>

4 years agointel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE
Jason Ekstrand [Wed, 2 Sep 2020 20:24:01 +0000 (15:24 -0500)]
intel/fs: Allow constant-propagation into SAMPLEINFO and IMAGE_SIZE

Without this, we end up with indirect sampler messages all the time
because we don't propagate the texture/image BTI.  This makes debugging
shaders with imageSize or textureSamples in them a pain.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19720612 -> 19720564 (<.01%)
    instructions in affected programs: 4998 -> 4950 (-0.96%)
    helped: 12
    HURT: 0

All affected shaders were compute shaders in Deus Ex: Mankind Divided.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6794>

4 years agodocs: update calendar and link releases notes for 20.1.10
Eric Engestrom [Wed, 14 Oct 2020 17:52:14 +0000 (19:52 +0200)]
docs: update calendar and link releases notes for 20.1.10

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7147>

4 years agodocs: add release notes for 20.1.10
Eric Engestrom [Wed, 14 Oct 2020 17:35:41 +0000 (19:35 +0200)]
docs: add release notes for 20.1.10

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7147>

4 years agoisl: Allow CCS for 8bpp surfaces with 3+ miplevels
Nanley Chery [Fri, 9 Oct 2020 18:25:53 +0000 (11:25 -0700)]
isl: Allow CCS for 8bpp surfaces with 3+ miplevels

I can't find a restriction for enabling CCS on these surfaces in recent
versions of the Bspec. Since I didn't cite my source, I'm not even sure
such a restriction existed in the first place.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7085>

4 years agoiris: Add fast-clear restriction for 8bpp surfaces
Nanley Chery [Fri, 9 Oct 2020 17:07:51 +0000 (10:07 -0700)]
iris: Add fast-clear restriction for 8bpp surfaces

For 8bpp surfaces on TGL, prevent LOD1+ from being fast-cleared. This
will be relevant once ISL starts allowing CCS for 8bpp surfaces with
more than 2 miplevels. I verified the problem behind this restriction
with a modified version of the fbo-clearmipmap piglit test.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7085>

4 years agodocs: update calendar and link releases notes for 20.2.1
Dylan Baker [Wed, 14 Oct 2020 17:42:27 +0000 (10:42 -0700)]
docs: update calendar and link releases notes for 20.2.1

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>

4 years agodocs: add SHA256 sums for 20.2.1
Dylan Baker [Wed, 14 Oct 2020 17:33:42 +0000 (10:33 -0700)]
docs: add SHA256 sums for 20.2.1

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>

4 years agodocs: add release notes for 20.2.1
Dylan Baker [Wed, 14 Oct 2020 16:46:48 +0000 (09:46 -0700)]
docs: add release notes for 20.2.1

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7141>

4 years agoradv: fix optimizing needed states if some are marked as dynamic
Samuel Pitoiset [Tue, 13 Oct 2020 14:28:34 +0000 (16:28 +0200)]
radv: fix optimizing needed states if some are marked as dynamic

From the Vulkan spec 1.2.157:

    "VK_DYNAMIC_STATE_STENCIL_TEST_ENABLE_EXT specifies that the
     stencilTestEnable state in VkPipelineDepthStencilStateCreateInfo
     will be ignored and must be set dynamically with
     vkCmdSetStencilTestEnableEXT before any draw call."

So, stencilTestEnable should be ignored if dynamic. While we are
at it, fix depthBoundsTestEnable too.

Cc: 20.2
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3633
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7112>

4 years agodocs: Document how to replicate a CI build locally.
Eric Anholt [Sat, 12 Sep 2020 16:16:59 +0000 (09:16 -0700)]
docs: Document how to replicate a CI build locally.

Who hasn't needed to do this at some point?  Turns out it's not too hard
to do, and was useful for me in iterating on the Android build.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>

4 years agoci/android: Switch to using the Android NDK.
Eric Anholt [Wed, 9 Sep 2020 23:37:54 +0000 (16:37 -0700)]
ci/android: Switch to using the Android NDK.

To support Android drivers, we're going to want to be tracking that Mesa's
build succeeds on a real android toolchain.  This still uses the android
stubs since these libs aren't in the NDK.

Note that I had to drop the Intel and AMD drivers currently: we don't have
LLVM cross-compiled for Android in this container, and I'm honestly hoping
ACO saves us from that.  Intel has dependencies on libexpat, which AOSP
really doesn't want to bring in, and it looks to me like those dependencies
could be optional.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>

4 years agosymbols-check: Add __cxa_guard_* to the list of approved symbols.
Eric Anholt [Tue, 6 Oct 2020 16:17:32 +0000 (09:17 -0700)]
symbols-check: Add __cxa_guard_* to the list of approved symbols.

These are introduced by the compiler during static local initialization in
c++ for thread safety.  This seems to end up being public in the driver
with --static-libc++ on android.

Reviewed-by: <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>

4 years agoglsl/tests: Make the tests skip on Android binary execution failures.
Eric Anholt [Sat, 12 Sep 2020 15:53:52 +0000 (08:53 -0700)]
glsl/tests: Make the tests skip on Android binary execution failures.

We don't have a suitable exe wrapper for running them, and the missing
linker is throwing return code 255 instead of an ENOEXEC.  Catch it and
return skip from the tests.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>

4 years agomeson: Drop adding -Wl,--gc-sections to project c/cpp arguments.
Eric Anholt [Wed, 9 Sep 2020 23:43:02 +0000 (16:43 -0700)]
meson: Drop adding -Wl,--gc-sections to project c/cpp arguments.

We already have the targets we care about doing this using
ld_args_gc_sections, and by adding it to project arguments we caused
warnings spam in the android clang build about the compile stage not using
the argument.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700>

4 years agoaco/isel: Remove now unused VS-related code from create_null_export
Tony Wasserka [Tue, 13 Oct 2020 13:59:08 +0000 (15:59 +0200)]
aco/isel: Remove now unused VS-related code from create_null_export

Also replaced a hardcoded constant with the appropriate register macro.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>

4 years agoaco/isel: Remove some dead code
Tony Wasserka [Mon, 12 Oct 2020 18:58:21 +0000 (20:58 +0200)]
aco/isel: Remove some dead code

exported_pos was always initialized to true (due to the is_pos argument
of the first export_vs_varying call being true), so none of this code has
any effect.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>

4 years agoaco/isel: Always export position data from VS/NGG
Tony Wasserka [Mon, 12 Oct 2020 17:05:14 +0000 (19:05 +0200)]
aco/isel: Always export position data from VS/NGG

AMD ISA docs explicitly require this for VS, and this likely extends to
NGG too.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>

4 years agoaco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible
Daniel Schürmann [Mon, 21 Sep 2020 17:35:52 +0000 (18:35 +0100)]
aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible

This patch also does a slight rework of export_fs_mrt_color()
to avoid setting of enabled channels which are not used.

Totals from 52404 (38.38% of 136546) affected shaders (NAVI):
SGPRs: 3097443 -> 3097435 (-0.00%)
CodeSize: 189151600 -> 188546200 (-0.32%)
Instrs: 36445061 -> 36445104 (+0.00%); split: -0.00%, +0.00%
Cycles: 1739388020 -> 1739388192 (+0.00%); split: -0.00%, +0.00%
VMEM: 21071501 -> 21071665 (+0.00%); split: +0.00%, -0.00%
SMEM: 3470983 -> 3470982 (-0.00%); split: +0.00%, -0.00%
PreSGPRs: 2058965 -> 2058962 (-0.00%)
PreVGPRs: 1860294 -> 1860295 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10
Daniel Schürmann [Fri, 18 Sep 2020 17:02:08 +0000 (18:02 +0100)]
aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10

Totals from 767 (0.56% of 136546) affected shaders (NAVI):
CodeSize: 2862208 -> 2850036 (-0.43%)
Instrs: 561572 -> 561574 (+0.00%)
Cycles: 6455420 -> 6455428 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoradv,aco: lower_pack_half_2x16
Daniel Schürmann [Fri, 18 Sep 2020 16:48:36 +0000 (17:48 +0100)]
radv,aco: lower_pack_half_2x16

This patch also optimizes pack_half_2x16(a, 0.0).

Totals from 1949 (1.43% of 136546) affected shaders (RAVEN):
SGPRs: 83376 -> 83336 (-0.05%)
CodeSize: 3532144 -> 3512352 (-0.56%)
Instrs: 660746 -> 660682 (-0.01%); split: -0.01%, +0.00%
Cycles: 6780716 -> 6780472 (-0.00%); split: -0.00%, +0.00%
VMEM: 990886 -> 990883 (-0.00%); split: +0.00%, -0.00%
SMEM: 150506 -> 150538 (+0.02%); split: +0.05%, -0.03%
SClause: 30595 -> 30594 (-0.00%); split: -0.01%, +0.00%
Copies: 40801 -> 40729 (-0.18%)
PreSGPRs: 52335 -> 52341 (+0.01%); split: -0.03%, +0.04%
PreVGPRs: 45104 -> 45097 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16
Daniel Schürmann [Fri, 18 Sep 2020 16:34:37 +0000 (17:34 +0100)]
aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16

Apparently, we forgot to remove some debug code.
This patch also fixes the round mode check to consider
the destination bit width.

Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
SGPRs: 100848 -> 100280 (-0.56%)
VGPRs: 68536 -> 66044 (-3.64%); split: -3.68%, +0.05%
CodeSize: 4882296 -> 4837220 (-0.92%); split: -0.94%, +0.01%
MaxWaves: 18990 -> 19019 (+0.15%); split: +0.19%, -0.04%
Instrs: 938150 -> 930388 (-0.83%); split: -0.83%, +0.00%
Cycles: 8699824 -> 8667648 (-0.37%); split: -0.38%, +0.01%
VMEM: 1144502 -> 1059680 (-7.41%); split: +0.06%, -7.48%
SMEM: 170076 -> 167999 (-1.22%); split: +0.22%, -1.44%
VClause: 18428 -> 18422 (-0.03%)
SClause: 41375 -> 41353 (-0.05%); split: -0.06%, +0.00%
Copies: 60008 -> 60054 (+0.08%); split: -0.31%, +0.39%
PreVGPRs: 56163 -> 56142 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: add validation rules for p_split_vector
Daniel Schürmann [Fri, 18 Sep 2020 14:55:54 +0000 (15:55 +0100)]
aco: add validation rules for p_split_vector

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: use p_split_vector for nir_op_unpack_half_*
Daniel Schürmann [Fri, 18 Sep 2020 12:48:52 +0000 (13:48 +0100)]
aco: use p_split_vector for nir_op_unpack_half_*

This enables the use of SDWA if possible

Totals from 9933 (7.27% of 136546) affected shaders (RAVEN):
VGPRs: 731764 -> 731772 (+0.00%); split: -0.00%, +0.00%
CodeSize: 90944852 -> 90671472 (-0.30%); split: -0.30%, +0.00%
Instrs: 17881885 -> 17867831 (-0.08%); split: -0.08%, +0.00%
Cycles: 1597904072 -> 1597771260 (-0.01%); split: -0.01%, +0.00%
VMEM: 1702328 -> 1697383 (-0.29%); split: +0.13%, -0.42%
SMEM: 659583 -> 659049 (-0.08%); split: +0.01%, -0.09%
VClause: 318024 -> 318025 (+0.00%); split: -0.00%, +0.00%
SClause: 631670 -> 631707 (+0.01%); split: -0.01%, +0.01%
Copies: 1504107 -> 1504626 (+0.03%); split: -0.01%, +0.04%
PreVGPRs: 683153 -> 683180 (+0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agonir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16)
Daniel Schürmann [Fri, 18 Sep 2020 12:50:25 +0000 (13:50 +0100)]
nir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16)

Same as extract_u16(a, 1)

Totals from 2021 (1.48% of 136546) affected shaders (RAVEN):
VGPRs: 129516 -> 129524 (+0.01%); split: -0.00%, +0.01%
CodeSize: 12485704 -> 12486600 (+0.01%); split: -0.00%, +0.01%
Instrs: 2435041 -> 2434999 (-0.00%); split: -0.00%, +0.00%
Cycles: 20952552 -> 20952624 (+0.00%); split: -0.00%, +0.00%
VMEM: 374492 -> 374212 (-0.07%); split: +0.01%, -0.08%
SMEM: 123309 -> 123291 (-0.01%); split: +0.00%, -0.02%
VClause: 64156 -> 64164 (+0.01%)
Copies: 191620 -> 191616 (-0.00%); split: -0.03%, +0.03%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: use p_create_vector for nir_op_pack_half_2x16
Daniel Schürmann [Thu, 17 Sep 2020 23:02:21 +0000 (00:02 +0100)]
aco: use p_create_vector for nir_op_pack_half_2x16

This enables the use of SDWA if possible

Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
VGPRs: 68508 -> 68516 (+0.01%)
CodeSize: 4897024 -> 4881068 (-0.33%); split: -0.33%, +0.00%
MaxWaves: 18992 -> 18990 (-0.01%)
Instrs: 946942 -> 939161 (-0.82%); split: -0.82%, +0.00%
Cycles: 8737668 -> 8705704 (-0.37%); split: -0.37%, +0.00%
VMEM: 1155362 -> 1145245 (-0.88%); split: +0.00%, -0.88%
SMEM: 170435 -> 170165 (-0.16%); split: +0.01%, -0.16%
VClause: 18426 -> 18425 (-0.01%)
SClause: 41376 -> 41375 (-0.00%)
Copies: 59813 -> 59787 (-0.04%); split: -0.15%, +0.10%
PreVGPRs: 56126 -> 56136 (+0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: expand create_vector more carefully w.r.t. subdword operands
Daniel Schürmann [Fri, 18 Sep 2020 10:52:35 +0000 (11:52 +0100)]
aco: expand create_vector more carefully w.r.t. subdword operands

No pipelinedb changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agoaco: propagate SGPRs into VOP1 instructions early.
Daniel Schürmann [Thu, 17 Sep 2020 23:00:38 +0000 (00:00 +0100)]
aco: propagate SGPRs into VOP1 instructions early.

This helps DCE. We should reconsider our optimization order
or maybe do the dead code analysis twice

Totals from 106 (0.08% of 136546) affected shaders (RAVEN):
SGPRs: 7184 -> 7152 (-0.45%)
CodeSize: 736912 -> 736052 (-0.12%)
Instrs: 145739 -> 145509 (-0.16%)
Cycles: 2085344 -> 2084268 (-0.05%)
VMEM: 14819 -> 14807 (-0.08%)
SMEM: 7109 -> 7100 (-0.13%); split: +0.04%, -0.17%
SClause: 5383 -> 5385 (+0.04%)
Copies: 13290 -> 13189 (-0.76%)
PreSGPRs: 5265 -> 5221 (-0.84%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>

4 years agozink: unify code for emitting named uint-based variable instructions
Mike Blumenkrantz [Tue, 7 Jul 2020 18:28:31 +0000 (14:28 -0400)]
zink: unify code for emitting named uint-based variable instructions

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7130>

4 years agoaco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute()
Samuel Pitoiset [Thu, 8 Oct 2020 11:54:18 +0000 (13:54 +0200)]
aco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute()

This gets rids of one more use of radv_shader_info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agoaco: compute the CS workgroup size from the shader NIR info
Samuel Pitoiset [Thu, 8 Oct 2020 11:51:27 +0000 (13:51 +0200)]
aco: compute the CS workgroup size from the shader NIR info

cs.block_size is copied from cs.local_size during the shader info pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agoradv: move compiler statistics to ACO
Samuel Pitoiset [Thu, 8 Oct 2020 11:14:21 +0000 (13:14 +0200)]
radv: move compiler statistics to ACO

They are really specific to ACO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agoaco: remove unused radv_shader.h includes
Samuel Pitoiset [Thu, 8 Oct 2020 08:18:08 +0000 (10:18 +0200)]
aco: remove unused radv_shader.h includes

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agoaco: remove useless occurences of radv_nir_compiler_options
Samuel Pitoiset [Thu, 8 Oct 2020 08:12:58 +0000 (10:12 +0200)]
aco: remove useless occurences of radv_nir_compiler_options

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agoaco: remove stub lower_wqm() prototype
Samuel Pitoiset [Thu, 8 Oct 2020 08:11:48 +0000 (10:11 +0200)]
aco: remove stub lower_wqm() prototype

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>

4 years agozink: export PIPE_CAP_MAX*_VARYINGS values
Mike Blumenkrantz [Tue, 8 Sep 2020 19:08:20 +0000 (15:08 -0400)]
zink: export PIPE_CAP_MAX*_VARYINGS values

this is separate from PIPE_SHADER_CAP_MAX_OUTPUTS

fixes mesa/mesa#3105

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7110>

4 years agozink: add feature-documentation
Erik Faye-Lund [Tue, 13 Oct 2020 16:09:05 +0000 (18:09 +0200)]
zink: add feature-documentation

This adds some documentation for the current feature-set in Zink,
explaining what extensions are currently needed for what functionality.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7116>

4 years agozink: redo slot mapping again for the last time really I mean it
Mike Blumenkrantz [Tue, 30 Jun 2020 19:10:12 +0000 (15:10 -0400)]
zink: redo slot mapping again for the last time really I mean it

now that shader compiling is happening all at once, we can store the slot
map on zink_gfx_program directly and reserve it dynamically in order to
use up only the slots that are actually being used across all shader stages

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7100>

4 years agozink: don't leak sampler view textures
Mike Blumenkrantz [Fri, 3 Jul 2020 17:34:34 +0000 (13:34 -0400)]
zink: don't leak sampler view textures

by adding a batch reference for these textures during draw, we can successfully
destroy the resources without crashing

Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>

4 years agozink: explicitly flag fb attachments as being written to in render passes
Mike Blumenkrantz [Mon, 29 Jun 2020 18:28:27 +0000 (14:28 -0400)]
zink: explicitly flag fb attachments as being written to in render passes

we need to ensure that we're accurately setting this hint in order to avoid
synchronization issues when determining whether we can read from the buffer

Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>

4 years agozink: add more explicit fencing for transfer maps
Mike Blumenkrantz [Mon, 29 Jun 2020 18:26:47 +0000 (14:26 -0400)]
zink: add more explicit fencing for transfer maps

we're using our (primitive) buffer r/w tracking here to ensure that our
src buffers are synchronized before we do any kind of read operation on them

this is pretty slow in some cases, but it fixes a bunch of piglit tests

Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>

4 years agozink: optimize transfer_map for resources with pending reads/writes
Mike Blumenkrantz [Fri, 26 Jun 2020 19:16:17 +0000 (15:16 -0400)]
zink: optimize transfer_map for resources with pending reads/writes

we don't need to stall here if we know that we're not about to have any io
conflicts in the buffer

Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>

4 years agozink: add a mechanism to track current resource usage in batches
Mike Blumenkrantz [Mon, 15 Jun 2020 19:51:05 +0000 (15:51 -0400)]
zink: add a mechanism to track current resource usage in batches

this is really primitive, but it at least gives an idea of whether a
resource has been submitted for writing in a pending batch

Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924>

4 years agoradv: fix ignoring the vertex attribute stride if set as dynamic
Samuel Pitoiset [Mon, 12 Oct 2020 15:56:02 +0000 (17:56 +0200)]
radv: fix ignoring the vertex attribute stride if set as dynamic

The vertex attribute stride should be ignored, so make sure it's
initialized to zero if dynamic to avoid computing a wrong offset.

The fact that each element of pStrides must be greater than or equal
to the maximum extent of all vertex input attributes fetched saves us
one user SGPR for the dynamic stride.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3627
Cc: 20.2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7101>

4 years agoac,amd/llvm,radv: Initialize structs with {0}
James Park [Wed, 14 Oct 2020 04:48:25 +0000 (21:48 -0700)]
ac,amd/llvm,radv: Initialize structs with {0}

Necessary to compile with MSVC.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7123>

4 years agoradv/aco: disable NGG GS support because it randomly hangs the GPU
Samuel Pitoiset [Tue, 13 Oct 2020 12:13:15 +0000 (14:13 +0200)]
radv/aco: disable NGG GS support because it randomly hangs the GPU

Disable ACO NGG GS until the random GPU hangs are fixed
(one CTS run == one GPU hang here). No hangs so far after
5 full CTS runs with this disabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7108>

4 years agonir/opt_uniform_atomics: remove useless returns
Rhys Perry [Tue, 13 Oct 2020 18:18:52 +0000 (19:18 +0100)]
nir/opt_uniform_atomics: remove useless returns

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7117>

4 years agoradv: Only close local_fd when valid
James Park [Sat, 29 Aug 2020 18:35:29 +0000 (11:35 -0700)]
radv: Only close local_fd when valid

Necessary when drm_device is bypassed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>

4 years agoutil: Hide timespec_passed on Windows
James Park [Sat, 8 Aug 2020 02:57:04 +0000 (19:57 -0700)]
util: Hide timespec_passed on Windows

Windows doesn't have clockid_t.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>

4 years agoradv: Increased const usage
James Park [Wed, 19 Aug 2020 06:20:57 +0000 (23:20 -0700)]
radv: Increased const usage

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>

4 years agoamd/addrlib: Fix warning list for msvc
James Park [Mon, 3 Aug 2020 19:12:30 +0000 (12:12 -0700)]
amd/addrlib: Fix warning list for msvc

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>

4 years agointel/fs: Rework scratch handling on Gen9+
Jason Ekstrand [Thu, 8 Oct 2020 19:41:43 +0000 (14:41 -0500)]
intel/fs: Rework scratch handling on Gen9+

The current scratch mechanism uses an MRF hack where we reserve a few
GRF registers to treat like the MRF and we collect the data into that
MRF region before doing a scratch write.  We also use that region for
the header for scratch reads.

This commit changes things and gets rid of the MRF hack.  Instead, we
reserve a single register (which RA is free to pick) for the scratch
header and uses split sends for scratch writes to avoid having to do
the copy.  This should provide RA with more freedom in the presence of
spilling as well as avoid some unnecessary data moves.  In future, the
new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own
per-thread scratch base address calculations rather than depending on
the scratch base address that gets pushed into g0.  Having an opcode for
this lets us do it once at the top of the shader rather than repeating
it at every read/write.

One other noticeable difference is the use of SHADER_OPCODE_SEND.  We
can get away with this thanks to the fact that we're now using a set to
track which instructions are generated by spills and don't rely on the
opcodes to find spill/fill instructions.  This allows us to avoid adding
more virtual opcodes and let the normal code paths handle things like
scoreboard dependencies between header setup and the SEND.  It also
means that post-RA scheduling may be able to space out the header setup
MOV and the SEND for better latency hiding.

Shader-db results on Skylake:

    total spills in shared programs: 12137 -> 10604 (-12.63%)
    spills in affected programs: 6685 -> 5152 (-22.93%)
    helped: 274
    HURT: 2

    total fills in shared programs: 13065 -> 11515 (-11.86%)
    fills in affected programs: 9007 -> 7457 (-17.21%)
    helped: 275
    HURT: 1

Shader-db results on Ice Lake:

    total spills in shared programs: 12482 -> 10953 (-12.25%)
    spills in affected programs: 6586 -> 5057 (-23.22%)
    helped: 275
    HURT: 0

    total fills in shared programs: 12819 -> 11234 (-12.36%)
    fills in affected programs: 7867 -> 6282 (-20.15%)
    helped: 274
    HURT: 0

Shader-db results on Tigerlake:

    total spills in shared programs: 11689 -> 10233 (-12.46%)
    spills in affected programs: 4740 -> 3284 (-30.72%)
    helped: 259
    HURT: 0

    total fills in shared programs: 10840 -> 9443 (-12.89%)
    fills in affected programs: 6244 -> 4847 (-22.37%)
    helped: 259
    HURT: 0

Fossil-db results on Ice Lake:

    Spills in all programs: 245249 -> 201633 (-17.8%)
    Fills in all programs: 366066 -> 314368 (-14.1%)

More practically, this seems to give about a 0.5-1% perf boost in
Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs/ra: Use a set to track added spill/fill instructions
Jason Ekstrand [Fri, 9 Oct 2020 09:27:35 +0000 (04:27 -0500)]
intel/fs/ra: Use a set to track added spill/fill instructions

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs/ra: Sanity-check our IP counts
Jason Ekstrand [Mon, 12 Oct 2020 20:07:25 +0000 (15:07 -0500)]
intel/fs/ra: Sanity-check our IP counts

Starting with e99081e76d4a, we don't re-construct liveness information
every time we spill a register.  Instead, we're very careful to track
which instructions are spill instructions and not contribute those to
the IP count so that we can continue to use the old liveness information
even though instructions have been added.  This commit adds an assert
that sanity-checks that we count the same number of instructions as our
liveness information is based on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs/ra: Store the last non-spill VGRF node
Jason Ekstrand [Thu, 8 Oct 2020 20:51:13 +0000 (15:51 -0500)]
intel/fs/ra: Store the last non-spill VGRF node

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs/ra: Refactor handling of Gen7 scratch reads
Jason Ekstrand [Thu, 8 Oct 2020 19:32:30 +0000 (14:32 -0500)]
intel/fs/ra: Refactor handling of Gen7 scratch reads

The attempt at de-duplication with the gen7_read Boolean wasn't actually
saving us anything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs/ra: Increment spill_offset as part of the emit_spill loop
Jason Ekstrand [Thu, 8 Oct 2020 19:26:57 +0000 (14:26 -0500)]
intel/fs/ra: Increment spill_offset as part of the emit_spill loop

This makes it consistent with our handling of src.offset and with our
handling of spill_offset in emit_unspill.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs: Add a SCRATCH_HEADER opcode
Jason Ekstrand [Fri, 9 Oct 2020 09:13:20 +0000 (04:13 -0500)]
intel/fs: Add a SCRATCH_HEADER opcode

This opcode is responsible for setting up the buffer base address and
per-thread scratch space fields of a scratch message header.  For the
most part, it's a copy of g0 but some messages need us to zero out g0.2
and the bottom bits of g0.5.

This may actually fix a bug when nir_load/store_scratch is used.  The
docs say that the DWORD scattered messages respect the per-thread
scratch size specified in gN.3[3:0] in the message header but we've been
leaving it zero.  This may mean that we've been ignoring any scratch
reads/writes from a load/store_scratch intrinsic above the 1KB mark.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/fs: Copy the PTSS from g0 for scratch reads/writes
Jason Ekstrand [Mon, 12 Oct 2020 21:15:02 +0000 (16:15 -0500)]
intel/fs: Copy the PTSS from g0 for scratch reads/writes

In theory, this fixes a bug where we were dropping the PTSS bound on the
floor.  The hardware docs claim that the A32 DWORD and BYTE scattered
read/write messages do a PTSS bounds check.   However, in practice, it
seems that the hardware ignores the bounds check so this doesn't
actually matter.  I verified this with the following couple of piglit
tests:

    https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399

In practice, this prevents the next commit from making a subtle
behavioral change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agointel/batch_decoder: Don't clame vec4 vs/gs/tcs shaders on Gen11+
Jason Ekstrand [Mon, 12 Oct 2020 19:53:01 +0000 (14:53 -0500)]
intel/batch_decoder: Don't clame vec4 vs/gs/tcs shaders on Gen11+

Because we hard-coded the default to vec4, any platform where it doesn't
have a "Dispatch Mode" field gets vec4 by default.  This includes Gen11+
where vec4 is no longer a thing.  Change the default so it works on
newer hardware.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084>

4 years agov3dv/device: Support loader interface version 3.
Alejandro Piñeiro [Tue, 13 Oct 2020 21:06:35 +0000 (23:06 +0200)]
v3dv/device: Support loader interface version 3.

Port of 1e41d7f7b0855934744fe578ba4eae9209ee69f7:
"anv: Support loader interface version 3 (patch v2)"

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fix buffer copies to compressed images on the blit path
Iago Toral Quiroga [Fri, 9 Oct 2020 10:06:06 +0000 (12:06 +0200)]
v3dv: fix buffer copies to compressed images on the blit path

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: drop a couple of obsolete comments
Iago Toral Quiroga [Thu, 8 Oct 2020 07:12:23 +0000 (09:12 +0200)]
v3dv: drop a couple of obsolete comments

We only expose a coherent memory heap, so invalidation and flushing
are always no-ops for us.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: limit blit framebuffer dimensions to max coordinates
Iago Toral Quiroga [Tue, 6 Oct 2020 06:57:47 +0000 (08:57 +0200)]
v3dv: limit blit framebuffer dimensions to max coordinates

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: generate proper UUIDs for device and driver
Iago Toral Quiroga [Mon, 5 Oct 2020 08:44:59 +0000 (10:44 +0200)]
v3dv: generate proper UUIDs for device and driver

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fix blit path for copies from 3D compressed images
Iago Toral Quiroga [Thu, 1 Oct 2020 08:02:23 +0000 (10:02 +0200)]
v3dv: fix blit path for copies from 3D compressed images

The aliasing we were using was not always correct. Particularly,
for 3D images, the simulator would complain about image strides
not being large enough in some cases.

This patch fixes this by aliasing both src and dst images and
carefully choosing the alias dimensions taking into account the
format chosen for the copy and the ratio of block sizes between
both images.

Playing a bit with the image dimensions used by the relevant CTS
tests we confirmed this works well for all tile layouts (lineartile,
ublinear1/2 and UIF).

This fixes all CTS tests involving 3D image copies from compressed
formats without needing to force UIF layout for all compressed
images (which would actually not work for all image sizes either).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fixes for barriers in secondary command buffers
Iago Toral Quiroga [Fri, 25 Sep 2020 13:00:41 +0000 (15:00 +0200)]
v3dv: fixes for barriers in secondary command buffers

This patch addresses various issues, mostly from secondary command buffers
that recorded pipeline barriers that are not consumed in the secondary itself,
so they need to be applied to jobs that come right after the execution of the
secondary in a primary command buffer.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: implement workaround for GFXH-1918
Iago Toral Quiroga [Thu, 24 Sep 2020 08:09:00 +0000 (10:09 +0200)]
v3dv: implement workaround for GFXH-1918

Loading depth with odd width/height might cause incorrect loading
of the early-Z buffer.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: implement workaround for GFXH-1461
Iago Toral Quiroga [Thu, 24 Sep 2020 07:23:29 +0000 (09:23 +0200)]
v3dv: implement workaround for GFXH-1461

If a subpass clears one aspect of Depth/Stencil but loads the other
the clear might get lost. Fix this by emitting the clear as a draw
call instead of relying on the TLB clear.

Fixes:
dEQP-VK.renderpass.suballocation.attachment.3.307

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: flag tmu_dirty_rcl in primaries when linking secondaries that have it set
Iago Toral Quiroga [Wed, 23 Sep 2020 10:38:27 +0000 (12:38 +0200)]
v3dv: flag tmu_dirty_rcl in primaries when linking secondaries that have it set

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: only advertise one memory type
Iago Toral Quiroga [Wed, 23 Sep 2020 09:28:41 +0000 (11:28 +0200)]
v3dv: only advertise one memory type

Our current implemenation is always coherent.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: always program a reasonable internal depth type for copies/clears
Iago Toral Quiroga [Wed, 23 Sep 2020 08:02:26 +0000 (10:02 +0200)]
v3dv: always program a reasonable internal depth type for copies/clears

This doesn't seem to fix anything, but it is the right thing to do.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv/pipeline_cache: extend pipeline cache envvar
Alejandro Piñeiro [Sun, 20 Sep 2020 20:54:33 +0000 (22:54 +0200)]
v3dv/pipeline_cache: extend pipeline cache envvar

So far V3DV_ENABLE_DEFAULT_PIPELINE_CACHE allowed to configure
pipeline cache to avoid any caching using a pipeline cache.

With this change we can be more detailed. Then envvar is not anymore a
boolean. Allowed values:

  * "off": no pipeline cache at all. PipelineCache objects behaves as
    no-op objects.

  * "no-default-cache": user PipelineCache caches nir/variants, but we
    don't provide a default cache in case the user doesn't provide a
    PipelineCache object, neither for internal pipelines.

  * "full" (default): we provide a default PipelineCache, used when
    the user doesn't provide one when creating a Pipeline, and for
    internal Pipelines.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv/pipeline_cache: set a max size for the pipeline cache
Alejandro Piñeiro [Sat, 19 Sep 2020 20:51:58 +0000 (22:51 +0200)]
v3dv/pipeline_cache: set a max size for the pipeline cache

We don't want to let the default pipeline cache grow without limit. We
choose a maximum number of entries that should work for all real world
applications. CTS will exceed that limit, but that is okay, as it will
prevent us from running out of memory.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3d/compiler: allow to batch spills
Iago Toral Quiroga [Thu, 10 Sep 2020 07:51:54 +0000 (09:51 +0200)]
v3d/compiler: allow to batch spills

Some shaders that need to spill hundreds of registers can take very long times
to compile as each allocation attempt spills a single register and restarts
the allocation process. We can significantly cut down these times if we allow
the compiler to spill in batches, which should be possible if we are spilling
uniforms, which is in fact the kind of spills that we do first because they
have lower cost than TMU spills.

Doing this could cause us to slightly over spill in some cases (depending on
the chosen batch size) leading to slightly worse performance, so we only
enable this behavior after we have started to spill over a certain threshold,
at which point we assume that performance won't be good and we want to
favor compilation speed instead.

v2:
  - Keep it simple and just try to spill a fixed amount of registers in a
    batch instead of trying to compute this dynamically based on accumulated
    spills and current register pressure. (Eric).

v3:
  - Check if the node is valid before doing anything with it.
  - Drop the environment variable to select batch size and just fix it to 20.

With this we can take this CTS test from 35 minutes down to about 3 minutes:
dEQP-VK.ssbo.layout.random.all_shared_buffer.5

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: free noop job if needed when finishing the queue
Iago Toral Quiroga [Fri, 18 Sep 2020 16:02:05 +0000 (18:02 +0200)]
v3dv: free noop job if needed when finishing the queue

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: clean-up after obtaining an XCB connection
Iago Toral Quiroga [Thu, 17 Sep 2020 14:12:31 +0000 (16:12 +0200)]
v3dv: clean-up after obtaining an XCB connection

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: don't leak dumb BO handles allocated for swapchain images
Iago Toral Quiroga [Fri, 11 Sep 2020 10:20:20 +0000 (12:20 +0200)]
v3dv: don't leak dumb BO handles allocated for swapchain images

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv/meta_copy: fix TFU blitting when using 3D images
Alejandro Piñeiro [Thu, 10 Sep 2020 21:59:47 +0000 (23:59 +0200)]
v3dv/meta_copy: fix TFU blitting when using 3D images

We had some code on blit_tfu to hande 3D images but it was wrong. For
example, it executed a copy on the 3D image no matter the depth
component copy needed. This was not detected until vk-gl-cts 1.2.4
introduced more 1D and 3D blitting tests.

Also add checks for rely on blit_shader if needed like when mirroring
on the depth component.

Fixes the following tests:
  dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.mirror_z_3d.nearest
  dEQP-VK.api.copy_and_blit.core.blit_image.simple_tests.whole_3d.nearest
  dEQP-VK.api.copy_and_blit.dedicated_allocation.blit_image.simple_tests.mirror_z_3d.nearest
  dEQP-VK.api.copy_and_blit.dedicated_allocation.blit_image.simple_tests.whole_3d.nearest

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: honor VkPipelineDepthStencilStateCreateInfo::depthWriteEnable
Iago Toral Quiroga [Wed, 9 Sep 2020 07:18:43 +0000 (09:18 +0200)]
v3dv: honor VkPipelineDepthStencilStateCreateInfo::depthWriteEnable

Fixes:
dEQP-VK.renderpass.suballocation.subpass_dependencies.separate_channels.d24_unorm_s8_uint

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fix sampling from stencil aspect of a combined depth/stencil image
Iago Toral Quiroga [Tue, 8 Sep 2020 06:47:23 +0000 (08:47 +0200)]
v3dv: fix sampling from stencil aspect of a combined depth/stencil image

When sampling the stencil aspect we want to reinterpret the D24S8 format
as RGBA8 and read stencil values from the R component.

Fixes:
dEQP-VK.renderpass.suballocation.formats.d24_unorm_s8_uint.input.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv/formats: properly return unsupported for 1D compressed textures
Alejandro Piñeiro [Mon, 7 Sep 2020 22:57:31 +0000 (00:57 +0200)]
v3dv/formats: properly return unsupported for 1D compressed textures

Gets tests like the following one properly skipped:
   dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.1d.etc2_r8g8b8a8_unorm_block.etc2_r8g8b8a8_unorm_block.optimal_general

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: signal semaphore/fence if needed after acquiring a swapchain image
Iago Toral Quiroga [Mon, 7 Sep 2020 09:09:23 +0000 (11:09 +0200)]
v3dv: signal semaphore/fence if needed after acquiring a swapchain image

Fixes:
dEQP-VK.wsi.*.swapchain.acquire.too_many
dEQP-VK.wsi.*.swapchain.acquire.too_many_timeout

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: do not expose VK_IMAGE_USAGE_SAMPLED_BIT for swapchains
Iago Toral Quiroga [Wed, 2 Sep 2020 05:58:28 +0000 (07:58 +0200)]
v3dv: do not expose VK_IMAGE_USAGE_SAMPLED_BIT for swapchains

The display pipeline on the Rpi4 requires that images are linear and the
3D pipeline cannot sample from linear images.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fix size computed by vkGetImageSubresourceLayout for 3D images
Iago Toral Quiroga [Tue, 1 Sep 2020 11:02:32 +0000 (13:02 +0200)]
v3dv: fix size computed by vkGetImageSubresourceLayout for 3D images

Fixes:
dEQP-VK.image.subresource_layout.3d.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: fix offset computed by vkGetImageSubresourceLayout for array images
Iago Toral Quiroga [Tue, 1 Sep 2020 11:01:49 +0000 (13:01 +0200)]
v3dv: fix offset computed by vkGetImageSubresourceLayout for array images

Fixes:
dEQP-VK.image.subresource_layout.2d_array.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: expose DRM modifiers based on supported features
Iago Toral Quiroga [Tue, 1 Sep 2020 06:56:47 +0000 (08:56 +0200)]
v3dv: expose DRM modifiers based on supported features

So far we have only been exposing linear for WSI formats and UIF on
everythig else, but we should instead expose linear or UIF based
on whether the underlying format supports any features for the given
layout.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>

4 years agov3dv: handle VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO
Iago Toral Quiroga [Tue, 1 Sep 2020 06:51:37 +0000 (08:51 +0200)]
v3dv: handle VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_DRM_FORMAT_MODIFIER_INFO

When negotiating DRM modifiers, applications may use this to validate the
features that are supported with a particular modifier. The WSI code in
Mesa relies on this to validate its modifiers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>