platform/upstream/mesa.git
2 years agovirgl: Fix texture transfers by using a staging resource
Gert Wollny [Wed, 12 Jan 2022 12:01:30 +0000 (13:01 +0100)]
virgl: Fix texture transfers by using a staging resource

This commit fixes the following flaws in the implementation:

* when a resource was re-allocated, the guest side storage
  was also allocated
* when a source needs a readback before being written to, then
  the call would go through vws->transfer_get, thereby bypassing the
  staging resource, and this would fail on the host, because no
  the allocated IOV was too small (just one byte)
* if the texture write would need neither flush nor readback, the
  old code path would be used expecting that guest side backing stogage
  for the texture.

v2: - actually do a readback to the stageing resource when it is required
    - fix typo (Lepton)

v3: Don't use stageing transfers if the host can't read back the data
    by rendering to an FBO or calling getTexImage, because in this case
    we rely on the IOV to hold the date.

v4: Also don't use staging transfers if the format is no readback
    format. Otherwise we have to deal with the resolve blit, and
    this is currently not working correctly.

v5: add a new flag that indicates whether non-renderable textures can
    be read back (either via glGetTexImage or GBM)

v6: Restrict the use of staging texture transfers to textures that can
    be read back, and on GLES also if the they are bound to scanout and
    the host uses minigbm to allocate such textures.
    For that replace the flag indicating the capability to read back
    non-renderable textures with a cap that indicates whether scanout
    textures can be read back.

v7: update virglrenderer version in the CI

v8: update use of stageing (Chia-I)

v9: remove superflous check and assignment (Chia-I)

v10: disable stageing textures for arrays with stencil format. This is a
     workaround for failures of the CI.

Fixes: cdc480585c9be368ddfdc33e2eb73e3582f25fe7
    virgl/drm: New optimization for uploading textures

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14495>

2 years agollvmpipe: clamp surface clear geometry
Mike Blumenkrantz [Mon, 31 Jan 2022 15:18:10 +0000 (10:18 -0500)]
llvmpipe: clamp surface clear geometry

avoid oob writes to avoid crashing

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14655>

2 years agolavapipe: clamp clear attachments rects
Mike Blumenkrantz [Fri, 21 Jan 2022 19:31:30 +0000 (14:31 -0500)]
lavapipe: clamp clear attachments rects

there is at least one unnamed game which has problems with this, so try
to avoid crashing

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14655>

2 years agollvmpipe: fix debug print iterating in set_framebuffer_state
Mike Blumenkrantz [Fri, 28 Jan 2022 14:46:47 +0000 (09:46 -0500)]
llvmpipe: fix debug print iterating in set_framebuffer_state

this would potentially access garbage memory by checking the existing
state using the incoming state's iterator values

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14857>

2 years agozink: ci updates
Mike Blumenkrantz [Tue, 8 Mar 2022 17:51:50 +0000 (12:51 -0500)]
zink: ci updates

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>

2 years agozink: fix 64bit float shader ops
Mike Blumenkrantz [Mon, 7 Mar 2022 14:25:43 +0000 (09:25 -0500)]
zink: fix 64bit float shader ops

this was being set from back before zink actually supported 64bit
natively and only 32bit was functional, but it breaks 64bit support

cc: mesa-stable

fixes (lavapipe):
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec2
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec3
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec4

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>

2 years agozink: run nir_lower_phis_to_scalar in optimization loop
Mike Blumenkrantz [Tue, 8 Mar 2022 17:12:21 +0000 (12:12 -0500)]
zink: run nir_lower_phis_to_scalar in optimization loop

fixes (lavapipe):
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>

2 years agoradv,aco,llvm: lower post shuffle vertex in NIR
Samuel Pitoiset [Fri, 18 Feb 2022 13:23:42 +0000 (14:23 +0100)]
radv,aco,llvm: lower post shuffle vertex in NIR

fossils-db (Sienna Cichlid):
Totals from 774 (0.57% of 134913) affected shaders:
VGPRs: 26496 -> 26312 (-0.69%)
CodeSize: 1825936 -> 1828812 (+0.16%); split: -0.04%, +0.20%
MaxWaves: 22046 -> 22062 (+0.07%)
Instrs: 347634 -> 347975 (+0.10%); split: -0.05%, +0.15%
Latency: 1363949 -> 1356426 (-0.55%); split: -0.59%, +0.04%
InvThroughput: 221529 -> 221380 (-0.07%); split: -0.10%, +0.04%
VClause: 5682 -> 5676 (-0.11%); split: -1.46%, +1.36%
SClause: 7485 -> 7411 (-0.99%); split: -1.48%, +0.49%
Copies: 30481 -> 30420 (-0.20%); split: -0.51%, +0.31%
PreVGPRs: 19717 -> 19656 (-0.31%)

fossil-db (Polaris10):
Totals from 896 (0.66% of 135960) affected shaders:
SGPRs: 49824 -> 49648 (-0.35%); split: -0.39%, +0.03%
VGPRs: 31040 -> 29948 (-3.52%); split: -3.62%, +0.10%
CodeSize: 875960 -> 875920 (-0.00%); split: -0.06%, +0.05%
MaxWaves: 6380 -> 6429 (+0.77%)
Instrs: 171522 -> 171482 (-0.02%); split: -0.07%, +0.05%
Latency: 1356082 -> 1334386 (-1.60%); split: -1.61%, +0.01%
InvThroughput: 553389 -> 552957 (-0.08%); split: -0.08%, +0.00%
VClause: 4317 -> 4244 (-1.69%); split: -2.41%, +0.72%
SClause: 6157 -> 6139 (-0.29%); split: -0.45%, +0.16%
Copies: 9340 -> 9235 (-1.12%); split: -1.24%, +0.12%
PreVGPRs: 22366 -> 22116 (-1.12%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15113>

2 years agonir: Introduce workgroup_index and ability to lower workgroup_id to it.
Timur Kristóf [Thu, 24 Feb 2022 09:27:30 +0000 (10:27 +0100)]
nir: Introduce workgroup_index and ability to lower workgroup_id to it.

The workgroup_index is intended for situations when a 3 dimensional
workgroup_id is not available on the HW, but a 1 dimensional index is.
In this case, we can use lower the 3D ID to use this.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>

2 years agonir: Extract lower_id_to_index into a separate function.
Timur Kristóf [Thu, 24 Feb 2022 09:17:36 +0000 (10:17 +0100)]
nir: Extract lower_id_to_index into a separate function.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>

2 years agonir: Fix lowering terminology of compute system values: "from"->"to".
Timur Kristóf [Thu, 24 Feb 2022 09:14:08 +0000 (10:14 +0100)]
nir: Fix lowering terminology of compute system values: "from"->"to".

This is to match other NIR terminology.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>

2 years agopanvk: Non-destructively stub GetRenderAreaGranularity
Jason Ekstrand [Fri, 28 Jan 2022 21:04:50 +0000 (15:04 -0600)]
panvk: Non-destructively stub GetRenderAreaGranularity

Don't crash.  Just print a warning and return 1x1.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>

2 years agopanvk: Advertise zero sparse format properties
Jason Ekstrand [Fri, 28 Jan 2022 21:04:14 +0000 (15:04 -0600)]
panvk: Advertise zero sparse format properties

This is the correct implementation when you don't support sparse and
fixes piles of CTS crashes.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>

2 years agopanvk: Advertise VK_KHR_get_physical_device_properties2
Jason Ekstrand [Fri, 28 Jan 2022 21:22:45 +0000 (15:22 -0600)]
panvk: Advertise VK_KHR_get_physical_device_properties2

All the entrypooints are already implemented and a bunch of Vulkan CTS
tests assume this extension exists.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15285>

2 years agogallium/dri: Add missing in_fence_fd initialization
Rob Clark [Mon, 7 Mar 2022 23:53:59 +0000 (15:53 -0800)]
gallium/dri: Add missing in_fence_fd initialization

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6108
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15272>

2 years agovulkan/device_select: add has_vulkan11 flag with has_pci_bus flag
Yogesh Mohan Marimuthu [Thu, 13 Jan 2022 10:36:48 +0000 (16:06 +0530)]
vulkan/device_select: add has_vulkan11 flag with has_pci_bus flag

In EnumeratePhysicalDevices(), pci bus info is available only in
vulkan version >= 1.1. hence adding has_vulkan11 flag in places
where has_pci_bus is used in EnumeratePhysicalDevices() code flow.

Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14535>

2 years agovulkan/device_select: for vulkan 1.0 use vid/did for boot_vga
Yogesh Mohan Marimuthu [Thu, 13 Jan 2022 10:25:20 +0000 (15:55 +0530)]
vulkan/device_select: for vulkan 1.0 use vid/did for boot_vga

In device select layer EnumeratePhysicalDevices() function pci
bus information is available only in case of vulkan >= 1.1.
Hence use vid/did to match boot_vga device in case of vulkan 1.0.

Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14535>

2 years agonir: Fix handling of NV_mesh_shader PRIMITIVE_INDICES output.
Timur Kristóf [Wed, 23 Feb 2022 14:01:05 +0000 (15:01 +0100)]
nir: Fix handling of NV_mesh_shader PRIMITIVE_INDICES output.

PRIMITIVE_INDICES is a flat array in NV_mesh_shader,
not a proper arrayed output, as opposed to D3D-style
mesh shaders where it's addressed by the primitive index.

Prevent assigning several slots to primitive indices,
to avoid issues.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15160>

2 years agoaco/insert_exec_mask: optimize top-level transition to exact before demote
Rhys Perry [Wed, 23 Feb 2022 17:29:25 +0000 (17:29 +0000)]
aco/insert_exec_mask: optimize top-level transition to exact before demote

fossil-db (Sienna Cichlid):
Totals from 5767 (3.55% of 162293) affected shaders:
Instrs: 3264949 -> 3257527 (-0.23%); split: -0.23%, +0.00%
CodeSize: 17835692 -> 17806004 (-0.17%); split: -0.17%, +0.00%
Latency: 45990060 -> 45987924 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 7643850 -> 7643835 (-0.00%); split: -0.00%, +0.00%
Copies: 193641 -> 186219 (-3.83%); split: -3.84%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>

2 years agoaco/insert_exec_mask: use get_exec_op
Rhys Perry [Wed, 23 Feb 2022 17:35:33 +0000 (17:35 +0000)]
aco/insert_exec_mask: use get_exec_op

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>

2 years agoaco/insert_exec_mask: fix top-level to-exact with non-global exact mask
Rhys Perry [Wed, 23 Feb 2022 17:21:42 +0000 (17:21 +0000)]
aco/insert_exec_mask: fix top-level to-exact with non-global exact mask

After transitioning to exact after a discard, the exec stack might be:
[exact|global, wqm, exact]

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>

2 years agoradeonsi/ci: Mark a bunch of flaky tests on stoney
Cristian Ciocaltea [Mon, 7 Mar 2022 16:24:55 +0000 (18:24 +0200)]
radeonsi/ci: Mark a bunch of flaky tests on stoney

radeonsi-stoney-gl:amd64 job fails due to random crashes of some
'dEQP-GLES3.functional.buffer.map.write.explicit_flush.*' tests.

Fix the pipeline by adding them to 'radeonsi-stoney-flakes.txt'.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>

2 years agoci/zink: Report flake test
Cristian Ciocaltea [Mon, 7 Mar 2022 14:34:39 +0000 (16:34 +0200)]
ci/zink: Report flake test

Mark 'KHR-GL46.shader_image_load_store.advanced-sso-subroutine' as
flake since it failed several times while attempting to merge this MR.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>

2 years agoci: Improve interrupt signal handling in crosvm-runner.sh
Cristian Ciocaltea [Thu, 3 Mar 2022 23:59:55 +0000 (01:59 +0200)]
ci: Improve interrupt signal handling in crosvm-runner.sh

Run crosvm as a background process in order to allow intercepting
interrupt signals (INT, TERM) and properly release/cleanup any allocated
resources.

This is particularly helpful when one or more crosvm tasks hang, which
will eventually prevent subsequent instances to be started - currently
we can handle up to 128 concurrent crosvm instances per runner.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>

2 years agoci: Increase limit of concurrent crosvm instances per runner
Cristian Ciocaltea [Mon, 28 Feb 2022 19:26:31 +0000 (21:26 +0200)]
ci: Increase limit of concurrent crosvm instances per runner

Ensure we can handle up to 128 concurrent crosvm instances per runner
with the current CID generator. This is a safety margin for the new
64-core runners.

Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15238>

2 years agogallivm/nir: extract a valid texture index according to exec_mask.
Dave Airlie [Mon, 7 Mar 2022 07:03:25 +0000 (17:03 +1000)]
gallivm/nir: extract a valid texture index according to exec_mask.

When using indirect textures, some lanes may not be active,
particularly in a loop, so as with some other areas, extracting
the correct lane is needed here. This extracts the last valid one.

KHR-GL45.texture_barrier.* on zink.

Fixes: e168d148d76d ("gallivm/nir: handle non-uniform texture offsets")

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15259>

2 years agoradv/ci: remove unused files
Samuel Pitoiset [Mon, 7 Mar 2022 13:49:00 +0000 (14:49 +0100)]
radv/ci: remove unused files

These files are no longer used.

Fixes: cc327a0fe45 ("amd, ci: Remove unused runners.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Charlie Turner <cturner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15264>

2 years agofreedreno: add a420 deqp-runner files
Ilia Mirkin [Tue, 16 Nov 2021 23:19:37 +0000 (18:19 -0500)]
freedreno: add a420 deqp-runner files

This doesn't actually get run in CI, but this helps track outstanding
issues / expectations. This is from a run on my IFC6540 with A420.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agofreedreno/a4xx: expose shaders and images, as well as ES 3.1
Ilia Mirkin [Sun, 14 Nov 2021 18:06:49 +0000 (13:06 -0500)]
freedreno/a4xx: expose shaders and images, as well as ES 3.1

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agofreedreno/ir3: disable conversion folding on a4xx
Ilia Mirkin [Sat, 4 Dec 2021 00:06:12 +0000 (19:06 -0500)]
freedreno/ir3: disable conversion folding on a4xx

Experiments suggest that e.g.

add.u r0.y, hr0.x, hr0.y

will result in the summed value in both the high and low words of r0.y.
This only happens with odd registers, not even ones (r0.x works fine).

Seen in the bit_count lowering (which turns out to be unnecessary, but
this is still a larger problem).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agofreedreno/ir3: no need to count bits 16b at a time for a4xx
Ilia Mirkin [Fri, 3 Dec 2021 08:04:19 +0000 (03:04 -0500)]
freedreno/ir3: no need to count bits 16b at a time for a4xx

This also works out nicely since a4xx has some sort of problem with the
16b-based lowering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agofreedreno/a4xx: improve condition for disabling early z
Ilia Mirkin [Sat, 20 Nov 2021 07:30:37 +0000 (02:30 -0500)]
freedreno/a4xx: improve condition for disabling early z

This helps some subtests in the early-z piglit test, but leaves one
occlusion-based test still failing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agofreedreno/a4xx: extend astc and tg4 workarounds to compute shaders
Ilia Mirkin [Wed, 17 Nov 2021 00:10:46 +0000 (19:10 -0500)]
freedreno/a4xx: extend astc and tg4 workarounds to compute shaders

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15251>

2 years agoRevert "lavapipe: accurately set image/ssbo access based on shader usage"
Mike Blumenkrantz [Mon, 7 Mar 2022 23:00:42 +0000 (18:00 -0500)]
Revert "lavapipe: accurately set image/ssbo access based on shader usage"

This reverts commit 821a49981ff386559f8a8fdf6bf3526b8deb2415.

still flaky

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15271>

2 years agointel/perf: Destination array calculation into function
Matt Turner [Thu, 3 Mar 2022 23:30:37 +0000 (15:30 -0800)]
intel/perf: Destination array calculation into function

Cuts 119 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
 917511       0       0  917511   e0007 meson-generated_.._intel_perf_metrics.c.o (before)
 796986       0       0  796986   c293a meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14130948 365708  210004 14706660 e067e4 iris_dri.so (before)
14009332 365708  210004 14585044 de8cd4 iris_dri.so (after)

   text    data     bss     dec     hex filename
8124225  214264   22820 8361309  7f955d libvulkan_intel.so (before)
8002609  214264   22820 8239693  7dba4d libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Fix mistake in description string
Matt Turner [Thu, 3 Mar 2022 07:28:18 +0000 (23:28 -0800)]
intel/perf: Fix mistake in description string

Along with fixing the grammar, this allows it to be deduplicated since
the properly worded description exists in later generations' XMLs.

Cuts 96 B from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
 917613       0       0  917613   e006d meson-generated_.._intel_perf_metrics.c.o (before)
 917511       0       0  917511   e0007 meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14131044 365708  210004 14706756 e06844 iris_dri.so (before)
14130948 365708  210004 14706660 e067e4 iris_dri.so (after)

   text    data     bss     dec     hex filename
8124321  214264   22820 8361405  7f95bd libvulkan_intel.so (before)
8124225  214264   22820 8361309  7f955d libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Mark intel_perf_counter_* enums as PACKED
Matt Turner [Wed, 2 Mar 2022 02:49:26 +0000 (18:49 -0800)]
intel/perf: Mark intel_perf_counter_* enums as PACKED

Reduces their sizes from 4 bytes to 1. Cuts 6 KiB from iris_dri.so and
libvulkan_intel.so.

   text    data     bss     dec     hex filename
 924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (before)
 917613       0       0  917613   e006d meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14137732 365708  210004 14713444 e08264 iris_dri.so (before)
14131044 365708  210004 14706756 e06844 iris_dri.so (after)

   text    data     bss     dec     hex filename
8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (before)
8124321  214264   22820 8361405  7f95bd libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Store indices to strings rather than pointers
Matt Turner [Mon, 31 Jan 2022 21:16:26 +0000 (13:16 -0800)]
intel/perf: Store indices to strings rather than pointers

The compiler does a good job of deduplicating strings already, but we
can eliminate the pointers to each string by combining the strings into
a single char array and storing only an index into that array.

The longest of the char arrays is the descriptions array, which is a
little over 45 KiB, so still under MSVC's 64 KiB string literal limit
[0]. Because the string length is under 64 KiB we can use uint16_t as
the index type, which roughly doubles our savings as compared to an int.

This cuts 77 KiB from iris_dri.so (0.5%) and libvulkan_intel.so (0.9%).

   text    data     bss     dec     hex filename
 926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (before)
 924401       0       0  924401   e1af1 meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14190852 391628  210004 14792484 e1b724 iris_dri.so (before)
14137732 365708  210004 14713444 e08264 iris_dri.so (after)

   text    data     bss     dec     hex filename
8184097  240184   22820 8447101  80e47d libvulkan_intel.so (before)
8131009  214264   22820 8368093  7fafdd libvulkan_intel.so (after)

relinfo:
iris_dri.so (before): 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users

libvulkan_intel.so (before): 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) :  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users

[0] https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp?view=msvc-170&viewFallbackFrom=vs-2019

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Use slimmer intel_perf_query_counter_data struct
Matt Turner [Thu, 3 Mar 2022 20:24:02 +0000 (12:24 -0800)]
intel/perf: Use slimmer intel_perf_query_counter_data struct

intel_perf_query_counter contains fields for things we can't or don't
want to store in our static data (like runtime-determined max values) or
oa_read_counter function pointers which are dependent on the GPU gen and
would make deduplication very ineffective.

Cuts 16 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
 926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (before)
 926811   25920       0  952731   e899b meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14190852 408908  210004 14809764 e1faa4 iris_dri.so (before)
14190852 391628  210004 14792484 e1b724 iris_dri.so (after)

   text    data     bss     dec     hex filename
8184097  257464   22820 8464381  8127fd libvulkan_intel.so (before)
8184097  240184   22820 8447101  80e47d libvulkan_intel.so (after)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Use a function to initialize perf counters
Matt Turner [Thu, 3 Mar 2022 01:53:02 +0000 (17:53 -0800)]
intel/perf: Use a function to initialize perf counters

And specifically mark it with ATTRIBUTE_NOINLINE. Otherwise it will be
inlined and actually slightly increase code size.

Cuts 505 KiB from iris_dri.so and libvulkan_intel.so.

   text    data     bss     dec     hex filename
1538720       0       0 1538720  177aa0 meson-generated_.._intel_perf_metrics.c.o (before)
 926811   43200       0  970011   ecd1b meson-generated_.._intel_perf_metrics.c.o (after)

   text    data     bss     dec     hex filename
14751700 365708  210004 15327412 e9e0b4 iris_dri.so (before)
14190852 408908  210004 14809764 e1faa4 iris_dri.so (after)

   text    data     bss     dec     hex filename
8744913  214264   22820 8981997  890ded libvulkan_intel.so (before)
8184097  257464   22820 8464381  8127fd libvulkan_intel.so (after)

Relocations increase because the counter initializations are moved from
code (in .text) to pointers (in .text) to .rodata, which require
relocations.

relinfo:
iris_dri.so (before): 15605 relocations, 15385 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users
iris_dri.so (after) : 17765 relocations, 17545 relative (98%), 452 PLT entries, 1 for local syms (0%), 0 users

libvulkan_intel.so (before):  8560 relocations, 4829 relative (56%), 355 PLT entries, 1 for local syms (0%), 0 users
libvulkan_intel.so (after) : 10720 relocations, 6989 relative (65%), 355 PLT entries, 1 for local syms (0%), 0 users

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Deduplicate perf counters
Matt Turner [Thu, 3 Mar 2022 01:26:17 +0000 (17:26 -0800)]
intel/perf: Deduplicate perf counters

No changes in resulting code (yes, seriously!). GCC constant propagates
the static const arrays into the code, yielding bit for bit identical
results. This will however enable further cleanups.

Before this patch, we emit 11916 different initializations of
intel_perf_query_counter. With this patch we emit an array of 539 and
initialize the intel_perf_query_counters in terms of those.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Don't print leading space from desc_units()
Matt Turner [Thu, 3 Mar 2022 21:13:38 +0000 (13:13 -0800)]
intel/perf: Don't print leading space from desc_units()

Just an annoyance I noticed when I needed to generate the description
string in two different places.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agointel/perf: Move some static blocks of C code out of the python script.
Emma Anholt [Mon, 7 Feb 2022 19:06:18 +0000 (11:06 -0800)]
intel/perf: Move some static blocks of C code out of the python script.

Now my editor can help me format code as I type.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15237>

2 years agovenus: fix properties of unsupported external fences/semaphores
Chia-I Wu [Sat, 5 Mar 2022 17:39:36 +0000 (09:39 -0800)]
venus: fix properties of unsupported external fences/semaphores

compatibleHandleTypes should be cleared.

Fixed dEQP-VK.api.external.semaphore.sync_fd.info_timeline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15266>

2 years agoiris/ci: Mark amd_performance_monitor tests as flakes.
Ian Romanick [Thu, 3 Mar 2022 00:38:45 +0000 (16:38 -0800)]
iris/ci: Mark amd_performance_monitor tests as flakes.

On one attempt to merge !15210 failed because
spec@amd_performance_monitor@measure,UnexpectedPass.  In that same
report, the other two tests were listed under "some flakes."

Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15220>

2 years agoturnip: Add "rast_order" debug option to force rast order access
Danylo Piliaiev [Mon, 7 Mar 2022 10:13:45 +0000 (12:13 +0200)]
turnip: Add "rast_order" debug option to force rast order access

Enables rasterization order attachment access for all pipelines,
see VK_ARM_rasterization_order_attachment_access for details.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15262>

2 years agogallium/tc: warn if an app is incompatible with cpu_storage
Pierre-Eric Pelloux-Prayer [Mon, 21 Feb 2022 19:14:02 +0000 (20:14 +0100)]
gallium/tc: warn if an app is incompatible with cpu_storage

Instead of silently ignoring unmap calls.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15074>

2 years agoradeonsi: enable tc cpu_storage by default
Pierre-Eric Pelloux-Prayer [Fri, 18 Feb 2022 09:32:05 +0000 (10:32 +0100)]
radeonsi: enable tc cpu_storage by default

Enable for all applications for all small buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15074>

2 years agogallium/u_threaded: late alloc cpu_storage
Pierre-Eric Pelloux-Prayer [Fri, 18 Feb 2022 09:28:58 +0000 (10:28 +0100)]
gallium/u_threaded: late alloc cpu_storage

Instead of allocating cpu_storage in threaded_resource_init, defer the
allocation to first use (in tc_buffer_map).
This avoids needless memory allocation if tc_buffer_disable_cpu_storage is
called before tc_buffer_map.

map_buffer_alignment is stored and serves as a "can cpu_storage be used" flag.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15074>

2 years agoradeonsi: use 1 shader compilation thread if NIR_PRINT is used
Pierre-Eric Pelloux-Prayer [Thu, 17 Feb 2022 13:46:06 +0000 (14:46 +0100)]
radeonsi: use 1 shader compilation thread if NIR_PRINT is used

This avoids getting multiple shaders NIR code being printed at the
same time making the log unusable.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15074>

2 years agonir: Fix source type for fragment_fetch_amd.
Georg Lehmann [Fri, 4 Mar 2022 12:32:06 +0000 (13:32 +0100)]
nir: Fix source type for fragment_fetch_amd.

Like txf_ms, these take integers not floats.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15242>

2 years agoradeonsi/tests: always add the --gpu argument
Pierre-Eric Pelloux-Prayer [Fri, 4 Mar 2022 10:49:08 +0000 (11:49 +0100)]
radeonsi/tests: always add the --gpu argument

To get the same argument in multi and single gpu situation.

Fixes: 21b01538331 ("radeonsi/tests: print PCI-id of GPU device under test")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15240>

2 years agoradeonsi: change rounding mode to round to even
Pierre-Eric Pelloux-Prayer [Fri, 4 Mar 2022 10:23:20 +0000 (11:23 +0100)]
radeonsi: change rounding mode to round to even

Use ROUND_TO_EVEN instead of TRUNCATE; this matches what pal and radv do.

This fixes the spec@ext_framebuffer_multisample@turn-on-off tests.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15240>

2 years agoegl/wayland: fix crash in dri2_initialize_wayland_swrast
José Expósito [Thu, 3 Mar 2022 07:20:46 +0000 (08:20 +0100)]
egl/wayland: fix crash in dri2_initialize_wayland_swrast

When "dri2_wl_formats_init" fails in "dri2_initialize_wayland_swrast",
the "dri2_display_destroy" function is called for clean up. However, the
"dri2_egl_display" was not associated with the display in its
"DriverData" field yet.

The following cast in "dri2_display_destroy":

  struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);

Expands to:

  _EGL_DRIVER_TYPECAST(drvname ## _display, _EGLDisplay, obj->DriverData)

Crashing.

Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13972>

2 years agoegl/wayland: fix crash in dri2_initialize_wayland_drm
José Expósito [Sun, 28 Nov 2021 16:57:25 +0000 (17:57 +0100)]
egl/wayland: fix crash in dri2_initialize_wayland_drm

When "dri2_wl_formats_init" fails in "dri2_initialize_wayland_drm", the
"dri2_display_destroy" function is called for clean up. However, the
"dri2_egl_display" was not associated with the display in its
"DriverData" field yet.

The following cast in "dri2_display_destroy":

  struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);

Expands to:

  _EGL_DRIVER_TYPECAST(drvname ## _display, _EGLDisplay, obj->DriverData)

Crashing.

Addresses-Coverity-ID: 1494541 ("Resource leak")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13972>

2 years agozink: hide descriptor debug behind #ifdef
Mike Blumenkrantz [Fri, 4 Mar 2022 16:06:30 +0000 (11:06 -0500)]
zink: hide descriptor debug behind #ifdef

I've gotten a feel for this, but it's annoying to always see it spamming away

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: invalidate non-punted recycled descriptor sets that are not valid
Mike Blumenkrantz [Thu, 3 Mar 2022 17:25:41 +0000 (12:25 -0500)]
zink: invalidate non-punted recycled descriptor sets that are not valid

these sets may contain refs from the descriptors which need to be removed
to avoid invalid memory access if the ref is leaked

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: stop leaking descriptor sets
Mike Blumenkrantz [Thu, 3 Mar 2022 17:16:33 +0000 (12:16 -0500)]
zink: stop leaking descriptor sets

when migrating a recycled set here, the set was previously invalid and
in the recycled table, meaning it can be reused directly so long as
it's first invalidated

the previous code would instead pop a different set off the allocation array,
leaking this one

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: mark fbfetch push sets as non-cached
Mike Blumenkrantz [Thu, 3 Mar 2022 14:23:03 +0000 (09:23 -0500)]
zink: mark fbfetch push sets as non-cached

these can't be cached, so ensure the value isn't uninitialized

Test case 'KHR-GL46.blend_equation_advanced.blend_all.GL_HARDLIGHT_KHR_all_qualifier'..
==1193311== Conditional jump or move depends on uninitialised value(s)
==1193311==    at 0x634EF05: update_push_ubo_descriptors (zink_descriptors.c:1230)
==1193311==    by 0x634FCC5: zink_descriptors_update (zink_descriptors.c:1412)
==1193311==    by 0x63A5EA1: void zink_draw<(zink_multidraw)1, (zink_dynamic_state)3, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) (zink_draw.cpp:788)
==1193311==    by 0x635A0AF: void zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)3, true>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int) (zink_draw.cpp:907)
==1193311==    by 0x6174397: tc_call_draw_single (u_threaded_context.c:3150)
==1193311==    by 0x616C403: tc_batch_execute (u_threaded_context.c:211)
==1193311==    by 0x616CA44: _tc_sync (u_threaded_context.c:362)
==1193311==    by 0x61725E9: tc_texture_map (u_threaded_context.c:2274)
==1193311==    by 0x5A9DAE9: pipe_texture_map_3d (u_inlines.h:572)
==1193311==    by 0x5A9EB80: st_ReadPixels (st_cb_readpixels.c:530)
==1193311==    by 0x5A0647A: read_pixels (readpix.c:1178)
==1193311==    by 0x5A0647A: _mesa_ReadnPixelsARB (readpix.c:1195)
==1193311==    by 0x5A06517: _mesa_ReadPixels (readpix.c:1210)

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: fix descriptor cache pointer array allocation
Mike Blumenkrantz [Wed, 2 Mar 2022 20:49:29 +0000 (15:49 -0500)]
zink: fix descriptor cache pointer array allocation

this got mixed up during some refactor and started indexing based on the
number of bindings instead of the number of descriptors, which means
that array descriptor bindings would have overlapping array memory

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: wait on program cache fences before destroying programs
Mike Blumenkrantz [Wed, 2 Mar 2022 13:48:52 +0000 (08:48 -0500)]
zink: wait on program cache fences before destroying programs

if these still have outstanding cache jobs, deleting the object now
will cause a crash

maybe fixes some cts flakiness?

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: use a fence for pipeline cache update jobs
Mike Blumenkrantz [Wed, 2 Mar 2022 20:50:50 +0000 (15:50 -0500)]
zink: use a fence for pipeline cache update jobs

otherwise there's nothing to wait on

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: add function for refcounting zink_program structs
Mike Blumenkrantz [Wed, 2 Mar 2022 13:48:28 +0000 (08:48 -0500)]
zink: add function for refcounting zink_program structs

step one in maybe cleaning up and deduplicating some program code

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agozink: always update shader variants when rebinding a gfx program
Mike Blumenkrantz [Tue, 1 Mar 2022 17:06:13 +0000 (12:06 -0500)]
zink: always update shader variants when rebinding a gfx program

the variant data may have changed in the meanwhile, so do a pass to
ensure that the correct variant is used

cc: mesa-stable

fixes caselist:
dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_two_samples.multisample_texture_4
dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_two_samples.singlesample_rbo

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>

2 years agolavapipe: accurately set image/ssbo access based on shader usage
Mike Blumenkrantz [Thu, 3 Mar 2022 18:37:51 +0000 (13:37 -0500)]
lavapipe: accurately set image/ssbo access based on shader usage

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15233>

2 years agolavapipe: scan shaders for image/ssbo access and generate per-stage masks
Mike Blumenkrantz [Thu, 3 Mar 2022 18:37:13 +0000 (13:37 -0500)]
lavapipe: scan shaders for image/ssbo access and generate per-stage masks

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15233>

2 years agolavapipe: heap-allocate rendering_state struct
Mike Blumenkrantz [Fri, 4 Mar 2022 18:17:06 +0000 (13:17 -0500)]
lavapipe: heap-allocate rendering_state struct

this thing is like 28k now, which is just way too big to have on the stack

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15233>

2 years agogallivm: avoid division by zero when computing cube face
Mike Blumenkrantz [Fri, 4 Mar 2022 15:59:12 +0000 (10:59 -0500)]
gallivm: avoid division by zero when computing cube face

this is illegal and produces NaNs which blow up the sample instr

cc: mesa-stable

fixes (llvmpipe and zink):
KHR-GL45.incomplete_texture_access.sampler
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_vertex

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15246>

2 years agogallivm: fix debug prints for halfs
Mike Blumenkrantz [Fri, 4 Mar 2022 15:42:59 +0000 (10:42 -0500)]
gallivm: fix debug prints for halfs

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15246>

2 years agopan/bi: Don't assign slots for the blend second source
Icecream95 [Mon, 21 Feb 2022 03:54:50 +0000 (16:54 +1300)]
pan/bi: Don't assign slots for the blend second source

Another instruction might write to the second source, and then an
INSTR_INVALID_ENC fault will be raised because the tuple will write to
and read from the register at the same time.

Fixes: 795638767d1 ("pan/bi: Use fused dual source blending")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Skip psuedo sources in ISA.xml
Icecream95 [Mon, 21 Feb 2022 03:50:56 +0000 (16:50 +1300)]
pan/bi: Skip psuedo sources in ISA.xml

The second staging register source for the +BLEND instruction should
not be packed nor disassembled, so skip it when include_pseudo is not
set.

Fixes: 795638767d1 ("pan/bi: Use fused dual source blending")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Fix ubo_mask calculation
Icecream95 [Fri, 31 Dec 2021 13:15:21 +0000 (02:15 +1300)]
panfrost: Fix ubo_mask calculation

BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.

Fixes shaders with only sysvals but no UBOs when push constants are
disabled.

This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.

Fixes: c246af0dd80 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Improve comment for emit_fragment_job
Icecream95 [Sun, 20 Feb 2022 09:45:02 +0000 (22:45 +1300)]
panfrost: Improve comment for emit_fragment_job

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Add documentation for bifrost_nir_lower_store_component
Icecream95 [Wed, 23 Feb 2022 10:16:35 +0000 (23:16 +1300)]
pan/bi: Add documentation for bifrost_nir_lower_store_component

Taken from the commit that introduced the function,
95458c40330 ("pan/bi: Lower stores with component != 0").

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Make disassembler build reproducibly
Icecream95 [Thu, 27 Jan 2022 04:46:54 +0000 (17:46 +1300)]
pan/bi: Make disassembler build reproducibly

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Re-emit descriptors after resource shadowing
Icecream95 [Tue, 15 Feb 2022 07:34:57 +0000 (20:34 +1300)]
panfrost: Re-emit descriptors after resource shadowing

This could be made slightly more efficient by only setting the dirty
state that is needed, but eventually you reach a point where it's
cheaper to re-emit everything than work out what can or can't be kept.

Fixes rendering issues in Duckstation.

Fixes: cd2c1ef9da6 ("panfrost: Dirty track textures/samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Set dirty state in set_shader_buffers
Icecream95 [Tue, 15 Feb 2022 07:24:01 +0000 (20:24 +1300)]
panfrost: Set dirty state in set_shader_buffers

Otherwise the pointer (which is uploaded as a sysval) won't be updated
when a new SSBO is bound.

Fixes: c34b760b9f9 ("panfrost: Dirty track constant buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Check dependencies of both destinations of instructions
Icecream95 [Mon, 14 Feb 2022 03:26:59 +0000 (16:26 +1300)]
pan/bi: Check dependencies of both destinations of instructions

TEXC can have two destinations; the value for neither of them can be
used in the same bundle, so extend the code to check for this to
iterate over both destinations.

Fixes artefacts in the game "LIMBO".

Fixes: a303076c1ab ("pan/bi: Add bi_instr_schedulable predicate")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/bi: Add interference between destinations
Icecream95 [Mon, 14 Feb 2022 03:20:47 +0000 (16:20 +1300)]
pan/bi: Add interference between destinations

Trying to write to overlapping register ranges from a single
instruction is undefined behaviour, so add interference between the
nodes to avoid this.

Hit in a dual-texture instruction in LIMBO.

Fixes: 9146bafbb42 ("pan/bi: Add dual texture fusing pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Disable point size upper limit clamping
Icecream95 [Tue, 18 Jan 2022 03:02:12 +0000 (16:02 +1300)]
panfrost: Disable point size upper limit clamping

The hardware already clamps this, there is no need to do it in the
shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Update point size limits to match hardware behaviour
Icecream95 [Tue, 18 Jan 2022 02:48:17 +0000 (15:48 +1300)]
panfrost: Update point size limits to match hardware behaviour

Found while reverse-engineering the tiler heap format.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopanfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Icecream95 [Tue, 18 Jan 2022 02:16:49 +0000 (15:16 +1300)]
panfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

Fixes arb-provoking-vertex-render Piglit test.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agopan/mdg: Use util_logbase2 instead of C99 log2
Icecream95 [Sat, 19 Jun 2021 03:01:15 +0000 (15:01 +1200)]
pan/mdg: Use util_logbase2 instead of C99 log2

log2 operates on double, we only need the integer util/ function.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>

2 years agoa4xx: add emission of compute state, and compute dispatch
Ilia Mirkin [Sun, 14 Nov 2021 18:01:57 +0000 (13:01 -0500)]
a4xx: add emission of compute state, and compute dispatch

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agoa4xx: add logic to emit image/ssbo state
Ilia Mirkin [Sun, 14 Nov 2021 17:59:40 +0000 (12:59 -0500)]
a4xx: add logic to emit image/ssbo state

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/ir3: support a4xx compute differences
Ilia Mirkin [Sun, 14 Nov 2021 18:05:07 +0000 (13:05 -0500)]
freedreno/ir3: support a4xx compute differences

Mainly the workgroup id comes injected via consts by the hardware (or
CP), and we must make room for it, otherwise the driver won't know where
to put it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/ir3: support a4xx in load/store buffer/image emission
Ilia Mirkin [Sun, 14 Nov 2021 18:04:26 +0000 (13:04 -0500)]
freedreno/ir3: support a4xx in load/store buffer/image emission

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794>

2 years agofreedreno/perfetto+fdperf: Set SYSPROF param
Rob Clark [Thu, 3 Mar 2022 23:38:22 +0000 (15:38 -0800)]
freedreno/perfetto+fdperf: Set SYSPROF param

No need to check error return and deal with older kernels.  Older
kernels won't have this param but their default behavior allows for
systemwide perfcntr collection.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agofreedreno/drm: Add SYSPROF param
Rob Clark [Thu, 3 Mar 2022 23:34:36 +0000 (15:34 -0800)]
freedreno/drm: Add SYSPROF param

Add new param for putting kernel in system-profiling mode and add
corresponding fd_pipe_set_param() mechanism.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agofreedreno: Update uapi header
Rob Clark [Thu, 3 Mar 2022 23:17:13 +0000 (15:17 -0800)]
freedreno: Update uapi header

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236>

2 years agoegl+libsync: Add helper to complain about invalid fence fd's
Rob Clark [Fri, 25 Feb 2022 17:08:34 +0000 (09:08 -0800)]
egl+libsync: Add helper to complain about invalid fence fd's

Debugging fd lifetime issues can be hard.  Add a helper for debug builds
to print out an error if an fd is not a fence fd, and sprinkle it around

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agoandroid: Push in-fence-fd down to driver
Rob Clark [Sat, 19 Feb 2022 16:38:26 +0000 (08:38 -0800)]
android: Push in-fence-fd down to driver

Rather than immediately stall on the CPU in SwapBuffers() if the
in-fence for the dequeued buffer is not yet signaled, push it down
to the driver.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6048
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agogallium/dri: Extend image extension to support in-fence
Rob Clark [Sat, 19 Feb 2022 16:36:43 +0000 (08:36 -0800)]
gallium/dri: Extend image extension to support in-fence

Extend dri so that an in-fence-fd can be plumbed through to driver.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094>

2 years agoradv/ci: update list of expected failures
Samuel Pitoiset [Thu, 3 Mar 2022 19:26:54 +0000 (20:26 +0100)]
radv/ci: update list of expected failures

Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3
which fails on all generations.

It looks like CTS should relax tolerance slightly.

Co-authored-by: Charlie Turner <cturner@igalia.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>

2 years agoradv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask
Samuel Pitoiset [Fri, 4 Mar 2022 15:20:25 +0000 (16:20 +0100)]
radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask

They randomly hang on Navi10 and randomly fail on Sienna Cichlid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>

2 years agov3d: rebind sampler view if resource changed the BO
Juan A. Suarez Romero [Thu, 3 Mar 2022 11:09:32 +0000 (12:09 +0100)]
v3d: rebind sampler view if resource changed the BO

When discarding the whole resource to create a new one, if this resource
is used by a sampler view, a rebind must be done to use the new
resource.

But this must be done when setting the sampler views, because we don't
have access to those samplers before.

v2:
 - Pack shader state on setting sampler views (Iago)
 - Use a serial ID to know when to rebind sampler views (Juan)

v3:
 - Move check to caller (Iago)
 - Keep rebind sampler view on BO change (Iago)

v4:
 - Rename "serial_bo" to "serial_id" (Iago)
 - Add comments (Iago)

Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171>

2 years agopanfrost: Push twice as many uniforms
Alyssa Rosenzweig [Thu, 3 Mar 2022 22:39:02 +0000 (17:39 -0500)]
panfrost: Push twice as many uniforms

The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.

total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs)   min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel)   min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.

total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs)   min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel)   min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.

total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel)   min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.

total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.

total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs)   min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.

total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.

total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel)   min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.

total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.

total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0

total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0

Fixes: d4dccea0ba3 ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>

2 years agopan/bi: Run CSE after lowering FAU
Alyssa Rosenzweig [Fri, 4 Mar 2022 02:04:28 +0000 (21:04 -0500)]
pan/bi: Run CSE after lowering FAU

Lowering FAU can add moves from uniforms. If a uniform is moved out to a
register mulitple times in a basic block, these moves can be CSE'd, saving
instructions at the cost of register pressure.

854 shaders in my shader-db are helped on cycle count (average 2.94% reduction
in cycles). Only 9 shaders have hurt thread count, and there is no change in
spills or fills. Overall, this seems to be a win.

Prevents instruction count regressions from the next commit.

total instructions in shared programs: 2454423 -> 2444690 (-0.40%)
instructions in affected programs: 386274 -> 376541 (-2.52%)
helped: 2105
HURT: 0
helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2
helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92%
95% mean confidence interval for instructions value: -4.91 -4.33
95% mean confidence interval for instructions %-change: -3.83% -3.45%
Instructions are helped.

total tuples in shared programs: 1963534 -> 1957106 (-0.33%)
tuples in affected programs: 233562 -> 227134 (-2.75%)
helped: 1491
HURT: 117
helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2
helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.61 x̃: 1
HURT stats (rel)   min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05%
95% mean confidence interval for tuples value: -4.28 -3.71
95% mean confidence interval for tuples %-change: -4.20% -3.73%
Tuples are helped.

total clauses in shared programs: 387848 -> 387079 (-0.20%)
clauses in affected programs: 13718 -> 12949 (-5.61%)
helped: 583
HURT: 60
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1
helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00%
95% mean confidence interval for clauses value: -1.29 -1.10
95% mean confidence interval for clauses %-change: -7.57% -6.58%
Clauses are helped.

total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%)
cycles in affected programs: 6241.79 -> 6058.50 (-2.94%)
helped: 952
HURT: 98
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666700000000034 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43%
95% mean confidence interval for cycles value: -0.19 -0.16
95% mean confidence interval for cycles %-change: -3.80% -3.24%
Cycles are helped.

total arith in shared programs: 74924.00 -> 74660.12 (-0.35%)
arith in affected programs: 9303.67 -> 9039.79 (-2.84%)
helped: 1513
HURT: 118
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67%
HURT stats (abs)   min: 0.041665999999999315 max: 0.16666800000000137 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37%
95% mean confidence interval for arith value: -0.17 -0.15
95% mean confidence interval for arith %-change: -4.48% -3.98%
Arith are helped.

total quadwords in shared programs: 1757254 -> 1751978 (-0.30%)
quadwords in affected programs: 197399 -> 192123 (-2.67%)
helped: 1464
HURT: 110
helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52%
HURT stats (abs)   min: 1.0 max: 7.0 x̄: 1.71 x̃: 1
HURT stats (rel)   min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93%
95% mean confidence interval for quadwords value: -3.58 -3.13
95% mean confidence interval for quadwords %-change: -3.97% -3.53%
Quadwords are helped.

total threads in shared programs: 52899 -> 52890 (-0.02%)
threads in affected programs: 18 -> 9 (-50.00%)
helped: 0
HURT: 9
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>

2 years agofrontends/va: ignore incoming frame_num from VA picture parameters
Henry Goffin [Wed, 29 Dec 2021 09:20:30 +0000 (09:20 +0000)]
frontends/va: ignore incoming frame_num from VA picture parameters

The Gallium pipe video "frame_num" variable is internally used as a
counter of elapsed reference frames since the last IDR. The incoming
frame_num field from VA picture parameters is not equivalent; the VA
value may wrap to zero prematurely, as it is a 16-bit struct field with
a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1.

This change improves "infinite GOP" single-client live streaming, where
it is reasonable for the server to desire an endless series of P-frames
without IDR. Without this change, it is difficult/impossible for an
application to encode a P- or B-frame after the VA frame_num field wraps
around to zero, depending on the backend encoder implementation.

This change has no effect on existing applications that always signal an
IDR frame and reset the VA frame_num to zero before it wraps around. For
example, the FFmpeg vaapi encoder ignores the VA documentation and sends
an un-wrapped VA frame_num, which results in identical computation of
the internal frame_num (as long as each GOP is less than 65536 frames).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768

Reviewed-by: Thong Thai <thong.thai@amd.com>
patch revision 3: correctly avoid incrementing frame_num when the encoded
frame is not a reference, per h264 spec and ffmpeg behavior

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332>

2 years agoaco: rework removal of jumps over branches
Rhys Perry [Wed, 2 Mar 2022 13:32:42 +0000 (13:32 +0000)]
aco: rework removal of jumps over branches

Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.

Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.

fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f030b75b7d2 ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>