platform/upstream/mesa.git
2 years agollvmpipe: add images to the scene resource tracker.
Dave Airlie [Wed, 9 Feb 2022 05:39:57 +0000 (15:39 +1000)]
llvmpipe: add images to the scene resource tracker.

This adds all the images to the scene resource tracker.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: add ssbo to resources reference by scenes.
Dave Airlie [Wed, 9 Feb 2022 05:37:40 +0000 (15:37 +1000)]
llvmpipe: add ssbo to resources reference by scenes.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: pass ssbo write mask down into setup.
Dave Airlie [Wed, 9 Feb 2022 05:34:06 +0000 (15:34 +1000)]
llvmpipe: pass ssbo write mask down into setup.

this will be used to keep track of ssbo buffer references.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: add writeable resource tracking to the scene.
Dave Airlie [Wed, 9 Feb 2022 05:07:53 +0000 (15:07 +1000)]
llvmpipe: add writeable resource tracking to the scene.

The scene tracks resource, but only currently tracks textures,
in order for scene overlap to work properly, it needs to track
fb, ssbo and image resources as well.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: size initial allocation and free scenes
Dave Airlie [Tue, 31 Mar 2020 05:57:19 +0000 (15:57 +1000)]
llvmpipe: size initial allocation and free scenes

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: handle dynamically creating scenes when needed
Dave Airlie [Tue, 31 Mar 2020 05:49:16 +0000 (15:49 +1000)]
llvmpipe: handle dynamically creating scenes when needed

This will create scenes from the slab allocator up the to max

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: base the scene queue size of the max number of scenes.
Dave Airlie [Tue, 31 Mar 2020 05:48:51 +0000 (15:48 +1000)]
llvmpipe: base the scene queue size of the max number of scenes.

If the max scenes increases then the queue will get resized.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe/scene: move to slab allocated objects for scenes.
Dave Airlie [Tue, 31 Mar 2020 05:26:18 +0000 (15:26 +1000)]
llvmpipe/scene: move to slab allocated objects for scenes.

Currently we only allocate one scene, but I'd like to increase that
so move it to a slab allocator.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe/flush: always finish whether for cpu/gpu access.
Dave Airlie [Fri, 11 Feb 2022 05:45:34 +0000 (15:45 +1000)]
llvmpipe/flush: always finish whether for cpu/gpu access.

Subsequent GPU access to resources for reading in the vertex
shader may rely on previous fragment shaders being flushed out.

Always finish here.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agollvmpipe: convert texture barrier to a finish.
Dave Airlie [Wed, 9 Feb 2022 01:13:10 +0000 (11:13 +1000)]
llvmpipe: convert texture barrier to a finish.

Need to flush the rasterizer and wait for everything to finish,
with new overlap flush isn't enough.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agolavapipe: handle non-timeline semaphores wait/signal.
Dave Airlie [Mon, 7 Feb 2022 00:08:02 +0000 (10:08 +1000)]
lavapipe: handle non-timeline semaphores wait/signal.

When llvmpipe is allowed execute fragment shaders overlapping
with other stuff, we have to start using the pipe fences.

With presentation, the acquire path needs to signal a semaphore
that can be waited on by the user, so add support for passing
signal/wait semaphores for non-timeline in, and just put a
fence pointer in the semaphore for that case.

This fixes rendering once we allow overlapping rasterization.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agolavapipe: don't flush on transfer operations.
Dave Airlie [Sun, 13 Feb 2022 23:35:25 +0000 (09:35 +1000)]
lavapipe: don't flush on transfer operations.

The pipeline barrier/wait event code should handle this.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agolavapipe: execute a finish in pipeline barrier and event waiting.
Dave Airlie [Mon, 16 Nov 2020 05:44:06 +0000 (15:44 +1000)]
lavapipe: execute a finish in pipeline barrier and event waiting.

Refactor out the code for finishing a fence and used it in
pipeline barrier and event waiting.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agolavapipe: handle endless fence timeout properly.
Dave Airlie [Mon, 7 Feb 2022 07:41:04 +0000 (17:41 +1000)]
lavapipe: handle endless fence timeout properly.

If the users ask for an infinte timeout, just pass it through
to gallium.

When llvmpipe ends up allowing async fragment shaders, it's important
to get this right for lots of CTS tests.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agolavapipe: fix pipeline statistic query results with availability.
Dave Airlie [Tue, 8 Feb 2022 07:26:40 +0000 (17:26 +1000)]
lavapipe: fix pipeline statistic query results with availability.

The availability is meant to be the last integer value written,
but for pipeline stats this was being done wrong. calculate
the availability position properly.

With the old non-overlapping execution model queries never
were unavailable.

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agodrisw: fence drawing to the swap/copy buffers.
Dave Airlie [Tue, 31 Mar 2020 05:30:34 +0000 (15:30 +1000)]
drisw: fence drawing to the swap/copy buffers.

Currently neither llvmpipe or softpipe ever leave any drawing in
the pipeline, but I'd like to change that for llvmpipe.

This makes drisw block for completed rendering before sending
data to the X server.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14923>

2 years agofreedreno/ir3: document GETINFO's x/y results
Ilia Mirkin [Mon, 29 Nov 2021 07:20:36 +0000 (02:20 -0500)]
freedreno/ir3: document GETINFO's x/y results

The zw were already known, but throw them in here too. I'm not extremely
happy with the description of "y", feels like there's a simpler
explanation there, but I couldn't find it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14672>

2 years agoradeonsi: fix depth stencil multi sample texture blit
Qiang Yu [Fri, 11 Feb 2022 07:01:25 +0000 (15:01 +0800)]
radeonsi: fix depth stencil multi sample texture blit

This causes the flushed_depth_texture is allocated without
multi sample. So the blit will cause VM fault.

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14990>

2 years agocrocus: fix leak on gen4/5 stencil fallback blit path.
Dave Airlie [Sun, 20 Feb 2022 23:35:00 +0000 (09:35 +1000)]
crocus: fix leak on gen4/5 stencil fallback blit path.

Noticed by Ilia.

Fixes: f3630548f1da ("crocus: initial gallium driver for Intel gfx 4-7")

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15100>

2 years agofreedreno/a4xx: make luminance formats renderable, add missing L8A8_SNORM
Ilia Mirkin [Sat, 19 Feb 2022 17:19:58 +0000 (12:19 -0500)]
freedreno/a4xx: make luminance formats renderable, add missing L8A8_SNORM

If the luminance formats aren't renderable, they back out to R*
formats, but those will end up with a 1 in alpha rather than 0 when
textured. So instead make them explicitly renderable, which will cause
the correct texture format swizzle to be applied.

Fixes query-rgba-signed-components and probably others.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097>

2 years agofreedreno/a4xx: use correct macro for color
Ilia Mirkin [Sat, 19 Feb 2022 17:16:27 +0000 (12:16 -0500)]
freedreno/a4xx: use correct macro for color

Doesn't actually matter since all the colors are encoded the same. But
for consistency...

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097>

2 years agoturnip: Add a refcount mechanism to BOs
Danylo Piliaiev [Wed, 2 Feb 2022 17:29:34 +0000 (19:29 +0200)]
turnip: Add a refcount mechanism to BOs

Until now we have lived without a refcount mechanism in the driver
because in Vulkan the user is responsible for handling the life
span of memory allocations for all Vulkan objects, however,
imported BOs are tricky because the kernel doesn't refcount
so user-space needs to make sure that:

1. When importing a BO into the same device used to create it
   (self-importing) it does not double free the same BO.
2. Frees imported BOs that were not allocated through the same
   device.

Our initial implementation always freed BOs when requested,
so we handled 2) correctly but not 1) on drm and we would
double-free self-imported BOs because kernel doesn't return
a unique gem_handle on each import.

Beside this the submit ioctl checks for duplicates in the
BO list and returns an error if there is one.

This fixes the problem for good by adding refcounts to BOs
so that self-imported BOs have a refcnt > 1 and are only freed
when all references are freed.

KGSL on the other hand does not have the same problems,
at least not with ION buffers which are used for exportable
BOs on pre 5.10 android kernels.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5936
Fixes CTS tests: dEQP-VK.drm_format_modifiers.export_import.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15031>

2 years agoanv/genxml/intel/fs: fix binding shader record entry
Lionel Landwerlin [Mon, 31 Jan 2022 12:43:04 +0000 (12:43 +0000)]
anv/genxml/intel/fs: fix binding shader record entry

Bit is flipped compared to all the other packets.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 705395344d25 ("intel/fs: Add support for compiling bindless shaders with resume shaders")
Fixes: c3ac9afca389 ("anv: Create and return ray-tracing pipeline SBT handles")
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15078>

2 years agovenus: trace vn_ring_wait_space
Chia-I Wu [Wed, 9 Feb 2022 23:48:36 +0000 (15:48 -0800)]
venus: trace vn_ring_wait_space

It is good to know that we run out of ring space and have to wait.  This
happens easily with fossilize-replay because encoding a
vkCreateGraphicsPipeline takes microseconds while executing it can take
milliseconds, >100ms sometimes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966>

2 years agovenus: cache VkFormatProperties
Chia-I Wu [Wed, 9 Feb 2022 22:38:29 +0000 (14:38 -0800)]
venus: cache VkFormatProperties

This is for fossilize-replay which keeps querying for the same formats.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966>

2 years agopan/bi: Promote MUX to CSEL in the scheduler
Alyssa Rosenzweig [Fri, 18 Feb 2022 17:49:06 +0000 (12:49 -0500)]
pan/bi: Promote MUX to CSEL in the scheduler

Helps scheduling, and makes scheduling more predictable when deciding between
MUX and CSEL.

total tuples in shared programs: 1523328 -> 1516256 (-0.46%)
tuples in affected programs: 509800 -> 502728 (-1.39%)
helped: 1977
HURT: 181
helped stats (abs) min: 1.0 max: 48.0 x̄: 3.71 x̃: 2
helped stats (rel) min: 0.04% max: 14.29% x̄: 1.98% x̃: 1.28%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.43 x̃: 1
HURT stats (rel)   min: 0.14% max: 7.69% x̄: 1.40% x̃: 0.70%
95% mean confidence interval for tuples value: -3.47 -3.08
95% mean confidence interval for tuples %-change: -1.79% -1.60%
Tuples are helped.

total clauses in shared programs: 350552 -> 349906 (-0.18%)
clauses in affected programs: 34839 -> 34193 (-1.85%)
helped: 570
HURT: 49
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.22 x̃: 1
helped stats (rel) min: 0.67% max: 20.00% x̄: 3.26% x̃: 2.22%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.92% max: 16.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for clauses value: -1.13 -0.96
95% mean confidence interval for clauses %-change: -2.95% -2.38%
Clauses are helped.

total cycles in shared programs: 202589.37 -> 202512.25 (-0.04%)
cycles in affected programs: 7644.46 -> 7567.33 (-1.01%)
helped: 771
HURT: 147
helped stats (abs) min: 0.041665999999999315 max: 1.8333360000000027 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.16% max: 14.29% x̄: 2.10% x̃: 1.35%
HURT stats (abs)   min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.24% max: 7.41% x̄: 1.49% x̃: 1.11%
95% mean confidence interval for cycles value: -0.09 -0.07
95% mean confidence interval for cycles %-change: -1.69% -1.36%
Cycles are helped.

total arith in shared programs: 56755.96 -> 56585.50 (-0.30%)
arith in affected programs: 18746.29 -> 18575.83 (-0.91%)
helped: 1605
HURT: 352
helped stats (abs) min: 0.04166399999999726 max: 1.8333360000000027 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.92% x̃: 1.12%
HURT stats (abs)   min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 0.17% max: 33.33% x̄: 2.09% x̃: 1.08%
95% mean confidence interval for arith value: -0.09 -0.08
95% mean confidence interval for arith %-change: -1.34% -1.07%
Arith are helped.

total quadwords in shared programs: 1429737 -> 1424670 (-0.35%)
quadwords in affected programs: 418175 -> 413108 (-1.21%)
helped: 1682
HURT: 198
helped stats (abs) min: 1.0 max: 35.0 x̄: 3.17 x̃: 2
helped stats (rel) min: 0.04% max: 13.33% x̄: 1.72% x̃: 1.29%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.38 x̃: 1
HURT stats (rel)   min: 0.15% max: 7.41% x̄: 1.30% x̃: 0.92%
95% mean confidence interval for quadwords value: -2.86 -2.53
95% mean confidence interval for quadwords %-change: -1.48% -1.32%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Revert "Fix load_const of 1-bit booleans"
Alyssa Rosenzweig [Sat, 15 Jan 2022 19:00:17 +0000 (14:00 -0500)]
pan/bi: Revert "Fix load_const of 1-bit booleans"

This reverts commit 29d319c767394b685e2b421a89a7e8e7103e2688.

Now that we use nir_lower_bool_to_bitsize, we don't see 1-bit booleans
anymore, so the issue this fixed doesn't apply. Actually, that issue was
(in part) why I started looking into boolean handling in the first
place.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Switch to lower_bool_to_bitsize
Alyssa Rosenzweig [Sat, 15 Jan 2022 18:30:39 +0000 (13:30 -0500)]
pan/bi: Switch to lower_bool_to_bitsize

Instead of ingesting 1-bit booleans and trying to force everything to be
16-bit, except when it isn't, and creating a mess in the backend... just
use the NIR pass designed to select bitsize for booleans. Yes, this
means we need to handle more NIR instructions, but the handling is
easier and the conversion is more obvious (except for some edge cases
like 16-bit vectorized b32csel). This generates noticeably better code,
and the generated code will be easier to optimize.

total instructions in shared programs: 90257 -> 88941 (-1.46%)
instructions in affected programs: 49145 -> 47829 (-2.68%)
helped: 201
HURT: 2
helped stats (abs) min: 1.0 max: 40.0 x̄: 6.57 x̃: 3
helped stats (rel) min: 0.29% max: 13.89% x̄: 2.57% x̃: 1.90%
HURT stats (abs)   min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 2.15% max: 2.74% x̄: 2.45% x̃: 2.45%
95% mean confidence interval for instructions value: -7.71 -5.26
95% mean confidence interval for instructions %-change: -2.84% -2.20%
Instructions are helped.

total tuples in shared programs: 73740 -> 72922 (-1.11%)
tuples in affected programs: 36564 -> 35746 (-2.24%)
helped: 184
HURT: 7
helped stats (abs) min: 1.0 max: 74.0 x̄: 4.49 x̃: 2
helped stats (rel) min: 0.30% max: 16.67% x̄: 2.86% x̃: 1.89%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.29 x̃: 1
HURT stats (rel)   min: 0.12% max: 12.50% x̄: 4.26% x̃: 3.33%
95% mean confidence interval for tuples value: -5.29 -3.28
95% mean confidence interval for tuples %-change: -3.06% -2.13%
Tuples are helped.

total clauses in shared programs: 15993 -> 15928 (-0.41%)
clauses in affected programs: 2464 -> 2399 (-2.64%)
helped: 35
HURT: 16
helped stats (abs) min: 1.0 max: 27.0 x̄: 2.31 x̃: 1
helped stats (rel) min: 0.49% max: 18.88% x̄: 7.63% x̃: 5.88%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.79% max: 6.25% x̄: 1.91% x̃: 1.01%
95% mean confidence interval for clauses value: -2.46 -0.09
95% mean confidence interval for clauses %-change: -6.38% -2.90%
Clauses are helped.

total cycles in shared programs: 7622.13 -> 7594.75 (-0.36%)
cycles in affected programs: 1078.67 -> 1051.29 (-2.54%)
helped: 103
HURT: 4
helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.27 x̃: 0
helped stats (rel) min: 0.32% max: 21.05% x̄: 3.62% x̃: 2.44%
HURT stats (abs)   min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0
HURT stats (rel)   min: 0.13% max: 7.14% x̄: 2.94% x̃: 2.25%
95% mean confidence interval for cycles value: -0.33 -0.19
95% mean confidence interval for cycles %-change: -4.14% -2.61%
Cycles are helped.

total arith in shared programs: 2762.46 -> 2728.08 (-1.24%)
arith in affected programs: 1550.12 -> 1515.75 (-2.22%)
helped: 197
HURT: 6
helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.32% max: 21.05% x̄: 2.93% x̃: 1.61%
HURT stats (abs)   min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 0.13% max: 20.00% x̄: 5.78% x̃: 3.37%
95% mean confidence interval for arith value: -0.21 -0.13
95% mean confidence interval for arith %-change: -3.20% -2.15%
Arith are helped.

total quadwords in shared programs: 68155 -> 67555 (-0.88%)
quadwords in affected programs: 27944 -> 27344 (-2.15%)
helped: 151
HURT: 9
helped stats (abs) min: 1.0 max: 52.0 x̄: 4.09 x̃: 3
helped stats (rel) min: 0.23% max: 12.35% x̄: 2.87% x̃: 2.17%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 1.89 x̃: 1
HURT stats (rel)   min: 0.20% max: 6.76% x̄: 1.91% x̃: 1.13%
95% mean confidence interval for quadwords value: -4.67 -2.83
95% mean confidence interval for quadwords %-change: -2.99% -2.21%
Quadwords are helped.

total threads in shared programs: 2232 -> 2233 (0.04%)
threads in affected programs: 1 -> 2 (100.00%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Handle vectorized u2f16/i2f16
Alyssa Rosenzweig [Thu, 17 Feb 2022 23:40:59 +0000 (18:40 -0500)]
pan/bi: Handle vectorized u2f16/i2f16

Will be useful when we enable int16, I guess...

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Handle trivial i2i32
Alyssa Rosenzweig [Sat, 15 Jan 2022 19:06:06 +0000 (14:06 -0500)]
pan/bi: Handle trivial i2i32

lower_bool_to_bitsize can generate i2i32 from a 32-bit source, which is
trivial but needs to be handled explicitly to avoid going down the 8-bit
conversion path.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Optimize replication
Alyssa Rosenzweig [Sat, 15 Jan 2022 17:26:42 +0000 (12:26 -0500)]
pan/bi: Optimize replication

Bifrost's 16-bit support comes in the form of vectorized instructions,
so when we manipulate scalars, we usually replicate to both bottom and
top halves of 32-bit registers. Add an analysis pass that detects
replication. Then, use that replication pass to optimize out useless
swizzle instructions (by changing them to plain moves, which can be
copypropped).

This optimization is a slight shader-db win on its own, and allows us to
transition to lower_bool_to_bitsize without regressing shader-db.

total instructions in shared programs: 90323 -> 90257 (-0.07%)
instructions in affected programs: 2513 -> 2447 (-2.63%)
helped: 20
HURT: 0
helped stats (abs) min: 1.0 max: 16.0 x̄: 3.30 x̃: 2
helped stats (rel) min: 1.25% max: 11.11% x̄: 4.80% x̃: 4.29%
95% mean confidence interval for instructions value: -5.05 -1.55
95% mean confidence interval for instructions %-change: -6.06% -3.54%
Instructions are helped.

total tuples in shared programs: 73769 -> 73740 (-0.04%)
tuples in affected programs: 1611 -> 1582 (-1.80%)
helped: 17
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.71 x̃: 1
helped stats (rel) min: 0.58% max: 16.67% x̄: 4.80% x̃: 3.33%
95% mean confidence interval for tuples value: -2.70 -0.71
95% mean confidence interval for tuples %-change: -7.06% -2.54%
Tuples are helped.

total clauses in shared programs: 15997 -> 15993 (-0.03%)
clauses in affected programs: 27 -> 23 (-14.81%)
helped: 4
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 7.69% max: 25.00% x̄: 18.17% x̃: 20.00%
95% mean confidence interval for clauses value: -1.00 -1.00
95% mean confidence interval for clauses %-change: -29.91% -6.44%
Clauses are helped.

total cycles in shared programs: 7623.13 -> 7622.13 (-0.01%)
cycles in affected programs: 64.83 -> 63.83 (-1.54%)
helped: 13
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.08 x̃: 0
helped stats (rel) min: 1.02% max: 5.56% x̄: 2.82% x̃: 2.50%
95% mean confidence interval for cycles value: -0.13 -0.02
95% mean confidence interval for cycles %-change: -3.79% -1.85%
Cycles are helped.

total arith in shared programs: 2763.75 -> 2762.46 (-0.05%)
arith in affected programs: 67.17 -> 65.88 (-1.92%)
helped: 18
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.07 x̃: 0
helped stats (rel) min: 1.02% max: 22.22% x̄: 5.68% x̃: 3.16%
95% mean confidence interval for arith value: -0.11 -0.03
95% mean confidence interval for arith %-change: -8.56% -2.80%
Arith are helped.

total quadwords in shared programs: 68173 -> 68155 (-0.03%)
quadwords in affected programs: 1258 -> 1240 (-1.43%)
helped: 14
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.29 x̃: 1
helped stats (rel) min: 0.42% max: 8.70% x̄: 3.88% x̃: 3.67%
95% mean confidence interval for quadwords value: -1.64 -0.93
95% mean confidence interval for quadwords %-change: -5.27% -2.49%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Constant fold swizzles on constants
Alyssa Rosenzweig [Sat, 15 Jan 2022 17:25:45 +0000 (12:25 -0500)]
pan/bi: Constant fold swizzles on constants

This lets us avoid generating SWZ instructions. Those instructions could
be constant folded but that complicates the replication analysis
introduced in the next commit.

Almost no shader-db changes.

quadwords HURT:   shaders/glmark/1-22.shader_test MESA_SHADER_FRAGMENT: 718 -> 722 (0.56%)

total quadwords in shared programs: 68169 -> 68173 (<.01%)
quadwords in affected programs: 718 -> 722 (0.56%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Lower swizzles on MUX.v2i16
Alyssa Rosenzweig [Wed, 5 Jan 2022 22:39:07 +0000 (17:39 -0500)]
pan/bi: Lower swizzles on MUX.v2i16

We'll generate this in a moment.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agopan/bi: Lower swizzles on CSEL.i32/MUX.i32
Alyssa Rosenzweig [Fri, 23 Jul 2021 20:49:02 +0000 (16:49 -0400)]
pan/bi: Lower swizzles on CSEL.i32/MUX.i32

This is counter-intuitive, but required for correct operation when
CSEL.i32 takes a 1-bit (stored 16-bit) boolean argument. The impedance
mismatch ultimately is between CSEL.b32 (nir's bcsel, nonexistant in the
hardware) and the lowering CSEL.i32. However, a similar problem exists
even with MUX.i32 which lacks a good way of zero/sign-extending
booleans.

Cherry-picked from my Valhall branch though the issue also affects
Bifrost. Fixes piglit shaders@glsl-vs-if-bool on Bifrost.

Unfortunately, shader-db is quite unhappy :-(

The proper fix is to use lower_bool_to_bitsize, but that can't be
backported to mesa-stable.

total instructions in shared programs: 157539 -> 158953 (0.90%)
instructions in affected programs: 55621 -> 57035 (2.54%)
helped: 2
HURT: 259
helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.11% max: 2.67% x̄: 2.39% x̃: 2.39%
HURT stats (abs)   min: 1.0 max: 40.0 x̄: 5.47 x̃: 2
HURT stats (rel)   min: 0.36% max: 16.13% x̄: 2.55% x̃: 1.59%
95% mean confidence interval for instructions value: 4.44 6.40
95% mean confidence interval for instructions %-change: 2.21% 2.82%
Instructions are HURT.

total tuples in shared programs: 132322 -> 132907 (0.44%)
tuples in affected programs: 31806 -> 32391 (1.84%)
helped: 5
HURT: 152
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.39% max: 3.03% x̄: 1.70% x̃: 1.61%
HURT stats (abs)   min: 1.0 max: 42.0 x̄: 3.89 x̃: 2
HURT stats (rel)   min: 0.29% max: 18.18% x̄: 2.50% x̃: 1.79%
95% mean confidence interval for tuples value: 2.88 4.58
95% mean confidence interval for tuples %-change: 1.87% 2.85%
Tuples are HURT.

total clauses in shared programs: 28672 -> 28698 (0.09%)
clauses in affected programs: 869 -> 895 (2.99%)
helped: 1
HURT: 24
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 5.88% max: 5.88% x̄: 5.88% x̃: 5.88%
HURT stats (abs)   min: 1.0 max: 2.0 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 0.49% max: 33.33% x̄: 8.46% x̃: 3.59%
95% mean confidence interval for clauses value: 0.82 1.26
95% mean confidence interval for clauses %-change: 3.84% 11.93%
Clauses are HURT.

total cycles in shared programs: 15119.04 -> 15137.88 (0.12%)
cycles in affected programs: 922.87 -> 941.71 (2.04%)
helped: 4
HURT: 79
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.57% x̃: 1.35%
HURT stats (abs)   min: 0.041665999999999315 max: 1.75 x̄: 0.24 x̃: 0
HURT stats (rel)   min: 0.30% max: 20.00% x̄: 2.83% x̃: 2.12%
95% mean confidence interval for cycles value: 0.17 0.29
95% mean confidence interval for cycles %-change: 1.86% 3.37%
Cycles are HURT.

total arith in shared programs: 4922.71 -> 4947.71 (0.51%)
arith in affected programs: 1423.79 -> 1448.79 (1.76%)
helped: 5
HURT: 177
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.82% x̃: 1.67%
HURT stats (abs)   min: 0.041665999999999315 max: 1.75 x̄: 0.14 x̃: 0
HURT stats (rel)   min: 0.30% max: 22.22% x̄: 2.50% x̃: 1.52%
95% mean confidence interval for arith value: 0.11 0.17
95% mean confidence interval for arith %-change: 1.86% 2.90%
Arith are HURT.

total quadwords in shared programs: 120605 -> 120956 (0.29%)
quadwords in affected programs: 26535 -> 26886 (1.32%)
helped: 6
HURT: 143
helped stats (abs) min: 1.0 max: 7.0 x̄: 2.83 x̃: 1
helped stats (rel) min: 0.93% max: 6.33% x̄: 2.29% x̃: 1.71%
HURT stats (abs)   min: 1.0 max: 21.0 x̄: 2.57 x̃: 2
HURT stats (rel)   min: 0.34% max: 13.79% x̄: 2.02% x̃: 1.22%
95% mean confidence interval for quadwords value: 1.86 2.86
95% mean confidence interval for quadwords %-change: 1.45% 2.24%
Quadwords are HURT.

total threads in shared programs: 4670 -> 4669 (-0.02%)
threads in affected programs: 2 -> 1 (-50.00%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576>

2 years agoci/freedreno: Add a known spilling hangcheck flake.
Emma Anholt [Fri, 18 Feb 2022 19:39:21 +0000 (11:39 -0800)]
ci/freedreno: Add a known spilling hangcheck flake.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>

2 years agoci/freedreno: Cut down pre-merge a630 VK coverage.
Emma Anholt [Fri, 18 Feb 2022 19:30:57 +0000 (11:30 -0800)]
ci/freedreno: Cut down pre-merge a630 VK coverage.

We've got lots of VK coverage on 618, so take some of the load off (but
leave a little bit of testing just to make sure we don't totally break
630).  This should help with our Marge times since we've added some other
coverage to 630 that's started overloading the runners.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>

2 years agoci/freedreno: Move a 60s timeout test to skips instead of flakes.
Emma Anholt [Fri, 18 Feb 2022 19:30:04 +0000 (11:30 -0800)]
ci/freedreno: Move a 60s timeout test to skips instead of flakes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085>

2 years agospirv: Rewrite determinant calculation
Connor Abbott [Tue, 18 Jan 2022 10:52:45 +0000 (11:52 +0100)]
spirv: Rewrite determinant calculation

The old calculation for mat3 was clever, but it turns out that a
straightforward application of subdeterminants similar to how mat4 is
handled is more efficient: on a scalar architecture with some sort of
combined multiply+add instruction with a negate modifier (both fairly
common), the new determinant is 9 instructions vs. 15 for the old one,
and without the multiply-add it's 14 instructions vs. 18 for the old
one.  When used as a routine for inverse() the savings are compounded,
because we now use the same method as used to compute the adjucate
matrix and so CSE can combine most of the calculations with the adjucate
matrix ones.

Once mat3 and mat4 use the same method for computing determinants, we
can combine them into a single recursive function. I also pulled up the
mat_subdet() function because it was doing basically what we need, so
it's now shared between determinant and inverse. This shrinks the
implementation significantly, as can be seen from the diffstat.

The real reason I want to change this, though, is that it fixes
dEQP-VK.glsl.builtin.precision_fp16_storage16b.inverse.compute.mat3 with
turnip. Qualcomm uses round-to-zero for 16-bit frcp, which combined with
some inaccuracy in the old method of calculating the determinant led us
to fail. Qualcomm's driver uses something like the new method to
calculate the determinant in the inverse. We could argue that Mesa's
method should be allowed, because round-to-zero for floating-point
division is within spec and there are no precision guarantees given for
determinant() or inverse(). However we might as well use the more
efficient method.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14652>

2 years agoutil/blob: Clarify rules on blob::data
Connor Abbott [Tue, 15 Feb 2022 11:29:56 +0000 (12:29 +0100)]
util/blob: Clarify rules on blob::data

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>

2 years agonir/serialize: Don't access blob->data directly
Connor Abbott [Thu, 3 Feb 2022 16:16:36 +0000 (17:16 +0100)]
nir/serialize: Don't access blob->data directly

It won't work if the blob is fixed-size and we overrun the size, which
will be the case with the Vulkan pipeline cache.

This gets a bit tricky for the repeated-header optimization, because we
can't read the header from the blob. Instead we have to store the header
itself.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028>

2 years agopan/bi: Disambiguate IDVS variants in shader-db
Alyssa Rosenzweig [Sun, 2 Jan 2022 18:13:10 +0000 (13:13 -0500)]
pan/bi: Disambiguate IDVS variants in shader-db

Label IDVS variants as being MESA_SHADER_{POSITION, VARYING} stages;
reserve the MESA_SHADER_VERTEX label for non-IDVS shaders. This reduces
confusion where a single shader compiles to two MESA_SHADER_VERTEX
shaders with different stats.

While we're at it, de-vendor the blend shader stage name; these stats
are internal anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15086>

2 years agoasahi: Wire in pure integer texture formats
Alyssa Rosenzweig [Sun, 6 Feb 2022 21:32:20 +0000 (16:32 -0500)]
asahi: Wire in pure integer texture formats

Passes dEQP-GLES3.functional.texture.format.sized.2d.r*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Support LOD clamps
Alyssa Rosenzweig [Sun, 6 Feb 2022 20:04:51 +0000 (15:04 -0500)]
asahi: Support LOD clamps

Passes:

   dEQP-GLES3.functional.texture.mipmap.2d.min_lod.*
   dEQP-GLES3.functional.texture.mipmap.2d.max_lod.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Identify minimum/maximum LOD fields
Alyssa Rosenzweig [Sun, 6 Feb 2022 20:04:45 +0000 (15:04 -0500)]
asahi: Identify minimum/maximum LOD fields

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add LOD clamp packing unit tests
Alyssa Rosenzweig [Sun, 6 Feb 2022 20:03:25 +0000 (15:03 -0500)]
asahi: Add LOD clamp packing unit tests

With GTest.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add LOD type
Alyssa Rosenzweig [Sun, 6 Feb 2022 20:04:08 +0000 (15:04 -0500)]
asahi: Add LOD type

Automatically packs and unpacks float <==> clamped 4:6 fixed point, used
for min/max LOD fields on the Sampler descriptor.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Allow GenXML to be used in C++
Alyssa Rosenzweig [Sun, 6 Feb 2022 20:02:26 +0000 (15:02 -0500)]
asahi: Allow GenXML to be used in C++

C++ requires explicit casts from integers to enums. Fixes errors like
the following when trying to use Asahi GenXML from a GTest unit test.

src/asahi/lib/agx_pack.h:554:23: error: assigning to 'enum agx_channels' from incompatible type 'uint64_t' (aka 'unsigned long long')
   values->channels = __gen_unpack_uint(cl, 0, 6);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoagx: Round and clamp array indices
Alyssa Rosenzweig [Sun, 6 Feb 2022 16:33:51 +0000 (11:33 -0500)]
agx: Round and clamp array indices

Conforming with the GLSL spec. Fixes:

dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_fixed_fragment

(and probably others)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoagx: Naturally align uniform pushes
Alyssa Rosenzweig [Sun, 6 Feb 2022 22:38:33 +0000 (17:38 -0500)]
agx: Naturally align uniform pushes

Required to pack correctly, e.g if we push a 16-bit value then a 64-bit
value.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoagx: Add agx_size_align_16 helper
Alyssa Rosenzweig [Sun, 6 Feb 2022 22:38:24 +0000 (17:38 -0500)]
agx: Add agx_size_align_16 helper

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoagx: Add typed move helper
Alyssa Rosenzweig [Sun, 6 Feb 2022 22:37:56 +0000 (17:37 -0500)]
agx: Add typed move helper

Useful for u2u16 in lowering code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add AGX_PUSH_ARRAY_SIZE_MINUS_1
Alyssa Rosenzweig [Sun, 6 Feb 2022 22:39:08 +0000 (17:39 -0500)]
asahi: Add AGX_PUSH_ARRAY_SIZE_MINUS_1

Required to clamp array indices against the array sizes per the GLSL
spec. Metal also does this, implying it's required by the hardware for
correct operation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Implement texturing with non-zero start level
Alyssa Rosenzweig [Wed, 19 Jan 2022 00:04:23 +0000 (19:04 -0500)]
asahi: Implement texturing with non-zero start level

Unsure if this comes up anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Handle reloads of specific cube/mipfaces
Alyssa Rosenzweig [Wed, 19 Jan 2022 00:01:19 +0000 (19:01 -0500)]
asahi: Handle reloads of specific cube/mipfaces

The texture descriptor we construct for reloading needs to respect the
surface's texture/layer selection. Fix exactly the same bug as
b8c31ac06d3 ("lima: fix glCopyTexSubImage2D").

Fixes:

    dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb
    dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgba
    dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgb
    dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgba

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add agx_map_texture_{cpu,gpu} helpers
Alyssa Rosenzweig [Wed, 19 Jan 2022 00:00:08 +0000 (19:00 -0500)]
asahi: Add agx_map_texture_{cpu,gpu} helpers

Streamline access to particular layer/levels. These patterns show up
across the driver and are easy to screw up, so add a helper.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Support 2D array and 3D textures
Alyssa Rosenzweig [Tue, 18 Jan 2022 19:16:12 +0000 (14:16 -0500)]
asahi: Support 2D array and 3D textures

As far as I can tell, these *must* be tiled. Other than that, the
implementation is completely routine. Passes

dEQP-GLES3.functional.texture.format.unsized.*2d_array*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Track mipmap state explicitly
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:07:40 +0000 (10:07 -0500)]
asahi: Track mipmap state explicitly

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Pass correct tile shift to tiling routines
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:07:12 +0000 (10:07 -0500)]
asahi: Pass correct tile shift to tiling routines

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Handle page alignment of miptrees
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:06:48 +0000 (10:06 -0500)]
asahi: Handle page alignment of miptrees

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Align linear texture's strides to 64 bytes
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:06:00 +0000 (10:06 -0500)]
asahi: Align linear texture's strides to 64 bytes

Required to pack the stride, and should improve cache performance.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Align allocations to effective tile size
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:05:40 +0000 (10:05 -0500)]
asahi: Align allocations to effective tile size

May be smaller than 64x64.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Rename bpp to blocksize
Alyssa Rosenzweig [Sun, 6 Feb 2022 21:22:44 +0000 (16:22 -0500)]
asahi: Rename bpp to blocksize

Will matter for block compressed formats.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Allow tiling of all bpps
Alyssa Rosenzweig [Tue, 18 Jan 2022 23:31:18 +0000 (18:31 -0500)]
asahi: Allow tiling of all bpps

Use the usual macro trick via Panfrost. Fixes textures with formats with
non-32-bit bpp, including:

dEQP-GLES2.functional.texture.specification.basic_teximage2d.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Dynamically configure tile size
Alyssa Rosenzweig [Mon, 17 Jan 2022 01:03:28 +0000 (20:03 -0500)]
asahi: Dynamically configure tile size

We need to shrink the tile size when using small images (including
due to mipmapping) or when using large block sizes.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add some notes to XML about mipmapping
Alyssa Rosenzweig [Sun, 6 Feb 2022 15:01:27 +0000 (10:01 -0500)]
asahi: Add some notes to XML about mipmapping

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Handle tiling of 2D arrays and 3D
Alyssa Rosenzweig [Sun, 6 Feb 2022 14:57:48 +0000 (09:57 -0500)]
asahi: Handle tiling of 2D arrays and 3D

Nothing special required, just need to respect the Z coordinate.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Add 2D Array and 3D texture dimensions
Alyssa Rosenzweig [Tue, 18 Jan 2022 18:13:48 +0000 (13:13 -0500)]
asahi: Add 2D Array and 3D texture dimensions

Add to XML and translate in the driver.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Respect mip level when rendering
Alyssa Rosenzweig [Sun, 6 Feb 2022 14:30:36 +0000 (09:30 -0500)]
asahi: Respect mip level when rendering

Use hardware mip level field.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Identify Level field of render target descriptor
Alyssa Rosenzweig [Sun, 6 Feb 2022 14:30:26 +0000 (09:30 -0500)]
asahi: Identify Level field of render target descriptor

Hardware support for rendering into nonzero mip levels.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Don't redefine MIN2/MAX2
Alyssa Rosenzweig [Mon, 17 Jan 2022 01:02:59 +0000 (20:02 -0500)]
asahi: Don't redefine MIN2/MAX2

The tiling function was written before the Mesa driver...

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agoasahi: Streamline modifier selection
Alyssa Rosenzweig [Sun, 6 Feb 2022 14:58:02 +0000 (09:58 -0500)]
asahi: Streamline modifier selection

We can only use linear for 2D images, not even 2D arrays. Even for 2D
images, we only want to use linear if:

* We are required to use linear due to window system requirements.
* The texture is streaming.

Otherwise, we want to use tiled textures. (Or better, compressed, but we
don't support that yet.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903>

2 years agonir: Check all sizes in nir_alu_instr_is_comparison
Alyssa Rosenzweig [Fri, 18 Feb 2022 01:18:58 +0000 (20:18 -0500)]
nir: Check all sizes in nir_alu_instr_is_comparison

nir_alu_instr_is_comparison needs to consider all comparison opcodes regardless
of size. Otherwise, they will be missed by nir_opt_move/sink.

Without this change, lowering booleans to integers regresses register
pressure (and spills/fills) significantly in certain shaders on Panfrost,
like android/com.miHoYo.GenshinImpact/1420.shader_test.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15073>

2 years agopan/bi: Test avoiding FADD.v2f16 hazards in scheduler
Alyssa Rosenzweig [Fri, 18 Feb 2022 00:40:03 +0000 (19:40 -0500)]
pan/bi: Test avoiding FADD.v2f16 hazards in scheduler

There are many of them, and integration testing of the scheduler won't hit every
case. Add targeted unit tests for the various scheduling hazards of this funny
instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>

2 years agopan/bi: Test avoiding *FADD.v2f16 hazard in optimizer
Alyssa Rosenzweig [Fri, 18 Feb 2022 00:18:08 +0000 (19:18 -0500)]
pan/bi: Test avoiding *FADD.v2f16 hazard in optimizer

This hazard exists but is obscure enough to be missed on our existing test
coverage (e.g the conformance tests). Add piles of unit tests for it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>

2 years agopan/bi: Avoid *FADD.v2f16 hazard in scheduler
Alyssa Rosenzweig [Fri, 18 Feb 2022 00:34:04 +0000 (19:34 -0500)]
pan/bi: Avoid *FADD.v2f16 hazard in scheduler

Obscure encoding restriction. Fixes crash (assertion fail when instruction
packing) in asphalt9/2659.shader_test on Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>

2 years agopan/bi: Avoid *FADD.v2f16 hazard in optimizer
Alyssa Rosenzweig [Fri, 18 Feb 2022 00:33:29 +0000 (19:33 -0500)]
pan/bi: Avoid *FADD.v2f16 hazard in optimizer

This is a very obscure encoding restriction in the Bifrost ISA. Unknown if any
real apps or tests hit this, but we still need to get it right sadly.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15072>

2 years agopan/va: Identify LEA_TEX_IMM table
Alyssa Rosenzweig [Thu, 17 Feb 2022 19:39:17 +0000 (14:39 -0500)]
pan/va: Identify LEA_TEX_IMM table

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15069>

2 years agopan/va: Fix conservative branch handling
Alyssa Rosenzweig [Thu, 17 Feb 2022 19:31:33 +0000 (14:31 -0500)]
pan/va: Fix conservative branch handling

Mixed up lanes and conservative branch combine. Fix that.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15069>

2 years agopan/va: Make subgroup 4-bits
Alyssa Rosenzweig [Thu, 17 Feb 2022 19:07:34 +0000 (14:07 -0500)]
pan/va: Make subgroup 4-bits

Future proofing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15069>

2 years agopan/va: Fix some units
Alyssa Rosenzweig [Thu, 17 Feb 2022 19:06:16 +0000 (14:06 -0500)]
pan/va: Fix some units

Remove the todos.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15069>

2 years agopan/va: Parse units from the XML
Alyssa Rosenzweig [Mon, 2 Aug 2021 19:34:32 +0000 (15:34 -0400)]
pan/va: Parse units from the XML

We need this information for cycle counting in Valhall.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15069>

2 years agopanvk: Don't use UBOs for meta_clear
Alyssa Rosenzweig [Mon, 7 Feb 2022 19:40:14 +0000 (14:40 -0500)]
panvk: Don't use UBOs for meta_clear

It must always be pushed, so constructing a uniform remap table is
useless.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14913>

2 years agopan/mdg: Remove todo we'll probably never get to
Alyssa Rosenzweig [Fri, 4 Feb 2022 23:26:00 +0000 (18:26 -0500)]
pan/mdg: Remove todo we'll probably never get to

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Assert that we don't see unknown jumps
Alyssa Rosenzweig [Fri, 4 Feb 2022 23:25:21 +0000 (18:25 -0500)]
pan/mdg: Assert that we don't see unknown jumps

I still don't understand why we don't see continues. But in case we do, scream
loudly so it can't be fixed.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Delete dedicated fdot2 lowering
Alyssa Rosenzweig [Fri, 4 Feb 2022 23:13:55 +0000 (18:13 -0500)]
pan/mdg: Delete dedicated fdot2 lowering

It's just lower_alu_to_scalar

total instructions in shared programs: 72542 -> 72528 (-0.02%)
instructions in affected programs: 673 -> 659 (-2.08%)
helped: 4
HURT: 1
helped stats (abs) min: 1.0 max: 11.0 x̄: 3.75 x̃: 1
helped stats (rel) min: 0.28% max: 6.79% x̄: 3.07% x̃: 2.60%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.03% max: 3.03% x̄: 3.03% x̃: 3.03%
95% mean confidence interval for instructions value: -8.65 3.05
95% mean confidence interval for instructions %-change: -6.32% 2.62%
Inconclusive result (value mean confidence interval includes 0).

total bundles in shared programs: 32051 -> 32036 (-0.05%)
bundles in affected programs: 207 -> 192 (-7.25%)
helped: 3
HURT: 0
helped stats (abs) min: 1.0 max: 10.0 x̄: 5.00 x̃: 4
helped stats (rel) min: 3.28% max: 13.89% x̄: 8.29% x̃: 7.69%

total quadwords in shared programs: 56496 -> 56487 (-0.02%)
quadwords in affected programs: 422 -> 413 (-2.13%)
helped: 2
HURT: 0

total registers in shared programs: 5106 -> 5104 (-0.04%)
registers in affected programs: 8 -> 6 (-25.00%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Delete stray comment
Alyssa Rosenzweig [Fri, 4 Feb 2022 23:10:57 +0000 (18:10 -0500)]
pan/mdg: Delete stray comment

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Clarify some ISA unknowns
Alyssa Rosenzweig [Fri, 4 Feb 2022 22:58:30 +0000 (17:58 -0500)]
pan/mdg: Clarify some ISA unknowns

Nothing usefully new here, just trying to improve signal:noise ratio on the
disassembly.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Handle 8/16-bit UBO loads
Alyssa Rosenzweig [Fri, 4 Feb 2022 22:36:56 +0000 (17:36 -0500)]
pan/mdg: Handle 8/16-bit UBO loads

These will be seen by the compiler when we enable fp16 constant buffers.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Model zero/sign extension for 8/16-bit loads
Alyssa Rosenzweig [Fri, 4 Feb 2022 22:36:27 +0000 (17:36 -0500)]
pan/mdg: Model zero/sign extension for 8/16-bit loads

The destinations are packed as if 32-bit even for 8/16-bit loads, so the mask
needs to be constructed accordingly.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Print optimized and scheduled shader
Alyssa Rosenzweig [Fri, 4 Feb 2022 22:35:44 +0000 (17:35 -0500)]
pan/mdg: Print optimized and scheduled shader

To help identify problems across the compiler, print more forms of the shader
with MIDGARD_MESA_DEBUG=shaders. Roughly matches the Bifrost compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agopan/mdg: Pull out skip_internal boolean
Alyssa Rosenzweig [Fri, 4 Feb 2022 22:21:04 +0000 (17:21 -0500)]
pan/mdg: Pull out skip_internal boolean

Aligns with Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14888>

2 years agov3dv/v3d: Fix copyright holder to Raspberry Pi Ltd
Jose Maria Casanova Crespo [Thu, 17 Feb 2022 11:38:42 +0000 (12:38 +0100)]
v3dv/v3d: Fix copyright holder to Raspberry Pi Ltd

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15057>

2 years agoanv: Lower bufferImageGranularity to 1 from 64
Kenneth Graunke [Thu, 17 Feb 2022 10:08:33 +0000 (02:08 -0800)]
anv: Lower bufferImageGranularity to 1 from 64

The Vulkan 1.3 spec says:

   "The implementation-dependent limit bufferImageGranularity specifies
    a page-like granularity at which linear and non-linear resources
    must be placed in adjacent memory locations to avoid aliasing.  Two
    resources which do not satisfy this granularity requirement are said
    to alias. bufferImageGranularity is specified in bytes, and must be
    a power of two.  Implementations which do not impose a granularity
    restriction may report a bufferImageGranularity value of one.

    Note: Despite its name, bufferImageGranularity is really a
    granularity between "linear" and "non-linear" resources."

We set this limit to 64 bytes (a cacheline) at the dawn of time, without
any real rationale attached.  There shouldn't be any restrictions here.
Our tile sizes are typically 4K, and tiled resource addresses are
aligned to the tile size, and the extent is also a multiple of the tile
sized.  So if a linear resource occurs before a tiled one, there will
naturally be some space due to the alignment of the tiled resource's
starting address.  If a linear resource occurs after a tiled one, the
tiled resource's ending address is already 4K aligned, which is already
guaranteeing that they won't share a cacheline.

So I think it should be fine to reduce this to 1.  The other Vulkan
driver for our hardware seems to advertise 1 here as well.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15066>

2 years agovc4/ci: make piglit test mandatory
Juan A. Suarez Romero [Wed, 16 Feb 2022 09:17:04 +0000 (10:17 +0100)]
vc4/ci: make piglit test mandatory

Make piglit test jobs to run always, as piglit testsuite offers more
coverage for the VC4 driver.

On the other hand, make the EGL testing manually, as we don't have
enough devices to execute all the tests fast enough.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15045>

2 years agobroadcom/compiler: document that spill_base is used for spills and scratch
Iago Toral Quiroga [Thu, 17 Feb 2022 07:55:16 +0000 (08:55 +0100)]
broadcom/compiler: document that spill_base is used for spills and scratch

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>

2 years agobroadcom/compiler: drop spill_count and add spilling boolean
Iago Toral Quiroga [Thu, 17 Feb 2022 07:53:54 +0000 (08:53 +0100)]
broadcom/compiler: drop spill_count and add spilling boolean

We added spill_count to handle uniform batch spills, which we no longer do.
What we want now is a way to know if we are spilling registers.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>

2 years agobroadcom/compiler: do not rebuild the interference graph after each spill
Iago Toral Quiroga [Mon, 14 Feb 2022 10:56:05 +0000 (11:56 +0100)]
broadcom/compiler: do not rebuild the interference graph after each spill

Instead, we only recompute liveness and we add new nodes and
interferences to the graph manually (we also need to patch
register classes in some cases).

To assist in this process, we also add an ip counter to our
instructions that we also recompute after each spill, which we use
to identify registers that cross thrsw boundries introduced with
TMU spills and fills and adjust their register classes accordingly
(removing their capacity to use accumulators).

This significantly reduces the CPU cost of spills. Using
shaders/closed/gputest/piano/7.shader_test as reference:

Compile time up to the first successful compile strategy in main is
~24s and with this change it is ~11s. With this speed up, we can now
try all 2-thread compile strategies (including the fallback scheduler)
in only ~15s.

A full shader-db run results in:
Total CPU time (seconds): 9904.67 -> 9087.98 (-8.25%)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>

2 years agobroadcom/compiler: reset spill/fill counts after lowering thread count.
Iago Toral Quiroga [Mon, 14 Feb 2022 10:46:29 +0000 (11:46 +0100)]
broadcom/compiler: reset spill/fill counts after lowering thread count.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>

2 years agobroadcom/compiler: fix end of TMU sequence check
Iago Toral Quiroga [Mon, 14 Feb 2022 10:42:16 +0000 (11:42 +0100)]
broadcom/compiler: fix end of TMU sequence check

We may be pipelining TMU writes and reads, in which case we can
see both TMUWT and LDTMU at the end of a TMU sequence, so we should
not assume that a TMUWT always terminates a sequence.

Also, we had a bug where we were using inst instead of scan_inst
to check if we find another TMUWT after the curent instruction.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>

2 years agobroadcom/compiler: define max number of tmu spills for compile strategies
Iago Toral Quiroga [Fri, 4 Feb 2022 12:40:50 +0000 (13:40 +0100)]
broadcom/compiler: define max number of tmu spills for compile strategies

Instead of whether they are allowed to spill or not. This is more flexible.
Also, while we are not currently enabling spilling on any 4-thread strategies,
should we do that in the future, always prefer a 4-thread compile.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>