Alyssa Rosenzweig [Thu, 20 Jul 2023 14:52:42 +0000 (10:52 -0400)]
asahi: Advertise Z16_UNORM
This works (on the downstream kernel, anyway).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 20:38:48 +0000 (16:38 -0400)]
asahi: Execute preambles for background programs
This will be useful when spilling render targets.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 15 Jun 2023 11:03:32 +0000 (07:03 -0400)]
asahi: Offset clear colour uniform by 4
Frees up u0_u1 for a bindless base address which will make render target
spilling easier to implement.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 21:07:18 +0000 (17:07 -0400)]
asahi: Ignore spilled render targets for background load
Nothing to reload.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 20:45:54 +0000 (16:45 -0400)]
asahi: Permit meta shaders to use preambles
Preambles are occassionally useful with background programs.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 20:26:42 +0000 (16:26 -0400)]
asahi: Lower multisample image stores
These will be used for spilling multisampled render targets.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 18 Jul 2023 01:39:14 +0000 (21:39 -0400)]
asahi: Lower tilebuffer access for spilled RTs
Conceptually similar, we just don't have the tilebuffer available this time.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 14 Jun 2023 21:48:14 +0000 (17:48 -0400)]
asahi: Extract some tilebuffer lowering code
In prep for spilling. No functional change.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 23 Jun 2023 18:23:59 +0000 (14:23 -0400)]
asahi: Ignore spilled render targets with partial renders
Partial renders exist to the spill the tilebuffer to memory, there's nothing to
do if it's already spilled (and would just waste memory bandwidth and create a
feedback loop).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 14 Jun 2023 21:42:01 +0000 (17:42 -0400)]
asahi: Ignore spilled render targets in EOT shaders
Regardless whether we implement Apple-style eMRT or something simpler, the EOT
shader isn't involved here.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 14 Jun 2023 21:53:38 +0000 (17:53 -0400)]
asahi: Do not support masking with spilled RTs
Extra complexity for this interaction, not worth it until we have an actual use
case IMHO.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 15 Jun 2023 11:05:39 +0000 (07:05 -0400)]
asahi: Add agx_tilebuffer_spills query
We can skip various work in the driver if we're not spilling render targets.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 14 Jun 2023 21:37:04 +0000 (17:37 -0400)]
asahi: Introduce concept of spilled render targets
To accommodate framebuffers which exceed tilebuffer limits, we'll need to spill
render targets to main memory. In effect, we need to emulate an immediate-mode
renderer for some render targets. This decision is made on a per-render target
basis. In our tilebuffer layout calculation, rather than asserting that all
render targets fit, introduce a notion of spilling.
This doesn't actually implement spilling -- it just pushes the assert failure
down to the users. But it's progress.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 23 Jun 2023 18:23:30 +0000 (14:23 -0400)]
asahi: Extract sampler_view_for_surface
We'll reuse this logic for the spilled RT case.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 21:19:21 +0000 (17:19 -0400)]
agx: Plumb in coverage mask
This is internally used by the hardware when writing to the tilebuffer. We need
to use it externally to spill multisample render targets.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 15 Jun 2023 12:48:31 +0000 (08:48 -0400)]
agx: Require tag writes with side effects
Otherwise the fragment shader might be skipped entirely. (Possibly this is the
wrong approach to this though...)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 8 Jun 2023 15:11:22 +0000 (11:11 -0400)]
agx: Add simple image fencing pass
Minimum needed to pass CTS.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 8 Jun 2023 14:07:31 +0000 (10:07 -0400)]
agx: Implement fence_*_to_tex_agx intrinsics
We need these fencing intrinsics because our image caches aren't coherent with
memory. Furthermore, we need some sync intrinsics for imageblocks (which are
spicy images). These are a stub of what the final fragment shader interlock
implementation will look like, or what a real Metal-grade imageblock
implementation needs, but this is good enough for handling the sync requirements
with spilled render targets.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 20:53:11 +0000 (16:53 -0400)]
agx: Don't emit silly barriers
Trust in the scoped_barrier.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 18:21:19 +0000 (14:21 -0400)]
agx: Emit global memory barriers for images
This is part of image atomics, since those go through the regular memory path.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 14:57:00 +0000 (10:57 -0400)]
agx: Implement image_load
Texture loads can be reordered freely but image loads can't be (since there
could be writes). Implement image_load natively to avoid subtle problems with
CSE and scheduling.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 8 Jun 2023 16:15:01 +0000 (12:15 -0400)]
agx: Extract texture write mask handling
image_load will share the logic.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 15:04:40 +0000 (11:04 -0400)]
agx: Add image_load opcode
This is equivalent to texture_load but cannot be reordered, since it might be
writeable.
It also sets bit 43. This needs more investigation, but it fixes
KHR-GLES31.core.shader_image_load_store.basic-glsl-misc-fs. Some sort of cache
control bit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 20:14:13 +0000 (16:14 -0400)]
asahi,agx: Fix txf sampler
Bizarrely, the clamps/wrap modes are respected so we need to set them
appropriately for correct out-of-bounds behaviour (returning all zero). That in
turn means we can't use whatever sampler is already there, instead we need to
allocate a dedicated sampler just for txf. Good news is we have an extra sampler
state register available for the purpose.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 23:26:02 +0000 (19:26 -0400)]
agx: Lower buffer images
Similar to buffer reads, we need to implement buffer images as 2D images with
fixed width and some lowering code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 23:25:18 +0000 (19:25 -0400)]
agx: Lower image atomics
Lower image atomics to texel address loads, and lower texel address loads to
arithmetic and descriptor reads. This implements image atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 21:33:39 +0000 (17:33 -0400)]
agx: Extract texture_descriptor_ptr_for_* helpers
For implementing image_texel_address, when there's no point in creating an
internal texture instruction just to lower immediately.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 22:58:40 +0000 (18:58 -0400)]
agx: Extract coords_for_buffer_texture helper
The mapping of 1D -> 2D coordinates for indexing into buffer textures (lowered
to fixed-width 2D images) will be shared between both texture load and image
store code paths, so pull it out.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 22:54:44 +0000 (18:54 -0400)]
agx: Add interleave opcode
We'll use it for texture atomics.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 30 May 2023 00:34:50 +0000 (20:34 -0400)]
agx: Handle early_fragment_tests
Simply doing nothing fixes
dEQP-GLES31.functional.image_load_store.early_fragment_tests.*. However, we need
to actually insert the sample_mask instruction to make sure the shader runs at
all (I think), doing that fixes:
KHR-GLES31.core.shader_image_load_store.basic-glsl-earlyFragTests
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Mon, 29 May 2023 23:30:16 +0000 (19:30 -0400)]
agx: Implement image barriers
Or cache flushes or whatever these actually are. Probably could be optimized
once we understand what the 4 individual instructions are actually doing. Fixes
dEQP-GLES31.functional.image_load_store.2d.qualifiers.*.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 30 May 2023 02:51:29 +0000 (22:51 -0400)]
agx: Wait for outstanding stores before barriers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 31 May 2023 00:16:07 +0000 (20:16 -0400)]
agx: Handle frag side effects without render targets
We still need to insert our lowering code, though this case could probably be
optimized somehow. Fixes a massive number of KHR-GLES3 and KHR-GLES31 tests,
including
KHR-GLES31.core.shader_atomic_counters.advanced-usage-many-draw-calls2 and lots
of PBO tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 17:08:13 +0000 (13:08 -0400)]
agx: Translate image_store from NIR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 20 May 2023 17:44:37 +0000 (13:44 -0400)]
agx: Translate texture bindless handles
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 20 May 2023 17:46:50 +0000 (13:46 -0400)]
agx: Pack bindless textures
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 18:22:40 +0000 (14:22 -0400)]
agx: Handle bindless properly for txs lowering
When I wrote this pass I mostly guessed what our bindless handles would look
like. Now that we know we can do it right.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 20 May 2023 17:36:10 +0000 (13:36 -0400)]
agx: Model texture bindless base
Extra source we need to implement bindless.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 17:07:52 +0000 (13:07 -0400)]
agx: Add image write instruction
Model and pack what's in the hardware for this.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 17:06:41 +0000 (13:06 -0400)]
agx: Generalize texture/PBE packing
For the generic image write instruction we'll want the full forms of these
fields.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 20:51:20 +0000 (16:51 -0400)]
agx: Lower image size to txs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 19:01:22 +0000 (15:01 -0400)]
agx: Legalize image LODs to be 16-bit
Required by the hardware. Do it in NIR so we can optimize the conversion.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 27 Jun 2023 22:33:57 +0000 (18:33 -0400)]
asahi: Use nir_lower_robust_access
This makes images robust as required by the OpenGL ES spec.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 19:43:22 +0000 (15:43 -0400)]
asahi: Extend PBE packing for image support
We need to support arrayed images and sRGB images, which are hardware. For
atomics, we need to pack the augmented software data structure. Finally, we need
to support buffer images. Like their texture counterparts, these get lowered to
2D images.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 19:39:57 +0000 (15:39 -0400)]
asahi: Augment PBE descriptor for software access
For implementing image atomics (and multisample image writes), we need
information about the image layout in the shader. It's a lot nicer to determine
the image layouts on the CPU (where we have ail) and stash the results in the
PBE descriptor, where we have a convenient hole to do so, rather than trying to
do all the layout calculations on the GPU on the fly. Add a data structure that
the driver will fill out and the image atomic lowering will consider as part of
the hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Mon, 10 Jul 2023 15:45:17 +0000 (00:45 +0900)]
asahi: Add a shared library interface for decode
Add a simple API so that decode can be used as a shared library by the
Python hypervisor. Note that this is not thread-safe. If we ever want to
use this in other contexts with thread safety, it will need a refactor
(along with the core decode code anyway).
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Mon, 10 Jul 2023 15:43:57 +0000 (00:43 +0900)]
asahi: decode: Add a function to construct decode_params from a chip_id
Should be useful on macOS later to properly support detecting the right
GPU, but for now just hardcode T8103/G13G.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Mon, 10 Jul 2023 15:42:52 +0000 (00:42 +0900)]
asahi: decode: Refactor to always copy GPU mem to local buffers
We want to plug this library into the hypervisor, but there we don't
have all GPU memory already mapped in our address space. Refactor the
GPU mem read function to always allocate local buffers and copy in the
data there.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Mon, 10 Jul 2023 17:23:33 +0000 (02:23 +0900)]
asahi: wrap: Handle freeing shmems
Needed for some Metal demos that end up creating multiple queues.
This is still definitely broken/not fully correct, but it at least
gets things working for those.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Fri, 7 Jul 2023 10:45:28 +0000 (19:45 +0900)]
asahi: Add extra CDM header block for G14X
Looks like we finally found our first properly divergent codepath.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Fri, 7 Jul 2023 10:41:48 +0000 (19:41 +0900)]
asahi: decode: Add a params argument to pass through
Sooner or later we were going to need divergent codepaths in decode, and
it looks like now is the time. Add a `params` typedef and pass it
through all the decoder callbacks. This is an alias for
drm_asahi_params_global, but use a typedef so we can change that later
without changing dozens of instances.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sun, 2 Jul 2023 17:30:44 +0000 (13:30 -0400)]
agx: Fix bogus assert
Dolphin uses all the uniforms.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 1 Jul 2023 18:43:53 +0000 (14:43 -0400)]
agx: Reduce un/packs with mem access lowering
Often not needed and makes the NIR harder to read.
shader-db is noise.
total instructions in shared programs: 1752712 -> 1752688 (<.01%)
instructions in affected programs: 8338 -> 8314 (-0.29%)
helped: 21
HURT: 8
Inconclusive result (%-change mean confidence interval includes 0).
total bytes in shared programs:
11943572 ->
11943434 (<.01%)
bytes in affected programs: 56716 -> 56578 (-0.24%)
helped: 21
HURT: 8
Inconclusive result (%-change mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sun, 2 Jul 2023 13:57:26 +0000 (09:57 -0400)]
agx: Vectorize 16-bit parallel copies
If we have two 16-bit copies to/from adjacent 16-bit registers, we can instead
use a single 32-bit copy from the 32-bit register pair. Since 32-bit integer
arithmetic is (almost) as efficient as 16-bit on AGX, this (almost) doubles
performance of affected parallel copies.
total instructions in shared programs: 1788606 -> 1788301 (-0.02%)
instructions in affected programs: 17057 -> 16752 (-1.79%)
helped: 150
HURT: 0
Instructions are helped.
total bytes in shared programs:
12196492 ->
12194662 (-0.02%)
bytes in affected programs: 122894 -> 121064 (-1.49%)
helped: 150
HURT: 0
Bytes are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sun, 2 Jul 2023 13:46:12 +0000 (09:46 -0400)]
agx: Try to allocate phi sources with loop phis
total instructions in shared programs: 1788666 -> 1788606 (<.01%)
instructions in affected programs: 7953 -> 7893 (-0.75%)
helped: 29
HURT: 0
Instructions are helped.
total bytes in shared programs:
12196852 ->
12196492 (<.01%)
bytes in affected programs: 53908 -> 53548 (-0.67%)
helped: 29
HURT: 0
Bytes are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 1 Jul 2023 17:11:46 +0000 (13:11 -0400)]
agx: Try to allocate phi sources with phis
Not meaningfully using more registers since this is just about how we assign
registers after fixing the maximum # of registers used (note that thread count
is unaffected).
total instructions in shared programs: 1790901 -> 1788666 (-0.12%)
instructions in affected programs: 230680 -> 228445 (-0.97%)
helped: 681
HURT: 2
Instructions are helped.
total bytes in shared programs:
12210266 ->
12196852 (-0.11%)
bytes in affected programs: 1634100 -> 1620686 (-0.82%)
helped: 682
HURT: 2
Bytes are helped.
total halfregs in shared programs: 532130 -> 532218 (0.02%)
halfregs in affected programs: 848 -> 936 (10.38%)
helped: 3
HURT: 13
Halfregs are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 1 Jul 2023 16:56:02 +0000 (12:56 -0400)]
agx: Try to allocate phis compatibly with sources
All shaders affected for thread count are in pubg... by chance the allocation
before used fewer registers than the calculated register demand (I guess because
we're conservative with our vector handling) and so got lucky and got higher
thread count. That shader is also helped massively for instructions.
The halfreg change doesn't matter -- we're not actually increasing register
demand, we're just being more choosy about our registers.
total instructions in shared programs: 1799738 -> 1790901 (-0.49%)
instructions in affected programs: 306081 -> 297244 (-2.89%)
helped: 889
HURT: 14
Instructions are helped.
total bytes in shared programs:
12263290 ->
12210266 (-0.43%)
bytes in affected programs: 2150966 -> 2097942 (-2.47%)
helped: 889
HURT: 14
Bytes are helped.
total halfregs in shared programs: 531981 -> 532130 (0.03%)
halfregs in affected programs: 1925 -> 2074 (7.74%)
helped: 0
HURT: 26
Halfregs are HURT.
total threads in shared programs:
18885184 ->
18884224 (<.01%)
threads in affected programs: 13440 -> 12480 (-7.14%)
helped: 0
HURT: 15
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sun, 2 Jul 2023 13:33:53 +0000 (09:33 -0400)]
agx: Add try_coalesce_with helper
Common logic the next few patches will use to try to assign something to the
same register as something else. "If it's already been assigned a register and
that register is free now, use it, otherwise bail."
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 22:57:47 +0000 (18:57 -0400)]
asahi: Forbid 2D Linear with images
There's no known use case, so forbidding this reduces the combinatorics required
in the texture atomic lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 19:58:19 +0000 (15:58 -0400)]
asahi: Don't restrict sampler views
We now emulate an infinitely large binding table with bindless, so the sky is
the limit for this CAP. Note we still have the limit for samplers, so this
probably doesn't do anything for OpenGL.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 28 Jun 2023 19:41:28 +0000 (15:41 -0400)]
asahi: Make clear the non-sRGBness of EOT images
For sRGB render targets, we encode sRGB when writing pixels into the tilebuffer
(in the fragment shader), not when writing out the image. When we actually write
out the tilebuffer to the image, we don't use the PBE's sRGB conversion, we just
bind it as a UNORM 8 image and blit the pre-transformed pixels.
We're about to add real sRGB support for the PBE, so make this linearization
explicit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 20:21:20 +0000 (16:21 -0400)]
asahi: Upload image descriptors
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 20:02:44 +0000 (16:02 -0400)]
asahi: Upload at most the max texture state registers
The rest are bindless now.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Sat, 20 May 2023 17:24:13 +0000 (13:24 -0400)]
asahi: Add texture/image indexing lowering pass
Both textures and images share a unified indexing scheme in AGX. When binding
tables are used, they can be mapped to texture state registers. Otherwise, there
is bindless access available.
It would be nice to map OpenGL's binding table based textures and images to AGX
texture state registers 1:1. The problem is that OpenGL allows more combined
textures and images than we necessarily have texture state registers. So, we use
as many texture state registers as we can, and then we fallback on an internal
bindless scheme mapping an extended binding table.
Add and use a lowering pass to map all of the API-level texture/image indices to
either texture state registers or bindless handles as required.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 20:03:15 +0000 (16:03 -0400)]
asahi: Add agx_batch_track_image helper
Adapted from Panfrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 7 Jun 2023 00:28:11 +0000 (20:28 -0400)]
asahi: Reallocate to set the writeable image flag
...If needed, for array images.
But avoid doing so for non-array images.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Tue, 6 Jun 2023 23:44:40 +0000 (19:44 -0400)]
asahi: Mark writeable images as such
ail needs this information to select the appropriate layout.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 22:12:22 +0000 (18:12 -0400)]
ail: Page-align layers for writable images
This appears to be necessary for PBE writes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 5 Jul 2023 16:19:51 +0000 (12:19 -0400)]
asahi,agx: Set coherency bit for clustered targets
We need to set a particular bit on atomics for them to be coherent across
clusters. Fixes atomics on G13X.
Setting this bit on the single-cluster G13G, on the other hand, wedges the GPU.
So best be careful ;-)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Janne Grunau [Wed, 5 Jul 2023 18:58:48 +0000 (20:58 +0200)]
asahi: toggle more barrier bits after transform feedback
Fixes KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-arrays
and KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-elements
on M1 Ultra (G13D). Let's assume that same bits are required on M1 Pro
and Max.
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 20 Jul 2023 13:58:26 +0000 (09:58 -0400)]
asahi: Identify background/EOT counts
Similar to the counts for VDM/PDM/CDM.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 29 Jun 2023 00:24:40 +0000 (20:24 -0400)]
asahi: Serialize NIR in memory
Deserializing isn't expected to be much more expensive than cloning, and the
serialized NIR is *significantly* smaller. So store the serialized instead of
the deserialized, and deserialize on the fly.
This reduces a lot of noise in valgrind due to random crap alloc'd against the
NIR shader by lowering passes that now get properly freed.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 25 May 2023 18:34:42 +0000 (14:34 -0400)]
asahi: Extract shader_initialize helper
To fill out an agx_uncompiled_shader struct, since the logic was duplicated
between graphics and compute.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Wed, 5 Jul 2023 11:28:46 +0000 (20:28 +0900)]
asahi: Add nomsaa debug flag
This forces off MSAA, which together with smalltile mode helps test more
combinations.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Wed, 5 Jul 2023 11:25:01 +0000 (20:25 +0900)]
asahi: Add smalltile debug option
This lets us force small tiles when they otherwise would not be
necessary, which is useful for decoupling tile size and the logic that
depends on it from things like MSAA and MRT which can trigger small
tiles.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Asahi Lina [Wed, 5 Jul 2023 11:16:19 +0000 (20:16 +0900)]
asahi: Add synctvb debug flag
This requests synchronous TVB growth (instead of split renders). Mostly
for testing at this point.
Only works with newer kernels and the kernel will complain on dmesg for
now.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Fri, 19 May 2023 18:09:17 +0000 (14:09 -0400)]
asahi: Refactor PBE upload routine
In general, PBE descriptors map pipe_image_views for the hardware. That we use a
writeable shader image internally for render targets is an implementation-detail
of the end-of-tile program. So, refactor the PBE upload routine to take a
pipe_image_view (not a pipe_surface), and translate the pipe_surface into an
internal pipe_image_view for end-of-tile programs.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 5 Jul 2023 02:26:36 +0000 (22:26 -0400)]
asahi: Remove unused #define
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Wed, 5 Jul 2023 02:24:28 +0000 (22:24 -0400)]
asahi: Use nir_builder_at more
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Alyssa Rosenzweig [Thu, 20 Jul 2023 14:45:18 +0000 (10:45 -0400)]
asahi: Augment fake drm_asahi_params_global
Stub out a bit more UAPI so we can build with the additions in this patch
series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>
Sergi Blanch Torne [Mon, 17 Jul 2023 06:54:58 +0000 (08:54 +0200)]
Integrate ci-kdl in the building process and launch process.
Modify the build process for the images to include the build to have ci-kdl
available in the Mesa jobs. Modify also the init-stage2 to launch in the
background the process that will collect data and store a json file with the
relative changes on the recorded data.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24177>
Sergi Blanch Torne [Mon, 17 Jul 2023 06:51:00 +0000 (08:51 +0200)]
Introduce ci-kdl builder and launcher.
A tool to collect relative changes in some registers of sysfs can be used in
the Mesa jobs to record information while the tests are being executed.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24177>
Vignesh Raman [Thu, 20 Jul 2023 03:47:53 +0000 (09:17 +0530)]
ci: add Vignesh Raman into restricted traces access list
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24247>
Eric Engestrom [Mon, 17 Jul 2023 11:22:17 +0000 (12:22 +0100)]
ci: delete install.tar after extracting it to avoid re-uploading it
Leaving it means it gets re-uploaded when sync'ing the artifacts back
from the DUT to GitLab.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24196>
Pavel Ondračka [Thu, 20 Jul 2023 08:40:09 +0000 (10:40 +0200)]
r300: fix cycles calculation
There might be more texture semaphores per begin tex block, just do the
cycles calculation on the first one.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24250>
Lionel Landwerlin [Thu, 20 Jul 2023 05:42:10 +0000 (08:42 +0300)]
ci/a530: switch a few tests to flakes to unblock CI
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24248>
Felix DeGrood [Tue, 18 Jul 2023 18:48:48 +0000 (18:48 +0000)]
intel/compiler: use shader source hash in shader dump code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Felix DeGrood [Wed, 28 Jun 2023 21:15:45 +0000 (21:15 +0000)]
intel: use shader source hash in INTEL_MEASURE
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Felix DeGrood [Tue, 18 Jul 2023 18:02:36 +0000 (18:02 +0000)]
mesa: propagate shader source sha1 from gl_shader to nir_shader
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Felix DeGrood [Wed, 28 Jun 2023 23:40:46 +0000 (23:40 +0000)]
iris: save shader source sha1 in ish
Save lowest dword of shader source sha1 in pipeline object for use
later as hash for uniquely identifying shader in debug outputs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Felix DeGrood [Thu, 29 Jun 2023 04:03:03 +0000 (04:03 +0000)]
anv: Add Source hash field to VkPipelineExecutableStatisticKHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Felix DeGrood [Wed, 28 Jun 2023 20:58:33 +0000 (20:58 +0000)]
anv: save a shader source uint32_t hash in gfx/compute pipelines
Save lowest dword of shader source sha1 in pipeline object for use
later as hash for uniquely identifying shader in debug outputs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Lionel Landwerlin [Thu, 13 Jul 2023 23:10:20 +0000 (02:10 +0300)]
intel/compiler: rework input parameters
Use a struct for various common parameters rather than per stage
structure or arguments to stage specific entrypoints.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>
Konstantin Seurer [Tue, 18 Jul 2023 13:12:15 +0000 (15:12 +0200)]
radv/meta_buffer: Rename size_minus16 to max_offset
It's just better.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24213>
Konstantin Seurer [Tue, 18 Jul 2023 12:53:24 +0000 (14:53 +0200)]
radv/meta_buffer: Stop setting RADV_META_SAVE_DESCRIPTORS
Everything is done via push constants.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24213>
Konstantin Seurer [Tue, 18 Jul 2023 12:39:27 +0000 (14:39 +0200)]
radv: Stop using the misleading round_up_u* functions
The functions had the same behavior as DIV_ROUND_UP but their names do
not mention a division.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24210>
Pavel Ondračka [Fri, 14 Jul 2023 08:05:35 +0000 (10:05 +0200)]
r300: cycles estimate for shader-db
To account for:
- macro MAD in vs
- NOPs needed before presubtract
- texture scheduling and a proper texture semaphore usage
The docs don't mention any other references to extra cycles, so otherwise
we assume 1 instruction = 1 cycle.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7573
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24152>
Pavel Ondračka [Fri, 14 Jul 2023 08:05:27 +0000 (10:05 +0200)]
r300: add a helper for checking number of temporary sources
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24152>
Pavel Ondračka [Fri, 14 Jul 2023 06:22:23 +0000 (08:22 +0200)]
r300: normal instruction can't have presubtract op
Only fs have presubtract ops and by the time we gather the stats,
all normal instructions were converted to pair ones.
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24152>
Pavel Ondračka [Fri, 14 Jul 2023 10:22:15 +0000 (12:22 +0200)]
r300: bump the RC_MAX_INDEX_BITS
We skip ntt regalloc for vertex shaders and we have 1024 instruction
limit for R500 vs, so in theory we could run some shaders with more that
1024 ssa registers (if we can optimize the number of instruction in the
backend). So add one more bit.
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24154>