Jesse Natalie [Wed, 22 Sep 2021 19:06:58 +0000 (12:06 -0700)]
gallium, windows: Use HANDLE instead of FD for external objects
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
Jesse Natalie [Wed, 21 Jul 2021 18:44:48 +0000 (11:44 -0700)]
microsoft/compiler: Handle GLES external textures
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
Jesse Natalie [Wed, 11 Aug 2021 21:43:52 +0000 (14:43 -0700)]
d3d12: Support RGBX formats mapped to RGBA
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
Jesse Natalie [Sun, 26 Sep 2021 15:22:58 +0000 (08:22 -0700)]
d3d12: Support PIPE_CAP_MIXED_COLOR_DEPTH_BITS
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
Jesse Natalie [Sun, 26 Sep 2021 15:19:21 +0000 (08:19 -0700)]
d3d12: Support BGRA 555 and 565 formats
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13054>
Jesse Natalie [Wed, 6 Oct 2021 19:32:05 +0000 (12:32 -0700)]
android: Allow forcing softpipe
When dealing with swrast, there's two possibilities: If you have LLVM, you get
llvmpipe, which is pretty fast. If you don't, you get softpipe, which is slow,
but does have a couple nice qualities, like being smaller and not needing
executable memory for JIT.
If you're building a driver that requires LLVM like radeonsi then you need the
LLVM stub for the build to find LLVM. But for swrast, since it can mean either
softpipe/llvmpipe, you don't strictly need LLVM. So this just makes the
Android build files flexible like the Meson build files (where you can specify
-Dllvm=disabled even if LLVM is findable).
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13532>
Jesse Natalie [Thu, 5 Aug 2021 00:55:58 +0000 (17:55 -0700)]
android,d3d12: Support using DirectX-Headers dependency from AOSP
Note that the Android build system apparently lowercases stuff,
so add a lowercase "directx-headers" dependency which is searched first,
before falling back to the proper-cased "DirectX-Headers" dependency.
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13532>
Jesse Natalie [Thu, 29 Jul 2021 17:09:32 +0000 (10:09 -0700)]
mesa/main, android: Log errors to logcat
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13532>
Jesse Natalie [Thu, 18 Nov 2021 22:51:05 +0000 (14:51 -0800)]
android: Add a BOARD CFlags option so build can be customized
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13532>
Mike Blumenkrantz [Fri, 19 Nov 2021 18:41:22 +0000 (13:41 -0500)]
zink: be consistent about waiting on context queue on context destroy
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13885>
Mike Blumenkrantz [Fri, 19 Nov 2021 18:42:41 +0000 (13:42 -0500)]
zink: set batch state queue on creation
make this easier to find
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13885>
Emma Anholt [Thu, 18 Nov 2021 03:28:52 +0000 (19:28 -0800)]
freedreno/a5xx: Emit MSAA state for sysmem rendering, too.
This looked obviously wrong, we want to set the sample counts for sysmem
too just like we do on 6xx. Turns out it fixes some piglits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
Emma Anholt [Thu, 18 Nov 2021 00:27:16 +0000 (16:27 -0800)]
freedreno/a5xx: Document the sRGB bit on RB_2D_SRC/DST info.
Noticed while looking through my set of traces for where the average bit
might be. Same spot as on a6xx.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
Emma Anholt [Thu, 18 Nov 2021 00:30:19 +0000 (16:30 -0800)]
freedreno/a5xx: Define a5xx_2d_surf_info like a6xx has.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
Emma Anholt [Thu, 18 Nov 2021 00:04:45 +0000 (16:04 -0800)]
freedreno/a6xx: Disable sample averaging on non-ubwc z24s8 MSAA blits.
The fallback path we averages unorm textures, but if we don't have ubwc on
either then we can just cast them to uint which then just takes sample 0.
The proper UBWC format I think ends up averaging, though.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
Emma Anholt [Wed, 17 Nov 2021 23:40:49 +0000 (15:40 -0800)]
freedreno/a6xx: Disable sample averaging on z/s or integer blits.
We can't generally force fd_blitter_blit() to not average in our fallback
blits, but this should at help some cases.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13867>
Connor Abbott [Tue, 16 Nov 2021 14:20:52 +0000 (15:20 +0100)]
ir3/lower_pcopy: Fix bug with "illegal" copies and swaps
If the source and destination were within the same full register, like
hr90.x and hr90.y (which both map to r45.x), then we'd perform the
swap/copy with the wrong register. This broke
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35 once BDA is enabled.
Fixes:
0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
Connor Abbott [Tue, 16 Nov 2021 14:32:58 +0000 (15:32 +0100)]
ir3/lower_pcopy: Fix shr.b illegal copy lowering
The immediate shouldn't be half-reg because the other source isn't.
Fixes an assertion failure with
dEQP-VK.ssbo.phys.layout.random.16bit.scalar.35.
Fixes:
0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
Connor Abbott [Tue, 16 Nov 2021 13:23:03 +0000 (14:23 +0100)]
ir3/spill: Support larger spill slot offset
This is required by
dEQP-VK.ssbo.phys.layout.random.all_shared_buffer.47, where we need to
spill a lot of pointers due to NIR CSE being a little too aggressive and
creating a large register pressure across basic blocks, too large to fit
within the boundaries of ldp/stp offsets.
Note that this will be a lot more difficult with support for "real
functions" because the base register will become unknown at compile
time. However this hack gets things working for the time being.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
Connor Abbott [Mon, 15 Nov 2021 11:53:01 +0000 (12:53 +0100)]
ir3/ra: Add missing asserts to ra_push_interval()
This would've caught the previous issue earlier. We checked that the
physreg made sense when inserting via ra_file_insert() but not
ra_push_interval() which is used for live-range splitting.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
Connor Abbott [Mon, 15 Nov 2021 11:11:07 +0000 (12:11 +0100)]
ir3/ra: Consider reg file size when swapping killed sources
Don't swap a 2-component vector of half-regs with a full reg if that
would result in the half regs going outside of the allowable half-reg
space.
Fixes:
d4b5d2a0204 ("ir3/ra: Use killed sources in register eviction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13818>
Jesse Natalie [Mon, 27 Sep 2021 14:24:16 +0000 (07:24 -0700)]
meson: Allow mismatching RTTI for MSVC
This might be safe to relax to all Windows compilers, but I didn't
test Clang or MinGW, so scoping to MSVC for now. For MSVC, this is
safe to mismatch, because the vftables are emitted into all objects
with "pick largest," and the definition with RTTI is larger than the
one without. This is different than the Itanium ABI, which only emits
one copy of the typeinfo in the object which defines the key method.
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13064>
Jesse Natalie [Mon, 27 Sep 2021 14:23:54 +0000 (07:23 -0700)]
meson: Don't override built-in cpp_rtti option, error if it's invalid
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13064>
Lionel Landwerlin [Thu, 18 Nov 2021 21:53:48 +0000 (23:53 +0200)]
anv: initialize anv_bo_sync base fields
v2: zalloc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
cbb13fae33a8b ("anv: Add a BO sync type")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13875>
Lionel Landwerlin [Fri, 19 Nov 2021 14:00:15 +0000 (16:00 +0200)]
anv: don't try to close fd = -1
CID: 1464334
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13879>
Samuel Pitoiset [Wed, 17 Nov 2021 19:32:44 +0000 (20:32 +0100)]
radv: ignore the descriptor set layout when creating descriptor template
From the Vulkan spec:
"This parameter is ignored if templateType is not
VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET."
This fixes an assertion about the base object type when running Yuzu
with Vulkan validation layers enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13846>
Samuel Pitoiset [Mon, 2 Aug 2021 16:34:08 +0000 (18:34 +0200)]
radv: allow TC-compat CMASK with storage images on GFX10+
Hardware seems to support it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12173>
Mike Blumenkrantz [Thu, 18 Nov 2021 21:27:36 +0000 (16:27 -0500)]
zink: add a compiler pass to scan for shader image use
other frontends and internal shaders won't set this
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13864>
Mike Blumenkrantz [Thu, 18 Nov 2021 14:53:58 +0000 (09:53 -0500)]
zink: explicitly init glsl
need this to be able to use other frontends
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13864>
Alejandro Piñeiro [Thu, 18 Nov 2021 11:16:43 +0000 (12:16 +0100)]
vulkan: move common format helpers to vk_format
v3dv, radv, and turnip are using several C&P format helpers (most of
them wrappers over util_format_description based helpers). methods.
This commit moves the common helpers to the already existing common
vk_format.h. For the case of v3dv we were able to remove the vk_format
header. For turnip and radv, a local vk_format.h header remains, with
methods that are only used for those drivers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13858>
Samuel Pitoiset [Thu, 18 Nov 2021 15:58:42 +0000 (16:58 +0100)]
util/queue: fix a data race detected by TSAN when finishing the queue
Thread sanitizer complains if it detects that the pthread_barrier
is destroyed when a thread might still blocked on the barrier.
Fix this by destroying the barrier only if pthread_barrier_wait
returns PTHREAD_BARRIER_SERIAL_THREAD which is the value for success.
In practice this shouldn't fix anything serious given that this code
is only called when the disk cache is destroyed.
Original patch from Timothy Arceri.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4342
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13861>
Qiang Yu [Fri, 12 Nov 2021 06:42:54 +0000 (14:42 +0800)]
glx/dri3: fix glXQueryContext does not return GLX_RENDER_TYPE value
Cc: mesa-stable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13772>
Emma Anholt [Wed, 17 Nov 2021 23:15:19 +0000 (15:15 -0800)]
freedreno: Stop exposing MSAA image load/store on desktop GL.
GLES doesn't support it, and blob VK doesn't support it. We could
theoretically lower it, but don't bother since it's not required. Fixes
various piglit image load/store tests.
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13852>
Alyssa Rosenzweig [Sat, 13 Nov 2021 19:52:52 +0000 (14:52 -0500)]
asahi: Fix BIND_PIPELINE sizing and alignment
Fix a bug in BIND_PIPELINE XML reported by Dougall, which cleans up
a bit of both decoder and driver.
Instead of...
* 17 bytes BIND_PIPELINE (17)
* An unused 8 byte record (25)
* A set of N 8 byte records (25 + 8 * N)
* Oops, 1 byte too many! One just disappeared (24 + 8 * N)
It seems to instead be
* 24 bytes BIND_PIPELINE (24)
* A set of N 8 byte records (24 + 8 * N)
without the sentinel record. These means the 8 byte records themselves
are shuffled, with the high byte of the pointers split from the low
word, but that's less gross than an off-by-one.
It's still not clear what the last 8 bytes of the BIND_VERTEX_PIPELINE
structure mean, or the last 4 byte of the BIND_FRAGMENT_PIPELINE
structure which seems to be a bit shorter.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sat, 13 Nov 2021 19:26:57 +0000 (14:26 -0500)]
asahi: Remove obnoxious workaround
Now that we're not hardcoded any magic BO IDs, there is no minimum
number of allocations needed. Remove the unneeded -- and obnoxious --
workaround of allocating unused BOs on startup.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sat, 13 Nov 2021 19:25:47 +0000 (14:25 -0500)]
asahi: Remove silly magic numbers
These are unnecessary now that the structure of agx_map_* is better
understood.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sat, 13 Nov 2021 19:24:01 +0000 (14:24 -0500)]
asahi: Fix agx_map_* structures
Dougall Johnson observed these structures make more sense with indices[]
first in the entries and indices[] absent from the header. Then the
sentinel entry disappears, nr_entries makes more sense, and a few magic
numbers pop out. Many thanks to Dougall's astute eyes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sat, 13 Nov 2021 18:53:43 +0000 (13:53 -0500)]
asahi: Allocate special scratch buffers
Seem to be used for preemption.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sat, 13 Nov 2021 18:53:34 +0000 (13:53 -0500)]
asahi: Deflake addresses
Reported by Dougall.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Sun, 31 Oct 2021 15:13:58 +0000 (11:13 -0400)]
asahi: Rename PANDECODE->AGXDECODE
Fix remnant of the Panfrost decoder fork.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13784>
Alyssa Rosenzweig [Mon, 8 Nov 2021 00:51:19 +0000 (19:51 -0500)]
pan/bi: Add XML for LD_BUFFER
Encoded like LOAD.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 15 Nov 2021 23:15:42 +0000 (18:15 -0500)]
pan/bi: Suppress uniform validation for LD_BUFFER
Seems to be ok and used by the DDK...
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 8 Nov 2021 00:50:51 +0000 (19:50 -0500)]
pan/bi: Confirm IDP unit on Valhall
Based on Anandtech which gives 8-bit dot product throughput on Valhall
under FMA and not consistent with SFU.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Thu, 4 Nov 2021 23:08:36 +0000 (19:08 -0400)]
pan/bi: Forbid unaligned staging registers on Valhall
Would've saved me some debugging with the computerator. I keep
forgetting about this nuance. Enforce it in the assembler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Thu, 4 Nov 2021 00:22:26 +0000 (20:22 -0400)]
pan/bi: Add XML for assembling Valhall image stores
Not complete yet but let's get some tests in early. Document the new
instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 15 Nov 2021 23:19:57 +0000 (18:19 -0500)]
pan/bi: Add Valhall's special FMA_RSCALE instructions
Like Bifrost, but exposed as separate physical instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 15 Nov 2021 23:19:08 +0000 (18:19 -0500)]
pan/bi: Add sqrt form of Valhall FREXPM
Like Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 15 Nov 2021 23:18:55 +0000 (18:18 -0500)]
pan/bi: Add full form of Valhall MUX instruction
Like Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Alyssa Rosenzweig [Mon, 15 Nov 2021 23:18:23 +0000 (18:18 -0500)]
pan/bi: Annotate Valhall instructions with units
Based on analyzing the cycle counts reported by the Mali offline
compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13802>
Mike Blumenkrantz [Thu, 5 Aug 2021 15:32:52 +0000 (11:32 -0400)]
zink: enable PIPE_TEXTURE_TRANSFER_COMPUTE on non-cpu drivers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13859>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:45:20 +0000 (16:45 -0500)]
zink: use pb_slab_alloc_reclaimed(reclaim_all) for BAR heap sometimes
this forces a full slab reclaim any time the device is known to have a
too-small BAR in order to keep memory usage at a minimum when it might otherwise
balloon out and crash us
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:43:55 +0000 (16:43 -0500)]
aux/pb: add a new slab alloc function for reclaiming all bo objects
sometimes a driver might want to always reclaim all bo objects in the course
of allocating a new bo. this is useful when it's known that a given memory
heap is very small and will likely need to keep its usage minimized
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13850>
Roland Scheidegger [Mon, 15 Nov 2021 17:27:44 +0000 (18:27 +0100)]
llvmpipe: adjust rounding for viewport scissoring
Some apps may try to use a viewport adjusted by 0.5 pixels (among other
things) to emulate d3d9 pixel center, and in this case we would end up
with incorrect "fake scissor" box (shifted by 1 pixel), hence pixels
being incorrectly scissored away when permit_linear_rasterizer is set
(this happens even if the linear rasterizer is not used in the end).
So adjust the offset so that the half-way points get rounded down instead
of up.
(This is all a bit iffy I think since we don't use fractional
boxes (with 8 subpixel bits) anywhere yet, but at least without msaa
it should work out.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13794>
Eric Engestrom [Wed, 17 Nov 2021 20:30:43 +0000 (20:30 +0000)]
docs: add 22.0 branchpoint date for perspective
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13848>
Eric Engestrom [Wed, 17 Nov 2021 20:28:17 +0000 (20:28 +0000)]
docs: add 21.3.x release schedule
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13848>
Eric Engestrom [Wed, 17 Nov 2021 20:22:05 +0000 (20:22 +0000)]
docs: update calendar and link releases notes for 21.3.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13848>
Eric Engestrom [Wed, 17 Nov 2021 20:10:19 +0000 (20:10 +0000)]
docs: add release notes for 21.3.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13848>
Samuel Pitoiset [Tue, 16 Nov 2021 10:08:15 +0000 (11:08 +0100)]
radv: disable HTILE for D32S8 format and mipmaps on GFX10
Stencil texturing with HTILE doesn't work with mipmapping on Navi10-14,
it's a hw bug. RadeonSI and PAL have a workaround too.
This fixes 35 piglit failures with Zink on Navi10.
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13814>
Tomeu Vizoso [Fri, 29 Oct 2021 09:00:32 +0000 (11:00 +0200)]
ci: Uprev Crosvm
And use my fork while we upstream some improvements to Crosvm that make
it more appropriate for using in CI.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Tomeu Vizoso [Wed, 8 Sep 2021 05:52:58 +0000 (07:52 +0200)]
virgl/ci: Run each dEQP instance in its own VM
Currently we run deqp-runner inside a single VM, which makes very poor
use of the available CPUs because Virgl has a bottleneck in the VMM that
serializes everything.
With this change, we can run several Crosvm instances in a runner and
make full use of the CPUs. Getting the same coverage with 3 runners
instead of 6.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Tomeu Vizoso [Wed, 17 Nov 2021 07:33:46 +0000 (08:33 +0100)]
ci: Remove syslogd
Crosvm doesn't need it any more.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Tomeu Vizoso [Wed, 8 Sep 2021 05:42:49 +0000 (07:42 +0200)]
virgl/ci: Set GALLIVM_PERF=nopt,no_quad_lod
nopt will disable some shader optimizations that slow down test runs for
no gain.
no_quad_lod will disable some speed hacks that can cause inaccurate
results.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Tomeu Vizoso [Wed, 8 Sep 2021 05:40:23 +0000 (07:40 +0200)]
ci: Don't set GALLIVM_PERF in the scripts
Instead, let gitlab-ci.yml files define it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Tomeu Vizoso [Wed, 8 Sep 2021 05:38:59 +0000 (07:38 +0200)]
ci: Create symlink to /install early
So we can use well-known absolute paths in configuration files.
Otherwise, the install dir is within $CI_PROJECT_DIR, which changes
between jobs.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12828>
Mike Blumenkrantz [Thu, 5 Aug 2021 19:28:52 +0000 (15:28 -0400)]
gallium: implement compute pbo download
this reworks PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER into an
enum as PIPE_CAP_TEXTURE_TRANSFER_MODES, enabling drivers to choose
a (sometimes) faster, compute-based download mechanism based on a new
pipe_screen hook
compute pbo download is implemented using shaders with a prolog to convert
the input format to generic rgb float values, then an epilog to convert
to the output value. the prolog and epilog are determined based on a vec4
of packed ubo data which is dynamically updated based on the API usage
currently, the only known limitations are:
* GL_ARB_texture_cube_map_array is broken somehow (and disabled)
* AMD hardware somehow can't do depth readback?
otherwise it should work for every possible case
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Mike Blumenkrantz [Wed, 17 Nov 2021 23:16:14 +0000 (18:16 -0500)]
mesa/st: make some pbo functions public
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Mike Blumenkrantz [Tue, 16 Nov 2021 15:14:37 +0000 (10:14 -0500)]
mesa/st: make sampler_type_for_target public
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Mike Blumenkrantz [Mon, 8 Nov 2021 17:33:57 +0000 (12:33 -0500)]
gallium: rename PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER
this is now a bitfield enum for more functionality
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Mike Blumenkrantz [Mon, 8 Nov 2021 17:30:26 +0000 (12:30 -0500)]
gallium: add pipe_screen::is_compute_copy_faster hook
this can be used to query whether a driver expects a given texture
copy to be faster as a compute shader or using cpu/gfx transfers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Dylan Baker [Thu, 18 Nov 2021 01:03:49 +0000 (17:03 -0800)]
turnip: don't use mesa/macros.h to get utils/rounding.h
For hopefully obvious reasons.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13853>
Pierre-Eric Pelloux-Prayer [Wed, 17 Nov 2021 14:45:19 +0000 (15:45 +0100)]
radeonsi/sqtt: increase the default buffer size to 32MB
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13838>
Pierre-Eric Pelloux-Prayer [Wed, 17 Nov 2021 14:40:31 +0000 (15:40 +0100)]
radeonsi: unreference framebuffer state after use
util_copy_framebuffer_state increases refcounts, so we have
to decrement them afterwards.
Fixes:
b1b491cdbba ("radeonsi: add a faster clear path for glClearTexImage")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13838>
Iago Toral Quiroga [Wed, 17 Nov 2021 12:55:48 +0000 (13:55 +0100)]
broadcom/compiler: fix early fragment tests setup
When early fragment tests are mandated by the shader, we must use
the Z value produced by the FEP even if there are elements that
would typically require late fragment tests (such as discards,
sample to coverage, etc).
This change means we also need to be a bit more careful when
we promote shaders to use early fragment tests so we don't
promote anything with discards for example.
Fixes:
dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_depth
dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_stencil
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13837>
Joshua Ashton [Tue, 6 Jul 2021 14:01:10 +0000 (15:01 +0100)]
radv: Implement VK_EXT_image_view_min_lod
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13820>
Joshua Ashton [Tue, 16 Nov 2021 15:03:42 +0000 (15:03 +0000)]
vulkan: Update the XML and headers to 1.2.199
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13820>
Joshua Ashton [Tue, 6 Jul 2021 13:19:46 +0000 (14:19 +0100)]
radv: Expose min_lod in *_make_texture_descriptor
We'll need this going forward for VK_EXT_image_view_min_lod.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13820>
Joshua Ashton [Tue, 6 Jul 2021 13:28:20 +0000 (14:28 +0100)]
radv: Refactor S_FIXED to radv_float_to_{s,u}fixed
We'll need to use this in radv_image for VK_EXT_image_view_min_lod.
Additionally, creates signed/unsigned variants to avoid sign-extending where we don't need to.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13820>
Mike Blumenkrantz [Wed, 17 Nov 2021 19:15:18 +0000 (14:15 -0500)]
zink: clamp to 500 max batch states on nvidia
I've been advised that leaving this unclamped will use up all the fds
allotted to a process
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13844>
Mike Blumenkrantz [Tue, 16 Nov 2021 22:45:58 +0000 (17:45 -0500)]
zink: fail context creation more gracefully
handle some cases where context creation fails earlier than expected
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13844>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:48:28 +0000 (16:48 -0500)]
zink: fix memory availability reporting
this shouldn't report the budgeted available memory, it should return
the total memory, as that's what this api expects
Fixes:
ff4ba3d4a77 ("zink: support PIPE_CAP_QUERY_MEMORY_INFO")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:46:49 +0000 (16:46 -0500)]
zink: use IMMUTABLE for dummy xfb buffer
this is never getting read back or anything so don't waste BAR allocation
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:42:50 +0000 (16:42 -0500)]
zink: demote BAR allocations to device-local on oom
ideally this shouldn't happen, but it's better than crashing even if
it may crash later from attempting to map
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:47:16 +0000 (16:47 -0500)]
zink: set zink_resource_object::host_visible based on actual bo placement
the properties determined before allocation may not be the same as what gets
allocated
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Mike Blumenkrantz [Wed, 17 Nov 2021 21:41:34 +0000 (16:41 -0500)]
zink: always use slab allocation placement for domains
this allows the actual bo to have its memory type changed if necessary
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Mike Blumenkrantz [Wed, 17 Nov 2021 18:22:39 +0000 (13:22 -0500)]
zink: add error for bo allocation failure
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13849>
Marek Olšák [Sun, 31 Oct 2021 05:02:02 +0000 (01:02 -0400)]
glx: add a workaround to glXDestroyWindow for Viewperf2020/Sw
This fixes:
X Error of failed request: GLXBadWindow
Major opcode of failed request: 152 (GLX)
Minor opcode of failed request: 32 (X_GLXDestroyWindow)
Serial number of failed request: 9667
Current serial number in output stream: 9674
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13611>
Iván Briano [Wed, 17 Nov 2021 19:02:26 +0000 (11:02 -0800)]
intel/nir: also allow unknown format for getting the size of a storage image
Fixes:
fa251cf111df ("intel/nir: allow unknown format in lowering of storage images")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13847>
Ian Romanick [Wed, 17 Nov 2021 01:02:50 +0000 (17:02 -0800)]
glsl/nir: Don't build soft float64 when it cannot be used
Fixes:
82d9a37a59c ("glsl/nir: Add a shared helper for building float64 shaders")
Closes: #5556
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13828>
Mike Blumenkrantz [Thu, 11 Nov 2021 16:57:49 +0000 (11:57 -0500)]
zink: implement multiplanar modifier handling
it turns out this is trivial as long as dri gives usable resource templates
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
Mike Blumenkrantz [Thu, 11 Nov 2021 20:10:45 +0000 (15:10 -0500)]
dri2: set dimensions on dmabuf import planes
this is unusable for some drivers without the plane size attached
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
Mike Blumenkrantz [Thu, 11 Nov 2021 17:23:29 +0000 (12:23 -0500)]
zink: always set matching resource export type for dmabuf creation
both of these need to be set if one is
cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
Mike Blumenkrantz [Thu, 11 Nov 2021 17:11:33 +0000 (12:11 -0500)]
zink: stop using VK_IMAGE_LAYOUT_PREINITIALIZED for dmabuf
this is illegal
cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13799>
Jason Ekstrand [Wed, 17 Nov 2021 15:24:40 +0000 (09:24 -0600)]
vulkan/sync: Rework asserts a bit
ANV currently smashes off the TIMELINE bit depending on whether or not
the i915 interface supports them, triggering assert(!type->get_value).
Instead of requiring ANV to smash off function pointers, let the extra
function pointers through and then assert on the feature bits before the
function pointers get used. This should give us roughly the same amount
of assert protection while side-stepping the feature disabling problem.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13839>
Filip Gawin [Tue, 16 Nov 2021 23:46:19 +0000 (00:46 +0100)]
glsl: fix trivial strict aliasing warning
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13827>
Omar Akkila [Fri, 12 Nov 2021 17:51:26 +0000 (12:51 -0500)]
llvmpipe: page-align memory allocations
Allows memory allocated by llvmpipe_allocate_memory_fd to be
mappable to guests in virtualized environments like KVM which
requires page-aligned memory.
llvmpipe_allocate_memory is updated similarly for consistency.
Signed-off-by: Omar Akkila <omar.akkila@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13793>
Connor Abbott [Mon, 8 Nov 2021 18:36:36 +0000 (19:36 +0100)]
ir3: Stop inserting nops during scheduling
Not necessary since nothing uses it anymore. This might have a slight
effect on spilling with multiple blocks, but no shader-db difference
because nothing spills.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
Connor Abbott [Mon, 8 Nov 2021 18:13:38 +0000 (19:13 +0100)]
ir3/postsched: Only prefer tex/sfu if they are soft-ready
Otherwise we schedule an SFU depending on a tex as soon as the tex is
scheduled, which is very much not what we want.
Note that sstall is helped more than nops are hurt, and the shaders with
the largest nop regressions also have sstall helped. However (sy) is
also very much helped.
total nops in shared programs: 345482 -> 345986 (0.15%)
nops in affected programs: 5731 -> 6235 (8.79%)
helped: 15
HURT: 81
helped stats (abs) min: 1 max: 9 x̄: 3.27 x̃: 3
helped stats (rel) min: 0.50% max: 16.00% x̄: 8.55% x̃: 10.26%
HURT stats (abs) min: 1 max: 72 x̄: 6.83 x̃: 4
HURT stats (rel) min: 0.57% max: 400.00% x̄: 32.50% x̃: 13.16%
95% mean confidence interval for nops value: 3.34 7.16
95% mean confidence interval for nops %-change: 13.07% 39.10%
Nops are HURT.
total sstall in shared programs: 133804 -> 132381 (-1.06%)
sstall in affected programs: 4743 -> 3320 (-30.00%)
helped: 68
HURT: 24
helped stats (abs) min: 1 max: 153 x̄: 21.88 x̃: 8
helped stats (rel) min: 1.79% max: 100.00% x̄: 33.20% x̃: 28.00%
HURT stats (abs) min: 1 max: 11 x̄: 2.71 x̃: 2
HURT stats (rel) min: 1.02% max: 200.00% x̄: 17.73% x̃: 5.59%
95% mean confidence interval for sstall value: -22.05 -8.89
95% mean confidence interval for sstall %-change: -27.60% -12.22%
Sstall are helped.
total (ss) in shared programs: 35471 -> 35481 (0.03%)
(ss) in affected programs: 462 -> 472 (2.16%)
helped: 9
HURT: 15
helped stats (abs) min: 1 max: 2 x̄: 1.11 x̃: 1
helped stats (rel) min: 4.17% max: 33.33% x̄: 14.00% x̃: 7.69%
HURT stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1
HURT stats (rel) min: 1.19% max: 50.00% x̄: 12.27% x̃: 8.33%
95% mean confidence interval for (ss) value: -0.14 0.97
95% mean confidence interval for (ss) %-change: -5.11% 9.94%
Inconclusive result (value mean confidence interval includes 0).
total (sy) in shared programs: 13522 -> 13288 (-1.73%)
(sy) in affected programs: 422 -> 188 (-55.45%)
helped: 22
HURT: 1
helped stats (abs) min: 1 max: 21 x̄: 10.68 x̃: 10
helped stats (rel) min: 8.00% max: 94.44% x̄: 56.58% x̃: 56.94%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
95% mean confidence interval for (sy) value: -13.18 -7.17
95% mean confidence interval for (sy) %-change: -65.48% -40.59%
(sy) are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
Connor Abbott [Wed, 3 Nov 2021 17:06:17 +0000 (18:06 +0100)]
ir3/postsched: Rewrite delay handling
Analogous to the pre-RA scheduler. Unfortunately this time it's a bit
more involved because we have to correctly handle (rptN), which is
already relevant for swz. This means we need the index of the
destination register that conflicts with the source register, to handle
swz, and we need to expose that part of ir3_delay. But once that's done,
we can delete ir3_delay_calc_postra.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
Connor Abbott [Mon, 8 Nov 2021 16:20:39 +0000 (17:20 +0100)]
ir3/delay: Ignore earlier definitions to the same register
We have a situation in some skia shaders like:
add.f r0.x, ...
(rpt2)nop
mul.f ..., r0.x
sam (xyzw) r0.x, ...
rcp ..., r0.x
Notice that rcp uses the result of the sam instruction, not the add.f,
but we didn't keep track of which instructions kill the sources in
ir3_delay, so we'd add an extra nop, resulting in a disagreement betwen
ir3_delay and the scheduling graph. Since postsched is correct, fix
ir3_delay. This only results in some very slight shader-db changes but
keeps the next commit from changing things.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
Connor Abbott [Wed, 3 Nov 2021 17:00:51 +0000 (18:00 +0100)]
ir3/postsched: Handle sync dependencies better
We want to model soft dependencies, but because of how there's only a
single bit to wait on all of them, there may be unnecessary delays
inserted when a (sy)-consumer follows an unrelated (sy)-producer.
Previously there was some code to try to work around this, but we can
just model it directly using the sfu_delay and tex_delay cycle counts
that we have to maintain anyway and delete it.
This also gets rid of the calls to ir3_delay_postra with soft=true which
would be more complicated to handle in the next commit.
There is a functional change here: the idea of preferring less nop's
over critical path length (max_delay) up to 3 nops is kept (and we
delete the TODO which is already sort-of resolved by it), but delays due
to (ss)/(sy) and nops are now treated equally, rather than always
preferring nops over syncs. So if our estimate indicates that scheduling
an (ss) consumer will result in a wait of one cycle and there's another
instruction that will require one nop, we will treat them otherwise
equal and choose based on max_delay instead. This results in more
sstall, but the decrease in nops is much greater.
total nops in shared programs: 376613 -> 345482 (-8.27%)
nops in affected programs: 275483 -> 244352 (-11.30%)
helped: 3226
HURT: 110
helped stats (abs) min: 1 max: 78 x̄: 9.73 x̃: 7
helped stats (rel) min: 0.19% max: 100.00% x̄: 19.48% x̃: 13.68%
HURT stats (abs) min: 1 max: 16 x̄: 2.43 x̃: 2
HURT stats (rel) min: 0.00% max: 150.00% x̄: 13.34% x̃: 4.36%
95% mean confidence interval for nops value: -9.61 -9.06
95% mean confidence interval for nops %-change: -19.01% -17.78%
Nops are helped.
total sstall in shared programs: 126195 -> 133806 (6.03%)
sstall in affected programs: 79440 -> 87051 (9.58%)
helped: 300
HURT: 1922
helped stats (abs) min: 1 max: 15 x̄: 4.72 x̃: 4
helped stats (rel) min: 1.05% max: 100.00% x̄: 17.15% x̃: 14.55%
HURT stats (abs) min: 1 max: 29 x̄: 4.70 x̃: 4
HURT stats (rel) min: 0.00% max: 900.00% x̄: 25.38% x̃: 10.53%
95% mean confidence interval for sstall value: 3.22 3.63
95% mean confidence interval for sstall %-change: 17.50% 21.78%
Sstall are HURT.
total (ss) in shared programs: 35190 -> 35472 (0.80%)
(ss) in affected programs: 6433 -> 6715 (4.38%)
helped: 163
HURT: 401
helped stats (abs) min: 1 max: 2 x̄: 1.06 x̃: 1
helped stats (rel) min: 1.92% max: 33.33% x̄: 11.53% x̃: 10.00%
HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel) min: 1.56% max: 100.00% x̄: 15.33% x̃: 12.50%
95% mean confidence interval for (ss) value: 0.41 0.59
95% mean confidence interval for (ss) %-change: 6.22% 8.93%
(ss) are HURT.
total (sy) in shared programs: 13476 -> 13521 (0.33%)
(sy) in affected programs: 669 -> 714 (6.73%)
helped: 30
HURT: 78
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 4.00% max: 50.00% x̄: 21.22% x̃: 21.11%
HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1
HURT stats (rel) min: 3.45% max: 100.00% x̄: 31.93% x̃: 25.00%
95% mean confidence interval for (sy) value: 0.23 0.60
95% mean confidence interval for (sy) %-change: 11.19% 23.15%
(sy) are HURT.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>