Alyssa Rosenzweig [Tue, 17 Aug 2021 16:05:07 +0000 (16:05 +0000)]
panfrost: Cache number of users of a resource
This can be tracked efficiently with atomics, and reduces the places we
use the rsrc->track.users bitmap which has concurrency issues.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
Alyssa Rosenzweig [Tue, 17 Aug 2021 15:59:07 +0000 (15:59 +0000)]
panfrost: Switch resources from an array to a set
This will help us reduce shared state and simplify multithreading, at
the expense of additional CPU overhead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12528>
Mike Blumenkrantz [Wed, 9 Jun 2021 21:11:42 +0000 (17:11 -0400)]
zink: stop referencing framebuffers
this is a waste of cycles now that surfaces are accurately tracked;
no-attachment fbs are still deferred to avoid premature deletion
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>
Mike Blumenkrantz [Tue, 24 Aug 2021 15:28:22 +0000 (11:28 -0400)]
zink: defer deletion of no-attachment framebuffers
the ref on these is owned by the context, so defer deletion to avoid
premature destruction if the fb might be in use
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12429>
Alyssa Rosenzweig [Mon, 16 Aug 2021 23:55:50 +0000 (23:55 +0000)]
panfrost: Inline add_fbo_bos
Only used once, it's just complicating the batch cache interface.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Mon, 16 Aug 2021 23:53:07 +0000 (23:53 +0000)]
panfrost: Remove get_fresh_batch
Unused, and of dubious value.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Thu, 12 Aug 2021 23:25:49 +0000 (23:25 +0000)]
panfrost: Move bo->label assignment into the lock
We already took the lock, we just unlocked too early. Since the label is
reset in the BO cache, this is racy. Minimal impact in practice but is
still /wrong/ and caught by helgrind.
Fixes: 3fa1f93dace ("panfrost: Label all BOs in userspace")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Thu, 12 Aug 2021 23:20:44 +0000 (23:20 +0000)]
panfrost: Don't use ralloc for resources
ralloc is not thread safe, so we cannot use a pipe_screen as a ralloc
context unless we lock the screen. The allocation patterns for resources
are trivial, so just use malloc/calloc/free directly instead of ralloc.
This fixes a segfault in:
dEQP-EGL.functional.sharing.gles2.multithread.random.images.copytexsubimage2d.1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Thu, 12 Aug 2021 21:17:29 +0000 (21:17 +0000)]
panfrost: Simplify get_fresh_batch_for_fbo
Makes the code easier to read, too.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Thu, 12 Aug 2021 21:10:55 +0000 (21:10 +0000)]
panfrost: Remove null check in batch_cleanup
Shouldn't happen.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Thu, 12 Aug 2021 20:17:48 +0000 (20:17 +0000)]
panfrost: Protect the variants array with a lock
Without a lock, two threads may bind the same shader CSO simultaneously,
allocate the same variant simultaneously, and then race each other in
the compiler. This manifests in various ways, most commonly failing the
assertion that UBO pushing has only run once. The simple_mtx_t solution
is used in Iris. Fixes the crash in:
dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12525>
Alyssa Rosenzweig [Tue, 24 Aug 2021 00:12:54 +0000 (20:12 -0400)]
panfrost/ci: Don't skip matrix inverse tests
Older versions of these tests were buggy and failed on Bifrost. The test
bug has been resolved upstream, but the skip list was not updated when
dEQP was uprevved with the fix. Run the tests.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>
Alyssa Rosenzweig [Tue, 10 Aug 2021 18:22:02 +0000 (14:22 -0400)]
panfrost/ci: Switch to suite support
Use the new deqp-runner suite support to combine our dEQP-GLES2,
dEQP-GLES3, and dEQP-GLES31 jobs into a single job. This simplifies load
balancing, enabling us to expand our test coverage without impacting
wall clock time.
With the new infrastructure in place, we add KHR-GLES* jobs for Mali
G52. This would have caught some recent regressions. Once we hit
conformance it's essential we remain conformant.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12313>
Rhys Perry [Wed, 18 Aug 2021 09:52:20 +0000 (10:52 +0100)]
nir/gcm: pin some instructions which require uniform sources
fossil-db (Sienna Cichlid, GCM enabled):
Totals from 6192 (4.12% of 150170) affected shaders:
VGPRs: 548392 -> 542040 (-1.16%)
SpillSGPRs: 3702 -> 3990 (+7.78%); split: -0.54%, +8.32%
CodeSize:
62418488 ->
62481516 (+0.10%); split: -0.07%, +0.17%
MaxWaves: 70582 -> 71718 (+1.61%)
Instrs:
11768497 ->
11795079 (+0.23%); split: -0.07%, +0.30%
Latency:
445891848 ->
523561297 (+17.42%); split: -0.07%, +17.49%
InvThroughput:
115675481 ->
121494913 (+5.03%); split: -0.09%, +5.12%
VClause: 164914 -> 164934 (+0.01%); split: -0.05%, +0.06%
SClause: 405991 -> 395302 (-2.63%); split: -2.64%, +0.00%
Copies: 907216 -> 926429 (+2.12%); split: -1.11%, +3.23%
Branches: 456373 -> 457478 (+0.24%); split: -0.13%, +0.38%
PreSGPRs: 648030 -> 642953 (-0.78%); split: -0.88%, +0.10%
PreVGPRs: 522425 -> 516355 (-1.16%); split: -1.16%, +0.00%
Seems to affect Detroit: Become Human and Cyberpunk 2077. The Cyberpunk
2077 changes look like a fixed bug. At least some of the Detroit: Become
Human changes could probably be removed with better divergence analysis.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
Rhys Perry [Wed, 18 Aug 2021 10:02:21 +0000 (11:02 +0100)]
nir: consider push constant loads as always dynamically uniform
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
Daniel Schürmann [Wed, 24 Jun 2020 16:47:09 +0000 (17:47 +0100)]
radv: call nir_lower_flrp() after the first radv_optimize_nir()
instead of inside the optimization loop
Totals from 2504 (1.67% of 150170) affected shaders: (GFX10.3)
VGPRs: 162592 -> 162416 (-0.11%); split: -0.12%, +0.01%
CodeSize:
18399756 ->
18383552 (-0.09%); split: -0.10%, +0.01%
MaxWaves: 42654 -> 42748 (+0.22%)
Instrs:
3499404 ->
3497075 (-0.07%); split: -0.08%, +0.01%
Latency:
87087238 ->
87064270 (-0.03%); split: -0.06%, +0.03%
InvThroughput:
21159621 ->
21150546 (-0.04%); split: -0.05%, +0.01%
VClause: 56653 -> 56667 (+0.02%); split: -0.00%, +0.03%
Copies: 226332 -> 226423 (+0.04%); split: -0.15%, +0.19%
Branches: 110027 -> 110025 (-0.00%); split: -0.05%, +0.04%
PreSGPRs: 168087 -> 168076 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 160814 -> 160705 (-0.07%)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
Daniel Schürmann [Fri, 23 Jul 2021 10:16:12 +0000 (12:16 +0200)]
nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once
Totals from 201 (0.13% of 150170) affected shaders: (GFX10.3)
VGPRs: 13880 -> 13856 (-0.17%)
CodeSize:
1517328 ->
1518124 (+0.05%); split: -0.04%, +0.10%
MaxWaves: 3184 -> 3192 (+0.25%)
Instrs: 285487 -> 285569 (+0.03%); split: -0.06%, +0.08%
Latency:
7774066 ->
7780877 (+0.09%); split: -0.10%, +0.19%
InvThroughput:
1936341 ->
1935287 (-0.05%); split: -0.07%, +0.02%
SClause: 11446 -> 11448 (+0.02%); split: -0.01%, +0.03%
Copies: 17500 -> 17506 (+0.03%); split: -0.51%, +0.55%
Branches: 8174 -> 8180 (+0.07%); split: -0.13%, +0.21%
PreVGPRs: 12507 -> 12427 (-0.64%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
Daniel Schürmann [Fri, 23 Jul 2021 10:15:57 +0000 (12:15 +0200)]
nir/loop_analyze: consider instruction cost of nir_op_flrp
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
Chia-I Wu [Fri, 20 Aug 2021 22:20:58 +0000 (15:20 -0700)]
venus: use uint32_t in vn_ring_submit
And in vn_ring_write_buffer as well, to fix the assert in
vn_ring_write_buffer.
The ring code uses 32-bit unsigned integers and relies on that their
overflow/underflow behavior is well-defined. When ring->shared.head is
about to overflow and ring->cur has overflowed, this expression
ring->cur + size - vn_ring_load_head(ring)
gives an incorrect result when size is 64-bit.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12494>
Marek Olšák [Thu, 19 Aug 2021 00:16:38 +0000 (20:16 -0400)]
glthread: implement glGetUniformLocation without syncing
We already have the infrastructure for querying shader program properties
without syncing. This just uses it. _mesa_error_glthread_safe sets a GL
error from the producer thread.
This decreases CPU overhead for viewperf/snx.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12490>
Marek Olšák [Fri, 20 Aug 2021 02:51:11 +0000 (22:51 -0400)]
gallium: change pipe_draw_info::mode to uint8_t on MSVC to make it 1 byte large
needed by u_threaded_context
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12480>
Marek Olšák [Fri, 20 Aug 2021 14:46:54 +0000 (10:46 -0400)]
gallium: use a packed enum to make pipe_prim_mode 1-byte large with __GNUC__
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12480>
Mike Blumenkrantz [Thu, 3 Jun 2021 19:28:44 +0000 (15:28 -0400)]
zink: only force all buffer rebinds if rebinds exist on other contexts
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Thu, 3 Jun 2021 19:20:34 +0000 (15:20 -0400)]
zink: rebind all buffers on replacement
in theory it's possible to trigger cases where rebinds aren't based on the
current context, so ensure that (very unlikely) case is handled
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Thu, 3 Jun 2021 19:31:48 +0000 (15:31 -0400)]
zink: count streamout rebinds when doing buffer rebinds
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Thu, 3 Jun 2021 19:07:58 +0000 (15:07 -0400)]
zink: add bind counts for so bindings
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Thu, 3 Jun 2021 18:55:20 +0000 (14:55 -0400)]
zink: split out buffer rebinds to helper functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Thu, 3 Jun 2021 19:09:24 +0000 (15:09 -0400)]
zink: make descriptor update functions return the updated resource
convenience++
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Mike Blumenkrantz [Fri, 14 May 2021 23:07:27 +0000 (19:07 -0400)]
zink: clear out all ubo rebinds first if they exist
these are the second most common rebind, so iterate them all first
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12425>
Rhys Perry [Thu, 19 Aug 2021 09:41:07 +0000 (10:41 +0100)]
radv: use nir_vector_insert_imm in lower_intrinsics
This creates a single nir_op_vecn instead of a nir_op_vecn and several
copies.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12469>
Rhys Perry [Thu, 19 Aug 2021 09:40:17 +0000 (10:40 +0100)]
nir/lower_io: use nir_vector_insert_imm()
This creates a single nir_op_vecn instead of a nir_op_vecn and several
copies.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12469>
Guilherme Gallo [Mon, 23 Aug 2021 13:30:37 +0000 (10:30 -0300)]
gitlab-ci: Fix octopus device type and tag
* This month, the octopus device was renamed on Collabora's LAVA lab, so
we need to update it to get it working again on Mesa CI.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12507>
Samuel Pitoiset [Tue, 24 Aug 2021 06:04:06 +0000 (08:04 +0200)]
Revert "nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)"
This is wrong for negative values.
This reverts commit
07cd30ca293d1eb6980f69f330f9d182652cf902.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12515>
Erik Faye-Lund [Mon, 9 Aug 2021 06:40:42 +0000 (08:40 +0200)]
lavapipe: fix reported subpixel precision for lines
We have no reason to report a subpixel precision of 4 for lines; in fact
LLVMpipe uses 8 subpixel bits for lines, similar to other primitives.
But let's use the pipe-cap for this instead of hard-coding it.
Fixes: 9fbf6b2abff ("lavapipe: implement VK_EXT_line_rasterization")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12277>
Vinson Lee [Mon, 16 Aug 2021 07:47:26 +0000 (00:47 -0700)]
broadcom/compiler: Fix qpu.flags.muf typo.
Fix defect reported by Coverity Scan.
Same on both sides (CONSTANT_EXPRESSION_RESULT)
pointless_expression: The expression inst->qpu.flags.auf !=
V3D_QPU_UF_NONE || inst->qpu.flags.auf != V3D_QPU_UF_NONE does not
accomplish anything because it evaluates to either of its identical
operands, inst->qpu.flags.auf != V3D_QPU_UF_NONE.
Fixes: 3f2c54a27ff ("broadcom/compiler: rewrite partial update liveness tracking")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12385>
Erik Faye-Lund [Mon, 23 Aug 2021 14:07:47 +0000 (16:07 +0200)]
llvmpipe: improve polygon-offset precision
This performs the polygon offset addition after interpolation, which
prevents floating-point cancellation issues completely.
This does mean that we have to perform a single floating-point addition
more per fragment than before, unless we also want to spend a bit in
the fragment-shader variant key to avoid this.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>
Erik Faye-Lund [Mon, 23 Aug 2021 13:52:46 +0000 (15:52 +0200)]
llvmpipe: split coefficient calculation and store
This will be used for some underhanded smuggling of values in the next
commit.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>
Erik Faye-Lund [Wed, 18 Aug 2021 09:00:33 +0000 (11:00 +0200)]
llvmpipe: clamp z to 0..1 range when using polygon offset
The OpenGL 4.6 compatibility spec, section 14.6.5 (Depth Offset) says
the following:
> For fixed-point depth buffers, fragment depth values are always
> limited to the range [0,1] by clamping after offset addition is
> performed. Fragment depth values are clamped even when the depth
> buffer uses a floating-point representation.
So we need to properly clamp the result here.
This fixes the following dEQP failures, that the CI has missed:
- dEQP-GLES3.functional.polygon_offset.default_result_depth_clamp
- dEQP-GLES3.functional.polygon_offset.default_factor_1_slope
- dEQP-GLES3.functional.polygon_offset.fixed24_result_depth_clamp
- dEQP-GLES3.functional.polygon_offset.fixed24_factor_1_slope
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12442>
Dave Airlie [Tue, 24 Aug 2021 06:16:59 +0000 (02:16 -0400)]
crocus: copy views before adjusting
The current code overwrote the original view which meant if we
had to reemit a surface the second emit would be wrong.
This fixes cubemaps on gm45 and maybe some issues with 3D textures
elsewhere.
Fixes: f3630548f1da ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12514>
Vinson Lee [Mon, 9 Aug 2021 22:48:25 +0000 (15:48 -0700)]
freedreno: Require C++17.
Commit
3a772be026c ("freedreno: Add perfetto renderpass support")
uses C++17 init-statement feature.
GCC
../src/gallium/drivers/freedreno/freedreno_perfetto.cc: In lambda function:
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: init-statement in selection statements only available with ‘-std=c++17’ or ‘-std=gnu++17’
148 | if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
| ^~~~
Clang
../src/gallium/drivers/freedreno/freedreno_perfetto.cc:148:11: warning: 'if' initialization statements are a C++17 extension [-Wc++17-extensions]
if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
^
Intel C++ Compiler
../src/gallium/drivers/freedreno/freedreno_perfetto.cc(148): error: expected a ")"
if (auto state = tctx.GetIncrementalState(); state->was_cleared) {
^
Fixes: 3a772be026c ("freedreno: Add perfetto renderpass support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5193
Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12293>
Jason Ekstrand [Fri, 5 Feb 2021 01:36:16 +0000 (19:36 -0600)]
intel/compiler: Add unified barrier support for CS
Program CS barrier message fields for producers/consumers.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
Jordan Justen [Wed, 2 Sep 2020 22:07:02 +0000 (15:07 -0700)]
intel/compiler: Add unified barrier support for TCS
Program the producers/consumer fields for TCS Barrier messages.
Producer and consumer fields are set to number of TCS threads.
Ref: Bspec 54006 for Barrier Data Payload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
Jordan Justen [Wed, 2 Sep 2020 21:59:29 +0000 (14:59 -0700)]
intel/compiler: Regroup TCS barrier code paths
Rearrange if/else fragments to unify case for Gen11 or later
platforms. This will help the code look cleaner for adding
unified barrier support to TCS.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
Alyssa Rosenzweig [Mon, 23 Aug 2021 16:10:53 +0000 (12:10 -0400)]
panfrost: Rip out primconvert code
This is handled in common Gallium code if we set the appropriate CAP.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12509>
Alyssa Rosenzweig [Tue, 24 Aug 2021 00:18:25 +0000 (20:18 -0400)]
panfrost: Fix NULL dereference in allowlist code
If a user attempts to run Panfrost on an unsupported GPU (e.g. Mali
T604), Panfrost will refuse to load and will destroy the screen
immediately, allowing for a graceful fallback to a software rasterizer.
However, the screen destroy code calls a screen_destroy function in the
GenXML vtbl -- and this function is still NULL when the allowlist is
checked. This manifests as crashes on unsuported GPUs.
Issue tracked down with Icecream95's mad Ghidra skills.
Closes: #5269
Fixes: 88dc4db6be7 ("panfrost: Init/destroy blitter from per-gen file")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Icecream95 <ixn@disroot.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12512>
Nanley Chery [Wed, 21 Jul 2021 23:50:32 +0000 (16:50 -0700)]
intel: Parse INTEL_NO_HW for devinfo construction
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Nanley Chery [Wed, 21 Jul 2021 21:37:04 +0000 (14:37 -0700)]
intel: Use env_var_as_boolean for INTEL_NO_HW
The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Alyssa Rosenzweig [Mon, 23 Aug 2021 18:06:41 +0000 (14:06 -0400)]
panfrost: Port v5 blend shader issue to blitter
This is a presumed erratum workaround. Fixes INSTR_INVALID_PC faults on
some draw_buffers_indexed.* cases on Midgard, where a blend shader is
required to pack RT n > 0.
Backport the workaround from the GL driver. The helper is now in common
code for panvk to use as well; it has the same bug.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Mon, 23 Aug 2021 17:42:23 +0000 (13:42 -0400)]
panfrost: Zero initialize blend_shaders
Fixes an invalid read caught by valgrind when there is a hole in the
valid render target mask:
==6749== Conditional jump or move depends on uninitialised value(s)
==6749== at 0x5E88EC0: panfrost_prepare_fs_state (pan_cmdstream.c:417)
==6749== by 0x5E88EC0: panfrost_emit_frag_shader (pan_cmdstream.c:501)
==6749== by 0x5E88EC0: panfrost_emit_frag_shader_meta (pan_cmdstream.c:573)
==6749== by 0x5E88EC0: panfrost_update_state_fs (pan_cmdstream.c:2593)
==6749== by 0x5E8B0BF: panfrost_direct_draw (pan_cmdstream.c:2839)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Fixes: a124c47b9f9 ("panfrost: Fix NULL derefs in pan_cmdstream.c")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 17:10:35 +0000 (13:10 -0400)]
pan/mdg: Handle swapped 565 and
1010102 unorm
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:45:31 +0000 (12:45 -0400)]
pan/lower_framebuffer: Don't open-code pan_unpacked_type_for_format
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:40:29 +0000 (12:40 -0400)]
pan/lower_framebuffer: Don't open-code pad_vec4
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:38:19 +0000 (12:38 -0400)]
pan/lower_framebuffer: Don't treat UNORM 4 special
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:37:37 +0000 (12:37 -0400)]
pan/lower_framebuffer: Unify UNORM handling
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:26:49 +0000 (12:26 -0400)]
pan/lower_framebuffer: Use fmul_imm
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 16:18:26 +0000 (12:18 -0400)]
pan/lower_framebuffer: Don't replicate so much
We need to replicate to deal with multisampling, but not otherwise.
Simplify the logic substantially.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Fri, 11 Jun 2021 22:48:09 +0000 (18:48 -0400)]
pan/mdg: Insert moves before writeout when needed
Otherwise we end up accessing overwritten registers. Fixes
dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_enable_buffer_enable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Tue, 15 Jun 2021 17:15:15 +0000 (13:15 -0400)]
panfrost: Delete unpacks for blendable formats
Unnecessary.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Fri, 11 Jun 2021 22:02:11 +0000 (18:02 -0400)]
panfrost: Use blendable check for tib read check
These are the same! Either you're blendable and can use f32/f16
conversion, or you're raw and you can only get raw. It's that simple!
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Fri, 11 Jun 2021 21:27:37 +0000 (17:27 -0400)]
panfrost: Fix UNORM 10 sizes
Fixes: 56047fb64d7 ("panfrost: Fix UNORM 16 rendering")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Alyssa Rosenzweig [Mon, 23 Aug 2021 18:53:00 +0000 (14:53 -0400)]
panfrost: Remove unneeded quirks from T760
Will cause trouble later in the series when we start garbage collecting
unneeded code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Boris Brezillon [Fri, 4 Jun 2021 12:41:06 +0000 (14:41 +0200)]
panfrost: Add explicit padding to pan_blend_shader_key
So the hash function doesn't end up hashing uninitialized values.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reported-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Fixes: bbff09b9521f ("panfrost: Move the blend shader cache at the device level")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Tomeu Vizoso [Tue, 6 Jul 2021 08:31:38 +0000 (10:31 +0200)]
panfrost: Add padding to pan_blit_blend_shader_key
So the hashtable helpers know the correct size of the struct.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11383>
Kenneth Graunke [Sun, 22 Aug 2021 00:36:34 +0000 (17:36 -0700)]
iris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE.
Having these could be useful when tracking down GPU hangs.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>
Kenneth Graunke [Mon, 16 Aug 2021 19:54:47 +0000 (12:54 -0700)]
iris: Bypass the BO cache when allocating buffers for aux map tables
When freeing a buffer, we may return a non-idle buffer to the cache,
which means we cannot unmap aux entries at that time. Instead, we
defer unmapping the stale aux entry until we reuse a BO from the cache.
Unfortunately, this can lead to a recursive locking issue:
1. intel_aux_map_add_mapping wants to set up a new aux entry
It takes the intel_aux_map_context::mutex lock, then calls:
add_mapping -> get_aux_entry -> add_sub_table -> add_buffer ->
intel_aux_map_buffer_alloc -> iris_bo_alloc
2. iris_bo_alloc tries to allocate a BO from the cache, doing:
alloc_bo_from_cache -> intel_aux_map_unmap_range ->
intel_aux_unmap_range
...which then tries to take the intel_aux_map_context::mutex lock.
But it is already locked.
One solution would be to rework the aux map handling code to allocate
BOs without holding its lock, but that looks to be painful. Another
is to make the lock recursive, but we try and avoid that. A third
option wuold be to add a BO_ALLOC flag that makes alloc_bo_from_cache
skip any buffers with aux_map_address != 0 so we don't have to unmap,
making the less cache effective but fixing the recursive lock.
A fourth option is to simply bypass the BO cache altogether for the
buffers that hold the aux map itself. Allocating new BOs for the aux
tables should be relatively rare, so there's probably not a lot of
benefit in using the BO cache.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5191
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12420>
Yiwei Zhang [Sat, 21 Aug 2021 23:44:47 +0000 (23:44 +0000)]
venus: scrub ignored fields of pipeline info when rasterization is disable
v2: use vk_alloc instead of vk_zalloc because of full memcpy
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12499>
Yiwei Zhang [Sat, 21 Aug 2021 22:21:17 +0000 (22:21 +0000)]
venus: fix all missing vn_object_base_fini
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12498>
Matt Turner [Sat, 21 Aug 2021 01:48:45 +0000 (18:48 -0700)]
tu: Enable VK_KHR_uniform_buffer_standard_layout
This extension relaxes the alignment requirements to allow the GL std430
layout to be used. freedreno/ir3 already supports this (via
PIPE_CAP_LOAD_CONSTBUF).
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12495>
Samuel Pitoiset [Thu, 1 Jul 2021 06:41:04 +0000 (08:41 +0200)]
nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)
Found with Cyberpunk 2077.
fossils-db (GFX10.3):
Totals from 128 (2.34% of 5465) affected shaders:
CodeSize: 769720 -> 767656 (-0.27%); split: -0.27%, +0.00%
Instrs: 145748 -> 145229 (-0.36%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11604>
Dave Airlie [Mon, 23 Aug 2021 03:00:56 +0000 (13:00 +1000)]
vulkan/wsi/sw: wait for image fence before submitting to queue
With hw devices, when you submit a present, implicit sync will
make sure the work submitted to the gpu on the client will end
up happening before the present work submitted on the server.
However with sw paths there is no real GPU, the lavapipe fake
GPU thread is client side only and presenting is done directly
from the pixmap (or later shared pixmap). In order for this to
make sense the wsi common code should wait for the fence on the
image before queueing the submit to the server so that all
client works has been flushed to the pixmap before the copy or
present operation is submitted.
Fixes: 8004fa9c9501 ("vulkan/wsi: add sw support. (v2)")
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12502>
Rhys Perry [Thu, 5 Aug 2021 10:42:39 +0000 (11:42 +0100)]
aco/scheduler: allow moving down VMEM stores to below VMEM loads
fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%
fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211>
Erik Faye-Lund [Wed, 9 Jun 2021 12:15:54 +0000 (14:15 +0200)]
llvmpipe: use preferred attribute interpolation for wide lines
When rasterizing legacy-lines, OpenGL defines the width as being an
extrusion along the minor axis, repeating varyings. While the spec
*does* allow for an alternative method that matches our current results,
the OpenGL ES CTS doesn't allow these results even if OpenGL ES has the
same wording of an alternative method.
This is technically speaking a bug in the OpenGL ES CTS, but it seems
like nobody else is using the alternative formulation, at least not
while passing the OpenGL ES CTS. On top of this, the OpenGL specification
explicitly lists the extrusion results as the preferred method.
So it seems like a good idea for us to do this the way the OpenGL
specification prefers regardless; it's going to give less surprising
results to applications, and it's helping us pass some tests.
This math to set these up would "trivially" be:
dx = (dx * dx + dy * dy) / dx
dy = 0
and:
dy = (dx * dx + dy * dy) / dy
dx = 0
...but since we've already calculated dxdy, we can reformulate this to
save a division.
This fixes the following dEQP test-cases:
- dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
- dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
- dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
- dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
- dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.interpolation.lines_wide
- dEQP-GLES3.functional.rasterization.fbo.texture_2d.interpolation.lines_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.line_loop_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.line_strip_wide
- dEQP-GLES3.functional.rasterization.interpolation.basic.lines_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.line_loop_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.line_strip_wide
- dEQP-GLES3.functional.rasterization.interpolation.projected.lines_wide
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11315>
Rhys Perry [Thu, 29 Jul 2021 15:55:51 +0000 (16:55 +0100)]
aco: remove label_extract if the extract is used by a non-VALU
If an extract is used by a non-VALU instruction, it can't be applied to
all instructions, so it's not beneficial to try to apply it.
This check isn't needed because can_apply_extract()/can_use_SDWA() should
already handle non-VALU instructions.
fossil-db (Sienna Cichlid):
Totals from 1020 (0.68% of 150170) affected shaders:
SpillSGPRs: 1577 -> 1571 (-0.38%)
CodeSize:
7863668 ->
7858336 (-0.07%); split: -0.07%, +0.00%
Instrs:
1431583 ->
1431083 (-0.03%); split: -0.04%, +0.01%
Latency:
25891250 ->
25890916 (-0.00%); split: -0.01%, +0.01%
InvThroughput:
7248683 ->
7248655 (-0.00%); split: -0.01%, +0.01%
SClause: 49072 -> 49071 (-0.00%)
Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06%
Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02%
PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01%
fossil-db (Polaris10):
Totals from 654 (0.43% of 151696) affected shaders:
CodeSize:
5814552 ->
5811568 (-0.05%); split: -0.05%, +0.00%
Instrs:
1105783 ->
1105049 (-0.07%); split: -0.07%, +0.00%
Latency:
20261458 ->
20259744 (-0.01%); split: -0.01%, +0.00%
InvThroughput:
9011785 ->
9011749 (-0.00%); split: -0.00%, +0.00%
Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00%
PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01%
PreVGPRs: 43813 -> 43809 (-0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>
Samuel Pitoiset [Thu, 19 Aug 2021 07:04:46 +0000 (09:04 +0200)]
radv: allocate shaders to 32-bit address to skip PGM_HI
This reduces the number of emitted registers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>
Samuel Pitoiset [Thu, 19 Aug 2021 06:40:19 +0000 (08:40 +0200)]
radv: don't use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x
Seems it make the perf worse.
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466>
Daniel Schürmann [Wed, 18 Aug 2021 19:42:15 +0000 (21:42 +0200)]
aco: add more validation rules for SDWA operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Fri, 13 Aug 2021 12:39:29 +0000 (14:39 +0200)]
aco/opcodes: remove definition_size[]
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Fri, 13 Aug 2021 12:38:40 +0000 (14:38 +0200)]
aco/validate: simplify get_subdword_bytes_written()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Thu, 12 Aug 2021 16:04:01 +0000 (18:04 +0200)]
aco/ra: refactor subdword operand stride
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Fri, 13 Aug 2021 10:54:59 +0000 (12:54 +0200)]
aco/ra: refactor subdword definition info
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Wed, 18 Aug 2021 16:56:59 +0000 (18:56 +0200)]
aco: add instr_is_16bit() helper function
to indicate whether some instruction writes partial registers, only.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Wed, 7 Jul 2021 09:37:49 +0000 (11:37 +0200)]
aco: use VOPC_SDWA on GFX9+
Totals from 5138 (3.42% of 150170) affected shaders: (GFX10.3)
VGPRs: 409520 -> 409416 (-0.03%); split: -0.03%, +0.00%
CodeSize:
43056360 ->
43035696 (-0.05%); split: -0.06%, +0.02%
MaxWaves: 69296 -> 69310 (+0.02%)
Instrs:
8161016 ->
8153365 (-0.09%); split: -0.10%, +0.01%
Latency:
109397002 ->
109756208 (+0.33%); split: -0.05%, +0.38%
InvThroughput:
23238920 ->
23310761 (+0.31%); split: -0.11%, +0.42%
VClause: 135141 -> 135100 (-0.03%); split: -0.05%, +0.02%
SClause: 349511 -> 349489 (-0.01%); split: -0.01%, +0.00%
Copies: 388107 -> 387754 (-0.09%); split: -0.48%, +0.38%
Branches: 184629 -> 184503 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 258807 -> 258839 (+0.01%)
PreVGPRs: 372561 -> 372184 (-0.10%); split: -0.10%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Daniel Schürmann [Fri, 13 Aug 2021 13:07:16 +0000 (15:07 +0200)]
aco/print_ir: fix printing of VOPC_SDWA definitions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364>
Rhys Perry [Fri, 20 Aug 2021 13:03:29 +0000 (14:03 +0100)]
aco: fix vectorized 16-bit load_input/load_interpolated_input
Seems we haven't encountered this before because
nir_lower_io_to_scalar_early usually scalarizes this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12486>
Samuel Pitoiset [Wed, 11 Aug 2021 11:35:13 +0000 (13:35 +0200)]
radv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state
This has no effect without enabling EXPCLEAR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326>
Samuel Pitoiset [Wed, 11 Aug 2021 11:33:30 +0000 (13:33 +0200)]
radv: remove unused fast depth-stencil gfx clear path with expclear
This has never been used because it requires to know the previous
clear values which is not really possible in Vulkan.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326>
Michel Zou [Fri, 20 Aug 2021 08:42:06 +0000 (10:42 +0200)]
lavapipe: fix missing VKAPI_CALL attribute
Fixes build on mingw
Fixes: c198adf7
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12484>
Ian Romanick [Mon, 16 Aug 2021 18:20:56 +0000 (11:20 -0700)]
util/xmlconfig: Test values set via the environment
driconf options can also be set via environment variables. This is a
simple touch-test of that feature.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12477>
Ian Romanick [Mon, 16 Aug 2021 18:20:00 +0000 (11:20 -0700)]
util/xmlconfig: Make unit tests more resilient against user env settings
Before this, setting 'vblank_mode=0' in the environment would cause a
unit test to fail.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12477>
Marek Olšák [Sun, 7 Mar 2021 15:28:52 +0000 (10:28 -0500)]
frontend/dri: add environment variable DRI_NO_MSAA for performance comparisons
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12491>
Marek Olšák [Fri, 13 Aug 2021 07:24:38 +0000 (03:24 -0400)]
radeonsi: remove vertices_per_patch parameter from draw-related functions
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
Marek Olšák [Fri, 13 Aug 2021 06:29:56 +0000 (02:29 -0400)]
gallium: remove vertices_per_patch, add pipe_context::set_patch_vertices
We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).
It also allows removing some code from draw_vbo for tessellation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
Connor Abbott [Fri, 20 Aug 2021 15:24:45 +0000 (17:24 +0200)]
tu: Remove some stale bypass xfails
These were fixed by
09e0b29bb63f60231b26b4c8f02eadb68e51b623 which was
missed during the suite conversion. For the remaining still-valid fail,
there is a CTS patch in progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12488>
Rob Clark [Fri, 20 Aug 2021 20:46:03 +0000 (13:46 -0700)]
freedreno/crashdec: Quiet spammy print in query mode
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>
Rob Clark [Fri, 20 Aug 2021 18:13:44 +0000 (11:13 -0700)]
freedreno/crashdec: Decode full RB in verbose mode
This is useful to get a better view of previous commands in the
ringbuffer.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>
Rob Clark [Fri, 20 Aug 2021 17:48:57 +0000 (10:48 -0700)]
freedreno/cffdec: Fix gpuaddr comparision
gpuaddrs are 64b, and they can be more than 2^^32 apart.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>
Rob Clark [Fri, 20 Aug 2021 17:48:28 +0000 (10:48 -0700)]
freedreno/cffdec: Fix indentation
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12489>
Icecream95 [Sat, 14 Aug 2021 11:36:27 +0000 (23:36 +1200)]
pan/bi: Extend bi_add_nop_for_atest for tilebuffer loads
Fixes framebuffer_fetch and blend_equation_advanced dEQP tests on v6.
v2: Use clause dependencies rather than comparing the message type
v3: Shift the BIFROST_SLOT_* constants before using them as a mask
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12375>
Matt Turner [Fri, 20 Aug 2021 03:12:57 +0000 (20:12 -0700)]
tu: Free device->bo_idx and device->bo_list on init failure
Two related changes:
- in tu_device.c:tu_CreateDevice we need to free both pointers in the
teardown path after tu_bo_finish(global_bo), which uses the pointers.
They are allocated in the first call to tu_bo_init(), which happens
when global_bo is allocated.
- in tu_drm.c:tu_bo_init we need to free bo_list if the bo_idx
allocation fails. Convert to the goto teardown pattern as well.
Fixes the following dEQP-VK tests:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail
dEQP-VK.api.object_management.alloc_callback_fail.device
dEQP-VK.api.object_management.alloc_callback_fail.device_group
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12481>
Alyssa Rosenzweig [Thu, 19 Aug 2021 22:09:32 +0000 (22:09 +0000)]
pan/bi: Use CLPER_V6 on Mali G31
Apparently, CLPER_V7 is missing from Mali G31, but CLPER_V6 works. Fixes
INSTR_INVALID_ENC faults and failures in
dEQP-GLES3.functional.shaders.derivate.* on Dvalin.
Technically not an errata but an implementation difference. I suspect
Mali G51 will need this as well, should we ever allowlist it.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12478>