Jason Ekstrand [Wed, 11 Aug 2021 02:05:30 +0000 (21:05 -0500)]
meson/glsl: Only run GLSL tests if can_run_host_binaries()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12308>
Jason Ekstrand [Tue, 10 Aug 2021 17:43:00 +0000 (12:43 -0500)]
meson: Intel drivers don't require expat on Android
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12308>
Jason Ekstrand [Tue, 10 Aug 2021 22:37:31 +0000 (17:37 -0500)]
meson/intel: Don't build genxml tests on Android
They require expat which we don't have on Android.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12308>
Ilia Mirkin [Wed, 11 Aug 2021 01:02:43 +0000 (21:02 -0400)]
st/mesa: fix pbo download store image type
There's generally not too big of a difference between 1D (default) and
buffer, but can't hurt to be accurate.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12319>
Paulo Zanoni [Mon, 2 Aug 2021 20:41:46 +0000 (13:41 -0700)]
iris: use add_bo_to_batch() when adding batch->bo
Again, we don't need all the dependency checking, seqno incrementing
and duplicate tracking for batch->bo. Just use the unchecked version.
This commit is not particularly significant since it really just saves
us a check in the iris_use_pinned_bo() hot path, but since we already
have the helper function, why not?
v2:
- (turns out the answer to "why not?" is because the patch had a bug)
- Call ensure_exec_obj_space() since batch batch chaining can happen
and doesn't guarantee pre-reserved space (Ken).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Paulo Zanoni [Mon, 2 Aug 2021 20:32:20 +0000 (13:32 -0700)]
iris: add the workaround_bo directly to the batch
Don't use iris_use_pinned_bo(), go directly with add_bo_to_batch(),
skipping every check. This allows us to early return from
iris_use_pinned_bo when the workaround bo is used, saving us the call
to find_validation_entry() which ends up doing nothing except
iterating over every bo in the batch. Also don't bother with
ensure_exec_obj_space() since we just reset the batch and this is the
second BO we're adding to it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Paulo Zanoni [Fri, 11 Jun 2021 23:25:01 +0000 (16:25 -0700)]
iris: extract the code that adds BOs to the batch lists
We want to add a new caller, so extract this first.
v2: kflags can never contain EXEC_OBJECT_WRITE (Ken).
v3: Rebase after s/gtt_offset/address/.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Paulo Zanoni [Fri, 11 Jun 2021 22:53:19 +0000 (15:53 -0700)]
iris: assign bo->index to the aux map BOs too
I don't see these BOs being searched for in the benchmarks I tested so
I don't think this should improve anything. On the other hand, it
shouldn't hurt either since it's just an extra assignment.
I want to unify both places where we have this code into a single
function and the lack of the bo->index assignment was the only
difference between the two places. So first we make both functions the
same and in the next commit we'll unify things. This should make
bisecting easier in case I'm wrong.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Paulo Zanoni [Mon, 2 Aug 2021 20:13:13 +0000 (13:13 -0700)]
iris: don't bump the seqno for the workaround_bo
The last_seqnos list is used by iris_emit_buffer_barrier_for() and as
far as I can understand we don't emit barriers for the workaround bo,
so don't even bother doing the atomic operations required to bump the
workaround_bo seqno list.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12194>
Eric Engestrom [Wed, 11 Aug 2021 20:41:10 +0000 (21:41 +0100)]
docs: update calendar and link releases notes for 21.1.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12331>
Eric Engestrom [Wed, 11 Aug 2021 19:20:54 +0000 (20:20 +0100)]
docs: add release notes for 21.1.7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12331>
Dave Airlie [Wed, 4 Aug 2021 07:38:12 +0000 (17:38 +1000)]
intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5
This is the equivalent of idr's
intel/fs: sel.cond writes the flags on Gfx4 and Gfx5
except for the vec4 backend.
This fixes buggy rendering seen with crocus on a qt trace.
v2 (idr): Trivial whitespace change. Add unit tests.
v3: Fix type in comment in unit tests. Noticed by Jason and Priit.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iron Lake
total instructions in shared programs: 8183077 -> 8184543 (0.02%)
instructions in affected programs: 198990 -> 200456 (0.74%)
helped: 0
HURT: 1355
HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1
HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.96% 1.03%
Instructions are HURT.
total cycles in shared programs:
238967672 ->
238962784 (<.01%)
cycles in affected programs: 4666014 -> 4661126 (-0.10%)
helped: 406
HURT: 314
helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18
helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65%
HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12
HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16%
95% mean confidence interval for cycles value: -8.60 -4.98
95% mean confidence interval for cycles %-change: -0.87% -0.49%
Cycles are helped.
GM45
total instructions in shared programs: 4986888 -> 4988354 (0.03%)
instructions in affected programs: 198990 -> 200456 (0.74%)
helped: 0
HURT: 1355
HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1
HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.96% 1.03%
Instructions are HURT.
total cycles in shared programs:
153577826 ->
153572938 (<.01%)
cycles in affected programs: 4666014 -> 4661126 (-0.10%)
helped: 406
HURT: 314
helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18
helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65%
HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12
HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16%
95% mean confidence interval for cycles value: -8.60 -4.98
95% mean confidence interval for cycles %-change: -0.87% -0.49%
Cycles are helped.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>
Ian Romanick [Tue, 3 Aug 2021 04:33:17 +0000 (21:33 -0700)]
intel/fs: sel.cond writes the flags on Gfx4 and Gfx5
On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented
using a separte cmpn and sel instruction. This lowering occurs in
fs_vistor::lower_minmax which is called very, very late... a long, long
time after the first calls to opt_cmod_propagation. As a result,
conditional modifiers can be incorrectly propagated across sel.cond on
those platforms.
No tests were affected by this change, and I find that quite shocking.
After just changing flags_written(), all of the atan tests started
failing on ILK. That required the change in cmod_propagatin (and the
addition of the prop_across_into_sel_gfx5 unit test).
Shader-db results for ILK and GM45 are below. I looked at a couple
before and after shaders... and every case that I looked at had
experienced incorrect cmod propagation. This affected a LOT of apps!
Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2,
Gang Beasts, and on and on... :(
I discovered this bug while working on a couple new optimization
passes. One of the passes attempts to remove condition modifiers that
are never used. The pass made no progress except on ILK and GM45.
After investigating a couple of the affected shaders, I noticed that
the code in those shaders looked wrong... investigation led to this
cause.
v2: Trivial changes in the unit tests.
v3: Fix type in comment in unit tests. Noticed by Jason and Priit.
v4: Tweak handling of BRW_OPCODE_SEL special case. Suggested by Jason.
Fixes:
df1aec763eb ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Dave Airlie <airlied@redhat.com>
Iron Lake
total instructions in shared programs: 8180493 -> 8181781 (0.02%)
instructions in affected programs: 541796 -> 543084 (0.24%)
helped: 28
HURT: 1158
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50%
HURT stats (abs) min: 1 max: 3 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23%
95% mean confidence interval for instructions value: 1.06 1.11
95% mean confidence interval for instructions %-change: 0.31% 0.38%
Instructions are HURT.
total cycles in shared programs:
239420470 ->
239421690 (<.01%)
cycles in affected programs: 2925992 -> 2927212 (0.04%)
helped: 49
HURT: 157
helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70
helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96%
HURT stats (abs) min: 2 max: 48 x̄: 27.34 x̃: 24
HURT stats (rel) min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20%
95% mean confidence interval for cycles value: -0.80 12.64
95% mean confidence interval for cycles %-change: -0.31% <.01%
Inconclusive result (value mean confidence interval includes 0).
GM45
total instructions in shared programs: 4985517 -> 4986207 (0.01%)
instructions in affected programs: 306935 -> 307625 (0.22%)
helped: 14
HURT: 625
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49%
HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1
HURT stats (rel) min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22%
95% mean confidence interval for instructions value: 1.04 1.12
95% mean confidence interval for instructions %-change: 0.29% 0.36%
Instructions are HURT.
total cycles in shared programs:
153827268 ->
153828052 (<.01%)
cycles in affected programs: 1669290 -> 1670074 (0.05%)
helped: 24
HURT: 84
helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67
helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94%
HURT stats (abs) min: 2 max: 48 x̄: 27.71 x̃: 24
HURT stats (rel) min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14%
95% mean confidence interval for cycles value: -1.94 16.46
95% mean confidence interval for cycles %-change: -0.29% 0.11%
Inconclusive result (value mean confidence interval includes 0).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>
Dave Airlie [Wed, 11 Aug 2021 17:34:26 +0000 (13:34 -0400)]
crocus: align staging resource pitch on gen4/5 to allow BLT usage.
Aligning the pitch to 4 bytes allows the BLT engine to be used for
transfers to/from these surfaces.
Fixes:
f3630548f1da ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12329>
Dave Airlie [Wed, 11 Aug 2021 17:32:41 +0000 (13:32 -0400)]
crocus/blt: add pitch/offset checks to fix blt corruption
I lost these in my conversion from i965 but they are necessary.
This should fix corruption in qt fonts at seen in the minecraft
launcher.
Fixes:
f3630548f1da ("crocus: initial gallium driver for Intel gfx 4-7")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12329>
Alyssa Rosenzweig [Wed, 4 Aug 2021 22:21:34 +0000 (18:21 -0400)]
pan/bi: Unit test DISCARD+FCMP fusing
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 21:56:31 +0000 (17:56 -0400)]
pan/bi: Fuse DISCARD with conditions
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Wed, 4 Aug 2021 16:12:31 +0000 (12:12 -0400)]
pan/bi: Add fclamp unit tests
The negative cases here did not pass before this series, showing the bug
in the clamp optimization. By introducing the FCLAMP pseudo op, the bug
is fixed. Let's ensure we don't regress.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Wed, 4 Aug 2021 18:49:30 +0000 (14:49 -0400)]
pan/bi: Use FCLAMP pseudo op for clamp prop
Map nir_op_fsat/etc to FCLAMP pseudo ops, instead of FADD. There are
significantly fewer knobs on FCLAMP, meaning significantly fewer things
to get wrong.
This fixes two(!) classes of bugs:
* Swizzles (failing to lower/compose swizzles on clamps)
* Numerical bugs (incorrectly treating +0.0 as an additive identity)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 22:39:13 +0000 (18:39 -0400)]
pan/bi: Add optimizer unit tests
Writing these tests brought to light the cluster of bugs fixed in the
previous commits. Now that things work, let's ensure they stay working.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 23:16:52 +0000 (19:16 -0400)]
pan/bi: Use FABSNEG pseudo ops for modifier prop
Simplifies pattern matching. This commit by itself fixes multiple
numerical issues -- the previous fabsneg check failed to check the round
mode or the sign of the zero. That will break Vulkan/OpenCL.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 22:38:46 +0000 (18:38 -0400)]
pan/bi: Add shader equality helper for unit tests
Optimizer tests really are global.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 22:09:56 +0000 (18:09 -0400)]
pan/bi: Fuse abs/neg more on Valhall
Some of these Bifrost restrictions may be skipped on Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 17:47:59 +0000 (13:47 -0400)]
pan/bi: Simplify bi_compose_clamp
Realized this trick when reversing Valhall.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 10 Aug 2021 19:49:47 +0000 (15:49 -0400)]
pan/bi: Unit test new constant folding patterns
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Tue, 3 Aug 2021 15:19:11 +0000 (11:19 -0400)]
pan/bi: Constant fold texturing lowerings
This ensures we can constant fold the ALU ops used to lower:
* explicit LOD calculations
* array textures
* texture offsets
* multisample indices
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Wed, 4 Aug 2021 16:21:06 +0000 (12:21 -0400)]
pan/va: Document IEEE 754 conformance of clamps
These rules are not obvious. But they turn out to be exactly what's
required by the spec.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12205>
Alyssa Rosenzweig [Fri, 30 Jul 2021 23:30:05 +0000 (19:30 -0400)]
panfrost: Test src*dst + dst*src blending
Validates the prior commit.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Fri, 30 Jul 2021 23:28:37 +0000 (19:28 -0400)]
panfrost: Leverage Bifrost's 2*src blend factor
Bifrost adds a value for the C factor equaling 2*src. This does not
correspond directly to API blend modes so it is not too useful in
general. However, it's required for src*dest + dest*src blending to be
done in hardware instead of a blend shader. GFXbench uses that blend
mode, so it must be important ;-)
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Fri, 30 Jul 2021 22:43:48 +0000 (18:43 -0400)]
panfrost: Add basic fixed-function blending tests
Add unit tests for the fixed-function blending helpers in pan_blend.c.
Each test consists of a Porter-Duff blend mode and the associated
hardware state. In this commit, we add tests for the most common modes.
For motivation, this code has NOT been properly tested in CI. True,
functional correctness of the blend module as a whole is tested by
dEQP-GLES3.functional.fragment_ops.blend.* among other integration
tests. However, this testing is insufficient to check for regressions.
Crucially, the following broken patch would clear CI:
bool pan_can_fixed_function(...) {
return false;
}
In that case, blend shaders are used 100% of the time, which will
regress performance horribly but still pass dEQP. The only clue
something went wrong would be some traces changing checksum due to the
fixed-function blender producing slightly different output than
equivalent blend shaders. By unit testing the fixed blend path, we
ensure we always use the fixed-function path when we expect it to.
Similarly, using incorrect values for the blend metadata may not affect
functional correctness but will increase power consumption. Let's check
all the data we export to drivers.
Note: due to additive commutativity, there are many pairs of equivalent
Mali blend modes. Unfortunately, the vendor is... inconsistent about how
to resolve ambiguous modes. Our algorithm for computing modes is
correct; the "preferred" values are left in comments since otherwise our
tests fail despite correct code. I want to blame Bifrost for this, but
Midgard was patient zero.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Fri, 30 Jul 2021 22:19:46 +0000 (18:19 -0400)]
panfrost: Simplify blend_factor_constant_mask
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Fri, 30 Jul 2021 22:04:05 +0000 (18:04 -0400)]
panfrost: Fix is_opaque when blend_enable=false
Needed to pass the "replace" unit test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Fri, 30 Jul 2021 21:46:21 +0000 (17:46 -0400)]
panfrost: Add blend helper packing the equation
This is more convenient for the Gallium driver and easier to test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Tue, 10 Aug 2021 16:58:56 +0000 (12:58 -0400)]
panfrost: Use _PU for non-dithered formats
This is required to disable dithering on a per-draw basis when OPAQUE
output is used (bypassing the blender which normally uses the
round_to_framebuffer_precision flag to do the same).
This functionally reverts:
ebc07f4b2f3 ("panfrost: Remove padded unorm blendable formats")
fae90a79404 ("panfrost: Always pick dithered tb formats")
while adding the functionality to make them useful.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12152>
Alyssa Rosenzweig [Wed, 11 Aug 2021 14:58:26 +0000 (10:58 -0400)]
panfrost: Remove unused #defines
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
Alyssa Rosenzweig [Wed, 11 Aug 2021 14:50:44 +0000 (10:50 -0400)]
panfrost: Add LINEAR debug option
Useful to cross off CPU texture tiling as the source of bugs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
Alyssa Rosenzweig [Wed, 11 Aug 2021 14:31:48 +0000 (10:31 -0400)]
pan/bi: Add a noopt debug option
To rule out buggy optimization passes when debugging.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
Alyssa Rosenzweig [Wed, 11 Aug 2021 14:03:13 +0000 (10:03 -0400)]
pan/bi: Make bi_opt_push_ubo optional
It's an optimization pass -- omitting it should not cause MMU faults
(!). Make sure the UBO push mask is set regardless of whether the pass
is called, and just call the pass when required.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12328>
Lionel Landwerlin [Mon, 19 Jul 2021 16:33:12 +0000 (19:33 +0300)]
nir/lower_shader_calls: remove empty phis
This is confusing opt_cse.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes:
8dfb240b1f0633 ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11953>
Marcin Ślusarz [Tue, 10 Aug 2021 11:53:40 +0000 (13:53 +0200)]
zink: use nir_shader_instructions_pass in nir_lower_dynamic_bo_access
Changes:
- nir_metadata_preserve(..., nir_metadata_dominance)
is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
make progress
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 11:49:05 +0000 (13:49 +0200)]
zink: use nir_shader_instructions_pass in lower_discard_if
Changes:
- nir_metadata_preserve(..., nir_metadata_dominance)
is called only when pass makes progress
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
make progress
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 11:36:47 +0000 (13:36 +0200)]
microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_lower_double_math
No functional changes.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 11:30:56 +0000 (13:30 +0200)]
microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_split_clip_cull_distance
No functional changes.
v2: fix build
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 11:15:25 +0000 (13:15 +0200)]
microsoft/compiler: preserve all metadata when upcast_phi doesn't make progress
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 11:03:51 +0000 (13:03 +0200)]
microsoft/clc: use nir_shader_instructions_pass in clc_nir_dedupe_const_samplers
Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
make progress
v2: fix build
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 10:59:29 +0000 (12:59 +0200)]
microsoft/clc: preserve only valid metadata in clc_lower_printf_base
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Tue, 10 Aug 2021 10:53:52 +0000 (12:53 +0200)]
d3d12: use nir_metadata_none instead of its value
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Fri, 6 Aug 2021 11:06:04 +0000 (13:06 +0200)]
intel/compiler: use nir_shader_instructions_pass in brw_nir_apply_attribute_workarounds
Changes:
- removal of attr_wa_state (it's passed directly)
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
make progress
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Marcin Ślusarz [Fri, 6 Aug 2021 10:14:38 +0000 (12:14 +0200)]
nir/builder: invalidate metadata per function
Fixes:
a62098fff20 ("nir: Add a helper for general instruction-modifying passes.")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12324>
Danylo Piliaiev [Tue, 10 Aug 2021 09:24:04 +0000 (12:24 +0300)]
freedreno/decode: print estimated crash location without colored output
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12302>
Pierre-Eric Pelloux-Prayer [Fri, 7 May 2021 14:36:47 +0000 (16:36 +0200)]
nir: add a pass to optimize "gl_FragDepth = gl_FragCoord.z" away
gl_FragDepth default value is gl_FragCoord.z so if a shader does:
gl_FragDepth = gl_FragCoord.z
we can drop this assignment.
v2: use nir_ssa_scalar_resolved and don't do this is gl_FragDepth
is wrote multiple times (Jason)
v3: - move to its own pass (Jason)
- handle var = NULL (Rhys)
v4: refactoring (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10697>
Kenneth Graunke [Thu, 22 Jul 2021 05:44:43 +0000 (22:44 -0700)]
iris: Drop dead drm_ioctl prototype
We now use intel_ioctl instead.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12206>
Kenneth Graunke [Tue, 20 Jul 2021 04:30:21 +0000 (21:30 -0700)]
iris: Improve the memory layout of iris_bo by fixing pahole issues
We had a 4 byte hole and a 4-byte field breaking up a run of bools.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12206>
Kenneth Graunke [Tue, 20 Jul 2021 04:23:18 +0000 (21:23 -0700)]
iris: Rename bo->gtt_offset to bo->address
This is the virtual memory address of the buffer object. Calling it the
BO's address is a lot more obvious than calling it an offset in one of
the now many graphics translation tables.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12206>
Iago Toral Quiroga [Tue, 10 Aug 2021 09:32:50 +0000 (11:32 +0200)]
v3d,v3dv: add options to force 32-bit or 16-bit TMU precision
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12303>
Tapani Pälli [Tue, 10 Aug 2021 08:08:35 +0000 (11:08 +0300)]
anv/android: fix build error due refactoring
Fixes:
e08370dc37e ("anv: disable aux for exportable images without modifiers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5208
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12300>
Dave Airlie [Wed, 11 Aug 2021 00:06:22 +0000 (10:06 +1000)]
docs: add llvmpipe host memory extensions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12316>
Dave Airlie [Tue, 10 Aug 2021 19:18:10 +0000 (05:18 +1000)]
lavapipe: add host ptr support.
This actually doesn't need any backend support.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12316>
Dave Airlie [Tue, 10 Aug 2021 06:21:10 +0000 (16:21 +1000)]
llvmpipe: add support for user memory pointers
This is useful for clover, but throw it at CI at least
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12316>
Icecream95 [Sun, 8 Aug 2021 08:57:30 +0000 (20:57 +1200)]
pan/bi: Use the computed scale for fexp NaN propagation
This makes pow(NaN, x) return NaN rather than 1.0.
Fixes:
499397700c1 ("pan/bi: Don't lower fpow")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5189
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12269>
Ian Romanick [Fri, 6 Aug 2021 21:20:25 +0000 (14:20 -0700)]
Revert "nir/algebraic: Convert some f2u to f2i"
Per https://gitlab.freedesktop.org/mesa/mesa/-/issues/5178#note_1019666,
the assumption fundamental to this optimization is false. Section
2.4.1 (Float to Integer) of Ivy Bridge PRMs describes the situation.
The wording of the section is somewhat confusing (because it doesn't
clearly delineate between signed and unsigned integers), but the last
two rows of the table make it clear that F->UD conversion clamps
negative float values to 0.
All other hardware mentioned in that thread seems to behave the same
way.
The real problem is that, with hardware that behaves in this ways,
converting f2u(
2147483648.0) to f2i(
2147483648.0) changes the bit pattern
that would be produced from 0x80000000 to 0x7fffffff.
This reverts commit
ad059202583e8c86bbccf0d65c5ce35bc4ab20f1.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>
Ian Romanick [Fri, 6 Aug 2021 21:16:24 +0000 (14:16 -0700)]
nir/opcodes: Use u_intN_(min|max)
uadd_sat was updated using sed, so I didn't even notice the surrounding
opcodes. Oops.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>
Dave Airlie [Mon, 9 Aug 2021 03:24:34 +0000 (13:24 +1000)]
clover: only return CLC version as 1.2 (even for 3.0)
Fixes CTS compiler opencl_c_versions
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12286>
Dave Airlie [Mon, 9 Aug 2021 01:36:58 +0000 (11:36 +1000)]
clover/nir: don't convert to NIR on library link
If just creating a library, just link the spir-v and store it.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12286>
Dave Airlie [Wed, 11 Nov 2020 01:33:33 +0000 (11:33 +1000)]
clover: fix compilation with clang + llvm 12.
clang in llvm 12 no longer accepts "-cl-denorms-are-zero" as a cc1
options which is how this code uses it.
For now just pick the correct cc1 equivalent.
This fixes a crash with llvm master and CL conversions tests
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12286>
Eric Engestrom [Mon, 9 Aug 2021 20:44:21 +0000 (21:44 +0100)]
pick-ui: show commit date
With our ff-only merge setup, the commit date ends up being when the
commit actually landed (as opposed to when it was first written).
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12289>
Eric Engestrom [Mon, 9 Aug 2021 20:23:08 +0000 (21:23 +0100)]
pick-ui: show nomination type in the UI
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12289>
Eric Engestrom [Wed, 29 Apr 2020 16:51:30 +0000 (18:51 +0200)]
pick-ui: drop assert that optional argument is passed
Let's just make it not-optional instead.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12289>
Rob Clark [Tue, 10 Aug 2021 18:24:47 +0000 (11:24 -0700)]
freedreno: Use correct key for binning pass shader
We updated the key correctly for whether we wanted to use a
safe_constlen binning pass variant, but then passed the wrong
key to ir3_shader_variant().
Fixes:
1dd24bf27b2 ("freedreno: Share constlen between different stages properly")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12314>
Alyssa Rosenzweig [Tue, 6 Jul 2021 16:55:23 +0000 (12:55 -0400)]
nir/lower_mediump: Fix metadata in all passes
Fixes:
fb29cef8dda ("nir: add many passes that lower and optimize 16-bit input/outputs and samplers")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11732>
Alyssa Rosenzweig [Wed, 16 Jun 2021 18:54:46 +0000 (14:54 -0400)]
nir/lower_mediump_io: Don't remap base unless needed
Otherwise drivers that don't use 16-bit slots for varyings will get
confused and have their driver_locations scribbled over. This has caused
multiple problems for both Panfrost and Asahi this week. Given the only
other user of the pass for varyings is radeonsi, which needs both
together, I think this is the least controversial fix.
Fixes:
fb29cef8dda ("nir: add many passes that lower and optimize 16-bit input/outputs and samplers")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11732>
Danylo Piliaiev [Mon, 9 Aug 2021 18:19:11 +0000 (21:19 +0300)]
tu: add "flushall" and "syncdraw" debug options
They will be useful to check whether some issue is due to the lack
of flushing or waiting.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12283>
Mike Blumenkrantz [Tue, 10 Aug 2021 18:21:26 +0000 (14:21 -0400)]
nine: init more draw info members
Reviewed-by: <Axel Davy davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12284>
Mike Blumenkrantz [Tue, 10 Aug 2021 18:19:45 +0000 (14:19 -0400)]
nine: init take_index_buffer_ownership for draws
Reviewed-by: <Axel Davy davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12284>
Jesse Natalie [Mon, 2 Aug 2021 18:45:19 +0000 (11:45 -0700)]
u_driconf: Use a macro to avoid repeating option names
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
Jesse Natalie [Sun, 1 Aug 2021 16:02:12 +0000 (09:02 -0700)]
wgl: Add a driver name for driconf
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
Jesse Natalie [Sun, 1 Aug 2021 15:41:35 +0000 (08:41 -0700)]
wgl: Parse driconf options
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
Jesse Natalie [Sun, 1 Aug 2021 18:09:47 +0000 (11:09 -0700)]
xmlconfig: Use static inline for regex fallback to prevent -O0 issues
A non-static inline function body is only actually emitted by GCC during optimization passes,
so running -O0 ends up never emitting the body, producing linker errors.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
Jesse Natalie [Sun, 1 Aug 2021 15:41:31 +0000 (08:41 -0700)]
gallium/dri: Move driConf -> st option processing to aux/util
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12158>
Connor Abbott [Thu, 5 Aug 2021 12:52:00 +0000 (14:52 +0200)]
tu: Read some input attachments directly
It can happen that the user reads an input attachment as the first use
of that attachment. In that case there are no subpass dependencies
required at all, because there could be a pipeline barrier before the
renderpass instead, and in any case we assume that dependencies with the
first subpass as a destination can be executed only once outside the
renderpass. The result is that we only do a CACHE_INVALIDATE once
before the entire renderpass, but it's actually required after each GMEM
load, because input attachments read GMEM through UCHE and those writes
to GMEM invalidate UCHE.
While we could add the missing CACHE_INVALIDATE "by hand" somehow, it
turns out it's actually just as easy to do an optimization the blob
does, where it simply doesn't patch those input attachments and reads
them directly instead. This means we can skip allocating memory in GMEM
for them entirely in some circumstances.
This fixes e.g.
dEQP-VK.api.copy_and_blit.core.resolve_image.whole_array_image.4_bit
with TU_DEBUG=forcebin.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12213>
Jason Ekstrand [Mon, 9 Aug 2021 22:23:52 +0000 (17:23 -0500)]
intel/eu: Set scope to TILE for TGM flushes
Setting it to GPU can cause an L3$ flush in certain cases. That's not
what we want as we really only care about coherency within the GPU.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12291>
Samuel Pitoiset [Tue, 27 Jul 2021 21:08:19 +0000 (23:08 +0200)]
radv: allow fast clears for concurrent images if comp-to-single is supported
Only GFX10+ is affected because older chips don't support
comp-to-single. For them, we need to implement FCE on compute with DCC
and eventually CMASK.
Fixes the gap between concurrent vs exclusive queue with Scarlet Nexus,
also gives a boost with Doom Eternal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12088>
Marcin Ślusarz [Fri, 6 Aug 2021 08:49:29 +0000 (10:49 +0200)]
glsl: evaluate switch expression once
v2: intialize test_val in constructor
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5185
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Cc: mesa-stable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12234>
Iago Toral Quiroga [Mon, 9 Aug 2021 06:42:29 +0000 (08:42 +0200)]
broadcom/compiler: rewrite partial update liveness tracking
The code we had for this was a work in progress and not finished. Also,
it was geared towards partial writes caused by output packing (i.e.
fp16) and was ignoring partial updates caused by conditional writes,
which are far more common in our case.
This change provides an implementation for tracking conditional writes
that works in tandem with the previous spill change to narrow liveness
for their spills.
Fixes register allocation failures in:
dEQP-VK.graphicsfuzz.spv-stable-maze-flatten-copy-composite
We also gain one shader from shader-db:
total instructions in shared programs:
13339969 ->
13338584 (-0.01%)
instructions in affected programs: 185520 -> 184135 (-0.75%)
helped: 375
HURT: 130
Instructions are helped.
total threads in shared programs: 412038 -> 412040 (<.01%)
threads in affected programs: 2 -> 4 (100.00%)
helped: 1
HURT: 0
total uniforms in shared programs: 3746581 -> 3746585 (<.01%)
uniforms in affected programs: 49 -> 53 (8.16%)
helped: 0
HURT: 1
total max-temps in shared programs: 2359960 -> 2359947 (<.01%)
max-temps in affected programs: 289 -> 276 (-4.50%)
helped: 7
HURT: 0
Max-temps are helped.
total sfu-stalls in shared programs: 34351 -> 34359 (0.02%)
sfu-stalls in affected programs: 218 -> 226 (3.67%)
helped: 35
HURT: 37
Inconclusive result (value mean confidence interval includes 0).
total inst-and-stalls in shared programs:
13374320 ->
13372943 (-0.01%)
inst-and-stalls in affected programs: 186653 -> 185276 (-0.74%)
helped: 373
HURT: 132
Inst-and-stalls are helped.
LOST: 0
GAINED: 1
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12278>
Iago Toral Quiroga [Mon, 9 Aug 2021 06:29:49 +0000 (08:29 +0200)]
broadcom/compiler: make spills of conditional writes also conditional
A spill of a conditional write generates code like this:
mov.ifa t5000, 0
mov tmud, t5000
nop t5001; ldunif (0x00008100 / 0.000000)
add tmua, t11, t5001
Here, we are spilling t5000, which has a conditional write, and we
produce an inconditional spill for it. This implicitly means that
our spill requires a correct value for all channels of t5000.
If we do a conditional spill, then we emit:
mov.ifa t5000, 0
mov tmud.ifa, t5000
nop t5001; ldunif (0x00008100 / 0.000000)
add tmua.ifa, t11, t5001
Which only uses channels of t5000 that have been written by the
instruction being spilled.
By doing the latter, we can then narrow down the liveness for t5000
more effectively, as we can use this to detect that the block only reads
(in the tmud instruction) the values that have been written previously
in the same block (in the mov instruction). This means that values in
other channels are not used, and therefore, we don't need them to be
alive at the start of the block. This means that if this is the only
write of t5000 in this block, we can consider that the block
completely defines t5000.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12278>
Iago Toral Quiroga [Mon, 9 Aug 2021 06:27:59 +0000 (08:27 +0200)]
broadcom/compiler: Flags are per-thread state in V3D 4.2+
This means they survive a thread switch, so we can remove redundant
flag setups across thread switches.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12278>
Iago Toral Quiroga [Mon, 9 Aug 2021 05:59:06 +0000 (07:59 +0200)]
broadcom/compiler: add a vir_get_cond helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12278>
Samuel Pitoiset [Thu, 29 Apr 2021 11:32:13 +0000 (13:32 +0200)]
radv: enable DCC fast-clears with comp-to-single on GFX10+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Thu, 29 Apr 2021 11:30:26 +0000 (13:30 +0200)]
radv: skip FCE for images that are fast-cleared using comp-to-single
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Wed, 28 Apr 2021 12:03:53 +0000 (14:03 +0200)]
radv: implement DCC fast clears with comp-to-single
When an image supports comp-to-single, DCC is cleared to 0x10 (single)
and the clear color value is written to the beginning of each 256B
block in the image.
This allows to skip FCE.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Wed, 28 Apr 2021 11:53:02 +0000 (13:53 +0200)]
radv: determine if an image support fast clears using comp-to-single
Only on GFX10+ with DCC enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Wed, 28 Apr 2021 11:53:35 +0000 (13:53 +0200)]
radv: add RADV_DCC_CLEAR_SINGLE
When DCC is cleared with that code, the hardware expects the clear
color value to be stored at the beginning of each 256B block in
the image.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Wed, 28 Apr 2021 15:01:13 +0000 (17:01 +0200)]
radv: pass an image view to vi_get_fast_clear_parameters()
image_format was unused.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Samuel Pitoiset [Wed, 28 Apr 2021 14:44:09 +0000 (16:44 +0200)]
radv: use more explicit DCC clear codes
No functional changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10518>
Tomeu Vizoso [Fri, 6 Aug 2021 10:05:45 +0000 (12:05 +0200)]
virgl/ci: Set NIR_VALIDATE=0 on the host
As we aren't testing LLVMPipe in these jobs, and shader compilation is
currrently the bottleneck.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12196>
Tomeu Vizoso [Fri, 6 Aug 2021 08:28:36 +0000 (10:28 +0200)]
virgl/ci: Wait a bit before shutting the VM down
Sometimes, the VM powered off before all the output from the guest got
to the console.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12196>
Tomeu Vizoso [Thu, 5 Aug 2021 06:27:15 +0000 (08:27 +0200)]
virgl/ci: Rebalance concurrency
Crosvm deals with virtio-gpu commands sequentially, so parallelization
in the host doesn't help much.
Also, too much parallelization in the guest causes some tests to time
out.
So reduce the number of dEQP instances being run concurrently, make sure
we dont limit the number of CPUs being used in the host and schedule
more jobs in CI to keep the times below 10 minutes.
Closes: #5172
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12196>
Tomeu Vizoso [Wed, 4 Aug 2021 05:51:31 +0000 (07:51 +0200)]
virgl/ci: Have LLVMPipe use more threads for rendering
dEQP isn't high on rendering, but that is in the critical path as all
dEQP processes are waiting for Crosvm to single-threadedly service their
requests.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12196>
Samuel Pitoiset [Fri, 6 Aug 2021 15:08:09 +0000 (17:08 +0200)]
radv: fix reported sample counts for VRS 1x1
The Vulkan spec requires ~0 for 1x1.
Fixes dEQP-VK.fragment_shading_rate.misc.shading_rates.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12245>
Samuel Pitoiset [Fri, 6 Aug 2021 15:00:26 +0000 (17:00 +0200)]
radv: bump maxFragmentShadingRateCoverageSamples to 32
Minimum required value is 16 but we support up to 32
(2x2 VRS with MSAA 8x).
Fixes dEQP-VK.fragment_shading_rate.misc.limits.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12245>