platform/upstream/mesa.git
3 years agoradv: allow arbitrary swizzle modes for displayable DCC
Marek Olšák [Tue, 17 Aug 2021 16:57:03 +0000 (12:57 -0400)]
radv: allow arbitrary swizzle modes for displayable DCC

by adding retile pipeline variants

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>

3 years agoradeonsi: allow arbitrary swizzle modes for displayable DCC
Marek Olšák [Tue, 17 Aug 2021 16:57:03 +0000 (12:57 -0400)]
radeonsi: allow arbitrary swizzle modes for displayable DCC

by adding retile shader variants

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>

3 years agoir3: prohibit folding of half->full conversion into mul.s24/u24
Danylo Piliaiev [Thu, 19 Aug 2021 11:53:23 +0000 (14:53 +0300)]
ir3: prohibit folding of half->full conversion into mul.s24/u24

mul.s24/u24 always return 32b result regardless of its sources size,
hence we cannot guarantee the high 16b of dst being zero or sign extended.

Fixes cts tests on a650:
 dEQP-VK.spirv_assembly.type.scalar.i16.mul_test_high_part_zero_*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12471>

3 years agofreedreno/ci: Add spillall tests
Connor Abbott [Wed, 18 Aug 2021 11:06:37 +0000 (13:06 +0200)]
freedreno/ci: Add spillall tests

Only test shader tests, because the others are unlikely to have
interesting shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3, turnip, freedreno: Report stp/ldp in shader stats
Connor Abbott [Fri, 23 Jul 2021 12:06:04 +0000 (14:06 +0200)]
ir3, turnip, freedreno: Report stp/ldp in shader stats

This is important after spilling, so that we get an indication when a
change causes spilling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Fix getting stp/ldp components in ir3_info
Connor Abbott [Fri, 23 Jul 2021 11:57:24 +0000 (13:57 +0200)]
ir3: Fix getting stp/ldp components in ir3_info

Noticed by inspection when adding stp_count/ldp_count.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Initial support for spilling non-shared registers
Connor Abbott [Fri, 23 Jul 2021 11:12:30 +0000 (13:12 +0200)]
ir3: Initial support for spilling non-shared registers

Support for spilling shared registers to normal registers is still TODO.
There are also several improvements to be made, like rematerialization.

Note, there is one behavior change to register pressure accounting: we
now include half registers in the current full pressure directly in
mergedregs mode, rather than adding the max half pressure to the max
full pressure afterwards, which might result in lower calculated max
pressure in some cases with half registers. This is needed for spilling,
since we need to make sure the total pressure including half registers
is below the maximum at each instruction. Because the entire pass is
rewritten, including the register pressure calculating parts, it didn't
seem worth it to separate out this change.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Fix compress_regs_left accounting for half-regs
Connor Abbott [Thu, 19 Aug 2021 16:50:07 +0000 (18:50 +0200)]
ir3: Fix compress_regs_left accounting for half-regs

This was just wrong - we need to check against the entire register file,
and we need to include removed full regs even if the register we're
trying to insert is a half-reg, or else we could run out of space when
reinserting full regs after it. There does need to be an additional
check so that we don't try to insert a half-reg beyond the half-reg
limit, but that has to happen in addition to the normal check.

This fixes KHR-GLES31.core.arrays_of_arrays.InteractionArgumentAliasing6
once spilling is added.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Properly validate pcopy reg sizes
Connor Abbott [Wed, 18 Aug 2021 12:43:48 +0000 (14:43 +0200)]
ir3: Properly validate pcopy reg sizes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Fix RA debug printing
Connor Abbott [Tue, 17 Aug 2021 15:58:15 +0000 (17:58 +0200)]
ir3: Fix RA debug printing

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Add ra_foreach_src_n/ra_foreach_dst_n
Connor Abbott [Fri, 23 Jul 2021 11:08:59 +0000 (13:08 +0200)]
ir3: Add ra_foreach_src_n/ra_foreach_dst_n

I found ra_foreach_src_n useful in one place in the spiller. But this
also aligns RA with the rest of the compiler and stops us from
reinventing the iterators.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Add loop depth to ir3_block
Connor Abbott [Fri, 23 Jul 2021 11:05:50 +0000 (13:05 +0200)]
ir3: Add loop depth to ir3_block

And while we're at it, fix adding loop_id for the continue block.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/ra: Make ir3_reg_interval_remove_all() useful for spilling
Connor Abbott [Fri, 23 Jul 2021 10:55:39 +0000 (12:55 +0200)]
ir3/ra: Make ir3_reg_interval_remove_all() useful for spilling

RA uses this to pop and then reinsert intervals when shuffling around
registers. For spilling, we want to remove the interval and also mark
all its descendants as removed. Since "remove_all" sounds more like the
latter, rename the old "remove_all" to "remove_temp". "remove_all" was
already exposed in ir3_ra.h, so there's no need to add it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/ra: Handle huge merge sets
Connor Abbott [Fri, 23 Jul 2021 09:59:34 +0000 (11:59 +0200)]
ir3/ra: Handle huge merge sets

It can happen that we create an enormous merge set, even larger than the
entire register file, in which case find_best_gap() would loop
infinitely. This seems to be triggered more often with
IR3_SHADER_DEBUG=spillall, since it actually happened with a CTS test.
Just bail out in that case.

Fixes: 0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/ra: Fix available bitset for live-through collect srcs
Connor Abbott [Fri, 23 Jul 2021 09:56:14 +0000 (11:56 +0200)]
ir3/ra: Fix available bitset for live-through collect srcs

When we mark live-through sources that are merged with the destination
as killed, we kept the bitsets in sync, but we forgot to keep them in
sync when unmarking them after allocating the destination. The result
was that "available" wasn't correct for any instruction afterwards. This
resulted in a bad register allocation with IR3_SHADER_DEBUG=spillall for
a dEQP-VK test.

While we're changing this, use ra_foreach_src().

Fixes: 0ffcb19b9d9 ("ir3: Rewrite register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/ra: Reinitialize interval when inserting
Connor Abbott [Fri, 23 Jul 2021 09:50:46 +0000 (11:50 +0200)]
ir3/ra: Reinitialize interval when inserting

Otherwise when an interval is removed and then re-inserted it could
have an invalid/corrupted parent link and child tree. I think RA
happened to never do this, but spilling will.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/merge_regs: Set wrmask for pcopy destinations
Connor Abbott [Fri, 23 Jul 2021 09:47:49 +0000 (11:47 +0200)]
ir3/merge_regs: Set wrmask for pcopy destinations

This was wrong, and with spilling we can now create vector phi's in rare
circumstances.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/print: Use mesa_stream_log_printf for (kill)
Connor Abbott [Fri, 23 Jul 2021 09:43:56 +0000 (11:43 +0200)]
ir3/print: Use mesa_stream_log_printf for (kill)

This was missed during the conversion.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Print physical successors/predecessors
Connor Abbott [Wed, 21 Jul 2021 13:01:32 +0000 (15:01 +0200)]
ir3: Print physical successors/predecessors

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Copy-propagate single-source phis
Connor Abbott [Fri, 23 Jul 2021 09:45:10 +0000 (11:45 +0200)]
ir3: Copy-propagate single-source phis

These can be created when removing unreachable control flow, and it
seems easier to remove them than to add special code to handle them when
spilling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3/ra: Remove logical_unreachable
Connor Abbott [Fri, 23 Jul 2021 12:34:39 +0000 (14:34 +0200)]
ir3/ra: Remove logical_unreachable

This reverts 394c597b1b31842b3943e30ab7f21359b0076b13, although I had to
manually do it due to the reformatting.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agoir3: Add pass to remove unreachable blocks
Connor Abbott [Wed, 21 Jul 2021 13:03:21 +0000 (15:03 +0200)]
ir3: Add pass to remove unreachable blocks

Rather than continue to add special cases for these, just clean them up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12033>

3 years agodraw: improve numerical stability in clipper
Erik Faye-Lund [Fri, 13 Aug 2021 12:08:58 +0000 (14:08 +0200)]
draw: improve numerical stability in clipper

Floats have much better precision close to zero than close to one, so
let's make sure we compute an interpolation factor that goes in the
direction that discards the fewest bits.

This makes a big difference when interpolating from very small to very
large values for screen-space positions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12355>

3 years agogitlab-ci: Fix trace expectations for iris devices
Guilherme Gallo [Fri, 20 Aug 2021 00:08:10 +0000 (21:08 -0300)]
gitlab-ci: Fix trace expectations for iris devices

By checking the output images against the reference ones on the failed
trace jobs, I looked for artifacts via naked eye and image diffs. No
significant change was found. So the trace produced by the failed jobs
can be considered valid.

Updated devices' traces:
* Intel Comet Lake: iris-cml-traces
* Intel Gemini Lake: iris-glk-traces
* Intel Kaby Lake: iris-kbl-traces
* Intel Whiskey Lake: iris-whl-traces

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12394>

3 years agogitlab-ci: enable testing on Intel Comet Lake (experimental)
Guilherme Gallo [Mon, 16 Aug 2021 15:18:13 +0000 (12:18 -0300)]
gitlab-ci: enable testing on Intel Comet Lake (experimental)

* Integrate sarien Chromebook devices from Collabora lab
* Based on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11162

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12394>

3 years agogitlab-ci: enable testing on Intel Whiskey Lake (experimental)
Guilherme Gallo [Fri, 13 Aug 2021 16:54:41 +0000 (13:54 -0300)]
gitlab-ci: enable testing on Intel Whiskey Lake (experimental)

* Integrate sarien Chromebook devices from Collabora lab
* Based on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11162

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12394>

3 years agomesa: rgb10_a2 is never color-renderable in gles2
Ilia Mirkin [Thu, 19 Aug 2021 03:14:12 +0000 (23:14 -0400)]
mesa: rgb10_a2 is never color-renderable in gles2

Fixes
dEQP-GLES2.functional.fbo.completeness.renderable.texture.color0.rgb10_a2 on
GLES2 drivers which support RGB10_A2 textures.
GL_OES_required_internalformat does not make it a color-renderable
format.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4972
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12464>

3 years agofreedreno/a6xx: Sync TFB BO access against prior TFB writes.
Emma Anholt [Wed, 18 Aug 2021 20:30:57 +0000 (13:30 -0700)]
freedreno/a6xx: Sync TFB BO access against prior TFB writes.

CTS draw_indirect usage of TFB output was flaking due to the TFB writes
possibly not having completed.  Since GL TFB doesn't require any other
barrier between TFB and use of the BO (as seen by the CTS not emitting any
memory barrier), we have to do it ourselves.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12457>

3 years agofreedreno/ir3: Align driver param upload size/offset for indirect uploads.
Emma Anholt [Wed, 18 Aug 2021 19:49:10 +0000 (12:49 -0700)]
freedreno/ir3: Align driver param upload size/offset for indirect uploads.

For indirect draws, we have to upload some of the params as indirect
references, which have a more strict size requirement.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12455>

3 years agofreedreno/ir3: Apply the a6xx samgq workaround to TES/TCS/GS as well.
Emma Anholt [Wed, 18 Aug 2021 19:34:01 +0000 (12:34 -0700)]
freedreno/ir3: Apply the a6xx samgq workaround to TES/TCS/GS as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12454>

3 years agoanv: Set CONTEXT_PARAM_RECOVERABLE to false
Jason Ekstrand [Thu, 19 Aug 2021 15:51:17 +0000 (10:51 -0500)]
anv: Set CONTEXT_PARAM_RECOVERABLE to false

We want the kernel to ban our context immediately instead of foolhardily
attempting to recover.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12476>

3 years agoaco/tests: add tests for post-RA DPP combining
Rhys Perry [Thu, 15 Jul 2021 16:46:40 +0000 (17:46 +0100)]
aco/tests: add tests for post-RA DPP combining

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco/tests: add tests for pre-RA DPP combining
Rhys Perry [Mon, 19 Jul 2021 14:39:34 +0000 (15:39 +0100)]
aco/tests: add tests for pre-RA DPP combining

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: combine DPP into VALU after RA
Rhys Perry [Tue, 30 Jun 2020 14:33:18 +0000 (15:33 +0100)]
aco: combine DPP into VALU after RA

Mostly helps a bunch of Cyberpunk 2077 shaders.

fossil-db (Siena Cichlid):
Totals from 26 (0.02% of 150170) affected shaders:
CodeSize: 83208 -> 81528 (-2.02%)
Instrs: 14728 -> 14308 (-2.85%)
Latency: 48041 -> 47793 (-0.52%)
InvThroughput: 10836 -> 10578 (-2.38%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: combine DPP into VALU before RA
Rhys Perry [Tue, 30 Jun 2020 14:33:18 +0000 (15:33 +0100)]
aco: combine DPP into VALU before RA

Mostly helps a bunch of Cyberpunk 2077 shaders. Catches some of the cases
that the post-RA can't optimize because of register assignment.

fossil-db (Siena Cichlid):
Totals from 25 (0.02% of 150170) affected shaders:
CodeSize: 78808 -> 75764 (-3.86%)
Instrs: 14311 -> 13547 (-5.34%)
Latency: 278697 -> 277885 (-0.29%)
InvThroughput: 63428 -> 62754 (-1.06%)
Copies: 1348 -> 1349 (+0.07%); split: -0.07%, +0.15%
PreVGPRs: 1035 -> 1011 (-2.32%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: handle DPP in the optimizer
Rhys Perry [Mon, 19 Jul 2021 13:26:42 +0000 (14:26 +0100)]
aco: handle DPP in the optimizer

There are a bunch of optimizations that are broken when DPP is involved.

fossil-db (Sienna Cichlid):
Totals from 100 (0.07% of 150170) affected shaders:
CodeSize: 325204 -> 325192 (-0.00%); split: -0.06%, +0.05%
Instrs: 62773 -> 62664 (-0.17%); split: -0.18%, +0.00%
Latency: 295348 -> 295266 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 73990 -> 73946 (-0.06%); split: -0.06%, +0.01%
Copies: 1650 -> 1609 (-2.48%); split: -2.55%, +0.06%
PreSGPRs: 3554 -> 3520 (-0.96%)

Fossil-db changes are probably because v_sub_f32_dpp(v_mul_f32) is no
longer being combined into MAD and then split back into separate
instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: make optimize_postRA() work across blocks
Rhys Perry [Thu, 8 Jul 2021 16:43:37 +0000 (17:43 +0100)]
aco: make optimize_postRA() work across blocks

fossil-db (Sienna Cichlid):
Totals from 46 (0.03% of 150170) affected shaders:
CodeSize: 103672 -> 103488 (-0.18%)
Instrs: 21968 -> 21922 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: move a bunch of helpers into aco_ir.h/aco_ir.cpp
Rhys Perry [Wed, 14 Jul 2021 16:22:02 +0000 (17:22 +0100)]
aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: add can_use_DPP() and convert_to_DPP()
Rhys Perry [Wed, 14 Jul 2021 16:11:44 +0000 (17:11 +0100)]
aco: add can_use_DPP() and convert_to_DPP()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoaco: fix validation of DPP v_cndmask_b32/v_addc_co_u32
Rhys Perry [Wed, 7 Jul 2021 19:42:27 +0000 (20:42 +0100)]
aco: fix validation of DPP v_cndmask_b32/v_addc_co_u32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>

3 years agoi915g: clang-format fixup.
Emma Anholt [Wed, 18 Aug 2021 04:20:51 +0000 (21:20 -0700)]
i915g: clang-format fixup.

I really need to get clang-format into CI so I can stop doing fixups.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agoi915g: Add comments explaining various xfails.
Emma Anholt [Wed, 18 Aug 2021 02:38:08 +0000 (19:38 -0700)]
i915g: Add comments explaining various xfails.

I haven't gone through every test (particularly ones I think are loop
unrolling or instruction-count-related ones I think), but this gives a
better picture of what's going on in this driver.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agoi915g: Clear some xfails that are now skips.
Emma Anholt [Sat, 14 Aug 2021 03:10:11 +0000 (20:10 -0700)]
i915g: Clear some xfails that are now skips.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agoi915g: Reduce ARB_fp max tex indirections to match i915c.
Emma Anholt [Wed, 18 Aug 2021 03:55:37 +0000 (20:55 -0700)]
i915g: Reduce ARB_fp max tex indirections to match i915c.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agoi915g: Correct PIPE_SHADER_CAP_MAX_TEMPS.
Emma Anholt [Wed, 18 Aug 2021 03:54:20 +0000 (20:54 -0700)]
i915g: Correct PIPE_SHADER_CAP_MAX_TEMPS.

This is the value that i915c reported, too, and is required for ARB_fp.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agoi915g: Fix polygon offset by telling draw the Z format.
Emma Anholt [Wed, 18 Aug 2021 23:50:39 +0000 (16:50 -0700)]
i915g: Fix polygon offset by telling draw the Z format.

This is what initializes the MRD for draw's polygon offset calculations.

Closes: #4976
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>

3 years agofrontends/va: add num_temporal_layers check
Boyuan Zhang [Thu, 19 Aug 2021 02:47:05 +0000 (22:47 -0400)]
frontends/va: add num_temporal_layers check

Fixes: 51935d59

temporal_id check is valid only if the num_temporal_layers is set (>0).
When num_temporal_layers is 0, we shouldn't check temporal_id and return
error.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12463>

3 years agoradeon/vcn: set min value for num_temporal_layers
Boyuan Zhang [Thu, 19 Aug 2021 02:30:02 +0000 (22:30 -0400)]
radeon/vcn: set min value for num_temporal_layers

Fixes: 51935d59

In the case where num_temporal_layers is not set (0), set it using the
minimum value 1, otherwise the rate control settings will be missing.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12463>

3 years agonir: return false for loops in contains_other_jump()
Daniel Schürmann [Sat, 31 Oct 2020 22:25:12 +0000 (23:25 +0100)]
nir: return false for loops in contains_other_jump()

Allows to unwrap more loops.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12473>

3 years agov3d: implement resource_get_param
Simon Ser [Sat, 14 Aug 2021 12:05:43 +0000 (14:05 +0200)]
v3d: implement resource_get_param

Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

A tiny helper function is introduced to compute the modifier of a
resource.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb22363935 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>

3 years agovc4: implement resource_get_param
Simon Ser [Sat, 14 Aug 2021 12:07:28 +0000 (14:07 +0200)]
vc4: implement resource_get_param

Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

A tiny helper function is introduced to compute the modifier of a
resource.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb22363935 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>

3 years agopanfrost: implement resource_get_param
Simon Ser [Sat, 14 Aug 2021 12:03:58 +0000 (14:03 +0200)]
panfrost: implement resource_get_param

Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c092947df30 ("panfrost: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>

3 years agoetnaviv: add stride, offset and modifier to resource_get_param
Simon Ser [Sat, 14 Aug 2021 11:57:15 +0000 (13:57 +0200)]
etnaviv: add stride, offset and modifier to resource_get_param

Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.

Instead, extend the resource_get_param hook to allow users to fetch
this information without WINSYS_HANDLE_TYPE_KMS.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 9da901d2b2e7 ("etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>

3 years agogallium/nir/tgsi: initialize file_max for inputs
Erik Faye-Lund [Tue, 17 Aug 2021 17:42:21 +0000 (19:42 +0200)]
gallium/nir/tgsi: initialize file_max for inputs

When this was rewritten to support Vulkan, we stopped initializing
file_max to -1 in the case of no inputs. This causes the draw module
to go down a needlessly pessimistic case, printing an error while we're
at it.

Fixes: 42b5cfdbd26 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>

3 years agogallium/nir/tgsi: fixup indentation
Erik Faye-Lund [Tue, 17 Aug 2021 17:41:03 +0000 (19:41 +0200)]
gallium/nir/tgsi: fixup indentation

This was using mixed tabs and spaces, let's fix that before we start
modifying the code.

Fixes: 42b5cfdbd26 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>

3 years agoturnip: apply workaround for depth bounds test without depth test
Danylo Piliaiev [Tue, 17 Aug 2021 11:59:56 +0000 (14:59 +0300)]
turnip: apply workaround for depth bounds test without depth test

On some GPUs when:
- depth bounds test is enabled
- depth test is disabled
- depth attachment uses UBWC in sysmem mode
GPU hangs. As a workaround we should enable z test. That's what blob
is doing for a630. And since we enable z test we should make it always pass.

Blob doesn't emit this workaround on a650 and a660. Untested on a640.

Fixes:
 dEQP-VK.pipeline.extended_dynamic_state.two_draws_static.depth_bounds_test_disable
 dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.depth_bounds_test_disable
 dEQP-VK.dynamic_state.ds_state.depth_bounds_1

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12407>

3 years agofreedreno: rename Z_TEST_ENABLE->Z_READ_ENABLE, Z_ENABLE->Z_TEST_ENABLE
Danylo Piliaiev [Tue, 17 Aug 2021 15:19:06 +0000 (18:19 +0300)]
freedreno: rename Z_TEST_ENABLE->Z_READ_ENABLE, Z_ENABLE->Z_TEST_ENABLE

This makes their interaction with Z_BOUNDS_ENABLE more understandable.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12407>

3 years agodraw: fix stippling of fractional lines
Erik Faye-Lund [Wed, 11 Aug 2021 14:53:11 +0000 (16:53 +0200)]
draw: fix stippling of fractional lines

The OpenGL 4.6 specification, section 14.5.2.1 (Line Stipple) says:

> The masking is achieved using three parameters: the 16-bit line
> stipple p, the line repeat count r, and an integer stipple counter s.

This is pretty clear that the stipple counter shouldn't carry fractional
parts. But we also don't really do anything useful with the fractional
part anyway, apart from skewing the third or later line-segments

Properly carrying over the fractional parts as the Vulkan specification
allows for rectangular lines is trickier than this and would require us
to use a shorter output-line at the start of the following
line-segments.

But let's just do what the OpenGL specification describes, and the
Vulkan specification allows for now.

This, combined with the following patch for the vulkan CTS makes the
last two rasterization-tests pass for me:

https://github.com/KhronosGroup/VK-GL-CTS/pull/279

Fixes the "spec/!opengl 1.1/linestipple/line strip" piglit-test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12327>

3 years agoturnip: use nir_shader_instructions_pass in tu_lower_io
Marcin Ślusarz [Tue, 10 Aug 2021 13:02:51 +0000 (15:02 +0200)]
turnip: use nir_shader_instructions_pass in tu_lower_io

No functional changes.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>

3 years agor600: preserve all metadata when passes don't make progress
Marcin Ślusarz [Tue, 10 Aug 2021 12:40:41 +0000 (14:40 +0200)]
r600: preserve all metadata when passes don't make progress

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>

3 years agor600: use nir_shader_instructions_pass in r600_nir_lower_atomics
Marcin Ślusarz [Tue, 10 Aug 2021 12:10:16 +0000 (14:10 +0200)]
r600: use nir_shader_instructions_pass in r600_nir_lower_atomics

Changes:
- nir_metadata_preserve(..., nir_metadata_all) is called when pass doesn't
  make progress

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>

3 years agofreedreno/ir3: use nir_metadata_none instead of its value
Marcin Ślusarz [Tue, 10 Aug 2021 10:52:46 +0000 (12:52 +0200)]
freedreno/ir3: use nir_metadata_none instead of its value

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12467>

3 years agoradv: do not allocate the FCE predicate for images that use comp-to-single
Samuel Pitoiset [Fri, 13 Aug 2021 08:39:38 +0000 (10:39 +0200)]
radv: do not allocate the FCE predicate for images that use comp-to-single

Images that support comp-to-single don't have to be fast-cleared at
all, so the predicate is unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>

3 years agoradv: remove useless check about the FCE predicate offset
Samuel Pitoiset [Fri, 13 Aug 2021 08:38:36 +0000 (10:38 +0200)]
radv: remove useless check about the FCE predicate offset

radv_update_fce_metadata() already prevents that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>

3 years agoradv: determine if an image support comp-to-single at creation time
Samuel Pitoiset [Wed, 11 Aug 2021 10:28:30 +0000 (12:28 +0200)]
radv: determine if an image support comp-to-single at creation time

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12323>

3 years agobroadcom/ci: use deqp-runner suites for gles
Juan A. Suarez Romero [Wed, 18 Aug 2021 09:02:03 +0000 (11:02 +0200)]
broadcom/ci: use deqp-runner suites for gles

Glue together all the GLES related jobs using the suites feature.

This allow us to reduce the total number of devices required, moving
some of them to help in other jobs, and the remaining free for other
pipelines in parallel.

Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12453>

3 years agoglsl: refactor code to avoid static analyzer noise
Marcin Ślusarz [Tue, 17 Aug 2021 09:17:18 +0000 (11:17 +0200)]
glsl: refactor code to avoid static analyzer noise

Clang analyzer thinks struct_base_offset can be used uninitialized
because it doesn't know that glsl_type_is_struct_or_ifc returns
the same value for the same type.

Refactor the code to make it clear what is going on. As a side effect
this should be faster because glsl_get_length and
glsl_type_is_struct_or_ifc will be called only once (they are not
inline functions).

This is an alternative approach to
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12399.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12403>

3 years agonir/inline_uniforms: support loop
Qiang Yu [Mon, 19 Jul 2021 06:25:36 +0000 (14:25 +0800)]
nir/inline_uniforms: support loop

Be able to inline uniforms in loop for unrolling it.
Nested loop/if is also supported.

Some example:

    for (i = 0; i < count; i++)
...

uniform "count" will be inlined. But note this does not
make sure the loop will be unrolled (ie. count = 1000).

    for (i = 0; i < count; i++)
        for (j = init; j < 10; j++)
            if (type == 2)
                ...

uniform "count", "init" and "type" will be inlined.

It is intentional to not be too aggressive to add uniforms
to avoid false positive case while be able to support most
common usage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agonir/loop_analyze: skip unsupported induction variable early
Qiang Yu [Mon, 26 Jul 2021 09:13:52 +0000 (17:13 +0800)]
nir/loop_analyze: skip unsupported induction variable early

Instead of fail in trip count calculation, just don't mark such
kind of variable as induction from the beginning.

Don't bother inline uniform to deal with such kind of variable
either.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agonir/loop_analyze: record induction variables for each loop
Qiang Yu [Thu, 15 Jul 2021 09:40:40 +0000 (17:40 +0800)]
nir/loop_analyze: record induction variables for each loop

For being used by uniform inline lowering pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agonir/loop_analyze: move nir_is_supported_terminator_condition() to header
Qiang Yu [Mon, 26 Jul 2021 08:42:14 +0000 (16:42 +0800)]
nir/loop_analyze: move nir_is_supported_terminator_condition() to header

To be shared with uniform inline.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agonir/inline_uniforms: support vector uniform
Qiang Yu [Thu, 22 Jul 2021 08:16:58 +0000 (16:16 +0800)]
nir/inline_uniforms: support vector uniform

Collect per vector component dependency and lower vector uniform
load to scalar if any component need to be inlined.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agonir/inline_uniforms: add uniforms in condition atomically
Qiang Yu [Mon, 19 Jul 2021 01:54:37 +0000 (09:54 +0800)]
nir/inline_uniforms: add uniforms in condition atomically

Unless all uniforms in the condition can be inlined we can
lower the if/loop. So we rollback added uniforms when one
of uniforms in a if condition fail to be added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>

3 years agomesa: don't return errors for gl_* GetFragData* queries
Ilia Mirkin [Thu, 12 Aug 2021 02:26:07 +0000 (22:26 -0400)]
mesa: don't return errors for gl_* GetFragData* queries

There is nothing in the spec about this. BindFragDataLocation* is
supposed to return an error, but not Get.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5221
Fixes: 59012c3133 ("mesa: Implement glGetFragDataLocation")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12333>

3 years agopanfrost: Add unit tests for non-dithered clears
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:17:38 +0000 (22:17 +0000)]
panfrost: Add unit tests for non-dithered clears

Would have exposed the bug fixed in the previous commit. This is gnarly
stuff, let's not regress it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>

3 years agopanfrost: Handle non-dithered clear colours
Alyssa Rosenzweig [Wed, 18 Aug 2021 22:16:56 +0000 (22:16 +0000)]
panfrost: Handle non-dithered clear colours

In b9c095cc2c6 ("panfrost: Rewrite the clear colour packing code"),
packing of clear colours was corrected to use the tilebuffer's
fractional bits, fixing dithering of the clear colour with formats like
RGB565. Unfortunately, that commit did so unconditionally. If the
framebuffer is dithered, but dithering is disabled at the time of
the clear, we would incorrectly dither the clear.

This is a regression, as the old (broken) code passed the relevant CTS
test. What's the catch? Depending on dither state, there are two
formulas to pack tilebuffer colours. We need to handle both. Fixes
KHR-GLES31.core.draw_buffers_indexed.color_masks.

Fixes: b9c095cc2c6 ("panfrost: Rewrite the clear colour packing code")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>

3 years agopanfrost: Add dither state to the clear colour tests
Alyssa Rosenzweig [Wed, 18 Aug 2021 16:21:55 +0000 (16:21 +0000)]
panfrost: Add dither state to the clear colour tests

There is a dependence on dithering state about which I was previously
unaware. All these test cases were with dithering enabled, so mark that
down.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>

3 years agobroadcom/qpu: use and expand version info at opcode description
Alejandro Piñeiro [Sun, 8 Aug 2021 00:18:18 +0000 (02:18 +0200)]
broadcom/qpu: use and expand version info at opcode description

Right now opcode_desc struct, used to define data for all the
operations to pack/unpack, include a version field. In theory that
could be used to check if we are retrieving a opcode valid for our hw
version, or to get the correct opcode if a given one changed across hw
versions, or just the same if it didn't change.

In practice that field was not used. So for example, if by mistake we
asked for an opcode defined at version 41, while being on version 33
hardware, we would still get that opcode description.

This commit fixes that, and as we are here we expand the functionality
to allow to define version ranges, just in case a given opcode number
and their description is only valid for a given range.

v2 (from Iago feedback):
   * Fixed some comment typos
   * Simplified filtering opcode method
   * Rename filtering opcode method

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>

3 years agobroadcom/qpu: add new lookup opcode description helper
Alejandro Piñeiro [Mon, 9 Aug 2021 23:35:14 +0000 (01:35 +0200)]
broadcom/qpu: add new lookup opcode description helper

Right now there is a helper to get the opcode description from a
packed instruction, used on unpack related instructions. This commit
adds a helper that refactors the equivalent that is already in use on
pack related instructions.

Right now the helper is small, but we plan to extend it on following
commits in order to use the opcode description version field.

To avoid any possible confusion we rename the existing lookup helper.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>

3 years agobroadcom/qpu: update/remove comments
Alejandro Piñeiro [Tue, 3 Aug 2021 23:10:00 +0000 (01:10 +0200)]
broadcom/qpu: update/remove comments

   * Remove one about waddr 6 being reserved, when at some point it
     become NOP

   * Fix one comment about reserved signals on v41 map, as 24 and 25
     are in fact defined. This seems a C&P issue (see v40 map).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>

3 years agoci/freedreno: Flake the rest of the pbuffer/window dEQP-EGL tests.
Emma Anholt [Wed, 18 Aug 2021 22:35:09 +0000 (15:35 -0700)]
ci/freedreno: Flake the rest of the pbuffer/window dEQP-EGL tests.

I had at least 3 of these in my logs, I see no reason not to fill out the
rest at this point.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12458>

3 years agoci/freedreno: Mark a new flaky SSBO length test.
Emma Anholt [Wed, 18 Aug 2021 22:22:52 +0000 (15:22 -0700)]
ci/freedreno: Mark a new flaky SSBO length test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12458>

3 years agointel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
Ian Romanick [Sat, 23 Jan 2021 22:28:07 +0000 (14:28 -0800)]
intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms

This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.

With the previous optimizations in place, this change seems to improve
the quality of the generated code.  Comparing a couple Vulkan CTS tests
on Skylake had the following results.

dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends

dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agonir: intel/compiler: Add and use nir_op_pack_32_4x8_split
Ian Romanick [Tue, 26 Jan 2021 00:31:17 +0000 (16:31 -0800)]
nir: intel/compiler: Add and use nir_op_pack_32_4x8_split

A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27fc ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agonir/algebraic: Remove spurious conversions from inside logic ops
Ian Romanick [Tue, 26 Jan 2021 00:31:44 +0000 (16:31 -0800)]
nir/algebraic: Remove spurious conversions from inside logic ops

Not only does this eliminate a bunch of unnecessary type converting
MOVs, but it can also enable some SWAR.  The
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag test does
something about like:

    c = a.x ^ b.x;
    d = a.y ^ b.y;
    e = a.z ^ b.z;

After this change, it looks more like:

    uint t = i8vec3AsUint(a) ^ i8vec3AsUint(b);
    c = extract_u8(t, 0);
    d = extract_u8(t, 1);
    e = extract_u8(t, 2);

On Ice Lake, this results in:

SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 31 instructions. 1 loops. 2844 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agointel/fs: Emit better code for u2u of extract
Ian Romanick [Wed, 27 Jan 2021 03:52:50 +0000 (19:52 -0800)]
intel/fs: Emit better code for u2u of extract

Emitting the instructions one by one results in two MOV instructions
that won't be propagated.  By handling both instructions at once, a
single MOV is emitted.  For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends

Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agointel/fs: Allow copy propagation between MOVs of mixed sizes
Ian Romanick [Tue, 13 Apr 2021 21:07:19 +0000 (14:07 -0700)]
intel/fs: Allow copy propagation between MOVs of mixed sizes

This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends

Unfortunately, this doesn't clean everything up.  Here's a subset of the
"before" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
mov(8)          g12<1>UB        g7<32,8,4>UB                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g12<8,8,1>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g14<1>UB        g8<32,8,4>UB                    { align1 1Q };
mov(8)          g16<1>UW        g14<8,8,1>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

And here's the same subset of the "after" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g7<32,8,4>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g16<1>UW        g8<32,8,4>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

There are a lot of regioning and type restrictions in
fs_visitor::try_copy_propagate, and I'm a little nervious about messing
with them too much.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agonir/algebraic: Optimize some extract forms resulting from 8-bit lowering
Ian Romanick [Wed, 27 Jan 2021 03:51:57 +0000 (19:51 -0800)]
nir/algebraic: Optimize some extract forms resulting from 8-bit lowering

This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 56 instructions. 1 loops. 4444 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends

v2: Condition two of the patterns on !options->lower_extract_byte.
Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agointel/compiler: Document and assert some aspects of 8-bit integer lowering
Ian Romanick [Fri, 22 Jan 2021 22:54:02 +0000 (14:54 -0800)]
intel/compiler: Document and assert some aspects of 8-bit integer lowering

In the vec4 compiler, 8-bit types should never exist.

In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.

Some instructions are handled in non-obvious ways.

Hopefully this will save the next person some time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>

3 years agoglx: Simplify context API profile computation
Adam Jackson [Mon, 16 Aug 2021 21:34:44 +0000 (17:34 -0400)]
glx: Simplify context API profile computation

GLX_ARB_create_context_profile has some clever language that sets the
default to core profile but silently degrades back to compat for pre-3.2
GLs. We can just do that, rather than track whether the user specified a
profile.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agoglx/dri: Collect the GLX context attributes in a struct
Adam Jackson [Fri, 6 Aug 2021 20:32:56 +0000 (16:32 -0400)]
glx/dri: Collect the GLX context attributes in a struct

dri2_convert_glx_attribs had way too many arguments, let's fix that.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agoglx/drisw: Remove some misplaced error checks
Adam Jackson [Wed, 11 Aug 2021 14:36:42 +0000 (10:36 -0400)]
glx/drisw: Remove some misplaced error checks

If the driver doesn't like these attributes it can reject them, it's not
libGL's job to verify them here.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agoglx/dri2: Require the driver to support v4 of __DRI_DRI2
Adam Jackson [Wed, 11 Aug 2021 14:18:42 +0000 (10:18 -0400)]
glx/dri2: Require the driver to support v4 of __DRI_DRI2

Mesa has supported this unconditionally since 10.1.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agoglx: Store the context vtable on the glx screen
Adam Jackson [Fri, 6 Aug 2021 21:53:38 +0000 (17:53 -0400)]
glx: Store the context vtable on the glx screen

Again this is rewriting part of driX_create_context_attribs to be
caller-agnostic, so that we can eventually unify it among the DRI
backends.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agoglx: Fix and simplify the share context compatibility check
Adam Jackson [Fri, 6 Aug 2021 21:10:45 +0000 (17:10 -0400)]
glx: Fix and simplify the share context compatibility check

We only end up with one DRI provider per screen, so the only way the
context vtable can differ is if they're not the same directness. Rewrite
the test in those terms to help us unify some of this code away in the
future. Also apply the same logic to the indirect context creation path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agodri: Reformat DRI context attribute #defines
Adam Jackson [Fri, 6 Aug 2021 20:29:52 +0000 (16:29 -0400)]
dri: Reformat DRI context attribute #defines

These were confusingly sorted before.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12456>

3 years agozink: clear current gfx/compute program upon unbinding its shaders
Mike Blumenkrantz [Tue, 8 Jun 2021 20:48:15 +0000 (16:48 -0400)]
zink: clear current gfx/compute program upon unbinding its shaders

this simplifies a lot of code

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12428>

3 years agozink: do compute shader change on bind
Mike Blumenkrantz [Fri, 14 May 2021 22:47:49 +0000 (18:47 -0400)]
zink: do compute shader change on bind

we can do this update earlier to optimize the actual compute path

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12428>

3 years agozink: flag the gfx pipeline dirty and unset pipeline shader module on shader change
Mike Blumenkrantz [Fri, 14 May 2021 22:31:50 +0000 (18:31 -0400)]
zink: flag the gfx pipeline dirty and unset pipeline shader module on shader change

there's no need to leave this until the module updating when the info
is known much earlier

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12428>

3 years agozink: remove repeated lazy batch dd casts
Mike Blumenkrantz [Fri, 21 May 2021 21:44:02 +0000 (17:44 -0400)]
zink: remove repeated lazy batch dd casts

these all have an ergonomic cost

Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12427>