platform/upstream/mesa.git
5 years agopanfrost: Add flags to reflect the BO imported/exported state
Boris Brezillon [Sat, 14 Sep 2019 15:11:03 +0000 (17:11 +0200)]
panfrost: Add flags to reflect the BO imported/exported state

Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that
are not exported/imported (meaning that all GPU accesses are known
by the context).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add a panfrost_flush_batches_accessing_bo() helper
Boris Brezillon [Sun, 15 Sep 2019 18:17:14 +0000 (20:17 +0200)]
panfrost: Add a panfrost_flush_batches_accessing_bo() helper

This will allow us to only flush batches touching a specific resource,
which is particularly useful when the CPU needs to access a BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add a panfrost_flush_all_batches() helper
Boris Brezillon [Sun, 15 Sep 2019 17:15:16 +0000 (19:15 +0200)]
panfrost: Add a panfrost_flush_all_batches() helper

And use it in panfrost_flush() to flush all batches, and not only the
one currently bound to the context.

We also replace all internal calls to panfrost_flush() by
panfrost_flush_all_batches() ones.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Prepare panfrost_fence for batch pipelining
Boris Brezillon [Sun, 15 Sep 2019 16:23:10 +0000 (18:23 +0200)]
panfrost: Prepare panfrost_fence for batch pipelining

The panfrost_fence logic currently waits on the last submitted batch,
but the batch serialization that was enforced in
panfrost_batch_submit() is about to go away, allowing for several
batches to be pipelined, and the last submitted one is not necessarily
the one that will finish last.

We need to make sure the fence logic waits on all flushed batches, not
only the last one.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Start tracking inter-batch dependencies
Boris Brezillon [Sun, 15 Sep 2019 11:39:52 +0000 (13:39 +0200)]
panfrost: Start tracking inter-batch dependencies

The idea is to track which BO are being accessed and the type of access
to determine when a dependency exists. Thanks to that we can build a
dependency graph that will allow us to flush batches in the correct
order.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add a panfrost_freeze_batch() helper
Boris Brezillon [Sun, 15 Sep 2019 10:14:22 +0000 (12:14 +0200)]
panfrost: Add a panfrost_freeze_batch() helper

We'll soon need to freeze a batch not only when it's flushed, but also
when another batch depends on us, so let's add a helper to avoid
duplicating the logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Use the per-batch fences to wait on the last submitted batch
Boris Brezillon [Sun, 15 Sep 2019 08:57:26 +0000 (10:57 +0200)]
panfrost: Use the per-batch fences to wait on the last submitted batch

We just replace the per-context out_sync object by a pointer to the
the fence of the last last submitted batch. Pipelining of batches will
come later.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add a batch fence
Boris Brezillon [Sun, 15 Sep 2019 08:27:07 +0000 (10:27 +0200)]
panfrost: Add a batch fence

So we can implement fine-grained dependency tracking between batches.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Make panfrost_batch->bos a hash table
Boris Brezillon [Sun, 15 Sep 2019 07:27:14 +0000 (09:27 +0200)]
panfrost: Make panfrost_batch->bos a hash table

So we can store the flags as data and keep the BO as a key. This way
we keep track of the type of access done on BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Extend the panfrost_batch_add_bo() API to pass access flags
Boris Brezillon [Sun, 15 Sep 2019 07:21:13 +0000 (09:21 +0200)]
panfrost: Extend the panfrost_batch_add_bo() API to pass access flags

The type of access being done on a BO has impacts on job scheduling
(shared resources being written enforce serialization while those
being read only allow for job parallelization) and BO lifetime (the
fragment job might last longer than the vertex/tiler ones, if we can,
it's good to release BOs earlier so that others can re-use them
through the BO re-use cache).

Let's pass extra access flags to panfrost_batch_add_bo() and
panfrost_batch_create_bo() so the batch submission logic can take the
appropriate when submitting batches. Note that this information is not
used yet, we're just patching callers to pass the correct flags here.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add the shader BO to the batch in patch_shader_state()
Boris Brezillon [Tue, 1 Oct 2019 18:12:12 +0000 (20:12 +0200)]
panfrost: Add the shader BO to the batch in patch_shader_state()

We know a shader will be used by a batch when
panfrost_patch_shader_state() is called, so let's add the shader BO at
that time.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoegl: Remove the 565 pbuffer-only EGL config under X11.
Andres Gomez [Wed, 2 Oct 2019 15:50:38 +0000 (18:50 +0300)]
egl: Remove the 565 pbuffer-only EGL config under X11.

The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a900796871470ac4f1a04e154392e4898,
commit 6ad31c4ff33d92f6359b196a94ace99682272111 and
commit dacb11a585face5ca179c34cfc588a71a425c1e0.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4d ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agobin: delete unused releasing scripts
Dylan Baker [Thu, 26 Sep 2019 21:34:49 +0000 (14:34 -0700)]
bin: delete unused releasing scripts

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agorelease: Add an update_release_calendar.py script
Dylan Baker [Thu, 26 Sep 2019 21:23:47 +0000 (14:23 -0700)]
release: Add an update_release_calendar.py script

This script is for updating post version bump.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agoscripts: Add a gen_release_notes.py script
Dylan Baker [Wed, 25 Sep 2019 21:56:21 +0000 (14:56 -0700)]
scripts: Add a gen_release_notes.py script

This script is responsible for generating an entire page in the
docs/relnotes/ directory. It includes a template for the page, and uses
mako to fill in the necessary bits. It is designed to be purely fire and
forget, calculating previous versions, shortlogs, bug fixes, and dates.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agodocs: add a new_features.text file and remove 19.3.0 release notes
Dylan Baker [Thu, 26 Sep 2019 18:00:43 +0000 (11:00 -0700)]
docs: add a new_features.text file and remove 19.3.0 release notes

The next patch is going to introduce a tool that creates the entire
release html page for us, without any user intervention. As such we
can't be editing it. To that end the script will read the
new_features.txt file to get a list of new features.

This is a flat text file, one entry per line.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agoanv/block_pool: Align anv_block_pool state to 64 bits.
Rafael Antognolli [Mon, 30 Sep 2019 21:08:11 +0000 (14:08 -0700)]
anv/block_pool: Align anv_block_pool state to 64 bits.

On 64 bits platforms, some atomic operations like __sync_fetch_and_add()
have constant time, but on 32 bits platforms they are implemented with a
loop and might take much longer.

Additionally, it seems like if their operands are not aligned to 64
bits, they also require extra memory accesses. From the Intel
Architecture's Developer Manual Vol. 1, 4.1.1:

 "A word or doubleword operand that crosses a 4-byte boundary or a
 quadword operand that crosses an 8-byte boundary is considered
 unaligned and requires two separate memory bus cycles for access."

Forcing the u64 field to be aligned to 64 bits seems to make the unit
tests that are stressing this finish much faster.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoloader/dri3: do not blit outside old/new buffers
Erik Faye-Lund [Mon, 25 Mar 2019 08:47:58 +0000 (09:47 +0100)]
loader/dri3: do not blit outside old/new buffers

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agodocs: Add use of Closes: tag for closing gitlab issues
Dylan Baker [Wed, 25 Sep 2019 18:10:15 +0000 (11:10 -0700)]
docs: Add use of Closes: tag for closing gitlab issues

This replaces to old Bugzilla: tag, which no longer makes sense because
we don't use bugzilla anymore.

Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agointel/isl/icl: Use halign 8 instead of 4 hw workaround
Anuj Phogat [Mon, 29 Oct 2018 12:38:58 +0000 (14:38 +0200)]
intel/isl/icl: Use halign 8 instead of 4 hw workaround

v1 by Topi Pohjolainen
v2,v3 by Anuj Phogat:
- Apply for gen >= 11
- Remove wa_bug_xxx function
- Use helper functions

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoac/nir: remove unused code for nir_op_{fmod,frem}
Samuel Pitoiset [Thu, 3 Oct 2019 14:21:47 +0000 (16:21 +0200)]
ac/nir: remove unused code for nir_op_{fmod,frem}

RADV and RadeonSI both lower these two NIR instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: enable lower_fmod for the LLVM path
Samuel Pitoiset [Thu, 3 Oct 2019 14:20:40 +0000 (16:20 +0200)]
radv: enable lower_fmod for the LLVM path

This lowers fmod and frem at NIR level like RadeonSI. fmod is
already lowered directly in NIR->LLVM, and frem will be lowered by
LLVM anyways.

This fixes a LLVM crash with:
dEQP-VK.glsl.builtin.precision_fp16_storage32b.frem.compute.scalar.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoegl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure
Adam Jackson [Wed, 2 Oct 2019 20:26:48 +0000 (16:26 -0400)]
egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure

... because it's wrong to do so. The error path out of
dri2_initialize_drm ends with dri2_display_destroy, which calls
functions in the vtable we're trying to set up, so if we dlclose the
driver then those function pointers will point off into space and things
crash.

Noticed this because after !1923 eglinfo would crash when setting up the
GBM platform. This was something of a cascade failure, because my kernel
is too old for DRM_IOCTL_I915_GETPARAM to work without DRM_AUTH, so i965
wouldn't load. platform_drm.c then got very confused when it tries to
load swrast as a dri2 driver.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: Fix warning in 32-bit build.
Bas Nieuwenhuizen [Mon, 30 Sep 2019 21:20:05 +0000 (23:20 +0200)]
radv: Fix warning in 32-bit build.

uintptr_t is 32 bits in a 32-bits build, resulting in shifting out
of bounds.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Fix condition for skipping the continue CS.
Bas Nieuwenhuizen [Wed, 2 Oct 2019 19:26:01 +0000 (21:26 +0200)]
radv: Fix condition for skipping the continue CS.

We need the continue CS for referencing the tess/GDS/sample position BOs.

Fixes: 46e52df34d3 "radv: add tessellation ring allocation support. (v2)"
Fixes: e1dc3ab7534 "radv/gfx10: allocate GDS/OA buffer objects for NGG streamout"
Fixes: 1171b304f30 "radv: overhaul fragment shader sample positions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agogitlab-ci: Use per-job ccache
Michel Dänzer [Tue, 1 Oct 2019 14:00:16 +0000 (16:00 +0200)]
gitlab-ci: Use per-job ccache

Instead of a single cache shared between all jobs, but reduce the
maximum cache size to 1.5G (from 5G).

Rationale for smaller cache:

Pulling & pushing a 5G cache could take a long time. Consider
https://gitlab.freedesktop.org/mesa/mesa/-/jobs/684010 (click the "Show
complete raw" button to see timestamps): Pulling the cache took
1569927241-1569927194 = 47 seconds, pushing it 1569927671-1569927519
= 152, for a total of 199 seconds. The actual build took comparable
1569927518-1569927243 = 275 seconds, despite no cache hits from ccache.
In other words, the cache transfers almost doubled the job duration,
and they would have negated any build time benefits from ccache even
with a high cache hit rate.

Also, the smaller caches avoid blowing up storage requirements for them
too much.

Rationale for per-job caches:

Making a single cache significantly smaller might result in cached
build products from one job getting evicted by another job, reducing
the likelihood of cache hits from previous pipelines.

v2:
* Move up "ccache --max-size=1500M" call (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovirgl: honor winsys supplied metadata
Gurchetan Singh [Wed, 25 Sep 2019 17:44:44 +0000 (10:44 -0700)]
virgl: honor winsys supplied metadata

To truly to do this correctly, we'll have to fix the discrepancy between
drm_virtgpu_3d_transfer_to_host and virtio_gpu_transfer_host_3d. However,
this is a good starting point.

Since virtio-gpu only supports self-import and export, this should be fine.
Let's only do WINSYS_HANDLE_TYPE_FD for this currently.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>

5 years agovirgl: modify internal structures to track winsys-supplied data
Gurchetan Singh [Wed, 25 Sep 2019 17:33:16 +0000 (10:33 -0700)]
virgl: modify internal structures to track winsys-supplied data

The winsys might supply dimensions that are different than
those we calculate.  In additional, it may supply virtualized
modifiers.

In practice, a stride != bpp * width and virtualized modifiers don't
happen yet, but the plan is to move in that direction.

Also make virgl_resource_layout static.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>

5 years agovirgl: modify resource_create_from_handle(..) callback
Gurchetan Singh [Wed, 25 Sep 2019 17:06:23 +0000 (10:06 -0700)]
virgl: modify resource_create_from_handle(..) callback

This commit makes no functional changes, just adds the revelant
plumbing.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>

5 years agovirgl: remove stride from virgl_hw_res
Gurchetan Singh [Wed, 25 Sep 2019 16:25:46 +0000 (09:25 -0700)]
virgl: remove stride from virgl_hw_res

It's not used anywhere, and stride isn't really an intrinsic
property of a GEM buffer.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>

5 years agointel: fix topology query
Lionel Landwerlin [Wed, 2 Oct 2019 14:13:06 +0000 (17:13 +0300)]
intel: fix topology query

i915 will report ENODEV on generations prior to Haswell because there
is no point in reporting values on those. This is prior any fusing
could happen on parts with identical PCI ids.

This query call was previously only triggered on generations that
support performance queries, which happens to match generation for
which i915 reports topology, but the commit pointed below started
using it on all generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 96e1c945f2 ("i965: Move device info initialization to common code")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
5 years agodocs: Fix GL_EXT_demote_to_helper_invocation name
Caio Marcelo de Oliveira Filho [Tue, 1 Oct 2019 03:47:58 +0000 (20:47 -0700)]
docs: Fix GL_EXT_demote_to_helper_invocation name

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoradv/gfx10: fix the ESGS ring size symbol
Samuel Pitoiset [Wed, 18 Sep 2019 07:58:54 +0000 (09:58 +0200)]
radv/gfx10: fix the ESGS ring size symbol

Random hangs no longer happen, I'm actually not sure if they were
related to this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix build
Samuel Pitoiset [Wed, 2 Oct 2019 18:37:43 +0000 (20:37 +0200)]
radv: fix build

Forgot to amend the commit before updating the MR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoRevert "radv: disable viewport clamping even if FS doesn't write Z"
Samuel Pitoiset [Wed, 2 Oct 2019 17:34:52 +0000 (19:34 +0200)]
Revert "radv: disable viewport clamping even if FS doesn't write Z"

This was actually the wrong fix.

This reverts commit 0a313cc285c2939de9cac07f045b0b699bc208ca.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: rework the slow depthstencil clear to write depth from PS
Samuel Pitoiset [Wed, 18 Sep 2019 13:43:13 +0000 (15:43 +0200)]
radv: rework the slow depthstencil clear to write depth from PS

Make sure to export the expected clear values to the depth
stencil attachment.

This fixes dEQP-VK.pipeline.depth_range_unrestricted.* on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix NGG streamout with triangle strips for VS
Samuel Pitoiset [Tue, 17 Sep 2019 16:52:02 +0000 (18:52 +0200)]
radv/gfx10: fix NGG streamout with triangle strips for VS

The number of vertices has to be adjusted with the output primitive
type.

This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix storing/loading NGG stream outputs for GS
Samuel Pitoiset [Tue, 17 Sep 2019 09:01:01 +0000 (11:01 +0200)]
radv/gfx10: fix storing/loading NGG stream outputs for GS

The GS outputs are stored differently in the LDS storage, they
are indexed by out_idx which is incremented for each stored DWORD.
Thus, we need a different path for exporting the stream outputs.

This fixes a bunch of CTS failures when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: use the component mask when storing/loading NGG stream outputs
Samuel Pitoiset [Tue, 17 Sep 2019 08:51:46 +0000 (10:51 +0200)]
radv/gfx10: use the component mask when storing/loading NGG stream outputs

It's unnecessary to store/load more components that needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix storing/loading NGG stream outputs for VS and TES
Samuel Pitoiset [Tue, 17 Sep 2019 08:43:15 +0000 (10:43 +0200)]
radv/gfx10: fix storing/loading NGG stream outputs for VS and TES

The LDS storage allocated for stream outputs is 4 * N, where N
is the number of outputs. So, we have to store/load with N as index
and not with the output location as index.

This doesn't fix anything known but it should fix out-of-bounds
access and it also reduces the number of outputs written to the
LDS storage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: add missing counter buffer to the BO list
Samuel Pitoiset [Mon, 16 Sep 2019 14:16:05 +0000 (16:16 +0200)]
radv/gfx10: add missing counter buffer to the BO list

The buffer isn't necessarily used before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: add radv_device::use_ngg
Samuel Pitoiset [Wed, 18 Sep 2019 07:01:38 +0000 (09:01 +0200)]
radv/gfx10: add radv_device::use_ngg

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogit: delete .gitattributes
Eric Engestrom [Wed, 2 Oct 2019 12:20:13 +0000 (13:20 +0100)]
git: delete .gitattributes

The last of these was deleted in 44a8e5135470fa51ae36 ("d3d1x: Remove.")
over 6 years ago.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoetnaviv: enable triangle strips only when the hardware supports it
Gert Wollny [Wed, 2 Oct 2019 07:28:55 +0000 (09:28 +0200)]
etnaviv: enable triangle strips only when the hardware supports it

Some hardware has a bug with triangle strips and it is signalled by the
flag BUG_FIXED8 whether this bug has been fixed. So only enable triangle
strips when this flag is set.

Thanks: Jonathan Marek and Christian Gmeiner for the pointers

v2: Add TODO to indicate that the handling should be refined
    (Jonathan & Christian)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agomeson: remove -DGALLIUM_SOFTPIPE from st/osmesa
Dylan Baker [Mon, 30 Sep 2019 18:09:44 +0000 (11:09 -0700)]
meson: remove -DGALLIUM_SOFTPIPE from st/osmesa

It's unused here, and undefined in scons. It is used in targets/osmesa,
but it's properly defined there already.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agomesa: don't forget to clear _Layer field on texture unit
Lionel Landwerlin [Tue, 1 Oct 2019 08:55:46 +0000 (11:55 +0300)]
mesa: don't forget to clear _Layer field on texture unit

On the Android Antutu benchmark we ran into an assert in ISL where the
(base layer + num layers) > total layers. It turns out the core of
mesa forgot to clear the _Layer variable, potentially leaving an
inconsistent value.

v2: Pull setting u->_Layer out of the conditional blocks (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoegl/gbm: Fix config validation
Robin Murphy [Sat, 21 Sep 2019 17:07:28 +0000 (18:07 +0100)]
egl/gbm: Fix config validation

In converting to shift/size-based validation, we lost a condition from
the ARGB/XRGB equivalence check, which left it working one way round
but not the other, and broke applications like glmark2-es2-drm on some
platforms. Restore the equivalent check that *both* configs actually
have an alpha channel before considering a mismatch.

Fixes: 7b4ed2b513ef ("egl: Convert configs to use shifts and sizes instead of masks")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agohaiku: fix Mesa build
Ken Mays [Thu, 26 Sep 2019 09:47:06 +0000 (09:47 +0000)]
haiku: fix Mesa build

1. The hgl.c file is a read-only file versus read-write.
Ref: src/gallium/state_trackers/hgl/hgl.c

2.  I've included the Haiku-specific patches I used to get a successful
build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure.
Shows "[764/764] linking target ... libswpipe.so" at build completion.

v2:
Remove autotools files (Eric)

v3:
Update the patch

Reported-by: Ken Mays <kmays2000@gmail.com>
Tested-by: Ken Mays <kmays2000@gmail.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>
5 years agogitlab-ci: Set ccache path for cross compilers in meson cross file
Michel Dänzer [Mon, 30 Sep 2019 08:36:04 +0000 (10:36 +0200)]
gitlab-ci: Set ccache path for cross compilers in meson cross file

Without this, meson didn't pick up ccache for cross builds.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agodocs/relnotes: add support for GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL...
Andres Gomez [Wed, 25 Sep 2019 23:56:29 +0000 (02:56 +0300)]
docs/relnotes: add support for GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6 on i965 and iris

After 41549a18e6c ("i965: Enable OpenGL 4.6 for Gen8+"), i965
implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6.

After 15e439071d8 ("iris: Enable ARB_gl_spirv and ARB_spirv_extensions"),
iris implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL
4.6.

v2:
  - Explicit the support is for i965 and iris.

v3:
  - Add also GL_ARB_spirv_extensions to the release notes (Alejandro).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoegl: Fix implicit declaration of ffs
Kevin Strasser [Thu, 12 Sep 2019 16:38:24 +0000 (09:38 -0700)]
egl: Fix implicit declaration of ffs

Found when building for Android in C99 mode. Include bitscan.h to ensure ffs is
available.

Fixes: 7b4ed2b5 ("egl: Convert configs to use shifts and sizes instead of masks")

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agointel/tools: Fix aubinator usage of rb_tree.
Rafael Antognolli [Mon, 30 Sep 2019 19:34:12 +0000 (12:34 -0700)]
intel/tools: Fix aubinator usage of rb_tree.

The order of comparison has changed, so we need to invert the logic of
"insert_left" when using rb_tree_insert_at().

Fixes: dae33052dbf (util/rb_tree: Reverse the order of comparison
                    functions).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agodocs/relnotes: Add EXT_demote_to_helper_invocation support on iris, i965
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 18:08:44 +0000 (11:08 -0700)]
docs/relnotes: Add EXT_demote_to_helper_invocation support on iris, i965

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoi965: Enable EXT_demote_to_helper_invocation
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 17:56:36 +0000 (10:56 -0700)]
i965: Enable EXT_demote_to_helper_invocation

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Enable EXT_demote_to_helper_invocation
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 16:20:17 +0000 (09:20 -0700)]
iris: Enable EXT_demote_to_helper_invocation

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogallium: Add PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 16:21:02 +0000 (09:21 -0700)]
gallium: Add PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION

To enable EXT_demote_to_helper_invocation:

    This extension adds a "demote" keyword that is similar to "discard" but
    only suppresses subsequent writes and outputs to the framebuffer, and
    does not terminate the execution of the invocation. For the remainder
    of the execution, the invocation is "demoted" to act like a helper
    invocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Add helperInvocationEXT() builtin
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 17:50:37 +0000 (10:50 -0700)]
glsl: Add helperInvocationEXT() builtin

From EXT_demote_to_helper_invocation, implemented with the existing
nir_intrinsic_is_helper_invocation.

Such builtin is necessary when using `demote` because we can't
redefine the value of gl_HelperInvocation (since it is an input
variable).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Parse `demote` statement
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 16:34:19 +0000 (09:34 -0700)]
glsl: Parse `demote` statement

When the EXT_demote_to_helper_invocation extension is enabled,
`demote` is treated as a keyword, and produces an ir_demote.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: Add ir_demote
Caio Marcelo de Oliveira Filho [Fri, 20 Sep 2019 16:27:00 +0000 (09:27 -0700)]
glsl: Add ir_demote

To represent the new `demote` keyword when using
EXT_demote_to_helper_invocation extension.  Most of the changes are to
include it in the visitors.

Demote is not considered a control flow, so also include an empty
visit member function in ir_control_flow_visitor.

Only NIR actually supports `demote`, so assert the translations for
TGSI and Mesa's gl_program -- since the demote is not expected to
appear for those.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agomesa: Extension boilerplate for EXT_demote_to_helper_invocation
Caio Marcelo de Oliveira Filho [Thu, 19 Sep 2019 20:54:18 +0000 (13:54 -0700)]
mesa: Extension boilerplate for EXT_demote_to_helper_invocation

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.
Kenneth Graunke [Tue, 24 Sep 2019 03:37:39 +0000 (20:37 -0700)]
iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.

We can't just check for the BO base address, we need to check for the
full address including any offset we may have applied.  When updating
the address, we need to include the offset again.

Fixes: 5ad0c88dbe3 ("iris: Replace buffer backing storage and rebind to update addresses.")

5 years agodocs/install: drop autotools references
Eric Engestrom [Mon, 30 Sep 2019 18:11:22 +0000 (19:11 +0100)]
docs/install: drop autotools references

19.3 will be the 3rd release without autotools, people know it's gone by now.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: Test for -Wl,--build-id=sha1
Maya Rashish [Tue, 3 Sep 2019 08:55:34 +0000 (11:55 +0300)]
meson: Test for -Wl,--build-id=sha1

instead of hard-coding OS list. Helps Solaris ld builds.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Maya Rashish <coypu@sdf.org>
5 years agodocs: remove stray newline
Dylan Baker [Mon, 30 Sep 2019 18:02:41 +0000 (11:02 -0700)]
docs: remove stray newline

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: use https for mesonbuild.com
Dylan Baker [Mon, 30 Sep 2019 18:02:31 +0000 (11:02 -0700)]
docs: use https for mesonbuild.com

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update install docs for meson
Dylan Baker [Mon, 30 Sep 2019 18:02:09 +0000 (11:02 -0700)]
docs: update install docs for meson

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoac/nir: fix GLSL imageSamples()
Marek Olšák [Mon, 16 Sep 2019 23:39:40 +0000 (19:39 -0400)]
ac/nir: fix GLSL imageSamples()

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: add ac_build_image_get_sample_count from radeonsi
Marek Olšák [Mon, 16 Sep 2019 23:37:04 +0000 (19:37 -0400)]
ac: add ac_build_image_get_sample_count from radeonsi

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac/surface: don't allocate FMASK if there is no graphics
Marek Olšák [Fri, 13 Sep 2019 22:27:46 +0000 (18:27 -0400)]
ac/surface: don't allocate FMASK if there is no graphics

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agotgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes
Marek Olšák [Tue, 17 Sep 2019 01:19:44 +0000 (21:19 -0400)]
tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes

radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
5 years agomeson: gallium media state trackers require libdrm with x11
Dylan Baker [Thu, 9 May 2019 17:32:31 +0000 (10:32 -0700)]
meson: gallium media state trackers require libdrm with x11

v2: - update copyright year in all changed files
    - rebase on master

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoiris: Disable CCS_E for 32-bit floating point textures.
Kenneth Graunke [Thu, 29 Aug 2019 07:38:15 +0000 (00:38 -0700)]
iris: Disable CCS_E for 32-bit floating point textures.

A while back, Michael Larabel noticed that Paraview's Wavelet Volume
case runs significantly slower on iris than i965.  It turns out this
is because we enable CCS_E for 32-bit floating point formats, while
i965 disables it, with an oblique comment saying that we benchmarked
it (on what exactly?) and determined that it was a loss.

Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed
large framerate drops when enabling CCS_E for either format.  However,
several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit
floating point formats, with no apparent ill effects.

So, disable compression for 32-bit float formats for now, but leave it
enabled for 16-bit float formats as they seem to be working fine.

Improves performance in Paraview's Wavelet Volume test by 62% on a
Skylake GT4e.

Fixes: 3cfc6a207bd ("iris: Fill out res->aux.possible_usages")

5 years agoac: reorder and print all radeon_info fields
Marek Olšák [Fri, 20 Sep 2019 04:54:22 +0000 (00:54 -0400)]
ac: reorder and print all radeon_info fields

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: set the number of SDPs same as the number of TCCs
Marek Olšák [Fri, 20 Sep 2019 02:17:30 +0000 (22:17 -0400)]
ac: set the number of SDPs same as the number of TCCs

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: fix num_good_cu_per_sh for harvested chips
Marek Olšák [Fri, 20 Sep 2019 02:16:51 +0000 (22:16 -0400)]
ac: fix num_good_cu_per_sh for harvested chips

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix corruption for chips with harvested TCCs
Marek Olšák [Tue, 24 Sep 2019 20:56:57 +0000 (16:56 -0400)]
radeonsi/gfx10: fix corruption for chips with harvested TCCs

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add radeon_info::tcc_harvested
Marek Olšák [Tue, 24 Sep 2019 20:56:21 +0000 (16:56 -0400)]
ac: add radeon_info::tcc_harvested

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: fix incorrect vram_size reported by the kernel
Marek Olšák [Tue, 24 Sep 2019 20:47:05 +0000 (16:47 -0400)]
ac: fix incorrect vram_size reported by the kernel

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi/gfx10: fix L2 cache rinse programming
Marek Olšák [Tue, 24 Sep 2019 19:15:00 +0000 (15:15 -0400)]
radeonsi/gfx10: fix L2 cache rinse programming

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoetnaviv: fix bitmask typo
Eric Engestrom [Sun, 29 Sep 2019 21:27:24 +0000 (22:27 +0100)]
etnaviv: fix bitmask typo

Fixes: d92689c46f0d2da05ae6 ("etnaviv: nir: add native integers (HALTI2+)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoglx: Log the filename of the drm device if we fail to open it
Adam Jackson [Fri, 27 Sep 2019 16:16:22 +0000 (12:16 -0400)]
glx: Log the filename of the drm device if we fail to open it

Helps point the user to the specific device that's having issues, since
you're increasingly likely to have more than one.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/107
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoscons/windows: Enable compute shaders when possible.
pal1000 [Sun, 29 Sep 2019 15:35:29 +0000 (18:35 +0300)]
scons/windows: Enable compute shaders when possible.

Tests done with llvm-config indicate that there are only 2 libraries in
irreader and not in engine, LLVMAsmParser and LLVMIRReader and both of them
are part of coroutines so I replaced irreader with coroutines and added
libraries unique to coroutines.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agopan/midgard: Allow scheduling conditions with constants
Alyssa Rosenzweig [Sat, 28 Sep 2019 17:05:12 +0000 (13:05 -0400)]
pan/midgard: Allow scheduling conditions with constants

Now that we have constant adjustment logic abstracted, we can do this
safely. Along with the csel inversion patch, this allows many more
common csel ops to inline their condition in the bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add csel invert optimization
Alyssa Rosenzweig [Sat, 28 Sep 2019 16:39:15 +0000 (12:39 -0400)]
pan/midgard: Add csel invert optimization

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add mir_flip helper
Alyssa Rosenzweig [Sat, 28 Sep 2019 16:38:51 +0000 (12:38 -0400)]
pan/midgard: Add mir_flip helper

Useful for various operations on both commutative and anticommutative
ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Tightly pack 32-bit constants
Alyssa Rosenzweig [Sat, 28 Sep 2019 16:13:52 +0000 (12:13 -0400)]
pan/midgard: Tightly pack 32-bit constants

If we can reuse constant slots from other instructions, we would like to
do so to include more instructions per bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Allow writeout to see into the future
Alyssa Rosenzweig [Sat, 28 Sep 2019 14:43:51 +0000 (10:43 -0400)]
pan/midgard: Allow writeout to see into the future

If an instruction could be scheduled to vmul to satisfy the writeout
conditions, let's do that and save an instruction+cycle per fragment
shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Allow 6 instructions per bundle
Alyssa Rosenzweig [Sat, 28 Sep 2019 14:28:48 +0000 (10:28 -0400)]
pan/midgard: Allow 6 instructions per bundle

We never had a scheduler good enough to hit this case before! :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Only one conditional per bundle allowed
Alyssa Rosenzweig [Sat, 28 Sep 2019 14:22:35 +0000 (10:22 -0400)]
pan/midgard: Only one conditional per bundle allowed

There's no r32 to save ya after you use up r31 :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Schedule to smul/sadd
Alyssa Rosenzweig [Sat, 28 Sep 2019 13:48:53 +0000 (09:48 -0400)]
pan/midgard: Schedule to smul/sadd

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend choose_instruction for scalar units
Alyssa Rosenzweig [Sat, 28 Sep 2019 13:48:43 +0000 (09:48 -0400)]
pan/midgard: Extend choose_instruction for scalar units

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Don't double check SCALAR units
Alyssa Rosenzweig [Sat, 28 Sep 2019 13:48:27 +0000 (09:48 -0400)]
pan/midgard: Don't double check SCALAR units

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Use new scheduler
Alyssa Rosenzweig [Mon, 23 Sep 2019 12:00:51 +0000 (08:00 -0400)]
pan/midgard: Use new scheduler

We still emit in-order but we switch to using the bundles created from
the new scheduler, which will allow greater flexibility and room for
out-of-order optimization.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add distance metric to choose_instruction
Alyssa Rosenzweig [Sat, 28 Sep 2019 00:18:16 +0000 (20:18 -0400)]
pan/midgard: Add distance metric to choose_instruction

We require chosen instructions to be "close", to avoid ballooning
register pressure. This is a kludge that will go away once we have
proper liveness tracking in the scheduler, but for now it prevents a lot
of needless spilling.

v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders
that spilled excessively are fixed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Derp

5 years agopan/midgard: Add mir_choose_alu helper
Alyssa Rosenzweig [Fri, 27 Sep 2019 12:18:54 +0000 (08:18 -0400)]
pan/midgard: Add mir_choose_alu helper

Based on a given unit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement load/store pairing
Alyssa Rosenzweig [Mon, 23 Sep 2019 19:57:58 +0000 (15:57 -0400)]
pan/midgard: Implement load/store pairing

We can bundle two load/store together. This eliminates the need for
explicit load/store pairing in a prepass, as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Extend csel_swizzle to branches
Alyssa Rosenzweig [Tue, 24 Sep 2019 13:06:37 +0000 (09:06 -0400)]
pan/midgard: Extend csel_swizzle to branches

Conditions for branches don't have a swizzle explicitly in the emitted
binary, but they do implicitly get swizzled in whatever instruction
wrote r31, so we need to handle that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add helpers for scheduling conditionals
Alyssa Rosenzweig [Mon, 23 Sep 2019 19:37:53 +0000 (15:37 -0400)]
pan/midgard: Add helpers for scheduling conditionals

Conditional instructions (csel and conditional branches) require their
condition to be written to a special condition pipeline register (r31.w
for scalar, r31.xyzw for vector). However, pipeline registers are live
only for the duration of a single bundle. As such, the logic to schedule
conditionals correct is surprisingly complex. Essentially, we see if we
could stuff the conditional within the same bundle as the csel/branch
without breaking anything; if we can, we do that. If we can't, we add a
dummy move to make room.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Implement predicate->unit
Alyssa Rosenzweig [Fri, 27 Sep 2019 12:19:51 +0000 (08:19 -0400)]
pan/midgard: Implement predicate->unit

This allows ALUs to select for each unit of the bundle separately.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add predicate->exclude
Alyssa Rosenzweig [Fri, 27 Sep 2019 19:43:18 +0000 (15:43 -0400)]
pan/midgard: Add predicate->exclude

A bit of a kludge but allows setting an implicit dependency of synthetic
conditional moves on the actual condition, fixing code generated like:

   vmul.feq r0, ..
   sadd.imov r31, .., r0
   vadd.fcsel [...]

The imov runs simultaneous with feq so it gets garbage results, but it's
too late to add an actual dependency practically speaking, since the new
synthetic imov doesn't have a node associated.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>