platform/upstream/mesa.git
5 years agoglsl/nir_opt_access: Update uniforms correctly when only vars change
Caio Marcelo de Oliveira Filho [Wed, 19 Jun 2019 18:39:24 +0000 (11:39 -0700)]
glsl/nir_opt_access: Update uniforms correctly when only vars change

Even if only variables access flags are changed, the existing NIR
infrastructure expects metadata to be explicitly preserved, so do
that.  Don't care about avoiding preserve to be called twice since the
cost is negligible.

This scenario can be triggered by dead variables, and also by other
intrinsics that read the variables -- but not cause progress to be
made when processing the intrinsics.

Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl/nir: Fix getting the sampler dim when arrays are involved
Caio Marcelo de Oliveira Filho [Wed, 19 Jun 2019 17:00:39 +0000 (10:00 -0700)]
glsl/nir: Fix getting the sampler dim when arrays are involved

Unwrap any array in the variable type so we can get the sampler dim.

This fixes piglit test
spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test.

Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agomeson: Search for execinfo.h
Jory Pratt [Wed, 8 May 2019 02:47:40 +0000 (21:47 -0500)]
meson: Search for execinfo.h

Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoutil: Heap-allocate 256K zlib buffer
Jory Pratt [Mon, 10 Jun 2019 18:48:02 +0000 (11:48 -0700)]
util: Heap-allocate 256K zlib buffer

The disk cache code tries to allocate a 256 Kbyte buffer on the stack.
Since musl only gives 80 Kbyte of stack space per thread, this causes a
trap.

See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size

(In musl-1.1.21 the default stack size has increased to 128K)

[mattst88]: Original author unknown, but I think this is small enough
            that it is not copyrightable.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: Fix wrong printf formatter
Kenneth Graunke [Wed, 19 Jun 2019 16:57:01 +0000 (11:57 -0500)]
anv: Fix wrong printf formatter

%lu is for unsigned long, %zu is for size_t.  Just cast the data.

5 years agoiris: Bail on queries for INTEL_NO_HW=1.
Kenneth Graunke [Wed, 19 Jun 2019 04:47:12 +0000 (23:47 -0500)]
iris: Bail on queries for INTEL_NO_HW=1.

We don't execute any of the commands to record snapshots, so we can't
actually produce a real result.  We do however need to avoid waiting
on a syncpt which will never be signalled.  So, just return 0.

5 years agovirgl: Support VIRGL_BIND_SHARED
David Riley [Thu, 13 Jun 2019 00:16:35 +0000 (17:16 -0700)]
virgl: Support VIRGL_BIND_SHARED

Support a new virgl bind type for shared buffers.

Signed-off-by: David Riley <davidriley@chormium.org>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agoanv: write spirv-nir logs back to the application
Lionel Landwerlin [Thu, 2 May 2019 16:43:03 +0000 (17:43 +0100)]
anv: write spirv-nir logs back to the application

Using the existing VK_EXT_debug_report extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoac/nir: Set speculatable for buffer loads where allowed
Connor Abbott [Tue, 4 Jun 2019 12:42:54 +0000 (14:42 +0200)]
ac/nir: Set speculatable for buffer loads where allowed

This brings the nir path in line with the TGSI path.

Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 2792 -> 2652 (-5.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 247380 -> 248072 (0.28 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 121 -> 132 (9.09 %)
Wait states: 0 -> 0 (0.00 %)

Most of the change came from DiRT: Showdown, and came from sinking SSBO
loads.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: Use reorderable access flag
Connor Abbott [Tue, 4 Jun 2019 12:12:34 +0000 (14:12 +0200)]
nir: Use reorderable access flag

No changes with radeonsi shader-db.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: Add a helper to determine if an intrinsic can be reordered
Connor Abbott [Tue, 4 Jun 2019 11:02:31 +0000 (13:02 +0200)]
nir: Add a helper to determine if an intrinsic can be reordered

This is simple now, but we're going to be adding a few more conditions
to this later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agost/nir: Use gl_nir_opt_access
Connor Abbott [Tue, 4 Jun 2019 12:18:54 +0000 (14:18 +0200)]
st/nir: Use gl_nir_opt_access

Nothing uses its results yet, that will come with the following commits.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoglsl/nir: Add optimization pass for access flags
Connor Abbott [Tue, 4 Jun 2019 12:13:13 +0000 (14:13 +0200)]
glsl/nir: Add optimization pass for access flags

Right now, this just deduces when we can arbitrarily reorder SSBO and
image loads, matching the existing logic in radeonsi's TGSI->LLVM pass.
This approach can't handle some things that nir_opt_copy_prop_vars can,
but it can handle images, and with GCM it lets us hoist reads outside of
loops. We can also pass this information to LLVM which lets it do its
own optimizations on it.

This is GLSL only as I haven't tested it on Vulkan yet, and it would
probably need a few changes to work there.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: Add reorderable memory access enum
Connor Abbott [Fri, 31 May 2019 17:03:48 +0000 (19:03 +0200)]
nir: Add reorderable memory access enum

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/copy_prop_vars: Ignore volatile accesses
Connor Abbott [Wed, 5 Jun 2019 08:23:00 +0000 (10:23 +0200)]
nir/copy_prop_vars: Ignore volatile accesses

The spec explicitly says that volatile writes can't be removed and
volatile reads do not guarantee that the same value will still be around
after the read, as if there were a barrier after each read/write. Just
ignore them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoglsl/nir: Propagate access qualifiers
Connor Abbott [Tue, 4 Jun 2019 09:41:25 +0000 (11:41 +0200)]
glsl/nir: Propagate access qualifiers

We were completely ignoring these before, except for putting them on
variables. While we're here, don't set access qualifiers when converting
to bindless since glsl_to_nir will already have set a more accurate
qualifier that includes any qualifiers on struct members that are
dereferenced.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: Allow qualifiers on copy_deref and image instructions
Connor Abbott [Tue, 4 Jun 2019 09:40:14 +0000 (11:40 +0200)]
nir: Allow qualifiers on copy_deref and image instructions

In the next commit, we'll properly handle access qualifiers on struct
members by propagating them to load/store instructions, but these
instructions had no way to specify the qualifier.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoac,radeonsi: Always mark buffer stores as inaccessiblememonly
Connor Abbott [Fri, 31 May 2019 17:04:36 +0000 (19:04 +0200)]
ac,radeonsi: Always mark buffer stores as inaccessiblememonly

inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.

Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.

statistics with nir enabled:

Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

The only difference was a manhattan31 shader.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoegl: add missing #include
Eric Engestrom [Tue, 11 Jun 2019 12:19:35 +0000 (13:19 +0100)]
egl: add missing #include

close() is in <unistd.h>

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoradv: disable viewport clamping even if FS doesn't write Z
Samuel Pitoiset [Tue, 18 Jun 2019 16:58:40 +0000 (18:58 +0200)]
radv: disable viewport clamping even if FS doesn't write Z

This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask
Samuel Pitoiset [Wed, 14 Nov 2018 15:24:02 +0000 (16:24 +0100)]
radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask

This allows us to disable the FMASK decompress pass when
transitioning from CB writes to shader reads.

This will likely be improved and enabled by default in the future.

No CTS regressions on GFX8 but a few number of multisample CTS
failures on GFX9 (they look related to the small hint).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix FMASK expand with SRGB formats
Samuel Pitoiset [Tue, 18 Jun 2019 14:11:07 +0000 (16:11 +0200)]
radv: fix FMASK expand with SRGB formats

Found while working on DCC for MSAA.

Fixes: 6b976024a87 ("radv: add support for FMASK expand")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agopanfrost: Move to use ralloc for some allocations
Tomeu Vizoso [Tue, 18 Jun 2019 12:24:57 +0000 (14:24 +0200)]
panfrost: Move to use ralloc for some allocations

We have some serious leaks, so plug some and also move to ralloc to
limit the lifetime of some objects to that of their parent.

Lots more such work to do.

For some reason, this fixes:

dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoegl: Don't add hardware device if there is no render node v2.
Mathias Fröhlich [Thu, 6 Jun 2019 08:22:25 +0000 (10:22 +0200)]
egl: Don't add hardware device if there is no render node v2.

Do not offer a hardware drm backed egl device if no render node
is available. The current implementation will fail on this
egl device. On top it issues a warning that is actually missleading.
There are finally more error paths that can fail on the way to a
hardware backed egl device. Fixing all of them would kind of require
opening the drm device and see if there is a usable driver associated
with the device. The taken approach avoids a full probe and fixes at
least this kind of problem on kvm virtualization hosts I observe here.

Fixes: dbb4457d985 ("egl: add EGL_EXT_device_drm support")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agoetnaviv: support GL_ARB_seamless_cubemap_per_texture
Christian Gmeiner [Mon, 3 Jun 2019 05:42:06 +0000 (07:42 +0200)]
etnaviv: support GL_ARB_seamless_cubemap_per_texture

Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
5 years agoetnaviv: update headers from rnndb
Christian Gmeiner [Mon, 3 Jun 2019 05:31:08 +0000 (07:31 +0200)]
etnaviv: update headers from rnndb

Update to etna_viv commit a3bf0da.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoradeonsi: fix undefined shift in macro definition
Dave Airlie [Tue, 18 Jun 2019 21:19:03 +0000 (07:19 +1000)]
radeonsi: fix undefined shift in macro definition

Pointed out by coverity

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonouveau: fix frees in unsupported IR error paths.
Dave Airlie [Tue, 18 Jun 2019 21:10:13 +0000 (07:10 +1000)]
nouveau: fix frees in unsupported IR error paths.

This is pointless in that we won't ever hit those paths in real life,
but coverity complains.

Fixes: f014ae3c7cce ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agopanfrost: Move clearing logic into pan_job
Rohan Garg [Wed, 5 Jun 2019 17:04:04 +0000 (19:04 +0200)]
panfrost: Move clearing logic into pan_job

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agovirgl: fix sync issue regarding discard/unsync transfers
Chia-I Wu [Mon, 17 Jun 2019 16:53:48 +0000 (09:53 -0700)]
virgl: fix sync issue regarding discard/unsync transfers

GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively.  When we run into

  ptr = glMapBufferRange(buf, 0, size,
          GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
  memcpy(ptr, data1, size);
  glUnmapBuffer(buf);
  ptr = glMapBufferRange(buf, size, size,
          GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
  memcpy(ptr, data2, size);
  glUnmapBuffer(buf);

we never want data1 to be copy_transfer'ed.  Because that would mean
that data2 might overwrite valid data.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis alexandros.frantzis@collabora.com
Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agopanfrost: Enable sRGB
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:23:41 +0000 (16:23 -0700)]
panfrost: Enable sRGB

Now that sRGB formats are supported for both rendering and sampling,
advertise support.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Disable AFBC on sRGB buffers
Alyssa Rosenzweig [Tue, 18 Jun 2019 14:41:26 +0000 (07:41 -0700)]
panfrost: Disable AFBC on sRGB buffers

The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Enable sRGB fixed-function blending
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:23:23 +0000 (16:23 -0700)]
panfrost: Enable sRGB fixed-function blending

For fixed-function, we have hardware to handle sRGB so we just set a
flag. For blend shaders, it's rather more involved; this is currently
unimplemented. Assert it out for now; we don't need it quite yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Specify sRGB in the render target
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:19:33 +0000 (16:19 -0700)]
panfrost: Specify sRGB in the render target

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Implement sRGB texturing
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:16:20 +0000 (16:16 -0700)]
panfrost: Implement sRGB texturing

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add sRGB render target flag
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:01:24 +0000 (16:01 -0700)]
panfrost: Add sRGB render target flag

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Implement tiled rendering
Alyssa Rosenzweig [Mon, 17 Jun 2019 22:56:48 +0000 (15:56 -0700)]
panfrost: Implement tiled rendering

We already can sample from Mali's linear/tiled encoding (the one from
Utgard -- AFBC is mostly unrelated); let's be able to render to it as
well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Decode rendering block type
Alyssa Rosenzweig [Mon, 17 Jun 2019 22:53:09 +0000 (15:53 -0700)]
panfrost: Decode rendering block type

A mode for rendering tiled/uncompressed was noticed, so we reshuffle the
MFBD render target definitions to explicitly include block type.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Refactor texture targets
Alyssa Rosenzweig [Mon, 17 Jun 2019 21:26:08 +0000 (14:26 -0700)]
panfrost: Refactor texture targets

This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a
single 2-bit texture target selection, noticing it's the same as the
2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we
share this definition and add the missing entry for 1D/buffer textures.

This requires a nontrivial (but functionally similar) refactor of all
parts of the driver to use the new definitions appropriately.
Theoretically, this should add support for buffer textures, but that's
obviously not tested and probably wouldn't work.

While doing so, we notice the sRGB enable bit, which we document and
decode as well here so we don't forget about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Figure out job requirements in pan_job.c
Rohan Garg [Wed, 5 Jun 2019 15:49:14 +0000 (17:49 +0200)]
panfrost: Figure out job requirements in pan_job.c

Requirements for a job should be figured out in pan_job.c

v2: [Alyssa] Fix early return

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Reset job counters once the job is submitted
Rohan Garg [Wed, 5 Jun 2019 15:23:54 +0000 (17:23 +0200)]
panfrost: Reset job counters once the job is submitted

Move the reset out of frame invalidation into job submission

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Initial implementation of panfrost_job_submit
Rohan Garg [Wed, 5 Jun 2019 14:20:59 +0000 (16:20 +0200)]
panfrost: Initial implementation of panfrost_job_submit

Start fleshing out panfrost_job

v2: [Alyssa: Remove unused variable, warning introduced]

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agovirgl_hw: add YUV support
Gurchetan Singh [Fri, 14 Jun 2019 00:00:36 +0000 (17:00 -0700)]
virgl_hw: add YUV support

Add corresponding entries from p_format.h

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agovirgl: sync to virglrenderer virgl_hw.h
Gurchetan Singh [Thu, 13 Jun 2019 23:59:42 +0000 (16:59 -0700)]
virgl: sync to virglrenderer virgl_hw.h

It's nice to keep these two files in sync, as they define
guest userspace <---> host userspace communcation.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoanv: Make border colors the right size and alignment on HSW
Jason Ekstrand [Tue, 18 Jun 2019 15:15:24 +0000 (10:15 -0500)]
anv: Make border colors the right size and alignment on HSW

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoimgui: bump imgui memory editor copy
Lionel Landwerlin [Tue, 18 Jun 2019 07:37:11 +0000 (10:37 +0300)]
imgui: bump imgui memory editor copy

Getting rid of a compiler warning :

In file included from ../src/intel/tools/aubinator_viewer.cpp:225:
../src/imgui/imgui_memory_editor.h: In member function ‘void MemoryEditor::DisplayPreviewData(size_t, const u8*, size_t, MemoryEditor::DataType, MemoryEditor::DataFormat, char*, size_t) const’:
../src/imgui/imgui_memory_editor.h:637:16: warning: enumeration value ‘DataType_COUNT’ not handled in switch [-Wswitch]
         switch (data_type)
                ^

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agopanfrost/midgard: Enable autovectorization
Alyssa Rosenzweig [Mon, 17 Jun 2019 18:12:51 +0000 (11:12 -0700)]
panfrost/midgard: Enable autovectorization

Enable nir_opt_vectorize.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir: add a vectorization pass
Connor Abbott [Sun, 15 Nov 2015 01:26:47 +0000 (20:26 -0500)]
nir: add a vectorization pass

This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to be combined, but there are a few major
differences:

1. For now, we only support entirely per-component ALU operations.
2. Since it's not always guaranteed that we'll be able to combine
equivalent instructions, we keep a stack of equivalent instructions
around, trying to combine new instructions with instructions on the
stack.

The pass isn't comprehensive by far; it can't handle operations where
some of the sources are per-component and others aren't, and it can't
handle phi nodes. But it should handle the more common cases, and it
should be reasonably efficient.

[Alyssa: Rebase on latest master, updating with respect to typeless
moves]

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agopanfrost: Add support for TXS instructions
Boris Brezillon [Mon, 17 Jun 2019 20:13:04 +0000 (22:13 +0200)]
panfrost: Add support for TXS instructions

This patch adds support for nir_texop_txs instructions which are needed
to support the OpenGL textureSize() function. This is also needed to
support RECT texture sampling which is currently lowered to 2D sampling +
a TXS() instruction by the nir_lower_tex() helper.

Changes in v2:
* Split options for the 1st and 2nd tex lowering passes

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Prepare things to support non-native texture ops
Boris Brezillon [Mon, 17 Jun 2019 19:47:46 +0000 (21:47 +0200)]
panfrost: Prepare things to support non-native texture ops

We are about to add support for the TXS (texture size) op which is not
implemented using a midgard texture instruction. Let's rename emit_tex()
into emit_texop_native() and repurpose emit_tex() as a dispatcher.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Move sysval upload logic out of panfrost_emit_for_draw()
Boris Brezillon [Fri, 14 Jun 2019 08:41:17 +0000 (10:41 +0200)]
panfrost: Move sysval upload logic out of panfrost_emit_for_draw()

We're about to add more sysval types, and panfrost_emit_for_draw()
is big enough, so let's move the sysval upload logic in a separate
function.

We also add one sub-function per sysval type to keep the
panfrost_upload_sysvals() small/readable.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Make the sysval logic more generic
Boris Brezillon [Fri, 14 Jun 2019 07:59:20 +0000 (09:59 +0200)]
panfrost: Make the sysval logic more generic

We are about to add support for nir_texop_txs which requires adding a
sysval/uniform containing the texture size. Let's change the
emit_sysval_read() prototype to take a nir_instr object instead of
a nir_intrinsic_instr one so we can re-use this function when emitting
a sysval for a txs instruction.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions
Boris Brezillon [Mon, 17 Jun 2019 09:43:13 +0000 (11:43 +0200)]
nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions

The V3D driver has an open-coded solution for this, and we need the
same thing for Panfrost, so let's add a generic way to lower TXS(LOD)
into max(TXS(0) >> LOD, 1).

Changes in v2:
* Use == 0 instead of !
* Rework the minification logic as suggested by Jason
* Assign cursor pos at the beginning of the function
* Patch the LOD just after retrieving the old value

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/lower_tex: Update ->sampler_dim value before calling get_texture_size()
Boris Brezillon [Mon, 17 Jun 2019 09:31:51 +0000 (11:31 +0200)]
nir/lower_tex: Update ->sampler_dim value before calling get_texture_size()

get_texture_size() will create a txs instruction with ->sampler_dim set
to the original tex->sampler_dim. The condition to call lower_rect()
only checks the value of ->sampler_dim and whether lower_rect is
requested or not. This leads to an infinite loop when calling
nir_lower_tex() with the same options until it returns false.

In order to avoid that, let's move the tex->sampler_dim patching before
get_texture_size() is called. This way the txs instruction will have
->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try
to lower it on the subsequent passes.

Changes in v2:
* Add Jason R-b
* Add a comment explaining why we patch ->sampler_dim at the beginning
  of the lower_rect() func

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/lower_tex: Actually report when projector lowering happened
Boris Brezillon [Mon, 17 Jun 2019 09:23:33 +0000 (11:23 +0200)]
nir/lower_tex: Actually report when projector lowering happened

The code considers that projector lowering was done even if it's not
really the case. Change the project_src() prototype to return a bool
encoding whether projector lowering happened or not and update the
progress var accordingly in nir_lower_tex_block().

---
Changes in v2:
* Add Jason R-b
* Drop the part suggesting that nir_lower_rect() could be called in
  a do-while(progress) loop.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Adapt to constant name change in UABI
Tomeu Vizoso [Fri, 31 May 2019 07:12:59 +0000 (09:12 +0200)]
panfrost: Adapt to constant name change in UABI

We hadn't updated the kernel header after the driver got into mainline.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: ci: Update results
Tomeu Vizoso [Tue, 18 Jun 2019 13:15:19 +0000 (15:15 +0200)]
panfrost: ci: Update results

Alyssa fixed some failing tests last night.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoradv: adjust the DCC base VA for mipmapped color attachments
Samuel Pitoiset [Tue, 18 Jun 2019 09:51:31 +0000 (11:51 +0200)]
radv: adjust the DCC base VA for mipmapped color attachments

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix color decompressions for FMASK/CMASK
Samuel Pitoiset [Tue, 18 Jun 2019 10:02:12 +0000 (12:02 +0200)]
radv: fix color decompressions for FMASK/CMASK

Only skip levels without DCC when it's a DCC decompression.
Whoops.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not decompress levels without DCC with the graphics path
Samuel Pitoiset [Tue, 18 Jun 2019 08:30:45 +0000 (10:30 +0200)]
radv: do not decompress levels without DCC with the graphics path

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not decompress levels without DCC with the compute path
Samuel Pitoiset [Tue, 18 Jun 2019 08:30:44 +0000 (10:30 +0200)]
radv: do not decompress levels without DCC with the compute path

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: check if DCC is enabled per mip not for the whole image
Samuel Pitoiset [Tue, 18 Jun 2019 08:30:43 +0000 (10:30 +0200)]
radv: check if DCC is enabled per mip not for the whole image

In other words, make use of radv_dcc_enabled() instead of
radv_image_has_dcc() all over the places.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: implement simultaneous peripheral access exceptions for V3D 4.1+
Iago Toral Quiroga [Mon, 17 Jun 2019 08:15:54 +0000 (10:15 +0200)]
v3d: implement simultaneous peripheral access exceptions for V3D 4.1+

Shader-db results:

total instructions in shared programs: 9117550 -> 9102719 (-0.16%)
instructions in affected programs: 1752873 -> 1738042 (-0.85%)
helped: 7076
HURT: 478
helped stats (abs) min: 1 max: 22 x̄: 2.19 x̃: 2
helped stats (rel) min: 0.07% max: 13.89% x̄: 1.70% x̃: 1.07%
HURT stats (abs)   min: 1 max: 7 x̄: 1.41 x̃: 1
HURT stats (rel)   min: 0.09% max: 10.17% x̄: 0.86% x̃: 0.54%
95% mean confidence interval for instructions value: -2.00 -1.92
95% mean confidence interval for instructions %-change: -1.58% -1.50%
Instructions are helped.

total max-temps in shared programs: 1327774 -> 1327728 (<.01%)
max-temps in affected programs: 1025 -> 979 (-4.49%)
helped: 47
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 2.63% max: 20.00% x̄: 7.67% x̃: 5.26%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.17% max: 4.17% x̄: 4.17% x̃: 4.17%
95% mean confidence interval for max-temps value: -1.06 -0.82
95% mean confidence interval for max-temps %-change: -8.89% -5.49%
Max-temps are helped.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: only flush jobs accessing the query BO when reading query results
Iago Toral Quiroga [Mon, 17 Jun 2019 06:21:32 +0000 (08:21 +0200)]
v3d: only flush jobs accessing the query BO when reading query results

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: add a helper function to flush jobs using a BO
Iago Toral Quiroga [Fri, 14 Jun 2019 10:06:25 +0000 (12:06 +0200)]
v3d: add a helper function to flush jobs using a BO

v2: use _mesa_set_search() (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoiris: Support more RGBX pipe formats.
Kenneth Graunke [Sun, 9 Jun 2019 00:17:20 +0000 (17:17 -0700)]
iris: Support more RGBX pipe formats.

Without them, the state tracker falls back to an RGBA format, but it
doesn't always manage to override the swizzle for us.  So we lose the
information that the API expects an X channel, where alpha is garbage
and reads back as 1.  We have no equivalent ISL RGBX format for these,
so we just use RGBA directly and override the swizzle in all cases.

5 years agoglsl: Fix out of bounds read in shader_cache_read_program_metadata
Kenneth Graunke [Sat, 8 Jun 2019 06:00:40 +0000 (23:00 -0700)]
glsl: Fix out of bounds read in shader_cache_read_program_metadata

The VaryingNames array has NumVaryings entries.  But BufferStride is
a small array of MAX_FEEDBACK_BUFFERS (4) entries.  Programs with
more than 4 varyings would read out of bounds.

Also, BufferStride is set based on the shader itself, which means that
it's inherently already included in the hash, and doesn't need to be
included again.  At the point when shader_cache_read_program_metadata
is called, the linker hasn't even set those fields yet.  So, just drop
it entirely.

Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test.

Fixes: 6d830940f78 glsl/shader_cache: Allow shader cache usage with transform feedback

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoanv: Set STATE_BASE_ADDRESS upper bounds on gen7
Jason Ekstrand [Mon, 17 Jun 2019 22:01:48 +0000 (17:01 -0500)]
anv: Set STATE_BASE_ADDRESS upper bounds on gen7

This should fix floating-point border color on all gen7 HW.  Integer is
still thoroughly busted on gen7 because it doesn't exist on IVB and it's
crazy on HSW.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradv: Disable linear tiled compressed textures.
Bas Nieuwenhuizen [Mon, 17 Jun 2019 19:46:35 +0000 (21:46 +0200)]
radv: Disable linear tiled compressed textures.

Support got removed in the new addrlib update.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoanv:Use VK_EXT_separate_stencil_usage to avoid stencil shadows on gen7
Jason Ekstrand [Mon, 17 Jun 2019 14:39:08 +0000 (09:39 -0500)]
anv:Use VK_EXT_separate_stencil_usage to avoid stencil shadows on gen7

Whenever stencil texturing is not required (most of the time), we can
use VK_EXT_separate_stencil_usage to only create the shadow image when
VK_IMAGE_USAGE_SAMPLED_BIT is required for stencil.  Of course, this
depends on applications to use the extension but hopefully DXVK and
similar translators are doing so and that covers most of the apps.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Add stencil texturing support for gen7
Jason Ekstrand [Mon, 17 Jun 2019 02:21:16 +0000 (21:21 -0500)]
anv: Add stencil texturing support for gen7

Intel hardware didn't get support for sampling from W-tiled (required
for stencil) images until Broadwell so we can't directly sample from
stencil.  Instead, if we want to support stencil texturing on gen7
hardware, we have to keep a texture-capable shadow copy around and use
BLORP to update when stencil changes.  The one thing this commit does
not implement is self-dependencies with stencil input attachments.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99493
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/blorp: Update shadow images when clearing or uploading
Jason Ekstrand [Mon, 17 Jun 2019 06:53:50 +0000 (01:53 -0500)]
anv/blorp: Update shadow images when clearing or uploading

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/cmd_buffer: Add a stencil transition helper
Jason Ekstrand [Mon, 17 Jun 2019 02:55:25 +0000 (21:55 -0500)]
anv/cmd_buffer: Add a stencil transition helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/blorp: Take an aspect in anv_image_copy_to_shadow
Jason Ekstrand [Mon, 17 Jun 2019 02:36:21 +0000 (21:36 -0500)]
anv/blorp: Take an aspect in anv_image_copy_to_shadow

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/formats: Re-arrange the way se set some flag bits
Jason Ekstrand [Mon, 17 Jun 2019 02:20:41 +0000 (21:20 -0500)]
anv/formats: Re-arrange the way se set some flag bits

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: Make resource_copy_region handle packed depth-stencil resources.
Kenneth Graunke [Mon, 17 Jun 2019 21:35:31 +0000 (16:35 -0500)]
iris: Make resource_copy_region handle packed depth-stencil resources.

Also copy along the separate stencil buffer if needed.

Fixes Piglit's arb_copy_image-formats.

5 years agoiris: Order CS stall and TC invalidate for format reinterpretation hacks
Kenneth Graunke [Mon, 17 Jun 2019 11:55:07 +0000 (06:55 -0500)]
iris: Order CS stall and TC invalidate for format reinterpretation hacks

This should ensure the TC invalidate happens after the stall.

Fixes KHR-GL43.copy_image.functional which does a CopyImage (blorp_copy)
from a buffer (using R8G8B8A8_UINT), then GetTexImage to read back the
original image (using R10G10B10A2_UNORM).

5 years agoiris: Be more aggressive at post-format-reintepret TC invalidate hack
Kenneth Graunke [Mon, 17 Jun 2019 14:39:46 +0000 (09:39 -0500)]
iris: Be more aggressive at post-format-reintepret TC invalidate hack

When copying/blitting with format reinterpretation, we invalidate the
texture cache before/after.  Before is so the source of the copy works,
and after is to get rid of our new data in the "wrong" format to protect
future attempts to sample.

When I ported these hacks to iris, I tried to be cautious by only
bothering with the hacks if the batch referenced the BO.  This makes
some sense for the before case.  If it isn't referenced, the texture
cache can't really have any data for the BO (since it's also invalidated
between batches).  But we still need to do the after case regardless,
as we've just polluted the cache with hazardous entries.

5 years agovirgl: Assume sRGB write control for older guest kernels or virglrenderer hosts
Gert Wollny [Mon, 17 Jun 2019 06:44:14 +0000 (08:44 +0200)]
virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts

When the host virglrenderer is an older version that doesn't check the sRGB write
control feature, or when the guest kernel doesn't support CAPS v2, then the guest
will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting
3.3 with earlier guest mesa versions.

By also checking the host feature check version this regression can be avoided.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921
Fixes: 2845939d6a72
   virgl: Set sRGB write control CAP based on host capabilities

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
5 years agofreedreno/a6xx: disallow UBWC for x24s8
Rob Clark [Fri, 14 Jun 2019 16:12:46 +0000 (09:12 -0700)]
freedreno/a6xx: disallow UBWC for x24s8

Fixes:
  dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d
  dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d
  dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agofreedreno/a6xx: un-swap X24S8_UINT
Rob Clark [Thu, 13 Jun 2019 18:58:30 +0000 (11:58 -0700)]
freedreno/a6xx: un-swap X24S8_UINT

The stencil is actually in the .w component, but we used to use SWAP to
remap the channels.  This doesn't work when tiled/ubwc.

Fixes:
  dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array
  dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube
  dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array
  dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube
  dEQP-GLES31.functional.stencil_texturing.misc.base_level
  dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot
  dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot
  dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
  dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
  dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agoradv: add mipmaps support for DCC decompression on compute
Samuel Pitoiset [Mon, 17 Jun 2019 08:53:24 +0000 (10:53 +0200)]
radv: add mipmaps support for DCC decompression on compute

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add mipmaps support for color decompressions (DCC/FMASK/CMASK)
Samuel Pitoiset [Fri, 14 Jun 2019 07:21:58 +0000 (09:21 +0200)]
radv: add mipmaps support for color decompressions (DCC/FMASK/CMASK)

And some cleanups.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: set the DCC/FCE predicates from the base level
Samuel Pitoiset [Fri, 14 Jun 2019 13:17:06 +0000 (15:17 +0200)]
radv: set the DCC/FCE predicates from the base level

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: load the fast color clear values from the base level
Samuel Pitoiset [Fri, 14 Jun 2019 08:07:27 +0000 (10:07 +0200)]
radv: load the fast color clear values from the base level

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: store the DCC predicate for each mip
Samuel Pitoiset [Fri, 14 Jun 2019 13:15:09 +0000 (15:15 +0200)]
radv: store the DCC predicate for each mip

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: store the FCE predicate for each mip
Samuel Pitoiset [Fri, 14 Jun 2019 13:07:24 +0000 (15:07 +0200)]
radv: store the FCE predicate for each mip

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: store the fast color clear values for each mip
Samuel Pitoiset [Fri, 14 Jun 2019 08:21:56 +0000 (10:21 +0200)]
radv: store the fast color clear values for each mip

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: allocate DCC metadata for each mip
Samuel Pitoiset [Fri, 14 Jun 2019 12:52:28 +0000 (14:52 +0200)]
radv: allocate DCC metadata for each mip

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallium: Remove unused util_ringbuffer
Caio Marcelo de Oliveira Filho [Wed, 12 Jun 2019 23:14:52 +0000 (16:14 -0700)]
gallium: Remove unused util_ringbuffer

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agollvmpipe: Don't use u_ringbuffer for lp_scene_queue
Caio Marcelo de Oliveira Filho [Wed, 12 Jun 2019 22:32:30 +0000 (15:32 -0700)]
llvmpipe: Don't use u_ringbuffer for lp_scene_queue

Inline the ring buffer and signal logic into lp_scene_queue instead of
using a u_ringbuffer.  The code ends up simpler since there's no need
to handle serializing data from / to packets.

This fixes a crash when compiling Mesa with LTO, that happened because
of util_ringbuffer_dequeue() was writing data after the "header
packet", as shown below

    struct scene_packet {
       struct util_packet header;
       struct lp_scene *scene;
    };

    /* Snippet of old lp_scene_deque(). */
    packet.scene = NULL;
    ret = util_ringbuffer_dequeue(queue->ring,
                                  &packet.header,
                                  sizeof packet / 4,
    return packet.scene;

but due to the way aliasing analysis work the compiler didn't
considered the "&packet->header" to alias with "packet->scene".  With
the aggressive inlining done by LTO, this would end up always
returning NULL instead of the content read by
util_ringbuffer_dequeue().

Issue found by Marco Simental and iThiago Macieira.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agopanfrost/midgard: Simplify 2D array logic
Alyssa Rosenzweig [Mon, 17 Jun 2019 19:41:41 +0000 (12:41 -0700)]
panfrost/midgard: Simplify 2D array logic

It shouldn't matter if we stick a z in for non-arrays, anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Handle non-zero component in store
Alyssa Rosenzweig [Mon, 17 Jun 2019 19:35:57 +0000 (12:35 -0700)]
panfrost/midgard: Handle non-zero component in store

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Apply writemask to LUTs
Alyssa Rosenzweig [Mon, 17 Jun 2019 18:49:44 +0000 (11:49 -0700)]
panfrost/midgard: Apply writemask to LUTs

Fixes LUT instructions with NIR registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoamd: update addrlib
Marek Olšák [Fri, 14 Jun 2019 21:55:38 +0000 (17:55 -0400)]
amd: update addrlib

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: reduce MAX_GEOMETRY_OUTPUT_VERTICES
Nicolai Hähnle [Tue, 19 Jun 2018 11:53:01 +0000 (13:53 +0200)]
radeonsi: reduce MAX_GEOMETRY_OUTPUT_VERTICES

This fixes piglit spec@glsl-1.50@gs-max-output on gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agopanfrost: Cleanup default blend mode
Alyssa Rosenzweig [Mon, 17 Jun 2019 17:22:37 +0000 (10:22 -0700)]
panfrost: Cleanup default blend mode

Just encode the Mali magic number for `replace` rather than awkwardly
forcing Gallium structures through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Don't accidentally include blend shader
Alyssa Rosenzweig [Mon, 17 Jun 2019 17:08:47 +0000 (10:08 -0700)]
panfrost: Don't accidentally include blend shader

Some residual dirty state can leak through across frames; zero this out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Use typeless moves internally
Alyssa Rosenzweig [Mon, 17 Jun 2019 16:40:14 +0000 (09:40 -0700)]
panfrost/midgard: Use typeless moves internally

We switch all fmov to (i)mov, following the NIR switch. This simplifies
some code surrounding blend shaders and should have no functional
changes elsewhere.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agovirgl: better support for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
Chia-I Wu [Fri, 10 May 2019 04:44:33 +0000 (21:44 -0700)]
virgl: better support for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE

When the resource to be mapped is busy and the backing storage can
be discarded, reallocate the backing storage to avoid waiting.

In this new path, we allocate a new buffer, emit a state change,
write, and add the transfer to the queue .  In the
PIPE_TRANSFER_DISCARD_RANGE path, we suballocate a staging buffer,
write, and emit a copy_transfer (which may allocate, memcpy, and
blit internally).  The win might not always be clear.  But another
win comes from that the new path clears res->valid_buffer_range and
does not clear res->clean_mask.  This makes it much more preferable
in scenarios such as

  access = enough_space ? GL_MAP_UNSYNCHRONIZED_BIT :
                          GL_MAP_INVALIDATE_BUFFER_BIT;
  glMapBufferRange(..., GL_MAP_WRITE_BIT | access);
  memcpy(...); // append new data
  glUnmapBuffer(...);

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>