platform/upstream/mesa.git
6 years agor600: Emit EOP for more CF instruction types
Gert Wollny [Fri, 17 Nov 2017 11:13:40 +0000 (12:13 +0100)]
r600: Emit EOP for more CF instruction types

So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END,
CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction
was added to add the EOP flag, even though this is not actually
needed, because all these instrutions support the EOP flag.

This patch removes the fixup code, adds setting the EOP flag for the
according instructions as well as others like CF_OP_TEX and CF_OP_VTX,
and adds writing out EOP for this type of instruction in the disassembler.

This also fixes a bug where shaders were created that didn't actually have
the EOP flag set in the last CF instruction, which might have resulted
in GPU lockups.

[airlied: cleaned up a little]
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agomeson: replace with_*dri with with_dri_platform
Dylan Baker [Tue, 21 Nov 2017 00:34:28 +0000 (16:34 -0800)]
meson: replace with_*dri with with_dri_platform

This fixes the windows and macos stubs to be consistent with the *nix
path.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: add logic to select apple and windows dri
Dylan Baker [Sat, 28 Oct 2017 00:20:52 +0000 (17:20 -0700)]
meson: add logic to select apple and windows dri

This is still not fully correct (haiku and BSD is notably probably not
correct), but Linux is not regressed and this should be correct for
macOS and Windows.

v2: - set the dri_platform to windows on Cygwin as well (Jon)
v3: - Add a better todo for Hurd (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Fix LLVM requires for radeonsi
Dylan Baker [Tue, 21 Nov 2017 00:26:06 +0000 (16:26 -0800)]
meson: Fix LLVM requires for radeonsi

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: convert llvm option to tristate
Dylan Baker [Sat, 18 Nov 2017 00:37:50 +0000 (16:37 -0800)]
meson: convert llvm option to tristate

This option has been acting as a strange sort of half-tri state anyway.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Convert platform to auto
Dylan Baker [Thu, 16 Nov 2017 01:31:32 +0000 (17:31 -0800)]
meson: Convert platform to auto

This is necessary to support operating systems other than the *nix
family (excluding macOS). For Linux nothing has changed, the defaults
are still the same.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Remove duplicate _GNU_SOURCE
Dylan Baker [Thu, 16 Nov 2017 01:30:52 +0000 (17:30 -0800)]
meson: Remove duplicate _GNU_SOURCE

There is one provided unconditionally, and one guarded by platform ==
linux. Remove the unconditional one.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Remove completed or irrelevant TODO comments
Dylan Baker [Thu, 16 Nov 2017 01:09:33 +0000 (17:09 -0800)]
meson: Remove completed or irrelevant TODO comments

These are all either done already, or are autotools specific. The
misspelled gallium G3DVL is the autotools specific bit, meson is
handling that via build_by_default.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Fix TODO for missing dl_iterate_phdr function
Dylan Baker [Thu, 16 Nov 2017 01:07:37 +0000 (17:07 -0800)]
meson: Fix TODO for missing dl_iterate_phdr function

This function is required for both the Intel "Anvil" vulkan driver and
the i965 GL driver. Error out if either of those is enabled but this
function isn't found.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: disable x86 asm in fewer cases.
Dylan Baker [Thu, 16 Nov 2017 00:53:40 +0000 (16:53 -0800)]
meson: disable x86 asm in fewer cases.

This patch allows building asm for x86 on x86_64 platforms, when the
operating system is the same. Previously cross compile always turned off
assembly. This allows using a cross file to cross compile x86 binaries
on x86_64 with asm.

This could probably be relaxed further thanks to meson's "exe_wrapper",
which is way to specify an emulator or compatibility layer (wine) that
can run the foreign binaries on the build system. Since the meson build
at this point only supports building on Linux I can't test this and I
don't want to write/enable code that cannot even be build tested.

v4: - set condition to build == x86_64 and host == x86 and
      build.system == host.system

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Enable SSE4.1 optimizations
Dylan Baker [Thu, 16 Nov 2017 00:09:22 +0000 (16:09 -0800)]
meson: Enable SSE4.1 optimizations

This patch checks for an and then enables sse4.1 optimizations if the
host machine will be x86/x86_64.

v2: - Don't compile code, it's unnecessary since we require a compiler
      which always has SSE4.1 (Matt)
v3: - x64 -> x86_64 (Matt)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agobroadcom/vc5: Fix BASE_LEVEL handling with txl.
Eric Anholt [Wed, 22 Nov 2017 00:32:33 +0000 (16:32 -0800)]
broadcom/vc5: Fix BASE_LEVEL handling with txl.

The HW doesn't add the base level anywhere (the min/max lod clamping is
what does base level), so we need to add it manually in this case.

Fixes piglit tex-miplevel-selection *Lod 2D.

6 years agobroadcom/vc5: Fix array texture layer count setup.
Eric Anholt [Wed, 22 Nov 2017 00:05:49 +0000 (16:05 -0800)]
broadcom/vc5: Fix array texture layer count setup.

Fixes piglit array-texture.

6 years agobroadcom/vc5: Don't increment primitive queries while they're paused.
Eric Anholt [Tue, 21 Nov 2017 23:27:20 +0000 (15:27 -0800)]
broadcom/vc5: Don't increment primitive queries while they're paused.

Fixes ext_transform_feedback-generatemipmap prims_generated

6 years agobroadcom/vc5: Fix incorrect padding of TF outputs.
Eric Anholt [Tue, 21 Nov 2017 23:20:31 +0000 (15:20 -0800)]
broadcom/vc5: Fix incorrect padding of TF outputs.

After the first output, we were padding by an extra size of the previous
output.  Fixes piglit ext_transform_feedback-output-type mat4x3[2] and
friends.

6 years agobroadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.
Eric Anholt [Tue, 21 Nov 2017 23:00:36 +0000 (15:00 -0800)]
broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes.

The HW was computing an implicit height for the surface based on the image
size, but that may be smaller than the surface with ARB_fbo mismatched
sizes.  In that case, we need to tell it about the pad, either with the
little 4-bit field in the RT config, or the extended field in
CLEAR_COLORS_PART3.

Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.

6 years agoetnaviv: Put HALTI level in specs
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:25 +0000 (10:44 +0100)]
etnaviv: Put HALTI level in specs

The HALTI level is an indication of the gross architecture of the GPU.
It determines for significant part what feature level the GPU has, what
state (especially frontend state) is there, and where it is located.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
6 years agoetnaviv: Const-correctness etnaviv_emit.h
Wladimir J. van der Laan [Sat, 18 Nov 2017 09:44:24 +0000 (10:44 +0100)]
etnaviv: Const-correctness etnaviv_emit.h

The relocation structure is never changed by submitting it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
6 years agomeson: add si_driinfo.h in libgallium_dri
Juan A. Suarez Romero [Tue, 21 Nov 2017 11:38:27 +0000 (12:38 +0100)]
meson: add si_driinfo.h in libgallium_dri

v2: generate target conditionally (Dylan)

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agonir/gather_info: recognize load_patch_vertices_in as a system value
Iago Toral Quiroga [Thu, 16 Nov 2017 07:53:07 +0000 (08:53 +0100)]
nir/gather_info: recognize load_patch_vertices_in as a system value

This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is
generated to load gl_PatchVerticesIn in the SPIR-V path for both
Vulkan and OpenGL.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoi965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat
Jordan Justen [Wed, 15 Nov 2017 00:27:34 +0000 (16:27 -0800)]
i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat

This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated
samplers & surfaces.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agointel/genxml: Add helpers for determining field type
Kristian H. Kristensen [Wed, 30 Nov 2016 05:07:57 +0000 (21:07 -0800)]
intel/genxml: Add helpers for determining field type

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoi965/fs: Check ADD/MAD with immediates in satprop unit test
Matt Turner [Mon, 20 Nov 2017 22:21:43 +0000 (14:21 -0800)]
i965/fs: Check ADD/MAD with immediates in satprop unit test

The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.

mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoi965/fs: Handle negating immediates on MADs when propagating saturates
Matt Turner [Mon, 20 Nov 2017 22:24:57 +0000 (14:24 -0800)]
i965/fs: Handle negating immediates on MADs when propagating saturates

MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.

Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate
                      negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agomesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D
Juan A. Suarez Romero [Wed, 15 Nov 2017 16:49:21 +0000 (16:49 +0000)]
mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D

From section 8.7, page 179 of OpenGL ES 3.2 spec:

  An INVALID_OPERATION error is generated by CompressedTexImage3D
  if internalformat is one of the the formats in table 8.17 and target
  is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D.

  An INVALID_OPERATION error is generated by CompressedTexImage3D if
  internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array”
  column of table 8.17 is not checked, or if internalformat is
  TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked.

So far it was only considering TEXTURE_2D_ARRAY as valid target. But as
"Cube Map Array" column is checked for all the cases, in practice we can
consider also TEXTURE_CUBE_MAP_ARRAY.

This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agointel: fix disasm_info memory leaks
Tapani Pälli [Mon, 20 Nov 2017 08:57:17 +0000 (10:57 +0200)]
intel: fix disasm_info memory leaks

Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agost/glsl_to_nir: don't generate nir twice for gs
Timothy Arceri [Thu, 16 Nov 2017 00:16:10 +0000 (11:16 +1100)]
st/glsl_to_nir: don't generate nir twice for gs

This was left out of c980a3aa3133

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agollvmpipe: fix snorm blending
Roland Scheidegger [Sat, 18 Nov 2017 05:23:35 +0000 (06:23 +0100)]
llvmpipe: fix snorm blending

The blend math gets a bit funky due to inverse blend factors being
in range [0,2] rather than [-1,1], our normalized math can't really
cover this.
src_alpha_saturate blend factor has a similar problem too.
(Note that piglit fbo-blending-formats test is mostly useless for
anything but unorm formats, since not just all src/dst values are
between [0,1], but the tests are crafted in a way that the results
are between [0,1] too.)

v2: some formatting fixes, and fix a fairly obscure (to debug)
issue with alpha-only formats (not related to snorm at all), where
blend optimization would think it could simplify the blend equation
if the blend factors were complementary, however was using the
completely unrelated rgb blend factors instead of the alpha ones...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
6 years agor600: add cull distance support
Dave Airlie [Fri, 13 May 2016 04:35:33 +0000 (14:35 +1000)]
r600: add cull distance support

This passes all the tests in piglit.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoi965: Optimize bucket index calculation
Aravindan Muthukumar [Thu, 9 Nov 2017 05:45:28 +0000 (11:15 +0530)]
i965: Optimize bucket index calculation

Reducing Bucket index calculation to O(1).

This algorithm calculates the index using matrix method.  Assuming
PAGE_SIZE is 4096, matrix arrangement is as below:

          1*4096   2*4096    3*4096    4*4096
          5*4096   6*4096    7*4096    8*4096
          10*4096  12*4096   14*4096   16*4096
          20*4096  24*4096   28*4096   32*4096
           ...      ...       ...       ...
           ...      ...       ...       ...
           ...      ...       ...   max_cache_size

From this matrix its clearly seen that every row follows the below way:

          ...       ...       ...        n
        n+(1/4)n  n+(1/2)n  n+(3/4)n    2n

Row is calculated as log2(size/PAGE_SIZE) Column is calculated as
converting the difference between the elements to fit into power size of
two and indexing it.

Final Index is (row*4)+(col-1)

Tested with Intel Mesa CI.

Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20)

v4: Review comments on style and code comments implemented (Ian).
v3: Review comments implemented (Ian).
v2: Review comments implemented (Jason).

Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com>
Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agomeson: Guard the gallium dri componenet
Dylan Baker [Wed, 15 Nov 2017 01:04:27 +0000 (17:04 -0800)]
meson: Guard the gallium dri componenet

Currently the target has a redundant guard, and the state tracker isn't
properly guarded.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agomeson: don't build gallium subdir unless we're building gallium
Dylan Baker [Wed, 15 Nov 2017 01:03:39 +0000 (17:03 -0800)]
meson: don't build gallium subdir unless we're building gallium

This will allow us to simplify some guards within the gallium directory.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agobroadcom/vc5: Align 1D texture miplevels to 64b.
Eric Anholt [Mon, 20 Nov 2017 18:14:38 +0000 (10:14 -0800)]
broadcom/vc5: Align 1D texture miplevels to 64b.

Fixes tex-miplevel-selection GL2:texture() 1D

6 years agobroadcom/vc5: Clamp min lod to the last level.
Eric Anholt [Mon, 20 Nov 2017 18:07:24 +0000 (10:07 -0800)]
broadcom/vc5: Clamp min lod to the last level.

Otherwise, the simulator would complain in tex-miplevel-selection that the
min/max clamp was out of order.  The actual HW seems to have clamped to
the max anyway.

6 years agobroadcom/vc5: Increase simulator memory for tex-miplevel-selection.
Eric Anholt [Mon, 20 Nov 2017 20:26:49 +0000 (12:26 -0800)]
broadcom/vc5: Increase simulator memory for tex-miplevel-selection.

We were overflowing, because of all the little 4k allocations for CLs that
were getting expanded to 128kb in the simulator due to the GMP alignment.

6 years agoswr/rast: Repair simd8 frontend code rot
Tim Rowley [Fri, 10 Nov 2017 22:45:38 +0000 (16:45 -0600)]
swr/rast: Repair simd8 frontend code rot

Keep non-default simd8 frontend code running for comparison purposes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader
Tim Rowley [Thu, 9 Nov 2017 01:17:24 +0000 (19:17 -0600)]
swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader

Disabled for now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Simplify GATHER* jit builder api
Tim Rowley [Wed, 8 Nov 2017 20:07:33 +0000 (14:07 -0600)]
swr/rast: Simplify GATHER* jit builder api

General cleanup, and prep work for possibly moving to llvm masked
gather intrinsic.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Add alignment to transpose targets
Tim Rowley [Tue, 7 Nov 2017 21:24:25 +0000 (15:24 -0600)]
swr/rast: Add alignment to transpose targets

Needed to ensure alignment for avx512.

Fixes address sanitizer crash.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Cache eventmanager
Tim Rowley [Tue, 7 Nov 2017 19:50:11 +0000 (13:50 -0600)]
swr/rast: Cache eventmanager

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Enable AVX-512 targets in the jitter
Tim Rowley [Tue, 31 Oct 2017 21:46:59 +0000 (16:46 -0500)]
swr/rast: Enable AVX-512 targets in the jitter

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Points with clipdistance can't go through simplepoints path
Tim Rowley [Tue, 31 Oct 2017 14:41:02 +0000 (09:41 -0500)]
swr/rast: Points with clipdistance can't go through simplepoints path

Fixes piglit glsl-1.20:vs-clip-vertex-primitives and
glsl-1.30:vs-clip-distance-primitives.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Code style change (NFC)
Tim Rowley [Mon, 23 Oct 2017 20:10:35 +0000 (15:10 -0500)]
swr/rast: Code style change (NFC)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Widen fetch shader to SIMD16
Tim Rowley [Thu, 19 Oct 2017 22:33:37 +0000 (17:33 -0500)]
swr/rast: Widen fetch shader to SIMD16

Widen fetch shader to SIMD16, enable SIMD16 types in the jitter,
and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agoswr/rast: Support flexible vertex layout for DS output
Tim Rowley [Wed, 18 Oct 2017 21:51:07 +0000 (16:51 -0500)]
swr/rast: Support flexible vertex layout for DS output

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agogallium/u_threaded: avoid syncing in threaded_context_flush
Nicolai Hähnle [Fri, 10 Nov 2017 10:15:44 +0000 (11:15 +0100)]
gallium/u_threaded: avoid syncing in threaded_context_flush

We could always do the flush asynchronously, but if we're going to wait
for a fence anyway and the driver thread is currently idle, the additional
communication overhead isn't worth it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: avoid syncing the driver thread in si_fence_finish
Nicolai Hähnle [Fri, 10 Nov 2017 09:58:10 +0000 (10:58 +0100)]
radeonsi: avoid syncing the driver thread in si_fence_finish

It is really only required when we need to flush for deferred fences.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: recompute the relative timeout after waiting for ready fence
Nicolai Hähnle [Mon, 13 Nov 2017 13:50:17 +0000 (14:50 +0100)]
radeonsi: recompute the relative timeout after waiting for ready fence

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoddebug: fix the hang detection timeout calculation
Nicolai Hähnle [Fri, 10 Nov 2017 16:13:27 +0000 (17:13 +0100)]
ddebug: fix the hang detection timeout calculation

Fixes: c9fefa062b36 ("ddebug: rewrite to always use a threaded approach")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoddebug: fix use-after-free of streamout targets
Nicolai Hähnle [Fri, 10 Nov 2017 12:11:53 +0000 (13:11 +0100)]
ddebug: fix use-after-free of streamout targets

Fixes: b47727a83ad6 ("ddebug: implement pipelined hang detection mode")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agogallium/u_threaded: properly initialize fence unflushed tokens
Nicolai Hähnle [Fri, 10 Nov 2017 10:28:28 +0000 (11:28 +0100)]
gallium/u_threaded: properly initialize fence unflushed tokens

This got lost in a rebase but never hurt anything because we happened
to always sync in fence_finish anyway...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoutil/u_queue: really use futex-based fences
Nicolai Hähnle [Fri, 10 Nov 2017 11:32:44 +0000 (12:32 +0100)]
util/u_queue: really use futex-based fences

The relevant define changed in the final revision of the simple mutex
patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoutil/u_queue: fix timeout handling in util_queue_fence_wait_timeout
Nicolai Hähnle [Mon, 13 Nov 2017 13:35:50 +0000 (14:35 +0100)]
util/u_queue: fix timeout handling in util_queue_fence_wait_timeout

Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/mesa: use asynchronous flushes in st_finish
Nicolai Hähnle [Thu, 9 Nov 2017 13:34:20 +0000 (14:34 +0100)]
st/mesa: use asynchronous flushes in st_finish

With threaded gallium, the driver may currently be running in another
thread. In that case, we will execute all remaining commands in that
thread instead of syncing, which should be better for cache locality.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/mesa: implement st_server_wait_sync properly
Nicolai Hähnle [Thu, 9 Nov 2017 13:34:19 +0000 (14:34 +0100)]
st/mesa: implement st_server_wait_sync properly

Asynchronous flushes require a proper implementation of
st_server_wait_sync, because we could have the following with
threaded Gallium:

 Context 1 app     Context 1 driver         Context 2
 -------------     ----------------         ---------
 f = glFenceSync
 glFlush
 <-- app sync -->                           <-- app sync -->
                                            glWaitSync(f)
                                            .. draw calls ..
                   pipe_context::flush
                     for glFenceSync
                   pipe_context::flush
                     for glFlush

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agou_threaded_gallium: remove synchronization in fence_server_sync
Nicolai Hähnle [Mon, 6 Nov 2017 10:56:54 +0000 (11:56 +0100)]
u_threaded_gallium: remove synchronization in fence_server_sync

The whole point of fence_server_sync is that it can be used to
avoid waiting in the application thread.

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoamd: build addrlib with C++11
Nicolai Hähnle [Wed, 15 Nov 2017 11:51:23 +0000 (12:51 +0100)]
amd: build addrlib with C++11

It is required for LLVM anyway.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658
Fixes: 7f33e94e43a6 ("amd/addrlib: update to latest version")
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/gfx9: fix VM fault with fetched instance divisors
Nicolai Hähnle [Wed, 15 Nov 2017 10:22:26 +0000 (11:22 +0100)]
radeonsi/gfx9: fix VM fault with fetched instance divisors

We need to account for SGPR locations in merged shaders.

This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations

Fixes: 79c2e7388c7f ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: use a 16 bytes array for the sampled/storage image descriptors
Samuel Pitoiset [Wed, 15 Nov 2017 11:08:29 +0000 (12:08 +0100)]
radv: use a 16 bytes array for the sampled/storage image descriptors

This allows to update them with only one memcpy().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: do not add the query pool BO to the list in vkCmdEndQuery()
Samuel Pitoiset [Wed, 15 Nov 2017 09:55:05 +0000 (10:55 +0100)]
radv: do not add the query pool BO to the list in vkCmdEndQuery()

As per the spec, the query identified by queryPool and query
must currently be active. Applications have to call vkCmdBeginQuery()
before, and thus the query pool BO will already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: only load needed depth clear regs for fast depth clears
Samuel Pitoiset [Wed, 15 Nov 2017 14:44:01 +0000 (15:44 +0100)]
radv: only load needed depth clear regs for fast depth clears

Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not add the image BO in radv_set_depth_clear_regs()
Samuel Pitoiset [Wed, 15 Nov 2017 14:44:00 +0000 (15:44 +0100)]
radv: do not add the image BO in radv_set_depth_clear_regs()

For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove useless assertion in emit_depthstencil_clear()
Samuel Pitoiset [Wed, 15 Nov 2017 14:43:59 +0000 (15:43 +0100)]
radv: remove useless assertion in emit_depthstencil_clear()

Already checked in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove useless check in radv_set_depth_clear_regs()
Samuel Pitoiset [Wed, 15 Nov 2017 14:43:58 +0000 (15:43 +0100)]
radv: remove useless check in radv_set_depth_clear_regs()

aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agodocs/features: mark some r600 extensions supported
Dave Airlie [Sun, 19 Nov 2017 23:19:31 +0000 (09:19 +1000)]
docs/features: mark some r600 extensions supported

These just looked to be missed when this file was updated.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoglsl: Catch subscripted calls to undeclared subroutines
George Barrett [Sun, 19 Nov 2017 10:55:10 +0000 (21:55 +1100)]
glsl: Catch subscripted calls to undeclared subroutines

generate_array_index fails to check whether the target of a subroutine
call exists in the AST, potentially passing around null ir_rvalue
pointers eventuating in abort/segfault.

Fixes: fd01840c0bd3 ("glsl: add AoA support to subroutines")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438

6 years agobroadcom/vc5: Fix up integer texture handling.
Eric Anholt [Fri, 13 Oct 2017 20:11:15 +0000 (13:11 -0700)]
broadcom/vc5: Fix up integer texture handling.

The original spec I had didn't expose integer textures and suggested that
you use unfiltered floats.  Now there are proper formats for them.

Fixes 16- and 32-bit texwrap integer tests in piglit, and
dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.

6 years agobroadcom/vc5: Fix simulator assertion failures about color RT clears.
Eric Anholt [Fri, 17 Nov 2017 01:50:55 +0000 (17:50 -0800)]
broadcom/vc5: Fix simulator assertion failures about color RT clears.

When we tried to clear color while storing depth, it assertion failed
about basically not having enough information to decide which color RT to
clear.  It turns out the STORE_GENERAL picks the buffer according to the
color buffer being stored, or all of them if NONE.  If you're doing depth,
it doesn't know which to pick.

6 years agofreedreno/ir3: add texture gather support
Rob Clark [Sat, 18 Nov 2017 15:40:49 +0000 (10:40 -0500)]
freedreno/ir3: add texture gather support

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agoetnaviv: enable full overwrite when no color buffer is present
Lucas Stach [Wed, 15 Nov 2017 16:33:17 +0000 (17:33 +0100)]
etnaviv: enable full overwrite when no color buffer is present

The OVERWRITE bit disables destination fetches, which is exactly what
we want when there is no valid color buffer bound.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
6 years agoi965: Stop including brw_cfg.h in brw_disasm_info.h
Jason Ekstrand [Sat, 18 Nov 2017 01:27:55 +0000 (17:27 -0800)]
i965: Stop including brw_cfg.h in brw_disasm_info.h

The brw_disasm_info header is included by certain tools in order to get
shader assembly from binaries so it's a semi-external header.  Including
brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit
of our back-end compiler internals.  Instead, make the couple of forward
declarations we need and make the header more stand-alone.  This fixes
the meson build.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4f82b17287194ca7d10816f6cfe4712a3e0a03fc

6 years agoi965: Mark BOs as external when we export their handle
Jason Ekstrand [Sat, 18 Nov 2017 00:52:09 +0000 (16:52 -0800)]
i965: Mark BOs as external when we export their handle

Almost all of our BO export paths were already properly marked the BO as
external and added it to the handle table.  Most export use-cases go
through a prime fd or flink where we have a brw_bo export helper that
does the right thing.  The one missing one happens when you call
queryImage and ask for __DRI_IMAGE_ATTRIB_HANDLE.  We just grabbed the
gem handle out of the BO (because it's really easy to do that) and
handed it off to the client; what could go wrong?  As it turns out, this
path is used by basically every compositor that wants to turn around and
call drmModeAddFB2 on it so it can hand it off to display.  The result,
as of 4b1e70cc57d7ff5f465544644b2180dee1490cee, is that we no longer set
MOCS_PTE on those surfaces and the kernel's attempts to disable caching
fail and we scanout gets corruption.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103759
Fixes: 4b1e70cc57d7ff5f465544644b2180dee1490cee
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
6 years agoi965/bufmgr: Add a helper to mark a BO as external
Jason Ekstrand [Sat, 18 Nov 2017 00:49:03 +0000 (16:49 -0800)]
i965/bufmgr: Add a helper to mark a BO as external

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
6 years agoi965: Correct disasm_info usage in eu_validate test
Andres Gomez [Sat, 18 Nov 2017 00:48:45 +0000 (02:48 +0200)]
i965: Correct disasm_info usage in eu_validate test

Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")

Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agobroadcom/vc5: Set up the padded height at surface creation time.
Eric Anholt [Tue, 14 Nov 2017 23:52:53 +0000 (15:52 -0800)]
broadcom/vc5: Set up the padded height at surface creation time.

This centralizes the calculation in the surface, instead of in each
load/store.

6 years agobroadcom/vc5: Ensure that there is always a TLB write.
Eric Anholt [Wed, 15 Nov 2017 23:05:37 +0000 (15:05 -0800)]
broadcom/vc5: Ensure that there is always a TLB write.

This should fix some GPU hangs in our (currently always single-threaded)
fragment shaders, and definitely fixes assertion failures in simulation.

6 years agobroadcom/vc5: Fix clear color for swap_color_rb render targets.
Eric Anholt [Tue, 7 Nov 2017 23:42:04 +0000 (15:42 -0800)]
broadcom/vc5: Fix clear color for swap_color_rb render targets.

Fixes dEQP-GLES3.functional.depth_stencil_clear.depth.*

6 years agobroadcom/vc5: Fix pasteo in front stencil ref value setup.
Eric Anholt [Tue, 7 Nov 2017 23:37:46 +0000 (15:37 -0800)]
broadcom/vc5: Fix pasteo in front stencil ref value setup.

Fixes piglit masked-clear.

6 years agobroadcom/vc5: Fix colormasking when we need to swap r/b colors.
Eric Anholt [Tue, 7 Nov 2017 23:35:33 +0000 (15:35 -0800)]
broadcom/vc5: Fix colormasking when we need to swap r/b colors.

Fixes part of piglit masked-clear.

6 years agobroadcom/vc5: Enable the Z min/max clipping planes.
Eric Anholt [Tue, 7 Nov 2017 23:21:06 +0000 (15:21 -0800)]
broadcom/vc5: Enable the Z min/max clipping planes.

6 years agobroadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*.
Eric Anholt [Wed, 15 Nov 2017 00:01:32 +0000 (16:01 -0800)]
broadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*.

6 years agor300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases
Brian Paul [Fri, 17 Nov 2017 16:38:39 +0000 (09:38 -0700)]
r300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases

To silence compiler warnings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agotgsi: s/uint/enum pipe_shader_type/
Brian Paul [Fri, 17 Nov 2017 22:03:21 +0000 (15:03 -0700)]
tgsi: s/uint/enum pipe_shader_type/

Roland Scheidegger <sroland@vmware.com>

6 years agotgsi: bump tgsi_opcode_info::output_mode size to 4 bits
Brian Paul [Fri, 17 Nov 2017 16:51:10 +0000 (09:51 -0700)]
tgsi: bump tgsi_opcode_info::output_mode size to 4 bits

To avoid problems with MSVC.  And verify size with ASSERT_BITFIELD_SIZE().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agoi965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.
Kenneth Graunke [Fri, 17 Nov 2017 06:31:27 +0000 (22:31 -0800)]
i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround.

This apparently causes hangs on Broadwell, so let's back it out for now.
I think there are other PIPE_CONTROL workarounds that we're missing.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787

6 years agoegl: Convert int to attrib in eglGetPlatformDisplay
Adam Jackson [Thu, 16 Nov 2017 18:27:27 +0000 (13:27 -0500)]
egl: Convert int to attrib in eglGetPlatformDisplay

... because converting attrib to int truncates, and that's bad.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agodocs: update features for freedreno
Rob Clark [Fri, 17 Nov 2017 20:18:14 +0000 (15:18 -0500)]
docs: update features for freedreno

Just comparing glxinfo and features.txt, and it seems features.txt is
fairly out of date.  The a5xx specific features (compute/images/atomics/
etc) are recent.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agoi965: Rename intel_asm_annotation -> brw_disasm_info
Matt Turner [Thu, 16 Nov 2017 19:43:51 +0000 (11:43 -0800)]
i965: Rename intel_asm_annotation -> brw_disasm_info

It was the only file named intel_* in the compiler.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: Rewrite disassembly annotation code
Matt Turner [Thu, 16 Nov 2017 01:08:42 +0000 (17:08 -0800)]
i965: Rewrite disassembly annotation code

The old code used an array to store each "instruction group" (the new,
better name than the old overloaded "annotation"), and required a
memmove() to shift elements over in the array when we needed to split a
group so that we could add an error message. This was confusing and
difficult to get right, not the least of which was  because the array
has a tail sentinel not included in .ann_count.

Instead use a linked list, a data structure made for efficient
insertion.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: Simplify annotation_insert_error()
Matt Turner [Thu, 16 Nov 2017 21:42:41 +0000 (13:42 -0800)]
i965: Simplify annotation_insert_error()

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: Move common code out of #ifdef
Matt Turner [Thu, 16 Nov 2017 21:35:01 +0000 (13:35 -0800)]
i965: Move common code out of #ifdef

I'm going to change the call in a later patch and with the difference in
indentation level it wasn't immediately obvious that the calls were
identical.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965: Remove DWord length from MI_FLUSH_DW definition
Anuj Phogat [Tue, 14 Nov 2017 22:48:21 +0000 (14:48 -0800)]
i965: Remove DWord length from MI_FLUSH_DW definition

Fixes: 6165fda59b8 ("i965: Program DWord Length in MI_FLUSH_DW")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoanv/cmd_buffer: Take bo_offset into account in fast clear state addresses
Jason Ekstrand [Sat, 11 Nov 2017 19:52:41 +0000 (11:52 -0800)]
anv/cmd_buffer: Take bo_offset into account in fast clear state addresses

Otherwise, if the image is not bound to the start of the buffer, we're
going to be reading and writing its fast clear state in the wrong spot.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
6 years agoanv/cmd_buffer: Advance the address when initializing clear colors
Jason Ekstrand [Sun, 12 Nov 2017 06:03:45 +0000 (22:03 -0800)]
anv/cmd_buffer: Advance the address when initializing clear colors

Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
6 years agoradeon/video: enable encode support for raven
Boyuan Zhang [Tue, 7 Nov 2017 21:25:09 +0000 (16:25 -0500)]
radeon/video: enable encode support for raven

Enable h.264 encode for vcn hardware (raven)

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
6 years agoradeonsi: enable vcn encode
Boyuan Zhang [Tue, 7 Nov 2017 21:24:10 +0000 (16:24 -0500)]
radeonsi: enable vcn encode

Enable vcn encode by creating radeon_encoder for vcn.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vcn: add create encoder
Boyuan Zhang [Wed, 8 Nov 2017 16:24:09 +0000 (11:24 -0500)]
radeon/vcn: add create encoder

Add implementation for create_encoder interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vcn: add encode get feedback
Boyuan Zhang [Tue, 7 Nov 2017 21:21:21 +0000 (16:21 -0500)]
radeon/vcn: add encode get feedback

Add implementation for get_feedback interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vcn: add encode destroy
Boyuan Zhang [Tue, 7 Nov 2017 21:20:53 +0000 (16:20 -0500)]
radeon/vcn: add encode destroy

Add implementation for destroy interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vcn: add encode end frame
Boyuan Zhang [Tue, 7 Nov 2017 21:20:25 +0000 (16:20 -0500)]
radeon/vcn: add encode end frame

Add implementation for end_frame interface for vcn encode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>