platform/upstream/mesa.git
5 years agonir/lower_clip: add support for geometry shaders
Timothy Arceri [Thu, 27 Jun 2019 04:20:37 +0000 (14:20 +1000)]
nir/lower_clip: add support for geometry shaders

This will be used to enabled compat profile support for geometry
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add lower_clip_outputs() helper
Timothy Arceri [Fri, 28 Jun 2019 00:50:15 +0000 (10:50 +1000)]
nir/lower_clip: add lower_clip_outputs() helper

This will be reused in the following patch to add support for clip
vertex lowering in geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add create_clipdist_vars() helper
Timothy Arceri [Fri, 28 Jun 2019 00:35:11 +0000 (10:35 +1000)]
nir/lower_clip: add create_clipdist_vars() helper

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add a find_clipvertex_and_position_outputs() helper
Timothy Arceri [Fri, 28 Jun 2019 00:10:28 +0000 (10:10 +1000)]
nir/lower_clip: add a find_clipvertex_and_position_outputs() helper

This will allow code sharing in a following patch that adds support
for lowering in geometry shaders. It also allows us to exit early
if there is no lowering to do which allows a small code tidy up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agopanfrost: Set rt_count
Alyssa Rosenzweig [Thu, 18 Jul 2019 19:43:39 +0000 (12:43 -0700)]
panfrost: Set rt_count

This doesn't quite work yet, but it illustrates how MRT is implemented
in the MFBD: rt_count is set appropriately based on the number of render
targets, while additional render target descriptors are appended on with
an index variable in them (not quite decoded since there's some aspects
we don't understand there, but conceptually this should be right).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Trace invisible BOs
Alyssa Rosenzweig [Thu, 18 Jul 2019 19:42:27 +0000 (12:42 -0700)]
panfrost: Trace invisible BOs

Helps make the decode a little more readable (names instead of
addresses).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Preserve empty tiler heap symmetry
Alyssa Rosenzweig [Thu, 18 Jul 2019 19:28:56 +0000 (12:28 -0700)]
panfrost/decode: Preserve empty tiler heap symmetry

If tiler_heap_end == tiler_heap_start, ensure it's printed the same
rather than one erroring out as hex.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Zero polygon list body size for clears
Alyssa Rosenzweig [Thu, 18 Jul 2019 19:10:39 +0000 (12:10 -0700)]
panfrost: Zero polygon list body size for clears

There's no polygons, so you can't have any size to the polygon list,
although there is a minimal header.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/mfbd: Unify depth-only with masked FBO path
Alyssa Rosenzweig [Thu, 18 Jul 2019 18:09:19 +0000 (11:09 -0700)]
panfrost/mfbd: Unify depth-only with masked FBO path

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Simplify set_framebuffer_state
Alyssa Rosenzweig [Thu, 18 Jul 2019 18:05:01 +0000 (11:05 -0700)]
panfrost: Simplify set_framebuffer_state

Most of the ad hoc logic is already in Gallium.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Check for NULL surface in places
Alyssa Rosenzweig [Thu, 18 Jul 2019 17:59:59 +0000 (10:59 -0700)]
panfrost: Check for NULL surface in places

Fixes a bunch of NULL dereferences, although it does cause GPU faults of
course.

This is caused by color buffers masked out in MRT, which we'll
eventually have to solve the right way... one thing at a time.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Expose 4 render targets
Alyssa Rosenzweig [Thu, 18 Jul 2019 17:48:19 +0000 (10:48 -0700)]
panfrost: Expose 4 render targets

Hidden behind deqp flag as usual.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Shrink tiler heap
Alyssa Rosenzweig [Thu, 18 Jul 2019 20:04:44 +0000 (13:04 -0700)]
panfrost: Shrink tiler heap

128MB is excessive and 16MB is still plenty. Saves 112MB/context on
kernels without growable/heap support.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir/large_constants: De-duplicate constants
Caio Marcelo de Oliveira Filho [Fri, 7 Jun 2019 16:28:14 +0000 (09:28 -0700)]
nir/large_constants: De-duplicate constants

If a function has a constant and is called more than once, after
inlining we may end up with different variables representing the same
constant.  This commit look into the data and de-duplicate them.

The first pass now will collect the constant data in a per variable
buffer, then de-duplication happens (by sorting then linear walk), and
the second pass will use the data in var->data.location.

One side-effect of the current implementation is that constants will
be reordered.  If this turns out to be a problem is something that can
be fixed.

An alternative strategy considered was to perform this in a
per-function basis and then merge the results, the problem is that we
would have to fix up the offsets during the merge.  Given the data we
have, the current patch is good enough.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/large_constants: Use ralloc for var_infos
Caio Marcelo de Oliveira Filho [Fri, 7 Jun 2019 16:21:09 +0000 (09:21 -0700)]
nir/large_constants: Use ralloc for var_infos

This will be used later on to allocate constant data for each
variable (and then deduplicate).  Also drop initializing found_read,
as it is already implicitly false in the literal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agofreedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.
Eric Anholt [Tue, 16 Jul 2019 18:19:28 +0000 (11:19 -0700)]
freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.

Cuts a bunch of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Convert load_barycentric_at_sample to the NIR lowering helper.
Eric Anholt [Tue, 16 Jul 2019 18:15:15 +0000 (11:15 -0700)]
freedreno: Convert load_barycentric_at_sample to the NIR lowering helper.

Cuts out a ton of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Convert load_barycentric_at_offset to the NIR lowering helper.
Eric Anholt [Tue, 16 Jul 2019 18:09:39 +0000 (11:09 -0700)]
freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.

Cuts out a ton of boilerplate.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agov3d: Use nir_shader_lower_instructions() for txf_ms lowering.
Eric Anholt [Tue, 16 Jul 2019 17:55:56 +0000 (10:55 -0700)]
v3d: Use nir_shader_lower_instructions() for txf_ms lowering.

Cuts out a bunch of boilerplate.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agonir: Allow internal changes to the instr in nir_shader_lower_instructions().
Eric Anholt [Tue, 16 Jul 2019 17:52:25 +0000 (10:52 -0700)]
nir: Allow internal changes to the instr in nir_shader_lower_instructions().

v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in
NIR, but doesn't generate a new txf_ms instructions as replacement.  It's
pretty easy to allow that in nir_shader_lower_instructions, and it may be
common in lowering passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovc4: Convert vc4_nir_lower_txf_ms to nir_shader_lower_instructions().
Eric Anholt [Tue, 16 Jul 2019 17:21:13 +0000 (10:21 -0700)]
vc4: Convert vc4_nir_lower_txf_ms to nir_shader_lower_instructions().

Cuts out a bunch of boilerplate.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agov3d: Fix assertion failures in debug builds.
Eric Anholt [Tue, 16 Jul 2019 18:59:35 +0000 (11:59 -0700)]
v3d: Fix assertion failures in debug builds.

nir_lower_io leaves around deref_var instructions after lowering away
deref intrinsics.  This ends up breaking validation after v3d_nir_lower_io
removes variables not actually being stored by the shader's
store_output()s.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agopanfrost: Handle Z24 textures
Alyssa Rosenzweig [Thu, 18 Jul 2019 00:39:34 +0000 (17:39 -0700)]
panfrost: Handle Z24 textures

Just use the Z32 code.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/ci: Update expectations
Alyssa Rosenzweig [Thu, 18 Jul 2019 00:17:41 +0000 (17:17 -0700)]
panfrost/ci: Update expectations

We just fixed some stencil tests.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Make scissor test more robust
Alyssa Rosenzweig [Wed, 17 Jul 2019 23:30:09 +0000 (16:30 -0700)]
panfrost: Make scissor test more robust

See v3d implementation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Use correct NO_DITHER field on MFBD
Alyssa Rosenzweig [Wed, 17 Jul 2019 23:19:45 +0000 (16:19 -0700)]
panfrost: Use correct NO_DITHER field on MFBD

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Implement Z32F(_S8) support
Alyssa Rosenzweig [Wed, 17 Jul 2019 22:49:42 +0000 (15:49 -0700)]
panfrost: Implement Z32F(_S8) support

Z32F uses a dediacted float path. Z32F_S8 uses separate stencil planes
in the hardware, lowered via u_transfer_helper.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Don't disassemble NULL shaders
Alyssa Rosenzweig [Wed, 17 Jul 2019 22:43:24 +0000 (15:43 -0700)]
panfrost/decode: Don't disassemble NULL shaders

It is legal to load a shader from a NULL address, particularly when the
TILER job is used strictly for effects on the Z/S buffer with 0x0 color
mask. Don't crash the decoder in this case.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Copy stencil front to back if back disabled
Alyssa Rosenzweig [Wed, 17 Jul 2019 22:42:48 +0000 (15:42 -0700)]
panfrost: Copy stencil front to back if back disabled

When backside stenciling is disabled, backfacing primitives just do the
same thing as frontfacing primitives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoswr/rast: Refactor memory API between rasterizer core and swr
Jan Zielinski [Wed, 17 Jul 2019 15:22:16 +0000 (17:22 +0200)]
swr/rast: Refactor memory API between rasterizer core and swr

This commit cleans up API between the core of the rasterizer and swr.
Some formatting changes are also done.

Reviewed-by: Alok Hota <alok.hota@intel.com>
5 years agolima/ppir: Add gl_PointCoord handling
Andreas Baierl [Fri, 31 May 2019 07:54:27 +0000 (09:54 +0200)]
lima/ppir: Add gl_PointCoord handling

Treat gl_PointCoord as a system value and
add the necessary bits for correct codegen.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogallium: Add PIPE_CAP_TGSI_FS_POINT_IS_SYSVAL
Andreas Baierl [Tue, 4 Jun 2019 11:25:28 +0000 (13:25 +0200)]
gallium: Add PIPE_CAP_TGSI_FS_POINT_IS_SYSVAL

This adds an option to treat gl_PointCoord as a system value.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir/tgsi: Extend tgsi_to_nir.c to support gl_PointCoord as a system value.
Andreas Baierl [Tue, 11 Jun 2019 12:59:11 +0000 (14:59 +0200)]
nir/tgsi: Extend tgsi_to_nir.c to support gl_PointCoord as a system value.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Add gl_PointCoord system value
Andreas Baierl [Tue, 4 Jun 2019 11:24:53 +0000 (13:24 +0200)]
nir: Add gl_PointCoord system value

gl_PointCoord handling needs some special bits set in lima/ppir code
generation. Treating gl_PointCoord as a system value makes it easier
to distinguish from a regular varying.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl: Optionally declare gl_PointCoord as a system value
Andreas Baierl [Tue, 4 Jun 2019 11:23:44 +0000 (13:23 +0200)]
glsl: Optionally declare gl_PointCoord as a system value

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agolima/gp: Fix problem with complex moves
Connor Abbott [Sat, 11 May 2019 16:43:30 +0000 (18:43 +0200)]
lima/gp: Fix problem with complex moves

When writing the scheduler, we forgot that you can't read the complex
unit in certain sources because it gets overwritten to 0 or 1. Fixing
this turned out to be possible without giving up and reducing
GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't
expect. There can be at most 4 next-max nodes that can't have moves
scheduled in the complex slot, so it actually isn't a problem for
getting the number of next-max nodes at 5 or lower. However, it is a
problem for stores. If a given node is a next-max node whose move cannot
go in the complex slot *and* is used by a store that we decide to
schedule, we have to reserve one of the non-complex slots for a move
instead of all the slots, or we can wind up in a situation where only
the complex slot is free and we fail the move. This means that we have
to add another term to the reservation logic, for stores whose children
cannot be in the complex slot.

Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gpir: Rework the scheduler
Connor Abbott [Thu, 11 Jan 2018 23:35:58 +0000 (18:35 -0500)]
lima/gpir: Rework the scheduler

Now, we do scheduling at the same time as value register allocation. The
ready list now acts similarly to the array of registers in
value_regalloc, keeping us from running out of slots. Before this, the
value register allocator wasn't aware of the scheduling constraints of
the actual machine, which meant that it sometimes chose the wrong false
dependencies to insert. Now, we assign value registers at the same time
as we actually schedule instructions, making its choices reflect reality
much better. It was also conservative in some cases where the new scheme
doesn't have to be. For example, in something like:

1 = ld_att
2 = ld_uni
3 = add 1, 2

It's possible that one of 1 and 2 can't be scheduled in the same
instruction as 3, meaning that a move needs to be inserted, so the value
register allocator needs to assume that this sequence requires two
registers. But when actually scheduling, we could discover that 1, 2,
and 3 can all be scheduled together, so that they only require one
register. The new scheduler speculatively inserts the instruction under
consideration, as well as all of its child load instructions, and then
counts the number of live value registers after all is said and done.
This lets us be more aggressive with scheduling when we're close to the
limit.

With the new scheduler, the kmscube vertex shader is now scheduled in 40
instructions, versus 66 before.

Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gp: Mark more add-only nodes as maybe-two-slot
Connor Abbott [Mon, 22 Apr 2019 19:54:06 +0000 (21:54 +0200)]
lima/gp: Mark more add-only nodes as maybe-two-slot

Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima/gpir: Fix some bugs in instruction handling
Connor Abbott [Tue, 16 Jan 2018 00:38:17 +0000 (19:38 -0500)]
lima/gpir: Fix some bugs in instruction handling

Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: Reintroduce the standalone compiler
Connor Abbott [Fri, 3 Nov 2017 21:34:32 +0000 (17:34 -0400)]
lima: Reintroduce the standalone compiler

I used this to test things without needing to have a device handy.

Acked-by: Qiang Yu <yuq825@gmail.com>
5 years agonir/lower_viewport: Check variable mode first
Connor Abbott [Mon, 10 Jun 2019 19:59:12 +0000 (21:59 +0200)]
nir/lower_viewport: Check variable mode first

The location is unused for shader_temp and function_temp variables, and
due to the way we nir_lower_io_to_temproraries demotes shader_out
variables to shader_temp variables, it happened to equal
VARYING_SLOT_POS for the gl_Position temporary, which made this pass
fail with the offline compiler due to this coming before vars_to_ssa.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agoradv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID
Samuel Pitoiset [Thu, 18 Jul 2019 08:14:09 +0000 (10:14 +0200)]
radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv/gfx10: move emitting VGT_PRIMITIVEID_EN into the NGG path
Samuel Pitoiset [Thu, 18 Jul 2019 08:14:07 +0000 (10:14 +0200)]
radv/gfx10: move emitting VGT_PRIMITIVEID_EN into the NGG path

And do not emit VGT_GS_MODE which is unnecessary on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv/gfx10: do not always execute a barrier before the second shader
Samuel Pitoiset [Wed, 17 Jul 2019 13:43:53 +0000 (15:43 +0200)]
radv/gfx10: do not always execute a barrier before the second shader

With NGG, empty waves may still be required to export data.

This fixes dEQP-VK.ycbcr.format.*_unorm.geometry_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix VGT_GS_MODE if VS uses the primitive ID
Samuel Pitoiset [Wed, 17 Jul 2019 08:58:17 +0000 (10:58 +0200)]
radv: fix VGT_GS_MODE if VS uses the primitive ID

Found by inspection.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: emit correct lowering for logic operations with MSAA render targets
Iago Toral Quiroga [Tue, 16 Jul 2019 08:58:55 +0000 (10:58 +0200)]
v3d: emit correct lowering for logic operations with MSAA render targets

v2:
 - Drop the writemask from the per-sample color intrinsic (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: handle nir_intrinsic_store_tlb_sample_color_v3d
Iago Toral Quiroga [Tue, 16 Jul 2019 09:20:11 +0000 (11:20 +0200)]
v3d: handle nir_intrinsic_store_tlb_sample_color_v3d

v2:
 - Move handling of output intrinsics to ntq_emit_intrinsic() (Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: add a V3D-specific intrinsic for per-sample color writes
Iago Toral Quiroga [Tue, 16 Jul 2019 07:44:52 +0000 (09:44 +0200)]
nir: add a V3D-specific intrinsic for per-sample color writes

For per-sample color writes we need the output intrinsic to pack the
sample index, which is not provided with regular store_output intrinsics
unless we figured out a way to encode it into the base or the offset.

v2:
 - Drop the writemask (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: implement per-sample tlb color writes
Iago Toral Quiroga [Tue, 16 Jul 2019 08:48:06 +0000 (10:48 +0200)]
v3d: implement per-sample tlb color writes

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: refactor the tlb color write code
Iago Toral Quiroga [Tue, 16 Jul 2019 08:28:53 +0000 (10:28 +0200)]
v3d: refactor the tlb color write code

We want to split the tlb specifier setup from the color writes, because when
we implement per-sample color writes we want to do the latter for all the
samples, but the former only once.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: move tlb color write emission to a helper function
Iago Toral Quiroga [Tue, 16 Jul 2019 08:14:17 +0000 (10:14 +0200)]
v3d: move tlb color write emission to a helper function

We will soon be adding per-sample color writes which means additional
complexity and more indentation (we will need another loop to emit
the writes for each individual sample), so this will help keeping
things simple and a bit more readable.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: implement per-sample tlb color reads
Iago Toral Quiroga [Tue, 16 Jul 2019 08:58:12 +0000 (10:58 +0200)]
v3d: implement per-sample tlb color reads

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoanv: fix format mapping for depth/stencil formats
Lionel Landwerlin [Wed, 17 Jul 2019 22:00:11 +0000 (01:00 +0300)]
anv: fix format mapping for depth/stencil formats

anv_format is supposed to have a pointer back to the associated
VkFormat, we were missed this for depth/stencil formats.

This doesn't fix anything afaict, but will be needed for future
changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 465de47bad70 ("anv: associate vulkan formats with aspects")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: put back VGT_FLUSH at ring init on gfx10
Dave Airlie [Thu, 18 Jul 2019 06:12:53 +0000 (16:12 +1000)]
radv: put back VGT_FLUSH at ring init on gfx10

I can find no evidence that removing this is a good idea.

Fixes: 9b116173b6a ("radv: do not emit VGT_FLUSH on GFX10")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agosoftpipe: Clamp border colors when needed
Gert Wollny [Tue, 16 Jul 2019 06:05:20 +0000 (08:05 +0200)]
softpipe: Clamp border colors when needed

unorm and snorm require that the border color values are clamped, so when
picking the sampler view copy/clamp the border color from the sampler and
use these adjusted values.

Fixes:

  dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_compressed_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_snorm_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_srgb_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_unorm_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_compressed_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_snorm_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_srgb_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_color
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
  dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth_uint_stencil_sample_depth

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agosoftpipe: set a lower minimum clamp value for texture coordinate border clamp
Gert Wollny [Mon, 15 Jul 2019 10:51:06 +0000 (12:51 +0200)]
softpipe: set a lower minimum clamp value for texture coordinate border clamp

The value of -0.5f is not small enough to produce negative coordinates,
so lower the minimum clamp value to -1.0f. This fixes a number of tests
from
   dEQP-GLES31.functional.texture.border_clamp.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agosoftpipe: Correct repeat-mirror evaluation
Gert Wollny [Mon, 15 Jul 2019 11:46:53 +0000 (13:46 +0200)]
softpipe: Correct repeat-mirror evaluation

when mirroring the texture corrdinates the indices must be mirrored as
well and the half pixel shift must be applied in reverse.

Fixes a number of tests from:
  dEQP-GLES31.functional.texture.gather.offset.*
  dEQP-GLES31.functional.texture.gather.offsets.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agosoftpipe: Also mark textures as dirty when updating the framebuffer state
Gert Wollny [Fri, 24 May 2019 10:08:26 +0000 (12:08 +0200)]
softpipe: Also mark textures as dirty when updating the framebuffer state

At this point all the draw caches are flushed to the old attached textures,
so the read caches of these textures will need to be updated too.

Fixes:
   dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoetnaviv: set DITHER_MODE
Jonathan Marek [Fri, 12 Jul 2019 23:26:32 +0000 (19:26 -0400)]
etnaviv: set DITHER_MODE

This fixes a rendering glitch observed in SDL testscale test, where alpha
blending samples with value (1.0, 1.0, 1.0, 0.0) whitens the target instead
of having no effect.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: update headers from rnndb
Jonathan Marek [Fri, 12 Jul 2019 23:25:33 +0000 (19:25 -0400)]
etnaviv: update headers from rnndb

Update to etna_viv commit a16a418.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: fix blend color on newer GPUs
Jonathan Marek [Wed, 3 Jul 2019 18:06:17 +0000 (14:06 -0400)]
etnaviv: fix blend color on newer GPUs

Newer GPUs use the half float ALPHA_COLOR_EXT register.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: fix alpha blending cases
Jonathan Marek [Wed, 3 Jul 2019 18:04:20 +0000 (14:04 -0400)]
etnaviv: fix alpha blending cases

We need to check rgb_func/alpha_func when determining if blend or separate
alpha is required.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: fix polygon offset
Jonathan Marek [Wed, 3 Jul 2019 18:01:33 +0000 (14:01 -0400)]
etnaviv: fix polygon offset

Dividing the fui result by 65535 is obviously wrong, and from testing, on
GC7000L at least there is no division by 65535.

Fixes dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoradv: dont store disasm string unless keep_shader_info flag set
Timothy Arceri [Wed, 17 Jul 2019 04:20:55 +0000 (14:20 +1000)]
radv: dont store disasm string unless keep_shader_info flag set

This fixes the memory use regression from bug 111107.

Fixes: 726a31df705 ("radv: Add the concept of radv shader binaries.")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111107

5 years agoradv/gfx10: set the pgm rsrc3/4 regs using index sh reg set
Dave Airlie [Tue, 16 Jul 2019 05:21:56 +0000 (15:21 +1000)]
radv/gfx10: set the pgm rsrc3/4 regs using index sh reg set

This is ported from AMDVLK, it's probably not requires unless
we want to use "real time queues", but it might be nice to just have
in place.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use correct register setter for ngg hw addr
Dave Airlie [Wed, 17 Jul 2019 04:55:52 +0000 (14:55 +1000)]
radv: use correct register setter for ngg hw addr

this shouldn't matter, but it's good to be correct.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agofreedreno/a6xx: Drop the WFI in the program update stateobj.
Eric Anholt [Wed, 17 Jul 2019 19:56:12 +0000 (12:56 -0700)]
freedreno/a6xx: Drop the WFI in the program update stateobj.

Rob Clark thinks this was likely a workaround for our const buffer
update bugs, and now that it's passing tests, we should be able to
drop it.

renderdoc-traces results:

traces/android/clashofclans.rdc:  +6.1% +/-   1.1%
traces/android/candycrush.rdc:    +5.2% +/-   1.6%

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: Drop the WFI in constant uploads.
Eric Anholt [Wed, 19 Jun 2019 18:36:29 +0000 (11:36 -0700)]
freedreno/a6xx: Drop the WFI in constant uploads.

Now that the bin vs render constlen is fixed, we can skip these waits.

Improves webgl aquarium performance at 10k fish from 27fps to 33.
Some highlights from renderdoc-traces:

traces/android/minecraft.rdc:             +17.1% +/-   3.4%
traces/glmark2/ideas-speed=duration.rdc:  +11.6% +/-   2.4%
traces/android/candycrush.rdc:            +5.4%  +/-   1.1%
traces/android/clashofclans.rdc:          +4.4%  +/-   1.3%

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: Assert that we don't exceed constlen.
Eric Anholt [Wed, 26 Jun 2019 22:35:44 +0000 (15:35 -0700)]
freedreno: Assert that we don't exceed constlen.

We actually could go up to vs->constlen in the binning shader on a6xx,
but for sanity let's make sure that we're always under constlen.  This
would have caught the bug fixed in 572c76fd8826 ("freedreno: Clamp UBO
uploads to the constlen decided by the shader.")

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: Fix more constlen overflows.
Eric Anholt [Wed, 26 Jun 2019 23:47:10 +0000 (16:47 -0700)]
freedreno: Fix more constlen overflows.

Fixes constlen overflow in
dEQP-GLES31.functional.shaders.builtin_var.compute.num_work_groups and
dEQP-GLES31.functional.image_load_store.buffer.image_size.readonly_32
and probably others.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: Drop stale comment about skipping uploads.
Eric Anholt [Wed, 19 Jun 2019 19:00:57 +0000 (12:00 -0700)]
freedreno: Drop stale comment about skipping uploads.

We already skip the upload if it's unused, due to the constlen >
offset check.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agovirgl: Set meta data for textures from handle.
Lepton Wu [Wed, 17 Jul 2019 17:02:20 +0000 (10:02 -0700)]
virgl: Set meta data for textures from handle.

The set of meta data was removed by commit 8083464. It broke lots of
dEQP tests when running with pbuffer surface type.

Fixes: 80834640137 ("virgl: remove dead code")
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
5 years agoradv: Only save the descriptor set if we have one.
Bas Nieuwenhuizen [Wed, 17 Jul 2019 00:58:59 +0000 (02:58 +0200)]
radv: Only save the descriptor set if we have one.

After reset, if valid does not contain the relevant bit the descriptor
can be != NULL but still not be valid.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoanv: report timestampComputeAndGraphics true
Lionel Landwerlin [Wed, 17 Jul 2019 05:46:53 +0000 (08:46 +0300)]
anv: report timestampComputeAndGraphics true

Spec says :

   "timestampComputeAndGraphics specifies support for timestamps on all
    graphics and compute queues. If this limit is set to VK_TRUE, all
    queues that advertise the VK_QUEUE_GRAPHICS_BIT or
    VK_QUEUE_COMPUTE_BIT in the VkQueueFamilyProperties::queueFlags
    support VkQueueFamilyProperties::timestampValidBits of at least 36."

On gen7+ this should be true (we only have 32bits of timestamp on
gen6 and below).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 802f00219addb3 ("anv/device: Update features and limits")
Reported-by: Timothy Strelchun <timothy.strelchun@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Enable fast clears on other miplevels and layers than 0.
Rafael Antognolli [Tue, 25 Jun 2019 21:14:22 +0000 (14:14 -0700)]
iris: Enable fast clears on other miplevels and layers than 0.

Until now we only supported fast clear colors on the first miplevel and
layer. The main reason for it is that we can't have different fast clear
values at different levels/layers, since the surface state only supports
one clear value.

We can, however, enable it if we make sure we only use the same value
for all levels/layers, and if one of them changes, we resolve all the
others. We already do that for depth fast clears so hopefully it will be
fine for color fast clears too.

v2: Add check for partial clear too (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Allow resolving clear color of CCS_D surfaces.
Rafael Antognolli [Wed, 3 Jul 2019 16:37:47 +0000 (09:37 -0700)]
iris: Allow resolving clear color of CCS_D surfaces.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Make iris_has_color_unresolved non-static
Kenneth Graunke [Wed, 17 Jul 2019 20:43:04 +0000 (13:43 -0700)]
iris: Make iris_has_color_unresolved non-static

We want to use this in the transfer code and possibly for fast clears.

5 years agobroadcom: Move v3d_get_device_info to common
Andreas Bergmeier [Sun, 14 Jul 2019 20:23:15 +0000 (22:23 +0200)]
broadcom: Move v3d_get_device_info to common

In common we can use implementation for Vulkan.

5 years agonir/large_constants: Use dominance information to find more constants
Caio Marcelo de Oliveira Filho [Wed, 5 Jun 2019 20:08:56 +0000 (13:08 -0700)]
nir/large_constants: Use dominance information to find more constants

Relax the restriction that all the writes need to be in the first
block: now accept variables that have all the writes in the same
block, and all the reads are dominated by that block.

This let the pass identify large constants that are local to a helper
function.  The writes will be at the place that the function is
inlined, possibly not in the first block (but still all in the same
block).

Results for vkpipeline-db in SKL:

total instructions in shared programs: 3624891 -> 3623145 (-0.05%)
instructions in affected programs: 79416 -> 77670 (-2.20%)
helped: 16
HURT: 0

total cycles in shared programs: 1458149667 -> 1458147273 (<.01%)
cycles in affected programs: 30154164 -> 30151770 (<.01%)
helped: 14
HURT: 2

total loops in shared programs: 2437 -> 2437 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 8813 -> 8745 (-0.77%)
spills in affected programs: 2894 -> 2826 (-2.35%)
helped: 8
HURT: 0

total fills in shared programs: 23470 -> 23392 (-0.33%)
fills in affected programs: 12248 -> 12170 (-0.64%)
helped: 6
HURT: 2

LOST:   0
GAINED: 0

Results for shader-db in SKL with Iris:

total instructions in shared programs: 15379442 -> 15379392 (<.01%)
instructions in affected programs: 837 -> 787 (-5.97%)
helped: 2
HURT: 2
helped stats (abs) min: 27 max: 27 x̄: 27.00 x̃: 27
helped stats (rel) min: 10.47% max: 10.67% x̄: 10.57% x̃: 10.57%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 1.23% max: 1.23% x̄: 1.23% x̃: 1.23%
95% mean confidence interval for instructions value: -39.14 14.14
95% mean confidence interval for instructions %-change: -15.51% 6.17%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4880 -> 4880 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 370677237 -> 370676567 (<.01%)
cycles in affected programs: 17852 -> 17182 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 338 max: 356 x̄: 347.00 x̃: 347
helped stats (rel) min: 13.98% max: 14.64% x̄: 14.31% x̃: 14.31%
HURT stats (abs)   min: 24 max: 24 x̄: 24.00 x̃: 24
HURT stats (rel)   min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18%

total spills in shared programs: 11772 -> 11772 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 24948 -> 24948 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs: Use a strided MOV instead of a conversion for load_* destinations
Jason Ekstrand [Sat, 13 Jul 2019 23:35:20 +0000 (18:35 -0500)]
intel/fs: Use a strided MOV instead of a conversion for load_* destinations

In many cases, the compiler can just copy-prop the strided MOV whereas
the conversion is a bit trickier.  This cuts 5% of the instructions off
of one particular Vulkan CTS test which does lots of load_ssbo.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir/algebraic: Optimize comparisons and up-casts
Jason Ekstrand [Sat, 13 Jul 2019 04:26:59 +0000 (23:26 -0500)]
nir/algebraic: Optimize comparisons and up-casts

These seem like obvious enough optimizations in the world of multiple
integer bit sizes.  The only known thing which hits these at the moment
is some Vulkan CTS tests for 16-bit SSBO values which like to up-cast
and check for equality.  However, it's something that's bound to come up
as we start seeing more integers in shaders.

The optimizations of comparisons of casted values with constants are
something which we would ideally do with range analysis.  However,
lacking that, we can do it in opt_algebraic as long as one side is a
constant.

In dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13, this commit, along
with the previous commit, reduce the number of instructions emitted on
Skylake from 55328 to 44546, a reduction of 20%.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/algebraic: Optimize comparing unpacked values
Jason Ekstrand [Sat, 13 Jul 2019 04:26:48 +0000 (23:26 -0500)]
nir/algebraic: Optimize comparing unpacked values

We could, in theory, add the same optimization for 64-bit unpack
operations but that's likely to fight with 64-bit integer lowering on
platforms which require it so it will require more infrastructure before
that will be a good idea.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir/algebraic: Print out the list of transforms in the C file
Jason Ekstrand [Sat, 13 Jul 2019 15:57:24 +0000 (10:57 -0500)]
nir/algebraic: Print out the list of transforms in the C file

This helps greatly when debugging algebraic transform generators because
you can now actually see the output and verify that your transforms are
getting generated.

Acked-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs: Properly stride NULL replacement regs in DCE
Jason Ekstrand [Sat, 13 Jul 2019 17:06:00 +0000 (12:06 -0500)]
intel/fs: Properly stride NULL replacement regs in DCE

This fixes some validation errors generated by certain D->W conversions
but is likely not a full solution.  Calculating an actual register
stride is a far more complex problem in general and should probably be
handled by the brw_fs_generator.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: Fix nir_lower_alu_to_scalar's instr filtering.
Eric Anholt [Tue, 16 Jul 2019 19:38:10 +0000 (12:38 -0700)]
nir: Fix nir_lower_alu_to_scalar's instr filtering.

It was checking if the dest or src[0] SSA values were vectors, rather than
whether the ALU op was using the source as a vector resulting in a
nir_fdot4 making it through to vc4 and v3d:

vec1 32 ssa_6 = fdot4 ssa_4.xxxx, ssa_5

Fixes: c1cffa4249ca ("nir/alu_to_scalar: Use the new NIR lowering framework")
v2: Use Jason's recommendation to look at input_sizes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agopanfrost: Merge varyings_mem into transient buffers
Alyssa Rosenzweig [Tue, 16 Jul 2019 18:36:11 +0000 (11:36 -0700)]
panfrost: Merge varyings_mem into transient buffers

Theoretically we would like these split since varyings can have
specially optimized flags (no map, coherent local). For now, since
neither of these flags is particularly meaningful right now, merge them
together instead of special casing varyings_mem.

Saves upwards of 64MB of RAM per context.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agovulkan/wsi: update swapchain status on vkQueuePresent
Lionel Landwerlin [Mon, 15 Jul 2019 06:01:20 +0000 (09:01 +0300)]
vulkan/wsi: update swapchain status on vkQueuePresent

With the following chain of events :

   vkQueuePresent()
   <- Surface resize
   vkQueuePresent()

We should be able to report SUBOPTIMAL or OUT_OF_DATE on the second
vkQueuePresent() call. Currently we only look at X11 events in the
vkAcquireNextImage() path so we're not able to report this.

This change checks the queue of events and process any available ones
to update the swapchain status.

v2: Be consistent about reporting the current error state of the
    swapchain (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111097
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: add an option for disabling NGG on GFX10
Samuel Pitoiset [Wed, 17 Jul 2019 12:19:10 +0000 (14:19 +0200)]
radv: add an option for disabling NGG on GFX10

Will be useful for testing the legacy path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agosoftpipe: pass stream-out targets to draw-module early
Erik Faye-Lund [Tue, 7 May 2019 10:13:43 +0000 (12:13 +0200)]
softpipe: pass stream-out targets to draw-module early

This is essensially a port of ed53e61bec9 from LLVMpipe to softpipe,
as it makes things a bit simpler and more performant.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agospirv_extensions: i965: initialize SPIR-V extensions
Alejandro Piñeiro [Sat, 14 Oct 2017 08:52:51 +0000 (10:52 +0200)]
spirv_extensions: i965: initialize SPIR-V extensions

v2: Rebase update after changes on previous patches.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agospirv_extensions: add spirv_supported_extensions on gl_constants
Alejandro Piñeiro [Fri, 13 Oct 2017 18:43:01 +0000 (20:43 +0200)]
spirv_extensions: add spirv_supported_extensions on gl_constants

We can use it to get real values for ARB_spirv_extensions methods.

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agospirv_extensions: define spirv_extensions_supported
Alejandro Piñeiro [Wed, 19 Jun 2019 18:58:30 +0000 (13:58 -0500)]
spirv_extensions: define spirv_extensions_supported

Add a struct to maintain which SPIR-V extensions are supported, and an
utility method to initialize it based on
nir_spirv_supported_capabilities.

v2:
  * Fixing code style (Ian Romanick)
  * Adding a prefix (spirv) to fill_supported_spirv_extensions (Ian Romanick)

v3: rebase update (nir_spirv_supported_extensions renamed)

v4: include AMD_gcn_shader support

v5: move spirv_fill_supported_spirv_extensions to
    src/mesa/main/spirv_extensions.c

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agospirv_extensions: add list of extensions and to_string method
Alejandro Piñeiro [Fri, 13 Oct 2017 14:17:51 +0000 (16:17 +0200)]
spirv_extensions: add list of extensions and to_string method

Ideally this should be generated somehow. One option would be gather
all the extension dependencies listed on the core grammar, but there
would be the possibility of not including some of the extensions.

Note that spirv-tools is doing it just slightly better, as it has a
hardcoded list of extensions manually took from the registry, that
they parse to get the enum and the to_string method (see
generate_grammar_tables.py).

v2:
  * Use a macro to improve readability. (Tapani Pälli)
  * Add unreachable on the switch, no default (Eric Engestrom)
  * No typedef enum (Ian Romanick)
  * Sort extensions names (Ian Romanick)
  * Don't add extensions unlikely to be supported by Mesa at any point
    (Ian Romanick)

v3: rebase update

v4: Include AMD_gcn_shader

v5: move spirv_extensions_to_string to src/mesa/main/spirv_extensions.c

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agospirv_extensions: add GL_ARB_spirv_extensions boilerplate
Alejandro Piñeiro [Tue, 10 Oct 2017 14:09:25 +0000 (16:09 +0200)]
spirv_extensions: add GL_ARB_spirv_extensions boilerplate

v2:
  * Mention extension gap at gl_API.xml (Emil Velikov)
  * Bail with INVALID_ENUM if extension not available on getStringi (Emil Velikov)
  * Use EXTRA_EXT macro when defining the extension at
    get.c/get_hash_params.py (Emil Velikov)
  * Rename source files (spirvextensions.[ch] -> spirv_extensions.[ch]) (Ian)

v3:
  * Fix GL_PROGRAM_BINARY_FORMATS glGet query, broken by error on a
    previous rebase

v4:
   * Fix rebase conflicts on getstring.c after
     GL_SHADING_LANGUAGE_VERSION query was added

v5:
   * Remove src/mapi/glapi/gen/Makefile.am as it no longer exists in
     master

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoradv/gfx10: implement VK_EXT_post_depth_coverage
Samuel Pitoiset [Tue, 16 Jul 2019 15:11:50 +0000 (17:11 +0200)]
radv/gfx10: implement VK_EXT_post_depth_coverage

I did implement this extension a while ago but it didn't work
on pre GFX10 for some reasons. Now all CTS pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: disable the TC compat zrange workaround
Samuel Pitoiset [Tue, 16 Jul 2019 15:35:00 +0000 (17:35 +0200)]
radv/gfx10: disable the TC compat zrange workaround

Unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fallback to the legacy path if tess and extreme geometry
Samuel Pitoiset [Tue, 16 Jul 2019 14:39:16 +0000 (16:39 +0200)]
radv/gfx10: fallback to the legacy path if tess and extreme geometry

This is unsupported and hangs.

This fixes GPU hangs with
dEQP-VK.tessellation.geometry_interaction.limits.output_required_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: always build the GS copy shader but uses it on-demand
Samuel Pitoiset [Tue, 16 Jul 2019 14:39:15 +0000 (16:39 +0200)]
radv/gfx10: always build the GS copy shader but uses it on-demand

It should be possible to build it on-demand too but it requires
more work. On GFX10, the GS copy shader is required when tess
is enabled with extreme geometry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agosoftpipe: Remove unused static function
Gert Wollny [Thu, 2 May 2019 13:54:55 +0000 (15:54 +0200)]
softpipe: Remove unused static function

Thanks to Eric Engestrom for pointing out that there was something wrong
with that function.

Fixes: 724a73509e1bc1ce3abf9500e457bb2911b642db
  softpipe: Prepare handling explicit gradients

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agospirv: Bail when we see CounterBuffer decoration
Caio Marcelo de Oliveira Filho [Wed, 10 Jul 2019 16:00:47 +0000 (09:00 -0700)]
spirv: Bail when we see CounterBuffer decoration

This decoration can be ignored, so we can just skip the next steps.
Otherwise we'd have to also handle it in apply_var_decoration.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>