platform/upstream/mesa.git
6 years agointel/blorp: Update clear color state buffer during fast clears.
Rafael Antognolli [Mon, 5 Mar 2018 16:52:35 +0000 (08:52 -0800)]
intel/blorp: Update clear color state buffer during fast clears.

We always want to update the fast clear color during a fast clear on
i965. On anv, we are doing that before a resolve, but by adding support
to blorp, we can do a similar thing and update it during a fast clear
instead.

The goal is to remove some code from anv that does such update, and
centralize everything in blorp, hopefully removing a lot of code
duplication. It also allows us to have a similar behavior on gen < 9 and
gen >= 10.

v5: s/we/we are/ (Jordan)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agointel/blorp: Only copy clear color when doing a resolve.
Rafael Antognolli [Wed, 7 Mar 2018 18:49:03 +0000 (10:49 -0800)]
intel/blorp: Only copy clear color when doing a resolve.

We only need to copy the clear color from the state buffer to the
inlined surface state when doing a resolve.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agointel/blorp: Add support for fast clear address.
Rafael Antognolli [Thu, 7 Dec 2017 16:47:38 +0000 (08:47 -0800)]
intel/blorp: Add support for fast clear address.

On gen10+, if surface->clear_color_addr is present, use it directly
intead of copying it to the surface state.

v4: Remove redundant #if clause for GEN <= 10 (Jason)
v5: Move flush after the reloc, and keep lower bits (Topi).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agointel/isl: Add support to emit clear value address.
Rafael Antognolli [Thu, 10 Aug 2017 16:29:51 +0000 (09:29 -0700)]
intel/isl: Add support to emit clear value address.

gen10 can emit the clear color by setting it on a buffer somewhere, and
then adding only the address to the surface state.

This commit add support for that on isl_surf_fill_state, and if that is
requested, skip setting the clear value itself.

v2: Add assert to make sure we are at least on gen10.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel: Use Clear Color struct size.
Rafael Antognolli [Tue, 27 Mar 2018 22:51:21 +0000 (15:51 -0700)]
intel: Use Clear Color struct size.

The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.

v4:
 - Add struct to gen11 too (Jason, Jordan)
 - Add field for Converted Clear Color to gen11 (Jason)
 - Add clear_color_state_offset to differentiate from
   clear_value_offset.
 - Fix all the places where clear_value_size was used.

v5 (Jason):
 - Split genxml changes to another commit.
 - Remove unnecessary gen checks.
 - Bring back missing offset increment to init_fast_clear_color().

v6 (Jason):
 - On init_fast_clear_color, change:
   addr.offset += 4 => sdi.Address.offset += i * 4
 - Use GEN_GEN instead of GEN_VERSIONx10.

[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/genxml: Add Clear Color struct to gen10+.
Rafael Antognolli [Tue, 27 Mar 2018 22:48:44 +0000 (15:48 -0700)]
intel/genxml: Add Clear Color struct to gen10+.

v5: Split genxml changes into its own commit (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/genxml: Use a single field for clear color address on gen10.
Rafael Antognolli [Mon, 7 Aug 2017 19:14:04 +0000 (12:14 -0700)]
intel/genxml: Use a single field for clear color address on gen10.

genxml does not support having two address fields with different names
but same position in the state struct. Both "Clear Color Address"
and "Clear Depth Address Low" mean the same thing, only for different
surface types.

To workaround this genxml limitation, rename "Clear Color Address"
to "Clear Value Address" and use it for both color and depth. Do the
same for the high bits.

TODO: add support for multiple addresses at the same position in the
xml.

v2: Combine high and low order bits into a single address field.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agogenxml: Preserve fields that share dword space with addresses.
Rafael Antognolli [Thu, 18 Jan 2018 00:19:41 +0000 (16:19 -0800)]
genxml: Preserve fields that share dword space with addresses.

Some instructions contain fields that are either an address or a value
of some type based on the content of other fields, such as clear color
values vs address. That works fine if these fields are in the less
significant dword, the lower 32 bits of the address, because they get
OR'ed with the address. But if they are in the higher 32 bits, they get
discarded.

On Gen10 we have fields that share space with the higher 16 bits of the
address too. This commit makes sure those fields don't get discarded.

v5: Remove spurious whitespace (Jason).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv/image: Do not override lower bits of dword.
Rafael Antognolli [Thu, 18 Jan 2018 22:12:08 +0000 (14:12 -0800)]
anv/image: Do not override lower bits of dword.

The lower bits seem to have extra fields in every platform but gen8
(even though we don't use them in gen9). So just go ahead and avoid
using them for the address.

v4: Use Jason's suggestion for comment explaining the change.
v5: Fix aux_address comment in anv_private.h (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
6 years agoradv: implement a fast prefetch path for the vertex stage
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:04 +0000 (12:12 +0200)]
radv: implement a fast prefetch path for the vertex stage

This allows to start draws as soon as possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:03 +0000 (12:12 +0200)]
radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradv: use a mask for VBOs and shaders prefetching
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:02 +0000 (12:12 +0200)]
radv: use a mask for VBOs and shaders prefetching

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agogallium/pp: fix MLAA shaders
Marek Olšák [Wed, 4 Apr 2018 20:11:03 +0000 (16:11 -0400)]
gallium/pp: fix MLAA shaders

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549

6 years agogallium/pp: use user constant buffers
Marek Olšák [Wed, 4 Apr 2018 20:04:30 +0000 (16:04 -0400)]
gallium/pp: use user constant buffers

This fixes a radeonsi crash.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026

6 years agost/mesa: set stencil border color the same as intensity
Marek Olšák [Mon, 2 Apr 2018 21:58:30 +0000 (17:58 -0400)]
st/mesa: set stencil border color the same as intensity

This fixes some stencil border color tests on Vega and Raven chips.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoFix use of alloca() without #include <c99_alloca.h>
Jon Turney [Tue, 3 Apr 2018 16:52:56 +0000 (17:52 +0100)]
Fix use of alloca() without #include <c99_alloca.h>

Fix use of alloca() without #include <c99_alloca.h> in 1da345e5

vbo/vbo_context.c: In function '_vbo_draw_indirect':
vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
       struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim));
                                  ^~~~~~
vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion]

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
6 years agoradv: implement out-of-order rasterization when it's safe on VI+
Samuel Pitoiset [Wed, 28 Mar 2018 17:03:00 +0000 (19:03 +0200)]
radv: implement out-of-order rasterization when it's safe on VI+

Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.

No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.

Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.

Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: change blend_enable field to use four bits per CB
Samuel Pitoiset [Thu, 29 Mar 2018 08:54:29 +0000 (10:54 +0200)]
radv: change blend_enable field to use four bits per CB

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: scan which color blend attachments are enabled
Samuel Pitoiset [Thu, 29 Mar 2018 08:49:33 +0000 (10:49 +0200)]
radv: scan which color blend attachments are enabled

With cb_target_enabled_4bit in order to have four bits per CB.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: put more fields in radv_blend_state
Samuel Pitoiset [Thu, 29 Mar 2018 08:01:41 +0000 (10:01 +0200)]
radv: put more fields in radv_blend_state

Some will be used for further optimizations (ie. out-of-order rast).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not always disable dual quad mode when chip has RbPlus
Samuel Pitoiset [Thu, 29 Mar 2018 12:51:20 +0000 (14:51 +0200)]
radv: do not always disable dual quad mode when chip has RbPlus

For GFX9+ only, RadeonSI does this too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: don't use the SPI barrier management bug workaround
Samuel Pitoiset [Wed, 4 Apr 2018 08:55:43 +0000 (10:55 +0200)]
radv: don't use the SPI barrier management bug workaround

Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: mask out high VM address bits in registers where needed
Samuel Pitoiset [Wed, 4 Apr 2018 08:55:42 +0000 (10:55 +0200)]
radv: mask out high VM address bits in registers where needed

Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agointel: compiler: silence compiler warning
Lionel Landwerlin [Tue, 3 Apr 2018 13:41:18 +0000 (14:41 +0100)]
intel: compiler: silence compiler warning

../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]

Introduced by 8f83eea71e233 ("i965: Add negative_equals methods").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agocompiler/spirv: set is_shadow for depth comparitor sampling opcodes
Iago Toral Quiroga [Mon, 2 Apr 2018 09:39:41 +0000 (11:39 +0200)]
compiler/spirv: set is_shadow for depth comparitor sampling opcodes

From the SPIR-V spec, OpTypeImage:

"Depth is whether or not this image is a depth image. (Note that
 whether or not depth comparisons are actually done is a property of
 the sampling opcode, not of this type declaration.)"

The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).

v2:
 - Do the same for OpImageDrefGather.
 - Set is_shadow to false if the sampling opcode is not one of these (Jason)
 - Reuse an existing switch statement instead of adding a new one (Jason)

Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
6 years agoi965: Extend the negative 32-bit deltas to 64-bits
Sergii Romantsov [Mon, 2 Apr 2018 06:59:06 +0000 (09:59 +0300)]
i965: Extend the negative 32-bit deltas to 64-bits

Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.

v2:
  used int32_t fucntion parameter instead of explicit type conversion.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
6 years agonir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Jason Ekstrand [Fri, 23 Mar 2018 18:05:04 +0000 (11:05 -0700)]
nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination

Otherwise we may end up trying to coalesce in a case such as

ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)

and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA.  We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source.  However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.

Shader-db results on Haswell:

    total instructions in shared programs: 13657906 -> 13659101 (<.01%)
    instructions in affected programs: 149291 -> 150486 (0.80%)
    helped: 0
    HURT: 592

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agoanv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL
Kevin Strasser [Tue, 3 Apr 2018 21:21:34 +0000 (14:21 -0700)]
anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL

If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoglsl: always call do_lower_jumps() after loop unrolling
Timothy Arceri [Tue, 3 Apr 2018 01:38:13 +0000 (11:38 +1000)]
glsl: always call do_lower_jumps() after loop unrolling

This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.

LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!

Fixes: 96fe8834f539 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"

Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317

6 years agovulkan/wsi/wayland: fix leaks
James Legg [Fri, 30 Mar 2018 15:45:01 +0000 (16:45 +0100)]
vulkan/wsi/wayland: fix leaks

Fixes: bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
6 years agodocs: update calendar, add news and link release notes to 17.3.8
Juan A. Suarez Romero [Tue, 3 Apr 2018 17:38:36 +0000 (17:38 +0000)]
docs: update calendar, add news and link release notes to 17.3.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agodocs: add sha256 checksums for 17.3.8
Juan A. Suarez Romero [Tue, 3 Apr 2018 17:33:23 +0000 (17:33 +0000)]
docs: add sha256 checksums for 17.3.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit ba371c7262a484391cace9d5e17635ed14c58692)

6 years agodocs: add release notes for 17.3.8
Juan A. Suarez Romero [Tue, 3 Apr 2018 16:39:48 +0000 (16:39 +0000)]
docs: add release notes for 17.3.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 3bf5c10c5c0e9fac6eb0b2c201bcf44755ecfaec)

6 years agost/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.
Jakob Bornecrantz [Tue, 3 Apr 2018 15:58:10 +0000 (16:58 +0100)]
st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.

When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
6 years agointel: gen-decoder: print all dword a field belongs to
Lionel Landwerlin [Tue, 3 Apr 2018 10:45:24 +0000 (11:45 +0100)]
intel: gen-decoder: print all dword a field belongs to

Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agointel: genxml: decode variable length MI_LRI
Lionel Landwerlin [Tue, 3 Apr 2018 10:21:31 +0000 (11:21 +0100)]
intel: genxml: decode variable length MI_LRI

MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.

Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).

This is particularly useful for looking at error stats generated by
the kernel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agointel: gen-decoder: don't decode fields beyond a dword length
Lionel Landwerlin [Tue, 3 Apr 2018 10:01:56 +0000 (11:01 +0100)]
intel: gen-decoder: don't decode fields beyond a dword length

For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :

0xffffe374:  0x7a000002:  PIPE_CONTROL
0xffffe374:  0x7a000002 : Dword 0
    DWord Length: 2
0xffffe378:  0x00800000 : Dword 1
    Depth Cache Flush Enable: false
    Stall At Pixel Scoreboard: false
    State Cache Invalidation Enable: false
    Constant Cache Invalidation Enable: false
    VF Cache Invalidation Enable: false
    DC Flush Enable: false
    Pipe Control Flush Enable: false
    Notify Enable: false
    Indirect State Pointers Disable: false
    Texture Cache Invalidation Enable: false
    Instruction Cache Invalidate Enable: false
    Render Target Cache Flush Enable: false
    Depth Stall Enable: false
    Post Sync Operation: 0 (No Write)
    Generic Media State Clear: false
    TLB Invalidate: false
    Global Snapshot Count Reset: false
    Command Streamer Stall Enable: false
    Store Data Index: 0
    LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
    Destination Address Type: 0 (PPGTT)
    Flush LLC: false
0xffffe37c:  0x00000000 : Dword 2
    Address: 0x00000000
0xffffe380:  0x00000000 : Dword 3
0xffffe384:  0x05000000 : Dword 4
    Immediate Data: 83886080
0xffffe384:  0x05000000:  MI_BATCH_BUFFER_END

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agointel: error_decode: add an option to decode all buffers
Lionel Landwerlin [Tue, 27 Mar 2018 17:10:45 +0000 (18:10 +0100)]
intel: error_decode: add an option to decode all buffers

The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agointel: genxml: add preemption control instructions
Lionel Landwerlin [Tue, 27 Mar 2018 16:56:44 +0000 (17:56 +0100)]
intel: genxml: add preemption control instructions

Helpful to debug kernel workaround batchbuffers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
6 years agomesa: ensure that variable is initialized
Dylan Baker [Mon, 2 Apr 2018 22:29:45 +0000 (15:29 -0700)]
mesa: ensure that variable is initialized

This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes: 16f6634e7fb5ada308e55b852cd49251e7f3f8b1
       ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
6 years agoradeonsi/gfx9: fix bad LLVM params in monolithic LS+HS
Marek Olšák [Mon, 2 Apr 2018 19:06:42 +0000 (15:06 -0400)]
radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradv: enable VK_EXT_shader_viewport_index_layer
Samuel Pitoiset [Mon, 2 Apr 2018 16:17:55 +0000 (18:17 +0200)]
radv: enable VK_EXT_shader_viewport_index_layer

The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir+drivers: add helpers to get # of src/dest components
Rob Clark [Wed, 28 Mar 2018 12:32:10 +0000 (08:32 -0400)]
nir+drivers: add helpers to get # of src/dest components

Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agofreedreno/ir3: fix fallout of unused false-depth elimination
Rob Clark [Mon, 2 Apr 2018 14:47:23 +0000 (10:47 -0400)]
freedreno/ir3: fix fallout of unused false-depth elimination

Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd.  Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.

Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agogallium/pipebuffer: fix parenthesis location
Timothy Arceri [Sat, 31 Mar 2018 23:32:28 +0000 (09:32 +1000)]
gallium/pipebuffer: fix parenthesis location

Without this the return value will never get set to -1. This
was first added in 49866c8f3457 and copied in 2b396eeed983.

Fixes: 2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342

6 years agoRevert "mesa: add GL_HALF_FLOAT as supported type to readpixels"
Tapani Pälli [Tue, 3 Apr 2018 05:43:18 +0000 (08:43 +0300)]
Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels"

This reverts commit 41cf30b8bc55fdf36adac3311002dc32b6715949.

Commit caused regressions with KHR-GLES3.packed_pixels.* tests.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
6 years agogallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass
Mike Lothian [Sun, 1 Apr 2018 00:32:22 +0000 (01:32 +0100)]
gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass

Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass
Mike Lothian [Sun, 1 Apr 2018 00:32:21 +0000 (01:32 +0100)]
radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass

Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass
Mike Lothian [Sun, 1 Apr 2018 00:32:20 +0000 (01:32 +0100)]
ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass

Include llvm-c/Transforms/Utils.h with the newest LLVM 7

Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/dri: Initialise modifier to INVALID for DRI2
Daniel Stone [Mon, 2 Apr 2018 12:20:34 +0000 (13:20 +0100)]
st/dri: Initialise modifier to INVALID for DRI2

When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.

This resulted in VC4 DRI2 clients failing with:
  Modifier 0x0 vs. tiling (0x700000000000001) mismatch

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 3f8513172ff6 ("gallium/winsys/drm: introduce modifier field to winsys_handle")

6 years agoradeonsi: implement GL_KHR_blend_equation_advanced
Marek Olšák [Sun, 7 Jan 2018 20:05:52 +0000 (21:05 +0100)]
radeonsi: implement GL_KHR_blend_equation_advanced

MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: rename unpack_param -> si_unpack_param
Marek Olšák [Fri, 23 Mar 2018 03:40:55 +0000 (23:40 -0400)]
radeonsi: rename unpack_param -> si_unpack_param

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: move FMASK shader logic to shared code
Marek Olšák [Tue, 20 Mar 2018 23:14:57 +0000 (19:14 -0400)]
radeonsi: move FMASK shader logic to shared code

We'll need it for FBFETCH in both TGSI and NIR paths.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradeonsi: add R600_DEBUG=nofmask to disable MSAA compression
Marek Olšák [Tue, 20 Mar 2018 21:32:56 +0000 (17:32 -0400)]
radeonsi: add R600_DEBUG=nofmask to disable MSAA compression

For testing.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agogallium/u_tests: test FBFETCH and shader-based blending with MSAA
Marek Olšák [Tue, 20 Mar 2018 20:45:03 +0000 (16:45 -0400)]
gallium/u_tests: test FBFETCH and shader-based blending with MSAA

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoac/gpu_info: print GB_ADDR_CONFIG
Marek Olšák [Thu, 22 Mar 2018 00:02:47 +0000 (20:02 -0400)]
ac/gpu_info: print GB_ADDR_CONFIG

6 years agoac/gpu_info: reorder the fields and print them nicely
Marek Olšák [Mon, 19 Mar 2018 23:12:15 +0000 (19:12 -0400)]
ac/gpu_info: reorder the fields and print them nicely

6 years agoac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory
Marek Olšák [Mon, 19 Mar 2018 22:42:32 +0000 (18:42 -0400)]
ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory

6 years agoac/gpu_info: don't print irrelevant fields
Marek Olšák [Mon, 19 Mar 2018 22:36:35 +0000 (18:36 -0400)]
ac/gpu_info: don't print irrelevant fields

6 years agost/mesa: don't draw if the bound element array buffer is not allocated
Marek Olšák [Mon, 19 Mar 2018 20:51:21 +0000 (16:51 -0400)]
st/mesa: don't draw if the bound element array buffer is not allocated

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoanv/cmd_buffer: honor pending clear views for depth/stencil attachments
Iago Toral Quiroga [Wed, 28 Feb 2018 08:44:18 +0000 (09:44 +0100)]
anv/cmd_buffer: honor pending clear views for depth/stencil attachments

v2: rebased on top of subpass rework.

v3: rebased

v4:
 - rebased
 - reset pending clear views in one go rather one bit at a time (Caio)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv/cmd_buffer: consider multiview masks for tracking pending clear aspects
Iago Toral Quiroga [Wed, 21 Feb 2018 08:47:09 +0000 (09:47 +0100)]
anv/cmd_buffer: consider multiview masks for tracking pending clear aspects

When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.

This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.

v2:
  - track pending clear views in the attachment state (Jason)
  - rebased on top of fast-clear rework.

v3:
  - rebased on top of subpass rework.

v4: rebased.

v5 (Caio):
 - Rebased.
 - Initialize pending clear views to only have bits set for layers
   that exist.
 - Reset pending clear views in one go rather one bit at a time.
 - Put "last subpass for this attachment" condition in a separate
   function to simplify the conditional that resets pending_clear_aspects.

Fixes:
dEQP-VK.multiview.readback_implicit_clear.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeonsi/nir: fix explicit component packing for geom/tess doubles
Timothy Arceri [Wed, 21 Mar 2018 02:22:52 +0000 (13:22 +1100)]
radeonsi/nir: fix explicit component packing for geom/tess doubles

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: gather buffers declared more accurately and use const fast path
Timothy Arceri [Mon, 26 Mar 2018 23:39:49 +0000 (10:39 +1100)]
radeonsi/nir: gather buffers declared more accurately and use const fast path

For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip
setting the more accurate masks for builtin uniforms for now as it
causes some piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: create load_const_buffer_desc_fast_path() helper
Timothy Arceri [Mon, 26 Mar 2018 23:26:16 +0000 (10:26 +1100)]
radeonsi: create load_const_buffer_desc_fast_path() helper

This will be shared by the TGSI and NIR backends. For simplicity
we leave the SI LLVM 5.0 and lower work around only in the TGSI
backend.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER
Timothy Arceri [Mon, 26 Feb 2018 09:42:35 +0000 (20:42 +1100)]
radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_nir: gather next_stage in shader_info
Timothy Arceri [Mon, 26 Feb 2018 09:40:38 +0000 (20:40 +1100)]
st/glsl_to_nir: gather next_stage in shader_info

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agofreedreno/a5xx: don't align height for PIPE_BUFFER
Rob Clark [Sun, 1 Apr 2018 15:26:01 +0000 (11:26 -0400)]
freedreno/a5xx: don't align height for PIPE_BUFFER

Buffers can be large, so we probably don't want to make them all 32x
bigger.  But they can't be rendered to (at least in GL) so we don't
need this workaround to prevent page faults on mem<->gmem.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: fix page faults on last level
Rob Clark [Sun, 1 Apr 2018 14:32:36 +0000 (10:32 -0400)]
freedreno/a5xx: fix page faults on last level

We could alternatively fall back to using "old style" draw's for
mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32,
but that is somewhat more work (and not really something that could
be applied to stable)

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: fix issue w/ glamor composite shaders
Rob Clark [Sat, 31 Mar 2018 18:36:37 +0000 (14:36 -0400)]
freedreno/ir3: fix issue w/ glamor composite shaders

Fixes an issue that became possible when we started lowering phi webs to
regs (a7ea2b4e) (although was not really seen until we also switched to
using peephole select pass (ec8bc54a) instead of lowering *all* if/else
to select).

If texture coord (or anything else that uses create_collect() to collect
scalar values in a sequence of scalar registers) was consuming a value
produced on either side of an if/else (ie. a phi lowered to nir reg,
which in ir3 is an "array" of length 1) then register allocation would
happen incorrectly and we'd end up sampling from garbage coordinates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: more half-precision fixes
Rob Clark [Sat, 31 Mar 2018 18:16:26 +0000 (14:16 -0400)]
freedreno/ir3: more half-precision fixes

Some instructions require src/dst to be in full or half precision
register depending on src/dst type.  So do a better job of propagating
register type.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add helper to create immed of specified size
Rob Clark [Sat, 31 Mar 2018 18:14:31 +0000 (14:14 -0400)]
freedreno/ir3: add helper to create immed of specified size

We'll also need to be able to create a half-precision immediate.  So
re-work create_immed().  Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: pass ctx instead of block to create_collect()
Rob Clark [Sat, 31 Mar 2018 18:13:02 +0000 (14:13 -0400)]
freedreno/ir3: pass ctx instead of block to create_collect()

Prep work for following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: eliminate unused false-deps
Rob Clark [Tue, 6 Mar 2018 14:53:45 +0000 (09:53 -0500)]
freedreno/ir3: eliminate unused false-deps

Previously false-dependencies would get flagged as used, even if the
only "use" was a false dep to (for example) prevent a load from being
scheduled after a store.

In addition to being pointless instructions, in some cases they can
cause problems.  For example, ldg (and similar instructions) depend on
an immed arg getting CP'd into the instruction, but this doesn't happen
if an instruction is otherwise unused.  Which can result in undefined
results (overwriting unintended registers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add local_group_size
Rob Clark [Tue, 6 Mar 2018 13:30:41 +0000 (08:30 -0500)]
freedreno/ir3: add local_group_size

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: clear SSA flag when assigning "ARRAY" regs too
Rob Clark [Sat, 31 Mar 2018 17:58:11 +0000 (13:58 -0400)]
freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too

Avoids a misleading "INVALID FLAGS" warning in debug builds.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: print array live ranges
Rob Clark [Sat, 31 Mar 2018 17:56:32 +0000 (13:56 -0400)]
freedreno/ir3: print array live ranges

This is also useful to see if optmsgs are enabled.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Implement DP2 instruction
Wladimir J. van der Laan [Tue, 8 Aug 2017 15:06:21 +0000 (15:06 +0000)]
freedreno: a2xx: Implement DP2 instruction

Use DOT2ADDv instruction with 0.0f constant add.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: implement SEQ/SNE instructions
Wladimir J. van der Laan [Tue, 8 Aug 2017 14:16:21 +0000 (14:16 +0000)]
freedreno: a2xx: implement SEQ/SNE instructions

Extend translate_sge_slt to emit these, in analogous fashion
but using CNDEv.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Compressed textures support
Wladimir J. van der Laan [Wed, 2 Aug 2017 15:31:51 +0000 (15:31 +0000)]
freedreno: a2xx: Compressed textures support

Add support for:

- PIPE_FORMAT_ETC1_RGB8
- PIPE_FORMAT_DXT1_RGB
- PIPE_FORMAT_DXT1_RGBA
- PIPE_FORMAT_DXT3_RGBA
- PIPE_FORMAT_DXT5_RGBA

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Support TEXTURE_RECT
Wladimir J. van der Laan [Mon, 31 Jul 2017 11:58:29 +0000 (11:58 +0000)]
freedreno: a2xx: Support TEXTURE_RECT

Denormalized texture coordinates are required for text rendering in
GALLIUM_HUD.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Prevent crash in emit_texture if view is not set
Wladimir J. van der Laan [Wed, 23 Aug 2017 12:43:14 +0000 (12:43 +0000)]
freedreno: a2xx: Prevent crash in emit_texture if view is not set

Textures will sometimes be updated if texture view state was
un-set, without this change that causes an assertion crash or
segfault.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Fix fd2_tex_swiz
Wladimir J. van der Laan [Fri, 25 Aug 2017 14:25:18 +0000 (14:25 +0000)]
freedreno: a2xx: Fix fd2_tex_swiz

Compose swizzles using util_format_compose_swizzles instead
of the custom code (which somehow had a bug).

This makes the GL_ALPHA internal format work.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Change use of BLEND_ to BLEND2_
Wladimir J. van der Laan [Thu, 22 Mar 2018 14:47:03 +0000 (14:47 +0000)]
freedreno: a2xx: Change use of BLEND_ to BLEND2_

Change use of BLEND_ to BLEND2_,

    BLEND_* a3xx_rb_blend_opcode
    BLEND2_* is a2xx_rb_blend_opcode

This makes no effective difference as the used enumerant has the same
value (0), but the other enumerants do not match 1-to-1 so this will
avoid future problems.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: a2xx: Update rnndb header for formats enumeration
Wladimir J. van der Laan [Thu, 22 Mar 2018 14:46:18 +0000 (14:46 +0000)]
freedreno: a2xx: Update rnndb header for formats enumeration

The format enumeration comes comes from the yamoto
register headers that are part of the amd-gpu kernel driver.
(see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43)

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agovbo: Use alloca for _vbo_draw_indirect.
Mathias Fröhlich [Wed, 28 Mar 2018 05:19:02 +0000 (07:19 +0200)]
vbo: Use alloca for _vbo_draw_indirect.

Avoid using malloc in the draw path of mesa.
Since the draw_count is a user api input, fall back to malloc if
the amount of consumed stack space may get too high.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agovbo: Remove unused includes to vbo_private.h
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
vbo: Remove unused includes to vbo_private.h

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agovbo: Move vbo_split into the tnl module.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
vbo: Move vbo_split into the tnl module.

Move the files, adapt to the naming scheme in tnl, update callers
and build system.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agovbo: Readd the arrays argument to the legacy draw methods.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
vbo: Readd the arrays argument to the legacy draw methods.

The legacy draw paths from back before 2012 contained a gl_vertex_array
array for the inputs to be used for draw. So all draw methods from legacy
drivers and everything that goes through tnl are originally written
for this calling convention. The same goes for tools like t_rebase or
vbo_split*, that even partly still have the original calling convention
with a currently unused such pointer.
Back in 2012 patch 50f7e75

mesa: move gl_client_array*[] from vbo_draw_func into gl_context

introduced Array._DrawArrays, which was something that was IMO aiming for
a similar direction than Array._DrawVAO introduced recently.
Now several tools like t_rebase and vbo_split*, which are mostly used by
tnl based drivers, would need to be converted to use the internal
Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver
backends that use any of these tools.
Alternatively we can reintroduce the gl_vertex_array array in its call
argument list and put these tools finally into the tnl directory.
So this change reintroduces this gl_vertex_array array for the legacy
draw paths that are still required for the tools t_rebase and vbo_split*.
A followup will move vbo_split also into tnl.

Note that none of the affected drivers use the DriverFlags.NewArray
driver bit. So it should be safe to remove this also for the legacy
draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agovbo: Remove the now unused vbo draw path.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
vbo: Remove the now unused vbo draw path.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agotnl: Push down the gl_vertex_array inputs into tnl drivers.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
tnl: Push down the gl_vertex_array inputs into tnl drivers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agovbo: Remove vbo_indirect_draw_func.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
vbo: Remove vbo_indirect_draw_func.

Remove the vbo_indirect_draw_func vbo callback and make the default
implementation use the drivers main draw callback function directly.
This will be needed with the next changes when drivers without own main
drivers DrawIndirect implementation get moved to the main drivers
Draw method.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agoi965: Push down the gl_vertex_array inputs into i965.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
i965: Push down the gl_vertex_array inputs into i965.

Let the i965 backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.
Note that brw_draw_indirect_prims calls brw_draw_prims internally
and gets its update to Array._DrawArray by this way.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agogallium: Push down the gl_vertex_array inputs into gallium.
Mathias Fröhlich [Sun, 25 Mar 2018 17:16:54 +0000 (19:16 +0200)]
gallium: Push down the gl_vertex_array inputs into gallium.

Let the gallium backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agonir/validator: Validate that all used variables exist
Jason Ekstrand [Tue, 20 Mar 2018 23:57:51 +0000 (16:57 -0700)]
nir/validator: Validate that all used variables exist

We were validating this for locals but nothing else.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/vec4: Set channel_sizes for MOV_INDIRECT sources
Jason Ekstrand [Fri, 23 Mar 2018 16:27:55 +0000 (09:27 -0700)]
intel/vec4: Set channel_sizes for MOV_INDIRECT sources

Otherwise, any indirect push constant access results in an assertion
failure when we start digging through the channel_sizes array.  This
fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert
on Haswell.  It should be a harmless no-op for GL since indirect push
constants aren't used there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e69e5c7006d "i965/vec4: load dvec3/4 uniforms first in the..."

6 years agonir/lower_indirect_derefs: Support interp_var_at intrinsics
Jason Ekstrand [Tue, 20 Mar 2018 19:12:12 +0000 (12:12 -0700)]
nir/lower_indirect_derefs: Support interp_var_at intrinsics

This fixes the fs-interpolateAtCentroid-block-array piglit test on i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
6 years agonir/vars_to_ssa: Remove copies from the correct set
Jason Ekstrand [Thu, 15 Mar 2018 23:42:13 +0000 (16:42 -0700)]
nir/vars_to_ssa: Remove copies from the correct set

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
6 years agonir: Return a cursor from nir_instr_remove
Jason Ekstrand [Fri, 16 Mar 2018 16:52:04 +0000 (09:52 -0700)]
nir: Return a cursor from nir_instr_remove

Because nir_instr_remove is an inline wrapper around nir_instr_remove_v,
the compiler should be able to tell that the return value is unused and
not emit the extra code in most cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agonir: Add src/dest num_components helpers
Jason Ekstrand [Thu, 15 Mar 2018 04:44:51 +0000 (21:44 -0700)]
nir: Add src/dest num_components helpers

We already have these for bit_size

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>