platform/upstream/mesa.git
8 years agonvc0/ir: add emission for OP_SULEA
Samuel Pitoiset [Wed, 27 Apr 2016 17:14:35 +0000 (19:14 +0200)]
nvc0/ir: add emission for OP_SULEA

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50/ir: fix tex constraints for surface coords on Fermi
Samuel Pitoiset [Wed, 27 Apr 2016 16:27:10 +0000 (18:27 +0200)]
nv50/ir: fix tex constraints for surface coords on Fermi

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50/ir: use moveSources to condense sources
Ilia Mirkin [Sun, 31 Jan 2016 03:08:06 +0000 (22:08 -0500)]
nv50/ir: use moveSources to condense sources

This makes sure that rIndirectSrc and other things stay updated.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agonvc0: bind images on fragment and compute shaders for Fermi
Samuel Pitoiset [Wed, 27 Apr 2016 16:25:33 +0000 (18:25 +0200)]
nvc0: bind images on fragment and compute shaders for Fermi

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0/ir: don't check the format for surface stores on Kepler
Samuel Pitoiset [Sat, 21 May 2016 14:28:09 +0000 (16:28 +0200)]
nvc0/ir: don't check the format for surface stores on Kepler

Initially to make sure the format doesn't mismatch and won't produce
out-of-bounds access, we checked that both formats have exactly the same
number of bytes, but this should not be checked for type stores.

This fixes serious rendering issues in the UE4 demos (tested with
realistic and reflections).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50/ir: fix a comment in canDualIssue()
Samuel Pitoiset [Sat, 21 May 2016 14:13:48 +0000 (16:13 +0200)]
nv50/ir: fix a comment in canDualIssue()

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonv50/ir: fix SUSTx constraints on Kepler
Samuel Pitoiset [Thu, 19 May 2016 22:52:26 +0000 (00:52 +0200)]
nv50/ir: fix SUSTx constraints on Kepler

To prevent out-of-bounds access and format mismatch we add a predicate
on sustp, but we have to account for it when the sources are condensed
because a predicate is a source. Using the range 3:6 will only condense
the input data and it's always the case. This also fixes constraints
when an indirect access is used.

This ensures that sources are correctly aligned.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoi965: Just read the existing tally on EndTransformFeedback if paused.
Kenneth Graunke [Mon, 9 May 2016 05:48:02 +0000 (22:48 -0700)]
i965: Just read the existing tally on EndTransformFeedback if paused.

If the transform feedback object is paused when ending, then there are
no new snapshots to add to the tally.  In fact, we haven't written a
starting snapshot, so we'd best not try and compute (end - start).

Just load the existing tally so we can convert it to the number of
vertices written and store it to the final result location.

This is the Haswell+ equivalent of the previous commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi965: Don't write a counter snapshot on EndTransformFeedback if paused.
Kenneth Graunke [Mon, 9 May 2016 05:48:02 +0000 (22:48 -0700)]
i965: Don't write a counter snapshot on EndTransformFeedback if paused.

If the transform feedback object is paused, then we've already written
an ending counter snapshot.  We don't want to write another one.

This fixes assertions in GL33-CTS.transform_feedback.api_errors_test,
which calls EndTransformfeedback after PauseTransformFeedback.  On the
next BeginTransformFeedback, we tried to tally up the results, and saw
an odd number of snapshots (due to the double-end), and tripped an
assertion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agomesa: Call TransformFeedback driver hooks before setting flags.
Kenneth Graunke [Mon, 9 May 2016 05:45:01 +0000 (22:45 -0700)]
mesa: Call TransformFeedback driver hooks before setting flags.

This way, the driver's EndTransformFeedback() hook can tell whether the
transform feedback operation was paused.  It's also convenient to have
Paused remain false until the driver's PauseTransformFeedback hook
finishes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agonir: Fix crash in nir_lower_wpos_center().
Kenneth Graunke [Fri, 20 May 2016 23:29:44 +0000 (16:29 -0700)]
nir: Fix crash in nir_lower_wpos_center().

Otherwise we rewrote the fadd to use itself, causing crashes in
validation.  Instead, start after the last use like we should.

A brown paper bag fix.  Fixes crashes in several Vulkan tests.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agonir: remove dead glsl variables before lowering io.
Dave Airlie [Fri, 20 May 2016 20:48:05 +0000 (06:48 +1000)]
nir: remove dead glsl variables before lowering io.

For cull distance GLSL will let unsized unused arrays get
into the backend, we should nuke those straight away, to
save caring about them later.

This fixes:
arb_separate_shader_objects/linker/large-number-of-unused-varyings
as a side effect (even without culling changes).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agospirv: Handle the PixelCenterInteger execution mode.
Kenneth Graunke [Wed, 18 May 2016 18:06:08 +0000 (11:06 -0700)]
spirv: Handle the PixelCenterInteger execution mode.

This isn't allowed by Vulkan, but might be useful someday for
SPIR-V in OpenGL (if that ever becomes a thing).  It's easy enough
to hook up, and as precedent, we already do so for OriginLowerLeft.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi965: Delete dead dFdy flipping code.
Kenneth Graunke [Wed, 18 May 2016 17:35:54 +0000 (10:35 -0700)]
i965: Delete dead dFdy flipping code.

Rob's nir_lower_wpos_ytransform() pass flips dFdy in the opposite case
of what I expected, so we always take the negate_value case.  It doesn't
really matter.

v2: Write src0 before src1 in ADD instructions (requested by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoi965: Delete brw_wm_prog_key::render_to_fbo and drawable_height.
Kenneth Graunke [Tue, 17 May 2016 10:40:11 +0000 (03:40 -0700)]
i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height.

Now that we handle flipping and other gl_FragCoord transformations
via a uniform, these key fields have no users.

This patch actually eliminates the associated recompiles.  The Tomb
Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable
number.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoi965, anv: Use NIR FragCoord re-center and y-transform passes.
Kenneth Graunke [Tue, 17 May 2016 08:52:16 +0000 (01:52 -0700)]
i965, anv: Use NIR FragCoord re-center and y-transform passes.

This handles gl_FragCoord transformations and other window system vs.
user FBO coordinate system flipping by multiplying/adding uniform
values, rather than recompiles.

This is much better because we have no decent way to guess whether
the application is going to use a shader with the window system FBO
or a user FBO, much less the drawable height.  This led to a lot of
recompiles in many applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers.
Kenneth Graunke [Wed, 18 May 2016 18:38:32 +0000 (11:38 -0700)]
nir: Add a simple nir_lower_wpos_center() pass for Vulkan drivers.

nir_lower_wpos_ytransform() is great for OpenGL, which allows
applications to choose whether their coordinate system's origin is
upper left/lower left, and whether the pixel center should be on
integer/half-integer boundaries.

Vulkan, however, has much simpler requirements: the pixel center
is always half-integer, and the origin is always upper left.  No
coordinate transform is needed - we just need to add <0.5, 0.5>.
This means that we can avoid using (and setting up) a uniform.

I thought about adding more options to nir_lower_wpos_ytransform(),
but making a new pass that never even touched uniforms seemed simpler.

v2: Use normal iterator rather than _safe variant (noticed by Matt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Don't use ffma in nir_lower_wpos_ytransform().
Kenneth Graunke [Wed, 18 May 2016 19:14:02 +0000 (12:14 -0700)]
nir: Don't use ffma in nir_lower_wpos_ytransform().

ffma is an explicitly fused multiply add with higher precision.
The optimizer will take care of promoting mul/add to fma when
it's beneficial to do so.

This fixes failures on Gen4-5 when using this pass, as those platforms
don't actually implement fma().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agonir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform.
Kenneth Graunke [Wed, 18 May 2016 17:32:33 +0000 (10:32 -0700)]
nir: Handle fddy_fine and fddy_coarse in nir_lower_wpos_ytransform.

These also need flipping!

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Make lower_wpos_ytransform_block a void function.
Kenneth Graunke [Wed, 18 May 2016 18:19:00 +0000 (11:19 -0700)]
nir: Make lower_wpos_ytransform_block a void function.

The return value was used for the old nir_foreach_block callback system,
but at this point it no longer means anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Make nir_lower_wpos_ytransform() match FragCoord by location.
Kenneth Graunke [Wed, 18 May 2016 16:31:49 +0000 (09:31 -0700)]
nir: Make nir_lower_wpos_ytransform() match FragCoord by location.

gl_FragCoord is a shader input with location == VARYING_SLOT_POS.
ARB_fragment_programs have an equivalent input at VARYING_SLOT_POS,
but it isn't called gl_FragCoord.  We do want to transform it.

Matching by location guarantees we catch both.

Fixes several fp tests on a branch which uses this pass on i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Add interp_var_at_offset flipping.
Kenneth Graunke [Tue, 17 May 2016 10:30:58 +0000 (03:30 -0700)]
nir: Add interp_var_at_offset flipping.

The Y-offset needs flipping as well, similar to ddy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Fix fddy swizzles in nir_lower_wpos_ytransform().
Kenneth Graunke [Wed, 18 May 2016 06:16:14 +0000 (23:16 -0700)]
nir: Fix fddy swizzles in nir_lower_wpos_ytransform().

The original value might have been swizzled.  That's taken care of in
the fmul source - we don't want to reswizzle it again.

Fixes validation failures in glsl-derivs-varyings on a branch of mine
which uses this pass in i965.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agonir: Fix wpos_ytransform lowering state_slot swizzle.
Kenneth Graunke [Tue, 17 May 2016 10:05:56 +0000 (03:05 -0700)]
nir: Fix wpos_ytransform lowering state_slot swizzle.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
8 years agoi965: Fix brw_regs_equal() for NaN and positive/negative zero.
Kenneth Graunke [Tue, 17 May 2016 00:28:19 +0000 (17:28 -0700)]
i965: Fix brw_regs_equal() for NaN and positive/negative zero.

We'd like the comparisons to mean "the exact same bits".  Comparing
doubles won't do that for NaN values or positive vs. negative zero.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agovirgl: handle cull distance cap.
Dave Airlie [Fri, 20 May 2016 20:19:29 +0000 (06:19 +1000)]
virgl: handle cull distance cap.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agovirgl: Add missing texture transfer_inline_write
Rob Herring [Fri, 20 May 2016 17:51:00 +0000 (12:51 -0500)]
virgl: Add missing texture transfer_inline_write

transfer_inline_write cannot be NULL and the virgl renderer doesn't support
inline writes for textures, so add the default version.

This fixes a crash in st_TexSubImage since commit fb9fe352ea41 ("st/mesa:
use transfer_inline_write for memcpy TexSubImage path").

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv: Merge in my TODO list items
Kristian Høgsberg Kristensen [Fri, 20 May 2016 17:35:57 +0000 (10:35 -0700)]
anv: Merge in my TODO list items

Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
8 years agomesa: Replace uses of Shared->Mutex with hash-table mutexes
Matt Turner [Thu, 30 Jul 2015 21:31:04 +0000 (14:31 -0700)]
mesa: Replace uses of Shared->Mutex with hash-table mutexes

We were locking the Shared->Mutex and then using calling functions like
_mesa_HashInsert that do additional per-hash-table locking internally.

Instead just lock each hash-table's mutex and use functions like
_mesa_HashInsertLocked and the new _mesa_HashRemoveLocked.

In order to do this, we need to remove the locking from
_mesa_HashFindFreeKeyBlock since it will always be called with the
per-hash-table lock taken.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agohash: Add _mesa_HashRemoveLocked() function.
Matt Turner [Thu, 30 Jul 2015 21:24:07 +0000 (14:24 -0700)]
hash: Add _mesa_HashRemoveLocked() function.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoi965: Pass nir_src/nir_dest by reference.
Matt Turner [Thu, 19 May 2016 21:43:23 +0000 (14:43 -0700)]
i965: Pass nir_src/nir_dest by reference.

Cuts 6K of .text.

   text    data     bss     dec     hex filename
5772372  264648   29320 6066340  5c90a4 lib/i965_dri.so before
5766074  264648   29320 6060042  5c780a lib/i965_dri.so after

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoglsl: Guard against NULL dereference
Mark Janes [Fri, 20 May 2016 15:50:39 +0000 (08:50 -0700)]
glsl: Guard against NULL dereference

This trivially corrects mesa 3ca1c221, which introduced a check that
crashes when a match is not found.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Fixes: piglit.spec.glsl-1_50.compiler.interface-blocks-name-reused-globally-4.vert
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
8 years agoanv: Enable textureCompressionASTC_LDR on Gen9+
Nanley Chery [Wed, 18 May 2016 17:50:48 +0000 (10:50 -0700)]
anv: Enable textureCompressionASTC_LDR on Gen9+

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv/format: Reorder ASTC mappings to match ISL enum ordering
Nanley Chery [Wed, 18 May 2016 17:40:39 +0000 (10:40 -0700)]
anv/format: Reorder ASTC mappings to match ISL enum ordering

Keep the lists consistent for ease of use.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agogenxml: Expand SKL's SurfaceFormat field width for ASTC
Nanley Chery [Wed, 18 May 2016 17:45:30 +0000 (10:45 -0700)]
genxml: Expand SKL's SurfaceFormat field width for ASTC

In the expanded field, only ASTC format enums have the MSB set to 1.
Expanding the field width makes the process of handling these formats
identical to the way other formats are handled.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoisl: Handle npot ASTC block dimensions on Gen9+
Nanley Chery [Wed, 18 May 2016 23:19:23 +0000 (16:19 -0700)]
isl: Handle npot ASTC block dimensions on Gen9+

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoisl: Add 2D ASTC format layouts and enums
Nanley Chery [Wed, 18 May 2016 17:43:42 +0000 (10:43 -0700)]
isl: Add 2D ASTC format layouts and enums

Also, make changes needed for successful compilation and registration
as a texture compression mode.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agomesa: Build EGL without X11 headers after interop patchset
Youry Metlitsky [Wed, 27 Apr 2016 22:33:14 +0000 (22:33 +0000)]
mesa: Build EGL without X11 headers after interop patchset

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agonir/validate: assume() that hashtable entry exists
Rob Clark [Wed, 18 May 2016 15:43:15 +0000 (11:43 -0400)]
nir/validate: assume() that hashtable entry exists

At this point, it would require a logic error in nir_validate to not
have already populated this hashtable entry, but coverity doesn't
realize that:

CID 1265547 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

CID 1271039 (#1 of 1): Dereference null return value (NULL_RETURNS)3.
dereference: Dereferencing a null pointer entry.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir: coverity unitialized pointer read
Rob Clark [Wed, 18 May 2016 14:38:40 +0000 (10:38 -0400)]
nir: coverity unitialized pointer read

Not sure how coverity arrives at the conclusion that we can read comp[j]
unitialized (around line 204), other than not being aware that ncomp is
greater than 1 so it won't underflow in the 'if (tex->is_array)' case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir: coverity sign-extension fix
Rob Clark [Wed, 18 May 2016 14:17:02 +0000 (10:17 -0400)]
nir: coverity sign-extension fix

Not 100% sure, but I think being an unsigned literal will help:

CID 1358505 (#1 of 1): Unintended sign extension
(SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension:
load1->def.num_components with type unsigned char (8 bits, unsigned) is
promoted in load1->def.num_components * (load1->def.bit_size / 8) to
type int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If load1->def.num_components * (load1->def.bit_size /
8) is greater than 0x7FFFFFFF, the upper bits of the result will all be
1.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonir/glsl_to_nir: quell some uninit_member coverity errors
Rob Clark [Wed, 18 May 2016 14:58:29 +0000 (10:58 -0400)]
nir/glsl_to_nir: quell some uninit_member coverity errors

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
8 years agofreedreno/ir3: need to lower fmod too
Rob Clark [Tue, 17 May 2016 13:52:24 +0000 (09:52 -0400)]
freedreno/ir3: need to lower fmod too

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agoi965: Fix strerror error code sign
Mark Janes [Thu, 19 May 2016 20:42:16 +0000 (13:42 -0700)]
i965: Fix strerror error code sign

This trivial fix to error-handling corrects the sign of drm error
codes before passing them to strerror.

Identified by Coverity: CID1358581

8 years agonir/spirv: Handle the NonReadable decoration on struct members
Jason Ekstrand [Fri, 20 May 2016 03:58:32 +0000 (20:58 -0700)]
nir/spirv: Handle the NonReadable decoration on struct members

8 years agoanv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled
Jason Ekstrand [Thu, 19 May 2016 04:41:05 +0000 (21:41 -0700)]
anv/pipeline: Bounds-check resource indices when robuts_buffer_access is enabled

8 years agoanv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled
Jason Ekstrand [Sat, 14 May 2016 21:55:39 +0000 (14:55 -0700)]
anv/pipeline: Only do buffer bounds checks if robustBufferAccess is enabled

8 years agoanv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment
Jason Ekstrand [Sat, 14 May 2016 21:53:11 +0000 (14:53 -0700)]
anv/apply_dynamic_offsets: Use rewrite_src instead of a regular assignment

Originally we removed the instruction, changed the source, and then
re-inserted it.  This works, but nir_instr_rewrite_src is a bit more
obviously correct.

8 years agoanv/device: Add a boolean for robust buffer access
Jason Ekstrand [Sat, 14 May 2016 21:52:36 +0000 (14:52 -0700)]
anv/device: Add a boolean for robust buffer access

8 years agoanv: Add a TODO file
Jason Ekstrand [Fri, 20 May 2016 03:08:55 +0000 (20:08 -0700)]
anv: Add a TODO file

8 years agoglsl: handle same struct redeclaration (v2)
Dave Airlie [Tue, 17 May 2016 00:58:53 +0000 (10:58 +1000)]
glsl: handle same struct redeclaration (v2)

This works around a bug in older version of UE4, where a shader
defines the same structure twice. Although we aren't sure this is correct
GLSL (it most likely isn't) there are enough UE4 based things out there
we should deal with this.

This drops the error to a warning if the struct names and contents match.

v1.1: do better C++ on record_compare declaration (Rob)
v2: restrict this to desktop GL only (Ian)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoi965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.
Matt Turner [Wed, 4 May 2016 22:37:02 +0000 (15:37 -0700)]
i965/fs: Recognize and emit ld_lz, sample_lz, sample_c_lz.

Ken suggested instead of a big and complicated optimization pass, to
just recognize the operations here. It's certainly less code and a lot
prettier, but it seems to actually perform worse for currently unknown
reasons.

total instructions in shared programs: 8923452 -> 8904108 (-0.22%)
instructions in affected programs: 814563 -> 795219 (-2.37%)
helped: 3336
HURT: 10

total cycles in shared programs: 66970734 -> 66651476 (-0.48%)
cycles in affected programs: 10582686 -> 10263428 (-3.02%)
helped: 2438
HURT: 691

total spills in shared programs: 1811 -> 1789 (-1.21%)
spills in affected programs: 85 -> 63 (-25.88%)
helped: 4

total fills in shared programs: 3143 -> 3109 (-1.08%)
fills in affected programs: 167 -> 133 (-20.36%)
helped: 4

LOST:   2
GAINED: 36

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965: Add infrastucture for sample lod-zero operations.
Matt Turner [Wed, 4 May 2016 22:46:45 +0000 (15:46 -0700)]
i965: Add infrastucture for sample lod-zero operations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/fs: Add and use get_nir_src_imm().
Matt Turner [Wed, 4 May 2016 22:10:25 +0000 (15:10 -0700)]
i965/fs: Add and use get_nir_src_imm().

The next patch wants to inspect the LOD argument and do something
different if it's 0.0f. But at that point we've emitted a MOV for it and
we just have a register to look at.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agonvc0: account for shader-allocated local memory needs
Ilia Mirkin [Thu, 19 May 2016 01:27:33 +0000 (21:27 -0400)]
nvc0: account for shader-allocated local memory needs

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agonv50/ir: treat addresses as local
Ilia Mirkin [Wed, 18 May 2016 00:44:21 +0000 (20:44 -0400)]
nv50/ir: treat addresses as local

Address registers are always loaded right before use. Don't treat them
as "global", which will cause them to be put into the function's
linkage, and will make the register allocator hold onto that
register until the end of the function.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoswr: [rasterizer] utility functions for shared libs
Tim Rowley [Fri, 13 May 2016 00:12:55 +0000 (18:12 -0600)]
swr: [rasterizer] utility functions for shared libs

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD
Tim Rowley [Thu, 12 May 2016 00:05:23 +0000 (18:05 -0600)]
swr: [rasterizer jitter] fix assert in AVX implementation of MASKLOADD

llvm changed the mask type to vector of ints with 3.8.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions
Tim Rowley [Wed, 11 May 2016 22:51:11 +0000 (16:51 -0600)]
swr: [rasterizer core] apply KNOB_TOSS_DRAW to more functions

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer jitter] add instancing to non-gather fetch path
Tim Rowley [Wed, 11 May 2016 15:57:08 +0000 (09:57 -0600)]
swr: [rasterizer jitter] add instancing to non-gather fetch path

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] move MultisampleTrait static from header to cpp
Tim Rowley [Tue, 10 May 2016 18:55:18 +0000 (12:55 -0600)]
swr: [rasterizer core] move MultisampleTrait static from header to cpp

Move a MultisampleTrait static from header to cpp as clang seemed to get
confused with some specializations in the header vs some in cpp.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] clang override for _mm_undefined*
Tim Rowley [Tue, 10 May 2016 00:00:26 +0000 (18:00 -0600)]
swr: [rasterizer core] clang override for _mm_undefined*

Not supported in older xcode versions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer common] add OSX to unix portability sections
Tim Rowley [Fri, 6 May 2016 20:38:25 +0000 (14:38 -0600)]
swr: [rasterizer common] add OSX to unix portability sections

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer] rename _aligned_malloc to AlignedMalloc
Tim Rowley [Fri, 6 May 2016 18:49:23 +0000 (12:49 -0600)]
swr: [rasterizer] rename _aligned_malloc to AlignedMalloc

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer jitter] rename MEMCPY function to MEMCOPY
Tim Rowley [Fri, 6 May 2016 17:29:07 +0000 (11:29 -0600)]
swr: [rasterizer jitter] rename MEMCPY function to MEMCOPY

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer common] guard definition of __cdecl/__stdcall
Tim Rowley [Thu, 5 May 2016 22:13:21 +0000 (16:13 -0600)]
swr: [rasterizer common] guard definition of __cdecl/__stdcall

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer common] include cstddef for offsetof
Tim Rowley [Thu, 5 May 2016 21:48:32 +0000 (15:48 -0600)]
swr: [rasterizer common] include cstddef for offsetof

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] removed tabs that snuck in
Tim Rowley [Thu, 5 May 2016 19:47:03 +0000 (13:47 -0600)]
swr: [rasterizer core] removed tabs that snuck in

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] code style cleanup
Tim Rowley [Tue, 17 May 2016 22:04:36 +0000 (17:04 -0500)]
swr: [rasterizer core] code style cleanup

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] add dummy code for cygwin build
Tim Rowley [Tue, 17 May 2016 22:03:52 +0000 (17:03 -0500)]
swr: [rasterizer core] add dummy code for cygwin build

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] move variable query outside loop
Tim Rowley [Wed, 4 May 2016 21:04:39 +0000 (15:04 -0600)]
swr: [rasterizer core] move variable query outside loop

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] utility function for getenv
Tim Rowley [Wed, 4 May 2016 16:40:10 +0000 (10:40 -0600)]
swr: [rasterizer core] utility function for getenv

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer common] portable threadviz buckets
Tim Rowley [Wed, 4 May 2016 15:24:55 +0000 (09:24 -0600)]
swr: [rasterizer common] portable threadviz buckets

Output with slashes instead of backslashes for unix/linux.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer common] foreground win32 assert dialog
Tim Rowley [Tue, 3 May 2016 17:45:03 +0000 (11:45 -0600)]
swr: [rasterizer common] foreground win32 assert dialog

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: [rasterizer core] use parens to disambiguate operator precedence
Tim Rowley [Tue, 3 May 2016 17:07:15 +0000 (11:07 -0600)]
swr: [rasterizer core] use parens to disambiguate operator precedence

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agoswr: standardize linkage and check for unresolved symbols
Tim Rowley [Thu, 19 May 2016 16:36:32 +0000 (11:36 -0500)]
swr: standardize linkage and check for unresolved symbols

Acked-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoswr: fix swr linkage so that static llvm works
Tim Rowley [Mon, 16 May 2016 18:31:16 +0000 (13:31 -0500)]
swr: fix swr linkage so that static llvm works

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
8 years agoswr: PIPE_CAP_CULL_DISTANCE cap request response
Tim Rowley [Mon, 16 May 2016 16:03:26 +0000 (11:03 -0500)]
swr: PIPE_CAP_CULL_DISTANCE cap request response

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agodocs: add swr to GL3.txt
Tim Rowley [Mon, 16 May 2016 16:02:27 +0000 (11:02 -0500)]
docs: add swr to GL3.txt

v2: not on gl3.3 list until gl3.2 is complete

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
8 years agost/va: use drm render node for wayland display type
Leo Liu [Tue, 17 May 2016 19:16:09 +0000 (15:16 -0400)]
st/va: use drm render node for wayland display type

With xwayland, vainfo use VA_DISPLAY_WAYLAND as default and it fails
and fails when specify display with  `vainfo --display wayland`.
In fact wayland support for libva uses drm path to connect device,
and should use drm pipe loader to create screen.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
8 years agogallium/radeon: small cleanups in r600_texture_transfer_map
Marek Olšák [Thu, 12 May 2016 11:26:24 +0000 (13:26 +0200)]
gallium/radeon: small cleanups in r600_texture_transfer_map

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: don't set PB_USAGE in winsyses
Marek Olšák [Thu, 12 May 2016 11:13:42 +0000 (13:13 +0200)]
gallium/radeon: don't set PB_USAGE in winsyses

There is no point.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: handle VRAM_GTT placements as having slow CPU reads
Marek Olšák [Thu, 12 May 2016 11:05:19 +0000 (13:05 +0200)]
gallium/radeon: handle VRAM_GTT placements as having slow CPU reads

not sure if we should include GTT WC too

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium/radeon: ignore PIPE_TRANSFER_MAP_DIRECTLY
Marek Olšák [Thu, 12 May 2016 10:55:41 +0000 (12:55 +0200)]
gallium/radeon: ignore PIPE_TRANSFER_MAP_DIRECTLY

Only st/xa is using this, which is irrelevant to us.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add a workaround for a bug in LLVM <= 3.8
Marek Olšák [Tue, 17 May 2016 21:10:24 +0000 (23:10 +0200)]
radeonsi: add a workaround for a bug in LLVM <= 3.8

This is not directly applicable to stable and needs to be backported.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoi965/fs: Silence warnings related to use of uninitialized values
Eduardo Lima Mitev [Tue, 17 May 2016 10:21:02 +0000 (12:21 +0200)]
i965/fs: Silence warnings related to use of uninitialized values

brw_fs.cpp: In function ‘const unsigned int* brw_compile_fs(const [...]
brw_fs.cpp:6093:64: warning: ‘simd16_grf_start’ may be used uninitialized [...]
       prog_data->base.dispatch_grf_start_reg = simd16_grf_start;

brw_fs.cpp:5996:29: note: ‘simd16_grf_start’ was declared here
    uint8_t simd8_grf_start, simd16_grf_start;

brw_fs.cpp:6094:52: warning: ‘simd16_grf_used’ may be used uninitialized [...]
       prog_data->reg_blocks_0 = brw_register_blocks(simd16_grf_used);

brw_fs.cpp:5997:29: note: ‘simd16_grf_used’ was declared here
    unsigned simd8_grf_used, simd16_grf_used;

(and more)

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agovc4: Size transfer temporary mappings appropriately for full maps of 3D.
Eric Anholt [Wed, 18 May 2016 19:29:02 +0000 (12:29 -0700)]
vc4: Size transfer temporary mappings appropriately for full maps of 3D.

We don't really support reading/writing of 3D textures since the hardware
doesn't do 3D, but we do need to make sure that a pipe_transfer for them
has enough space to store the image.  This was previously not a problem
because the state tracker only mapped a slice at a time until
fb9fe352ea41c7e3633ba2c483c59b73c529845b.  Fixes glean glsl1 tests, which
all have setup of a 3D texture at the start.

8 years agoanv/device: Fix viewportBoundsRange
Nanley Chery [Tue, 17 May 2016 22:28:01 +0000 (15:28 -0700)]
anv/device: Fix viewportBoundsRange

Align with the spec requirement that the range must be at least
[−2 × maxViewportDimensions, 2 × maxViewportDimensions − 1]. Our
hardware supports this.

Fixes dEQP-VK.api.info.device.properties

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
8 years agoglsl/linker: attempt to match anonymous structures at link
Dave Airlie [Tue, 17 May 2016 00:31:29 +0000 (10:31 +1000)]
glsl/linker: attempt to match anonymous structures at link

This is my attempt at fixing at least one of the UE4 bugs with GL4.3.

If we are doing intrastage matching and hit anonymous structs, then
we should do a record comparison instead of using the names.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95005
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv/batch_chain: free pointers for error cases
Mark Janes [Wed, 18 May 2016 21:28:38 +0000 (14:28 -0700)]
anv/batch_chain: free pointers for error cases

Trivial fix to improperly handled cleanup during
VK_ERROR_OUT_OF_HOST_MEMORY.

Identified by Coverity: CID 1358908 and 1358909
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agost/nine: Minor change to support musl libc
Wang He [Tue, 10 May 2016 05:40:30 +0000 (13:40 +0800)]
st/nine: Minor change to support musl libc

A few changes to support musl libc as well.

In particular fpu_control.h is glibc specific.
fenv.h doesn't enable to do exactly what we want either,
so instead use assembly directly.

Signed-off-by: Wang He <xw897002528@gmail.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Enable D3DPMISCCAPS_PERSTAGECONSTANT
Patrick Rudolph [Fri, 29 Apr 2016 06:50:16 +0000 (08:50 +0200)]
st/nine: Enable D3DPMISCCAPS_PERSTAGECONSTANT

Nine already supports the feature.
There are no failing WINE tests for per stage constants.
Enabling D3DPMISCCAPS_PERSTAGECONSTANT as it fixes
https://github.com/iXit/Mesa-3D/issues/205

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Turn on thread_submit by default when on different device
Axel Davy [Sat, 7 May 2016 09:33:24 +0000 (11:33 +0200)]
st/nine: Turn on thread_submit by default when on different device

The last remaining issues with thread_submit have been resolved,
thus turn it when on a different device (the case where is is
beneficial).

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix usage of rasterizer multisample bit.
Axel Davy [Sun, 3 Apr 2016 11:04:39 +0000 (13:04 +0200)]
st/nine: Fix usage of rasterizer multisample bit.

pipe_rasterizer multisample bit should be enabled only when really
wanting to do multisampling, thus we should disable when not having
msaa render target.
This fixes some depth calculation precision issues on radeon.
Also disable it when depth and stencil tests are disabled, since in that
case multisampling is same as not multisampled.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: ATOC has effect only with ALPHATESTENABLE
Axel Davy [Sun, 3 Apr 2016 08:52:22 +0000 (10:52 +0200)]
st/nine: ATOC has effect only with ALPHATESTENABLE

ATOC extension does something only when alpha test is enabled.
Use a second bit to encode the difference with ATIATOC.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Add debug string for ATOC
Axel Davy [Sat, 7 May 2016 09:20:47 +0000 (11:20 +0200)]
st/nine: Add debug string for ATOC

We were missing a debug string for this format.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Add asserts for output/input packing
Axel Davy [Sat, 19 Mar 2016 18:27:34 +0000 (19:27 +0100)]
st/nine: Add asserts for output/input packing

Nine doesn't support vs output/ps input packing.
We haven't found any application requiring that,
and implementing it properly is complex.

Add asserts for now.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Use correct PIPE_HANDLE_USAGE flag for frontbuffer copy
Axel Davy [Mon, 14 Mar 2016 20:29:53 +0000 (21:29 +0100)]
st/nine: Use correct PIPE_HANDLE_USAGE flag for frontbuffer copy

When taking screenshots we do a copy from the frontbuffer
to an allocated buffer (which we then copy to a ram buffer).

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix output shift calculation
Axel Davy [Sat, 12 Mar 2016 11:24:51 +0000 (12:24 +0100)]
st/nine: Fix output shift calculation

We were getting it wrong for negative values.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix CheckDeviceFormat advertising for surfaces
Axel Davy [Fri, 11 Mar 2016 22:30:05 +0000 (23:30 +0100)]
st/nine: Fix CheckDeviceFormat advertising for surfaces

Signed-off-by: Axel Davy <axel.davy@ens.fr>