platform/upstream/mesa.git
9 years agost/nine: Force hw cursor for Windowed mode
Axel Davy [Sun, 22 Mar 2015 17:48:07 +0000 (18:48 +0100)]
st/nine: Force hw cursor for Windowed mode

According to the spec, Windowed mode must
have hw cursor

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
9 years agost/nine: Hide hardware cursor when we don't use it
Axel Davy [Sat, 21 Mar 2015 21:21:14 +0000 (22:21 +0100)]
st/nine: Hide hardware cursor when we don't use it

We have either hardware cursor or software cursor.
When we use software cursor, we should hide the hardware
cursor.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
9 years agost/nine: fix D3DRS_DITHERENABLE wrong state group
Axel Davy [Sun, 15 Feb 2015 20:30:44 +0000 (21:30 +0100)]
st/nine: fix D3DRS_DITHERENABLE wrong state group

D3DRS_DITHERENABLE was assigned to the rasterizer state
group, but it was used for the blend group.

Assign it to the blend group.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
9 years agost/nine: Account POINTSIZE_MIN and POINTSIZE_MAX for point size
Patrick Rudolph [Sun, 19 Apr 2015 08:14:30 +0000 (10:14 +0200)]
st/nine: Account POINTSIZE_MIN and POINTSIZE_MAX for point size

When using D3DRS_POINTSIZE make sure the value is at least
D3DRS_POINTSIZE_MIN but not greater than D3DRS_POINTSIZE_MAX.

Fixes some Wine tests.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
9 years agost/nine: Align texture memory
Patrick Rudolph [Tue, 12 May 2015 05:27:37 +0000 (07:27 +0200)]
st/nine: Align texture memory

Align texture memory on 32 byte boundry to allow
SSE/AVX memcpy to work on locked rects.

This fixes some crashes with games using SSE.

Reviewed-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
9 years agost/nine: Always set point_quad_rasterization to 1
Axel Davy [Sat, 16 May 2015 16:41:51 +0000 (18:41 +0200)]
st/nine: Always set point_quad_rasterization to 1

Both Points and Point Sprites are rasterized like quads,
according to d3d9 doc and gallium rasterizer doc.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
9 years agost/nine: Fix Swizzle for ATI2 format
Axel Davy [Sat, 16 May 2015 20:41:26 +0000 (22:41 +0200)]
st/nine: Fix Swizzle for ATI2 format

We had red and green in the wrong channels
for the ATI2 format (RGTC2).

Found thanks to wine tests.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
9 years agotarget/d3dadapter9: Return Windows like card names
Patrick Rudolph [Mon, 25 May 2015 08:36:21 +0000 (10:36 +0200)]
target/d3dadapter9: Return Windows like card names

Add support for multiple cards and fill in Win
like card name, driver name and version info.
Use fallback for unknown vendors and unknown card names.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Patrick Rudolph <siro@das-labor.org>
9 years agost/nine: Require gcc >= 4.6
David Heidelberg [Fri, 10 Apr 2015 22:13:53 +0000 (00:13 +0200)]
st/nine: Require gcc >= 4.6

Nine code uses some C11 features, and this
leads to compile error on gcc <= 4.5

Another way would have been to use the
-fms-extensions CFLAG

Signed-off-by: David Heidelberg <david@ixit.cz>
Cc: "10.4 10.5 10.6" <mesa-stable@lists.freedesktop.org>
9 years agoglsl: fix error message when validating tcs output decls
Ilia Mirkin [Fri, 21 Aug 2015 19:08:15 +0000 (15:08 -0400)]
glsl: fix error message when validating tcs output decls

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agorelnote updates
Rob Clark [Mon, 10 Aug 2015 21:27:19 +0000 (17:27 -0400)]
relnote updates

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agost/mesa: pass through 4th opcode argument in bitmap/pixel visitors
Ilia Mirkin [Fri, 21 Aug 2015 00:06:50 +0000 (20:06 -0400)]
st/mesa: pass through 4th opcode argument in bitmap/pixel visitors

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agost/mesa: fix assignments with 4-operand arguments (i.e. BFI)
Ilia Mirkin [Thu, 20 Aug 2015 23:59:04 +0000 (19:59 -0400)]
st/mesa: fix assignments with 4-operand arguments (i.e. BFI)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoi965: allow image_size on float images
Martin Peres [Fri, 21 Aug 2015 13:25:14 +0000 (16:25 +0300)]
i965: allow image_size on float images

This got missed because the piglit test only tested int images to avoid a
combinatiorial explosion of format, targets, stages and sizes which
takes more than 5 minutes to test on nvidia's driver.

This patch also drops the IMAGE_FUNCTION_AVAIL_ATOMIC which is not applicable
to the image_size codepath but was not hurting in any way.

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoclover: fix llvm 3.5 build error
Zoltan Gilian [Wed, 19 Aug 2015 09:56:08 +0000 (11:56 +0200)]
clover: fix llvm 3.5 build error

There is no MDOperand in llvm 3.5.

v2: Check if kernel metadata is present to avoid crash (EdB).
v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6.

Reviewed-by: Serge Martin (EdB) <edb+mesa@sigluy.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agomesa: update fbo state in glTexStorage
Tapani Pälli [Thu, 20 Aug 2015 07:25:59 +0000 (10:25 +0300)]
mesa: update fbo state in glTexStorage

We have to re-validate FBOs rendering to the texture like is done
with TexImage and CopyTexImage.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91673
Cc: "10.6" <mesa-stable@lists.freedesktop.org>
9 years agovc4: Add algebraic opt for rcp(1.0).
Eric Anholt [Wed, 19 Aug 2015 05:19:12 +0000 (22:19 -0700)]
vc4: Add algebraic opt for rcp(1.0).

We're generating rcps as part of backend lowering of the packed coordinate
in the CS, and we don't want to lower them in NIR because of the extra
newton-raphson steps in the common case.  However, GLB2.7 is moving a
vertex attribute with a 1.0 W component to the position, and that makes us
produce some silly RCPs.

total instructions in shared programs: 97590 -> 97580 (-0.01%)
instructions in affected programs:     74 -> 64 (-13.51%)

9 years agovc4: Allow unpack_8[abcd]_f's src to stay in r4.
Eric Anholt [Wed, 19 Aug 2015 05:07:47 +0000 (22:07 -0700)]
vc4: Allow unpack_8[abcd]_f's src to stay in r4.

I had QPU emit code to do it, but forgot to flag the register class.

total instructions in shared programs: 97974 -> 97590 (-0.39%)
instructions in affected programs:     25291 -> 24907 (-1.52%)

9 years agovc4: Pack the unorm-packing bits into a src MUL instruction when possible.
Eric Anholt [Wed, 19 Aug 2015 04:26:05 +0000 (21:26 -0700)]
vc4: Pack the unorm-packing bits into a src MUL instruction when possible.

Now that we do non-SSA QIR instructions, we can take a NIR SSA src that's
only used by the unorm packing and just stuff the pack bits into it.

total instructions in shared programs: 98136 -> 97974 (-0.17%)
instructions in affected programs:     4149 -> 3987 (-3.90%)

9 years agovc4: Add a QIR helper for whether the op is a MUL type.
Eric Anholt [Wed, 19 Aug 2015 04:43:42 +0000 (21:43 -0700)]
vc4: Add a QIR helper for whether the op is a MUL type.

9 years agovc4: Drop an unused algebraic op.
Eric Anholt [Wed, 19 Aug 2015 03:18:51 +0000 (20:18 -0700)]
vc4: Drop an unused algebraic op.

NIR now handles this optimization for us.

9 years agovc4: Switch QPU_PACK_SCALED to be two non-SSA instructions.
Eric Anholt [Thu, 6 Aug 2015 03:54:02 +0000 (20:54 -0700)]
vc4: Switch QPU_PACK_SCALED to be two non-SSA instructions.

total instructions in shared programs: 98159 -> 98136 (-0.02%)
instructions in affected programs:     12279 -> 12256 (-0.19%)

9 years agovc4: Make the pack-to-unorm instructions be non-SSA.
Eric Anholt [Thu, 6 Aug 2015 03:31:21 +0000 (20:31 -0700)]
vc4: Make the pack-to-unorm instructions be non-SSA.

This helps ensure that the register allocator doesn't force the later pack
operations to insert extra MOVs.

total instructions in shared programs: 98170 -> 98159 (-0.01%)
instructions in affected programs:     2134 -> 2123 (-0.52%)

9 years agovc4: Allow QIR registers to be non-SSA.
Eric Anholt [Tue, 4 Aug 2015 02:25:47 +0000 (19:25 -0700)]
vc4: Allow QIR registers to be non-SSA.

Now that we have NIR, most of the optimization we still need to do is
peepholes on instruction selection rather than general dataflow
operations.  This means we want to be able to have QIR be a lot closer to
the actual QPU instructions, just with virtual registers.  Allowing
multiple instructions writing the same register opens up a lot of
possibilities.

9 years agovc4: We can now move TEX_RESULT accesses across other r4 ops.
Eric Anholt [Thu, 6 Aug 2015 03:11:07 +0000 (20:11 -0700)]
vc4: We can now move TEX_RESULT accesses across other r4 ops.

No difference on shader-db.

9 years agoglsl: fix binding validation for interface blocks
Timothy Arceri [Wed, 27 May 2015 10:12:42 +0000 (20:12 +1000)]
glsl: fix binding validation for interface blocks

V2: rebase on SSBO changes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoglsl: interleave constant propagation and folding
Timothy Arceri [Sun, 16 Aug 2015 04:26:23 +0000 (14:26 +1000)]
glsl: interleave constant propagation and folding

The constant folding pass can take a long time to complete
so rather than running through the entire pass each time
a new constant is propagated (and vice versa) interleave them.

This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1
go from around 2 min -> 23 sec.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agonv50/ir: pre-compute BFE arg when both bits and offset are imm
Ilia Mirkin [Fri, 21 Aug 2015 02:13:48 +0000 (22:13 -0400)]
nv50/ir: pre-compute BFE arg when both bits and offset are imm

Due to a quirk in how the nv50 opt passes run, the algebraic
optimization that looks for these BFE's happens before the constant
folding pass. Rearranging these passes isn't a great idea, but this is
easy enough to fix. Allows a following cvt to eliminate the bfe in
certain situations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoglsl: expose textureQueryLod in GLSL 4.00+ fragment shaders
Ilia Mirkin [Wed, 19 Aug 2015 22:43:47 +0000 (18:43 -0400)]
glsl: expose textureQueryLod in GLSL 4.00+ fragment shaders

See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion:

    (3) The core specification uses the "Lod" spelling, not "LOD".  Should
        this extension be modified to use "Lod"?

      RESOLVED: The "Lod" spelling is the correct spelling for the core
      specification and the preferred spelling for use. However, use of
      "LOD" also exists, as the extension predated the core specification,
      so this extension won't remove use of "LOD".

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agoRevert "mesa/formats: refactor by collapsing cases in switch statement by type"
Nanley Chery [Fri, 21 Aug 2015 01:00:20 +0000 (18:00 -0700)]
Revert "mesa/formats: refactor by collapsing cases in switch statement by type"

This reverts commit ffe6c6ad5f719dedd1b6b95e8590e3f20b23d340.

_mesa_format_num_components() does not include the padding bits in mesa formats
containing 'X' channels. This could cause mipmap generation for certain
uncompressed formats to underestimate the number of channels in the source
image by 1.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agor600g: Fix handling of TGSI_OPCODE_ARR with SB
Glenn Kennard [Thu, 13 Aug 2015 18:30:07 +0000 (20:30 +0200)]
r600g: Fix handling of TGSI_OPCODE_ARR with SB

FLT_TO_INT goes in the vector pipes on evergreen/NI,
not the trans unit as on earlier chips.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600: Turn 'r600_shader_key' struct into union
Edward O'Callaghan [Wed, 19 Aug 2015 08:58:47 +0000 (18:58 +1000)]
r600: Turn 'r600_shader_key' struct into union

This struct was getting a bit crowded, following the lead of
radeonsi, mirror the idea of having sub-structures for each
shader type. Turning 'r600_shader_key' into an union saves
some trivial memory and CPU cycles for the shader keys.

[airlied: drop as_ls, and reorder so larger fields at start.]
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600: Rewrite r600_shader_selector_key() to use a switch stmt
Edward O'Callaghan [Wed, 19 Aug 2015 08:58:46 +0000 (18:58 +1000)]
r600: Rewrite r600_shader_selector_key() to use a switch stmt

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoi965: Use NIR by default for vertex shaders
Jason Ekstrand [Mon, 3 Aug 2015 15:17:42 +0000 (08:17 -0700)]
i965: Use NIR by default for vertex shaders

Shader-db results for vec4 on i965:

   total instructions in shared programs: 1499894 -> 1502261 (0.16%)
   instructions in affected programs:     1414224 -> 1416591 (0.17%)
   helped:                                2434
   HURT:                                  10543
   GAINED:                                1
   LOST:                                  0

Shader-db results for vec4 on g4x:

   total instructions in shared programs: 1437411 -> 1439779 (0.16%)
   instructions in affected programs:     1362402 -> 1364770 (0.17%)
   helped:                                2434
   HURT:                                  10544
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Iron Lake:

   total instructions in shared programs: 1437214 -> 1439593 (0.17%)
   instructions in affected programs:     1362205 -> 1364584 (0.17%)
   helped:                                2433
   HURT:                                  10544
   GAINED:                                1
   LOST:                                  0

Shader-db results for vec4 on Sandy Bridge:

   total instructions in shared programs: 2022092 -> 1941570 (-3.98%)
   instructions in affected programs:     1886838 -> 1806316 (-4.27%)
   helped:                                7510
   HURT:                                  10737
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Ivy Bridge:

   total instructions in shared programs: 1853749 -> 1804960 (-2.63%)
   instructions in affected programs:     1686736 -> 1637947 (-2.89%)
   helped:                                6735
   HURT:                                  11101
   GAINED:                                0
   LOST:                                  0

Shader-db results for vec4 on Haswell:

   total instructions in shared programs: 1853749 -> 1804960 (-2.63%)
   instructions in affected programs:     1686736 -> 1637947 (-2.89%)
   helped:                                6735
   HURT:                                  11101
   GAINED:                                0
   LOST:                                  0

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
9 years agoglsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULL
Kai Wasserbäch [Fri, 14 Aug 2015 12:49:43 +0000 (14:49 +0200)]
glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULL

Fixes a crash in Piglit's
spec@arb_shader_subroutine@linker@no-mutual-recursion.vert for me.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agonv50/ir: Handle OP_CVT when folding constant expressions
Tobias Klausmann [Sun, 11 Jan 2015 21:40:22 +0000 (22:40 +0100)]
nv50/ir: Handle OP_CVT when folding constant expressions

[imirkin: handle more type combinations, use macro]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: undo more shifts still by allowing a pre-SHL to occur
Ilia Mirkin [Wed, 19 Aug 2015 03:16:32 +0000 (23:16 -0400)]
nvc0/ir: undo more shifts still by allowing a pre-SHL to occur

This happens with unpackSnorm lowering. There's yet another
bitfield-extract behind it, but there's too much variation to be worth
cutting through.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: don't require AND when the high byte is being addressed
Ilia Mirkin [Wed, 19 Aug 2015 02:53:11 +0000 (22:53 -0400)]
nvc0/ir: don't require AND when the high byte is being addressed

unpackUnorm* lowering doesn't AND the high byte/word as it's
unnecessary. Detect that situation as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: detect i2f/i2i which operate on specific bytes/words
Ilia Mirkin [Wed, 19 Aug 2015 01:09:12 +0000 (21:09 -0400)]
nvc0/ir: detect i2f/i2i which operate on specific bytes/words

Some Unigine shaders have been observed to unpack bytes out of 32-bit
integers and convert them to floats. I2F/I2I can handle this sort of
thing directly. Detect the handleable situations.

This misses 16-bit word capabilities in nv50, but I haven't seen shaders
that would actually make use of that.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: detect AND/SHR pairs and convert into EXTBF
Ilia Mirkin [Wed, 19 Aug 2015 01:07:33 +0000 (21:07 -0400)]
nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect
(some) of those and convert to EXTBF instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonv50/ir: support different unordered_set implementations
Chih-Wei Huang [Fri, 19 Jun 2015 18:00:15 +0000 (02:00 +0800)]
nv50/ir: support different unordered_set implementations

If build with C++11 standard, use std::unordered_set.

Otherwise if build on old Android version with stlport,
use std::tr1::unordered_set with a wrapper class.

Otherwise use std::tr1::unordered_set.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoi965: Fix "handle nir_intrinsic_image_size"
Martin Peres [Thu, 20 Aug 2015 12:15:56 +0000 (15:15 +0300)]
i965: Fix "handle nir_intrinsic_image_size"

I pushed a half-baked version of "i965: handle nir_intrinsic_image_size" by
accident. Not having the Reviewed-by: tags on the last two commits should
have been a red flag but I somehow missed it after the QA check.

This patch should fix image-size for non-int images. I will add support to
the piglit test for all the other image types.

Sorry for the noise.

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoi965: enable GL_ARB_shader_image_size
Martin Peres [Wed, 29 Apr 2015 09:42:16 +0000 (12:42 +0300)]
i965: enable GL_ARB_shader_image_size

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
9 years agoi965: handle nir_intrinsic_image_size
Martin Peres [Wed, 29 Apr 2015 09:39:16 +0000 (12:39 +0300)]
i965: handle nir_intrinsic_image_size

v2, Review from Francisco Jerez:
- avoid the camelCase for the booleans
- init the booleans using the sampler type
- force the initialization of all the components of the output register

v3:
- Rename a variable from CubeMapArray to CubeArray to re-use GLSL's name (Ilia)
- Fix some indentation and drop parenthesis (Topi)
- Fix a signed/unsigned comparaison warning

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
9 years agonir: convert the glsl intrinsic image_size to nir_intrinsic_image_size
Martin Peres [Tue, 11 Aug 2015 14:42:12 +0000 (17:42 +0300)]
nir: convert the glsl intrinsic image_size to nir_intrinsic_image_size

v2, review from Francisco Jerez:
 - make the destination variable as large as what the nir instrinsic
   defines (4) instead of the size of the return variable of glsl. This
   is still safe for the already existing code because all the intrinsics
   affected returned the same amount of components as expected by glsl IR.
   In the case of image_size, it is not possible to do so because the
   returned number of component depends on the image type and this case
   is not well handled by nir.

v3:
- Style fix

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoglsl: add support for the imageSize builtin
Martin Peres [Mon, 27 Apr 2015 16:25:34 +0000 (19:25 +0300)]
glsl: add support for the imageSize builtin

The code is heavily inspired from Francisco Jerez's code supporting the
image_load_store extension.

Backends willing to support this builtin should handle
__intrinsic_image_size.

v2: Based on the review of Ilia Mirkin
- Enable the extension for GLES 3.1
- Fix indentation
- Fix the return type (float to int, number of components for CubeImages)
- Add a warning related to GLES 3.1

v3: Based on the review of Francisco Jerez
- Refactor the code to share both add_image_function and _image with the other
  image-related functions

v4: Based on Topi Pohjolainen's comments
- Do not add parenthesis for the return value

v5: based on Francisco Jerez's comments:
- Fix a few indent issues
- Reduce the size of a condition by testing the dimension and array properties
  instead of enumerating all the formats.

Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agomain: add extension GL_ARB_shader_image_size
Martin Peres [Mon, 27 Apr 2015 17:05:14 +0000 (20:05 +0300)]
main: add extension GL_ARB_shader_image_size

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
9 years agodocs: Mark GLES 3.1 image load/store as done on i965.
Francisco Jerez [Thu, 20 Aug 2015 10:46:53 +0000 (13:46 +0300)]
docs: Mark GLES 3.1 image load/store as done on i965.

9 years agomesa: Add ES31 API tag for the extension table.
Francisco Jerez [Wed, 19 Aug 2015 11:42:50 +0000 (14:42 +0300)]
mesa: Add ES31 API tag for the extension table.

I'll mark the OES_shader_image_atomic extension entry with this tag to
make sure that we don't expose it on earlier GLES API versions
accidentally, because according to the extension:

 "OpenGL ES 3.1 and GLSL ES 3.10 are required."

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Parse the allowed image format qualifiers in GLSL ES 3.1.
Francisco Jerez [Sun, 16 Aug 2015 22:47:50 +0000 (01:47 +0300)]
glsl: Parse the allowed image format qualifiers in GLSL ES 3.1.

This includes the minimum required desktop/ES GLSL version in the
format qualifier table in anticipation of new GLSL versions extending
the set of supported image formats.  According to section 4.4.7 of the
GLSL ES 3.1 spec:

"The format layout qualifier identifiers for image variable
 declarations are:
 [...]
 rgba32f
 rgba16f
 r32f
 rgba8
 rgba8_snorm
 [...]
 rgba32i
 rgba16i
 rgba8i
 r32i
 [...]
 rgba32ui
 rgba16ui
 rgba8ui
 r32ui"

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Recognise image memory qualifiers in GLSL ES 3.1.
Francisco Jerez [Mon, 17 Aug 2015 16:12:00 +0000 (19:12 +0300)]
glsl: Recognise image memory qualifiers in GLSL ES 3.1.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Define image-related built-in constants required by GLSL ES 3.1.
Francisco Jerez [Mon, 17 Aug 2015 14:42:30 +0000 (17:42 +0300)]
glsl: Define image-related built-in constants required by GLSL ES 3.1.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Remove duplicate definition of gl_MaxTess*ImageUniforms built-in constants.
Francisco Jerez [Sun, 16 Aug 2015 22:39:38 +0000 (01:39 +0300)]
glsl: Remove duplicate definition of gl_MaxTess*ImageUniforms built-in constants.

These seem to have been re-added at some point during the
ARB_tessellation_shader implementation work.  AFAICT the second
(correct) definition of each constant would have had no effect because
the symbols were already defined.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Accept atomic_uint type in GLSL ES 3.1.
Francisco Jerez [Sun, 16 Aug 2015 22:38:00 +0000 (01:38 +0300)]
glsl: Accept atomic_uint type in GLSL ES 3.1.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Accept supported image types in GLSL ES 3.1.
Francisco Jerez [Sun, 16 Aug 2015 22:37:12 +0000 (01:37 +0300)]
glsl: Accept supported image types in GLSL ES 3.1.

These are a subset of the image types supported by desktop GL,
excluding 1D, 1D array, rectangle, buffer, cube array, 2D MS and 2D
MS array texture targets.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Expose image load and store built-ins in GLSL ES 3.1.
Francisco Jerez [Sun, 16 Aug 2015 22:34:41 +0000 (01:34 +0300)]
glsl: Expose image load and store built-ins in GLSL ES 3.1.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Use a separate availability class for image atomic built-ins.
Francisco Jerez [Sun, 16 Aug 2015 22:34:13 +0000 (01:34 +0300)]
glsl: Use a separate availability class for image atomic built-ins.

These are not part of unextended GLSL ES 3.1.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Allow precision qualifiers on general opaque types.
Francisco Jerez [Sun, 16 Aug 2015 22:28:57 +0000 (01:28 +0300)]
glsl: Allow precision qualifiers on general opaque types.

From the GLSL ES 3.1 spec, section 4.7.3:
 "Any floating point, integer, opaque type declaration can have the
  type preceded by one of these precision qualifiers: [...] highp
  [...], mediump [...], lowp [...]."

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Implement GLSL ES restriction on images being either readonly or writeonly.
Francisco Jerez [Sun, 16 Aug 2015 22:27:43 +0000 (01:27 +0300)]
glsl: Implement GLSL ES restriction on images being either readonly or writeonly.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agoglsl: Require that all image uniforms have a format qualifier in GLSL ES.
Francisco Jerez [Sun, 16 Aug 2015 22:26:40 +0000 (01:26 +0300)]
glsl: Require that all image uniforms have a format qualifier in GLSL ES.

Note that this is slightly more permissive than the spec language
requires: "Any image variable must specify a format layout qualifier."

The GLSL ES spec seems really sketchy regarding format layout
qualifiers on function formal parameters -- On the one hand they are
required, but on the other hand it doesn't provide any syntax to
specify them (see section 6.1.1), they don't participate in parameter
type matching for overload resolution, and are in fact explictly
forbidden ("Layout qualifiers cannot be used on formal function
parameters").  Of course none of the image built-in functions defined
by the spec specify format layout qualifiers (and they probably
couldn't sensibly), to contradict its own requirement.

This probably qualifies for a spec bug, but in the meantime do the
sensible thing and require layout qualifiers on uniforms *only*.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agoglsl: Add support for image binding qualifiers.
Francisco Jerez [Sun, 16 Aug 2015 22:25:11 +0000 (01:25 +0300)]
glsl: Add support for image binding qualifiers.

Support for binding an image to an image unit explicitly in the shader
source is required by both GLSL 4.2 and GLSL ES 3.1, but not by the
original ARB_shader_image_load_store extension.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agoglsl: Forbid non-constant image array indexing in GLSL ES 3.1.
Francisco Jerez [Sun, 16 Aug 2015 22:21:01 +0000 (01:21 +0300)]
glsl: Forbid non-constant image array indexing in GLSL ES 3.1.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agomesa: Refuse to bind image uniforms using glUniform in GLES.
Francisco Jerez [Sun, 16 Aug 2015 23:05:43 +0000 (02:05 +0300)]
mesa: Refuse to bind image uniforms using glUniform in GLES.

The GLES 3.1 spec removed support for updating the image unit bound to
an image uniform using glUniform1i() calls.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Refuse to bind a mutable texture object to an image unit in GLES.
Francisco Jerez [Sun, 16 Aug 2015 23:02:17 +0000 (02:02 +0300)]
mesa: Refuse to bind a mutable texture object to an image unit in GLES.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agomesa: Initialize image unit state to different defaults in GLES.
Francisco Jerez [Sun, 16 Aug 2015 23:01:40 +0000 (02:01 +0300)]
mesa: Initialize image unit state to different defaults in GLES.

There is no GL_R8 image format in GLES, according to the state table
20.32 of the GLES 3.1 spec the default value should be GL_R32UI.  The
ES31-CTS.shader_image_load_store.basic-api-bind Khronos conformance
test checks that this is the case.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Reset image unit state to the default values when a bound image is deleted.
Francisco Jerez [Sun, 16 Aug 2015 23:00:48 +0000 (02:00 +0300)]
mesa: Reset image unit state to the default values when a bound image is deleted.

The ES31-CTS.shader_image_load_store.basic-api-bind conformance test
expects the whole image unit state to be reset when the bound texture
object is deleted.  The ARB_shader_image_load_store extension is
rather vague regarding what should happen with image unit state other
than the texture object in that case, but the GL 4.2 and GLES 3.1
specifications (section "Automatic Unbinding of Deleted Objects")
explicitly require it to be reset to the default values.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Reject image formats not supported by GLES.
Francisco Jerez [Sun, 16 Aug 2015 22:58:53 +0000 (01:58 +0300)]
mesa: Reject image formats not supported by GLES.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Don't lose track of the shader image layer originally specified by the user.
Francisco Jerez [Sun, 16 Aug 2015 22:53:48 +0000 (01:53 +0300)]
mesa: Don't lose track of the shader image layer originally specified by the user.

The spec requires that all layers of the image starting from the 0-th
are bound to the image unit regardless of the Layer parameter when
Layered is true, so I was setting gl_image_unit::Layer to zero in that
case for the convenience of the driver back-end.  However the
ES31-CTS.shader_image_load_store.basic-api-bind conformance test
checks that the layer value returned by glGetInteger is the same that
was originally specified, regardless of the value of layered.  Rename
Layer to _Layer as is usual for other derived state and keep track of
the original layer value as gl_image_unit::Layer.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Rename MaxCombinedImageUnitsAndFragmentOutputs to MaxCombinedShaderOutputResources.
Francisco Jerez [Mon, 17 Aug 2015 16:10:46 +0000 (19:10 +0300)]
mesa: Rename MaxCombinedImageUnitsAndFragmentOutputs to MaxCombinedShaderOutputResources.

The name of both the GLSL built-in variable and the glGetInteger param
with the same value changed in GLSL ES 3.1 and GL 4.5.  Its semantics
also changed slightly, since the limit now also takes into account the
number of SSBs in use.  Switch our internal data structures to the
up-to-date name.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoGL: update glext to svn 31811
Dave Airlie [Sat, 15 Aug 2015 21:37:37 +0000 (07:37 +1000)]
GL: update glext to svn 31811

This brings in the new ARB extensions.

Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agonir: Use nir_builder in nir_lower_io's get_io_offset().
Kenneth Graunke [Wed, 12 Aug 2015 18:26:34 +0000 (11:26 -0700)]
nir: Use nir_builder in nir_lower_io's get_io_offset().

Much more readable.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agonir: Pull nir_lower_io's load_op selection into a helper function.
Kenneth Graunke [Wed, 12 Aug 2015 17:57:31 +0000 (10:57 -0700)]
nir: Pull nir_lower_io's load_op selection into a helper function.

Makes the function a bit smaller.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agomesa/formats: refactor by collapsing cases in switch statement by type
Nanley Chery [Tue, 11 Aug 2015 18:56:35 +0000 (11:56 -0700)]
mesa/formats: refactor by collapsing cases in switch statement by type

Combine the adjacent cases which have the same GL type in the switch statemnt.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: add more MESA_FORMAT_LAYOUTs
Nanley Chery [Fri, 7 Aug 2015 21:36:23 +0000 (14:36 -0700)]
mesa/formats: add more MESA_FORMAT_LAYOUTs

Add the classes of compressed formats as layouts. This allows the detection
of compressed formats belonging to a certain category of compressed formats.

v2. simplify layout name construction (Ilia).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoglsl: Fix up GL_ARB_compute_shader for GLSL ES 3.1
Marta Lofstedt [Mon, 10 Aug 2015 11:04:42 +0000 (13:04 +0200)]
glsl: Fix up GL_ARB_compute_shader for GLSL ES 3.1

GL_ARB_compute_shader is limited for GLSL version 430.
This enables for GLSL ES version 310.

V2: Updated error string to also include GLSL 3.10

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agomesa/main: Add GL_IMAGE_FORMAT_COMPATIBILITY_TYPE to glGetTexParameterfv
Marta Lofstedt [Wed, 12 Aug 2015 09:57:39 +0000 (11:57 +0200)]
mesa/main: Add GL_IMAGE_FORMAT_COMPATIBILITY_TYPE to glGetTexParameterfv

According to Open GL ES 3.1 specification, section 8.10.2
GL_IMAGE_FORMAT_COMPATIBILITY_TYPE should be supported by
glGetTexParameterfv.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoradeonsi: fix a typo as_es -> as_ls in a string
Marek Olšák [Tue, 18 Aug 2015 22:56:33 +0000 (00:56 +0200)]
radeonsi: fix a typo as_es -> as_ls in a string

Trivial.

9 years agowinsys/amdgpu: fix the type of memory usage counters
Marek Olšák [Mon, 17 Aug 2015 17:55:57 +0000 (19:55 +0200)]
winsys/amdgpu: fix the type of memory usage counters

If the 32-bit types overflowed, the driver could submit an IB that uses much
more memory than is available.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: fix indirect indexing of MSAA textures
Marek Olšák [Sat, 15 Aug 2015 09:51:48 +0000 (11:51 +0200)]
radeonsi: fix indirect indexing of MSAA textures

FMASK wasn't handled correctly.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agost/mesa: add fake ARB_copy_image support in Gallium
Ilia Mirkin [Mon, 20 Jul 2015 19:19:53 +0000 (15:19 -0400)]
st/mesa: add fake ARB_copy_image support in Gallium

This support should be removed in favor of something that actually works
in all the weird cases. However this is simple and is enough to allow
Bioshock Infinite to render properly on nvc0.

Since the functionality is not implemented correctly, the extension will
not appear in the extension string and mesa will still return
INVALID_OPERATION for any glCopyImageSubData calls. In order to make use
of this functionality, run with
MESA_EXTENSION_OVERRIDE=GL_ARB_copy_image

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoglsl: enable textureSize and texelFetch on GLSL ES 3.10 with MS samplers
Tapani Pälli [Mon, 17 Aug 2015 09:11:03 +0000 (12:11 +0300)]
glsl: enable textureSize and texelFetch on GLSL ES 3.10 with MS samplers

Patch separates array samplers from the texture_multisample check so that we
can enable only [iu]sampler2DMS, [iu]sampler2DMSArray are not supported.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agomesa: validate size parameters for glTexStorage*Multisample
Tapani Pälli [Mon, 17 Aug 2015 07:14:35 +0000 (10:14 +0300)]
mesa: validate size parameters for glTexStorage*Multisample

v2: code cleanup
v3: check only dimensions, samples is checked separately later

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agomesa: expose dimension check for glTex*Storage functions
Tapani Pälli [Mon, 10 Aug 2015 07:50:06 +0000 (10:50 +0300)]
mesa: expose dimension check for glTex*Storage functions

This is done so that following patch can use it to verify dimensions
for multisample variants of glTex*Storage.

v2: move function to header, use bool instead GLboolean
v3: small changes, cleanup

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agoutil/ra: (trivial) fix c99 loop variable initialization
Roland Scheidegger [Wed, 19 Aug 2015 02:17:49 +0000 (04:17 +0200)]
util/ra: (trivial) fix c99 loop variable initialization

Fails with old msvc otherwise.

9 years agoutil: (trivial) include c99_math.h in rounding.h
Roland Scheidegger [Wed, 19 Aug 2015 02:17:36 +0000 (04:17 +0200)]
util: (trivial) include c99_math.h in rounding.h

Needed for rint/rintf.

9 years agoi965/bdw: Fix setting the instancing state for the SGVS element
Neil Roberts [Mon, 13 Jul 2015 17:01:13 +0000 (18:01 +0100)]
i965/bdw: Fix setting the instancing state for the SGVS element

When gl_VertexID or gl_InstanceID is used a 3DSTATE_VF_SGVS
instruction is sent to create a sort of element to store the generated
values. The last instruction in this chunk of code looks like it was
trying to set the instancing state for the element using the
3DSTATE_VF_INSTANCING instruction. However it was sending
brw->vb.nr_buffers instead of the element index. This instruction is
supposed to take an element index and that is how it is used further
down in the function so the previous code looks wrong. Perhaps
previously the number of buffers coincidentally matched the number of
enabled elements so the value was generally correct anyway. In a
subsequent patch I want to change a bit how it chooses the SGVS
element index so this needs to be fixed.

v2 [by Ben]
Remove stable 10.5 stable tag (it's too late now)
Commit update as follows:
The number of vertex buffers emitted is always <= the number of vertex elements.
To maximize reuse (actually, to minimize relocations - according to the code
comments), a vertex buffer is only emitted once, even when we setup multiple
components (3DSTATE_VERTEX_ELEMENT) from that buffer. This meant that the
previous code would use the wrong indexed element for these reuse cases. This
patch by itself prevents hangs on BSW in the linked bug. It doesn't make the
test pass, the remaining patches are needed for that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91610
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
9 years agoutil/ra: Make allocating conflict lists optional
Jason Ekstrand [Sat, 15 Aug 2015 16:58:32 +0000 (09:58 -0700)]
util/ra: Make allocating conflict lists optional

Since i965 is now using make_reg_conflicts_transitive and doesn't need
q-value computations, they are disabled on i965.  They are enabled
everywhere else so that they get the old behavior.  This reduces the time
spent in eglInitialize() on BDW by around 10-15%.

Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoi965/reg_allocate: Use make_reg_conflicts_transitive
Jason Ekstrand [Sat, 15 Aug 2015 16:50:11 +0000 (09:50 -0700)]
i965/reg_allocate: Use make_reg_conflicts_transitive

Instead of adding transitive conflicts as we go, we now add regular
conflicts and them make them all transitive at the end.  This reduces
screen creation time substantially on BDW.  The time spent in eglInitialize
is reduced from 27.78 ms/call to 9.92 ms/call in debug mode and from 13.15
ms/call to 4.54 ms/call in release mode (about 65% in either case).

Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoutil/ra: Add a function for making all conflicts on a register transitive
Jason Ekstrand [Sat, 15 Aug 2015 16:43:05 +0000 (09:43 -0700)]
util/ra: Add a function for making all conflicts on a register transitive

Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoutil/bitset: Add a BITSET_FOREACH_SET macro
Jason Ekstrand [Sat, 15 Aug 2015 16:30:40 +0000 (09:30 -0700)]
util/bitset: Add a BITSET_FOREACH_SET macro

Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agomesa: Move varying slots and FS output names to shader_enums.h
Eric Anholt [Tue, 4 Aug 2015 17:43:58 +0000 (10:43 -0700)]
mesa: Move varying slots and FS output names to shader_enums.h

They're used by glsl_to_nir.cpp, and I want to use them in TGSI-to-NIR as
well (our use of the var->index slot to store slot properties no longer
works since it got truncated).

The *_MAX defines are left in mtypes.h, because they depend on config.h.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agomesa: undo split out of create shader code
Timothy Arceri [Thu, 13 Aug 2015 13:26:01 +0000 (23:26 +1000)]
mesa: undo split out of create shader code

This code was split out into a separate function to be used also
by GL_EXT_separate_shader_objects which has since been removed from
Mesa, so move it back.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agofreedreno: use fd_pipe_wait_timeout()
Rob Clark [Tue, 18 Aug 2015 19:07:02 +0000 (15:07 -0400)]
freedreno: use fd_pipe_wait_timeout()

To properly support the case of waiting on a fence with a 0 timeout, we
still need to call down to the kernel.  Which requires the use of the
new fd_pipe_wait_timeout() API.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: fence fix
Rob Clark [Sun, 16 Aug 2015 23:18:22 +0000 (19:18 -0400)]
freedreno: fence fix

Don't take current timestamp/fence from current ring, as we might have
already rolled over to new rb.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agoAdd mesa.icd to the .gitignore
Neil Roberts [Mon, 10 Aug 2015 16:31:02 +0000 (17:31 +0100)]
Add mesa.icd to the .gitignore

Since 4d7e0fa8c731776 this file is generated by the configure script.
Reviewed-by: Tapani Palli <tapani.palli@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agodrirc: Add "Unigine Oil Rush" quirk (allow_glsl_extension_directive_midshader).
Richard Yao [Wed, 12 Aug 2015 16:48:22 +0000 (12:48 -0400)]
drirc: Add "Unigine Oil Rush" quirk (allow_glsl_extension_directive_midshader).

Appears to fix shader compilation. Tested by starting the client and observing
that the screen was correct after the trailers ran when previously, it was
blank. Play tested on amd64.

This was suggested by "Kuuchan" on the Steam forums:

https://steamcommunity.com/app/200390/discussions/0/540731690861139279/?insideModal=1#c594820656479479870

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
9 years agonir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)
Thomas Helland [Thu, 6 Aug 2015 11:36:05 +0000 (13:36 +0200)]
nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)

The positive and negative value of a float can only
be equal to each other if it is -0.0f and 0.0f.
This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf
This gives no changes in my shader-db

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agonir: Simplify fne(fneg(a), a) -> fne(a, 0.0)
Thomas Helland [Thu, 6 Aug 2015 11:36:04 +0000 (13:36 +0200)]
nir: Simplify fne(fneg(a), a) -> fne(a, 0.0)

-NaN != NaN, and -Inf != Inf, so this should be safe.
Found while working on my VRP pass.

Shader-db results on my IVB:
total instructions in shared programs: 1698267 -> 1698067 (-0.01%)
instructions in affected programs:     15785 -> 15585 (-1.27%)
helped:                                36
HURT:                                  0
GAINED:                                0
LOST:                                  0

Some shaders was found to have the following pattern in NIR:
vec1 ssa_26 = fneg ssa_21
vec1 ssa_27 = fne ssa_21, ssa_26

Make that:
vec1 ssa_27 = fne ssa_21, 0.0f

This is found in Dota2 and Brutal Legend.
One shader is cut by 8%, from 323 -> 296 instructons in SIMD8

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/gen7: Resolve GCC sign-compare warning.
Rhys Kidd [Thu, 6 Aug 2015 06:34:17 +0000 (16:34 +1000)]
i965/gen7: Resolve GCC sign-compare warning.

mesa/src/mesa/drivers/dri/i965/gen7_sol_state.c: In function 'gen7_upload_3dstate_so_decl_list':
mesa/src/mesa/drivers/dri/i965/gen7_sol_state.c:119:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (int i = 0; i < linked_xfb_info->NumOutputs; i++) {
                      ^

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
9 years agoi965/gen6: Resolve GCC sign-compare warning.
Rhys Kidd [Thu, 6 Aug 2015 06:34:16 +0000 (16:34 +1000)]
i965/gen6: Resolve GCC sign-compare warning.

mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c: In function 'gen6_upload_push_constants':
mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c:85:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       for (i = 0; i < prog_data->nr_params; i++) {
                     ^
mesa/src/mesa/drivers/dri/i965/gen6_vs_state.c:92:17: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (i = 0; i < prog_data->nr_params; i++) {
                 ^

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>