platform/upstream/mesa.git
12 years agoRemove _mesa_inv_sqrtf in favor of 1/SQRTF
Matt Turner [Fri, 20 Jul 2012 17:06:35 +0000 (10:06 -0700)]
Remove _mesa_inv_sqrtf in favor of 1/SQRTF

Except for a couple of explicit uses, _mesa_inv_sqrtf was disabled since
its addition in 2003 (see f9b1e524).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoRemove _mesa_sqrt* in favor of plain sqrt
Matt Turner [Fri, 20 Jul 2012 16:55:47 +0000 (09:55 -0700)]
Remove _mesa_sqrt* in favor of plain sqrt

Temporarily disabled since 2003 (see 386578c5b).

This saves us from calling sqrt() 128 times to generate the sqrttab in
one_time_init().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoUse INV_SQRT instead of 1/SQRTF
Matt Turner [Fri, 20 Jul 2012 17:03:10 +0000 (10:03 -0700)]
Use INV_SQRT instead of 1/SQRTF

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoautoconf: Only kink mcjit component when available.
José Fonseca [Sat, 21 Jul 2012 10:43:06 +0000 (11:43 +0100)]
autoconf: Only kink mcjit component when available.

Should fix build failures with older LLVM version, but only tested on
LLVM 3.1.

12 years agoi830: Fix stack corruption
Chad Versace [Fri, 20 Jul 2012 22:41:27 +0000 (15:41 -0700)]
i830: Fix stack corruption

Found by compiler warning:
    i830_texstate.c:131:28: warning: argument to 'sizeof' in 'memset' call
          is the same expression as the destination; did you mean to
          dereference it?  [-Wsizeof-pointer-memaccess]
       memset(state, 0, sizeof(state));
              ~~~~~            ^~~~~

On 64-bit systems, memset here would write an extra 4 bytes.

Note: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agomesa: disable MSVC global optimization in pack.c
José Fonseca [Fri, 20 Jul 2012 22:16:11 +0000 (16:16 -0600)]
mesa: disable MSVC global optimization in pack.c

To reduce excessive compilation time in release mode.

NOTE: This is a candidate for the 8.0 branch.

Tested-by: Brian Paul <brianp@vmware.com>
12 years agomesa: whitespace fixes in pbo.c
Brian Paul [Thu, 19 Jul 2012 22:34:24 +0000 (16:34 -0600)]
mesa: whitespace fixes in pbo.c

12 years agomesa: update texstore.c comment
Brian Paul [Thu, 19 Jul 2012 21:07:10 +0000 (15:07 -0600)]
mesa: update texstore.c comment

12 years agollvmpipe: use runtime loop instead of static loop for looping over quads
Roland Scheidegger [Fri, 6 Jul 2012 00:53:44 +0000 (02:53 +0200)]
llvmpipe: use runtime loop instead of static loop for looping over quads

This can potentially cut shader program size by a factor of 4 for 4-wide
execution respectively 2 for 8-wide execution and while this ratios aren't
quite reached for more complex shaders it can be close.
Could not really measure a performance difference so far except for trivial
shaders (glxgears).
There seems to be a fair amount of unnecessary move's generated especially
at the beginning it might be possible to optimize those away somehow.
Things aren't quite as clean, some additional stuff needs to be done for
keeping both paths working (though llvm might be able to optimize this away).
glxgears seems to lose about 5-10% of performance, looking at the generated
shaders this is actually less than I'd think it would be - both 4 and 8-wide
shaders, despite containing a loop actually have about 10% more instructions
in total, and will have roughly 50% more executed instructions (though mostly
cheap ones). Need to figure out how to reduce overhead...

v2: keep complex interpolation for 4-wide mode, adapt to interface changes.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
12 years agonv30: Support negative offsets in indirect constant access.
Roy Spliet [Wed, 18 Jul 2012 23:56:35 +0000 (01:56 +0200)]
nv30: Support negative offsets in indirect constant access.

Fixes piglit vp-address-01 amongst several others.

Signed-off-by: Roy Spliet <r.spliet@student.tudelft.nl>
Reviewed-by: Lucas Stach <dev@lynxeye.de>
Tested-by: Lucas Stach <dev@lynxeye.de>
12 years agonv50/ir: set position before i instead of i->next in NV50LoweringPreSSA::visit
Bryan Cain [Wed, 18 Jul 2012 04:46:39 +0000 (23:46 -0500)]
nv50/ir: set position before i instead of i->next in NV50LoweringPreSSA::visit

Fixes rendering glitches in Psychonauts such as Raz's eyes flickering white.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=51962.

12 years agoi965/gen7: Increase the WM threads to hardware limits.
Eric Anholt [Thu, 19 Jul 2012 05:58:15 +0000 (22:58 -0700)]
i965/gen7: Increase the WM threads to hardware limits.

This thread count is only supposed to be enabled when "WIZ Hashing Disable in
GT_MODE register enabled."  I've always been confused whether that means the
bit in the register should be 1 or 0.  For my IVB GT2's register 0x7008 value
of 0x0, this appears to work fine.

Improves l4d2 performance at 640x480 by 0.88 +/- 0.11% (n=88).  Improves
performance with rasterization at 1280x1024 by 1.45% +/- 0.36% (n=6).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Assign locations for uniforms in UBOs using the std140 rules.
Eric Anholt [Tue, 1 May 2012 22:10:14 +0000 (15:10 -0700)]
glsl: Assign locations for uniforms in UBOs using the std140 rules.

Fixes piglit layout-std140.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Don't resize arrays in uniform blocks.
Eric Anholt [Tue, 1 May 2012 21:43:31 +0000 (14:43 -0700)]
glsl: Don't resize arrays in uniform blocks.

This is a requirement for std140 uniform blocks, and optional for
packed/shared blocks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Don't dead-code eliminiate uniforms declared in uniform blocks.
Eric Anholt [Tue, 1 May 2012 21:26:09 +0000 (14:26 -0700)]
glsl: Don't dead-code eliminiate uniforms declared in uniform blocks.

This is a requirement for std140 uniform blocks, and optional for
packed/shared blocks.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Implement the UBO-specific pnames of glGetActiveUniformsiv.
Eric Anholt [Tue, 1 May 2012 21:15:14 +0000 (14:15 -0700)]
mesa: Implement the UBO-specific pnames of glGetActiveUniformsiv.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Propagate uniform block information into gl_uniform_storage.
Eric Anholt [Tue, 1 May 2012 20:59:31 +0000 (13:59 -0700)]
glsl: Propagate uniform block information into gl_uniform_storage.

Now we can actually return information on uniforms in uniform blocks
in the new queries.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Add implementation of glGetUniformBlockIndex().
Eric Anholt [Tue, 1 May 2012 21:07:43 +0000 (14:07 -0700)]
mesa: Add implementation of glGetUniformBlockIndex().

Now that we finally have a list of uniform blocks in the linked shader
program, we can tell what their indices are.

Fixes piglit GL_ARB_uniform_buffer_object/getuniformblockindex.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Set the uniform_block index for the linked shader variables.
Eric Anholt [Tue, 1 May 2012 20:34:04 +0000 (13:34 -0700)]
glsl: Set the uniform_block index for the linked shader variables.

At this point in the linking, we've totally lost track of the struct
gl_uniform_buffer that this pointed to in the original unlinked
shader, so we do a nasty n^2 walk to find it the new one based on the
variable name.

Note that these point into the shader's list of gl_uniform_buffers,
not the linked program's.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Add support for glGetActiveUniformsiv on non-UBO pnames.
Eric Anholt [Fri, 27 Apr 2012 22:56:44 +0000 (15:56 -0700)]
mesa: Add support for glGetActiveUniformsiv on non-UBO pnames.

We'll need to propagate the UBO fields to the uniform storage records
before we can handle the other pnames.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Add support for glGetUniformIndices().
Eric Anholt [Fri, 27 Apr 2012 22:37:49 +0000 (15:37 -0700)]
mesa: Add support for glGetUniformIndices().

This is a single entrypoint that maps from a series of names to the
indices of those names within the active uniforms list.  Each index is
like glGetUniformLocation()'s return value, except that it doesn't
encode an array offset.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Move the _mesa_uniform_merge_location_offset to glGetUniformLocation().
Eric Anholt [Mon, 25 Jun 2012 17:23:24 +0000 (10:23 -0700)]
mesa: Move the _mesa_uniform_merge_location_offset to glGetUniformLocation().

With the upcoming GL_ARB_uniform_buffer_object changes, the only
other caller that will want the cooked value is state_tracker.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Merge the lists of uniform blocks into the linked shader program.
Eric Anholt [Fri, 27 Apr 2012 20:52:56 +0000 (13:52 -0700)]
glsl: Merge the lists of uniform blocks into the linked shader program.

This attempts error-checking, but the layout isn't done yet.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Translate the AST for uniform blocks into some IR structures.
Eric Anholt [Fri, 27 Apr 2012 01:21:43 +0000 (18:21 -0700)]
glsl: Translate the AST for uniform blocks into some IR structures.

We're going to need this structure to cross-validate the uniform
blocks between shader stages, since unused ir_variables might get
dropped.  It's also the place we store the RowMajor qualifier, which
is not part of the GLSL type (since that would cause a bunch of type
equality checks to fail).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Turn UBO variable declarations into ir_variables and check qualifiers.
Eric Anholt [Fri, 27 Apr 2012 01:19:39 +0000 (18:19 -0700)]
glsl: Turn UBO variable declarations into ir_variables and check qualifiers.

Fixes piglit layout-*-non-uniform and layout-*-within-block.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agost/xorg: fix masked transformations
Lucas Stach [Thu, 19 Jul 2012 19:09:28 +0000 (21:09 +0200)]
st/xorg: fix masked transformations

Someone tried to be clever and "optimized" add_vertex_data2() to just use
two points for the texture coordinates and then reuse individual
components. Sadly this is not how matrix multiplication works.

Fixes rendercheck -t tmcoords

Signed-off-by: Lucas Stach <dev@lynxeye.de>
12 years agoi965/blorp: Use IMS layout when texturing from depth/stencil surfaces.
Paul Berry [Mon, 9 Jul 2012 19:50:31 +0000 (12:50 -0700)]
i965/blorp: Use IMS layout when texturing from depth/stencil surfaces.

Previously, on Gen7, when texturing from a depth or stencil surface,
the blorp engine would configure the 3D pipeline as though the input
surface was non-multisampled, and perform the necessary coordinate
transformations in the fragment shader to account for the IMS layout.
This meant outputting a lot of extra fragment shader code, and it
raised some uncertainty about how to deal with very large surfaces.

This patch modifies blorp to configure the 3D pipeline properly for
IMS layout when reading from depth and stencil surfaces.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965/blorp: Loosen assertions in compute_msaa_layout_for_pipeline.
Paul Berry [Mon, 9 Jul 2012 19:50:31 +0000 (12:50 -0700)]
i965/blorp: Loosen assertions in compute_msaa_layout_for_pipeline.

Previously, on Gen7, compute_msaa_layout_for_pipeline() would verify
that IMS layout is not used.  However, now that we configure
SURFACE_STATE correctly for IMS surfaces, IMS layout is available.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965/blorp: Configure SURFACE_STATE correctly for IMS surfaces.
Paul Berry [Mon, 9 Jul 2012 19:50:31 +0000 (12:50 -0700)]
i965/blorp: Configure SURFACE_STATE correctly for IMS surfaces.

This patch modifies gen7_set_surface_num_multisamples() to set up the
SURFACE_STATE appropriately for texturing from IMS format MSAA
surfaces (which are only used on Gen7 for depth and stencil buffers).
Since the function now sets more than just the number of multisamples,
it's been renamed to gen7_set_surface_msaa().

This will make it possible to remove some kludginess from the blorp
engine.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965/blorp: Optimize manual_blend() for compressed multisampled surfaces.
Paul Berry [Mon, 9 Jul 2012 18:10:52 +0000 (11:10 -0700)]
i965/blorp: Optimize manual_blend() for compressed multisampled surfaces.

When downsampling a compressed multisampled surface, we can take a
shortcut to downsample any pixels that were completely covered by a
single primitive.  In this case, the first color value we fetch is the
correct final color for the downsampled pixel, so we can skip the rest
of the blending operation.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965/blorp: Fix integer downsampling on Gen7.
Paul Berry [Sat, 7 Jul 2012 15:28:46 +0000 (08:28 -0700)]
i965/blorp: Fix integer downsampling on Gen7.

When downsampling an integer-format buffer on Gen7, we need to use the
"avg" instruction rather than the "add" instruction, to ensure that we
don't overflow the range of 32-bit integers.  Also, we need to use the
proper register type (BRW_REGISTER_TYPE_D or BRW_REGISTER_TYPE_UD) for
intermediate color data and for writing to the render target.

Note: this patch causes blorp to use the proper register type for all
operations (downsampling, upsampling, and ordinary blits).  Strictly
speaking, this is only necessary for downsampling, because the other
operations exclusively use MOV instructions on the color data.  But
it's simpler to use the proper register type in all cases.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965/blorp: Modify manual_blend() to avoid unnecessary loss of precision.
Paul Berry [Sat, 7 Jul 2012 15:02:48 +0000 (08:02 -0700)]
i965/blorp: Modify manual_blend() to avoid unnecessary loss of precision.

When downsampling from an MSAA image to a single-sampled image, it is
inevitable that some loss of numerical precision will occur, since we
have to use 32-bit floating point registers to hold the intermediate
results while blending.  However, it seems reasonable to expect that
when all samples corresponding to a given pixel have the exact same
color value, there will be no loss of precision.

Previously, we averaged samples as follows:

    blend = (((sample[0] + sample[1]) + sample[2]) + sample[3]) / 4

This had the potential to lose numerical precision when all samples
have the same color value, since ((sample[0] + sample[1]) + sample[2])
may not be precisely representable as a 32-bit float, even if the
individual samples are.

This patch changes the formula to:

    blend = ((sample[0] + sample[1]) + (sample[2] + sample[3])) / 4

This avoids any loss of precision in the event that all samples are
the same, by ensuring that each addition operation adds two equal
values.

As a side benefit, this puts the formula in the form we will need in
order to implement correct blending of integer formats.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965: Add support for AVG instruction.
Paul Berry [Sat, 7 Jul 2012 15:28:46 +0000 (08:28 -0700)]
i965: Add support for AVG instruction.

From the Ivy Bridge PRM, Vol4 Part3 p152:

    "The avg instruction performs component-wise integer average of
    src0 and src1 and stores the results in dst. An integer average
    uses integer upward rounding. It is equivalent to increment one to
    the addition of src0 and src1 and then apply an arithmetic right
    shift to this intermediate value."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoi965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill.
Paul Berry [Thu, 19 Jul 2012 06:20:23 +0000 (23:20 -0700)]
i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill.

The kill_emitted variable was duplicating the functionality of
gl_fragment_program::UsesKill.  There's no need for both.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agomesa: Set gl_fragment_program::UsesKill in do_set_program_inouts.
Paul Berry [Thu, 19 Jul 2012 06:18:14 +0000 (23:18 -0700)]
mesa: Set gl_fragment_program::UsesKill in do_set_program_inouts.

Previously, the code for setting this flag for GLSL programs was
duplicated in three places: brw_link_shader(), glsl_to_tgsi_visitor,
and ir_to_mesa_visitor.  In addition to the unnecessary duplication,
there was a performance problem on i965: brw_link_shader() set the
flag before doing its final round of optimizations, which meant that
if the optimizations managed to eliminate all the discard operations,
the flag would still be set, resulting (at least in theory) in slower
performance.

This patch consolidates all of the code that sets UsesKill for GLSL
programs into do_set_program_inouts(), which already is doing a
similar job for UsesDFdy, and which occurs after i965's final round of
optimizations.

Non-GLSL programs (ARB programs and the state tracker's glBitmap
program) are unaffected.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agogallium-egl: Move wayland query_buffer implementation
Kristian Høgsberg [Thu, 19 Jul 2012 20:07:55 +0000 (16:07 -0400)]
gallium-egl: Move wayland query_buffer implementation

Move it to native_wayland_drm_bufmgr_helper.c which only gets compiled when
wayland is enabled and which already includes the right headers.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
12 years agosoftpipe: Fix segfault with fbo-cubemap.
Olivier Galibert [Thu, 19 Jul 2012 16:55:14 +0000 (18:55 +0200)]
softpipe: Fix segfault with fbo-cubemap.

The cube sampler generates two-dimensional texture coordinates and
hence passes NULL for the array for the third one.  The actual 2D
sampler, lower in the pipe, knew not to used that array since it
didn't need it.  But the samplers have become single-texel and the
coordinate array dereference has been moved up one step, to a level
where the code does not know only two coordinates are used.  Hence the
segfault.

The simplest fix by far is to add a third dummy coordinate array in
the call to the next pipe step, which will be dereferenced to an
harmless 0 which then will be happily ignored by the sampler.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=52250

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agowayland: Support EGL_WIDTH and EGL_HEIGHT queries for wl_buffer
Kristian Høgsberg [Thu, 19 Jul 2012 13:02:25 +0000 (09:02 -0400)]
wayland: Support EGL_WIDTH and EGL_HEIGHT queries for wl_buffer

We're going to make the public wl_buffer struct as small as possible.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
12 years agowayland: Use existing EGL_TEXTURE_FORMAT for querying wl_buffer texture format
Kristian Høgsberg [Thu, 19 Jul 2012 12:54:05 +0000 (08:54 -0400)]
wayland: Use existing EGL_TEXTURE_FORMAT for querying wl_buffer texture format

We also reuse EGL_TEXTURE_RGBA and EGL_TEXTURE_RGB, adding only the new
planar YUV texture formats: EGL_TEXTURE_Y_U_V_WL, EGL_TEXTURE_Y_UV_WL and
EGL_TEXTURE_Y_XUXV_WL.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
12 years agogallium-egl: Implement eglQueryWaylandBufferWL
Kristian Høgsberg [Thu, 19 Jul 2012 12:48:45 +0000 (08:48 -0400)]
gallium-egl: Implement eglQueryWaylandBufferWL

Support this query for gallium EGL too.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
12 years agoglsl: Remove open coded version of ir_variable::interpolation_string().
Kenneth Graunke [Wed, 6 Jun 2012 08:52:47 +0000 (01:52 -0700)]
glsl: Remove open coded version of ir_variable::interpolation_string().

Presumably the function didn't exist when we wrote this code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoi965: Avoid unnecessary recompiles for shaders that don't use dFdy().
Paul Berry [Wed, 20 Jun 2012 20:40:45 +0000 (13:40 -0700)]
i965: Avoid unnecessary recompiles for shaders that don't use dFdy().

The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This patch avoids unnecessarily recompiling shaders that don't use
dFdy(), by only setting render_to_fbo in the wm program key if the
shader actually uses dFdy().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoglsl: Set UsesDFdy appropriately for GLSL shaders.
Paul Berry [Wed, 20 Jun 2012 19:49:29 +0000 (12:49 -0700)]
glsl: Set UsesDFdy appropriately for GLSL shaders.

This patch updates the ir_set_program_inouts_visitor so that it also
sets gl_fragment_program::UsesDFdy.

This is a bit of a hack (since dFdy() isn't an input or an output),
but there's no other obvious visitor to squeeze this functionality
into, and it would be silly to create a brand new visitor just for
this purpose.

v2: use local 'fprog' var to avoid repeated casting.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Set UsesDFdy appropriately for assembly programs.
Paul Berry [Wed, 20 Jun 2012 19:33:06 +0000 (12:33 -0700)]
mesa: Set UsesDFdy appropriately for assembly programs.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Add UsesDFdy to struct gl_fragment_program.
Paul Berry [Wed, 20 Jun 2012 19:31:46 +0000 (12:31 -0700)]
mesa: Add UsesDFdy to struct gl_fragment_program.

The i965 back-end needs to compile dFdy() differently for FBOs and
window system framebuffers, because Y coordinates are flipped between
the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs).
This boolean will allow it to avoid unnecessarily recompiling shaders
that don't use dFdy().

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agodrirc: Add disable_blend_func_extended workaround for Unigine OilRush.
Kenneth Graunke [Thu, 19 Jul 2012 08:40:24 +0000 (01:40 -0700)]
drirc: Add disable_blend_func_extended workaround for Unigine OilRush.

The previous commit implemented the workaround, cited a bug report
about OilRush, but actually only enabled the workaround for the demos.

Turn it on for OilRush too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965: Add a driconf option to disable GL_ARB_blend_func_extended.
Kenneth Graunke [Wed, 18 Jul 2012 07:07:17 +0000 (00:07 -0700)]
i965: Add a driconf option to disable GL_ARB_blend_func_extended.

Unigine Heaven (at least) has a bug where it incorrectly uses the
GL_ARB_blend_func_extended extension.

Dual source blending allows two color outputs per render target;
individual shader outputs can be assigned to be either the first or
second blending input by setting the 'index' via one of two methods:

- An API call: glBindFragDataLocationIndexed()
- The GLSL 'layout' qualifier provided by GL_ARB_explicit_attrib_location

Both of these only work on user defined fragment shader outputs; it's an
error to use either on built-in outputs like gl_FragData.

Unigine uses gl_FragData and gl_FragColor exclusively, and doesn't even
attempt to use either method to set index == 1.  However, it does set
the blending function to SRC1 enums, which requires a fragment shader
output with index == 1 or else rendering is undefined.

In other words, enabling ARB_blend_func_extended causes Unigine to
render incorrectly, resulting in an apparent regression, even though our
driver code (as far as I can tell) is perfectly fine.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agomesa: remove stale comment
Brian Paul [Wed, 18 Jul 2012 21:35:24 +0000 (15:35 -0600)]
mesa: remove stale comment

12 years agomesa: use gl_program cast wrappers
Brian Paul [Wed, 18 Jul 2012 21:33:11 +0000 (15:33 -0600)]
mesa: use gl_program cast wrappers

In a few cases, remove unneeded casts.
And fix a few other const-correctness issues.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agomesa: add some gl_program cast wrappers
Brian Paul [Wed, 18 Jul 2012 21:32:51 +0000 (15:32 -0600)]
mesa: add some gl_program cast wrappers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agor600g: setup streamout before calling last r600_need_cs_space before drawing
Marek Olšák [Wed, 18 Jul 2012 16:33:37 +0000 (18:33 +0200)]
r600g: setup streamout before calling last r600_need_cs_space before drawing

This fixes CS checker errors due to registers not being initialized, because
the flush occured after dirty state was emitted but before drawing.

12 years agoi965/fs: Make register spill/unspill only do the regs for that instruction.
Eric Anholt [Sat, 7 Jul 2012 00:18:35 +0000 (17:18 -0700)]
i965/fs: Make register spill/unspill only do the regs for that instruction.

Previously, if we were spilling the result of a texture call, we would store
all 4 regs, then for each use of one of those regs as the source of an
instruction, we would unspill all 4 regs even though only one was needed.

In both lightsmark and l4d2 with my current graphics config, the shaders that
produce spilling do so on split GRFs, so this doesn't help them out.  However,
in a capture of the l4d2 shaders with a different snapshot and playing the
game instead of using a demo, it reduced one shader from 2817 instructions to
2179, due to choosing a now-cheaper texture result to spill instead of piles
of texcoords.

v2: Fix comment noted by Ken, and fix the if condition associated with it for
    the current state of what constitutes a partial write of the destination.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
12 years agoi965/fs.h: Refactor tests for instructions modifying a register.
Eric Anholt [Fri, 6 Jul 2012 22:06:59 +0000 (15:06 -0700)]
i965/fs.h: Refactor tests for instructions modifying a register.

There's one instance of a potential behavior change: propagate_constants may
now propagate into a part of a vgrf after a different part of it was
overwritten by a send that returns multiple registers.  I don't think we ever
generate IR that meets that condition, but it's something to note if we bisect
behavior change to this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965/fs: Replace usage is_tex() with regs_written() checks.
Eric Anholt [Fri, 6 Jul 2012 21:51:44 +0000 (14:51 -0700)]
i965/fs: Replace usage is_tex() with regs_written() checks.

In these places, we care about any sort of send that hits more than one reg,
not just textures.  We don't yet have anything else returning more than one
reg, so there's no change.

v2: Use mlen instead of is_tex() for the is-it-a-send check.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965/fs: Rename virtual_grf_next to virtual_grf_count.
Eric Anholt [Fri, 6 Jul 2012 20:45:53 +0000 (13:45 -0700)]
i965/fs: Rename virtual_grf_next to virtual_grf_count.

"count" is a more useful name, since most of the time we're using it for
looping over the variables.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965/fs: Move a block out of a loop in live variables setup.
Eric Anholt [Sat, 7 Jul 2012 01:00:40 +0000 (18:00 -0700)]
i965/fs: Move a block out of a loop in live variables setup.

This was accidentally copy-and-pasted inside.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965/msaa: Disable alpha-to-{coverage, one} when drawbuffer zero is in integer format
Anuj Phogat [Wed, 18 Jul 2012 18:41:15 +0000 (11:41 -0700)]
i965/msaa: Disable alpha-to-{coverage, one} when drawbuffer zero is in integer format

OpenGL specification 3.3 (page 196), section 4.1.3 says:
If drawbuffer zero is not NONE and the buffer it references has an
integer format, the SAMPLE_ALPHA_TO_COVERAGE and SAMPLE_ALPHA_TO_ONE
operations are skipped."
This should work properly even if there are other draw buffers that
are not in integer format.

This patch makes following piglit tests pass on mesa:
int-draw-buffers-alpha-to-coverage
int-draw-buffers-alpha-to-one

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agost/xorg: attach EDID to outputs
Lucas Stach [Wed, 18 Jul 2012 14:07:29 +0000 (16:07 +0200)]
st/xorg: attach EDID to outputs

Allows tools like GNOME's monitor configuration to show meaningful names.

v2: fix resource leak

Signed-off-by: Lucas Stach <dev@lynxeye.de>
12 years agost/xorg: remove superfluous memset
Lucas Stach [Wed, 18 Jul 2012 14:07:28 +0000 (16:07 +0200)]
st/xorg: remove superfluous memset

exaDriverAlloc() uses calloc, which already initialises pExa to zero.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
12 years agost/xorg: reorder exa context creation and use screen param queries
Lucas Stach [Wed, 18 Jul 2012 14:07:27 +0000 (16:07 +0200)]
st/xorg: reorder exa context creation and use screen param queries

Gives the x-server a more accurate description of the exa hardware
capabilities.

v2: drop NPOT check

Signed-off-by: Lucas Stach <dev@lynxeye.de>
12 years agosoftpipe: Take all lods into account when texture sampling.
Olivier Galibert [Tue, 19 Jun 2012 19:01:37 +0000 (21:01 +0200)]
softpipe: Take all lods into account when texture sampling.

This patch churns a lot because it needs to change 4-wide filters into
single pixel filters, since each fragment may use a different filter.

The only case not entirely supported is the anisotropic filtering.
Not sure what we want to do there, since a full quad is required by
that filter.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agor600g: implement wait-free buffer transfer for DISCARD_RANGE
Marek Olšák [Sun, 26 Feb 2012 19:37:43 +0000 (20:37 +0100)]
r600g: implement wait-free buffer transfer for DISCARD_RANGE

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agor600g: accelerate buffer copying
Marek Olšák [Fri, 24 Feb 2012 01:27:38 +0000 (02:27 +0100)]
r600g: accelerate buffer copying

This will be useful for efficient handling of the DISCARD transfer flags.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agor600g: update R600_MAX_DRAW_CS_DWORDS to take draw-opaque into account
Marek Olšák [Wed, 18 Jul 2012 04:23:28 +0000 (06:23 +0200)]
r600g: update R600_MAX_DRAW_CS_DWORDS to take draw-opaque into account

12 years agor600g: move VGT_STRMOUT_DRAW_OPAQUE_OFFSET initialization into invariant state
Marek Olšák [Wed, 18 Jul 2012 04:18:42 +0000 (06:18 +0200)]
r600g: move VGT_STRMOUT_DRAW_OPAQUE_OFFSET initialization into invariant state

12 years agor600g: only set the index type if drawing is indexed
Marek Olšák [Wed, 18 Jul 2012 04:13:34 +0000 (06:13 +0200)]
r600g: only set the index type if drawing is indexed

12 years agor600g: remove debug code for streamout
Marek Olšák [Wed, 18 Jul 2012 04:12:46 +0000 (06:12 +0200)]
r600g: remove debug code for streamout

12 years agor600g: inline r600_context_draw_opaque_count
Marek Olšák [Wed, 18 Jul 2012 04:06:01 +0000 (06:06 +0200)]
r600g: inline r600_context_draw_opaque_count

12 years agor600g: fix alphatest without a colorbuffer on evergreen
Marek Olšák [Wed, 18 Jul 2012 03:16:40 +0000 (05:16 +0200)]
r600g: fix alphatest without a colorbuffer on evergreen

12 years agor600g: fix alphatest without a colorbuffer on r6xx-r7xx
Marek Olšák [Wed, 18 Jul 2012 02:31:56 +0000 (04:31 +0200)]
r600g: fix alphatest without a colorbuffer on r6xx-r7xx

12 years agor600g: always derive alphatest state from the first colorbuffer
Marek Olšák [Wed, 18 Jul 2012 02:17:11 +0000 (04:17 +0200)]
r600g: always derive alphatest state from the first colorbuffer

12 years agor600g: atomize alphatest state
Marek Olšák [Wed, 18 Jul 2012 01:45:25 +0000 (03:45 +0200)]
r600g: atomize alphatest state

12 years agor600g: try to fix line stippling with lineloops
Marek Olšák [Wed, 18 Jul 2012 00:17:10 +0000 (02:17 +0200)]
r600g: try to fix line stippling with lineloops

The piglit test is failing, but visually it looks almost correct.

12 years agor600g: optimize uploading depth textures
Marek Olšák [Tue, 17 Jul 2012 22:32:50 +0000 (00:32 +0200)]
r600g: optimize uploading depth textures

Make it only copy the portion of a depth texture being uploaded and
not the whole 2D layer.

There is also a little code cleanup.

12 years agor600g: remove needless wrapper r600_texture_depth_flush
Marek Olšák [Tue, 17 Jul 2012 22:17:46 +0000 (00:17 +0200)]
r600g: remove needless wrapper r600_texture_depth_flush

12 years agor600g: init_flushed_depth_texture should be able to report errors
Marek Olšák [Tue, 17 Jul 2012 22:05:14 +0000 (00:05 +0200)]
r600g: init_flushed_depth_texture should be able to report errors

12 years agomsaa: Generate proper error for operations prohibited on MSAA buffers.
Paul Berry [Mon, 16 Jul 2012 18:25:50 +0000 (11:25 -0700)]
msaa: Generate proper error for operations prohibited on MSAA buffers.

From the GL 3.0 spec, section 4.3.3, in the documentation for
CopyPixels():

    "An INVALID_OPERATION error will be generated if the object bound
    to READ_FRAMEBUFFER_BINDING is framebuffer complete and the value
    of SAMPLE_BUFFERS is greater than zero."

The same applies to CopyTexImage...() and CopyTexSubImage...()
functions, since they are defined in terms of CopyPixels().

Previously we were generating an INVALID_FRAMEBUFFER_OPERATION error
in these cases.

Fixes piglit tests
"EXT_framebuffer_multisample/negative-{copypixels,copyteximage}".

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agogallivm: silence uninitialized variable warnings
Brian Paul [Tue, 17 Jul 2012 20:41:29 +0000 (14:41 -0600)]
gallivm: silence uninitialized variable warnings

12 years agor600g: fix lockups with and enable dual source blending on evergreen
Marek Olšák [Sun, 15 Jul 2012 01:38:42 +0000 (03:38 +0200)]
r600g: fix lockups with and enable dual source blending on evergreen

GL_ARB_blend_func_extended is now enabled on all chipsets.

12 years agor600g: remove unused code after conversion of sampler views
Marek Olšák [Sun, 15 Jul 2012 00:34:02 +0000 (02:34 +0200)]
r600g: remove unused code after conversion of sampler views

12 years agor600g: convert sampler view emission into atoms
Marek Olšák [Sat, 14 Jul 2012 13:26:59 +0000 (15:26 +0200)]
r600g: convert sampler view emission into atoms

Vertex and constant buffers are emitted in the same way.
This is mainly a simplification of the code. The cleanup is in another patch.

12 years agor600g: only make constant buffers dirty if there's something to update
Marek Olšák [Sat, 14 Jul 2012 16:15:29 +0000 (18:15 +0200)]
r600g: only make constant buffers dirty if there's something to update

12 years agor600g: properly track which textures are depth
Marek Olšák [Sat, 14 Jul 2012 15:06:27 +0000 (17:06 +0200)]
r600g: properly track which textures are depth

This fixes the issue with have_depth_texture never being set to false.

12 years agor600g: consolidate and optimize sampler states changes for evergreen
Marek Olšák [Sat, 14 Jul 2012 14:53:26 +0000 (16:53 +0200)]
r600g: consolidate and optimize sampler states changes for evergreen

Only set sampler states which changed.

12 years agor600g: don't invalidate texture caches when setting sampler states
Marek Olšák [Sat, 14 Jul 2012 14:36:51 +0000 (16:36 +0200)]
r600g: don't invalidate texture caches when setting sampler states

Changing sampler states doesn't change resource bindings.

12 years agor600g: consolidate code for setting sampler views and fix bugs in the process
Marek Olšák [Sat, 14 Jul 2012 14:23:42 +0000 (16:23 +0200)]
r600g: consolidate code for setting sampler views and fix bugs in the process

Issues fixed:

- set_vs_sampler_views for evergreen is now properly implemented.

- Added the missing inval_texture_cache call for evergreen.

- have_depth_texture was sometimes incorrectly set to false on evergreen even
  if there were depth textures in other shader stages. To fix this, set it
  to true once and never set it to false again. It's stupid, but it matches
  the r600 code. The proper fix is left to another patch.

- Optimizaton: The sampler views which aren't changed aren't updated.

12 years agor600g: remove unused flag have_depth_fb
Marek Olšák [Sat, 14 Jul 2012 13:34:29 +0000 (15:34 +0200)]
r600g: remove unused flag have_depth_fb

This is a leftover from:

commit fe1fd675565231b49d3ac53d0b4bec39d8bc6781
Author: Marek Olšák <maraeo@gmail.com>
Date:   Sun Jul 8 03:10:37 2012 +0200

    r600g: don't flush depth textures set as colorbuffers

12 years agor600g: do fine-grained vertex buffer updates
Marek Olšák [Fri, 6 Jul 2012 01:18:06 +0000 (03:18 +0200)]
r600g: do fine-grained vertex buffer updates

If only some buffers are changed, the other ones don't have to re-emitted.
This uses bitmasks of enabled and dirty buffers just like
emit_constant_buffers does.

12 years agor600g: don't call inval_shader_cache in r600_context_flush twice
Marek Olšák [Sat, 14 Jul 2012 02:18:49 +0000 (04:18 +0200)]
r600g: don't call inval_shader_cache in r600_context_flush twice

It's already called in r600_constant_buffers_dirty.

12 years agogallium/util: add util_bit_last - finds the last bit set in a word
Marek Olšák [Thu, 29 Mar 2012 22:21:11 +0000 (00:21 +0200)]
gallium/util: add util_bit_last - finds the last bit set in a word

12 years agor600g: fix all failing depth-stencil tests for evergreen
Marek Olšák [Sat, 14 Jul 2012 22:02:42 +0000 (00:02 +0200)]
r600g: fix all failing depth-stencil tests for evergreen

12 years agoconfigure.ac: Further LLVM fixups.
Michel Dänzer [Tue, 17 Jul 2012 16:30:13 +0000 (18:30 +0200)]
configure.ac: Further LLVM fixups.

* Also add mcjit in the non-OpenCL case.
* Replace hardcoded llvm-config with $LLVM_CONFIG everywhere.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellad <thomas.stellard@amd.com>
12 years agoglsl: Drop obsolete .gitignore entries.
Michel Dänzer [Tue, 17 Jul 2012 09:33:01 +0000 (11:33 +0200)]
glsl: Drop obsolete .gitignore entries.

Helps spotting and removing the obsolete generated files, which otherwise break
the build.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agoconfigure.ac: Add libLLVMMCJIT to the LLVM_LDFLAGS
Tom Stellard [Tue, 17 Jul 2012 14:16:58 +0000 (14:16 +0000)]
configure.ac: Add libLLVMMCJIT to the LLVM_LDFLAGS

This is neccessary for linking the llvmpipe tests.  It appears this
dependency was introduced by the "wider native register" changes.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
12 years agointel: Add a comment explaining why we early return on matching BO names.
Eric Anholt [Wed, 4 Jul 2012 17:52:36 +0000 (10:52 -0700)]
intel: Add a comment explaining why we early return on matching BO names.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Drop other checks for old loader version.
Eric Anholt [Wed, 4 Jul 2012 17:52:35 +0000 (10:52 -0700)]
intel: Drop other checks for old loader version.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Replace the non-getBuffersWithFormat compat path with an error message.
Eric Anholt [Wed, 4 Jul 2012 17:52:34 +0000 (10:52 -0700)]
intel: Replace the non-getBuffersWithFormat compat path with an error message.

It's been broken (using NULL getBuffersWithFormat() instead of
getBuffers()) due to a copy and paste error for a year now.
GetBuffersWithFormat has been around since 2009, so I don't feel any
guilt in not supporting it.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Remove dead intel_framebuffer_has_hiz().
Eric Anholt [Wed, 4 Jul 2012 17:52:33 +0000 (10:52 -0700)]
intel: Remove dead intel_framebuffer_has_hiz().

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Convert to using private depth/stencil buffers (v2)
Eric Anholt [Wed, 4 Jul 2012 17:52:32 +0000 (10:52 -0700)]
intel: Convert to using private depth/stencil buffers (v2)

This means that GLX buffer sharing of these no longer works.  On the
other hand, just *look* at this code reduction.

v2:
  - [chad] Fix intelCreateBuffer for gen < 6. When the branch for
    !screen->hw_has_separate_stencil was taken,
    intel_create_private_renderbuffer was incorrectly not used.

  - [chad] Remove all code in intel_process_dri2_buffer for processing
    depth, stencil, and hiz buffers. That code is now dead.

CC: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Add a function for creating a private window system buffer.
Eric Anholt [Wed, 4 Jul 2012 17:52:31 +0000 (10:52 -0700)]
intel: Add a function for creating a private window system buffer.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>