Brad King [Tue, 6 May 2014 15:06:47 +0000 (11:06 -0400)]
automake: Honor GL_LIB for gallium libgl-xlib
Use "@GL_LIB@" in src/gallium/targets/libgl-xlib/Makefile.am to produce
the library name specified by the configure --with-gl-lib-name option.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Tue, 13 May 2014 00:33:48 +0000 (01:33 +0100)]
configure: correctly set LD_NO_UNDEFINED
Commit
11623be934f85 was meant to have this hunk, which
I accidently dropped during git rebase.
Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jonathan Gray <jsg@jsg.id.au>
Roland Scheidegger [Wed, 14 May 2014 13:43:53 +0000 (15:43 +0200)]
gallivm: only fetch pointers to constant buffers once
In
1d35f77228ad540a551a8e09e062b764a6e31f5e support for multiple constant
buffers was introduced. This meant we had another indirection, and we did
resolve the indirection for each constant buffer access. This looks very
reasonable since llvm can figure out if it's the same pointer, however it
turns out that this can cause llvm compilation time to go through the roof
and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more
than 10 seconds (!)), with all the additional time spent in IR optimization
passes (and in the end all of it in DominatorTree::dominate()).
I've been unable to narrow it down a bit more (only some shaders seem affected,
seemingly without much correlation to overall shader complexity or constant
usage) but it is easily avoidable by doing the buffer lookups themeselves just
once (at constant buffer declaration time).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 14 May 2014 01:23:09 +0000 (03:23 +0200)]
gallivm: fix output stream flushing in error case for disassembly.
When there's an error, also need to flush the stream, otherwise an assertion
is hit (meaning you don't actually see the error neither).
Michel Dänzer [Wed, 14 May 2014 07:30:33 +0000 (16:30 +0900)]
radeonsi: Fix anisotropic filtering state setup
Bring it back in line with r600g. I broke this in the original radeonsi
bringup. :(
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Ilia Mirkin [Thu, 8 May 2014 01:15:12 +0000 (21:15 -0400)]
tgsi: support parsing texture offsets from text tgsi shaders
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Thu, 8 May 2014 13:06:36 +0000 (09:06 -0400)]
mesa/st: provide native integers implementation of ir_unop_any
Previously, ir_unop_any was implemented via a dot-product call, which
uses floating point multiplication and addition. The multiplication was
completely pointless, and the addition can just as well be done with an
or. Since we know that the inputs are booleans, they must already be in
canonical 0/~0 format, and the final SNE can also be avoided.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Rob Clark [Tue, 13 May 2014 02:19:03 +0000 (22:19 -0400)]
gallium/docs: clarify when query results are reset
It wasn't completely clear from the docs, so I had to figure out by
looking at piglit results. Hopefully this saves the next driver writer
implementing queries some time.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
José Fonseca [Mon, 12 May 2014 15:12:32 +0000 (16:12 +0100)]
gallivm: Remove lp_func_delete_body.
Not necessary, now that we will free the whole module (hence all
function bodies) immediately after compiling.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Mon, 12 May 2014 14:59:55 +0000 (15:59 +0100)]
gallivm: Remove gallivm_free_function.
Unused. Deprecated by gallivm_free_ir().
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Mon, 12 May 2014 14:59:13 +0000 (15:59 +0100)]
llvmpipe: Delete unneeded LLVM stuff earlier.
Same as Frank's change to draw module but for llvmpipe module.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Frank Henigman [Tue, 1 Oct 2013 19:15:43 +0000 (15:15 -0400)]
draw: Delete unneeded LLVM stuff earlier.
Free up unneeded LLVM stuff immediately after generating vertex shader
code. Saves about 500K per shader.
v2: Don't bother calling gallivm_free_function (Jose)
Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Frank Henigman [Tue, 1 Oct 2013 19:15:42 +0000 (15:15 -0400)]
gallivm: Separate freeing LLVM intermediate data from freeing final code.
Split free_gallivm_state() into two steps. First step is
gallivm_free_ir() which cleans up the LLVM scaffolding used to generate
code while preserving the code itself. Second step is
gallivm_free_code() to free the memory occupied by the code.
v2: s/gallivm_teardown/gallivm_free_ir/ (Jose)
Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Frank Henigman [Tue, 1 Oct 2013 19:15:41 +0000 (15:15 -0400)]
gallivm: One code memory pool with deferred free.
Provide a JITMemoryManager derivative which puts all generated code into
one memory pool instead of creating a new one each time code is generated.
This saves significant memory per shader as the pool size is 512K and
a small shader occupies just several K.
This memory manager also defers freeing generated code until you tell
it to do so, making it possible to destroy the LLVM engine while keeping
the code, thus enabling future memory savings.
v2: Fix compilation errors with LLVM 3.4 (Jose)
Signed-off-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Mon, 12 May 2014 13:29:04 +0000 (14:29 +0100)]
gallivm: Run passes per module, not per function.
This is how it is meant to be done nowadays.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Thu, 8 May 2014 14:21:49 +0000 (15:21 +0100)]
gallivm: Use LLVM global context.
I saw that LLVM internally uses its global context for some things, even
when we use our own. Given ours is also global, might as well use
LLVM's.
However, sepearate contexts can still be enabled with a simple source
code modification, for when the need/benefit arises.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Mon, 12 May 2014 13:03:47 +0000 (14:03 +0100)]
gallivm: Stop using module providers.
Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so
its use just causes unnecessary confusion.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Thu, 8 May 2014 12:25:28 +0000 (13:25 +0100)]
gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1.
Older versions haven't been tested probably don't work anyway. But more
importantly, code supporting it is hindering further work.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Mon, 12 May 2014 15:30:51 +0000 (16:30 +0100)]
configure: Require LLVM 3.1.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Thu, 8 May 2014 12:32:07 +0000 (13:32 +0100)]
scons: Require LLVM 3.1
Support for prior versions will be removed in the following change.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Matt Turner [Fri, 2 May 2014 21:49:24 +0000 (14:49 -0700)]
i965: Reformat brw_set_src1 so it can be easily found with grep.
Samuel Iglesias Gonsalvez [Thu, 8 May 2014 13:55:08 +0000 (15:55 +0200)]
i965: fix size assert for gen7 in brw_init_compaction_tables()
It should compare with it's own size.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Iago Toral Quiroga [Wed, 7 May 2014 07:58:43 +0000 (09:58 +0200)]
i965: Relax accumulator dependency scheduling on Gen < 6
Many instructions implicitly update the accumulator on Gen < 6. The instruction
scheduling code just calls add_barrier_deps() for each accumulator access on
these platforms, but a large class of operations don't actually update the
accumulator -- mostly move and logical instructions. Teaching the scheduling
code about this would allow more flexibility to schedule instructions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jonathan Gray [Wed, 14 May 2014 05:13:55 +0000 (15:13 +1000)]
glsl: simplify the M_PI*f macros, fixes build on OpenBSD
The M_PI*f macros used a preprocessor paste to append 'f'
to M_PI defines, which works if the values are only numbers
but breaks on OpenBSD where M_PI definitions have casts
and brackets to meet requirements of a future version of POSIX,
http://austingroupbugs.net/view.php?id=801
http://austingroupbugs.net/view.php?id=828
Simplify the M_PI*f macros by using casts directly in the defines
as suggested by Kenneth Graunke.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Carl Worth [Wed, 14 May 2014 00:30:17 +0000 (17:30 -0700)]
docs: Really add the 10.1.3 release nots this time
Commit
a96c3bccf6791359d1159ebe9475e0ed5cf790ed intended to add these, but I
forgot to add the file.
Rob Clark [Sun, 11 May 2014 18:15:32 +0000 (14:15 -0400)]
freedreno/a3xx: occlusion query support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 10 May 2014 17:45:54 +0000 (13:45 -0400)]
freedreno: add support for hw queries
Real GPU queries need some infrastructure to track samples per tile and
accumulate the results. But fortunately this can be shared across GPU
generation.
See:
https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Fri, 9 May 2014 21:33:19 +0000 (17:33 -0400)]
freedreno/query: allow multiple query implementations
Split out fd_query into an abstract base class, to allow multiple
implementations. The current sw based queries are moved into
fd_sw_query.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Kenneth Graunke [Mon, 12 May 2014 03:22:48 +0000 (20:22 -0700)]
mesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump.
As far as I can tell, Mesa hasn't had a convenient way to dump ARB_vp/fp
source until now. Using MESA_GLSL=dump is convenient, since it means
you can use a single environment variable to dump a program's shaders,
no matter which language they're written in.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Mon, 12 May 2014 00:20:08 +0000 (17:20 -0700)]
i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage.
The point of copytexsubimage_using_blit_framebuffer is to use a hardware
accelerated BlitFramebuffer path. If that fails, we shouldn't do a
swrast blit---we should try our CTSI fallback code.
This is especially important for i965 and GLES, where we don't even
create a swrast context.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Jordan Justen [Tue, 13 May 2014 18:06:59 +0000 (18:06 +0000)]
i965/gen8: Set depth extent field
The depth extent field is used to limit the allowed slice range that
can be rendered to.
With the previous setting, only slice 0 could be rendered.
This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen8 depth: Set depth size based on LOD0 for 3D textures
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen7 depth: Set depth size based on LOD0 for 3D textures
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D textures
Fixes piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Jordan Justen [Sat, 10 May 2014 21:48:47 +0000 (14:48 -0700)]
i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D textures
If blorp is disabled for color clears, then piglit's
'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped'
will fail.
Currently, gen8 fails similarly on this test because gen8
does not use blorp.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Rob Clark [Sun, 11 May 2014 15:57:20 +0000 (11:57 -0400)]
freedreno/a3xx: add point-size
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sun, 11 May 2014 15:51:41 +0000 (11:51 -0400)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Bryan Cain [Tue, 6 May 2014 03:23:38 +0000 (22:23 -0500)]
glsl_to_tgsi: remove unnecessary dead code elimination pass
With the more advanced dead code elimination pass already being run,
eliminate_dead_code was making no difference in instruction count, and had
an undesirable O(n^2) runtime. So remove it and rename
eliminate_dead_code_advanced to eliminate_dead_code.
Reviewed-by: Marek Olšák <marek.olsak at amd.com>
José Fonseca [Fri, 9 May 2014 09:55:17 +0000 (10:55 +0100)]
ralloc: Omit detailed license information about talloc.
That information misleads source code auditing tools to think that
ralloc itself is released under LGPL v3.
Instead, simply state talloc is not licensed under a permissive license.
v2: Use wording suggested by Kenneth.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Iago Toral Quiroga [Thu, 8 May 2014 11:29:20 +0000 (13:29 +0200)]
i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims()
We always call brw_merge_inputs() right before looping over the primitives but
this can be called inside the loop for each primitive too. In the case we do it
for the first primitive the call is redundant and can be skipped.
Reviewed-by: Eric Anholt <eric@anholt.net>
Iago Toral Quiroga [Fri, 9 May 2014 06:50:03 +0000 (08:50 +0200)]
glsl: Do not call lhs->variable_referenced() multiple times
Instead take the result from the first call and use it where needed.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Wed, 7 May 2014 05:42:29 +0000 (08:42 +0300)]
meta: Refactor state save/restore for framebuffer texture blits
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kristian Høgsberg [Mon, 12 May 2014 22:40:11 +0000 (15:40 -0700)]
wayland: Move version 2 request to end of interface specification
We're moving towards requiring interface additions to be appended to the
end of the interface block. No functional change, opcodes are assigned as
before, but version 2 additions are now grouped together, which prevents
a scanner warning.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Timothy Arceri [Sun, 11 May 2014 12:11:21 +0000 (22:11 +1000)]
glsl: the number of samplers is already calculated so use it
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Tue, 6 May 2014 22:19:55 +0000 (15:19 -0700)]
i965: Stop doing remapping of "special" regs.
Now that we aren't using pixel_[xy] in live variables, nothing is looking
at these regs after the visitor stage.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 6 May 2014 20:22:10 +0000 (13:22 -0700)]
i965: Generalize the pixel_x/y workaround for all UW types.
This is the only case where a fs_reg in brw_fs_visitor is used during
optimization/code generation, and it meant that optimizations had to be
careful to not move pixel_x/y's register number without updating it.
Additionally, it turns out we had a couple of other UW values that weren't
getting this treatment (like gl_SampleID), so this more general fix is
probably a good idea (though I wasn't able to replicate problems with
either pixel_[xy]'s values or gl_SampleID, even when telling the register
allocator to reuse registers immediately)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 23 Apr 2014 21:21:21 +0000 (14:21 -0700)]
i965: Move has_hiz from the slice to the level.
The value depends only on the level, so no need to store the bool per slice.
Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an
existing hole in intel_mipmap_level.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Mon, 5 May 2014 19:49:47 +0000 (22:49 +0300)]
meta: Refactor configuration of renderbuffer sampling
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Mon, 5 May 2014 14:37:40 +0000 (17:37 +0300)]
meta: Refactor binding of renderbuffer as texture image
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Thu, 17 Apr 2014 23:21:13 +0000 (02:21 +0300)]
meta: Merge compiling and linking of blit program
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Tue, 22 Apr 2014 19:11:27 +0000 (22:11 +0300)]
i965/blorp: Expose coordinate scissoring and mirroring
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Wed, 7 May 2014 09:12:02 +0000 (12:12 +0300)]
i965/gen8: Use helper variables for surface parameters
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ilia Mirkin [Sat, 10 May 2014 21:51:21 +0000 (17:51 -0400)]
nv50,nvc0: fix blit 3d path for 1d array textures
Need to adjust coordinates since the shader receives the array index as
depth in z, but the TEX instruction expects it to be the second
coordinate for a 1D array texture. This fixes fbo-generatemipmap-array.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 3 May 2014 07:00:07 +0000 (03:00 -0400)]
nv50,nvc0: leave queries on during blit, turn them on for 2d engine
Fixes the new logic of the conditional rendering piglit test.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 10 May 2014 14:25:29 +0000 (10:25 -0400)]
mesa/st: leave current query enabled during glBlitFramebuffer
Also make sure that pipe_blit_info gets zero'd out so that query isn't
accidentally left enabled.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 10 May 2014 14:22:17 +0000 (10:22 -0400)]
gallium: add bit to pipe_blit_info to leave current query enabled
Previously the implication was that queries should be disabled during
blits. However glBlitFramebuffer() is supposed to obey the current
query, and this new bit will indicate that to the driver.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sat, 10 May 2014 03:13:38 +0000 (23:13 -0400)]
nv50: fix setting of texture ms info to be per-stage
Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 10 May 2014 19:02:36 +0000 (15:02 -0400)]
nv50/ir: make sure to reverse cond codes on all the OP_SET variants
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
Rob Clark [Sun, 11 May 2014 12:57:38 +0000 (08:57 -0400)]
freedreno/a2xx: fix compiler warning
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Marek Olšák [Tue, 6 May 2014 17:59:53 +0000 (19:59 +0200)]
radeonsi: prepare depth export registers at compile time
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 6 May 2014 17:55:48 +0000 (19:55 +0200)]
radeonsi: simplify depth/stencil export code
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 6 May 2014 11:57:31 +0000 (13:57 +0200)]
radeon/llvm: add support for non-scalar system values
The sample position is one of them.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 6 May 2014 12:10:47 +0000 (14:10 +0200)]
radeonsi: add and use a helper function for loading constants
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 5 May 2014 20:16:45 +0000 (22:16 +0200)]
radeonsi: only count CS space for state atoms if we're going to draw
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 30 Apr 2014 19:21:17 +0000 (21:21 +0200)]
radeonsi: remove unused variable exports_ps in si_pipe_shader_ps
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 29 Apr 2014 23:03:40 +0000 (01:03 +0200)]
radeonsi: use DRAW_PREAMBLE on CIK
It's the same as setting the 3 regs separately, but shorter, and it also
seems to be required on GFX7.2 and later. This doesn't fix Hawaii.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 6 May 2014 11:53:59 +0000 (13:53 +0200)]
r600g: simplify framebuffer state size computation
Take the upper bound. The number doesn't have to absolutely correct, only safe.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Kenneth Graunke [Fri, 9 May 2014 21:45:57 +0000 (14:45 -0700)]
Revert "i965: Fix depth (array slices) computation for 1D_ARRAY render targets."
This reverts commit
e6967270c75a5b669152127bb7a746d55f4407a6.
Chris Forbes pointed out that this is broken for texture views which
restrict the number of slices. He committed a better fix which makes
this unnecessary.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Emil Velikov [Thu, 8 May 2014 15:49:45 +0000 (16:49 +0100)]
egl_dri2: cleanup memory leak in dri2_create_context()
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Emil Velikov [Thu, 8 May 2014 18:09:39 +0000 (19:09 +0100)]
ilo: destroy the mutex, if winsys creation fails
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Emil Velikov [Wed, 7 May 2014 21:30:43 +0000 (22:30 +0100)]
glx/tests: Partially revert commit
51e3569573a7b3f8da0df093836761003fcdc414
C++ does not support designated initializers, thus compilation
is not guaranteed to succeed. Surprisingly gcc 4.6.3 fails to
build the code, while version 4.9.0 compiles it without a hitch.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78403
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Emil Velikov [Mon, 5 May 2014 21:09:22 +0000 (22:09 +0100)]
configure: error out if building GBM without dri
Both backends require --enable-dri, and building an empty libgbm
makes little to no sense. Error out at configure to prevent the
user from shooting themselves in the foot.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78225
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chia-I Wu [Tue, 18 Feb 2014 04:25:06 +0000 (12:25 +0800)]
mesa: propagate FragDepthLayout to gl_program
The information was lost during linking, causing the layout to be treated as
FRAG_DEPTH_LAYOUT_NONE.
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Thu, 8 May 2014 23:28:49 +0000 (11:28 +1200)]
glsl: Rename linker's is_varying_var
Both the ast->IR and linker have functions with this name, but different
behavior.
Rename the linker's version to var_counts_against_varying_limit to be
closer to what it is actually used for.
Suggested by Ian a while back.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Thu, 8 May 2014 23:44:37 +0000 (16:44 -0700)]
i965: Fix GPU hangs on Broadwell in shaders with some control flow.
According to the documentation, we need to set the source 0 register
type to IMM for flow control instructions that have both JIP and UIP.
Fixes GPU hangs in approximately 10 Piglit tests, 5 es3conform tests,
Unigine Crypt, a WebGL raytracer demo, and several Steam titles.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75478
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75878
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76939
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Tested-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tom Stellard [Fri, 9 May 2014 08:24:42 +0000 (04:24 -0400)]
radeonsi: Enable geometry shaders with LLVM 3.4.1
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Fri, 9 May 2014 08:23:50 +0000 (04:23 -0400)]
configure.ac: Add LLVM_VERSION_PATCH to DEFINES
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Carl Worth [Fri, 9 May 2014 14:52:26 +0000 (07:52 -0700)]
docs: Import 10.1.3 release notes, andd news item.
Thomas Hellstrom [Thu, 8 May 2014 07:08:10 +0000 (09:08 +0200)]
st/xa: Fix performance regression introduced by commit "Cache render target surface"
The mentioned commit has the nasty side-effect of turning off accelerated
copies.
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Tom Stellard [Fri, 9 May 2014 01:08:32 +0000 (21:08 -0400)]
clover: Destory pipe_screen when device does not support compute v2
v2:
- Make sure screen was successfully created before destroying it.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 9 May 2014 01:05:02 +0000 (21:05 -0400)]
pipe-loader: Don't destroy the winsys in the sw loader
The screen takes ownership of the winsys, and is responsible for
destroying it. Users of pipe-loader should make sure they destory
and screens they've created to avoid memory leaks.
This fixes a crash in clover introduced by
ce6c17c0833032e91a2d1b34f9eb80c738a854a2 where the pipe-loader was
destroying the winsys while a screen was still using it.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Chris Forbes [Thu, 8 May 2014 05:06:01 +0000 (17:06 +1200)]
i965/Gen8: Set up layer constraints properly for depth buffers
Same issues as the previous commit fixed for Gen7:
- Bogus physical->logical layer conversion; depth/stencil surfaces
are still IMS layout on Gen8.
- mt_layer ignored in layered rendering case, which breaks handling
of views with MinLayer.
- Render target array extent not set correctly for arrays.
I'm not able to test this one since I can't get a Broadwell yet, but
it's the same set of fixes as for Gen7.
V2: Restore the MAX2() to account for zero depth/layer_count.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Thu, 8 May 2014 04:29:41 +0000 (16:29 +1200)]
i965/Gen7: Set up layer constraints properly for depth buffers
Again, a few problems:
- Layered attachments did not honor MinLayer.
- Non-layered MSAA attachments rendered to the wrong layer due to
dividing by the layer count. All depth buffers use the IMS layout, so
the physical layer count == logical layer count.
- Layered attachments were not limited to irb->layer_count, so we could
render off the end of the texture.
V2: Restore the MAX2() to account for zero depth/layer_count.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Thu, 8 May 2014 05:15:59 +0000 (17:15 +1200)]
i965/Gen8: Set up layer constraints properly for renderbuffers
Fixing the same issues the previous commit does for Gen7.
Note that I can't test this one, since I don't have a Broadwell.
V2: Restore the MAX2() to account for zero depth/layer_count.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Thu, 8 May 2014 04:02:16 +0000 (16:02 +1200)]
i965/Gen7: Set up layer constraints properly for renderbuffers
There were a few problems here, which mostly just broke layered
rendering into a view:
- Render target view extent was always set to be == depth. This is
benign for non-layered-rendering, but allows writes off the end of the
render target for layered rendering, which ends badly.
- Layered rendering did not honor the mt_layer setting, so would not
properly handle MinLayer being set on a view.
V2: Restore the MAX2() to account for zero depth/layer_count.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Thu, 8 May 2014 21:34:34 +0000 (09:34 +1200)]
i965: Fix typo in assert message
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Adam Jackson [Tue, 22 Apr 2014 16:46:08 +0000 (12:46 -0400)]
radeonsi: Don't use anonymous struct trick in atom tracking
I'm somewhat impressed that current gccs will let you do this, but
sufficiently old ones (including 4.4.7 in RHEL6) won't.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Roland Scheidegger [Thu, 8 May 2014 14:25:47 +0000 (16:25 +0200)]
llvmpipe: change LP_MAX_SHADER_INSTRUCTIONS limit definition.
When the limit was changed to be defined in terms of LP_MAX_SHADER_VARIANTS
(
75f1fea14f524ef05e980d825fda3ae226ae2ffe) when it was increased, this
inadvertently lowered the limit in some branches (that have a lower
LP_MAX_SHADER_VARIANTS number) when merged. So, make sure the limit is always
at least the number it once was.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 7 May 2014 17:06:39 +0000 (19:06 +0200)]
draw: do not use draw_get_option_use_llvm() inside draw execution paths
1c73e919a4b4dd79166d0633075990056f27fd28 made it possible to not allocate
the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is
not reliable after draw context creation, since drivers can explicitly
request a non-llvm draw context even if draw_get_option_use_llvm() would
return true (and softpipe does just that) which leads to crashes.
Thus use draw->llvm to determine if we're using llvm or not instead (and
make draw->llvm available even if HAVE_LLVM is false so we don't have to put
even more ifdefs).
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Wed, 7 May 2014 21:35:43 +0000 (14:35 -0700)]
i965: Fix depth (array slices) computation for 1D_ARRAY render targets.
1D array targets store the number of slices in the Height field.
Fixes Piglit's spec/!OpenGL 3.2/layered-rendering/clear-color-all-types
1d_array single_level, at least when used with Meta clears.
Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Wed, 7 May 2014 21:35:42 +0000 (14:35 -0700)]
mesa: Fix MaxNumLayers for 1D array textures.
1D array targets store the number of slices in the Height field.
Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Tue, 6 May 2014 23:04:40 +0000 (16:04 -0700)]
i965: Enable GL_ARB_texture_view on Broadwell.
This is a port of commit
c9c08867ed07ceb10b67ffac5f0a33812710a5e8.
A tiny bit of extra work was necessary to not break stencil texturing.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ilia Mirkin [Tue, 6 May 2014 06:51:45 +0000 (02:51 -0400)]
mesa: pass target through to driver when choosing texture format
This only matters for TextureView where the texObj's target has not been
set yet, in all other instances, texObj->target should be the same as
the passed-in target parameter.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ilia Mirkin [Tue, 6 May 2014 23:54:59 +0000 (19:54 -0400)]
nv50/ir/gk110: fix set with f32 dest
Should fix comparison opcodes like SGE/SLT/etc which expected a float to
be returned. These were previously getting integer 0/-1 values.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Cc: 10.2 <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 3 May 2014 04:26:14 +0000 (00:26 -0400)]
nv50/ir: allow load propagation when flags are defined
The old condition disallowed load propagation any time flags were
defined, even with e.g. set and a constbuf reference. The new condition
disallows it only with immediate propagation. (There are no opcodes that
set the condition flag and have an immediate argument.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 27 Apr 2014 03:47:14 +0000 (23:47 -0400)]
mesa/st: pass 4-offset TG4 without lowering if supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Sun, 27 Apr 2014 03:44:57 +0000 (23:44 -0400)]
gallium: add a cap for supporting 4-offset TG4 opcodes
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Mon, 5 May 2014 16:19:56 +0000 (10:19 -0600)]
svga: add switch case for PIPE_SHADER_CAP_PREFERRED_IR, remove default case
Remove default switch case so we're warned of missing cases at compile
time.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 5 May 2014 16:18:49 +0000 (10:18 -0600)]
tgsi: add missing switch cases in tgsi_exec_get_shader_param()
Add cases for PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS and
PIPE_SHADER_CAP_PREFERRED_IR. Remove default switch case so we
learn of missing cases at compile time.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 5 May 2014 16:17:48 +0000 (10:17 -0600)]
gallivm: add PIPE_SHADER_CAP_PREFERRED_IR switch case, remove default
Return PIPE_SHADER_IR_TGSI for the PIPE_SHADER_CAP_PREFERRED_IR query.
Remove default switch case so we learn of missing switch cases at
compile time.
Reviewed-by: José Fonseca <jfonseca@vmware.com>