Zack Rusin [Sat, 27 Apr 2013 06:51:26 +0000 (02:51 -0400)]
draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
José Fonseca [Mon, 29 Apr 2013 14:40:06 +0000 (15:40 +0100)]
llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.
Trivial.
Tested with
LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
José Fonseca [Mon, 29 Apr 2013 14:12:26 +0000 (15:12 +0100)]
Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents"
This reverts commit
5649f886f76023532538b8792605a3578cec1ed1.
It causes segfaults when size is zero.
Jerome Glisse [Wed, 24 Apr 2013 23:15:52 +0000 (19:15 -0400)]
r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.
Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Rob Clark [Mon, 29 Apr 2013 11:36:27 +0000 (07:36 -0400)]
freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Chris Forbes [Fri, 26 Apr 2013 23:00:46 +0000 (11:00 +1200)]
i965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.
Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.
Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Matt Turner [Sun, 28 Apr 2013 21:35:01 +0000 (14:35 -0700)]
i965/vs: Fix order of source arguments to LRP.
The order or arguments matches DirectX, and is backwards from GLSL's
mix() built-in.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983
Zack Rusin [Sat, 27 Apr 2013 04:52:49 +0000 (00:52 -0400)]
llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Zack Rusin [Sat, 27 Apr 2013 04:49:23 +0000 (00:49 -0400)]
draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Zack Rusin [Sat, 27 Apr 2013 02:53:07 +0000 (22:53 -0400)]
gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Zack Rusin [Wed, 24 Apr 2013 03:36:40 +0000 (23:36 -0400)]
tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Zack Rusin [Tue, 23 Apr 2013 22:56:47 +0000 (18:56 -0400)]
draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Tue, 23 Apr 2013 22:47:08 +0000 (18:47 -0400)]
draw/llvm: fix viewport transformations
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Sat, 27 Apr 2013 03:00:38 +0000 (23:00 -0400)]
gallium: increase the number of available stream output decls
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Tue, 23 Apr 2013 10:19:14 +0000 (06:19 -0400)]
llvmpipe: implement so_overflow query
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 26 Apr 2013 19:49:35 +0000 (13:49 -0600)]
mesa: fix the compressed TexSubImage size checking code
Before, we'd incorrectly generate an error if we we tried to
replace a non-4x4 block near the edge of a NPOT compressed texture.
For example, if the dest image was 15 texels wide and xoffset=12
and width=3 we'd incorrectly generate GL_INVALID_OPERATION.
Verified with new tests added to piglit s3tc-errors test.
Note: This is a candidate for the stable branches.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 26 Apr 2013 13:31:49 +0000 (07:31 -0600)]
llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 26 Apr 2013 13:26:46 +0000 (07:26 -0600)]
llvmpipe: bump LP_MAX_THREADS to 16
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 26 Apr 2013 13:26:06 +0000 (07:26 -0600)]
mesa: updated read_buffer_enum_to_index() comment
Remove the part about the value of gl_framebuffer::Name.
Christian König [Fri, 26 Apr 2013 09:49:55 +0000 (11:49 +0200)]
r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2
That is just not supported by the hardware.
v2: fix compare
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 26 Apr 2013 09:16:19 +0000 (11:16 +0200)]
radeon/uvd: stop using anonymous unions
Signed-off-by: Christian König <christian.koenig@amd.com>
Tapani Pälli [Thu, 18 Apr 2013 06:21:27 +0000 (09:21 +0300)]
mesa: fix type comparison errors in sub-texture error checking code
patch fixes a crash that happens if glTexSubImage2D is called with a
negative xoffset.
NOTE: This is a candidate for stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Thu, 25 Apr 2013 17:51:17 +0000 (18:51 +0100)]
Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements."
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.
So this reverts commit
12096f334b82340dc165ed15e6f8f44d4cf94df4, except the
variant->key -> key shorthands.
Chia-I Wu [Wed, 12 Dec 2012 22:01:23 +0000 (06:01 +0800)]
ilo: add the driver to the build system
Add ilo to targets/egl-static and add a new target dri-ilo. Update autoconf
and automake rules.
Chia-I Wu [Wed, 12 Dec 2012 21:48:46 +0000 (05:48 +0800)]
ilo: compile VS/GS/FS with the toy compiler
Chia-I Wu [Wed, 12 Dec 2012 21:48:28 +0000 (05:48 +0800)]
ilo: add a toy shader compiler
This is a simple shader compiler that performs almost zero optimizations. The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.
Function-wise, it lacks register spilling and does not support most TGSI
indirections. Other than those, it works alright.
Chia-I Wu [Wed, 12 Dec 2012 21:44:41 +0000 (05:44 +0800)]
ilo: hook up pipe context GPGPU functions
This just adds a stub.
Chia-I Wu [Wed, 12 Dec 2012 21:43:04 +0000 (05:43 +0800)]
ilo: hook up pipe context video functions
This just hooks them up with auxiliary/vl layer.
Chia-I Wu [Wed, 12 Dec 2012 21:35:37 +0000 (05:35 +0800)]
ilo: add support for time/occlusion/primitive queries
Chia-I Wu [Tue, 16 Apr 2013 08:36:03 +0000 (16:36 +0800)]
ilo: hook up pipe context 3D functions
Chia-I Wu [Tue, 16 Apr 2013 10:09:35 +0000 (18:09 +0800)]
ilo: add GEN7 support for 3D pipeline
Chia-I Wu [Wed, 12 Dec 2012 21:28:42 +0000 (05:28 +0800)]
ilo: add 3D pipeline for GEN6
The 3D pipeline is a high-level interface to emit 3D commands and states. It
uses GEN6 GPE to do the real work.
Chia-I Wu [Tue, 16 Apr 2013 10:09:01 +0000 (18:09 +0800)]
ilo: add GEN7 GPE
Chia-I Wu [Wed, 12 Dec 2012 21:23:34 +0000 (05:23 +0800)]
ilo: add GEN6 GPE
GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D
commands and states.
Chia-I Wu [Wed, 12 Dec 2012 21:20:40 +0000 (05:20 +0800)]
ilo: hook up pipe context query functions
None of the query types are supported yet.
Chia-I Wu [Wed, 12 Dec 2012 21:18:25 +0000 (05:18 +0800)]
ilo: hook up pipe context transfer functions
Chia-I Wu [Wed, 12 Dec 2012 21:15:10 +0000 (05:15 +0800)]
ilo: hook up pipe context blit functions
Chia-I Wu [Tue, 16 Apr 2013 08:27:50 +0000 (16:27 +0800)]
ilo: hook up pipe context state functions
Chia-I Wu [Wed, 12 Dec 2012 21:05:01 +0000 (05:05 +0800)]
ilo: add functions to manage shaders
This commits add shader cache, shader state, shader variant, and etc. It does
not add the shader compiler though.
Chia-I Wu [Tue, 16 Apr 2013 08:24:40 +0000 (16:24 +0800)]
ilo: hook up pipe context flush function
Chia-I Wu [Wed, 12 Dec 2012 20:36:41 +0000 (04:36 +0800)]
ilo: add command parser
The command parser manages batch buffers and command submissions.
Chia-I Wu [Wed, 12 Dec 2012 20:44:21 +0000 (04:44 +0800)]
ilo: hook up pipe screen resource functions
Chia-I Wu [Wed, 12 Dec 2012 20:43:01 +0000 (04:43 +0800)]
ilo: hook up pipe screen format functions
Chia-I Wu [Wed, 12 Dec 2012 20:26:23 +0000 (04:26 +0800)]
ilo: hook up pipe_screen param and fence functions
Chia-I Wu [Wed, 12 Dec 2012 20:24:40 +0000 (04:24 +0800)]
ilo: add debug flags settable through ILO_DEBUG
Chia-I Wu [Wed, 12 Dec 2012 20:07:16 +0000 (04:07 +0800)]
ilo: new pipe driver for Intel GEN6+
This commit adds some boilerplate code. The header files found under include/
are copied from i965.
Chia-I Wu [Wed, 12 Dec 2012 19:52:50 +0000 (03:52 +0800)]
winsys/intel: new winsys for intel
This is a wrapper for libdrm_intel to allow the pipe driver to stay OS
agnostic.
José Fonseca [Fri, 26 Apr 2013 07:43:00 +0000 (08:43 +0100)]
gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup().
Courtesy of clang:
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
Matt Turner [Thu, 25 Apr 2013 18:03:38 +0000 (11:03 -0700)]
i965/vs: Add support for LRP instruction.
Only 13 affected programs in shader-db, but they were all helped.
total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs: 1576 -> 1550 (-1.65%)
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Thu, 25 Apr 2013 18:02:02 +0000 (11:02 -0700)]
i965/vs: Add a function to fix-up uniform arguments for 3-src insts.
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.
With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jerome Glisse [Tue, 23 Apr 2013 23:22:33 +0000 (19:22 -0400)]
winsys/radeon: consolidate tracing into winsys v2
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).
Lot of file touched because of winsys API changes.
v2: Do not write lockup file if ib uniq id does not match last one
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Mon, 22 Apr 2013 16:12:07 +0000 (09:12 -0700)]
r600g/compute: Removed unused and untested code
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tom Stellard [Mon, 22 Apr 2013 15:38:40 +0000 (08:38 -0700)]
r600g/compute: Use a constant buffer to store kernel parameters v2
v2:
- Fix usage of set_constant_buffer()
- Fix typo in comment
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Tue, 23 Apr 2013 03:06:54 +0000 (20:06 -0700)]
r600g: Add evergreen_emit_cs_constant_buffers() v2
v2:
- Bump R600_NUM_ATOMS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Mon, 22 Apr 2013 15:34:18 +0000 (08:34 -0700)]
r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel
The state tracker should be responsible for waiting for the kernel to
finish.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tom Stellard [Mon, 22 Apr 2013 14:32:10 +0000 (07:32 -0700)]
r600g/compute: Fix input buffer size calculation
Buffer size should be in bytes not dwords.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Adam Jackson [Tue, 23 Apr 2013 18:07:33 +0000 (14:07 -0400)]
linux: Don't emit a .note.ABI-tag section anymore (#26663)
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed. Just remove it.
Note: this is a candidate for the stable branches.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Rob Clark [Thu, 25 Apr 2013 19:00:58 +0000 (15:00 -0400)]
freedreno: use writecombine buffers
Better than uncached for writes, which are common for vertex buffer
upload, etc.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Thu, 25 Apr 2013 15:17:02 +0000 (11:17 -0400)]
freedreno: don't patch and re-emit same shader as much
New textures or vertex buffers don't always require patching and
re-emitting the shaders. So do a better job of figuring out when we
actually have to patch the shader.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Eric Anholt [Mon, 15 Apr 2013 23:44:55 +0000 (16:44 -0700)]
i965: Avoid recompiles for fragment clamping on non-clamping APIs.
Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are
due to FBO-rendering size predictions). We currently expose
GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm
about to send a patch for removing that silly extension in that case.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Alex Deucher [Thu, 25 Apr 2013 18:22:46 +0000 (14:22 -0400)]
radeonsi: add new SI pci ids
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 25 Apr 2013 18:21:15 +0000 (14:21 -0400)]
r600g: add new richland pci ids
Note: this is a candidate for the stable branches.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
José Fonseca [Thu, 25 Apr 2013 15:16:21 +0000 (16:16 +0100)]
draw: Yield zeros for LLVM fetches of non-existing vertex elements.
If a bug in an app/stater-tacker causes vertex buffer to fetch vertex
elements that are not bound, simply return zeros instead of crashing.
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Thu, 25 Apr 2013 13:18:33 +0000 (14:18 +0100)]
trace: Only close trace files on exit.
Many applications don't exit cleanly, others may create and destroy a
screen multiple times, so we only write </trace> tag and close at exit
time.
José Fonseca [Thu, 25 Apr 2013 13:06:50 +0000 (14:06 +0100)]
graw: Set the vertex shader constant buffer.
We were setting the fragment shader, which wasn't needed.
José Fonseca [Wed, 24 Apr 2013 12:08:46 +0000 (13:08 +0100)]
graw: Simple utilities to dump and disassemble TGSI tokens.
Useful for core dumps, where calling tgsi_dump() from gdb is not an
alternative.
José Fonseca [Wed, 24 Apr 2013 21:02:18 +0000 (22:02 +0100)]
scons: Support clang.
clang is supports most gcc options / extensions, with a some exceptions.
The biggest advantage of using clang is that compilation times are much
short.
One can tell scons to use clang when building by invoking it as
CC=clang CXX=clang++ scons libgl-xlib
José Fonseca [Wed, 24 Apr 2013 20:58:20 +0000 (21:58 +0100)]
util/u_sse: Fix _mm_shuffle_epi8 prototype for clang.
Clang does not support __artificial__. Instead match precisely what's
in the clang headers.
José Fonseca [Wed, 24 Apr 2013 21:00:32 +0000 (22:00 +0100)]
scons: Remove redundant code.
-fvisibility=hidden is already elsewhere for the whole tree.
Chris Forbes [Mon, 22 Apr 2013 05:07:57 +0000 (17:07 +1200)]
mesa: fix bogus comment about PrimitiveRestart fields
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Chris Forbes [Tue, 23 Apr 2013 18:44:02 +0000 (06:44 +1200)]
i965: report correct sample positions
From low to high bits, the sample positions are packed y0,x0,y1,x1...
Fixes arb_texture_multisample-sample-position piglit.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Rob Clark [Wed, 24 Apr 2013 21:45:00 +0000 (17:45 -0400)]
freedreno: fix bogus IMM const reg index
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location. This fixes
an incorrect-rendering bug with xonotic.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 24 Apr 2013 14:50:51 +0000 (10:50 -0400)]
freedreno: clear fixes and debugging
Set a few extra registers to make sure we are in proper state for
clearing. And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 24 Apr 2013 14:48:59 +0000 (10:48 -0400)]
freedreno: fix texture fetch type
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 24 Apr 2013 14:44:56 +0000 (10:44 -0400)]
freedreno: fix temp register usage
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases. For example, if the dst
register is the same as one of the src registers.
For now, just simplify it and always allocate a new register to use as
an intermediate. In some cases this will result in more registers used
than required. I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 23 Apr 2013 15:20:25 +0000 (11:20 -0400)]
freedreno: add noop driver
It is useful for debugging.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 22 Apr 2013 17:55:14 +0000 (13:55 -0400)]
freedreno: use u_math macros/helpers more
Get rid of a few self-defined macros:
ALIGN() -> align()
min() -> MIN2()
max() -> MAX2()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 22 Apr 2013 17:42:55 +0000 (13:42 -0400)]
freedreno: implement fd_screen_destroy()
Opps, didn't notice that I had left it stubbed out.
Also, make things fail a bit more gracefully when things go wrong.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 22 Apr 2013 17:21:21 +0000 (13:21 -0400)]
freedreno: set SWAP bit based on format
Really this should be set based on buffer format, not on color vs
depth/stencil. Probably there should be more formats that set the bit
as we add support for more render target formats.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tom Stellard [Tue, 23 Apr 2013 15:08:30 +0000 (08:08 -0700)]
radeon/llvm: Fix segfault with a specifc libelf implementation
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
requires calling elf_version() prior to calling elf_memory()
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Alex Deucher [Wed, 24 Apr 2013 16:26:52 +0000 (12:26 -0400)]
r600g: use CP DMA for buffer clears on evergreen+
Lighter weight then using streamout. Only evergreen
and newer asics support embedded data as src with
CP DMA.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chia-I Wu [Wed, 10 Apr 2013 13:32:30 +0000 (21:32 +0800)]
i965/gen7: fix encoding of (huge) surface size for BRW_SURFACE_BUFFER
Unlike GEN6, the bits of entry count are distributed like this
width = (entry_count & 0x0000007f); /* bits [6:0] */
height = (entry_count & 0x001fff80) >> 7; /* bits [20:7] */
depth = (entry_count & 0x7fe00000) >> 21; /* bits [30:21] */
The maximum entry count is still limited to 2^27.
This was noted while going over the PRM. No test is impacted, because
1<<20 (the bit that moved) is much larger than GL_UNIFORM_BLOCK_MAX_SIZE,
GL_MAX_TEXTURE_BUFFER_SIZE, or MAX_*_UNIFORM_COMPONENTS.
v2: Explain more in the commit message (by anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
Chia-I Wu [Wed, 10 Apr 2013 13:32:13 +0000 (21:32 +0800)]
i965/gen7: fix 3DSTATE_LINE_STIPPLE_PATTERN
The inverse repeat count should taks up bits 31:15 and is in U1.16. Fixes
the "Restarting lines within a single Begin/End block" subtest of piglit
linestipple, and gets the other failing subtests much closer to passing.
v2: Rewrite commit message with more detailed piglit info (by anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
Chia-I Wu [Wed, 10 Apr 2013 13:31:56 +0000 (21:31 +0800)]
i965: fix SURFACE_STATE dumping
Wrong fields were used when dumping width and height.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Tue, 23 Apr 2013 04:43:34 +0000 (21:43 -0700)]
i965: Remove strange comments about math functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Tue, 23 Apr 2013 04:33:38 +0000 (21:33 -0700)]
i965: Remove traces of nonexistent TAN math function.
Never existed? At least never supported. Doesn't appear in 965, G45,
or ILK documentation.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Mon, 22 Apr 2013 21:02:00 +0000 (14:02 -0700)]
glsl: Teach basic block analysis about break/continue/discard.
Previously, the only kind of ir_jump that would terminate a basic
block was "return". However, the other possible types of ir_jump
("break", "continue", and "discard") should terminate a basic block
too. This patch modifies basic block analysis so that it terminates a
basic block on any type of ir_jump, not just ir_return.
Fixes piglit test dead-code-break-interaction.shader_test.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Mon, 22 Apr 2013 20:59:17 +0000 (13:59 -0700)]
glsl: Add virtual function ir_instruction::as_jump()
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Wed, 24 Apr 2013 03:08:57 +0000 (20:08 -0700)]
r600g/llvm: Pass struct r600_bytecode to r600_llvm_compile
This way we don't need to update the function signature everytime we
emit a new config value. This also fixes the build with
--enable-opencl.
José Fonseca [Wed, 24 Apr 2013 09:49:57 +0000 (10:49 +0100)]
winsys/sw/xlib: Prevent shared memory segment leakage.
Running piglit with this was causing all sort of weird stuff happening
to my desktop (Chromium webpages become blank, Qt Creator flickered,
etc). I tracked this down to shared memory segment leakage when GL is
not shutdown properly. The segments can be seen running `ipcs` and
looking for nattch==0.
This changes fixes this by calling shmctl(IPC_RMID) soon after creation
(which does not remove the segment immediately, but simply marks it for
removal when no more processes are attached).
This matches src/mesa/drivers/x11/xm_buffer.c behaviour.
v2:
- move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson
- remove stray debug printfs, spotted by Ian Romanick
NOTE: This is a candidate for stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Tue, 23 Apr 2013 00:44:21 +0000 (20:44 -0400)]
draw/gs: preserve leading vertex info for gs
We need to handle the leading vertex information when
assembling primitives for the geometry shader otherwise
the resulting triangles will have vertices at incorrect
input locations.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Laurent Carlier [Wed, 24 Apr 2013 10:47:18 +0000 (12:47 +0200)]
r200: fix build regression introduced with
9a32203e1618486e87c7baf494134e05f0e38cf3
Signed-off-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Christian König [Sat, 20 Apr 2013 11:19:33 +0000 (13:19 +0200)]
radeonsi: cleanup disabling tiling for UVD v3
Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702
v2: add a comment that this is just a workaround
v3: fix typo in comment
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Chad Versace [Tue, 23 Apr 2013 02:17:48 +0000 (04:17 +0200)]
egl/dri2: Fix min/max swap interval of configs
The commit below exposed a bug in dri2_add_config.
commit
3998f8c6b5da1a223926249755e54d8f701f81ab
Author: Ralf Jung <post@ralfj.de>
Date: Tue Apr 9 14:09:50 2013 +0200
egl/x11: Fix initialisation of swap_interval
This little code snippet near the bottom of dri2_add_config,
if (double_buffer) {
...
conf->base.MinSwapInterval = dri2_dpy->min_swap_interval;
conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval;
}
it never did what it claimed to do. The assignment never changed the value
of conf->base.MaxSwapInterval, because dri2_dpy->max_swap_interval was,
until the above exposing commit, unitialized here. That is,
conf->base.MaxSwapInterval was 0 before and after assignment. Ditto for
the min swap interval.
Above the troublesome code snippet, the call to _eglFilterArray rejects
the config as unmatching if its swap interval bounds differ from the base
config's. Before the exposing commit, at the call to _eglFilterArray, the
swap interval bounds were always [0,0], and hence no config was rejected
due to swap interval.
After the exposing commit, _eglFilterArray incorrectly rejected some
configs, which prevented dri2_egl_config::dri_double_config from getting
set for the rejected config, which resulted in a NULL pointer getting
passed into dri2CreateNewDrawable, and then segfault.
The solution: set the swap interval bounds before _eglFilterArray.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63447
Tested-by: Lu Hua <huax.lu@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Kenneth Graunke [Tue, 23 Apr 2013 06:52:22 +0000 (23:52 -0700)]
mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Tue, 23 Apr 2013 06:37:06 +0000 (23:37 -0700)]
mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats.
NOTE: This is a candidate for stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Tue, 23 Apr 2013 06:13:47 +0000 (23:13 -0700)]
mesa: Add an unpack function for ARGB2101010_UINT.
v2: Remove extra parenthesis (suggested by Brian).
NOTE: This is a candidate for stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Tue, 23 Apr 2013 05:58:49 +0000 (22:58 -0700)]
mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1.
We accidentally set MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1 twice,
rather than setting the RGB8 and SRGB8 formats.
NOTE: This is a candidate for stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 21 Apr 2013 20:55:37 +0000 (13:55 -0700)]
mesa: Fix up some final license word wrapping issues by hand.
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 21 Apr 2013 20:52:08 +0000 (13:52 -0700)]
mesa: Restore 78-column wrapping of license text in C++-style comments.
The previous commit introduced extra words, breaking the formatting.
This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript2
where 'vimscript2' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/^ *$/ !fmt -w 78 -p '// '
:wq
Reviewed-by: Brian Paul <brianp@vmware.com>