Michel Dänzer [Tue, 19 Mar 2013 16:57:11 +0000 (17:57 +0100)]
radeonsi: Emit pixel shader state even when only the vertex shader changed
Fixes random failures with piglit glsl-max-varyings.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Christian König <christian.koenig@amd.com>
Chad Versace [Mon, 18 Mar 2013 20:56:28 +0000 (13:56 -0700)]
android: Define PACKAGE_VERSION/BUGREPORT in CFLAGS
This fixes the Android build. Commit 439c3d4 broke it.
CC: Adrian M Negreanu <adrian.m.negreanu@intel.com>
CC: Matt Turner <mattst@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Kenneth Graunke [Mon, 11 Mar 2013 18:10:34 +0000 (11:10 -0700)]
i965/vs: Add IR dumping for immediates.
This makes dump_instructions more useful.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 19 Mar 2013 01:57:28 +0000 (18:57 -0700)]
glsl: Add built-in functions for GLSL 1.50.
This makes basic built-in functions work in GLSL 1.50. It supports
everything except the new Geometry Shader functions.
The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl;
150.frag is identical to 140.frag except for the #version bump.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Tue, 19 Mar 2013 01:57:27 +0000 (18:57 -0700)]
glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.
GLSL 1.50 includes support for the new sampler types introduced by
the ARB_texture_multisample extension.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Tue, 19 Mar 2013 01:57:26 +0000 (18:57 -0700)]
glsl: Bump standalone compiler versions to 1.50.
The version bumps are necessary in order to compile built-ins for 1.50.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kenneth Graunke [Fri, 15 Mar 2013 21:48:24 +0000 (14:48 -0700)]
i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.
Commit
33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.
This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types. By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled. This is a very common
case.
Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.
However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits. If not, the sampler already returns 1.0
for us without any special swizzling. XRGB8888, for example, is a very
common case where this occurs.
This partially fixes a performance regression since commit
33599433c7.
More work is required to fully fix it in all cases. This at least helps
Warsow.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 14 Mar 2013 18:48:36 +0000 (11:48 -0700)]
i965: Don't print a fatal-looking message if intelCreateContext fails.
With the old context creation mechanism, an application asked the GL to
give it a context. Failing to produce a context was a fatal error.
Now, with GLX_ARB_create_context, the application can request a specific
version. If it's higher than the maximum version we support, context
creation will fail. But this is a normal error that applications
recover from.
In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
context. This led to it printing the following message 6 times:
"brwCreateContext: failed to init intel context"
There's no need to alarm users (and developers) with such a message.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Mon, 18 Mar 2013 22:38:58 +0000 (15:38 -0700)]
i965/gen7: Align all depth miplevels to 8 in the X direction.
On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.
It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).
No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Christoph Bumiller [Fri, 15 Mar 2013 22:39:01 +0000 (23:39 +0100)]
nvc0: fix max varying count, move CLIPVERTEX,FOG out of the way
The card spews an error if I use all 128 generic slots.
Apparently the real limit isn't just dictated by the address space
layout.
Christoph Bumiller [Fri, 15 Mar 2013 21:11:31 +0000 (22:11 +0100)]
gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
Ian Romanick [Wed, 20 Mar 2013 00:44:31 +0000 (17:44 -0700)]
docs: import release notes for 9.1.1, add news item
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Kristian Høgsberg [Wed, 20 Mar 2013 00:16:57 +0000 (20:16 -0400)]
gallium-egl: Fix compile errors introduced in
de315f76a
The commit changed API in a helper library shared by both egl_dri2 and
the gallium egl state tracker, but only egl_dri2 was updated to use the
new interface.
Tested-by: Giulio Camuffo <giuliocamuffo@gmail.com>
Paul Berry [Sat, 23 Feb 2013 00:40:41 +0000 (16:40 -0800)]
i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.
Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage. During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1. As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.
The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.
This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Fri, 22 Feb 2013 23:37:41 +0000 (15:37 -0800)]
ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.
Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.
These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver->LinkShader(), which takes care of calling these functions (or
their equivalent). Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.
It was only by sheer coincidence that this wasn't manifesting itself
as a bug. It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1. As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.
I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Fri, 8 Mar 2013 21:39:43 +0000 (13:39 -0800)]
i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.
This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Eliminate unnecessary call to _mesa_is_depthstencil_format(). Fix
handling of depth buffer in depth/stencil format.
v3: Use correct bitfields for clear_mask. Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.
Reviewed-by: Eric Anholt <eric@anholt.net>
Alex Deucher [Tue, 19 Mar 2013 22:11:20 +0000 (18:11 -0400)]
r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman
Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 19 Mar 2013 18:25:32 +0000 (14:25 -0400)]
r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx
Not using HiS yet, but matches what we do on evergreen+.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brian Paul [Tue, 19 Mar 2013 16:03:39 +0000 (10:03 -0600)]
winsys/svga: improve error/debug message output
Use vmw_printf() just for extra debugging info (off by default).
Use vmw_error() for real errors/failures/etc that we definitely
want to report.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 19 Mar 2013 19:49:42 +0000 (13:49 -0600)]
tgsi: fix uninitialized declaration array fields
Fixes a few regressions since the TGSI array changes.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Kristian Høgsberg [Tue, 19 Mar 2013 17:20:36 +0000 (13:20 -0400)]
egl_dri2: Lower __DRI_IMAGE version requirement back to 1
We check the extension version manually instead and verify that we have
the createImageFromFds function before enabling prime fd passing.
Maarten Lankhorst [Tue, 19 Mar 2013 19:17:57 +0000 (20:17 +0100)]
radeon/llvm: Do not link against libgallium when building statically.
NOTE: This is a candidate for the 9.1 branch.
Tested-by: Vincent Lejeune <vljn@ovi.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Matt Turner [Wed, 30 Jan 2013 01:37:02 +0000 (17:37 -0800)]
gles2: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.
Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
Matt Turner [Tue, 12 Mar 2013 19:36:06 +0000 (12:36 -0700)]
gles1: Add an ABI-check test
Checks that no functions are exported that are not part of the ABI.
Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.
Andreas Boll [Sat, 16 Mar 2013 13:04:24 +0000 (14:04 +0100)]
gallium/egl: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=
7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1
NOTE: This is a candidate for the 9.1 branch.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Andreas Boll [Sat, 16 Mar 2013 13:00:44 +0000 (14:00 +0100)]
osmesa: fix out-of-tree build
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=
00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1
v2: Move the added line immediately after -I$(top_srcdir)/src/mapi
NOTE: This is a candidate for the 9.1 and 9.0 branches.
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Andreas Boll [Sat, 16 Mar 2013 12:50:19 +0000 (13:50 +0100)]
build: Enable x86 assembler on Hurd.
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=
984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1
Thanks to Pino Toscano.
v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far.
NOTE: This is a candidate for stable branches.
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Matt Turner <mattst88@gmail.com>
Andreas Boll [Sat, 16 Mar 2013 12:54:09 +0000 (13:54 +0100)]
mesa: use ieee fp on s390 and m68k
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=
d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1
Fixes Debian bug #349437.
Patch written by David Nusinow.
NOTE: This is a candidate for stable branches.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Roland Scheidegger [Sat, 16 Mar 2013 01:55:43 +0000 (02:55 +0100)]
gallivm: fix return opcode handling in main function of a shader
If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.
Note: This is a candidate for the stable branches.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Rob Clark [Tue, 5 Mar 2013 22:49:43 +0000 (17:49 -0500)]
freedreno: clear fixes
Some fixes for clearing only depth or only stencil.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Christian König [Thu, 7 Mar 2013 11:00:18 +0000 (12:00 +0100)]
radeonsi: enable indirect adressing
Fixing 16 piglit tests.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Thu, 7 Mar 2013 10:58:56 +0000 (11:58 +0100)]
radeonsi: implement indirect adressing of constants
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Thu, 28 Feb 2013 13:50:07 +0000 (14:50 +0100)]
radeonsi: switch to using resource destribtors for constants v2
v2: remove superfluous mask, use buffer_size instead of constant
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Christian König [Thu, 7 Mar 2013 10:01:07 +0000 (11:01 +0100)]
radeon/llvm: rework input fetch and output store
Cleanup the code and implement indirect addressing.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Brian Paul [Tue, 19 Mar 2013 13:55:48 +0000 (07:55 -0600)]
tgsi: add initializer data to fix MSVC compile error
Christian König [Thu, 14 Mar 2013 10:10:16 +0000 (11:10 +0100)]
tgsi: add ArrayID documentation v2
v2: further improve the text with comments from Christoph Bumiller.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Thu, 7 Mar 2013 14:02:31 +0000 (15:02 +0100)]
tgsi: use separate structure for indirect address v2
To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.
Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.
v2: rename Declaration to ArrayID, put the ArrayID into () instead of []
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Wed, 13 Mar 2013 13:58:15 +0000 (14:58 +0100)]
tgsi: add ArrayID to declarations
Remember which declarations are declared as "arrays" and so
can be indirectly addressed. ArrayIDs start at 1, cause for
compatibility reasons zero is treaded as no array present.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Thu, 7 Mar 2013 15:52:54 +0000 (16:52 +0100)]
tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
Nobody seems to be using it, and only nv50 had a partial implementation.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Sun, 10 Mar 2013 13:36:13 +0000 (14:36 +0100)]
glsl_to_tgsi: remove indirect addressing limitations
They shouldn't be necessary any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Sun, 10 Mar 2013 13:33:29 +0000 (14:33 +0100)]
glsl_to_tgsi: allocate arrays separately v2
Instead of allocating everything as temporaries, use the
new array allocation functions.
v2: fix bug in simplify_cmp, declare arrays on demand
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Fri, 8 Mar 2013 12:17:05 +0000 (13:17 +0100)]
glsl_to_tgsi: use get_temp for all allocations
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Sun, 10 Mar 2013 12:44:25 +0000 (13:44 +0100)]
tgsi/ureg: implement support for array temporaries
Don't bother with free temporaries, just allocate them at
the end and also emit them in their own declaration.
Signed-off-by: Christian König <christian.koenig@amd.com>
Christian König [Fri, 8 Mar 2013 16:55:46 +0000 (17:55 +0100)]
tgsi/ureg: cleanup local temporary emission v2
Instead of emitting each temporary separately, emit them in a chunk.
v2: keep separate function for emitting temps
Signed-off-by: Christian König <christian.koenig@amd.com>
Andreas Boll [Tue, 19 Mar 2013 10:55:41 +0000 (11:55 +0100)]
radeon/llvm: Link against libgallium.la to fix an undefined symbol
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=
ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1
Fixes a regression introduced with
f70c3853513637fa6ed38e75f73d472a9fa61213
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Kristian Høgsberg [Sat, 2 Feb 2013 17:26:12 +0000 (12:26 -0500)]
wayland: Add prime fd passing as a buffer sharing mechanism
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Kristian Høgsberg [Sat, 2 Feb 2013 13:38:07 +0000 (08:38 -0500)]
Add dri image entry point for creating image from fd
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Kristian Høgsberg [Sat, 2 Feb 2013 12:40:51 +0000 (07:40 -0500)]
wayland: allocate a __DRIimage for the color buffer
No functional change here, but this will let us query the image
for an fd handle later.
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Rob Clark [Tue, 12 Mar 2013 23:31:58 +0000 (19:31 -0400)]
DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap
If ddx does not support swap, don't advertise it. This is a hack to
work around current xservers which advertise this extension even when it
is clearly not supported. When:
http://lists.x.org/archives/xorg-devel/2013-February/035449.html
is merged in upstream xserver and makes it's way into most distros then
this hack can be removed. In the mean time, it is required to allow
gnome-shell/clutter/etc to work properly with a DDX driver which does
not support ScheduleSwap.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Paul Berry [Sat, 16 Mar 2013 17:32:21 +0000 (10:32 -0700)]
i965/blorp: Add INTEL_DEBUG=blorp flag.
This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP. Hopefully this should be useful in
developing additional BLORP features.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Alex Deucher [Fri, 15 Mar 2013 19:11:01 +0000 (15:11 -0400)]
r600g: properly set non_disp tiling mode for DMA (v2)
Needs to be set for depth, stencil, and fmask just
like other blocks.
v2: drop additional cayman bits for now
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 15 Mar 2013 18:29:24 +0000 (14:29 -0400)]
r600g: Use blitter rather than DMA for 128bpp on cayman (v3)
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces. When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.
v2: cayman/TN only
v3: fix comments
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Berry [Wed, 13 Mar 2013 20:48:13 +0000 (13:48 -0700)]
i965: Simplify separate stencil check
The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.
Reviewed-by: Eric Anholt <eric@anholt.net>
Maarten Lankhorst [Thu, 21 Feb 2013 17:07:52 +0000 (18:07 +0100)]
gallium/build: Fix visibility CFLAGS in automake
v2: Andreas Boll <andreas.boll.dev@gmail.com>
- Fix formatting - use one CFLAG per line
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
José Fonseca [Fri, 15 Mar 2013 15:23:54 +0000 (15:23 +0000)]
scons: Warn when using MSVS versions prior to 2012.
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Fri, 8 Mar 2013 20:03:10 +0000 (12:03 -0800)]
i965: Apply depthstencil alignment workaround when doing fast clears.
Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations. Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.
Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.
Fixes piglit "depthstencil-render-miplevels" tests with size 273.
NOTE: This is a candidate for stable branches
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Paul Berry [Sat, 23 Feb 2013 17:00:58 +0000 (09:00 -0800)]
Replace gl_frag_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 16:36:40 +0000 (08:36 -0800)]
Get rid of _mesa_frag_attrib_to_vert_result().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 16:28:18 +0000 (08:28 -0800)]
Get rid of _mesa_vert_result_to_frag_attrib().
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function. But we still need to be able to detect when a given vertex
output has no corresponding fragment input. So it is replaced by a
new function, _mesa_varying_slot_in_fs(), which tells whether the
given varying slot exists as an FS input or not.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 16:09:27 +0000 (08:09 -0800)]
mtypes.h: Modify gl_frag_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_frag_attrib enum entirely.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 15:49:04 +0000 (07:49 -0800)]
Replace gl_geom_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:
gl_geom_result -> gl_varying_slot
GEOM_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 15:45:07 +0000 (07:45 -0800)]
mtypes.h: Modify gl_geom_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_result enum entirely.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 15:34:06 +0000 (07:34 -0800)]
Replace gl_geom_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes:
gl_geom_attrib -> gl_varying_slot
GEOM_ATTRIB_* -> VARYING_SLOT_*
GEOM_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 15:31:33 +0000 (07:31 -0800)]
mtypes.h: Modify gl_geom_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_attrib enum entirely.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sat, 23 Feb 2013 15:22:01 +0000 (07:22 -0800)]
Replace gl_vert_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Fri, 22 Feb 2013 19:49:44 +0000 (11:49 -0800)]
mtypes.h: Modify gl_vert_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_vert_result enum entirely.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Fri, 22 Feb 2013 19:32:54 +0000 (11:32 -0800)]
mtypes.h: Add new gl_varying_slot enum, and bitfield defines.
Future patches will make use of the enum. It will eventually take the
place of the existing enums gl_vert_result, gl_geom_attrib,
gl_geom_result, and gl_frag_attrib, all of which represent essentially
the same information but using inconsistent values.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Paul Berry [Sun, 24 Feb 2013 18:53:35 +0000 (10:53 -0800)]
i965: Change fragment input related bitfields to 64-bit.
This patch updates the bitfields brw_context::wm.input_size_masks,
tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of
which are indexed by gl_frag_attrib, from 32-bit to 64-bit.
This paves the way for supporting geometry shaders, and for merging
the gl_frag_attrib and gl_vert_result enums. The combination of these
two will require at least 55 bits in the bitfields.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
Alex Deucher [Fri, 8 Mar 2013 18:52:37 +0000 (13:52 -0500)]
r600g: add Richland APU pci ids
Note: this is a candidate for the stable branches.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brian Paul [Thu, 14 Mar 2013 02:24:05 +0000 (20:24 -0600)]
st/dri: add support for the always_have_depth_buffer option
This involved adding another driOptionCache to dri_screen. The
existing one just held the default values. But now we also need
to have the values from the DRI config file so that we can get at
the always_have_depth_buffer config option, which is per-screen.
Brian Paul [Thu, 14 Mar 2013 02:19:44 +0000 (20:19 -0600)]
driconf: add a miscellaneous section and always_have_depth_buffer option
This option is needed for some applications that neglect to request
a depth buffer when choosing a visual/fbconfig.
The Linux app Topogun is an example of this problem.
Brian Paul [Wed, 13 Mar 2013 18:08:48 +0000 (12:08 -0600)]
driconf: reorder options, reformat comments, etc
Move the options into the proper section (Debug, Quality, Performance,
etc).
Update comments and add some whitespace to improve readability.
Philipp Brüschweiler [Fri, 8 Mar 2013 20:32:36 +0000 (21:32 +0100)]
wayland: fix segfault when using software rendering
wayland_roundtrip() was given an incorrect parameter.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62362
Note: This is a candidate for the stable branches.
Signed-off-by: Brian Paul <brianp@vmware.com>
Brian Paul [Thu, 14 Mar 2013 13:45:59 +0000 (07:45 -0600)]
softpipe: fix up NUM_ENTRIES confusion
There were two different NUM_ENTRIES #defines for the framebuffer
tile cache and the texture tile cache. Rename the later to fix
the warnings:
In file included from sp_flush.c:40:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
In file included from sp_context.c:50:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
Also, replace occurances of NUM_ENTRIES with Element() macro to
be safer.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 13 Mar 2013 14:43:04 +0000 (08:43 -0600)]
st/osmesa: silence some optimized build warnings
Brian Paul [Wed, 13 Mar 2013 14:35:39 +0000 (08:35 -0600)]
draw: init pre_clip_pos = NULL to fix optimized build warning
Brian Paul [Wed, 13 Mar 2013 14:35:21 +0000 (08:35 -0600)]
glx: init screen = 0 to fix optimized build warning
Kenneth Graunke [Thu, 7 Feb 2013 07:26:36 +0000 (23:26 -0800)]
i965: Make INTEL_DEBUG=shader_time use the RAW surface format.
Untyped Atomic Operation messages are illegal for non-RAW formats. The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't. The simulator for gen7 does complain, though.
v2: Rebase against updates to previous patches. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 7 Feb 2013 07:26:35 +0000 (23:26 -0800)]
i965: Specialize SURFACE_STATE creation for shader time.
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.
It will diverge shortly.
I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.
v2: Replace tabs in the new code (by anholt)
Add back dropped memset() and add a comment about HSW channel selects.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 7 Feb 2013 07:26:34 +0000 (23:26 -0800)]
i965: Fix INTEL_DEBUG=shader_time for Haswell.
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.
Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.
v2: Use the #defines from the previous commit. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Eric Anholt [Mon, 11 Mar 2013 21:56:38 +0000 (14:56 -0700)]
i965: Add definitions for gen7+ data cache messages.
We were sparsely using some of these message types, but I'll just fill
them all in now. It will be used for fixing shader_time on HSW.
v2: Add missing MEDIA_BLOCK_READ.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 11 Mar 2013 19:59:06 +0000 (12:59 -0700)]
i965: Split shader_time entries into separate cachelines.
This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).
Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).
v2: Add a define for the stride with a comment explaining its units and
why.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
José Fonseca [Thu, 14 Mar 2013 17:40:14 +0000 (17:40 +0000)]
scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds.
scons/llvm.py defines inline globally to workaround issues with LLVM C
binding headers, so the only way to is to avoid
aggravating xkeycheck.h errors is to set _ALLOW_KEYWORD_MACROS.
This fixes MSVC 2012 build with LLVM.
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Thu, 14 Mar 2013 11:44:21 +0000 (11:44 +0000)]
softpipe: Shrink context size.
- each softpipe_tex_tile_cache 50*64*64*4*4 = 3,276,800 bytes
- each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe
context is 314,572,800 bytes, i.e, 300MB
That is, in a 32bits process (around 3GB virtual memory max), we can
only fit 10 contexts.
This change is a short-term hack to shrink the context size. Longer
term we'll need to change how the texture cache works.
Reviewed-by: Brian Paul <brianp@vmware.com>
Christian König [Thu, 14 Mar 2013 11:37:02 +0000 (12:37 +0100)]
radeon/llvm: fix LLVM dependencies
Since commit
1c4f283151b191c51cbd76d7f304cc1fe7be3019 we obvious depend on this.
Signed-off-by: Christian König <christian.koenig@amd.com>
Anuj Phogat [Thu, 7 Mar 2013 22:05:38 +0000 (14:05 -0800)]
mesa: Fix FB blitting in case of zero size src or dst rect
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.
V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.
Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495
Note: Candidate for all the stable branches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Roland Scheidegger [Wed, 13 Mar 2013 21:10:18 +0000 (22:10 +0100)]
tgsi: fix sample_d emit for arrays
Those cases were apparently forgotten.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 13 Mar 2013 20:23:18 +0000 (21:23 +0100)]
llvmpipe: don't assert when trying to render to surfaces with multiple layers
instead just warn when creating the surface, rendering will simply happen
to first layer.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 13 Mar 2013 20:19:20 +0000 (21:19 +0100)]
softpipe: don't assert when creating surfaces with multiple layers
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
José Fonseca [Wed, 13 Mar 2013 21:21:17 +0000 (21:21 +0000)]
llvmpipe: Fix geometry shader token leak.
Trivial. Matches softpipe's code.
Tom Stellard [Thu, 7 Mar 2013 21:51:14 +0000 (16:51 -0500)]
radeon/llvm: Add missing license headers
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Tom Stellard [Thu, 7 Mar 2013 21:51:13 +0000 (16:51 -0500)]
radeon/llvm: Make radeon_llvm_util.cpp a C file
All the functions in this file are now implemented in C.
Tom Stellard [Thu, 7 Mar 2013 21:51:12 +0000 (16:51 -0500)]
radeon/llvm: Optimize radeon_llvm_strip_unused_kernels()
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.
Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
Tom Stellard [Thu, 7 Mar 2013 21:51:11 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API
Tom Stellard [Thu, 7 Mar 2013 21:51:10 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API
Tom Stellard [Thu, 7 Mar 2013 21:51:09 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API
Also make the function static since it is not used anywhere else.
Tom Stellard [Thu, 7 Mar 2013 21:51:08 +0000 (16:51 -0500)]
r600g/llvm: Move llvm wrapper functions into the radeon directory
Jon TURNEY [Wed, 27 Feb 2013 15:32:37 +0000 (15:32 +0000)]
Properly check GLX_INDIRECT_RENDERING in glapi/tests/check_table
Actually use $DEFINES, so we can see if GLX_INDIRECT_RENDERING is defined
If GLX_INDIRECT_RENDERING is defined, _GLAPI_SKIP_PROTO_ENTRY_POINTS will
be defined, and libglapi won't contain the 'protocol entry points', so we
should provide stubs in check_table.cpp
Jon TURNEY [Wed, 27 Feb 2013 12:58:17 +0000 (12:58 +0000)]
Fix glapi/tests/check_table.cpp for standardized OpenGL function names
It looks like this has been broken since commit
1a1db1746db82efc7f0643508886dfc78a15eb71 "Standardize names of OpenGL
functions."
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Jon TURNEY [Tue, 26 Feb 2013 16:02:13 +0000 (16:02 +0000)]
Fix out-of-tree build of 'make check' in src/mapi/glapi/tests/
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>