Matt Turner [Thu, 22 May 2014 23:30:07 +0000 (16:30 -0700)]
i965: Skip IR annotations with INTEL_DEBUG=noann.
Running shader-db with INTEL_DEBUG=noann reduces the runtime
from ~90 to ~80 seconds on my machine. It also reduces the disk space
consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on
disk).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 7 Apr 2014 17:25:50 +0000 (10:25 -0700)]
i965/fs: Debug the optimization passes by dumping instr to file.
With INTEL_DEBUG=optimizer, write the output of dump_instructions() to a
file each time an optimization pass makes progress. This lets you easily
diff successive files to see what an optimization pass did.
Example filenames written when running glxgears:
fs8-0000-00-start
fs8-0000-01-04-opt_copy_propagate
fs8-0000-01-06-dead_code_eliminate
fs8-0000-01-12-compute_to_mrf
fs8-0000-02-06-dead_code_eliminate
| | | |
| | | `-- optimization pass name
| | |
| | `-- optimization pass number in the loop
| |
| `-- optimization loop interation
|
`-- shader program number
Note that with INTEL_DEBUG=optimizer, we disable compact_virtual_grfs,
so that we can diff instruction lists across loop interations without
the register numbers being changes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 29 May 2014 20:08:59 +0000 (13:08 -0700)]
i965: Give dump_instructions() a filename argument.
This will allow debugging code to dump the IR after an optimization pass
makes progress (the next patch). Only let it open and write to a file if
the effective user isn't root.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 29 May 2014 18:45:15 +0000 (11:45 -0700)]
i965: Give dump_instruction() a FILE* argument.
Use function overloading rather than default arguments, since gdb
doesn't know about default arguments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Sat, 12 Apr 2014 04:10:53 +0000 (21:10 -0700)]
i965: Add envvar to debug the optimization passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Roland Scheidegger [Thu, 29 May 2014 22:53:36 +0000 (00:53 +0200)]
llvmpipe: (trivial) drop "unswizzled" from some function names
This made sense when swizzled storage layout was used for rendering to tiles.
But nowadays the name just adds confusion (and makes for long lines).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Thu, 29 May 2014 22:37:17 +0000 (00:37 +0200)]
llvmpipe: fix crash when not all attachments are populated in a fb
Framebuffers can have NULL attachments since a while. llvmpipe handled
that properly for lp_rast_shade_quads_mask but it seems the change didn't
make it to lp_rast_shade_tile.
This fixes piglit fbo-drawbuffers-none test (though I need to increase
the FB_SIZE from 32 to 256 so the tris cover some tiles fully).
https://bugs.freedesktop.org/show_bug.cgi?id=79421
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 28 May 2014 23:22:19 +0000 (01:22 +0200)]
softpipe: honor the render_condition_enable bit in blits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 28 May 2014 23:22:11 +0000 (01:22 +0200)]
llvmpipe: honor the render_condition_enable bit in blits.
This fixes piglit nv_conditional_render-blitframebuffer.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 28 May 2014 23:21:20 +0000 (01:21 +0200)]
gallium/docs: improve documentation of render condition wrt blits.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 29 May 2014 19:56:48 +0000 (13:56 -0600)]
svga: use svga_shader_too_large() in compile_vs()
And rework the dummy shader code to match the fragment shader case.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 29 May 2014 19:56:22 +0000 (13:56 -0600)]
svga: use svga_shader_too_large() in compile_fs()
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 29 May 2014 19:55:46 +0000 (13:55 -0600)]
svga: added svga_shader_too_large() helper
To check if a shader bytcode exceeds the device limit. There's no
limit when using GBS.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Jeremy Huddleston Sequoia [Sat, 31 May 2014 10:44:51 +0000 (03:44 -0700)]
darwin: Remove extra kCGLPFAColorSize attribute when requesting an offscreen context
https://xquartz.macosforge.org/trac/ticket/650
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Vinson Lee [Sat, 31 May 2014 02:40:26 +0000 (19:40 -0700)]
util: Do not use __builtin_clrsb with Intel C++ Compiler.
This patch fixes this build error with icc 14.0.2.
In file included from state_tracker/st_glsl_to_tgsi.cpp(63):
../../src/gallium/auxiliary/util/u_math.h(583): error: identifier "__builtin_clrsb" is undefined
return 31 - __builtin_clrsb(i);
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Lubomir Rintel [Wed, 28 May 2014 06:56:12 +0000 (08:56 +0200)]
i915: add a missing NULL pointer check
mesaVisual can be NULL with configless context since this commit:
commit
551d459af421a2eb937e9e16301bb64da4624f89
Author: Neil Roberts <neil@linux.intel.com>
Date: Fri Mar 7 18:05:47 2014 +0000
Add the EGL_MESA_configless_context extension
...
Previously the i965 and i915 drivers were explicitly creating a zeroed visual
whenever 0 is passed for the EGLConfig.
We attempt to dereference the visual in i915 and now we don't create a
zeroed-out one one it crashes, breaking at least weston in an i915. There's
no point in doing so as it would be zero anyway.
v2: Fixed a typo in commit message. Added some tags.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1100967
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ian Romanick [Fri, 30 May 2014 20:55:28 +0000 (13:55 -0700)]
glapi: Duplicate GLES1 prototypes in glapi_dispatch.c
These prototypes are necessary because GLES1 library builds will create
dispatch functions for them. We can't directly include GLES/gl.h
because it would conflict the previously-included GL/gl.h. Since GLES1
ABI is not expected to every add more functions, the path of least
resistance is to just duplicate the prototypes for the functions that
aren't already in desktop OpenGL.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79294
Acked-by: Matt Turner <mattst88@gmail.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Matt Turner [Thu, 29 May 2014 23:47:39 +0000 (16:47 -0700)]
i965/vec4: Allow writemasking on math instructions on Gen7+.
The math instruction was Align1-only on Gen6 and we never updated this
to let it use Align16 features like writemasking on newer platforms.
total instructions in shared programs: 1686120 -> 1685507 (-0.04%)
instructions in affected programs: 48593 -> 47980 (-1.26%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Pavel Popov [Fri, 30 May 2014 03:50:34 +0000 (10:50 +0700)]
i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
Brian Paul [Wed, 28 May 2014 16:01:30 +0000 (10:01 -0600)]
st/wgl: use _debug_printf() instead of fprintf()
This should print output both for debug and release builds.
Suggested by Jose.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 21 May 2014 17:41:59 +0000 (11:41 -0600)]
st/wgl: formatting fixes in stw_framebuffer.c
And remove some unneeded #includes and INLINE qualifiers.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 21 May 2014 17:32:30 +0000 (11:32 -0600)]
st/wgl: make stw_lookup_context_locked() an inline function
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Tue, 20 May 2014 20:56:41 +0000 (14:56 -0600)]
st/wgl: fix implementation of wglCreateContextAttribsARB()
wglCreateContextAttribsARB() didn't work previously since it returned
a context ID that wasn't allocated by OPENGL32.DLL. So if that context
ID was later passed to wglMakeCurrent(), etc. it was rejected.
Now when wglCreateContextAttribsARB() is called we actually call
wglCreateContext() in order to get a valid context ID. Then we
replace the context data which was created with new context data
which reflects the arguments passed to wglCreateContextAttribsARB().
If there were a DrvCreateContextAttribs() function in the ICD this
work-around wouldn't be necessary.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Conflicts:
src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c
src/gallium/state_trackers/wgl/stw_getprocaddress.c
Brian Paul [Fri, 21 Mar 2014 17:06:41 +0000 (11:06 -0600)]
st/wgl: add debug code to check that pixel format initialization worked
If the assertion fails, it means something is really broken. Before,
if this happened we reverted to the GDI renderer without any warning.
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
Brian Paul [Mon, 19 May 2014 15:26:04 +0000 (09:26 -0600)]
st/wgl: change PFD_SWAP_COPY to PFD_SWAP_EXCHANGE.
To reflect our actual SwapBuffers implementation. See
stw_st_swap_framebuffer_locked(). This fixes various rendering issues
with SolidEdge.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
José Fonseca [Thu, 29 May 2014 19:02:31 +0000 (20:02 +0100)]
docs: Document how to replace Windows built-in OpenGL software rasterizer with llvmpipe.
Just happened to stumble across this registry key while debugging
something else.
This technique is much neater than trying to override opengl32.dll.
Also a few minors cleanups.
Tapani Pälli [Fri, 30 May 2014 07:10:09 +0000 (10:10 +0300)]
scons: add common.c as part of glcpp build
to have _mesa_error_no_memory function available
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79440
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Juha-Pekka Heikkila [Thu, 13 Feb 2014 14:04:23 +0000 (16:04 +0200)]
mesa: Add missing null checks into prog_hash_table.c
Check calloc return values in hash_table_insert() and
hash_table_replace()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tapani Pälli [Fri, 30 May 2014 04:47:05 +0000 (07:47 +0300)]
glcpp: link with tests/common.c
So that prog_hash_table can use _mesa_error_no_memory function.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Juha-Pekka Heikkila [Mon, 12 May 2014 08:01:48 +0000 (11:01 +0300)]
mesa/main: Add missing null check in _mesa_CreatePerfQueryINTEL()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Juha-Pekka Heikkila [Fri, 25 Apr 2014 08:34:12 +0000 (11:34 +0300)]
mesa/drivers: Add extra null check in blitframebuffer_texture()
If texObj == NULL here it mean there is already GL_INVALID_VALUE
or GL_OUT_OF_MEMORY error set to context.
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Juha-Pekka Heikkila [Thu, 3 Apr 2014 13:51:14 +0000 (16:51 +0300)]
glsl: Add null check in loop_analysis.cpp
Check return value from hash_table_find before using it as a pointer
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Juha-Pekka Heikkila [Wed, 26 Feb 2014 12:03:19 +0000 (14:03 +0200)]
mesa: add missing null check in _mesa_NewHashTable()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Gary Wong [Thu, 22 May 2014 03:07:42 +0000 (21:07 -0600)]
loader: add optional /sys filesystem method for PCI identification.
Introduce a simple PCI identification method of looking up the answer
the /sys filesystem (available on Linux). Attempted after libudev, but
before DRM.
Disabled by default (available only when the --enable-sysfs configure
option is specified).
Signed-off-by: Gary Wong <gtw@gnu.org>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Gary Wong [Thu, 22 May 2014 02:39:15 +0000 (20:39 -0600)]
loader: allow attempting more than one method of PCI identification.
loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt
all available strategies to identify the hardware, instead of conditionally
compiling in a single test. The existing libudev and DRM approaches have
been retained, attempting first libudev (if available) and then DRM (if
necessary).
Signed-off-by: Gary Wong <gtw@gnu.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Wed, 28 May 2014 14:43:35 +0000 (15:43 +0100)]
st/egl: do not link against libloader
Move the link to the final targets, like any other place in
mesa/gallium. This allows better visibilty and will prevent
us from including the library archive twice.
Resolves multiple definition of `loader_get_pci_id_for_fd'
multiple definition of `loader_get_pci_id_for_fd'
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79263
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79382
Cc: Chia-I Wu <olv@lunarg.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Emil Velikov [Wed, 28 May 2014 13:36:46 +0000 (14:36 +0100)]
egl_dri2: fix wayland_platform when drm_platform is not set
The build fails with implicit delaration of drmGetCap (xf86drm.h)
Were we're including the header only when building the DRM_PLATFORM.
Wayland backend can operate without DRM_PLATFORM so replace the
guard, and fold in drmGetCap() usage to silence compiler warnings.
Cc: Chad Versace <chad.versace@linux.intel.com>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Matt Turner [Tue, 27 May 2014 22:26:06 +0000 (15:26 -0700)]
i965/fs: Set correct number of regs_written for MCS fetches.
regs_written is in units of virtual GRFs.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jerome Glisse [Thu, 29 May 2014 17:32:21 +0000 (13:32 -0400)]
glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload
There is no reason anymore to load with RTLD_GLOBAL and for some driver
this even result in dlclose failing to unload leading to catastrophic
failure with swrast fallback.
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Stéphane Marchesin [Wed, 28 May 2014 21:15:07 +0000 (14:15 -0700)]
i915g: Support B5G5R5A1 render targets and textures
Stéphane Marchesin [Wed, 28 May 2014 21:00:20 +0000 (14:00 -0700)]
i915g: Support R4G4B4A4 render targets and textures
Stéphane Marchesin [Wed, 28 May 2014 17:29:40 +0000 (10:29 -0700)]
i915g: Fix copy region code
This fixes a few issues with it, also cleans up the code.
Connor Abbott [Wed, 28 May 2014 01:23:05 +0000 (21:23 -0400)]
glsl/tests: remove generated tests from the repo
They were made unneccesary by the last commit.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Connor Abbott [Wed, 28 May 2014 01:23:04 +0000 (21:23 -0400)]
glsl/tests: call create_test_cases.py in optimization-test
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.
v2: make sure we parse the right directories and check for existance the
right way.
v3 (Ken): Use $PYTHON2 instead of calling python directly.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Connor Abbott [Wed, 28 May 2014 01:23:03 +0000 (21:23 -0400)]
glsl/tests/lower_jumps: fix generated sexpr's for loops
In
088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which he used to create the tests.
Fix that, so that now create_test_cases.py is synced with the generated
tests.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Connor Abbott [Wed, 28 May 2014 01:23:02 +0000 (21:23 -0400)]
glsl: be more consistent about printing constants
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_cases.py in line with them.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Fri, 23 May 2014 20:57:49 +0000 (14:57 -0600)]
glsl: replace strncmp("gl_") calls with new is_gl_identifier() helper
Makes things a little easier to read.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Brian Paul [Fri, 23 May 2014 20:59:33 +0000 (14:59 -0600)]
glsl: fix use-after free bug/crash in ast_declarator_list::hir()
The call to get_variable_being_redeclared() may delete 'var' so we
can't reference var->name afterward. We fix that by examining the
var's name before making that call.
Fixes valgrind warnings and possible crash when running the piglit
tests/spec/glsl-1.30/execution/clipping/vs-clip-distance-in-param.shader_test
test (and probably others).
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kenneth Graunke [Wed, 28 May 2014 01:16:01 +0000 (18:16 -0700)]
i965: Fix repeated usage of rectangle texture coordinate scaling.
Previously, we set up new entries in the params[] array on every access
of a rectangle texture. Unfortunately, we only reserve space for
(2 * MaxTextureImageUnits) extra entries, so programs which accessed
rectangle textures more times than that would write off the end of the
array and likely crash.
We don't really have a decent mapping between the index returned by
_mesa_add_state_reference and our index into the params array, so we
have to manually search for it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78691
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: mesa-stable@lists.freedesktop.org
José Fonseca [Wed, 28 May 2014 09:33:33 +0000 (10:33 +0100)]
egl-static: Fix undefined reference to `loader_*'
Trivial. Better than a broken build.
Topi Pohjolainen [Tue, 27 May 2014 12:39:06 +0000 (15:39 +0300)]
meta/blit: Use gl_FragColor also in the msaa blit shader
Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
the meta path.
No piglit regressions on IVB.
Further input from Ken:
"Unfortunately, this doesn't fix MRT for integer data.
In the single-sampled case, since we're directly copying data, we were
read/copy/write data as "float" values, which actually contained the
integer bits. Here, we can't do that since we need to process the
actual integer data.
I do wonder if we could use intBitsToFloat/uintBitsToFloat to stuff the
integer bits in the float gl_FragColor output. Just a crazy idea.
In the long term (post 10.2), I think we should draft an extension that
allows you to do "layout(location = all)" on user-defined fragment
shader outputs. (Or some similar syntax.)"
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Alexandre Courbot [Tue, 27 May 2014 07:03:02 +0000 (16:03 +0900)]
nvc0/ir: use SM35 ISA with GK20A
GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
the GK110 path when this chip is detected.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Alexandre Courbot [Tue, 27 May 2014 07:03:01 +0000 (16:03 +0900)]
nvc0: add GK20A 3D class
GK20A is mostly compatible with GK104, but features a new 3D
class. Add it to the relevant header and use it when GK20A is
detected.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kenneth Graunke [Sun, 25 May 2014 08:08:56 +0000 (01:08 -0700)]
i965/sf: Replace push/pop in brw_emit_anyprim_setup.
Each of the subroutine emitters alter the predication state, but
otherwise don't change anything (or put it back when they do).
Resetting predication at the end makes these functions idempotent with
regard to the default instruction state - which is a nice property.
With that in place, push/pop is no longer necessary.
v2: Improve whitespace (requested by Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:55 +0000 (01:08 -0700)]
i965/sf: Drop unnecessary push/pop in copy_z_inv_w.
brw_MOV doesn't alter the default instruction state, so this does
nothing.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:54 +0000 (01:08 -0700)]
i965/sf: Drop unnecessary push/pop in flatshading code.
brw_JMPI sets predicate_control to BRW_PREDICATE_NONE, but that's
already the value coming in. Otherwise, nothing changes state.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:53 +0000 (01:08 -0700)]
i965/sf: Move brw_compile::flag_value to brw_sf_compile.
This field is only used to track the current value of the flag register
during the SF compile. It has no place in the common compiler code.
While we're changing every call, drop the 'brw' prefix from the function
since it's static.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:52 +0000 (01:08 -0700)]
i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c.
Only the Gen4-5 SF program compiler actually uses this function; move
it there. Soon the fields will be moved out of brw_compile.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:51 +0000 (01:08 -0700)]
i965/sf: Drop useless push/pop state from flag register mashing code.
There's no point in pushing and popping the default state; the code
between the two stack operations doesn't alter anything.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:50 +0000 (01:08 -0700)]
i965/sf: Drop unnecessary push/pop in do_twoside_color.
None of the assembly emitters called between push and pop actually
change the state. So, we can drop these.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:49 +0000 (01:08 -0700)]
i965: Don't implicitly set predicate default state in brw_CMP.
Previously, brw_CMP with a null destination implicitly set the default
state to make future instructions predicated. This is messy and
confusing - emitting a CMP that populates the flag register and later
using it to predicate instructions are logically separate. With the
main compiler, we may even schedule instructions between the CMP and the
user of the flag value.
This patch simplifies brw_CMP to just emit a CMP instruction, and not
mess with predication. It also updates all necessary callers. These
mostly fell into two patterns:
1. brw_CMP followed by brw_IF.
We don't need to do anything special here; brw_IF already sets up
predication appropriately.
2. brw_CMP followed by a single predicated instruction.
The old model was to call brw_CMP, emit the next (predicated)
instruction, then disable predication for any instructions beyond
that. Instead, just explicitly set predicate_control on the single
instruction we want to predicate. It's no more code, and requires
less cross-module knowledge.
This drops setting flag_value to 0xff as well, which is a field only
used by the SF compile. There is only one brw_CMP call in the SF code,
which is in do_twoside_caller, and called at the start of
brw_emit_tri_setup, where flag_value is already 0xff.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:48 +0000 (01:08 -0700)]
i965: Drop unnecessary predication default state resets in clip code.
Presumably, this was to reset the default state of predication_control
from brw_CMP. But brw_CMP only sets that if dst is ARF null, which it
isn't here.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sun, 25 May 2014 08:08:47 +0000 (01:08 -0700)]
i965/sf: Reset flag_value to 0xff before emitting SF subroutines.
When compiling any of the SF program variants, flag_value starts off as
0xff and will be modified when generating code.
brw_emit_anyprim_setup emits several subroutines, saving and restoring
flag_value across each of them. Since it starts out as 0xff, this is
equivalent to simply setting it to 0xff at the start of each subroutine.
Resetting the value makes more logical sense; each subroutine doesn't
know whether one of the others even executed, much less what it did
to the flag register.
This also lets us to drop the brw_set_predicate_control_flag_value call
from brw_init_compile: predicate is already initialized to
BRW_PREDICATE_NONE by the memset, and the value of flag_value is
irrelevant (as it's only used by the SF compiler).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Leo Liu [Tue, 27 May 2014 14:12:02 +0000 (10:12 -0400)]
st/omx/enc: implement restricted b frames pattern
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Leo Liu [Tue, 27 May 2014 14:12:01 +0000 (10:12 -0400)]
radeon/vce: implement non-referenced frames
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Leo Liu [Tue, 27 May 2014 14:12:00 +0000 (10:12 -0400)]
vl: add interface for non-referenced frames
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Topi Pohjolainen [Wed, 21 May 2014 08:44:34 +0000 (11:44 +0300)]
i965/meta: Store stencil texturing mode
Meta path needs to keep the current texture object's state. Fixes
the following gles3 cts tests on bdw:
framebuffer_blit_functionality_negative_width_blit.test: fail
framebuffer_blit_functionality_all_buffer_blit.test: fail
framebuffer_blit_functionality_negative_height_blit.test: fail
framebuffer_blit_functionality_missing_buffers_blit.test: fail
framebuffer_blit_functionality_negative_dimensions_blit.test: fail
framebuffer_blit_functionality_minifying_blit.test: fail
framebuffer_blit_functionality_magnifying_blit.test: fail
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Wed, 21 May 2014 08:43:56 +0000 (11:43 +0300)]
meta/blit: Add stencil texturing mode save and restore
v2 (Ken): Only restore the mode if it has changed.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Stéphane Marchesin [Tue, 27 May 2014 05:12:17 +0000 (22:12 -0700)]
i915g: Fix shader disasm code
This broke when I separated declarations/shader.
Stéphane Marchesin [Tue, 27 May 2014 00:31:57 +0000 (17:31 -0700)]
i915g: Fallback to sw for npot copies
i915g's npot support is incomplete, so let's not use it for copies.
This fixes a bunch of piglit tests.
Stéphane Marchesin [Mon, 26 May 2014 13:48:11 +0000 (06:48 -0700)]
i915g: handle more formats in copy
We can handle depth, luminance,... copies by simply replacing the
format with a known format of the same bpp.
Tobias Klausmann [Tue, 27 May 2014 00:19:01 +0000 (02:19 +0200)]
nvc0: implement clear_buffer
Provide an accelerated path for ARB_clear_buffer_object
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Matt Turner [Sat, 17 May 2014 22:54:05 +0000 (15:54 -0700)]
i965: Switch types D->UD when possible to allow compaction.
Number of compacted instructions: 827404 -> 833045 (0.68%)
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Mon, 26 May 2014 18:45:48 +0000 (11:45 -0700)]
Revert "i965: Don't make instructions with a null dest a barrier to scheduling."
This reverts commit
42a26cb5e441a01d5288b299980f23affaad53fe.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
Matt Turner [Mon, 26 May 2014 18:44:57 +0000 (11:44 -0700)]
Revert "i965/fs: Simplify interference scan in register coalescing."
This reverts commit
5ff1e446d44bb9d50f84883c7058635cb070e069.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
Matt Turner [Mon, 26 May 2014 18:44:53 +0000 (11:44 -0700)]
Revert "i965/fs: Give up in interference check if we see a WHILE."
This reverts commit
55de1c035cbca2b7087b3aa21a8c3dfc900a4ad9.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Matt Turner [Mon, 26 May 2014 18:44:09 +0000 (11:44 -0700)]
Revert "i965/fs: Reduce restrictions on interference in register coalescing."
This reverts commit
f770123f58b46459e8dbd27525162ee8ba89f30b.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
Ilia Mirkin [Fri, 23 May 2014 15:31:39 +0000 (11:31 -0400)]
nvc0: revert mistaken logic to collapse color outputs to the beginning
In commit
af38ef907, I added a "fix" to color outputs not being assigned
correctly when sample mask was being output. This was totally wrong --
the color indices (i.e. "si" values) were the ones that were wrong. Undo
that hunk.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Ilia Mirkin [Fri, 23 May 2014 15:18:16 +0000 (11:18 -0400)]
mesa/st: fix color outputs in presence of sample mask output
Commit
c5d822dad90 added support for sample mask incorrectly. It became
treated as a color output, and messed up the color output indices.
Revert the hunk that did that, and add explicit support just like for
depth/stencil writes.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Rob Clark [Mon, 26 May 2014 13:03:09 +0000 (09:03 -0400)]
freedreno/a3xx: texture fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 26 May 2014 12:58:17 +0000 (08:58 -0400)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 24 May 2014 14:07:13 +0000 (10:07 -0400)]
freedreno: few caps fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Vinson Lee [Mon, 26 May 2014 04:32:49 +0000 (21:32 -0700)]
mesa/x86: Fix build with clang <= 3.3.
clang <= 3.3 cpuid.h does not define contants for feature bits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79095
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Matt Turner [Thu, 17 Apr 2014 20:55:06 +0000 (13:55 -0700)]
i965: Don't treat HW_REGs as barriers if they're immediates.
We had a handful of cases where we'd used brw_imm_*() to generate an
immediate, rather than fs_reg(). We shouldn't do that but we shouldn't
limit scheduling flexibility on account of immediate arguments either.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 17 Apr 2014 18:53:22 +0000 (11:53 -0700)]
i965/fs: Don't use brw_imm_* unnecessarily.
Using brw_imm_* creates a source with file=HW_REG, and the scheduler
inserts barrier dependencies when it sees HW_REG. None of these are
hardware-registers in the sense that they're special and scheduling
shouldn't touch them. A few of the modified cases already have HW_REGs
for other sources, so it won't allow extra flexibility in some cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Emil Velikov [Sun, 25 May 2014 02:23:42 +0000 (03:23 +0100)]
automake: correctly append the version-script
Turns out that the AC conditional did not include the
the version-scripts as expected. Rather it truncated
the remaining linker flags.
Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Emil Velikov [Sun, 25 May 2014 00:54:42 +0000 (01:54 +0100)]
targets/libgl-xlib: hide all the exported symbol mayhem
Leave only the gl/glx and mangled gl symbols.
XMesa* was never an official interface and the only
user of it was mesa-demos, while they were still in
the same repo as mesa.
v2: Conditionally use the version-script.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 25 May 2014 00:46:42 +0000 (01:46 +0100)]
targets/osmesa: include mangled gl symbols
Missed out with commit
d4c3968c25885f6eb53dee4cc0c60d8d3f8fec32
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 2 May 2014 21:02:15 +0000 (22:02 +0100)]
targets/xa: limit the amount of exported symbols
In the presence of LLVM the final library exports every symbol from
the llvm namespace. Resolve this by using a version script (w/o the
version/name tag).
Considering that there are only ~35 symbols, explicitly list them
to minimize the chances of rogue symbols sneaking in.
v2: Conditionally include the version-script.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 2 May 2014 21:02:14 +0000 (22:02 +0100)]
dri_util: keep __dri2ConfigOptions symbol private
The symbol was added with commit
45e2b51c853(DRI2/GLX: check for
vblank_mode in DRI2 GLX code) but was never used as such according
to git log.
Possibly it was marked as public due to confusion with
__driConfigOptions which was used for dri1 drivers.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Kai Wasserbäch [Mon, 19 May 2014 15:02:49 +0000 (17:02 +0200)]
targets/opencl: Fix (static) linking with LLVM (v2)
Without this, I get linking failures (static linking).
The static linking is sort of required for me, because otherwise Steam and
applications using the Steam runtime regularily fail because my LLVM was
compiled and linked against a newer libgcc_s, libstdc++, etc. and uses
features from those newer versions. And instead of Steam just not
starting, my X starts crashing, whenever libGL fails to load a (32 bit)
driver.
Since I hate crashes of X and I don't think Valve/Steam will behave like
a proper distribution soon (rebuilds versus current Debian Testing, since
they base their Steam OS off that), I need a radeonsi which carries its
own LLVM within and doesn't care about what the runtime sets. This means
linking Mesa statically.
v1 → v2: Move logic to configure.ac
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Emil Velikov [Sat, 10 May 2014 02:41:45 +0000 (03:41 +0100)]
glx: do not leak dri3Display
v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith)
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 10 May 2014 02:41:44 +0000 (03:41 +0100)]
gallium/egl: st_profiles are build time decision, treat them as such
The profiles are present depending on the defines at build time.
Drop the extra functions and feed the defines directly into the
state-tracker at build time.
v2: Drop unused variable i.
Acked-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 10 May 2014 02:41:43 +0000 (03:41 +0100)]
dri_util: set implemented version of the DRI_CORE extension
... rather than the one defined in our internal interface (dri_interface.h)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Sun, 25 May 2014 17:30:13 +0000 (10:30 -0700)]
i965/fs: Don't modify ann_count if not debugging.
If we make ann_count non-zero, annotation_finalize() won't bail.
Not modifying it seems to make the code more clear than would modifying
annotation_finalize().
Matt Turner [Thu, 22 May 2014 16:39:13 +0000 (09:39 -0700)]
Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6"
This reverts commit
a6860100b87415ab510d0d210cabfeeccebc9a0a.
Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.
Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Thu, 22 May 2014 16:38:24 +0000 (09:38 -0700)]
Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6"
This reverts commit
2dfbbeca50b95ccdd714d9baa4411c779f6a20d9 with the
comment about MAC and implicit accumulator removed.
Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.
Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 19 May 2014 21:08:37 +0000 (14:08 -0700)]
i965: Remove useless typo'd debugging messages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 19 May 2014 21:02:26 +0000 (14:02 -0700)]
i965: Move brw_land_fwd_jump() to compilation unit of its use.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Sun, 18 May 2014 18:16:26 +0000 (11:16 -0700)]
i965/fs: Use next_insn_offset rather than nr_insn.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>