Ian Romanick [Wed, 26 Feb 2014 20:48:56 +0000 (12:48 -0800)]
i915: Allocate the sys_buffer using _mesa_align_malloc
Though it won't matter on Linux, use _mesa_align_free to release it.
Since i965 doesn't have sys_buffer, I overlooked this in the
GL_ARB_map_buffer_alignment work a few months ago. Fixes i915 (and
presumably i830) regressions in ARB_map_buffer_range tests and the
failure in arb_map_buffer_alignment-sanity_test.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74960
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 26 Feb 2014 20:32:29 +0000 (12:32 -0800)]
i915: Only allow 8 vertex texture units
There's no reason to have more vertex texture units than fragment
texture units on this hardware. Since increasing the default maximum
number of texture units from 16 to 32, this has triggered some segfault
in i915 driver. There's probably some array or bitfield that isn't
properly sized now. This really papers over the bug, but I don't think
I'll lose any sleep over that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74071
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Petri Latvala [Thu, 27 Feb 2014 14:15:05 +0000 (16:15 +0200)]
i965: Assert array index on access to vec4_visitor's arrays.
v2: vec4_visitor::pack_uniform_registers(): Use correct comparison in the
assert, this->uniforms is already adjusted. Compare the actual value used to
index uniform_size and uniform_vector_size instead.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Petri Latvala [Thu, 27 Feb 2014 14:15:04 +0000 (16:15 +0200)]
i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
v2: Don't add function parameters, pass the required size in
prog_data->nr_params.
v3:
- Use the name uniform_array_size instead of uniform_param_count.
- Round up when dividing param_count by 4.
- Use MAX2() instead of taking the maximum by hand.
- Don't crash if prog_data passed to vec4_visitor constructor is NULL
v4: Rebase for current master
v5 (idr): Trivial whitespace change.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Chalupa [Thu, 27 Feb 2014 08:23:21 +0000 (09:23 +0100)]
gbm: export gbm_device_is_format_supported
Probably depending on compiler settings, the definition can be hidden,
so undefined reference error can be encountred during linking.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75528
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Thu, 27 Feb 2014 18:20:53 +0000 (18:20 +0000)]
configure: use enable_dri_glx local variable
GLX can be either dri or xlib based, while enable_dri is
used in a variety of contexts.
With enable_dri_glx the context is clearly visible.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 24 Feb 2014 22:58:13 +0000 (22:58 +0000)]
configure: enable the drm pipe-loader for non swrast drivers
All hardware drivers including the virtual vmwgfx require
the drm pipe-loader in order to be properly loaded by xa,
gbm and opencl.
Note this does _not_ add support for the above three it only
allows the pipe driver to be loaded by the library.
Eg. GBM will now properly open the pipe-i915 driver, should
one be working on the such hardware.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75453
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 24 Feb 2014 22:58:10 +0000 (22:58 +0000)]
configure: error out when building xa only with swrast
Building to provide accelration using swrast does not make
sense.
Note: update your build script to explicitly mention svga
in the gallium drivers list, if you are building the vmwgfx
xa library.
v2: Update error message to provide more clarify, add an example.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 24 Feb 2014 22:58:07 +0000 (22:58 +0000)]
configure: avoid setting variables as empty strings
Recent patch converted our logic to use test -n and test -z.
An emptry string variable (empty_str="") return true for both
thus making the check unreliable.
Fix this by correctly setting the variable when applicable.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 24 Feb 2014 22:57:59 +0000 (22:57 +0000)]
configure: avoid constantly building megadrivers 'core'
The issue is caused by a thinko that an empty string will be
considered of zero length by 'test'. This is not the case,
thus we were building the 'core' of megadrivers even when no
classic drivers were built.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tom Stellard [Mon, 24 Feb 2014 21:51:05 +0000 (16:51 -0500)]
r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs
This prevents clover from using unsupported devices.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Matt Turner [Sun, 23 Feb 2014 00:35:15 +0000 (16:35 -0800)]
glsl: Don't vectorize horizontal expressions.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75224
Matt Turner [Sun, 23 Feb 2014 00:35:14 +0000 (16:35 -0800)]
glsl: Add is_horizontal() method to ir_expression.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Matt Turner [Mon, 24 Feb 2014 23:00:45 +0000 (15:00 -0800)]
glsl: Optimize lrp(x, 0, a) into x - (x * a).
Helps one program in shader-db:
instructions in affected programs: 96 -> 92 (-4.17%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Mon, 24 Feb 2014 23:00:44 +0000 (15:00 -0800)]
glsl: Optimize lrp(0, y, a) into y * a.
Helps two programs in shader-db:
instructions in affected programs: 254 -> 234 (-7.87%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Brian Paul [Thu, 27 Feb 2014 15:36:13 +0000 (08:36 -0700)]
mesa: do depth/stencil format conversion in glGetTexImage
glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just
using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row()
to convert texels from the hardware format to the GL format.
Fixes issue reported by David Meng at Intel. The new piglit
ext_packed_depth_stencil-getteximage test checks for this bug.
Also, add some format/type assertions. We don't yet handle the
GL_FLOAT_32_UNSIGNED_INT_24_8_REV type. That should be fixed in
a follow-on patch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>
Brian Paul [Thu, 27 Feb 2014 16:11:51 +0000 (09:11 -0700)]
mesa: fix depth/stencil comments in formats.h
Thomas Hellstrom [Thu, 20 Feb 2014 13:32:07 +0000 (14:32 +0100)]
winsys/svga: Avoid calling drm getparam for max surface size on older kernels
This avoids the kernel driver spewing out errors about the param not being
supported.
Also correct the max surface size used when the kernel does not support the
query.
Reported-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Thu, 27 Feb 2014 01:40:43 +0000 (17:40 -0800)]
meta: Drop ctx->API checks.
API is always API_OPENGL_COMPAT (since commit
4e4a537ad55f61a25,
"meta: Push into desktop GL mode when doing meta operations."),
so most of these checks do nothing.
We could instead check save->API to only bother setting/restoring
relevant GL state, but I'm not sure saving a few _mesa_set_enable
calls is worth the complexity. My understanding is the point of
the ctx->API guards was to avoid raising GL errors.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 27 Feb 2014 06:19:33 +0000 (22:19 -0800)]
meta: Restore API at the end of _mesa_meta_end(), not the start.
In _mesa_meta_begin(), we switch to API_OPENGL_COMPAT, then munge a lot
of state (including some that doesn't exist in the actual API - like
PolygonStipple in API_OPENGL_CORE).
It seems reasonable that in _mesa_meta_end(), we should restore it,
then switch back to the original API. This at least makes it symmetric.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Roland Scheidegger [Thu, 27 Feb 2014 16:35:08 +0000 (17:35 +0100)]
util/u_format: don't crash in util_format_translate if we can't do translation
Some formats can't be handled - in particular cannot handle ints/uints formats,
which lack the pack_rgba_float/unpack_rgba_float functions. Instead of trying
to call these (and crash) return an error (I'm not sure yet if we should try
to translate such formats too here might not make much sense).
v2: suggested by Jose, use separate checks for pack/unpack of rgba_8unorm and
rgba_float functions (right now if one exists the other should as well).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Kenneth Graunke [Tue, 25 Feb 2014 20:21:41 +0000 (12:21 -0800)]
i965: Convert VUE map generation checks to if rather than switch.
There are currently only two VUE map layouts: one for Gen4-5, and one
for everything else. We keep having to add new "case N+1" labels for
every new hardware generation, and so far it's always been the same.
This patch makes it so we only have to do work in the case where
something actually changes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 25 Feb 2014 20:21:40 +0000 (12:21 -0800)]
i965: Only emit VS state pipe control workaround on IVB and BYT.
According to the BSpec's 3D workarounds page, this is unnecessary on
shipping Haswell hardware, and was never necessary on Broadwell. It
unfortunately doesn't say anything about Baytrail.
The workaround database confirms those results for Ivybridge, Haswell,
and Broadwell. Baytrail is less clear - one page says it's necessary,
while the other says it isn't. For now, be conservative and leave it
enabled.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Ilia Mirkin [Fri, 21 Feb 2014 06:05:10 +0000 (01:05 -0500)]
nouveau: add a nouveau_compiler binary to compile TGSI into shader ISA
This makes it easy to compare output between different cards, especially
for ones that you don't have (and/or not in the current machine).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 21 Feb 2014 07:22:31 +0000 (02:22 -0500)]
nv30: remove nv30_context use from nvfx_*prog
This should pave the way to being able to use the compiler without a
context. Also leads to cleaner code.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 21 Feb 2014 06:57:49 +0000 (01:57 -0500)]
nv30: remove unused sprite flipping parameter
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 21 Feb 2014 06:52:30 +0000 (01:52 -0500)]
nv30: remove unused render_mode and hw_pointsprite_control
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 21 Feb 2014 06:49:48 +0000 (01:49 -0500)]
nv30: remove use_nv4x, it is identical to is_nv4x
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 27 Feb 2014 04:35:14 +0000 (23:35 -0500)]
docs: update nvc0 state
ARB_texture_buffer_object_rgb32 has been supported for a while already.
Michel Daenzer [Thu, 13 Feb 2014 06:37:11 +0000 (15:37 +0900)]
radeonsi: Prevent geometry shader from emitting too many vertices
Anuj Phogat [Wed, 8 Jan 2014 01:46:45 +0000 (17:46 -0800)]
i965: Fix the region's pitch condition to use blitter
intelEmitCopyBlit uses a signed 16-bit integer to represent
buffer pitch, so it can only handle buffer pitches < 32k.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Brian Paul [Tue, 25 Feb 2014 19:38:45 +0000 (12:38 -0700)]
glsl: add switch case for MESA_SHADER_COMPUTE
To fix warning about unhandled enum value.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Sat, 22 Feb 2014 03:15:10 +0000 (19:15 -0800)]
meta: Use a #define for the vector type to avoid %svec4 everywhere.
By adding "#define gvec4 %svec4" to the top of our fragment shader, we
can write generic code without needing to specialize it to vec4, ivec4,
or uvec4 via asprintf.
This also makes the INT and UNSIGNED_INT merge function code identical,
so I combined those two cases.
It's not a big savings, but a little bit tidier.
v2: Rebase on Vinson's MSVC build fixes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Wed, 26 Feb 2014 06:15:30 +0000 (22:15 -0800)]
i965: Don't try to dump shader source for fixed-function FS programs.
sh->Source is NULL and this will segfault.
Fixes MESA_GLSL=dump with "The Swapper".
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sun, 23 Feb 2014 07:47:30 +0000 (23:47 -0800)]
i965: Don't forget to subtract mt->first_level in minify calls.
This fixes fbo-clear-formats GL_ARB_depth_texture on Ironlake, which
regressed since commit
f128bcc7c293013f4b44e4b661638333de0077c2
("i965: Drop mt->levels[].width/height.") intel_miptree_copy_slice was
calling minify(.., 7) on a 2x2 texture with mt->first_level == 7.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75292
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 24 Feb 2014 00:34:04 +0000 (16:34 -0800)]
glsl: Delete LRP_TO_ARITH lowering pass flag.
Tt's kind of a trap---calling do_common_optimization() after
lower_instructions() may cause opt_algebraic() to reintroduce
ir_triop_lrp expressions that were lowered, effectively defeating the
point. Because of this, nobody uses it.
v2: Delete more code (caught by Ian Romanick).
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 24 Feb 2014 00:32:39 +0000 (16:32 -0800)]
i965: Stop lowering ir_triop_lrp.
Both the vector and scalar backends now support it natively, so there's
no point in lowering it.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 24 Feb 2014 00:29:46 +0000 (16:29 -0800)]
i965/vec4: Handle ir_triop_lrp on Gen4-5 as well.
When the vec4 backend encountered an ir_triop_lrp, it always emitted an
actual LRP instruction, which only exists on Gen6+. Gen4-5 used
lower_instructions() to decompose ir_triop_lrp at the IR level.
Since commit
8d37e9915a3b21 ("glsl: Optimize open-coded lrp into lrp."),
we've had an bug where lower_instructions translates ir_triop_lrp into
arithmetic, but opt_algebraic reassembles it back into a lrp.
To avoid this ordering concern, just handle ir_triop_lrp in the backend.
The FS backend already does this, so we may as well do likewise.
v2: Add a comment reminding us that we could emit better assembly if we
implemented the infrastructure necessary to support using MAC.
(Assembly code provided by Eric Anholt).
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 24 Feb 2014 00:08:56 +0000 (16:08 -0800)]
i965/vec4: Add a brw->gen >= 6 assertion in three-source emitters.
Three source instructions didn't exist until Gen6. vec4_generator has
assertions to catch this, but catching it in the visitor provides a
nicer backtrace.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Chia-I Wu [Wed, 26 Feb 2014 03:29:18 +0000 (11:29 +0800)]
ilo: create u_upload_mgr last
Similar to u_blitter, u_upload_mgr is now a client of the pipe context. Its
creation needs to be delayed until the context has been (almost) initialized.
Fredrik Höglund [Sat, 15 Feb 2014 17:48:40 +0000 (18:48 +0100)]
glx: Fix the GLXFBConfig attrib sort priorities
The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are
not defined in GL_ARB_multisample, but they are defined in
the GLX 1.4 specification.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fredrik Höglund [Thu, 13 Feb 2014 20:07:09 +0000 (21:07 +0100)]
glx: Fix the default values for GLXFBConfig attributes
The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are
GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in
the GLX 1.4 specification.
This fixes the glx-choosefbconfig-defaults piglit test.
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tom Stellard [Tue, 25 Feb 2014 21:32:37 +0000 (13:32 -0800)]
Re-commit 'clover: Fix build with LLVM 3.5'
This was accidentally reverted in
9dfd7c5f75c806801b1b4b4d405899236c09ba75
Vinson Lee [Tue, 25 Feb 2014 20:18:41 +0000 (12:18 -0800)]
mesa: Add GL_ARB_buffer_storage to dispatch_sanity.cpp.
Fixes 'make check' failure introduced with commit
119ffa7307d62e7310ce3902fded662ee4021c92.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75503
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Timothy Arceri [Tue, 25 Feb 2014 21:46:08 +0000 (08:46 +1100)]
Revert "Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa"
This reverts commit
1b79582f322d4a89dd6d197c8d4962c788ae7f25, reversing
changes made to
376a98d345dfc3da8d5b0f1e489196f861c4e754.
Timothy Arceri [Tue, 25 Feb 2014 21:39:32 +0000 (08:39 +1100)]
Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa
ry,
Tom Stellard [Tue, 25 Feb 2014 21:32:37 +0000 (13:32 -0800)]
clover: Fix build with LLVM 3.5
Timothy Arceri [Tue, 25 Feb 2014 21:31:25 +0000 (08:31 +1100)]
glsl: removed unused dimension_count varible
This variable is no longer needed after the cleanup to the
code prior to the first arrays of array series
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Mon, 24 Feb 2014 17:43:17 +0000 (12:43 -0500)]
build: llvm libs may not be in system search path, add rpath
On my gentoo system, llvm libs are in /usr/lib64/llvm, and llvm-config
--ldflags does not provide the rpath (it does, of course, provide a -L).
This adds the llvm dir to the rpath. It should be harmless if the path
is a system path, and should make things work when it's not.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Emil Velikov <emil.l.velikov@gmail.com>
Eric Anholt [Tue, 25 Feb 2014 19:35:49 +0000 (11:35 -0800)]
i965: Fix segfaults since the buffer_storage changes.
Ilia Mirkin [Tue, 25 Feb 2014 19:41:15 +0000 (14:41 -0500)]
docs: update nv50 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 20 Feb 2014 02:29:40 +0000 (21:29 -0500)]
nv50: enable txg where supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 7 Feb 2014 22:42:58 +0000 (17:42 -0500)]
nv50: enable cube map array texture support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Brian Paul [Tue, 25 Feb 2014 16:56:49 +0000 (09:56 -0700)]
libgl-xlib: add -Isrc/gallium/winsys flag
So that sw/xlib/xlib_sw_winsys.h can be found. Fixes a build break.
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Brian Paul [Tue, 25 Feb 2014 16:53:49 +0000 (09:53 -0700)]
st/mesa: add comment to explain _min(), _maxf(), etc. functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Sun, 9 Feb 2014 22:25:06 +0000 (23:25 +0100)]
r600g,radeonsi: consolidate create_surface and surface_destroy
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 11 Feb 2014 02:02:35 +0000 (03:02 +0100)]
radeonsi: inline util_blitter_copy_texture
This will be used for changing texture properties without modifying
pipe_resource like r600g, but not in this series. For now, this change
allows consolidation of pipe_surface functions.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 11 Feb 2014 00:50:03 +0000 (01:50 +0100)]
radeonsi: remove useless psbox variable from resource_copy_region
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 22:05:13 +0000 (23:05 +0100)]
radeonsi: compute depth surface registers only once
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 21:16:48 +0000 (22:16 +0100)]
radeonsi: compute color surface registers only once
Same as r600g.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 18:34:59 +0000 (19:34 +0100)]
r600g: remove r600_resource.h
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 18:30:09 +0000 (19:30 +0100)]
r600g: remove r600_surface::htile_enabled
v2: use one of the htile registers instead
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 18:25:45 +0000 (19:25 +0100)]
r600g: use r600_surface::db_z_info
db_z_info was unused. This just renames the variable to match the register
name.
Now, db_depth_info is unused on Evergreen.
Both variables will be needed on SI though.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 18:23:58 +0000 (19:23 +0100)]
r600g,radeonsi: share r600_surface
I'm gonna use this in radeonsi.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 9 Feb 2014 16:42:00 +0000 (17:42 +0100)]
radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to framebuffer state
It doesn't depend on anything else.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Thu, 6 Feb 2014 18:24:23 +0000 (19:24 +0100)]
mesa: allow buffers to be mapped multiple times
OpenGL allows a buffer to be mapped only once, but we also map buffers
internally, e.g. in the software primitive restart fallback, for PBOs,
vbo_get_minmax_index, etc. This has always been a problem, but it will
be a bigger problem with persistent buffer mappings, which will prevent
all Mesa functions from mapping buffers for internal purposes.
This adds a driver interface to core Mesa which supports multiple buffer
mappings and allows 2 mappings: one for the GL user and one for Mesa.
Note that Gallium supports an unlimited number of buffer and texture
mappings, so it's not really an issue for Gallium.
v2: fix unmapping in xm_dd.c, remove the GL errors there
v3: fix the intel driver (by Fredrik)
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Wed, 29 Jan 2014 02:20:32 +0000 (03:20 +0100)]
docs: update ARB_buffer_storage status
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 23:53:28 +0000 (00:53 +0100)]
gallium/upload_mgr: remove useless variable "size"
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 21:28:56 +0000 (22:28 +0100)]
gallium/upload_mgr: don't unmap buffers if persistent mappings are supported
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:57:42 +0000 (21:57 +0100)]
gallium: the other drivers don't support ARB_buffer_storage
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:46:21 +0000 (21:46 +0100)]
r300g,r600g,radeonsi: add support for ARB_buffer_storage
All GTT memory mappings are coherent and therefore can be persistent.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:45:54 +0000 (21:45 +0100)]
st/mesa: implement ARB_buffer_storage
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:42:07 +0000 (21:42 +0100)]
gallium: add interface for persistent and coherent buffer mappings
Required for ARB_buffer_storage.
Marek Olšák [Mon, 27 Jan 2014 20:36:53 +0000 (21:36 +0100)]
mesa: allow buffers mapped with the persistent flag to be used by the GPU
v2: also fixed InvalidateBufferData, added citations from the 4.4 spec
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:31:58 +0000 (21:31 +0100)]
mesa: add error checks to glMapBufferRange, glMapBuffer for ARB_buffer_storage
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 11:56:11 +0000 (12:56 +0100)]
glapi: add ARB_buffer_storage
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 20:22:43 +0000 (21:22 +0100)]
mesa: implement glBufferStorage, immutable buffers; add extension enable flag
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
v2: dropped the error that DYNAMIC_STORAGE is required for MAP_WRITE_BIT,
the error is removed in the latest revision of GL 4.4
Marek Olšák [Mon, 27 Jan 2014 20:15:19 +0000 (21:15 +0100)]
mesa: add storage flags parameter to Driver.BufferData
It will be used by glBufferStorage. The parameters are chosen according
to ARB_buffer_storage.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Marek Olšák [Mon, 27 Jan 2014 11:57:28 +0000 (12:57 +0100)]
mesa: remove unused driver hook BindBuffer
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Emil Velikov [Mon, 24 Feb 2014 16:46:19 +0000 (16:46 +0000)]
nv50: correctly calculate the number of vertical blocks during transfer map
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Dave Airlie [Fri, 7 Feb 2014 01:37:31 +0000 (11:37 +1000)]
st/mesa: add texture gather support. (v2)
This adds support for GL_ARB_texture_gather, and one step of
support for GL_ARB_gpu_shader5.
This adds support for passing the TG4 instruction, along
with non-constant texture offsets, and tracking them for the
optimisation passes.
This doesn't support native textureGatherOffsets hw, to do that
you'd need to add a CAP and if set disable the lowering pass,
and bump the MAX offsets to 4, then do the i0,j0 sampling using
those.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 21 Sep 2013 08:45:43 +0000 (18:45 +1000)]
gallium: add texture gather support to gallium (v3)
This adds support to gallium for a TG4 instruction,
and two CAPs. The first CAP is required for GL_ARB_texture_gather.
The second CAP is required to expose GL_ARB_gpu_shader5.
However so far we haven't found any hardware that natively
exposes the textureGatherOffsets feature from GL, so just
lower it for now. If hardware appears for this we can add
another CAP to allow TG4 to take 4 offsets.
v2: add component selection src and a cap to say
hw can do it. (st can use to help control
GL_ARB_gpu_shader5/GLSL 4.00). Add docs.
v3: rename to SM5, add docs.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 10 Feb 2014 23:41:44 +0000 (09:41 +1000)]
glsl/i965: move lower_offset_array up to GLSL compiler level.
This lowering pass will be useful for gallium drivers as well, in order to support
the GL TG4 oddity that is textureGatherOffsets.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Thu, 13 Feb 2014 22:46:25 +0000 (14:46 -0800)]
clover: Pass buffer offsets to the driver in set_global_binding() v3
The offsets will be stored in the handles parameter. This makes
it possible to use sub-buffers.
v2:
- Style fixes
- Add support for constant sub-buffers
- Store handles in device byte order
v3:
- Use endian helpers
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Wed, 19 Feb 2014 22:19:53 +0000 (14:19 -0800)]
radeonsi: Use SI_BIG_ENDIAN now that it exists
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tom Stellard [Thu, 20 Feb 2014 15:51:24 +0000 (07:51 -0800)]
r600g: Use util_cpu_to_le32() instead of bswap32() on big-endian systems
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tom Stellard [Thu, 20 Feb 2014 17:03:53 +0000 (09:03 -0800)]
radeonsi: Use util_cpu_to_le32() instead of bswap32() on big-endian systems
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tom Stellard [Thu, 20 Feb 2014 15:46:28 +0000 (07:46 -0800)]
util: Add util_cpu_to_le* helpers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Wed, 19 Feb 2014 22:17:33 +0000 (14:17 -0800)]
util: Add util_bswap64() v3
v2:
- Use __builtin_bswap64()
- Remove unnecessary mask
- Add util_le64_to_cpu() helper
v3:
- Remove unnecessary AC_SUBST
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tom Stellard [Thu, 20 Feb 2014 15:31:16 +0000 (07:31 -0800)]
configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2
v2:
- Remove unnecessary AC_SUBST
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Sun, 23 Feb 2014 20:29:57 +0000 (20:29 +0000)]
targets/opencl: resolve undefined symbols at link time
Current automake build does not try to resolve undefined
symbols thus we could end up with a broken library.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Emil Velikov [Mon, 24 Feb 2014 14:20:36 +0000 (14:20 +0000)]
gallium/targets: resolve undefined reference to pipe_loader_sw_probe_dri
With the introduction of the pipe_loader_sw_probe_dri helper we
require the sw/dri winsys during linking stage despite it being
unused by any of the targets. This will cause a minor increase
in the resulting library which will be cleaned up via linker
options with upcoming patches.
v2: Link with libswdri.la only when available.
Reported-and-tested-by: Tom Stellard <thomas.stellard@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 22 Feb 2014 16:47:21 +0000 (16:47 +0000)]
configure: correctly report if we're building the sw/xlib winsys
While looking at bug 75356, I've noticed that the presence of
x11 egl platform pulls in sw/xlib as "needed" but fails to
report so at the end of configure.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 22 Feb 2014 16:44:14 +0000 (16:44 +0000)]
pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIB
The above function implies using the the xlib winsys, which
has additional library dependencies that should not be forced.
Make the software xlib pipe loader optional thus avoid all
the dependency hell. A user that wishes to use the particular
pipe-loader would need to set the following within configure.ac.
enable_gallium_xlib_loader=yes
v2:
- Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems
lacking X11 headers. Spotted by Christian Prochaska.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 22 Feb 2014 16:20:04 +0000 (16:20 +0000)]
targets/gbm: exit gracefully if pipe_loader_drm_probe_fd is not available
When one builds without gallium_drm_loader, the above function will
not be available, thus we'll segfault in gallium_screen_create due
to memory access violation.
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75335
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Kenneth Graunke [Sat, 22 Feb 2014 03:15:51 +0000 (19:15 -0800)]
i965: Don't try to use the hardware blitter for multisampled miptrees.
The blitter is completely ignorant of MSAA buffer layouts, so any
attempt to use BLT paths with MSAA buffers is likely to break
spectacularly.
In most cases, BLORP handles MSAA blits, so we never hit this bug.
Until recently, it also wasn't worth fixing, since Meta couldn't handle
MSAA either, so there was nothing to fall back to. But now there is.
+143 piglit tests on Broadwell (which doesn't have BLORP support).
Surprisingly, three also start failing. Since non-IMS MSAA buffers
store samples in successive array slices, using the blitter ought to
access sample 0 and ignore the rest, which is apparently good enough for
a few not-very-picky Piglit tests. Presumably the meta replacement code
is still broken.
No Piglit changes on Ivybridge.
v2: Move the early return to the top of the function (suggested by
Paul).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Rob Clark [Sat, 22 Feb 2014 14:46:39 +0000 (09:46 -0500)]
freedreno/a3xx/compiler: half-precision output
Using generic shaders caused a measurable fps drop, which was isolated to
use of full precision (vs half precision) output. This is an attempt to
regain that lost performance by using half precision solid/blit shaders
(when the output format is not float32).
Note: for the built-in shaders, I would not expect them to be register
starved. And in fact it is the solid frag shader that seems to have the
biggest impact. So I suspect you get double the pixel pipe units (or
half the cycles) when the output is half precision. So there may be
some gain to using half precision output for application shaders as
well, even though the rest of register usage is still full precision.
But for half precision to work for more complex shaders, we need to deal
with some constraints, like cat2 needing same precision for it's two src
registers. So for now it is not enabled by default except for the
built-in shaders.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 22 Feb 2014 00:17:43 +0000 (19:17 -0500)]
freedreno/a3xx: add shader variants
Start putting in place infrastructure to deal with multiple shader
variants. Initially we'll use this for two sided color (frag) and
binning pass (vert) shaders. Possibly need for others later (such
as YUV vs RGB eglImage?).
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Fri, 21 Feb 2014 23:03:30 +0000 (18:03 -0500)]
freedreno/a3xx/compiler: collapse nop's with repeat
Easier than making more extensive use of rpt, and the more compact
shaders seem to bring some bit of performance boost. (Perhaps repeat
flag benefits are more than just instruction cache, possibly it saves
on instruction decode as well?)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Fri, 21 Feb 2014 19:36:11 +0000 (14:36 -0500)]
freedreno/a3xx: drop hand-coded blit/solid shaders
Instead in the common code, construct these shaders from TGSI. For now
we let a2xx keep it's hand coded shaders, as it's compiler isn't quite
up to the job yet. All the same it is a net drop in code size and gets
rid of special cases.
Signed-off-by: Rob Clark <robclark@freedesktop.org>