Kenneth Graunke [Fri, 2 Oct 2015 23:40:14 +0000 (16:40 -0700)]
i965: Remove shader_prog from vec4_gs_visitor.
Unfortunately it has to stay in gen6_gs_visitor.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 2 Oct 2015 23:45:09 +0000 (16:45 -0700)]
i965: Use nir->has_transform_feedback_varyings to avoid shader_prog.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 2 Oct 2015 22:55:05 +0000 (15:55 -0700)]
nir: Add a nir_shader_info::has_transform_feedback_varyings flag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 1 Oct 2015 07:46:19 +0000 (00:46 -0700)]
nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number. While GLSL makes
this look like a normal array, it can be very different behind the
scenes.
On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct. This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.
NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset. We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.
This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case. They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.
v2: Rebase on earlier refactors.
v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Kenneth Graunke [Thu, 1 Oct 2015 07:36:25 +0000 (00:36 -0700)]
nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects.
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time via deref_has_indirect(). This means moving
the call a bit earlier.
By returning a nir_ssa_def *, we can pass back both an existence flag
(via NULL checking the pointer) and the value in one parameter. It
also simplifies the code somewhat. nir_lower_samplers works in a
similar fashion.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Timothy Arceri [Sun, 4 Oct 2015 06:42:41 +0000 (17:42 +1100)]
glsl: fix whitespace
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Marek Olšák [Mon, 28 Sep 2015 21:50:12 +0000 (23:50 +0200)]
radeonsi: enable PIPE_CAP_FORCE_PERSAMPLE_INTERP
Now st/mesa won't generate 2 variants for this state.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 28 Sep 2015 21:46:04 +0000 (23:46 +0200)]
radeonsi: do force_persample_interp in shaders for non-trivial cases
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 28 Sep 2015 15:21:10 +0000 (17:21 +0200)]
radeonsi: implement the simple case of force_persample_interp
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 28 Sep 2015 15:01:21 +0000 (17:01 +0200)]
radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate state
This will be a derived state used for changing center->sample and
centroid->sample at runtime.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 28 Sep 2015 19:44:54 +0000 (21:44 +0200)]
tgsi/scan: add interpolation info into tgsi_shader_info
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 27 Sep 2015 17:54:57 +0000 (19:54 +0200)]
st/mesa: automatically set per-sample interpolation if using SampleID/Pos
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 27 Sep 2015 17:43:00 +0000 (19:43 +0200)]
st/mesa: set force_persample_interp if ARB_sample_shading is used
This is only a half of the work. The next patch will handle
gl_SampleID/SamplePos, which is the other half of ARB_sample_shading.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 27 Sep 2015 17:32:07 +0000 (19:32 +0200)]
gallium: add per-sample interpolation control into rasterizer statOAe
Required by ARB_sample_shading for drivers that don't want a shader variant
in st/mesa.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:08:46 +0000 (21:08 +0200)]
st/mesa: add ST_DEBUG=precompile support for tessellation shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.BindImageTexture
Nothing sets it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.DeleteSamplerObject
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.EndCallList
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.BeginCallList
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.EndList
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.NewList
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.NotifySaveBegin
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.SaveFlushVertices
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.FlushVertices
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.BeginVertices
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.BindArrayObject
Nothing sets it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.DeleteArrayObject
Nothing reimplements it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.NewArrayObject
Nothing reimplements it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.Hint
Nothing sets it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.ColorMaskIndexed
Nothing sets it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove some Driver.Blend* hooks
Nothing sets them.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.Accum
Nothing calls it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.ResizeBuffers
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.DeleteShaderProgram
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.NewShaderProgram
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Sun, 27 Sep 2015 19:28:22 +0000 (21:28 +0200)]
mesa: remove Driver.DeleteShader
Nothing overrides it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Fri, 25 Sep 2015 20:48:00 +0000 (22:48 +0200)]
egl/dri2: don't require a context for ClientWaitSync (v2)
The spec doesn't require it. This fixes a crash on Android.
v2: don't set any flags if ctx == NULL
v3: add the spec note
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Marek Olšák [Fri, 25 Sep 2015 20:44:41 +0000 (22:44 +0200)]
st/dri: don't use _ctx in client_wait_sync
Not needed and it can be NULL.
v2: fix dri2_get_fence_from_cl_event - thanks Albert
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>
Marek Olšák [Sun, 6 Sep 2015 15:37:38 +0000 (17:37 +0200)]
r600g: only do depth-only or stencil-only in-place decompression
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 6 Sep 2015 15:37:38 +0000 (17:37 +0200)]
radeonsi: only do depth-only or stencil-only in-place decompression
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 6 Sep 2015 15:35:06 +0000 (17:35 +0200)]
gallium/radeon: add separate stencil level dirty flags
We will only do depth-only or stencil-only decompress blits, whichever is
needed by textures, instead of always doing both.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 23:38:48 +0000 (01:38 +0200)]
radeonsi: dump buffer lists while debugging
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 22:58:15 +0000 (00:58 +0200)]
winsys/radeon: implement cs_get_buffer_list
This is more complicated, because tracking priority_usage needed changing
the relocs_bo type.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 22:52:32 +0000 (00:52 +0200)]
winsys/amdgpu: add winsys function cs_get_buffer_list
For debugging.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 22:10:00 +0000 (00:10 +0200)]
gallium/radeon: stop using "reloc" in a few places
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 21:18:55 +0000 (23:18 +0200)]
gallium/radeon: tell the winsys the exact resource binding types
Use the priority flags and expand them.
This information will be used for debugging.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 01:15:40 +0000 (03:15 +0200)]
radeonsi: add an option for debugging VM faults
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 01:14:43 +0000 (03:14 +0200)]
radeonsi: move dumping the last IB into its own function
v2: indentation fix
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sat, 26 Sep 2015 01:13:11 +0000 (03:13 +0200)]
ddebug: separate creation of debug files
This will be used by radeonsi for logging.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Emil Velikov [Sat, 3 Oct 2015 12:23:13 +0000 (13:23 +0100)]
docs: add news item and link release notes for 10.6.9
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 3 Oct 2015 12:16:18 +0000 (13:16 +0100)]
docs: add sha256 checksums for 10.6.9
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
8957b696f9cc8a92b2c160c551c34545447ec28a)
Emil Velikov [Sat, 3 Oct 2015 11:37:15 +0000 (12:37 +0100)]
docs: add release notes for 10.6.9
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
ab9aacce2d26a802bac81fc25748320428996692)
Matthew Waters [Mon, 14 Sep 2015 17:35:45 +0000 (18:35 +0100)]
egl: rework handling EGL_CONTEXT_FLAGS
As of version 15 of the EGL_KHR_create_context spec, debug contexts
are allowed for ES contexts. We should allow creation instead of
erroring.
While we're here provide a more comprehensive checking for the other two
flags - ROBUST_ACCESS_BIT_KHR and FORWARD_COMPATIBLE_BIT_KHR
v2 [Emil Velikov] Rebase. Minor tweak in commit message.
Cc: Boyan Ding <boyan.j.ding@gmail.com>
Cc: Chad Versace <chad.versace@intel.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91044
Signed-off-by: Matthew Waters <ystreet00@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Jason Ekstrand [Sat, 3 Oct 2015 01:31:17 +0000 (18:31 -0700)]
i965/wm: Make compute_barycentric_interp_modes take a nir_shader and a devinfo
Now that everything comes in through NIR, we can pick this directly out of
the shader source and don't need to reference the gl_fragment_program.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 3 Oct 2015 01:16:10 +0000 (18:16 -0700)]
i965: Use nir_foreach_variable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 3 Oct 2015 01:15:06 +0000 (18:15 -0700)]
nir: Add a nir_foreach_variable macro
This is a common enough operation that it's nice to not have to think about
the arguments to foreach_list_typed every time.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 2 Oct 2015 23:39:51 +0000 (16:39 -0700)]
i965/nir: Remove the prog parameter from brw_nir_lower_inputs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Thu, 24 Sep 2015 16:29:56 +0000 (16:29 +0000)]
radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
This fixes a race condition in the glx-multithreaded-shader-compile
test.
v2:
- Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Thu, 24 Sep 2015 15:57:02 +0000 (15:57 +0000)]
gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2
Drivers and state trackers that use LLVM for generating code, must
register the targets they use with LLVM's global TargetRegistry.
The TargetRegistry is not thread-safe, so all targets must be added
to the registry before it can be queried for target information.
When drivers and state trackers initialize their own targets, they need
a way to force gallivm to initialize its targets at the same time.
Otherwise, there can be a race condition in some multi-threaded
applications (e.g. glx-multihreaded-shader-compile in piglit),
when one thread creates a context for a driver that uses LLVM (e.g.
radeonsi) and another thread creates a gallivm context (glxContextCreate
does this).
The race happens when the driver thread initializes its LLVM targets and
then starts using the registry before the gallivm thread has a chance to
register its targets.
This patch allows users to force gallivm to register its targets by
calling the gallivm_init_llvm_targets() function.
v2:
- Use call_once and remove mutexes and static initializations.
- Replace gallivm_init_llvm_{begin,end}() with
gallivm_init_llvm_targets().
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Wed, 30 Sep 2015 15:00:39 +0000 (15:00 +0000)]
gallium/radeon: Use call_once() when initailizing LLVM targets
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Thu, 1 Oct 2015 22:21:57 +0000 (15:21 -0700)]
i965/shader: Get rid of the shader, prog, and shader_prog fields
Unfortunately, we can't get rid of them entirely. The FS backend still
needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still
needs gl_shader_program for handling transfom feedback. However, the VS
needs neither and we can substantially reduce the amount they are used.
One day we will be free from their tyranny.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 2 Oct 2015 01:15:21 +0000 (18:15 -0700)]
i965/fs,vec4: Get rid of the sanity_param_count
It doesn't exist for anything other than an assert that, as far as I can
tell, isn't possible to trip. Soon, we will remove prog from the visitor
entirely and this will become even more impossible to hit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 22:22:23 +0000 (15:22 -0700)]
i965/vec4: Use nir info instead of pulling things out of [shader_]prog
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 22:12:59 +0000 (15:12 -0700)]
i965/fs: Use the nir info instead of pulling things out of [shader_]prog
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 23:38:59 +0000 (16:38 -0700)]
i965/fs: Move sampler unit lookup into rescale_texcoord
The texunit variable we create and assign in nir_emit_texture gets passed
through two more layers of function calls before it gets to its sole use in
rescale_texcoord. The best part is that we already pass the sampler into
rescale_texcoord so we can just look it up there.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 20:52:21 +0000 (13:52 -0700)]
i965/cs: Remove the prog argument from local_id_payload_dwords
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 19:23:53 +0000 (12:23 -0700)]
i965/backend_shader: Add a field to store the NIR shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 2 Oct 2015 01:27:38 +0000 (18:27 -0700)]
nir: Move GS data to nir_shader_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 6 Aug 2015 00:14:59 +0000 (17:14 -0700)]
nir: Add a a nir_shader_info struct
This commit also adds code to glsl_to_nir and prog_to_nir to fill it out.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 5 Aug 2015 23:39:32 +0000 (16:39 -0700)]
nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 00:01:03 +0000 (17:01 -0700)]
i965: Move prog_data uniform setup to the codegen level
As of now, uniform setup is more-or-less unified between vec4 and fs and no
longer requires the fs_visitor. This makes uniform setup more of a
language/API thing than a backend compiler thing. This commit moves
setting up the stage_prog_data.params arrays to the same place as we set up
the rest of stage_prog_data.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 15:55:20 +0000 (08:55 -0700)]
i965: Move binding table setup to codegen time.
Setting up binding tables really has little to do with the actual process
of turning shaders into instructions; it's more part of setting up
prog_data. This commit moves it out of the visitors and with the rest of
the prog_data setup stuff.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 15:30:56 +0000 (08:30 -0700)]
i965/shader: Pull assign_common_binding_table_offsets out of backend_shader
This really has nothing to do with the backend compiler and we'd like to
eventually be able to set this up earlier in the compile process.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Oct 2015 00:37:36 +0000 (17:37 -0700)]
i965/nir: Simplify uniform setup
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 2 Oct 2015 17:45:53 +0000 (10:45 -0700)]
i965/nir: Pull GLSL uniform handling into a common function
The way we deal with GLSL uniforms and builtins is basically the same in
both the vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Sep 2015 21:07:20 +0000 (14:07 -0700)]
i965/nir: Pull common ARB program uniform handling into a common function
The way we deal with ARB program uniforms is basically the same in both the
vec4 and the fs backend. This commit takes the best parts of both
implementations and pulls the common code into a shared helper function.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 20:11:23 +0000 (13:11 -0700)]
i965/vec4: Use the uniform count from nir_assign_var_locations
Previously, we were counting up uniforms as we set them up. However, this
count should be exactly identical to shader->num_uniforms provided by
nir_assign_var_locations. (If it's not, we're in trouble anyway because
that means that locations don't match up.) This matches what the fs
backend is already doing.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 18:47:01 +0000 (11:47 -0700)]
i965/shader: Get rid of the setup_vec4_uniform_value helper
It's not used by anything anymore
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 18:46:36 +0000 (11:46 -0700)]
i965/shader: Pull setup_image_uniform_values out of backend_shader
I tried to do this once before but Curro pointed out that having it in
backend_shader meant it could use the setup_vec4_uniform_values helper
which did different things in vec4 and fs. Now the setup_uniform_values
function differs only by an assert in the two backends so there's no real
good reason to be using it anymore.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 18:02:14 +0000 (11:02 -0700)]
i965/vec4: Get rid of the uniform_vector_size array
The uniform_vector_size array was only ever used by pack_uniform_registers
which no longer needs it.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 17:31:03 +0000 (10:31 -0700)]
i965/vec4: Use the actual channels used in pack_uniform_registers
Previously, pack_uniform_registers worked based on the size of the uniform
as given to us when we initially set up the uniforms. However, we have to
walk through the uniforms and figure out liveness anyway, so we migh as
well record the number of channels used as we go. This may also allow us
to pack things tighter in a few cases.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 22:19:49 +0000 (15:19 -0700)]
glsl/types: Make subroutine types have a single matrix column
That way, if we do the usual thing of multiplying vector_elements by
matrix_columns we get the actual number of components in the type as per
component_slots().
While we're at it, we also switch to using the actual C++ field
initializers for vector_elements and matrix_columns.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 23:09:01 +0000 (16:09 -0700)]
i965: Pull stage_prog_data.nr_params out of the NIR shader
Previously, we had a bunch of code in each stage to figure out how many
slots we needed in stage_prog_data.param. This code was mostly identical
across the stages and had been copied and pasted around. Unfortunately,
this meant that any time you did something special, you had to add code for
it to each of these places. In particular, none of the stages took
subroutines into account; they were working entirely by accident. By
taking this data from the NIR shader, we know the exact number of entries
we need and everything goes a bit smoother.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Sep 2015 23:06:43 +0000 (16:06 -0700)]
i965/vs: Move lazy NIR creation to codegen_vs_prog
The next commit will add code to codegen_vs_prog that requires the NIR
shader to be there in all cases. It doesn't hurt anything to just move it
from brw_vs_emit to its only caller.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Mon, 21 Sep 2015 18:07:32 +0000 (11:07 -0700)]
i965/vec4: Delete the old vec4_vp code
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Mon, 21 Sep 2015 18:03:29 +0000 (11:03 -0700)]
i965/vec4: Delete the old ir_visitor code
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Mon, 21 Sep 2015 17:42:19 +0000 (10:42 -0700)]
i965/vec4: Always use NIR
GLSL IR vs. NIR shader-db results for vec4 programs on i965:
total instructions in shared programs: 1499328 -> 1388354 (-7.40%)
instructions in affected programs: 1245199 -> 1134225 (-8.91%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on G4x:
total instructions in shared programs: 1436799 -> 1325825 (-7.72%)
instructions in affected programs: 1205599 -> 1094625 (-9.20%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake:
total instructions in shared programs: 1436654 -> 1325682 (-7.72%)
instructions in affected programs: 1205503 -> 1094531 (-9.21%)
helped: 7468
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge:
total instructions in shared programs: 2016249 -> 1787033 (-11.37%)
instructions in affected programs: 1850547 -> 1621331 (-12.39%)
helped: 14856
HURT: 1481
GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
I also ran our full suite of benchmarks on a Haswell and had the following
statistically significant (according to ministat) changes:
Test master-glsl master-nir diff
bench_OglGeomPoint 461.556 463.006 1.450
bench_OglTerrainFlyInst 184.484 187.574 3.090
bench_OglTerrainPanInst 132.412 136.307 3.895
bench_OglTexFilterAniso 19.653 19.645 -0.008
bench_OglTexFilterTri 58.333 58.009 -0.324
bench_OglVSInstancing 65.049 65.327 0.278
bench_trexoff 69.474 69.694 0.220
bench_valley 40.708 41.125 0.417
v2 (Jason Ekstrand):
- Remove more uses of NirOptions as a switch
- New shader-db numbers
- Added benchmark numbers
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Fri, 2 Oct 2015 00:27:06 +0000 (20:27 -0400)]
i965: don't forget to free image_param on prog_data free
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Ilia Mirkin [Fri, 2 Oct 2015 00:21:47 +0000 (20:21 -0400)]
glsl: avoid leaking hiddenUniforms map when there are no uniforms
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Ilia Mirkin [Fri, 2 Oct 2015 00:18:19 +0000 (20:18 -0400)]
mesa: avoid leaking closure when iterating over a string_to_uint_map
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Chris Wilson [Fri, 2 Oct 2015 17:39:39 +0000 (10:39 -0700)]
nir: Fix uninitialized 'progress' variable in nir_lower_system_values.
Commit
0a1adaf11d051b71b4c46aabee2e5342f2d6aef3 (nir: Report progress
from nir_lower_system_values().) introduced a bug caught by Valgrind:
==823== Conditional jump or move depends on uninitialised value(s)
==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68)
==823== by 0xB079FB8: foreach_cf_node (nir.c:1310)
==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336)
==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79)
...
==823== Uninitialised value was created by a stack allocation
==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76)
which is trivially fixed by initializing progress.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Connor Abbott [Thu, 21 May 2015 23:26:01 +0000 (19:26 -0400)]
nir/remove_phis: handle trivial back-edges
Some loops may have phi nodes that look like:
foo = ...
loop {
bar = phi(foo, bar)
...
}
in which case we can remove the phi node and replace all uses of 'bar'
with 'foo'. In particular, there are some L4D2 vertex shaders with loops
that, after optimization, look like:
/* succs: block_1 */
loop {
block block_1:
/* preds: block_0 block_4 */
vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994
vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321
vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324
vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327
vec1 ssa_8139 = intrinsic load_uniform () () (232)
vec1 ssa_588 = ige ssa_2195, ssa_8139
/* succs: block_2 block_3 */
if ssa_588 {
block block_2:
/* preds: block_1 */
break
/* succs: block_5 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
vec1 ssa_994 = iadd ssa_2195, ssa_2150
/* succs: block_1 */
}
where after removing the second, third, and fourth phi nodes, the loop becomes
entirely dead, and this patch will cause the loop to be deleted entirely.
No piglit regressions.
Shader-db results on bdw:
instructions in affected programs: 5824 -> 5664 (-2.75%)
total loops in shared programs: 2234 -> 2202 (-1.43%)
helped: 32
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Kyle Brenneman [Mon, 28 Sep 2015 18:12:05 +0000 (12:12 -0600)]
glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)
Add a macro GL_LIB_NAME to hold the filename that configure comes up with
based on the --with-gl-lib-name and --enable-mangling options.
In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding
"libGL.so.1".
v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will
work.
v3: Fix the library filename in the Makefile.
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Kyle Brenneman [Mon, 28 Sep 2015 17:59:22 +0000 (11:59 -0600)]
mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.
When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m"
prefix before trying to skip it, so that "glFoo" and "mglFoo" are
equivalent.
This should let it work with all the places where something calls
_glapi_get_proc_offset with a hard-coded name that starts with the normal
"gl" prefix.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Kyle Brenneman [Mon, 28 Sep 2015 17:59:21 +0000 (11:59 -0600)]
glx: Fix build errors with --enable-mangling (v2)
Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
the renames from glx_mangle.h.
Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
defined.
v2: Add a comment clarifying why GLX_ALIAS needs two macros.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tapani Pälli [Fri, 2 Oct 2015 07:08:55 +0000 (10:08 +0300)]
glsl: validate binding qualifier on block members
Fixes following Piglit test:
member-invalid-binding-qualifier.frag
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Samuel Iglesias Gonsalvez [Wed, 30 Sep 2015 13:03:15 +0000 (15:03 +0200)]
glsl: emit row_major matrix's SSBO stores only for components in writemask
When writing to a column of a row-major matrix, each component of the
vector is stored to non-consecutive memory addresses, so we generate
one instruction per component.
This patch skips the disabled components in the writemask, saving some
store instructions plus avoid storing wrong data on each disabled
component.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tapani Pälli [Thu, 1 Oct 2015 11:54:33 +0000 (14:54 +0300)]
glsl: error out if non-constant indexing of SSBO arrays with GLSL ES
Fixes a failing subtest in:
ES31-CTS.shader_storage_buffer_object.negative-glsl-compileTime
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Daniel Scharrer [Thu, 1 Oct 2015 12:36:31 +0000 (14:36 +0200)]
mesa: Add abs input modifier to base for POW in ffvertex_prog
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <daniel@constexpr.org>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kenneth Graunke [Tue, 25 Aug 2015 23:09:46 +0000 (16:09 -0700)]
i965/fs: Print reg and reg_offset separately for ATTR files.
Reading this output was really confusing. reg represents attribute
slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>