Emil Velikov [Thu, 19 Nov 2015 15:36:03 +0000 (15:36 +0000)]
automake: loader: don't create an empty dri3 helper
Seems that creating an empty one does not fair too well with MacOSX's
ar. Considering that all the users of the helper include it only when
needed, let's reshuffle the makefile.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92985
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:34:20 +0000 (15:34 +0000)]
automake: loader: honour the XCB_DRI3 cflags
Without this the compilation will fail, as the headers are installed in
a non-default location.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:50:50 +0000 (15:50 +0000)]
automake: egl: add symbols test
Should help us catch issues where we expose any extra symbols by
mistake. Just like the ones fixes with previous commit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 19 Nov 2015 15:31:06 +0000 (15:31 +0000)]
automake: loader: rework the CPPFLAGS
Rather than duplicating things, just use the generic AM_CPPFLAGS. This
has the fortunate side-effect of adding VISIBILITY_CFLAGS for the dri3
helper. The latter of which was erroneously exposing some internal
symbols.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 18 Nov 2015 01:57:08 +0000 (17:57 -0800)]
i965: Enable EXT_shader_samples_identical
On the vec4 backend, textureSamplesIdentical() will always return
false. There are currently no test cases for the vec4 backend, so we
don't have much confidence in any implementation. We also don't think
anyone is likely to miss it.
v2: Handle immediate value for MCS smarter. Rebase on changes to
nir_texop_sampels_identical (missing second parameter). Suggested by
Jason.
v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of
f9a9ba5e. Stub out the vec4 implementation.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2]
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]
Ian Romanick [Wed, 18 Nov 2015 03:31:39 +0000 (19:31 -0800)]
i965/vec4: Handle nir_tex_src_ms_index more like the scalar
v2: Rebase on top of
f9a9ba5e.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 01:09:09 +0000 (17:09 -0800)]
nir: Add nir_texop_samples_identical opcode
This is the NIR analog to GLSL IR ir_samples_identical.
v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by
Ken and Jason.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 00:59:40 +0000 (16:59 -0800)]
glsl: Add textureSamplesIdenticalEXT built-in functions
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Wed, 18 Nov 2015 00:54:31 +0000 (16:54 -0800)]
glsl: Add ir_samples_identical opcode
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:36:15 +0000 (15:36 -0800)]
glsl: Extension tracking for EXT_shader_samples_indentical
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:32:10 +0000 (15:32 -0800)]
mesa: Extension tracking for EXT_shader_samples_indentical
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ian Romanick [Tue, 17 Nov 2015 23:26:27 +0000 (15:26 -0800)]
Import current draft of EXT_shader_samples_identical spec
v2: Add Neil to the list of contributors. I meant to do that before,
but Matt reminded me.
v3: Fix typos noticed by Nicolai.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Rob Clark [Thu, 5 Nov 2015 15:23:48 +0000 (10:23 -0500)]
nir: add nir_ssa_for_alu_src()
Using something like:
numer = nir_ssa_for_src(bld, alu->src[0].src,
nir_ssa_alu_instr_src_components(alu, 0));
for alu src's with swizzle, like:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_2 = udiv ssa_10.xx, ssa_11
ends up turning into something like:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_13 = imov ssa_10
...
because nir_ssa_for_src() ignore's the original nir_alu_src's swizzle.
Instead for alu instructions, nir_src_for_alu_src() should be used to
ensure the original alu src's swizzle doesn't get lost in translation:
vec1 ssa_10 = intrinsic load_uniform () () (0, 0)
vec2 ssa_11 = intrinsic load_uniform () () (1, 0)
vec2 ssa_13 = imov ssa_10.xx
...
v2: check for abs/neg, and re-use existing nir_alu_src
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Wed, 4 Nov 2015 21:10:52 +0000 (16:10 -0500)]
nir: fix missing increments of num_inputs/num_outputs
Note: not quite perfect, we should use type_size vfunc (in
compiler_options or nir_shader?) to determine how much we
increment num_inputs/outputs/uniforms. But we don't have
that yet, so let's at least fix things for the existing
users of these passes.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Wed, 4 Nov 2015 15:05:32 +0000 (10:05 -0500)]
nir/print: show # of uniforms/inputs/outputs
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 26 Oct 2015 17:29:45 +0000 (13:29 -0400)]
nir/print: show shader name/label if set
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Rob Clark [Mon, 19 Oct 2015 15:57:51 +0000 (11:57 -0400)]
nir: add nir_var_all enum
Otherwise, passing -1 gets you:
error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive]
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Ilia Mirkin [Thu, 19 Nov 2015 06:37:14 +0000 (01:37 -0500)]
freedreno/a4xx: fix 5_5_5_1 texture sampler format
This fixes teximage-colors, fbo-generatemipmap-formats, and probably
others (in relation to the RGB5 formats, others still fail).
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Thu, 19 Nov 2015 05:32:39 +0000 (00:32 -0500)]
freedreno/a4xx: add depth clamp and halfz clip
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 19 Nov 2015 05:06:46 +0000 (00:06 -0500)]
freedreno/a4xx: allow seamless cubemap filtering to be enabled per-texture
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Thu, 19 Nov 2015 04:54:25 +0000 (23:54 -0500)]
freedreno/a4xx: support lod_bias
The lower layers assume that we support this, and it's been core since
GL 1.4. This fixes a slew of piglit tests, especially around
tex-miplevel-selection.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Samuel Pitoiset [Thu, 19 Nov 2015 08:51:03 +0000 (09:51 +0100)]
nv50: allow using inline vertex data submit when gl_VertexID is used
The hardware can actually generates vertexid when vertices come from
a client-side buffer like when glDrawElements is used.
This doesn't fix (or break) any piglit tests but it improves the
previous attempt of Ilia (c830d19 "nv50: avoid using inline vertex
data submit when gl_VertexID is used")
The only disadvantage is that only works on G84+, but we don't really
care of that weird and old NV50 chipset.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Thu, 19 Nov 2015 08:51:02 +0000 (09:51 +0100)]
nv50: add NV84_3D macro
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Matt Turner [Mon, 2 Nov 2015 20:25:24 +0000 (12:25 -0800)]
i965: Drop IMM fs_reg/src_reg -> brw_reg conversions.
The previous two commits make this unnecessary.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 20:12:44 +0000 (12:12 -0800)]
i965/vec4: Replace src_reg(imm) constructors with brw_imm_*().
Cuts 1.5k of .text.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 19:28:35 +0000 (11:28 -0800)]
i965/fs: Use brw_imm_uw().
W/UW immediates are 16-bits, but those 16-bits must be replicated
in the high 16-bits of the 32-bit field.
Remove the useless W/UW immediate saturating code, since we'll now be
using the appropriate immediate (and W/UW immediates in the IR can now
no longer be larger than 16-bits).
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 19:26:16 +0000 (11:26 -0800)]
i965/fs: Replace fs_reg(imm) constructors with brw_imm_*().
Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor
implementations themselves.
text data bss dec hex filename
5204535 214112 27784 5446431 531b1f i965_dri.so before
5193977 214112 27784 5435873 52f1e1 i965_dri.so after
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 2 Nov 2015 18:29:45 +0000 (10:29 -0800)]
i965: Make brw_imm_vf4() take 8-bit restricted floats.
This partially reverts commit
bbf8239f92ecd79431dfa41402e1c85318e7267f.
I didn't like that commit to begin with -- computing things at compile
time is fine -- but for purposes of verifying that the resulting values
are correct, looking up 0x00 and 0x30 in a table is a lot better than
evaluating a recursive function.
Anyway, by making brw_imm_vf4() take the actual 8-bit restricted floats
directly (instead of only integral values that would be converted to
restricted float), we can use this function as a replacement for the
vector float src_reg/fs_reg constructors.
brw_float_to_vf() is not currently an inline function, so it will not be
evaluated at compile time. I'll address that in a follow-up patch.
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nanley Chery [Wed, 18 Nov 2015 23:01:44 +0000 (15:01 -0800)]
mesa: Add test for sorted extension table
Enable developers to know if the table's alphabetical sorting
is maintained or lost.
v2: Move "*" next to pointer name (Matt)
Include extensions_table.h instead of extensions.h (Ian)
Remove extra " *" in comment (Ian)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Nanley Chery [Wed, 18 Nov 2015 23:01:43 +0000 (15:01 -0800)]
mesa/extensions: Sort the extension table alphabetically
Make it easier to determine where to add new extensions.
Performed with the vim sort command.
v2: Insert newline after last #define (Matt)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ilia Mirkin [Thu, 19 Nov 2015 17:25:53 +0000 (12:25 -0500)]
docs: GL3.1 for a3xx and a4xx
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:07:08 +0000 (11:07 -0600)]
mesa: enable EXT_blend_func_extended if the driver supports the ARB version
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:05:58 +0000 (11:05 -0600)]
mesa: allow MAX_DUAL_SOURCE_DRAW_BUFFERS to be available to ES
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:05:17 +0000 (11:05 -0600)]
mesa: enable usage of blend_func_extended blend factors in GLES2
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 17:03:44 +0000 (11:03 -0600)]
glsl: add a parse check to check for the index layout qualifier
This can only be used if EXT_blend_func_extended is enabled
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Fri, 6 Nov 2015 02:44:03 +0000 (20:44 -0600)]
glsl: add GL_EXT_blend_func_extended preprocessor define
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:59:32 +0000 (10:59 -0600)]
glsl: add support for EXT_blend_func_extended builtins
gl_MaxDualSourceDrawBuffersEXT - Maximum dual-source draw buffers supported
For ESSL 1.0, it provides two builtins since you can't have user-defined
color output variables:
gl_SecondaryFragColorEXT
gl_SecondaryFragDataEXT[MaxDSDrawBuffers]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:53:40 +0000 (10:53 -0600)]
glsl: add EXT_blend_func_extended parser enables
This adds a state for the maximum dual source draw variables available
and the variable for determining if the extension has been enabled
in the program shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ryan Houdek [Thu, 5 Nov 2015 16:52:35 +0000 (10:52 -0600)]
glapi: add EXT_blend_func_extended XML definitions
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Brian Paul [Wed, 18 Nov 2015 16:25:48 +0000 (09:25 -0700)]
os: check for GALLIUM_PROCESS_NAME to override os_get_process_name()
Useful for debugging and for glretrace.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Connor Abbott [Fri, 14 Aug 2015 18:58:45 +0000 (11:58 -0700)]
glsl: fix ir_constant::equals() for doubles
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Connor Abbott [Fri, 14 Aug 2015 18:58:07 +0000 (11:58 -0700)]
glsl: fix isinf() for doubles
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Connor Abbott [Tue, 4 Aug 2015 21:04:34 +0000 (14:04 -0700)]
nir: fix constant folding of bfi
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Brian Paul [Thu, 19 Nov 2015 00:08:39 +0000 (17:08 -0700)]
hud: fix Windows build break
Protect signal-related code with PIPE_OS_UNIX test.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ian Romanick [Wed, 18 Nov 2015 02:35:00 +0000 (18:35 -0800)]
glsl: Fix off-by-one error in array size check assertion
Apparently, this has been a bug since 2010 (
c30f6e5d).
Also use ARRAY_SIZE instead of open coding it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Ian Romanick [Tue, 17 Nov 2015 23:27:59 +0000 (15:27 -0800)]
mesa: Don't expose GL_EXT_shader_integer_mix in GLES 1.x
There are no shaders, so it doesn't even make sense to expose the
extension.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Ian Romanick [Wed, 18 Nov 2015 00:58:02 +0000 (16:58 -0800)]
glsl: Silence unused parameter warnings
builtin_functions.cpp:5289:52: warning: unused parameter 'num_arguments' [-Wunused-parameter]
unsigned num_arguments,
^
builtin_functions.cpp:5290:52: warning: unused parameter 'flags' [-Wunused-parameter]
unsigned flags)
^
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ian Romanick [Wed, 18 Nov 2015 00:25:06 +0000 (16:25 -0800)]
glsl: Silence ignored qualifier warning
I think the intention was to mark the "this" parameter as const, but
const goes on the other end to do that.
In file included from glsl_symbol_table.cpp:26:0:
ast.h:339:35: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
const bool is_single_dimension()
^
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Kenneth Graunke [Sun, 8 Nov 2015 02:58:59 +0000 (18:58 -0800)]
i965: Allow indirect GS input indexing in the scalar backend.
This allows arbitrary non-constant indices on GS input arrays,
both for the vertex index, and any array offsets beyond that.
All indirects are handled via the pull model. We could potentially
handle indirect addressing of pushed data as well, but it would add
additional code complexity, and we usually have to pull inputs anyway
due to the sheer volume of input data. Plus, marking pushed inputs
as live due to indirect addressing could exacerbate register pressure
problems pretty badly. We'd need to be careful.
v2: Use updated MOV_INDIRECT opcode.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jimmy Berry [Wed, 4 Nov 2015 05:24:47 +0000 (23:24 -0600)]
gallium/hud: document GALLIUM_HUD_PERIOD in envvars.html.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jimmy Berry [Tue, 10 Nov 2015 05:20:37 +0000 (23:20 -0600)]
gallium/hud: control visibility at startup and runtime.
- env GALLIUM_HUD_VISIBLE: control default visibility
- env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Mon, 16 Nov 2015 19:48:05 +0000 (11:48 -0800)]
i965/nir: Add hooks for testing nir_shader_clone
This commit adds code for testing nir_shader_clone by running it after each
and every optimization pass and throwing away the old shader. Testing
nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment
variable.
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 11 Nov 2015 16:31:29 +0000 (08:31 -0800)]
nir: Add support for cloning shaders
This commit is heavily based on one by Rob Clark <robdclark@gmail.com> but
reworked to re-use nir_create functions and do less hashing.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Kenneth Graunke [Tue, 3 Nov 2015 08:31:22 +0000 (00:31 -0800)]
i965/nir: Validate that NIR passes call nir_metadata_preserve().
Failing to call nir_metadata_preserve() can have nasty consequences:
some pass breaks dominance information, but leaves it marked as valid,
causing some subsequent pass to go haywire and probably crash.
This pass adds a simple validation mechanism to ensure passes handle
this properly. We add a new bogus metadata flag that isn't used for
anything in particular, set it before each pass, and ensure it *isn't*
still set after the pass. nir_metadata_preserve will reset the flag,
so correct passes will work, and bad passes will assert fail.
(I would have made these functions static inline, but nir.h is included
in C++, so we can't bit-or enums without lots of casting...)
Thanks to Dylan Baker for the idea.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Tue, 3 Nov 2015 08:31:15 +0000 (00:31 -0800)]
i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes.
OPT() is the normal macro for passes that return booleans, while OPT_V()
is a variant that works for passes that don't properly report progress.
(Such passes should be fixed to return a boolean, eventually.)
These macros take care of calling nir_validate_shader() and setting
progress appropriately. In the future, it would be easy to add shader
dumping similar to INTEL_DEBUG=optimizer by extending the macro.
v2 (Jason Ekstrand):
- Fix an unused variable warning
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Fri, 6 Nov 2015 16:35:21 +0000 (11:35 -0500)]
nir: add array length field
This will simplify things somewhat in clone.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Fri, 6 Nov 2015 16:35:20 +0000 (11:35 -0500)]
nir: remove nir_variable::max_ifc_array_access
No users.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Rob Clark [Tue, 17 Nov 2015 17:35:09 +0000 (12:35 -0500)]
freedreno/a4xx: add fake RGTC support (required for GL3)
The a4xx bits corresponding to 'freedreno/a3xx: add fake RGTC support
(required for GL3)'
TODO some more r/e.. maybe we get lucky and hw supports some of this
directly? For now this will help us enable gl3.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 17 Nov 2015 16:42:53 +0000 (11:42 -0500)]
freedreno/a4xx: add compressed texture formats
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 17 Nov 2015 16:42:34 +0000 (11:42 -0500)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 05:28:34 +0000 (00:28 -0500)]
freedreno: expose GLSL 140 and fake MSAA for GL3.0/3.1 support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Thu, 17 Sep 2015 05:43:36 +0000 (01:43 -0400)]
freedreno/a3xx: fix texture buffers, enable offsets
The main issue is that the current logic looked into cso->u.tex, which
is the wrong side of the union to look into for texture buffers. While I
was at it, it was easy enough to add the logic to handle offsets
(first_element).
- reduce texture buffer size limit (determined experimentally)
- don't look at first/last levels, instead look at first/last element
- include the first element offset
- set offset alignment to 16 (determined experimentally)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 04:20:31 +0000 (23:20 -0500)]
freedreno: add support for conditional rendering, required for GL3.0
A smarter implementation would make it possible to attach this to emit
state for the BY_REGION versions to avoid breaking the tiling. But this
is a start.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 03:13:16 +0000 (22:13 -0500)]
freedreno/a3xx: add fake RGTC support (required for GL3)
Also throw in LATC while we're at it (same exact format). This could be
made more efficient by keeping a shadow compressed texture to use for
returning at map time. However... it's not worth it for now...
presumably compressed textures are not updated often.
Lastly fix up Z32S8 transfers to non-0 layers.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 00:32:32 +0000 (19:32 -0500)]
freedreno/a3xx: add missing formats to enable ARB_vertex_type_2_10_10_10_rev
The previously RE'd formats were from an ES driver implementing
OES_vertex_type_10_10_10_2 and thus backwards. A future change could add
the 2_10_10_10 support.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 16 Nov 2015 20:07:29 +0000 (15:07 -0500)]
freedreno/a3xx+a4xx: fix for stk binning pass hang
We'd end up in a state where shader uses no inputs, yet num_elements is
greater than zero. Triggered by a TF vertex shader which did:
gl_Position = vec4(0.0, 0.0, 0.0, 0.0);
resulting in a binning pass variant with no inputs.
Includes equiv fix in a4xx, even though we don't have binning-pass
enabled yet on a4xx.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Mon, 16 Nov 2015 19:58:50 +0000 (14:58 -0500)]
freedreno/a3xx+a4xx: fix GL_POINTS lockup w/ GLES
point_size_per_vertex is always TRUE for GLES, causing us to configure
the hw as if gl_PointSize was written, even if it was not. Which makes
for grumpy hw.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Sun, 8 Nov 2015 18:43:07 +0000 (13:43 -0500)]
nir: fix typo in idiv lowering, causing large-udiv-udiv failures
In nv50, and in the python script that Rob circulated, we do:
bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b);
Do the same in the nir div lowering pass. This fixes the large-udiv-udiv
piglit tests on freedreno.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Oded Gabbay [Tue, 17 Nov 2015 14:16:46 +0000 (16:16 +0200)]
llvmpipe: disable VSX in ppc due to LLVM PPC bug
This patch disables the use of VSX instructions, as they cause some
piglit tests to fail
For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7
With this patch, ppc64le reaches parity with x86-64 as far as piglit test
suite is concerned.
v2:
- Added check that we have at least LLVM 3.4
- Added the LLVM bug URL as a comment in the code
v3:
- Only disable VSX if Altivec is supported, because if Altivec support
is missing, then VSX support doesn't exist anyway.
- Change original patch description.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Wed, 18 Nov 2015 19:23:35 +0000 (14:23 -0500)]
nvc0/ir: actually emit AFETCH on kepler
Looks like this was forgotten in the commit which added the AFETCH
logic.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Kenneth Graunke [Tue, 17 Nov 2015 22:56:32 +0000 (14:56 -0800)]
nir: Store the size of the TCS output patch in nir_shader_info.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Kenneth Graunke [Thu, 12 Nov 2015 03:24:01 +0000 (19:24 -0800)]
i965: Add enums for 3DSTATE_TE field values.
3DSTATE_TE has partitioning, output topology, and domain fields,
each of which has several enumerated values. We'll also need to
switch on the domain, so enums (rather than #defines) seem like a
natural fit.
I chose to put these in brw_compiler.h because they'll be stored
in struct brw_tes_prog_data, which will live there.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Ian Romanick [Sat, 14 Nov 2015 00:09:37 +0000 (16:09 -0800)]
meta/generate_mipmap: Don't leak the framebuffer object
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Brian Paul [Mon, 16 Nov 2015 17:41:20 +0000 (10:41 -0700)]
svga: use more VGPU10 formats
We always want to prefer the VGPU10 formats over the VGPU9 ones when
we have VGPU10 support.
Original patch by Jose and updated by Brian.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Mon, 16 Nov 2015 17:31:46 +0000 (10:31 -0700)]
svga: add/use new svga_sampler_format() function
This is important for the case of sampling from a depth texture. In
that case, we need to sample the texture as if it were a single-channel
color texture. For other/color formats, we can use the format as-is.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Nicolai Hähnle [Thu, 12 Nov 2015 23:38:36 +0000 (00:38 +0100)]
radeon: count cs dwords separately for query begin and end
This will be important for perfcounter queries.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 18 Nov 2015 11:06:58 +0000 (12:06 +0100)]
radeon: expose r600_query_hw functions for reuse
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
Nicolai Hähnle [Thu, 12 Nov 2015 23:27:34 +0000 (00:27 +0100)]
radeon: implement r600_query_hw_get_result via function pointers
We will need the clear_result override for the batch query implementation.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 18 Nov 2015 11:05:11 +0000 (12:05 +0100)]
radeon: split hw query buffer handling from cs emit
The idea here is that driver queries implemented outside of common code
will use the same query buffer handling with different logic for starting
and stopping the corresponding counters.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
Nicolai Hähnle [Wed, 18 Nov 2015 10:59:21 +0000 (11:59 +0100)]
radeon: convert hardware queries to the new style
Move r600_query and r600_query_hw into the header because we will want to
reuse the buffer handling and suspend/resume logic outside of the common
radeon code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
Nicolai Hähnle [Wed, 18 Nov 2015 10:55:09 +0000 (11:55 +0100)]
radeon: convert software queries to the new style
Software queries are all queries that do not require suspend/resume
and explicit handling of result buffers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
Nicolai Hähnle [Wed, 18 Nov 2015 10:40:00 +0000 (11:40 +0100)]
radeon: add query handler function pointers
The goal here is to be able to move the implementation details of hardware-
specific queries (in particular, performance counters) out of the common code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
Nicolai Hähnle [Thu, 12 Nov 2015 21:04:50 +0000 (22:04 +0100)]
radeon: move R600_QUERY_* constants into a new query header file
More query-related structures will have to be moved into their own
header file to support hardware-specific performance counters.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 10 Nov 2015 19:42:02 +0000 (20:42 +0100)]
radeon: cleanup driver query list
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 6 Nov 2015 11:52:51 +0000 (12:52 +0100)]
radeon: move get_driver_query_info to r600_query.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Neil Roberts [Mon, 16 Nov 2015 13:03:11 +0000 (14:03 +0100)]
i965: Prevent fast clears for MSRTs on SKL
There are currently a bunch of formats that behave strangely when
sampling the cleared color from the MCS buffer on SKL. They seem to
mostly be formats that don't have an alpha component, although it's
not all of them, and we haven't yet found anything in the specs which
would explain this. For now to be on the safe side this patch just
prevents fast clears for MSRTs on SKL altogether so that when fast
clears are eventually enabled it will only be for single-sampled
surfaces. The assumption is that clears are probably more likely to be
used in single-sampled applications anyway so we can at least get them
working and we can enable MSRTs later once we understand the problem
better.
This patch should have no functional effect other than perhaps
receiving fewer perf_debug messages on SKL+.
v2: Improve the commit message to avoid saying the patch disables fast
clears because it will be merged before fast clears are enabled
for any surfaces so it doesn't actually disable anything.
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Eric Anholt [Thu, 12 Nov 2015 01:09:40 +0000 (17:09 -0800)]
vc4: Don't bother lowering uniforms when the same value is used twice.
DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get
the absolute value would get lowered. The lowering doesn't bother to try
to restrict the lifetime of the lowered uniforms, so we'd end up register
allocation failng due to this on 5 of the tests (More tests still fail in
RA, which look like we'll need to reduce lowered uniform lifetimes to
fix).
No changes on shader-db, though fewer extra MOVs are generated on even
glxgears (MOVs pair well enough that it ends up being the same instruction
count).
Eric Anholt [Tue, 17 Nov 2015 04:45:46 +0000 (20:45 -0800)]
vc4: Fix uniform reordering to support reading the same uniform twice.
This does actually happen in the wild (particularly fabs of a uniform), so
we'd like to support it.
Eric Anholt [Thu, 12 Nov 2015 00:50:29 +0000 (16:50 -0800)]
vc4: Fix documentation on vc4_qir_lower_uniforms.c.
Eric Anholt [Tue, 10 Nov 2015 23:37:47 +0000 (15:37 -0800)]
vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Wed, 18 Nov 2015 00:31:14 +0000 (16:31 -0800)]
i965: Fix PIPE_CONTOL typo.
PIPE_CONTOL!!!
Ben Widawsky [Tue, 17 Nov 2015 01:23:01 +0000 (17:23 -0800)]
i965: Add assertion for src_stencil payload size
This helps address a coverity warning and prevents future questions about this
code.
Reported-by: Coverity (via Ilia)
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Wed, 28 Oct 2015 23:26:15 +0000 (16:26 -0700)]
i965: Implement ARB_pipeline_statistics_query tessellation counters.
We basically just need to uncomment Ben's code.
v2: Fix obvious bugs caught by Ben.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Timothy Arceri [Fri, 13 Nov 2015 04:43:13 +0000 (15:43 +1100)]
glsl: rename location layout helper
Change name from validate -> apply to more accurately describe what
the function does.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Timothy Arceri [Fri, 13 Nov 2015 00:41:52 +0000 (11:41 +1100)]
glsl: don't validate binding when its not needed
Checking that the flag has been set is all the validation thats
needed here.
Also not calling the binding validation function will make things
much simpler when adding compile time constant support as we
won't need to resolve the binding value.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Fri, 13 Nov 2015 00:28:20 +0000 (11:28 +1100)]
glsl: remove temp variable to make code easier to read
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Fri, 13 Nov 2015 00:21:42 +0000 (11:21 +1100)]
glsl: cleanup and fix validate matrix function for arrays
Previously if the member was an array of matrices then a
warning message would be incorrectly given.
Also the struct case could never be met so it has been removed.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Thu, 12 Nov 2015 23:49:48 +0000 (10:49 +1100)]
glsl: use better location in struct and block error messages
Previously we only gave the location for some members and never
gave the variable location. In those cases we were just giving
the location of the struct/block.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Thu, 12 Nov 2015 23:27:00 +0000 (10:27 +1100)]
glsl: only do type and qualifier validation once per declaration
For struct and block members previously we were doing it for
every variable declaration.
So for example
struct S {
atomic_uint x, y, z;
};
Would previously generate three error messages when one is sufficient.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Thu, 12 Nov 2015 22:49:31 +0000 (09:49 +1100)]
glsl: rename function that processes struct and iface members
As of the previous commit this function handles only struct/iface
members.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>