Kenneth Graunke [Wed, 11 Jun 2014 01:50:03 +0000 (18:50 -0700)]
i965: Fix Haswell discard regressions since Gen4-5 line AA fix.
In commit
dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally
moved fire_fb_write() above the brw_pop_insn_state(), which caused the
SEND to lose its predication and change from WE_normal to WE_all.
Haswell uses predicated SENDs for discards, so this broke Piglit's
tests for discards.
We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked,
but the actual FB write itself should respect those. So, pop state
first, and force it again around the single MOV.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
Michel Dänzer [Tue, 3 Jun 2014 07:45:23 +0000 (16:45 +0900)]
gbm: Remove 64x64 restriction from GBM_BO_USE_CURSOR
GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to
build, but it no longer rejects widths or heights other than 64.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Wed, 11 Jun 2014 00:44:56 +0000 (17:44 -0700)]
i965: Use brw->gen in some generation checks.
Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Wed, 11 Jun 2014 20:01:31 +0000 (13:01 -0700)]
i965/fs: Clean up tabs in brw_fs_cse.cpp.
I'm adding vec4 CSE, and I want to diff the files.
Matt Turner [Wed, 11 Jun 2014 01:18:39 +0000 (18:18 -0700)]
configure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Matt Turner [Wed, 11 Jun 2014 01:11:56 +0000 (18:11 -0700)]
configure.ac: Alphabetize AC_CONFIG_FILES.
This isn't supposed to be difficult.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Matt Turner [Wed, 11 Jun 2014 01:08:10 +0000 (18:08 -0700)]
configure.ac: Remove single quotes to fix syntax highlighting.
Please stop adding them.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Robert Bragg [Sun, 8 Jun 2014 18:02:41 +0000 (19:02 +0100)]
meta: save and restore swizzle for _GenerateMipmap
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ian Romanick [Wed, 11 Jun 2014 01:07:50 +0000 (18:07 -0700)]
i965/vec4: Emit smarter code for b2f of a comparison
Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float. Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.
No piglit regressions on Ivybridge.
total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs: 136533 -> 133671 (-2.10%)
GAINED: 0
LOST: 0
Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.
v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove
extraneous fix_3src_operand (suggested by Matt). The latter change
required swapping the order of the operands and using predicate_inverse.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 11 Jun 2014 00:50:04 +0000 (17:50 -0700)]
i965/vec4: Silence a couple unused parameter warnings
brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter]
brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Tue, 10 Jun 2014 17:41:32 +0000 (10:41 -0700)]
glsl: Store gl_uniform_driver_storage::format as the actual type
And delete the incorrect comment.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Dave Airlie [Wed, 11 Jun 2014 04:03:11 +0000 (14:03 +1000)]
softpipe: fix pt->resource assert placement
oops meant to move this.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 03:54:13 +0000 (13:54 +1000)]
softpipe: enable AMD_vertex_shader_layer.
This passes tests now on softpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 03:32:57 +0000 (13:32 +1000)]
softpipe: enable GLSL 3.30 support.
This enables GL3.3 on softpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 04:19:18 +0000 (14:19 +1000)]
softpipe: bump the softpipe geometry limits
This just aligns the limits with llvmpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 04:19:10 +0000 (14:19 +1000)]
tgsi_exec: use defines for max inputs/outputs
This fixes the limits for GL 3.2, and subsequently fixes
some segfaults in some varying packing tests and max varying tests
after the limits bumped.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 03:32:25 +0000 (13:32 +1000)]
softpipe: add layered rendering support.
This adds support for GL 3.2 layered rendering to softpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 02:12:27 +0000 (12:12 +1000)]
softpipe: add layering to the surface tile cache.
This adds the layer info to the tile cache.
This changes clear_flags to be dynamically allocated as
MAX_LAYERS seems like a too big step.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 10 Jun 2014 00:56:51 +0000 (10:56 +1000)]
softpipe: add depth clamping support. (v2)
This passes the piglit depth clamp tests.
this is required for GL 3.2.
v2: move min/max up one level, could go further, thanks
to Roland for suggestion.
v1: Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 11 Jun 2014 01:38:19 +0000 (11:38 +1000)]
tgsi/gs: bound max output vertices in shader
This limits the number of emitted vertices to the shaders max output
vertices, and avoids us writing things into memory that isn't big
enough for it.
Reviewed-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jon Ashburn [Wed, 5 Mar 2014 00:34:44 +0000 (17:34 -0700)]
i965: Add GPU BLIT of texture image to PBO in Intel driver
Add Intel driver hook for glGetTexImage to accelerate the case of reading
texture image into a PBO. This case gets huge performance gains by using
GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed
by memcpy.
No regressions on Piglit tests with Intel driver.
Performance gain (1280 x 800 FBO, Ivybridge):
glGetTexImage + glMapBufferRange with patch 1.45 msec
glGetTexImage + glMapBufferRange without patch 4.68 msec
v3: (by Kenneth Graunke)
- Fix compile after Eric's change to drop the tiling argument
to intel_miptree_create_for_bo.
- Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit
regressions.
- Squash in several whitespace and coding style fixes.
Kenneth Graunke [Mon, 9 Jun 2014 09:59:22 +0000 (02:59 -0700)]
i965: Invalidate live intervals when inserting Gen4 SEND workarounds.
We need to invalidate the live intervals when inserting new
instructions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Kenneth Graunke [Mon, 9 Jun 2014 09:59:21 +0000 (02:59 -0700)]
i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.
Fixes random crashes, as well as valgrind errors.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Kenneth Graunke [Mon, 9 Jun 2014 09:13:25 +0000 (02:13 -0700)]
meta: Label the meta GLSL clear program.
Giving the meta clear program a meaningful name makes it easier to find
in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time. We already
did so for integer programs, but neglected to label the primary program.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sat, 7 Jun 2014 09:21:47 +0000 (02:21 -0700)]
i965/fs: Combine generate_math[12]_gen6 methods.
These used to call different math emitters (brw_math vs. brw_math2).
Now that they both call gen6_math, they're virtually identical.
When unrolling SIMD16 to multiple SIMD8 operations, we should take care
not to apply sechalf to brw_null_reg for src1. Otherwise, we'd end up
with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's
valid.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Sat, 7 Jun 2014 09:27:43 +0000 (02:27 -0700)]
i965/fs: Drop the generate_math[12]_gen7 methods.
These functions are basically identical, so we should combine them.
However, they're so trivial, we may as well just fold them into their
only call sites.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Sat, 7 Jun 2014 09:32:40 +0000 (02:32 -0700)]
i965/vec4: Combine generate_math[12]_gen6 methods.
These are trivial to combine: we should just avoid checking the second
operand if it's brw_null_reg.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Sat, 7 Jun 2014 09:39:37 +0000 (02:39 -0700)]
i965/vec4: Drop the generate_math2_gen7() method.
It's now a single line of code, so we may as well fold it into the
caller.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Sat, 7 Jun 2014 09:12:46 +0000 (02:12 -0700)]
i965: Rename brw_math to gen4_math.
Usually, I try to use "brw" for functions that apply to all generations,
and "gen4" for dead end/legacy code that is only used on Gen4-5.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Sat, 7 Jun 2014 08:56:12 +0000 (01:56 -0700)]
i965: Split Gen4-5 and Gen6+ MATH instruction emitters.
Our existing functions, brw_math and brw_math2, had unclear roles:
Gen4-5 used brw_math for both unary and binary math functions; it never
used brw_math2. Since operands are already in message registers, this
is reasonable.
Gen6+ used brw_math for unary math functions, and brw_math2 for binary
math functions, duplicating a lot of code. The only real difference was
that brw_math used brw_null_reg() for src1.
This patch improves brw_math2's assertions to allow both unary and
binary operations, renames it to gen6_math(), and drops the Gen6+ code
out of brw_math().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Wed, 6 Mar 2013 16:51:44 +0000 (08:51 -0800)]
i965: Make src_reg::equals() take a constant reference, not a pointer.
This is more typical C++ style.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Kenneth Graunke [Thu, 13 Dec 2012 02:01:00 +0000 (18:01 -0800)]
i965: Don't set the "switch" flag on control flow instructions on Gen6+.
Thread switching on control flow instructions is a documented workaround
for Gen4-5 errata. As far as I can tell, it hasn't been needed since
Sandybridge. Thread switching is not free, so in theory this may help
performance slightly.
Flow control instructions with the "switch" flag cannot be compacted, so
removing it will make these instructions compactable. (Of course, we
still have to implement compaction for flow control instructions...)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 25 Jul 2013 07:30:05 +0000 (00:30 -0700)]
i965/fs: Allow CSE on math opcodes on Gen6+.
total instructions in shared programs: 2081469 -> 2081248 (-0.01%)
instructions in affected programs: 22606 -> 22385 (-0.98%)
No programs were hurt by this patch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Thomas Helland [Mon, 9 Jun 2014 22:57:42 +0000 (00:57 +0200)]
glsl: Remove unused include in expr.flatt.
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:41 +0000 (00:57 +0200)]
glsl: Remove unused include in ir.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:40 +0000 (00:57 +0200)]
glsl: Remove unused include from ir_constant_expression.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:39 +0000 (00:57 +0200)]
glsl: Remove unused include from ir_basic_block.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:38 +0000 (00:57 +0200)]
glsl: Remove unused include from hir_field_selection.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:37 +0000 (00:57 +0200)]
glsl: Remove unused include from glsl_symbol_table.h
Only function-defs use glsl_type so forward declare instead.
Compile-tested on my Ivy-bridge system.
IWYU also suggests removing #include <new>, and this compiles fine.
I'm not familiar enough with memory management in C/C++ that I feel
comfortable removing this. Insights would be appreciated.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:36 +0000 (00:57 +0200)]
glsl: Remove unused include from glsl_types.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Added comment about core.h being used for MAX2.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:35 +0000 (00:57 +0200)]
glsl: Remove unused include from builtin_variables.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:34 +0000 (00:57 +0200)]
glsl: Remove unused include in ast_to_hir.cpp
Found with IWYU. Comment says it's for struct gl_extensions.
Grepping for gl_extensions shows no uses.
Tested by compiling on my Ivy-bridge system.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:33 +0000 (00:57 +0200)]
glsl: Remove unused includes in link_uniform_block_active_visitor.h
Found with IWYU, compile-tested on my Ivy-bridge system.
This is not used in the header, and is included in the source.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Thomas Helland [Mon, 9 Jun 2014 22:57:32 +0000 (00:57 +0200)]
glsl: Remove unused includes in link_uniform_init.
Found with IWYU, confirmed with grepping for "hash" and "symbol".
No negative effects on compilation.
IWYU also reported core.h and linker.h could be removed,
but I'm unsure if those are false positives.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Matt Turner [Tue, 10 Jun 2014 09:08:10 +0000 (02:08 -0700)]
i965: Replace open-coded linked list with exec_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 09:06:23 +0000 (02:06 -0700)]
glsl: Add an exec_node_init() function, usable from C.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 08:00:01 +0000 (01:00 -0700)]
glsl: Make foreach macros usable from C by adding struct keyword.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 07:23:41 +0000 (00:23 -0700)]
glsl: Make exec_list members just wrap the C API.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 07:28:53 +0000 (00:28 -0700)]
glsl: Make exec_node members just wrap the C API.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 07:14:41 +0000 (00:14 -0700)]
glsl: Add C API for exec_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 07:14:24 +0000 (00:14 -0700)]
glsl: Add C API for exec_node.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 05:44:56 +0000 (22:44 -0700)]
glsl: Move definition of exec_list member functions out of the struct.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 10 Jun 2014 05:37:44 +0000 (22:37 -0700)]
glsl: Move definition of exec_node member functions out of the struct.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:57 +0000 (18:14 +0200)]
r600g/compute: Use %u as the unsigned format
This fixes an issue when running cl-program-bitcoin-phatk
piglit test where some of the inputs have negative values
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:56 +0000 (18:14 +0200)]
r600g/compute: align items correctly
Now, items whose size is a multiple of 1024 dw won't leave
1024 dw between itself and the following item
The rest of the cases is left as it was
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:55 +0000 (18:14 +0200)]
r600g/compute: Cleanup of compute_memory_pool.h
Removed compute_memory_defrag declaration because it seems
to be unimplemented.
I think that this function would have been the one that solves
the problem with fragmentation that compute_memory_finalize_pending has.
Also removed comments that are already at compute_memory_pool.c
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:54 +0000 (18:14 +0200)]
r600g/compute: Tidy a bit compute_memory_finalize_pending
Explanation of the changes, as requested by Tom Stellard:
Let's take need after is calculated as
item->size_in_dw+2048 - (pool->size_in_dw - allocated)
BEFORE:
If need is positive or 0:
we calculate need += 1024 - (need % 1024), which is like
cealing to the nearest multiple of 1024, for example
0 goes to 1024, 512 goes to 1024 as well, 1025 goes
to 2048 and so on. So now need is always possitive,
we do compute_memory_grow_pool, check its output
and continue.
If need is negative:
we calculate need += 1024 - (need % 1024), in this case
we will have negative numbers, and if need is
[-1024:-1] 0, so now we take the else, recalculate
need as need = pool->size_in_dw / 10 and
need += 1024 - (need % 1024), we do
compute_memory_grow_pool, check its output and continue.
AFTER:
If need is positive or 0:
we jump the if, calculate need += 1024 - (need % 1024)
compute_memory_grow_pool, check its output and continue.
If need is negative:
we enter the if, and need is now pool->size_in_dw / 10.
Now we calculate need += 1024 - (need % 1024)
compute_memory_grow_pool, check its output and continue.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:53 +0000 (18:14 +0200)]
r600g/compute: Add more NULL checks
In this case, NULL checks are added to compute_memory_grow_pool,
so it returns -1 when it fails. This makes necesary
to handle such cases in compute_memory_finalize_pending
when it is needed to grow the pool
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:52 +0000 (18:14 +0200)]
r600g/compute: Adding checks for NULL after CALLOC
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bruno Jiménez [Mon, 19 May 2014 16:14:51 +0000 (18:14 +0200)]
r600g/compute: Fixing a typo and some indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Cody Northrop [Thu, 5 Jun 2014 17:27:51 +0000 (11:27 -0600)]
mesa: Fix substitution of large shaders
Signed-off-by: Cody Northrop <cody@lunarg.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Michel Dänzer [Tue, 10 Jun 2014 02:25:04 +0000 (11:25 +0900)]
configure: Only check for OpenCL without LLVM when the latter is certain
LLVM is enabled by default for some architectures, but the test was failing
before that.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
David Heidelberger [Sun, 1 Jun 2014 03:02:44 +0000 (05:02 +0200)]
r600g,radeonsi: implement PIPE_QUERY_TIMESTAMP_DISJOINT
v2 Marek: set the query result correctly
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jon TURNEY [Fri, 9 May 2014 12:54:09 +0000 (13:54 +0100)]
configure: Always default to --enable-driglx-direct
Always default to --enable-driglx-direct, now that will build driswrast, but
won't try to use dri[123] on platforms which don't have that.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jon TURNEY [Mon, 2 Jun 2014 17:52:15 +0000 (18:52 +0100)]
glx: Fix build in GLX_DIRECT_RENDERING !GLX_USE_APPLEGL !GLX_USE_DRM case
Some untangling to fix building in the dri_platform=none, --enable-driglx-direct
case, where only driswast can be used.
Turn the test for including the glXGetScreenDriver()/glXGetScreenDriver()
interface used by xdriinfo from !GLX_USE_APPLEGL into a positive form, as it is
only useful when dri_platform=drm
Add additional GLX_USE_DRM tests so DRI[123] renderers are only used when
dri_platform=drm
Note that swrast and indirect must still be disabled in the APPLEGL case at the
moment, which makes things more complex than they need to be. More untangling
is needed to allow that
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Kristian Høgsberg [Sun, 1 Jun 2014 20:49:36 +0000 (13:49 -0700)]
i965: Make gen7_pi field of brw_instruction use unsigned instead of GLuint
Nothing else uses GL-types here.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Kristian Høgsberg [Sun, 1 Jun 2014 20:48:46 +0000 (13:48 -0700)]
i965: Don't include mtypes.h in brw_disasm.c
It's not used.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Matt Turner [Tue, 10 Jun 2014 04:03:38 +0000 (21:03 -0700)]
i965/fs: initialize src as reg_undef for texture opcodes on Gen4.
Untested.
Tapani Pälli [Mon, 9 Jun 2014 09:30:55 +0000 (12:30 +0300)]
i965/fs: initialize src as reg_undef for texture opcodes on Gen5/6.
Commit 07af0ab changed fs_inst to have 0 sources for texture opcodes
in emit_texture_gen5 (Ironlake, Sandybrige) while fs_generator still
uses a single source from brw_reg struct. Patch sets src as reg_undef
which matches the behavior before the constructor got changed.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79534
Emil Velikov [Mon, 2 Jun 2014 11:26:17 +0000 (12:26 +0100)]
egl/dri2: do not leak dri2_dpy->driver_name
Originally all hardware drivers duplicate the driver_name string
from an external source, while for the software rasterizer we set
it to "swrast". Follow the example set by hw drivers this way
we can free the string at dri2_terminate().
v2: Use strdup over strndup. Suggested by Ilia Mirkin.
v3: Handle platform_drm in a similar manner. Cleanup swrast
driver_name in error path.
Cc: Chia-I Wu <olv@lunarg.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 1 Jun 2014 14:19:46 +0000 (15:19 +0100)]
egl/dri2/x11: use standard strndup function
Using a custom version of the function brings no benefit.
Cc: Chad Versace <chad.versace@linux.intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:12 +0000 (12:16 +0300)]
android, dricore: undefined reference to _mesa_streaming_load_memcpy
_mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c
I'm adding it to the dricore lib
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:11 +0000 (12:16 +0300)]
android, mesa_gen_matypes: pull in timespec POSIX definition
This fixes:
include/c11/threads_posix.h: In function 'cnd_timedwait':
include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:10 +0000 (12:16 +0300)]
android, egl: typo dri2_fallback_pixmap_surface -> dri2_fallback_create_pixmap_surface
I used commit
bc8b07a6 as reference, and only the droid_display_vtbl had this issue.
This fixes:
src/egl/drivers/dri2/platform_android.c:641:29:
error: 'dri2_fallback_pixmap_surface' undeclared here (not in a function)
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:09 +0000 (12:16 +0300)]
android, egl: add correct drm include for libmesa_egl_dri2
Fixes:
src/egl/drivers/dri2/platform_android.c:38:
include/GL/internal/dri_interface.h:51:17:
fatal error: drm.h: No such file or directory
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:08 +0000 (12:16 +0300)]
android: add src/gallium/auxiliary as include path for libmesa_dricore
This fixes:
In file included from
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0:
/home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38:
fatal error: util/u_format_r11g11b10f.h: No such file or directory
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:07 +0000 (12:16 +0300)]
android: add libloader to libGLES_mesa and libmesa_egl_dri2
This fixes
src/egl/drivers/dri2/platform_android.c:664: error: undefined reference to 'loader_set_logger'
src/egl/drivers/dri2/platform_android.c:678: error: undefined reference to 'loader_get_driver_for_fd'
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:06 +0000 (12:16 +0300)]
android: adapt to the megadriver mechanism
Fixes linker error:
ld:
.../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o):
in function globalDriverAPI:dri_util.c(.data.rel+0x0): error:
undefined reference to 'driDriverAPI'
As an example, you can see that mesa_dri_drivers
also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am)
The _stub part might be confusing, but
it actually provides the dri-driver shared lib constructor,
megadriver_stub_init, which will later on load the real
platform dependent part and call
l __driDriverGetExtensions_<platform>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Adrian Negreanu [Fri, 6 Jun 2014 09:16:05 +0000 (12:16 +0300)]
add megadriver_stub_FILES
So that android part can also use $(megadriver_stub_FILES)
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Emil Velikov [Thu, 15 May 2014 18:32:52 +0000 (19:32 +0100)]
scons: remove dri-i915 build target
Unmaintained and broken.
Cc: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Emil Velikov [Thu, 15 May 2014 21:54:48 +0000 (22:54 +0100)]
configure: error out when building opencl without LLVM
Cc: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Abdiel Janulgue [Thu, 5 Jun 2014 18:05:33 +0000 (11:05 -0700)]
i965/disasm: Properly debug negate source modifier for logical instructions
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Abdiel Janulgue [Thu, 5 Jun 2014 18:05:31 +0000 (11:05 -0700)]
i965/vec4: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Abdiel Janulgue [Thu, 5 Jun 2014 18:05:29 +0000 (11:05 -0700)]
i965/fs: skip copy-propate for logical instructions with negated src entries
The negation source modifier on src registers has changed meaning in Broadwell when
used with logical operations. Don't copy propagate when negate src modifier is set
and when the destination instruction is a logical op.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Abdiel Janulgue [Thu, 5 Jun 2014 18:05:28 +0000 (11:05 -0700)]
i965/fs: Refactor check for potential copy propagated instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Brian Paul [Mon, 9 Jun 2014 13:13:41 +0000 (06:13 -0700)]
docs: add link to 10.1.5 on news page
Brian Paul [Mon, 9 Jun 2014 13:10:35 +0000 (06:10 -0700)]
docs: fix version number in 10.2.1 release notes
Brian Paul [Mon, 9 Jun 2014 13:10:18 +0000 (06:10 -0700)]
docs: import the 10.1.5 release notes
Chris Forbes [Sat, 12 Apr 2014 01:21:09 +0000 (13:21 +1200)]
glsl: Validate aux storage qualifier combination with other qualifiers.
We've been allowing `centroid` and `sample` in all kinds of weird places
where they're not valid.
Insist that `sample` is combined with `in` or `out`;
and that `centroid` is combined with `in`, `out`, or the deprecated
`varying`.
V2: Validate this in a more sensible place. This does require an extra
case for uniform blocks members and struct members, though, since they
don't go through the normal path.
V3: Improve error message wording; eliminate redundant error generation
for inputs in VS or outputs in FS.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Iago Toral Quiroga [Mon, 9 Jun 2014 10:00:04 +0000 (12:00 +0200)]
i965: Ensure that we end instruction streams properly.
Threads must terminate with a SEND message to a particular shared function,
such as a URB write or FB write, so the instruction stream really shouldn't
ever end in an IF/ELSE/ENDIF or similar block structure.
However, if the instruction stream (incorrectly) ends in a block structure
the last block's end pointer will not be set, leading to a crash later on in
fs_live_variables::setup_def_use(). It is better to detect this earlier, so
assert on that.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Iago Toral Quiroga [Thu, 5 Jun 2014 13:03:08 +0000 (15:03 +0200)]
i965/fs: Add Gen < 6 runtime checks for line antialiasing.
In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
has to be sent as part of the framebuffer write SEND message. This affects the
specific case where we have setup antialiased line rendering and we render
polygons which have one face setup in GL_LINE mode (line antialiasing
will be used) and the other one in GL_FILL mode (no line antialiasing needed).
Currently we are not doing this runtime test and instead we always send AA
data, which produces incorrect rendering of the GL_FILL face of the polygon in
in the aforementioned scenario (verified in ironlake and gm45).
In Gen4 this is, likely, a regression introduced with commit
098acf6c843. In
Gen5 this has never worked properly. Gen > 5 are not affected by this.
The patch fixes the problem by adding the appropriate runtime check and
adjusting the framebuffer write message accordingly in the conflictive
scenario.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Iago Toral Quiroga [Thu, 5 Jun 2014 13:03:06 +0000 (15:03 +0200)]
i965/fs: Let the gen < 8 generator know about runtime_check_aads_emit
In gen < 6 we need to produce conditional code based on this flag when doing
framebuffer writes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Chris Forbes [Mon, 2 Jun 2014 04:44:40 +0000 (16:44 +1200)]
docs: Mark off ARB_compressed_texture_pixel_storage
.. and add to release notes for 10.3
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Tue, 20 May 2014 09:28:21 +0000 (21:28 +1200)]
mesa: Add extension enable for ARB_compressed_texture_pixel_storage
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Mon, 2 Jun 2014 04:29:06 +0000 (16:29 +1200)]
mesa: Add pixel storage support for GetCompressedTexImage
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Mon, 2 Jun 2014 03:50:09 +0000 (15:50 +1200)]
mesa: Compute proper strides for compressed texture pixel storage.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Mon, 2 Jun 2014 03:47:47 +0000 (15:47 +1200)]
mesa: Extract computation of compressed pixel store params
This logic is reusable across CompressedTex*Image* and
GetCompressedTexImage; the strides calculated will also be needed
in the PBO validation functions to ensure that the referenced range of
bytes is valid.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Tue, 20 May 2014 11:41:59 +0000 (23:41 +1200)]
mesa: Emit errors for inconsistent compressed pixel store state
V2: Use bool rather than GLboolean for internal function
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Tue, 20 May 2014 10:37:13 +0000 (22:37 +1200)]
mesa: Add new pixel pack/unpack state for
ARB_compressed_texture_pixel_storage
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chris Forbes [Tue, 20 May 2014 09:53:02 +0000 (21:53 +1200)]
tests: Add new enum strings for ARB_compressed_texture_pixel_storage
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>