Kenneth Graunke [Tue, 15 Jan 2013 05:26:28 +0000 (21:26 -0800)]
i965: Make swizzle_to_scs non-static.
We'll need this for Broadwell code as well.
Normally, when we make things public, we add the "brw" prefix. I'm not
crazy about that in this case, since it deals with prog_instruction.h's
SWIZZLE_XYZW values, rather than the BRW_SWIZZLE_XYZW enums. However,
I can't think of a better name, and at least the comments and code make
it clear.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Fri, 1 Nov 2013 19:50:16 +0000 (12:50 -0700)]
i965: Move enum brw_urb_write_flags from brw_eu.h to brw_defines.h.
Broadwell code should not include brw_eu.h (since it is for Gen4-7
assembly encoding), but needs the URB write flags enum.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Tue, 1 Jan 2013 23:09:26 +0000 (15:09 -0800)]
i965/fs: Remove force_sechalf stack
Only Gen4 color write setup uses the force_sechalf flag, and it only
sets it on a single instruction. It also already has to get a pointer
to the instruction and manually set the saturate flag, so we may as well
just set force_sechalf the same way and avoid the complexity of a stack.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 23:00:14 +0000 (23:00 +0000)]
targets/dri: move linker flags out of configure into Automake.inc
Previous assumption was that the same set of flags can be reused
for both classic and gallium drivers. With megadriver work done
the classic drivers ended up using their own (single) instance of
the flags.
Move these into Automake.inc and rename to indicate that those
are gallium specific. Additionally silence an automake/autoconf
warning "XXX is not a standard libtool library name", due to
the parsing issues of the module tag.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 2 Nov 2013 15:33:07 +0000 (15:33 +0000)]
targets/dri: compact compiler flags into Automake.inc
Greatly reduce duplication and provide a sane minimum of
CFLAGS for all DRI targets.
Note: This commit adds VISIBILITY_CFLAGS to the following:
* freedreno
* i915
* ilo
* nouveau
* vmwgfx
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:56:33 +0000 (22:56 +0000)]
targets/xvmc: do not link against libtrace.la
In order to use the trace driver, one needs to define
GALLIUM_TRACE. Neither one of the two targets was
defining it, thus we're safe to remove libtrace.la.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:55:59 +0000 (22:55 +0000)]
targets/xvmc: consolidate lib deps into Automake.inc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:54:58 +0000 (22:54 +0000)]
targets/xvmc: move linker flags to Automake.inc
Minimise duplication and sources of error
(eg nouveau was missing shared and no-undefined)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:53:15 +0000 (22:53 +0000)]
targets/xvmc: use drop duplicated compiler flags
Automake.inc already has GALLIUM_VIDEO_CFLAGS, which
provide the essential compiler flags needed.
Note: this commit adds VISIBILITY_CFLAGS to nouveau.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 2 Nov 2013 02:02:47 +0000 (02:02 +0000)]
gallium/winsys: compact compiler flags into Automake.inc
Cleanup the duplicating flags and consolidate into a sigle variable.
Note: this patch adds VISIBILITY_CFLAGS to the following targets
* freedreno/drm
* i915/{drm,sw}
* nouveau/drm
* sw/fbdev
* sw/null
* sw/wayland
* sw/wrapper
* sw/xlib
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:51:56 +0000 (22:51 +0000)]
targets/vdpau: drop unused libraries from linker
In order for one to use trace, noop, rbug and/or galahad, they must
set the corresponding GALLIUM_* CFLAG.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:51:19 +0000 (22:51 +0000)]
targets/vdpau: consolidate lib deps into Automake.inc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:50:47 +0000 (22:50 +0000)]
targets/vdpau: move linker flags to Automake.inc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:48:44 +0000 (22:48 +0000)]
targets/vdpau: compact compiler flags into Automake.inc
Store the compiler flags into a variable, in order to minimise
flags duplication (amongst vdpau and xvmc).
Note: this commit add VISIBILITY_CFLAGS to the nouveau target
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 1 Nov 2013 18:58:27 +0000 (18:58 +0000)]
gallium/drivers: compact compiler flags into Automake.inc
* minimise flags duplication
* distingush between VISIBILITY C and CXX flags
* set only required flags - C and/or CXX
v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:46:20 +0000 (22:46 +0000)]
targets/radeonsi: move drm_target.c to a common folder
... and symlink to each target.
Make automake's subdir-objects work for radeonsi.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:42:44 +0000 (22:42 +0000)]
targets/r600: move drm_target.c to common folder
... and symlink for each target.
Make automake's subdir-objects work for r600.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 9 Nov 2013 22:30:20 +0000 (22:30 +0000)]
targets/r300: move drm_target.c to common folder
... and symlink for each target.
Make automake's subdir-objects work for r300.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 26 Oct 2013 16:48:22 +0000 (17:48 +0100)]
gallium/drivers: enable automake subdir-objects
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 26 Oct 2013 16:46:17 +0000 (17:46 +0100)]
r300: move the final sources list to Makefile.sources
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 26 Oct 2013 17:11:36 +0000 (18:11 +0100)]
r300: add symlink to ralloc.c and register_allocate.c
Make automake's subdir-objects work.
Update includes.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 4 Oct 2013 12:06:37 +0000 (13:06 +0100)]
st/xvmc: enable automake subdir-objects
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 28 Sep 2013 17:11:20 +0000 (18:11 +0100)]
dri/common: move source file lists to Makefile.sources
* Allow the lists to be shared among build systems.
* Update automake and Android build systems.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Fri, 8 Nov 2013 19:08:51 +0000 (19:08 +0000)]
gtest: enable subdir-objects to prevent automake warnings
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 8 Nov 2013 18:49:30 +0000 (18:49 +0000)]
gbm: enable subdir-objects to prevent automake warnings
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 30 Sep 2013 21:13:54 +0000 (22:13 +0100)]
scons: move SConscript from gallium/targets/ to mesa/drivers/dri/common/
Store scons side by side with the other build systems.
v2: cleanup after a failed rebase
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Johannes Obermayr [Fri, 1 Nov 2013 18:38:14 +0000 (18:38 +0000)]
freedreno: compact a2xx and a3xx makefiles into parent ones
Nearly everything within the three Makefile.am's is identical.
Let's simplify things a little.
v2: Rebase and rewrite the commit message (Emil Velikov)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sun, 29 Sep 2013 12:05:07 +0000 (13:05 +0100)]
scons: drop obsolete enabled_apis variable
The variable was forgotten during the FEATURE_* removal.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 30 Sep 2013 14:58:51 +0000 (15:58 +0100)]
Android: remove unused MESA_ENABLED_APIS variable
The variable was forgotten during the FEATURE_* removal.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Fri, 4 Oct 2013 12:18:25 +0000 (13:18 +0100)]
st/egl: use *_FILE over *_SOURCES names for filelists
Silence automake warnings about missing program/library whenever
the _SOURCES suffix is used for temporary variable names.
warning: variable 'gdi_SOURCES' is defined but no program or
library has 'gdi' as canonical name (possible typo)
Acked-by: Matt Turner <mattst88@gmail.com>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reported-by: Johannes Obermayr <johannesobermayr@gmx.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70581
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Matt Turner [Thu, 14 Nov 2013 18:36:12 +0000 (10:36 -0800)]
i965: Assert that IF with cmod is Gen6 only.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Vinson Lee [Fri, 15 Nov 2013 06:47:33 +0000 (22:47 -0800)]
i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case.
Fixes "Missing break in switch" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Eric Anholt [Fri, 20 Sep 2013 17:13:32 +0000 (10:13 -0700)]
mesa: Dynamically allocate the storage for program local parameters.
The array was 64kb per struct gl_program, plus we statically stored a copy
of one on disk for _mesa_DummyProgram. Given that most struct gl_programs
we generate are for GLSL shaders that don't have local parameters, this
was a waste.
Since you can store and fetch parameters beyond what the program actually
uses, we do have to do a late allocation if necessary at
GetProgramLocalParameter time.
Reduces peak memory usage in the dota2 trace I made by 76MB (4.5%)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Wed, 13 Nov 2013 21:41:28 +0000 (13:41 -0800)]
mesa: Remove PROGRAM_ENV_PARAM enum.
This has been replaced with referring to env parameters using
PROGRAM_STATE_VAR and _mesa_load_state_parameters.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Wed, 13 Nov 2013 21:38:37 +0000 (13:38 -0800)]
mesa: Remove PROGRAM_LOCAL_PARAM enum.
This has been replaced with referring to local parameters using
PROGRAM_STATE_VAR and _mesa_load_state_parameters.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Wed, 13 Nov 2013 21:36:30 +0000 (13:36 -0800)]
mesa: Update a comment about valid values of a field.
Notably, ENV and LOCAL aren't used any more (replaced by STATE_VAR), but
apparently CONSTANT is.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Thu, 7 Nov 2013 20:15:13 +0000 (12:15 -0800)]
glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic.
The comment was stale, because the lowering in question wasn't happening
in lower_instructions.cpp. Presumably if the lowering ever moves there,
we can plumb the lowering mask through to opt_algebraic.
total instructions in shared programs: 1618696 -> 1616810 (-0.12%)
instructions in affected programs: 243018 -> 241132 (-0.78%)
GAINED: 0
LOST: 0
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 7 Nov 2013 20:10:25 +0000 (12:10 -0800)]
glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 16:32:42 +0000 (09:32 -0700)]
glsl: Apply the transformation "(a && a) -> a" in opt_algebraic.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 07:10:32 +0000 (00:10 -0700)]
glsl: Apply the transformation "(a || a) -> a" in opt_algebraic.
total instructions in shared programs: 1732385 -> 1732373 (-0.00%)
instructions in affected programs: 416 -> 404 (-2.88%)
GAINED: 0
LOST: 0
(That's 4 already-short fragment shaders in dota2)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Eric Anholt [Thu, 31 Oct 2013 06:56:18 +0000 (23:56 -0700)]
glsl: Move the CSE equality functions to the ir class.
I want to reuse them in opt_algebraic.
v2: Merge in Chris Forbes's break fix.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Matt Turner [Mon, 11 Nov 2013 23:54:16 +0000 (15:54 -0800)]
clover: Remove dead file from Makefile.sources.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Kenneth Graunke [Wed, 16 Oct 2013 02:23:53 +0000 (19:23 -0700)]
i965: Rework brw_new_batch to actually start a new batch.
Previously, brw_new_batch was called just after execbuf, but before
intel_batchbuffer_reset. Essentially, it prepared for the creation of a
new batch, that wasn't yet available, and which it didn't create. This
was a bit awkward.
This patch makes brw_new_batch call intel_batchbuffer_reset as the very
first operation. This means that brw_new_batch actually creates a new
batchbuffer, and thus has it available. It brings the creation of the
new batchbuffer and BRW_NEW_BATCH flagging together into one place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 16 Oct 2013 02:21:34 +0000 (19:21 -0700)]
i965: Move cache_used_by_gpu flag setting to brw_finish_batch.
It really makes more sense here.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Ian Romanick [Mon, 11 Nov 2013 19:12:08 +0000 (11:12 -0800)]
i915: Actually enable __DRI2rendererQueryExtensionRec
More rebase fail. This code was written long before i915 and i965 were
split, so most of the code in i9[16]5/intel_screen.c only needed to
exist in one place. It looks like I fixed n-1 of those places after
rebasing on the split.
I only found this from the defined-but-not-used warning for
intelRendererQueryExtension. I noticed this while fixing the other,
related warnings.
(Note: During review, we decided to *not* pick this back to 10.0.)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
Aaron Watry [Thu, 14 Nov 2013 18:17:44 +0000 (12:17 -0600)]
radeon/llvm: Free elf_buffer after use
Prevents a memory leak.
v2: Remove null check
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 14 Nov 2013 18:17:43 +0000 (12:17 -0600)]
r600/llvm: Free binary.code/binary.config in r600_llvm_compile
radeon_llvm_compile allocates memory for binary.code, binary.config,
or neither depending on what's being done.
We need to make sure to free that memory after it's no longer needed.
v2: Don't bother checking for null before FREE()
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 14 Nov 2013 18:17:42 +0000 (12:17 -0600)]
r600/llvm: initialize radeon_llvm_binary
use memset to initialize to 0's... otherwise code_size and config_size
could be uninitialized when read later in this method.
It's also hard to do NULL checks on uninitialized pointers.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
v2: Fix indentation
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Brian Paul [Fri, 15 Nov 2013 17:25:19 +0000 (10:25 -0700)]
svga: remove unused vars in svga_hwtnl_simple_draw_range_elements()
And simplify the code.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 14 Nov 2013 20:41:19 +0000 (13:41 -0700)]
svga: print warning for unsupported indirect dest reg indexing
For DX9-level shaders, there's only limited support for indirect
indexing of registers (with the loop counter register, not the
general address register.)
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 14 Nov 2013 20:33:52 +0000 (13:33 -0700)]
svga: mark dest image as defined in svga_surface_copy()
After we blit/copy to a dest texture image we need to mark it as
being defined. This fixes broken mipmap generation for quite a
few texture formats. Mipgen involves making texture views and
svga_texture_view_surface() skips texture images that are undefined.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 13 Nov 2013 18:26:15 +0000 (11:26 -0700)]
svga: do primitive trimming in translate_indices()
The index translation code expects the number of indexes to be
consistent with the primitive type (ex: a multiple of 3 for
PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds
in the destination buffer.
Fixes failed assertions in the pipebuffer debug code found with
Piglit primitive-restart-draw-mode test.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Wed, 13 Nov 2013 18:24:41 +0000 (11:24 -0700)]
indices: add comments, assertions in u_indices.c file
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 12 Nov 2013 22:09:44 +0000 (15:09 -0700)]
mesa: remove duplicated prototypes in varray.h
Aaron Watry [Wed, 6 Nov 2013 22:49:24 +0000 (16:49 -0600)]
gallium/pipe_loader: un-reference udev resources when we're done with them.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:23 +0000 (16:49 -0600)]
radeonsi/compute: Dispose of LLVM module after compiling kernels
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:22 +0000 (16:49 -0600)]
radeonsi/compute: Free program and program.kernels on shutdown
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:21 +0000 (16:49 -0600)]
radeon/llvm: Free created llvm memory buffer
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:20 +0000 (16:49 -0600)]
radeon/llvm: Free libelf resources
v2: Fix indentation
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Wed, 6 Nov 2013 22:49:19 +0000 (16:49 -0600)]
radeon/llvm: fix spelling error
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Thu, 11 Apr 2013 14:37:55 +0000 (10:37 -0400)]
clover: Support multiple devices in clCreateContextFromType() v2
v2:
- Use clGetDeviceIDs to query devices.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Paul Berry [Tue, 29 Oct 2013 21:41:32 +0000 (14:41 -0700)]
glsl: Rework interface block linking.
Previously, when doing intrastage and interstage interface block
linking, we only checked the interface type; this prevented us from
catching some link errors.
We now check the following additional constraints:
- For intrastage linking, the presence/absence of interface names must
match.
- For shader ins/outs, the interface names themselves must match when
doing intrastage linking (note: it's not clear from the spec whether
this is necessary, but Mesa's implementation currently relies on
it).
- Array vs. nonarray must be consistent, taking into account the
special rules for vertex-geometry linkage.
- Array sizes must be consistent (exception: during intrastage
linking, an unsized array matches a sized array).
Note: validate_interstage_interface_blocks currently handles both
uniforms and in/out variables. As a result, if all three shader types
are present (VS, GS, and FS), and a uniform interface block is
mentioned in the VS and FS but not the GS, it won't be validated. I
plan to address this in later patches.
Fixes the following piglit tests in spec/glsl-1.50/linker:
- interface-blocks-vs-fs-array-size-mismatch
- interface-vs-array-to-fs-unnamed
- interface-vs-unnamed-to-fs-array
- intrastage-interface-unnamed-array
v2: Simplify logic in intrastage_match() for handling array sizes.
Make extra_array_level const. Use an unnamed temporary
interface_block_definition in validate_interstage_interface_blocks()'s
first call to definitions->store().
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Tue, 12 Nov 2013 18:55:18 +0000 (10:55 -0800)]
i965: Fix vertical alignment for multisampled buffers.
From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit
Size):
j [vertical alignment] = 4 for any render target surface is
multisampled (4x)
From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most
messages), under the "Surface Vertical Alignment" heading:
This field is intended to be set to VALIGN_4 if the surface was
rendered as a depth buffer, for a multisampled (4x) render target,
or for a multisampled (8x) render target, since these surfaces
support only alignment of 4.
Back in 2012 when we added multisampling support to the i965 driver,
we forgot to update the logic for computing the vertical alignment, so
we were often using a vertical alignment of 2 for multisampled
buffers, leading to subtle rendering errors.
Note that the specs also require a vertical alignment of 4 for all
Y-tiled render target surfaces; I plan to address that in a separate
patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Paul Berry [Wed, 13 Nov 2013 22:24:09 +0000 (14:24 -0800)]
main: Fix MaxUniformComponents for geometry shaders.
For both vertex and fragment shaders we default MaxUniformComponents
to 4 * MAX_UNIFORMS. It makes sense to do this for geometry shaders
too; if back-ends have different limits they can override them as
necessary.
Fixes piglit test:
spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
José Fonseca [Fri, 15 Nov 2013 15:42:02 +0000 (15:42 +0000)]
tools/trace: Several bugfixes/improvements to dump_state.py
- Don't crash with user memory pointers.
- Support old bind_*_sampler_* methods. Useful when comparing dumps
from old branches.
- Misc.
José Fonseca [Fri, 15 Nov 2013 15:32:33 +0000 (15:32 +0000)]
trace: Dump user_buffer members.
Fredrik Höglund [Mon, 11 Nov 2013 17:54:15 +0000 (18:54 +0100)]
mesa: Fix derived vertex state not being updated in glCallList()
AEcontext::NewState is not always set when the vertex array state
is changed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71492
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alex Deucher [Tue, 24 Sep 2013 16:13:42 +0000 (12:13 -0400)]
radeonsi: add Hawaii pci ids
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 24 Sep 2013 16:12:29 +0000 (12:12 -0400)]
radeonsi: add support for Hawaii asics (v2)
Update additional register fields.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Vinson Lee [Fri, 15 Nov 2013 06:33:56 +0000 (22:33 -0800)]
i965: Initialize schedule_node::delay.
Fixes "Uninitialized scalar field" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Alexander von Gluck IV [Wed, 13 Nov 2013 23:51:00 +0000 (23:51 +0000)]
haiku/swrast: Inherit gl_config, fix flush
* Inherit gl_context so we always have access to it
* Thanks curro for the idea.
* Last Haiku cannidate for 10.0.0
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Roland Scheidegger [Thu, 14 Nov 2013 15:48:30 +0000 (15:48 +0000)]
llvmpipe: (trivial) fix more fallout from the setup cleanup.
Oops... Should have done some more testing.
Roland Scheidegger [Thu, 14 Nov 2013 14:42:28 +0000 (14:42 +0000)]
llvmpipe: (trivial) fix misplaced bld context assignment.
Should fix polygon offset crashes...
José Fonseca [Thu, 14 Nov 2013 14:02:24 +0000 (14:02 +0000)]
gallivm: Compile flag to debug TGSI execution through printfs.
It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag.
I had prototyped this for a while while debugging an issue, but finally
cleaned this up and added a few more bells and whistles.
v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland
reviews.
Here is a sample output.
CONST[0].x = 0.
00625000009 0.
00625000009 0.
00625000009 0.
00625000009
CONST[0].y = -0.
00714285718 -0.
00714285718 -0.
00714285718 -0.
00714285718
CONST[0].z = -1 -1 -1 -1
CONST[0].w = 1 1 1 1
IN[0].x = 143.5 175.5 175.5 143.5
IN[0].y = 123.5 123.5 155.5 155.5
IN[0].z = 0 0 0 0
IN[0].w = 1 1 1 1
$ 1: RCP TEMP[0].w, IN[0].wwww
TEMP[0].w = 1 1 1 1
$ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw
TEMP[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
TEMP[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
$ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww
OUT[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
OUT[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
$ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww
OUT[0].z = 0 0 0 0
$ 5: MOV OUT[0].w, TEMP[0]
OUT[0].w = 1 1 1 1
$ 6: END
OUT[0].x = -0.
103124976 0.
0968750715 0.
0968750715 -0.
103124976
OUT[0].y = 0.
117857158 0.
117857158 -0.
110714316 -0.
110714316
OUT[0].z = 0 0 0 0
OUT[0].w = 1 1 1 1
Roland Scheidegger [Thu, 14 Nov 2013 12:21:02 +0000 (12:21 +0000)]
softpipe: (trivial) fix debug code
The debug printfs wouldn't actually compile when enabled, so kill them off
and insert some new one in another place, and make sure it keeps compiling
by enclosing it in a if-0 clause.
Roland Scheidegger [Tue, 12 Nov 2013 20:02:15 +0000 (20:02 +0000)]
llvmpipe: clean up state setup code a bit
In particular get rid of home-grown vector helpers which didn't add much.
And while here fix formatting a bit. No functional change.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Mon, 11 Nov 2013 14:29:25 +0000 (14:29 +0000)]
gallivm,llvmpipe: fix float->srgb conversion to handle NaNs
d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Ian Romanick [Mon, 11 Nov 2013 19:08:26 +0000 (11:08 -0800)]
dri: Change value param to unsigned
This silences some compiler warnings in i915 and i965. See also
75982a5.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Mon, 11 Nov 2013 18:57:55 +0000 (10:57 -0800)]
i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiB
Systems with little physical memory installed will report less than
2GiB, and some systems may (hypothetically?) have a larger address space
for the GPU. My IVB still reports 1534.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ian Romanick [Mon, 11 Nov 2013 18:55:34 +0000 (10:55 -0800)]
i915: Use drm_intel_get_aperture_sizes instead of drmAgpSize
Send the zombie back to the grave before it infects the townsfolk.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Alexander Monakov [Sun, 3 Nov 2013 21:34:32 +0000 (01:34 +0400)]
i965: implement blit path for PBO glDrawPixels
This patch implements accelerated path for glDrawPixels from a PBO in
i965. The code follows what intel_pixel_read, intel_pixel_copy,
intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests
show no regressions. In my testing on IVB, performance improvement is
huge (about 30x, didn't measure exactly) since generic path goes via
_mesa_unpack_color_span_float, memcpy, extract_float_rgba.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Wed, 13 Nov 2013 17:06:23 +0000 (10:06 -0700)]
docs: fill in md5 checksums for 9.2.3 release
Brian Paul [Wed, 13 Nov 2013 17:00:46 +0000 (10:00 -0700)]
docs: fix 9.2.2 -> 9.2.3 typos
Alexander von Gluck IV [Wed, 13 Nov 2013 05:39:19 +0000 (05:39 +0000)]
haiku: add swrast driver
* This is pretty small and upkeep should be minimal.
* Currently fully working.
* Cannidate for 10.0.0 branch
Acked-by: Brian Paul <brianp@vmware.com>
Carl Worth [Wed, 13 Nov 2013 15:31:42 +0000 (07:31 -0800)]
docs: Import 9.2.3 release notes, add news item.
Kristian Høgsberg [Tue, 12 Nov 2013 00:35:35 +0000 (16:35 -0800)]
dri: Remove redundant createNewContext function from __DRIimageDriverExtension
createContextAttribs is a superset of what createNewContext provides.
Also remove the function typedef, since createNewContext is deprecated
and no longer used in multiple interfaces.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Sat, 9 Nov 2013 06:10:36 +0000 (22:10 -0800)]
wayland: Use __DRIimage based getBuffers implementation when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Sat, 9 Nov 2013 06:06:51 +0000 (22:06 -0800)]
gbm: Add support for __DRIimage based getBuffers when available
This lets us allocate color buffers as __DRIimages and pass them into
the driver instead of having to create a __DRIbuffer with the flink
that requires.
With this patch, we can now run gbm on render-nodes. A render-node is a
drm device that doesn't support modesetting and all the legacy DRI ioctls.
flink is also not supported, but now that gbm doesn't need flink, we can
run piglit on head-less gbm or head-less GPGPU.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ander Conselvan de Oliveira [Tue, 12 Nov 2013 12:47:08 +0000 (14:47 +0200)]
dri/i915, dri/i965: Fix support for planar images
Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that
moved the conversion from dri_format to the mesa format made it
impossible to allocate a image with that format.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Eric Anholt [Thu, 7 Nov 2013 01:38:23 +0000 (17:38 -0800)]
i965/fs: Try a different pre-scheduling heuristic if the first spills.
Since LIFO fails on some shaders in one particular way, and non-LIFO
systematically fails in another way on different kinds of shaders, try
them both, and pick whichever one successfully register allocates first.
Slightly prefer non-LIFO in case we produce extra dependencies in register
allocation, since it should start out with fewer stalls than LIFO.
This is madness, but I haven't come up with another way to get unigine
tropics to not spill while keeping other programs from not spilling and
retaining the non-unigine performance wins from texture-grf.
total instructions in shared programs: 1626728 -> 1626288 (-0.03%)
instructions in affected programs: 1015 -> 575 (-43.35%)
GAINED: 50
LOST: 0
Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Thu, 7 Nov 2013 01:43:25 +0000 (17:43 -0800)]
i965/fs: Do instruction pre-scheduling just before register allocation.
Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling
barriers, so we had to run scheduler before them in order for it to be
able to do basically anything. Now that that's fixed, we can delay the
scheduling until we go to allocate (which will make the next change less
scary).
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Wed, 6 Nov 2013 07:30:33 +0000 (23:30 -0800)]
i965/fs: Ignore actual latency pre-reg-alloc.
We care about depth-until-program-end, as a proxy for "make sure I
schedule those early instructions that open up the other things that can
make progress while keeping register pressure low", not actual latency
(since we're relying on the post-register-alloc scheduling to actually
schedule for the hardware).
total instructions in shared programs: 1609931 -> 1609931 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 55
LOST: 43
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Tue, 5 Nov 2013 06:56:33 +0000 (22:56 -0800)]
i965/fs: Fix message setup for SIMD8 spills.
In the SIMD16 spilling changes, I replaced a "1" in the spill path with
"mlen", but obviously it wasn't mlen before because spills have the g0
header along with the payload. The interface I was trying to use was
asking for how many physical regs we're writing, so we're looking for "1"
or "2".
I'm guessing this actually passed piglit because the high 8 bits of the
execution mask in SIMD8 mode are all 0s.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Eric Anholt [Mon, 14 Oct 2013 18:38:09 +0000 (11:38 -0700)]
i965/fs: Prefer things we know reduce reg pressure when pre-scheduling.
Previously, the best thing we had was to schedule the things unblocked by
the last chosen instruction, on the hope that it would be consuming two
values at the end of their live intervals while only producing one new
value. But that's just a guess, and we can do counting of usage of
registers to know when an instruction would (almost surely) reduce
register pressure.
The only failure mode I know of in this new dominant heuristic is that
inside of a loop when scheduling the iterator (for example), choosing the
last use of the iterator doesn't actually reduce the live interval of the
iterator. But it doesn't seem to matter in shader-db:
total instructions in shared programs: 1618700 -> 1618700 (0.00%)
instructions in affected programs: 0 -> 0
GAINED: 13
LOST: 0
Note: The new functions are made virtual because I expect we'll soon lift
the pre-regalloc scheduling heuristic over to the vec4 backend.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Wed, 6 Nov 2013 00:24:58 +0000 (16:24 -0800)]
i965: Fix undefined value usage in ABO setup.
Fixes a compiler warning.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Anholt [Thu, 31 Oct 2013 17:14:17 +0000 (10:14 -0700)]
i965: Add a warning if something ever hits a bug I noticed.
We'd have to map the VBO and rewrite things to a lower stride to fix it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ben Skeggs [Tue, 12 Nov 2013 07:58:18 +0000 (17:58 +1000)]
nvc0: release 3d bufctx after drawing
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Francisco Jerez [Tue, 12 Nov 2013 19:14:20 +0000 (11:14 -0800)]
clover: Fix the const variant of adaptor_range::end to deal with mismatching range sizes.
Fixes infinite loop in find_grid_optimal_factor() in cases where the
user specifies a grid size with less dimensions than the device
supports.
Reported-by: Tom Stellard <thomas.stellard@amd.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Roland Scheidegger [Mon, 11 Nov 2013 15:11:59 +0000 (15:11 +0000)]
draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offset
Since we explicitly require a integer input we should avoid using exp2 math
(even if we were using optimized versions), which turns the exp2 into a int
sub (plus some casts).
v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cyril Brulebois [Tue, 12 Nov 2013 09:51:00 +0000 (02:51 -0700)]
gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detection
Thanks to Pino Toscano. Patch from Debian package.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>