Tapani Pälli [Thu, 31 May 2012 06:32:45 +0000 (09:32 +0300)]
automake: use -m32 in CCASFLAGS when using --enable-32-bit
this fixes libdricore directory build with --enable-32-bit on a x86_64 system
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tom Stellard [Fri, 1 Jun 2012 18:49:03 +0000 (14:49 -0400)]
radeon/llvm: Fix VTX_READ patterns
The VTX_READ instructions were using the ADDRParam ComplexPattern which
allows a load instruction's offset to be a register, but VTX_READ
instructions can only handle an immediate offset.
Also, the load_param pattern fragment had an erroneous return true;
statement that was causing it to match the wrong load instructions.
Tom Stellard [Fri, 1 Jun 2012 20:08:41 +0000 (16:08 -0400)]
radeon/llvm: Emit 2 bytes for vertex fetch offsets
Tom Stellard [Fri, 1 Jun 2012 20:10:06 +0000 (16:10 -0400)]
radeon/llvm: Only use indirect (vertex fetch) parameters for kernels
Kernel parameters can only be retrieved via vertex fetchs. Direct
parameters (i.e parameters stored in the constant buffer) are not
supported yet.
Kenneth Graunke [Thu, 31 May 2012 23:19:17 +0000 (16:19 -0700)]
intel: Change vendor string to "Intel Open Source Technology Center".
Tungsten Graphics has not existed for several years, and the majority of
ongoing development and support is done by Intel. I chose to include
"Open Source Technology Center" to distinguish it from, say, the closed
source Windows OpenGL driver.
The one downside to this patch is that applications that pattern match
against "Intel" may start applying workarounds meant for the Windows
driver. However, it does seem like the right thing to do.
This does change oglconform behavior.
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Acked-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Fri, 18 May 2012 23:25:31 +0000 (16:25 -0700)]
glsl: Remove spurious printf messages
These look like debug messages from the switch-statement development.
NOTE: This is a candidate for the 8.0 release branch.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tom Stellard [Fri, 1 Jun 2012 00:35:18 +0000 (20:35 -0400)]
radeon/llvm: Eliminate CFGStructurizer dependency on AMDIL instructions
Add some hooks to the R600,SI InstrInfo and RegisterInfo classes, so
that the CFGStructurizer pass can run without any relying on AMDIL
instructions.
Tom Stellard [Wed, 30 May 2012 23:23:39 +0000 (19:23 -0400)]
radeon/llvm: Change prefix on tablegen files to AMDGPU
Tom Stellard [Thu, 31 May 2012 20:02:37 +0000 (16:02 -0400)]
radeon/llvm: Remove deadcode from the R600LowerInstructions pass
Tom Stellard [Thu, 31 May 2012 19:58:17 +0000 (15:58 -0400)]
radeon/llvm: Remove AMDIL GLOBALSTORE* instructions
Tom Stellard [Thu, 31 May 2012 18:03:29 +0000 (14:03 -0400)]
radeon/llvm: Remove AMDIL GLOBALLOAD* instructions
Adam Rak [Wed, 30 Nov 2011 21:20:41 +0000 (22:20 +0100)]
r600g: compute support for evergreen
Tom Stellard:
- Updated for gallium interface changes
- Fixed a few bugs:
+ Set the loop counter
+ Calculate the correct number of pipes
- Added hooks into the LLVM compiler
Tom Stellard [Tue, 24 Apr 2012 16:44:53 +0000 (12:44 -0400)]
clover: Add function for building a clover::module for non-TGSI targets v6
v2:
-Separate IR type and LLVM triple
-Do the OpenCL C->LLVM IR and linking steps for all PIPE_SHADER_IR
types.
v3:
- Coding style fixes
- Removed compatibility code for LLVM < 3.1
- Split build_module_llvm() into three functions:
compile(), link(), and build_module_llvm()
v4:
- Use struct pipe_compute_program
v5:
- Don't malloc memory for struct pipe_llvm_program
v6:
- Fix serialization of llvm bytecode
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 25 May 2012 12:20:06 +0000 (08:20 -0400)]
gallium: Add struct pipe_llvm_program_header v3
This structure is used as a header that precedes LLVM bytecode programs
that are passed to the drivers.
v2:
- s/pipe_compute_program/pipe_llvm_program/
v3:
- Rename to struct pipe_llvm_program_header
- Drop the char * prog member
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 25 May 2012 12:15:02 +0000 (08:15 -0400)]
clover: Remove target argument from compile_program_tgsi()
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Tue, 24 Apr 2012 16:36:34 +0000 (12:36 -0400)]
clover: Add constructors to some of the module classes v3
This is for the llvm code that can't use extended initializers.
v2:
- Use const references for vector arguments
- Move constructor defs before data members
- Initialize all values in the default constructors
v3:
- Fix typo
Tom Stellard [Tue, 24 Apr 2012 14:42:56 +0000 (10:42 -0400)]
clover: Add necessary flags to libclllvm_la_CXXFLAGS
$(LLVM_CFLAGS) for LLVM defines
-DLIBCLC_PATH for libclc path
-DCLANG_RESOURCE_DIR for clang includes
$(DEFINES) for -DHAVE_LLVM
Tom Stellard [Wed, 2 May 2012 15:06:13 +0000 (11:06 -0400)]
clover: Link to the necessary LLVM and Clang libs
Tom Stellard [Tue, 24 Apr 2012 14:34:57 +0000 (10:34 -0400)]
configure.ac: Add variables LLVM_CPPFLAGS and LLVM_LIBDIR
Tom Stellard [Mon, 12 Mar 2012 17:53:20 +0000 (13:53 -0400)]
configure.ac: Add option for libclc path
Tom Stellard [Mon, 23 Apr 2012 16:09:08 +0000 (12:09 -0400)]
clover: Add a function for retrieving a device's preferred ir v3
A device now has two function for getting information about the IR
it needs to return.
ir_format() => returns the preferred IR
ir_target() => returns the triple for the target that is understood by
clang/llvm.
v2:
- renamed ir_target() to ir_format()
- renamed llvm_triple() to ir_target()
v3:
- Remove unnecessary include
- Do proper conversion from std::vector<char> to std::string
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Francisco Jerez [Fri, 23 Mar 2012 00:40:40 +0000 (01:40 +0100)]
gallium/compute: Add PIPE_COMPUTE_CAP_IR_TARGET v4
v2: Tom Stellard
- Update CAP description
v3: Tom Stellard
- TGSI targets should pass an empty string for this CAP.
v4: Tom Stellard
- TGSI targets can ignore this CAP.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Mon, 23 Apr 2012 16:08:02 +0000 (12:08 -0400)]
gallium: Add PIPE_SHADER_IR_LLVM to enum pipe_shader_ir v2
v2:
- s/PIPE_SHADER_IR_LLVM_R600/PIPE_SHADER_IR_LLVM/
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 20 Apr 2012 18:46:45 +0000 (14:46 -0400)]
configure.ac: Add HAVE_OPENCL AM_CONDITIONAL v2
v2:
- Drop HAVE_OPENCL variable for non-automake builds
- s/HAVE_OPENCL/HAVE_GALLIUM_COMPUTE
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Brian Paul [Fri, 1 Jun 2012 14:27:21 +0000 (08:27 -0600)]
scons: generate the glapitable.h file too
Brian Paul [Wed, 30 May 2012 22:47:34 +0000 (16:47 -0600)]
svga: fix saturated TEX instructions
TEX instructions can't do saturation. Do the TEX into a temp reg w/out
saturation, then do a MOV_SAT.
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Brian Paul [Wed, 30 May 2012 16:08:11 +0000 (10:08 -0600)]
scons: add code to generate the various GL API files
This fixes recent build breakage when we began building the generated
API files from xml as part of the normal build process.
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=50475
Brian Paul [Fri, 25 May 2012 15:44:53 +0000 (09:44 -0600)]
draw: simplify index buffer specification
Replace draw_set_index_buffer() and draw_set_mapped_index_buffer() with
draw_set_indexes() which simply takes a pointer and an index size.
Kenneth Graunke [Tue, 29 May 2012 18:16:34 +0000 (11:16 -0700)]
glsl/tests: Plumb $(PYTHON2) and $(PYTHON_FLAGS) into optimization-test.
Some distributions (like Arch Linux) make /usr/bin/python Python 3,
rather than Python 2. Since compare_ir uses /usr/bin/env python,
such systems will fail to run optimization-test, causing 'make check' to
always fail.
Automake's TESTS_ENVIRONMENT variable provides a mechanism to run
programs or set environment variables in the test environment.
Ideally, I think we would want to use AM_TESTS_ENVIRONMENT, since
TESTS_ENVIRONMENT is supposed to be user-overridable. However, it isn't
supported using the default/serial test runner.
Fixes 'make check' on Arch Linux and Gentoo.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Tue, 22 May 2012 02:23:48 +0000 (19:23 -0700)]
ralloc: Add some basic unit tests.
I started writing unit tests for a new piece of code, and discovered
they all failed due to a bug in ralloc. Clearly it needs a test suite.
v2: Rename to 'ralloc-test' and fix copyright date. (idr review)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Tue, 22 May 2012 02:34:13 +0000 (19:34 -0700)]
ralloc: Fix ralloc_parent() of memory allocated out of the NULL context.
If an object is allocated out of the NULL context, info->parent will be
NULL. Using the PTR_FROM_HEADER macro would be incorrect: it would say
that ralloc_parent(ralloc_context(NULL)) == sizeof(ralloc_header).
Fixes the new "null_parent" unit test.
NOTE: This is a candidate for the 7.9, 7.10, 7.11, and 8.0 branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Tue, 29 May 2012 23:03:05 +0000 (16:03 -0700)]
automake: Check for 'indent' and fall back to 'cat' if not found.
The glapi generator code uses indent to produce more readable code.
However, we don't want to make GNU indent a hard build dependency; check
for it in configure.ac and fall back to 'cat' if it's not available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50484
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Oliver McFadden [Sat, 26 May 2012 07:13:07 +0000 (10:13 +0300)]
mesa: don't compile integer clear shaders for unsupported APIs
Discovered while running the Khronos conformance test suite and
receiving "implementation error: meta program compile failed."
This bug was recently introduced by the i965 clear patch set and would
only be detected while using the ES2 API and only on gen6+ hardware.
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Mon, 14 May 2012 17:19:08 +0000 (10:19 -0700)]
i965/blorp: Implement destination clipping and scissoring
This patch implements clipping and scissoring of the destination rect
for blits that use the blorp engine (e.g. MSAA blits).
Eric Anholt [Thu, 24 May 2012 22:53:09 +0000 (15:53 -0700)]
mesa: Clean up some dricore-related detritus in the old Makefile.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 20:59:21 +0000 (13:59 -0700)]
automake: Convert dricore building to automake.
This is performed in a subdirectory to avoid needing to convert all of
src/mesa/Makefile in one go.
I can now cherry-pick a commit containing glapi XML changes, do "(cd
src/mapi/glapi/gen && make) && make", and get a working driver.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 22:25:09 +0000 (15:25 -0700)]
automake: Add a prefix variable to the common sources lists.
In order to do the minimal change for libdricore conversion to
automake, I need to put its Makefile.am in a subdirectory. Automake
gets whiny/broken if you use GNU make features like "addprefix" or
"$(FILES:%=../%)" to munge your *_SOURCES. So, use a plain old
variable to be able to substitute in that "../"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 16 May 2012 16:09:18 +0000 (09:09 -0700)]
automake: Rename variables in sources.mak to be automake compatible.
*_SOURCES is reserved for files lists for particular automake targets.
Also, "-" in the variable names is not allowed.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 17:55:08 +0000 (10:55 -0700)]
mesa: Remove generated source files during make clean.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 22:56:27 +0000 (15:56 -0700)]
glapi: Enable silent rules for generation when used from automake.
This variable won't be set when called from non-automake makefiles,
but it cleans up shared-glapi's output.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 22:54:40 +0000 (15:54 -0700)]
shared-glapi: Don't forget to clean our built file.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 24 May 2012 23:16:28 +0000 (16:16 -0700)]
mesa: Restore installing of libGL for non-dri builds.
Reported-by: Sven Joachim <svenjoac@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 15 May 2012 20:06:22 +0000 (13:06 -0700)]
mesa: Remove the generated glapi from source control, and just build it.
Mesa already always depends on python to build. The checked in
changes are not reviewed (because any trivial change rewrites the
world). We also have been pushing commits between xml change and
regen where at-build-time xml-generated code disagrees with committed
xml-generated code. And worst of all, sometimes we ("I") check in
*stale* xml-generated code.
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Kurt Roeckx [Thu, 10 May 2012 22:19:42 +0000 (00:19 +0200)]
i830: Fix crash for GL_STENCIL_TEST in i830Enable()
commit
87f12bb2d95236c7b025d1a8be56b5ab1683d702 tried to fix rb->mt
being NULL, but change this case wrong.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marcin Slusarz [Tue, 29 May 2012 18:13:55 +0000 (20:13 +0200)]
nv50: hook up forgotten short constant buffer upload method
Fixes crash in xorg st.
Tom Stellard [Tue, 29 May 2012 15:59:01 +0000 (11:59 -0400)]
radeon/llvm: Update and fix some comments
Tom Stellard [Tue, 29 May 2012 15:36:29 +0000 (11:36 -0400)]
radeonsi: Remove use.sgpr* intrinsics, use load instructions instead
We now model loading uses sgpr values with LLVM IR load instructions that
use the USER_SGPR address space.
The definition of the sgpr parameter to the use_sgpr() helper function
in radeonsi_shader.c has changed so that you can pass raw sgpr values
rather than having to divide the sgpr value you want to use by the dword
width of the type you want to load.
Tom Stellard [Wed, 16 May 2012 19:15:35 +0000 (15:15 -0400)]
radeonsi: Handle TGSI CONST registers
We now emit LLVM load instructions for TGSI CONST register reads,
which are lowered in the backend to S_LOAD_DWORD* instructions.
Tom Stellard [Mon, 28 May 2012 16:07:41 +0000 (12:07 -0400)]
radeon/llvm: Remove AMDILIntrinsicInfo::GetDeclaration fuction body
This function was causing compile errors in the tablegen'd code for
some intrinsic definitions. I don't think we really need this function,
so I'm removing the function body just as a temporary solution. I'll
look into removing the entire AMDILIntrinsicInfo class later.
Tom Stellard [Mon, 28 May 2012 02:11:53 +0000 (22:11 -0400)]
radeon/llvm: Remove AMDILTargetMachine
Christoph Bumiller [Mon, 28 May 2012 16:01:15 +0000 (18:01 +0200)]
nouveau: unreference fences on resource destruction
Christoph Bumiller [Thu, 24 May 2012 19:18:22 +0000 (21:18 +0200)]
nvc0: optimize blend cso by checking which by-RT data actually differs
Can save about 200 bytes of command buffer space.
Christoph Bumiller [Sat, 26 May 2012 11:54:55 +0000 (13:54 +0200)]
nvc0: don't upload UCPs if the shader doesn't use them
Christoph Bumiller [Tue, 29 May 2012 15:00:10 +0000 (17:00 +0200)]
nvc0/ir: allow 64-bit constant loads on nve4
Looks like only 128-bit access doesn't work.
Christoph Bumiller [Fri, 25 May 2012 15:27:03 +0000 (17:27 +0200)]
nvc0/ir: fix texture barrier insertion to prevent WAW hazards
Fixes, for instance, object highlighting in Diablo 3 (wine).
Christoph Bumiller [Mon, 28 May 2012 20:38:10 +0000 (22:38 +0200)]
nvc0/ir: TEX doesn't support JOIN modifier either
Christoph Bumiller [Mon, 21 May 2012 21:46:11 +0000 (23:46 +0200)]
gallium: add st_api feature mask to prevent advertising MS visuals
v2: use a define for the maximum sample count
v3: also test odd sample counts (r300 supports MS3)
While multisample renderbuffers are supported by mesa, MS visuals
are not, so we need a way to tell dri/st not to advertise them even
if the gallium driver does support multisampled surfaces.
Otherwise applications selecting these non-functional visuals would
run into trouble ...
Reviewed-by: Brian Paul <brianp@vmware.com>
Roy Spliet [Tue, 22 May 2012 13:14:26 +0000 (15:14 +0200)]
nv30: Fix generic passing to fragment program in NV34.
Christoph Bumiller [Tue, 22 May 2012 13:33:12 +0000 (15:33 +0200)]
nv30: handle user index buffers
Tom Stellard [Fri, 25 May 2012 16:53:22 +0000 (12:53 -0400)]
radeon/llvm: Use a custom inserter for MASK_WRITE
Tom Stellard [Fri, 25 May 2012 16:18:14 +0000 (12:18 -0400)]
radeon/llvm: Use tablegen pattern to lower bitconvert
Tom Stellard [Fri, 25 May 2012 14:59:52 +0000 (10:59 -0400)]
radeon/llvm: Use a custom inserter to lower FNEG
Tom Stellard [Fri, 25 May 2012 14:50:35 +0000 (10:50 -0400)]
radeon/llvm: Use a custom inserter to lower CLAMP
Tom Stellard [Fri, 25 May 2012 14:29:09 +0000 (10:29 -0400)]
radeon/llvm: Use a custom inserter to lower FABS
Kai Wasserbäch [Fri, 25 May 2012 14:27:08 +0000 (16:27 +0200)]
r600g: handle R16G16B16_FLOAT and R32G32B32_FLOAT in translate_colorswap
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=50318
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Brian Paul [Thu, 24 May 2012 17:09:18 +0000 (11:09 -0600)]
draw: fix primitive restart bug by using the index buffer offset
The code which scans the index buffer for restart indexes wasn't adding
the index buffer offset so we were always starting at offset=0. The
offset is usually zero so it wasn't noticed before.
Fixes a failure in the piglit primitive-restart test when testing
vertex data + index data in a single VBO.
NOTE: This is a candidate for the 8.0 branch.
Brian Paul [Tue, 22 May 2012 22:53:04 +0000 (16:53 -0600)]
svga: remove the special zero-stride vertex array code
This code actually hasn't been needed for some time now. We can just
treat a zero-stride vertex array like any other non-zero-stride array.
Brian Paul [Tue, 22 May 2012 19:03:36 +0000 (13:03 -0600)]
gallium/docs: beef up the docs related to color clamping
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Brian Paul [Tue, 22 May 2012 15:32:50 +0000 (09:32 -0600)]
util: add GALLIUM_LOG_FILE option for logging output to a file
Useful for logging different runs to files and diffing, etc.
Paul Berry [Wed, 9 May 2012 22:51:11 +0000 (15:51 -0700)]
i965/msaa: Enable 4x MSAA on Gen7.
Basic 4x MSAA support now works on Gen7. This patch enables it.
As with Gen6, MSAA support is still fairly preliminary. In
particular, the following are not yet supported:
- 8x oversampling (Gen7 has hardware support for this, but we do not
yet expose it).
- Fully general blits between MSAA and non-MSAA buffers.
- Formats other than RGBA8, DEPTH24, and STENCIL8.
- Centrold interpolation.
- Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE,
GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE,
GL_SAMPLE_COVERAGE_INVERT).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 9 May 2012 14:20:10 +0000 (07:20 -0700)]
i965/msaa: Implement manual blending operation for Gen7.
On Gen6, the blending necessary to blit an MSAA surface to a non-MSAA
surface could be accomplished with a single texturing operation. On
Gen7, the WM program must fetch each sample and blend them together
manually. From the Bspec (Shared Functions/Messages/Initiating
Message/Message Types/sample):
[DevIVB+]:Number of Multisamples on the associated surface must be
MULTISAMPLECOUNT_1.
This patch implements the manual blend operation.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 20:39:10 +0000 (13:39 -0700)]
i965/msaa: Modify blorp code to account for Gen7 MSAA layouts.
Since blorp uses color textures and render targets to do all its work
(even when blitting stencil and depth data), it always has to
configure the Gen7 GPU to use the new "sliced" MSAA layout. However,
when blitting stencil or depth data, the actual MSAA layout is
interleaved (as in Gen6). Therefore, blorp has to do extra coordinate
transformation work to account for the interleaving manually.
This patch causes blorp to perform the necessary extra coordinate
transformations.
It also modifies the blorp SURFACE_STATE setup code for Gen7, so that
it does not try to correct the surface width and height to account for
MSAA, since "sliced" MSAA layout doesn't affect the surface width or
height.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 22:30:33 +0000 (15:30 -0700)]
i965/msaa: Validate Gen7 surface state constraints.
When a Gen7 SURFACE_STATE is configured for MSAA, a number of
additional constaints come in to play. This patch adds a function
gen7_check_surface_setup() which verifies that all of those
constraints are met.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 20:39:10 +0000 (13:39 -0700)]
i965/msaa: Properly handle sliced layout for Gen7.
Starting in Gen7, there are two possible layouts for MSAA surfaces:
- Interleaved, in which additional samples are accommodated by scaling
up the width and height of the surface. This is the only layout
available in Gen6. On Gen7 it is used for depth and stencil
surfaces only.
- Sliced, in which the surface is stored as a 2D array, with array
slice n containing all pixel data for sample n. On Gen7 this layout
is used for color surfaces.
The "Sliced" layout has an additional requirement: it must be used in
ARYSPC_LOD0 mode, which means that the surface doesn't leave any extra
room between array slices for miplevels other than 0.
This patch modifies the surface allocation functions to use the
correct layout when allocating MSAA surfaces in Gen7, and to set the
array offsets properly when using ARYSPC_LOD0 mode. It also modifies
the code that populates SURFACE_STATE structures to ensure that
ARYSPC_LOD0 mode is selected in the appropriate circumstances.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 22:29:43 +0000 (15:29 -0700)]
i965/msaa: Add defines for Gen7.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 9 May 2012 23:00:43 +0000 (16:00 -0700)]
i965/blorp: Enable blorp blits on Gen7.
Gen7 support for blorp (blits using the render bath) now works for
non-MSAA purposes. This patch enables it.
Since blorp operations re-use the logic for HiZ ops, this required
adding a case to the switch statement in gen7_blorp_emit_wm_config(),
to allow for the case where no HiZ op is being performed.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 9 May 2012 13:57:06 +0000 (06:57 -0700)]
i965/blorp: Implement proper texel fetch messages for Gen7.
On Gen6, texel fetch is always accomplished using the SAMPLE_LD
message, which accepts arguments (u, v, r, lod, si). On Gen7, there
are two* texel fetch messages: SAMPLE_LD for non-MSAA surfaces, taking
arguments (u, lod, v), and SAMPLE_LD2DSS for MSAA surfaces, taking
arguments (si, u, v).
*Technically, there are other texel fetch messages, but they are used
for "compressed" MSAA surfaces, which we don't yet support.
This patch adds the proper message types and argument orderings for
Gen7.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Thu, 10 May 2012 00:14:56 +0000 (17:14 -0700)]
i965/blorp: Use 16 pixel dispatch on Gen7.
Gen7 hardware requires us to enable at least one WM dispatch mode,
even if there is no program being dispatched to. When this code was
only used for HiZ operations (which don't use a WM program), we used
32-pixel dispatch, because it didn't matter. But blit programs are
compiled for 16-pixel dispatch. So just enable 16-wide dispatch
unconditionally.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Enable 16-wide dispatch unconditionally rather than add the
unnecessary complication of using 32-wide dispatch when there is no WM
program.
Paul Berry [Tue, 8 May 2012 23:04:22 +0000 (16:04 -0700)]
i965/blorp: Allocate space for push constants on Gen7.
On Gen7, push constants for shader programs are stored in the URB, so
blorp code needs to set aside space for them. This was previously
unnecessary because blorp code was based on HiZ operations, which
don't require any shaders.
This patch adds a call from gen7_blorp_exec() to
gen7_allocate_push_constants(), to ensure that push constants are
assigned the correct location in the URB. It also extracts a new
function gen7_emit_urb_state() from gen7_upload_urb(), which is
re-used by gen7_blorp_emit_urb_config() to ensure that the URB regions
used by all the pipeline stages leave room for the push constants.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 22 May 2012 23:16:43 +0000 (16:16 -0700)]
i965/blorp: Set the dynamic state upper bound.
We know from previous bug fixes (commits
c25e5300cba7628b58df93ead14ebc3cc32f338c and
b2ace06cbbbb1021e2d7ace12a985c6406821939) that texture border color
doesn't work if the dynamic state upper bound is set to 0. Although
the blorp engine doesn't make use of texture borders, it seems like we
ought to err on the safe side and set this value properly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 23:00:25 +0000 (16:00 -0700)]
i965/blorp: Factor gen6_blorp_emit_batch_head into separate functions.
This patch separates out the portions of gen6_blorp_emit_batch_head()
that emit 3DSTATE_MULTISAMPLE, 3DSTATE_SAMPLE_MASK, and
STATE_BASE_ADDRESS. This paves the way for making the blorp code work
on Gen7, where additional command packets
(3DSTATE_PUSH_CONSTANT_ALLOC_VS and 3DSTATE_PUSH_CONSTANT_ALLOC_PS)
need to be emitted before 3DSTATE_MULTISAMPLE.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 9 May 2012 15:29:33 +0000 (08:29 -0700)]
i965/blorp: Use MSDISPMODE_PERSAMPLE rendering when necessary
This patch modifies the "blorp" WM program so that it can be run in
MSDISPMODE_PERSAMPLE (which means that every single sample of a
multisampled render target is dispatched to the WM program, not just
every pixel).
Previously we were using the ugly hack of configuring multisampled
destination surfaces as single-sampled, and generating sample indices
other than zero by swizzling the pixel coordinates in the WM program.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 9 May 2012 13:57:06 +0000 (06:57 -0700)]
i965/blorp: Emit sample index in SAMPLE_LD message when necessary
This patch modifies the function brw_blorp_blit_program::texel_fetch()
to emit the SI (sample index) argument to the SAMPLE_LD message when
reading from a sample index other than zero.
Previously we were using the ugly hack of configuring multisampled
source surfaces as single-sampled, and accessing sample indices other
than zero by swizzling the texture coordinates in the WM program.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 8 May 2012 23:28:43 +0000 (16:28 -0700)]
i965/blorp: Generalize sampling code in preparation for Gen7
This patch generalizes the function
brw_blorp_blit_program::texture_lookup() so that it prepares the
arguments to the sampler message based on a caller-provided array
rather than assuming the argument order is always (u, v).
This paves the way for the messages we will need to use in Gen7, which
use argument orders (u, lod, v) and (si, u, v) (si=sample index).
It will also will allow us to read from arbitrary sample indices on
Gen6, by supplying the arguments (u, v, r, lod, si) to the SAMPLE_LD
message instead of just (u, v).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Fri, 11 May 2012 00:52:09 +0000 (17:52 -0700)]
i965/msaa: Expand odd-sized MSAA surfaces to account for interleaving pattern.
Gen6 MSAA buffers (and Gen7 MSAA depth/stencil buffers) interleave
MSAA samples in a complex pattern that repeats every 2x2 pixel block.
Therefore, when allocating an MSAA buffer, we need to make sure to
allocate an integer number of 2x2 blocks; if we don't, then some of
the samples in the last row and column will be cut off.
Fixes piglit tests "EXT_framebuffer_multisample/unaligned-blit {2,4}
color msaa" on i965/Gen6.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Thomas Gstädtner [Wed, 23 May 2012 16:55:51 +0000 (18:55 +0200)]
gallium/targets: pass ldflags parameter to MKLIB
Without passing the -ldflags parameter before $(LDFLAGS) in some cases
flags will be passed to MKLIB which it does not understand.
This might be -m64, -m32 or similar.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Thomas Gstädtner <thomas@gstaedtner.net>
Signed-off-by: Brian Paul <brianp@vmware.com>
Vadim Girlin [Fri, 25 May 2012 13:28:08 +0000 (17:28 +0400)]
Revert "r600g: set round_mode to truncate and get rid of tgsi_f2i on evergreen"
This reverts commit
60bf0f05b472e66bf1175fcec7a274dab6f7e2a3.
It seems round_mode behaves differently in some cases depending on the
instruction/slot. Reverting it for now.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=50232
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:27:46 +0000 (17:27 +0400)]
radeon/llvm: add FLT_TO_UINT, UINT_TO_FLT instructions
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:27:33 +0000 (17:27 +0400)]
radeon/llvm: prepare to revert the round mode state to default
Use TRUNC before FLT_TO_INT on evergreen/cayman.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:27:23 +0000 (17:27 +0400)]
radeon/llvm: fix sampler index in llvm_emit_tex
Sampler index isn't a second source operand for some tgsi texture
instructions. Let's assume it's always the last.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=50230
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:23:06 +0000 (17:23 +0400)]
radeon/llvm: fix opcode for RECIP_UINT_r600
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=50312
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:22:38 +0000 (17:22 +0400)]
radeon/llvm/loader: convert hardcoded gpu name to option
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vadim Girlin [Fri, 25 May 2012 13:22:12 +0000 (17:22 +0400)]
r600g: add RECIP_INT, PRED_SETE_INT to r600_bytecode_get_num_operands
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=50315
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Vinson Lee [Thu, 24 May 2012 05:36:47 +0000 (22:36 -0700)]
i915g: Check for geometry shader earlier in i915_set_constant_buffer.
Fix resource leak defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Thu, 24 May 2012 00:26:20 +0000 (17:26 -0700)]
scons: Fix SCons build infrastructure for FreeBSD.
This patch gets the FreeBSD SCons build working again. The build still
fails though.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tom Stellard [Thu, 24 May 2012 16:17:58 +0000 (12:17 -0400)]
radeon/llvm: Lower UDIV using the Selection DAG
Tom Stellard [Thu, 24 May 2012 13:28:44 +0000 (09:28 -0400)]
radeon/llvm: Remove auto-generated AMDIL->ISA conversion code
Tom Stellard [Thu, 24 May 2012 13:01:33 +0000 (09:01 -0400)]
radeon/llvm: Remove AMDIL instructions MULHI, SMUL
Tom Stellard [Thu, 24 May 2012 12:55:15 +0000 (08:55 -0400)]
radeon/llvm: Remove AMDIL bitshift instructions (SHL, SHR, USHR)
Tom Stellard [Thu, 24 May 2012 12:37:49 +0000 (08:37 -0400)]
radeon/llvm: Remove AMDIL FTOI and ITOF instructions