Christoph Bumiller [Wed, 28 Mar 2012 21:50:32 +0000 (23:50 +0200)]
nv50/ir: add setFlagsDef/Src helper
Will be used by nv50 target.
Christoph Bumiller [Fri, 6 Apr 2012 16:34:44 +0000 (18:34 +0200)]
nv50/ir: add isAccessSupported check for memory access coalescing
Christoph Bumiller [Wed, 28 Mar 2012 19:30:59 +0000 (21:30 +0200)]
nv50/ir: add function for splitting a BasicBlock
Fixes to initial implementation by Francisco Jerez.
Francisco Jerez [Tue, 15 Nov 2011 20:39:52 +0000 (21:39 +0100)]
nv50/ir: Allow attaching two nodes when either one is already inside the graph.
Francisco Jerez [Tue, 15 Nov 2011 20:39:22 +0000 (21:39 +0100)]
nv50/ir: Allow inserting isolated nodes to a graph.
Francisco Jerez [Mon, 14 Nov 2011 23:38:15 +0000 (00:38 +0100)]
nv50/ir: Fix memory corruption in Function::orderInstructions().
"iter" doesn't reference a BasicBlock directly, but a Node::Graph,
i.e. BasicBlock::get() is casting to the wrong pointer type.
Francisco Jerez [Tue, 15 Nov 2011 14:58:04 +0000 (15:58 +0100)]
nv50/ir: Fix up insertion of PHI instructions using bb->insertHead().
Christoph Bumiller [Tue, 15 Nov 2011 23:39:41 +0000 (00:39 +0100)]
nv50/ir: fix insertHead and remove for BBs with PHI ops only
Francisco Jerez [Sat, 19 Nov 2011 20:31:28 +0000 (21:31 +0100)]
nv50/ir: Don't crash on zero sized BitSets.
Francisco Jerez [Tue, 15 Nov 2011 00:50:58 +0000 (01:50 +0100)]
nv50/ir: Fix Interval::clear().
Christoph Bumiller [Sun, 25 Dec 2011 17:34:35 +0000 (18:34 +0100)]
nv50/ir/tgsi: handle inferSrcType(NOT) to be u32
Francisco Jerez [Mon, 14 Nov 2011 22:09:45 +0000 (23:09 +0100)]
nv50/ir/opt: Fix OP_NOT to modifier conversion.
Dave Airlie [Sat, 14 Apr 2012 19:25:59 +0000 (20:25 +0100)]
r600g: disable dual-src hangs evergreen for some reason.
This did work previously, so I've broken something.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Sat, 14 Apr 2012 16:11:29 +0000 (12:11 -0400)]
r300/compiler: Exit immediately from rc_vert_fc() if there is an error
This way we correctly report "Too many temporaries" errors.
https://bugs.freedesktop.org/show_bug.cgi?id=48680
Note: This is a candidate for the stable branches.
Tom Stellard [Sat, 14 Apr 2012 14:02:19 +0000 (10:02 -0400)]
r300/compiler: Copy all instruction attributes during local transfoms
Instruction attributes like WriteALUResult and ALUResultCompare
were being discarded during the some of the local transformations.
This fixes the following piglit tests:
glsl1-inequality (vec2, pass)
loopfunc
fs-any-bvec2-using-if
fs-op-ne-bvec2-bvec2-using-if
fs-op-ne-ivec2-ivec2-using-if
fs-op-ne-mat2-mat2-using-if
fs-op-ne-vec2-vec2-using-if
fs-op-ne-mat2x3-mat2x3-using-if
fs-op-ne-mat2x4-mat2x4-using-if
https://bugs.freedesktop.org/show_bug.cgi?id=45921
NOTE: This is a candidate for the stable branches.
Tom Stellard [Wed, 21 Sep 2011 04:05:55 +0000 (21:05 -0700)]
r300/compiler: Fix nested flow control in r500 vertex shaders
Tom Stellard [Fri, 13 Apr 2012 02:07:40 +0000 (22:07 -0400)]
r300/compiler: Clear loop registers in vertex shaders w/o loops
The loop registers weren't being cleared, so any shader that was
executed after a shader containing loops was at risk of having a loop
randomly inserted into it.
This fixes over one hundred piglit tests, although these test
only failed during full piglit runs and would pass if
run individually. The exact number of piglit tests that this patch
fixes will vary depending on the version of piglit and the order the
tests are run.
NOTE: This is a candidate for the stable branches.
Eric Anholt [Fri, 16 Mar 2012 22:44:25 +0000 (15:44 -0700)]
glsl: If an "if" has no "then" or "else" code left, remove it.
Cuts 8/1068 instructions from glyphy's fragment shaders on i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Mon, 19 Mar 2012 23:37:23 +0000 (16:37 -0700)]
glsl: Add a helper for generating temporary variables in ir_builder.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 23:27:34 +0000 (16:27 -0700)]
glsl: Add a helper for ir_builder to make dereferences for assignments.
v2: Fix writemask setup for non-vec4 assignments.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 23:01:52 +0000 (16:01 -0700)]
glsl: Make a little tracking class for emitting IR lists.
This lets us significantly shorten p->instructions->push_tail(ir), and
will be used in a few more places.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 21:26:04 +0000 (14:26 -0700)]
glsl: Add common swizzles to ir_builder.
Now we can fold a bunch of our expression setup in ff_fragment_shader
into single-line, parseable commits.
v2: Make it actually work. I wasn't setting num_components in the
mask structure, and not setting up a mask structure is way easier.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 21:04:23 +0000 (14:04 -0700)]
glsl: Let ir_builder expressions take un-dereferenced variables.
Having to explicitly dereference is irritating and bloats the code,
when the compiler can detect and do the right thing.
v2: Use a little shim class to produce the automatic dereference
generation at compile time as opposed to runtime, while also
allowing compile-time type checking.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 Mar 2012 20:27:06 +0000 (13:27 -0700)]
glsl: Create an ir_builder helper for hand-generating IR.
The C++ constructors with placement new, while functional, are
extremely verbose, leading to generation of simple GLSL IR expressions
like (a * b + c * d) expanding to many lines of code and using lots of
temporary variables. By creating a new ir_builder.h that puts simple
generators in our namespace and taking advantage of ralloc_parent(),
we can generate much more compact code, at a minor runtime cost.
v2: Replace ir_instruction usage with just ir_rvalue.
v3: Drop remaining missed as_rvalue() in v2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Christoph Bumiller [Thu, 8 Mar 2012 20:41:41 +0000 (21:41 +0100)]
nv50,nvc0: fix handling of user vbufs with stride < access size
Christoph Bumiller [Tue, 28 Feb 2012 18:25:57 +0000 (19:25 +0100)]
nvc0: prefix all macro methods with MACRO
Some of them have non-macro counterparts.
Christoph Bumiller [Sat, 14 Apr 2012 04:08:08 +0000 (06:08 +0200)]
nvc0: replace VERTEX_DATA push mode with translate to buffer
While pushing vertices through the FIFO is relatively fast on nv50,
it's horribly slow on nvc0.
Christoph Bumiller [Fri, 16 Mar 2012 16:37:32 +0000 (17:37 +0100)]
nvc0: improve vertex state validation
Now updating vertex attribute format only when necessary.
Christoph Bumiller [Thu, 8 Mar 2012 14:56:11 +0000 (15:56 +0100)]
nvc0: track texture dirty state individually
Christoph Bumiller [Thu, 1 Mar 2012 20:28:29 +0000 (21:28 +0100)]
nv50,nvc0: use new scratch buffers code
Christoph Bumiller [Sat, 14 Apr 2012 03:38:16 +0000 (05:38 +0200)]
nouveau: add new shared scratch buffers
Christoph Bumiller [Thu, 1 Mar 2012 20:23:06 +0000 (21:23 +0100)]
nvc0: only force early fragment tests if requested by shader
Christoph Bumiller [Wed, 7 Mar 2012 18:44:10 +0000 (19:44 +0100)]
nv50,nvc0: hold references to the framebuffer surfaces
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
r300g: align vertex buffer suballocations to 4
Marek Olšák [Fri, 13 Apr 2012 15:51:42 +0000 (17:51 +0200)]
u_blitter: align vertex buffer suballocations to 4
Brian Paul [Fri, 13 Apr 2012 20:31:16 +0000 (14:31 -0600)]
docs: document another viewperf bug in Maya-03
Marcin Slusarz [Fri, 13 Apr 2012 19:55:56 +0000 (21:55 +0200)]
xorg/nouveau: switch to libdrm_nouveau-2.0
Martin Peres [Fri, 13 Apr 2012 18:53:02 +0000 (20:53 +0200)]
targets/{egl-static,gbm}: further clean-up the nvfx remains
Christoph Bumiller [Sat, 14 Apr 2012 01:05:02 +0000 (03:05 +0200)]
nvc0: remove include of old libdrm_nouveau's nouveau_reloc.h
Christoph Bumiller [Sat, 14 Apr 2012 00:39:16 +0000 (02:39 +0200)]
nv50,nvc0: handle PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS
Christoph Bumiller [Sat, 14 Apr 2012 00:38:25 +0000 (02:38 +0200)]
nv30: s/DUAL_SOURCE_BLEND/MAX_DUAL_SOURCE_RENDER_TARGETS
Merge accident.
Ben Skeggs [Wed, 11 Jan 2012 11:42:07 +0000 (12:42 +0100)]
nv30: import new driver for GeForce FX/6/7 chipsets, and Quadro variants
The primary motivation for this rewrite was to have a maintainable driver
going forward, as nvfx was quite horrible in a lot of ways.
The driver is heavily based on the design of the nv50/nvc0 3d drivers we
already have, and uses the same common buffer/fence code. It also passes
a HEAP more piglit tests than nvfx did, supports a couple more features,
and a few more to come still probably.
The CPU footprint of this driver is far far less than nvfx, and translates
into far greater framerates in a lot of applications (unless you're using
a CPU that's way way newer than the GPUs of these generations....)
Basically, we once again have a maintained driver for these chipsets \o/
Feel free to report bugs now!
Christoph Bumiller [Fri, 6 Apr 2012 13:41:55 +0000 (15:41 +0200)]
nouveau: switch to libdrm_nouveau-2.0
Christoph Bumiller [Sun, 12 Feb 2012 23:33:55 +0000 (00:33 +0100)]
nvc0: remove obsolete nvc0_push2.c
Slower version of nvc0_push.c, was only used to ascertain that
bugs were not the new version's fault.
Christoph Bumiller [Fri, 10 Feb 2012 12:18:13 +0000 (13:18 +0100)]
nouveau: remove automatic buffer migration heuristics
Ben Skeggs [Thu, 16 Feb 2012 12:08:41 +0000 (22:08 +1000)]
nvfx: completely remove this driver (GeForce FX/6/7)
This driver hasn't been maintained properly for a very long time, and for
many very good reasons. It's horrible.
A new driver supporting these chipsets will appear with the commits that
port vieux/nv50/nvc0 to libdrm_nouveau-2.0.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 13 Apr 2012 07:50:37 +0000 (17:50 +1000)]
nouveau: rework and simplify nv04/nv05 driver a bit
TEXTURED_TRIANGLE and MULTITEX_TRIANGLE are both a bit special in that if
you use any other graph object in the meantime they'll forget their state
and spew a lovely METHOD_CNT error at you when you try to draw.
The pre-newlib driver has a flush_notify() hook which does this state
re-emit, and a number of random workarounds like extra flushes and state
dirtying after various operations to solve this issue.
I'm taking a slightly different approach to things instead, which has the
nice side-effect of removing the divergent code-paths for ttri/mtri, the
flush/dirty workarounds and the need for flush_notify. Also gives a few
FPS boost in OA, yay.
Ben Skeggs [Fri, 23 Dec 2011 04:03:49 +0000 (14:03 +1000)]
nouveau/vieux: switch to libdrm_nouveau-2.0
Dave Airlie [Fri, 13 Apr 2012 16:15:47 +0000 (17:15 +0100)]
docs: update GL3.txt for ARB_blend_func_extended
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 13 Apr 2012 16:13:01 +0000 (17:13 +0100)]
gallium: document dual source blending restrictions on gallium
As per Brian's suggestion, document the restrictions on dual src blending.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:37:16 +0000 (13:37 +0000)]
r600g: initial r600 dual src blending support
survives piglit with no regressions on rv610/evergreen
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:36:59 +0000 (13:36 +0000)]
softpipe: add dual source blending support
This adds support for a single dual source blending MRT to softpipe.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 14:28:03 +0000 (14:28 +0000)]
util: add dual blend helper function (v2)
This is just a function to tell if a certain blend mode requires dual sources.
v2: move to inlines as per Brian's suggestion
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:36:17 +0000 (13:36 +0000)]
st/mesa: add ARB_blend_func_extended support to state tracker.
This adds the blend mode mapping, it also uses the var->index in the
glsl to tgsi convertor - this is the other half of my using 4 in the GLSL
compiler.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:34:45 +0000 (13:34 +0000)]
gallium: rename DUAL_SOURCE_BLEND cap to MAX_DUAL_SOURCE_RENDER_TARGETS
Though I don't think we'll ever expose > 1.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:33:41 +0000 (13:33 +0000)]
glsl: add support for ARB_blend_func_extended (v3)
This adds index support to the GLSL compiler.
I'm not 100% sure of my approach here, esp without how output ordering
happens wrt location, index pairs, in the "mark" function.
Since current hw doesn't ever have a location > 0 with an index > 0,
we don't have to work out if the output ordering the hw requires is
location, index, location, index or location, location, index, index.
But we have no hw to know, so punt on it for now.
v2: index requires layout - catch and error
setup explicit index properly.
v3: drop idx_offset stuff, assume index follow location
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sat, 24 Mar 2012 13:33:00 +0000 (13:33 +0000)]
mesa: add support for ARB_blend_func_extended (v4)
Add implementations of the two API functions,
Add a new strings to uint mapping for index bindings
Add the blending mode validation for SRC1 + SRC_ALPHA_SATURATE
Add get for MAX_DUAL_SOURCE_DRAW_BUFFERS
v2:
Add check in valid_to_render to address case in spec ERRORS.
v3:
Add index to ir.h so this patch compiles on its own
fixup comment
v4: fixup Brian's comments
The GLSL patch will setup the indices.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Fri, 6 Jan 2012 22:38:37 +0000 (17:38 -0500)]
radeonsi: initial WIP SI code
This commit adds initial support for acceleration
on SI chips. egltri is starting to work.
The SI/R600 llvm backend is currently included in mesa
but that may change in the future.
The plan is to write a single gallium driver and
use gallium to support X acceleration.
This commit contains patches from:
Tom Stellard <thomas.stellard@amd.com>
Michel Dänzer <michel.daenzer@amd.com>
Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The following commits were squashed in:
======================================================================
radeonsi: Remove unused winsys pointer
This was removed from r600g in commit:
commit
96d882939d612fcc8332f107befec470ed4359de
Author: Marek Olšák <maraeo@gmail.com>
Date: Fri Feb 17 01:49:49 2012 +0100
gallium: remove unused winsys pointers in pipe_screen and pipe_context
A winsys is already a private object of a driver.
======================================================================
radeonsi: Copy color clamping CAPs from r600
Not sure if the values of these CAPS are correct for radeonsi, but the
same changed were made to r600g in commit:
commit
bc1c8369384b5e16547c5bf9728aa78f8dfd66cc
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Jan 23 03:11:17 2012 +0100
st/mesa: do vertex and fragment color clamping in shaders
For ARB_color_buffer_float. Most hardware can't do it and st/mesa is
the perfect place for a fallback.
The exceptions are:
- r500 (vertex clamp only)
- nv50 (both)
- nvc0 (both)
- softpipe (both)
We also have to take into account that r300 can do CLAMPED vertex colors only,
while r600 can do UNCLAMPED vertex colors only. The difference can be expressed
with the two new CAPs.
======================================================================
radeonsi: Remove PIPE_CAP_OUTPUT_READ
This CAP was dropped in commit:
commit
04e324008759282728a95a1394bac2c4c2a1a3f9
Author: Marek Olšák <maraeo@gmail.com>
Date: Thu Feb 23 23:44:36 2012 +0100
gallium: remove PIPE_SHADER_CAP_OUTPUT_READ
r600g is the only driver which has made use of it. The reason the CAP was
added was to fix some piglit tests when the GLSL pass lower_output_reads
didn't exist.
However, not removing output reads breaks the fallback for glClampColorARB,
which assumes outputs are not readable. The fix would be non-trivial
and my personal preference is to remove the CAP, considering that reading
outputs is uncommon and that we can now use lower_output_reads to fix
the issue that the CAP was supposed to workaround in the first place.
======================================================================
radeonsi: Add missing parameters to rws->buffer_get_tiling() call
This was changed in commit:
commit
c0c979eebc076b95cc8d18a013ce2968fe6311ad
Author: Jerome Glisse <jglisse@redhat.com>
Date: Mon Jan 30 17:22:13 2012 -0500
r600g: add support for common surface allocator for tiling v13
Tiled surface have all kind of alignment constraint that needs to
be met. Instead of having all this code duplicated btw ddx and
mesa use common code in libdrm_radeon this also ensure that both
ddx and mesa compute those alignment in the same way.
v2 fix evergreen
v3 fix compressed texture and workaround cube texture issue by
disabling 2D array mode for cubemap (need to check if r7xx and
newer are also affected by the issue)
v4 fix texture array
v5 fix evergreen and newer, split surface values computation from
mipmap tree generation so that we can get them directly from the
ddx
v6 final fix to evergreen tile split value
v7 fix mipmap offset to avoid to use random value, use color view
depth view to address different layer as hardware is doing some
magic rotation depending on the layer
v8 fix COLOR_VIEW on r6xx for linear array mode, use COLOR_VIEW on
evergreen, align bytes per pixel to a multiple of a dword
v9 fix handling of stencil on evergreen, half fix for compressed
texture
v10 fix evergreen compressed texture proper support for stencil
tile split. Fix stencil issue when array mode was clear by
the kernel, always program stencil bo. On evergreen depth
buffer bo need to be big enough to hold depth buffer + stencil
buffer as even with stencil disabled things get written there.
v11 rebase on top of mesa, fix pitch issue with 1d surface on evergreen,
old ddx overestimate those. Fix linear case when pitch*height < 64.
Fix r300g.
v12 Fix linear case when pitch*height < 64 for old path, adapt to
libdrm API change
v13 add libdrm check
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
======================================================================
radeonsi: Remove PIPE_TRANSFER_MAP_PERMANENTLY
This was removed in commit:
commit
62f44f670bb0162e89fd4786af877f8da9ff607c
Author: Marek Olšák <maraeo@gmail.com>
Date: Mon Mar 5 13:45:00 2012 +0100
Revert "gallium: add flag PIPE_TRANSFER_MAP_PERMANENTLY"
This reverts commit
0950086376b1c8b7fb89eda81ed7f2f06dee58bc.
It was decided to refactor the transfer API instead of adding workarounds
to address the performance issues.
======================================================================
radeonsi: Handle PIPE_VIDEO_CAP_PREFERED_FORMAT.
Reintroduced in commit
9d9afcb5bac2931d4b8e6d1aa571e941c5110c90.
======================================================================
radeonsi: nuke the fallback for vertex and fragment color clamping
Ported from r600g commit
c2b800cf38b299c1ab1c53dc0e4ea00c7acef853.
======================================================================
radeonsi: don't expose transform_feedback2 without kernel support
Ported from r600g commit
15146fd1bcbb08e44a1cbb984440ee1a5de63d48.
======================================================================
radeonsi: Handle PIPE_CAP_GLSL_FEATURE_LEVEL.
Ported from r600g part of commit
171be755223d99f8cc5cc1bdaf8bd7b4caa04b4f.
======================================================================
radeonsi: set minimum point size to 1.0 for non-sprite non-aa points.
Ported from r600g commit
f183cc9ce3ad1d043bdf8b38fd519e8f437714fc.
======================================================================
radeonsi: rework and consolidate stencilref state setting.
Ported from r600g commit
a2361946e782b57f0c63587841ca41c0ea707070.
======================================================================
radeonsi: cleanup setting DB_SHADER_CONTROL.
Ported from r600g commit
3d061caaed13b646ff40754f8ebe73f3d4983c5b.
======================================================================
radeonsi: Get rid of register masks.
Ported from r600g commits
3d061caaed13b646ff40754f8ebe73f3d4983c5b..
9344ab382a1765c1a7c2560e771485edf4954fe2.
======================================================================
radeonsi: get rid of r600_context_reg.
Ported from r600g commits
9344ab382a1765c1a7c2560e771485edf4954fe2..
bed20f02a771f43e1c5092254705701c228cfa7f.
======================================================================
radeonsi: Fix regression from 'Get rid of register masks'.
======================================================================
radeonsi: optimize r600_resource_va.
Ported from r600g commit
669d8766ff3403938794eb80d7769347b6e52174.
======================================================================
radeonsi: remove u8,u16,u32,u64 types.
Ported from r600g commit
78293b99b23268e6698f1267aaf40647c17d95a5.
======================================================================
radeonsi: merge r600_context with r600_pipe_context.
Ported from r600g commit
e4340c1908a6a3b09e1a15d5195f6da7d00494d0.
======================================================================
radeonsi: Miscellaneous context cleanups.
Ported from r600g commits
e4340c1908a6a3b09e1a15d5195f6da7d00494d0..
621e0db71c5ddcb379171064a4f720c9cf01e888.
======================================================================
radeonsi: add a new simple API for state emission.
Ported from r600g commits
621e0db71c5ddcb379171064a4f720c9cf01e888..
f661405637bba32c2cfbeecf6e2e56e414e9521e.
======================================================================
radeonsi: Also remove sbu_flags member of struct r600_reg.
Requires using sid.h instead of r600d.h for the new CP_COHER_CNTL definitions,
so some code needs to be disabled for now.
======================================================================
radeonsi: Miscellaneous simplifications.
Ported from r600g commits
38bf2763482b4f1b6d95cd51aecec75601d8b90f and
b0337b679ad4c2feae59215104cfa60b58a619d5.
======================================================================
radeonsi: Handle PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION.
Ported from commit
8b4f7b0672d663273310fffa9490ad996f5b914a.
======================================================================
radeonsi: Use a fake reloc to sleep for fences.
Ported from r600g commit
8cd03b933cf868ff867e2db4a0937005a02fd0e4.
======================================================================
radeonsi: adapt to get_query_result interface change.
Ported from r600g commit
4445e170bee23a3607ece0e010adef7058ac6a11.
Dylan Noblesmith [Sun, 1 Apr 2012 19:47:07 +0000 (19:47 +0000)]
st/vega: silence enum cast warnings
clang warns on these:
stroker.c:626:19: warning: implicit conversion from enumeration
type 'VGPathCommand' to different enumeration type 'VGPathSegment'
[-Wconversion]
No change in the underlying value.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 19:04:47 +0000 (19:04 +0000)]
i965: fix typo
Noticed by clang:
brw_wm_surface_state.c:330:30: warning: initializer overrides prior
initialization of this subobject [-Winitializer-overrides]
[MESA_FORMAT_Z24_S8] = 0,
^
brw_wm_surface_state.c:326:30: note: previous initialization is here
[MESA_FORMAT_Z24_S8] = 0,
^
No functionality change, since the array is declared static so
it was zero-initialized by default.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Dylan Noblesmith [Sun, 1 Apr 2012 18:59:28 +0000 (18:59 +0000)]
mesa: fix truncated value warning
Silences a clang warning:
format_pack.c:2546:30: warning: implicit conversion from 'int' to
'GLubyte' (aka 'unsigned char') changes value from 65535 to 255
[-Wconstant-conversion]
d[i] = d[i] ? 0xffff : 0x0;
~ ^~~~~~
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 18:55:23 +0000 (18:55 +0000)]
mesa: don't cast away const
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 19:57:57 +0000 (19:57 +0000)]
egl-static: fix printf warning
Noticed by clang:
egl_st.c:57:50: warning: field precision should have type 'int',
but argument has type 'size_t' (aka 'unsigned long') [-Wformat]
ret = util_snprintf(path, sizeof(path), "%.*s/%s" UTIL_DL_EXT,
~~^~
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 19:48:21 +0000 (19:48 +0000)]
st/vega: fix uninitialized values
C still treats array arguments exactly like pointer arguments.
By sheer coincidence, this still worked fine on 64-bit
machines where 2 * sizeof(float) == sizeof(void*), but not
on 32-bit.
Noticed by clang:
text.c:76:51: warning: sizeof on array function parameter will
return size of 'const VGfloat *' (aka 'const float *') instead of
'const VGfloat [2]' [-Wsizeof-array-argument]
memcpy(glyph->glyph_origin, glyphOrigin, sizeof(glyphOrigin));
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 18:48:13 +0000 (18:48 +0000)]
egl: fix uninitialized values
Noticed by clang:
eglimage.c:48:28: warning: argument to 'sizeof' in 'memset' call is
the same expression as the destination; did you mean to dereference
it? [-Wsizeof-pointer-memaccess]
memset(attrs, 0, sizeof(attrs));
~~~~~ ^~~~~
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 18:35:29 +0000 (18:35 +0000)]
util: fix uninitialized table
Most of the 256 values in the 'generic_to_slot' table were supposed to
be initialized with the default value 0xff, but were left at zero
(from CALLOC_STRUCT()) instead.
Noticed by clang:
u_linkage.h:60:31: warning: argument to 'sizeof' in 'memset' call is the same expression as the destination;
did you mean to provide an explicit length? [-Wsizeof-pointer-memaccess]
memset(table, 0xff, sizeof(table));
~~~~~ ^~~~~
Also fix a signed/unsigned comparison and a comment typo here.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dylan Noblesmith [Sun, 1 Apr 2012 18:21:47 +0000 (18:21 +0000)]
util: fix undefined behavior
container_of() can legally return anything, even invalid addresses
that cause segfaults, when 'sample' is an uninitialized pointer.
Bug exposed by clang.
NOTE: This is a candidate for the 8.0 branch.
Vinson Lee [Thu, 12 Apr 2012 06:05:44 +0000 (23:05 -0700)]
ir_to_mesa: Fix uninitialized member in add_uniform_to_shader.
Fix uninitialized scalar field defect reported by Coverity.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Neil Roberts [Wed, 11 Apr 2012 16:07:56 +0000 (17:07 +0100)]
wayland-drm: Implement wl_buffer.damage in old versions of Wayland
Commit
272bc48976 removed the damage implementation for the
wl_buffer_interface because that has been removed from git master of
Wayland. However this breaks building with the 0.85 branch of Wayland
because it would end up initialising the struct incorrectly.
For the time being it's quite convenient for some compositors to track
the 0.85 branch of Wayland because the protocol is stable but they
will also want to track the master branch of Mesa so that they can use
the gbm surface changes.
This patch adds a compile-time check for the version of Wayland so
that it can work with either Wayland master or the 0.85 branch.
krh: Edited to also account for API changes in
6802eaa68, which
removes the timestamp argument from wl_resource_destroy().
Stéphane Marchesin [Fri, 13 Apr 2012 01:31:10 +0000 (18:31 -0700)]
Revert "i915g: Implement stipple with draw."
This reverts commit
3cff45fdb182a1327f6b89fdc4e0ddc5d680372a.
Stéphane Marchesin [Fri, 13 Apr 2012 01:30:59 +0000 (18:30 -0700)]
Revert "i915g: Remove unused poly stipple state."
This reverts commit
be6a02266d1a934c6eff9aaf12fc618588b2d586.
Ian Romanick [Thu, 29 Mar 2012 22:31:55 +0000 (15:31 -0700)]
tests/glx: Point at the imported copy of gtest
This is just in case there's one installed on the system.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Tue, 6 Mar 2012 01:01:13 +0000 (17:01 -0800)]
glx: Hook up the unit tests again using the internal gtest.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ian Romanick [Thu, 29 Mar 2012 22:31:27 +0000 (15:31 -0700)]
gtest: Fix up import of gtest 1.6.0
The include files were all missing.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Tue, 6 Mar 2012 01:01:12 +0000 (17:01 -0800)]
gtest: Build as a convenience library.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Tue, 6 Mar 2012 01:01:11 +0000 (17:01 -0800)]
gtest: Import sources from gtest 1.6.0.
The upstream of gtest has decided that the intended usage model is for
projects to import the source and use it, which is reflected in their
recent removal of the gtest-config tool.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Marek Olšák [Tue, 10 Apr 2012 06:28:23 +0000 (08:28 +0200)]
cso: unreference saved vertex buffers when restoring
Reviewed-by: Brian Paul <brianp@vmware.com>
Eric Anholt [Wed, 21 Mar 2012 21:35:14 +0000 (14:35 -0700)]
i965: When the kernel lacks the LLC check, assume it's present on gen >= 6.
The param wasn't added until drm-intel-next for 3.4, so we were
missing our various LLC fast-paths.
Eric Anholt [Wed, 21 Mar 2012 21:31:53 +0000 (14:31 -0700)]
intel: Drop backwards compat code for not having libdrm with the LLC check.
Eric Anholt [Tue, 13 Mar 2012 21:19:31 +0000 (14:19 -0700)]
i965/fs: Avoid generating extra AND instructions on bool logic ops.
By making a bool fs_reg only have a defined low bit (matching CMP
output), instead of being a full 0 or 1 value, we reduce the ANDs
generated in logic chains like:
if (v_texcoord.x < 0.0 || v_texcoord.x > texwidth ||
v_texcoord.y < 0.0 || v_texcoord.y > 1.0)
discard;
My concern originally when writing this code was that we would end up
generating unnecessary ANDs on bool uniforms, so I put the ANDs right
at the point of doing the CMPs that otherwise set only the low bit.
However, in order to use a bool, we're generating some instruction
anyway (e.g. moving it so as to produce a condition code update), and
those instructions can often be turned into an AND at that point. It
turns out in the shaders I have on hand, none of them regress in
instruction count:
Total instructions: 262649 -> 262545
39/2148 programs affected (1.8%)
14253 -> 14149 instructions in affected programs (0.7% reduction)
Eric Anholt [Sat, 10 Mar 2012 21:48:42 +0000 (13:48 -0800)]
i965/fs: Try to avoid generating extra MOVs to do saturates.
This change (before the previous two) produced a .23% +/- .11%
performance improvement in Unigine Tropics at 1024x768 on IVB.
Total instructions: 269270 -> 262649
614/2148 programs affected (28.6%)
179386 -> 172765 instructions in affected programs (3.7% reduction)
v2: Move some of the logic of finding the instruction that produced
the result of an expression tree to a helper.
Eric Anholt [Thu, 22 Mar 2012 20:22:51 +0000 (13:22 -0700)]
glsl: Extend the array splitting optimization pass to matrices.
This should fit in well with our lower_mat_op_to_vec code: now, in
addition to having expressions on each column of a matrix, we also
split the columns to separate variables so they can be tracked
individually by the copy propagation, dead code, and other passes.
This optimizes out some more code generation in unigine and gstreamer
shaders.
Total instructions: 269342 -> 269270
14/2148 programs affected (0.7%)
2226 -> 2154 instructions in affected programs (3.2% reduction)
Eric Anholt [Sun, 3 Oct 2010 05:57:17 +0000 (22:57 -0700)]
glsl: Add an array splitting pass.
I've had this code laying around almost done for a long time. The
idea is like opt_structure_splitting, that we've got a bunch of
transforms at the GLSL IR level that only understand scalars and
vectors, which just skip complicated dereferences. While driver
backends may manage some optimization after they split matrices up
themselves, it would be better to bring all of our optimization to
bear on the problem.
While I wasn't expecting changes quite yet, a few programs end up
winning: a gstreamer convolution shader, and the Humus dynamic
branching demo:
Total instructions: 269430 -> 269342
3/2148 programs affected (0.1%)
1498 -> 1410 instructions in affected programs (5.9% reduction)
Eric Anholt [Thu, 22 Mar 2012 15:58:33 +0000 (08:58 -0700)]
glsl: Don't apply optimization passes to builtins.
The builtins we have are generally optimized, having been
hand-written. This avoids generating bad code when an optimization
pass prints debug output.
Brian Paul [Wed, 11 Apr 2012 17:53:33 +0000 (11:53 -0600)]
docs: document yet another viewperf bug
Brian Paul [Fri, 6 Apr 2012 21:45:39 +0000 (15:45 -0600)]
mesa: add _mesa_total_texture_memory() debug function
This function can be called in gdb to find out how much memory is used
by all texture objects.
Brian Paul [Fri, 6 Apr 2012 21:44:56 +0000 (15:44 -0600)]
mesa: new _mesa_total_buffer_object_memory() debug function
This function can be called in gdb to find out how much memory is used
by buffer objects.
Chad Versace [Tue, 10 Apr 2012 22:36:07 +0000 (15:36 -0700)]
mapi: Fix Android build
The Android build was broken by
commit
ca760181b4420696c7e86aa2951d7203522ad1e8
Author: Kristian Høgsberg <krh@bitplanet.net>
Date: Fri Mar 16 12:55:40 2012 -0400
shared-glapi: Convert to automake
The offending change was that it redefined the filepaths in sources.mak
like this:
- FOO_FILES := bar.c
+ FOO_FILES := $(TOP)/src/mapi/mapi/bar.c
This broke the build because source filepaths in Android makefiles must be
relative to the makefile.
Ideally, this could be fixed by reverting the change in sources.mak and
making shared-glapi's Makefile.am use $(addprefix $(TOP)/src/mapi/mapi,
$(FOO_FILES)). However, automake doesn't understand builtin GNU make
functions, such as addprefix. So, it seems that automake and Android can
no longer share sources.mak.
Fix the build by duplicating the source lists from sources.mak into
Android.mk.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Mandeep Singh Baines [Tue, 10 Apr 2012 21:48:14 +0000 (14:48 -0700)]
egl_dri2: fix aux buffer leak in drm platform
Keep a reference to any newly allocated aux buffers to avoid
re-allocating for every st_framebuffer_validate() (i.e. leaking).
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Paul Berry [Fri, 6 Apr 2012 19:14:28 +0000 (12:14 -0700)]
i965: Stop lying about cpp and height of a stencil buffer.
When using a separate stencil buffer, i965 requires that the pitch of
the buffer (in the 3DSTATE_STENCIL_BUFFER command) be specified as 2x
the actual pitch.
Previously this was accomplished by doubling the "cpp" and "pitch"
values stored in the intel_region data structure, and halving the
height. However, this was confusing, and it led to a subtle (but
benign) bug: since a stencil buffer is W-tiled, its true height must
be aligned to a multiple of 64; we were accidentally aligning its faux
height to a multiple of 64, causing memory to be wasted.
Note that for window system stencil buffers, the DDX also doubles the
cpp and pitch values. To facilitate fixing this DDX server bug in the
future, we fix the cpp and pitch values we receive from the X server
only if cpp has the "incorrect" value of 2.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
v2: Clarify comments about the DDX.
Pekka Paalanen [Tue, 10 Apr 2012 12:35:06 +0000 (15:35 +0300)]
wayland-drm: remove wl_buffer.damage
This is a related fix for the Wayland change:
commit
83685c506e76212ae4e5cb722205d98d3b0603b9
Author: Kristian Høgsberg <krh@bitplanet.net>
Date: Mon Mar 26 16:33:24 2012 -0400
Remove wl_buffer.damage and simplify shm implementation
Apparently, this should also fix a memory leak. When wl_buffer.damage
was removed from Wayland and Mesa was not fixed, wl_buffer.destroy ended
up in the (empty) damage function instead of calling
wl_resource_destroy().
Spotted during build as:
CC wayland-drm-protocol.lo
wayland-drm.c:80:2: warning: initialization from incompatible pointer type
wayland-drm.c:82:1: warning: excess elements in struct initializer
wayland-drm.c:82:1: warning: (near initialization for 'drm_buffer_interface')
Signed-off-by: Pekka Paalanen <ppaalanen@gmail.com>
Vinson Lee [Mon, 9 Apr 2012 05:28:34 +0000 (22:28 -0700)]
st/mesa: Fix uninitialized members in glsl_to_tgsi_visitor constructor.
Fixes uninitialized member defects reported by Coverity.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Chad Versace [Mon, 9 Apr 2012 20:59:03 +0000 (13:59 -0700)]
main: Fix memory leak in _mesa_make_extension_string()
I forgot to free the string returned by strdup().
Note: This is a candidate for the stable branches.
CC: Johannes Obermayr <johannesobermayr@gmx.de>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Vadim Girlin [Mon, 9 Apr 2012 20:44:52 +0000 (00:44 +0400)]
r600g: check gpr count limit
This should help to prevent gpu lockups.
See https://bugs.freedesktop.org/show_bug.cgi?id=48472
NOTE: This is a candidate for the stable branches.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Vadim Girlin [Thu, 5 Apr 2012 01:07:03 +0000 (05:07 +0400)]
glsl: fix variable ordering in the output_read_remover
Use the hash of the variable name instead of the pointer value.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Tue, 27 Mar 2012 16:37:40 +0000 (09:37 -0700)]
i965: Add support for sampling texture buffer objects on gen7+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 28 Mar 2012 16:38:57 +0000 (09:38 -0700)]
i965: Add real support for texturing/rendering with MESA_FORMAT_RGBA8888_REV.
This was hacked in in one place for EGL image stuff, but the right
thing to do was just to provide the mapping from the mesa format to
the native hardware format, which includes render target support.
This turns out to be required for GL_ARB_texture_buffer_object, which
sees data in this layout.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 27 Mar 2012 21:03:26 +0000 (14:03 -0700)]
i965/gen7: Fix the /* ignored */ comment on constant surface setup.
It turns out this field *is* used, and it's the stride between samples
from the buffer. Discovered during TBO debugging.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 27 Mar 2012 22:48:21 +0000 (15:48 -0700)]
mesa: Add support for the GL 3.1 R/RG formats in texture buffer objects.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 27 Mar 2012 17:29:04 +0000 (10:29 -0700)]
mesa: Track a gl_format for the texture buffer format.
There was a function full of unused mappings from the GLenum to
datatype/comps, but that wasn't all the information a driver would
want, which includes the other fields that a gl_format has. Given
that all the texture buffer formats were represented in gl_format,
just use that as our description.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>