Chris Wilson [Thu, 24 Feb 2011 12:29:51 +0000 (12:29 +0000)]
intel: Reset the buffer offset after releasing reference to packed upload
Fixes oglc/vbo(basic.bufferdata)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34603
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 24 Feb 2011 10:58:22 +0000 (10:58 +0000)]
i965: Unmap the correct pointer after discontiguous upload
Fixes piglit/fbo-depth-sample-compare:
==14722== Invalid free() / delete / delete[]
==14722== at 0x4C240FD: free (vg_replace_malloc.c:366)
==14722== by 0x84FBBFD: intel_upload_unmap (intel_buffer_objects.c:695)
==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457)
==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394)
==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365)
==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389)
==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543)
==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973)
==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351)
==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== Address 0xc606310 is 0 bytes after a block of size 18,720 alloc'd
==14722== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==14722== by 0x85202AB: copy_array_to_vbo_array (brw_draw_upload.c:256)
==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457)
==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394)
==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365)
==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389)
==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543)
==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973)
==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351)
==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34604
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 24 Feb 2011 10:12:37 +0000 (10:12 +0000)]
intel: Protect against waiting on a NULL render target bo
If we fall back to software rendering due to the render target being
absent (GPU hang or other error in creating the named target), then we
do not need to nor should we wait upon the results.
Reported-by: Magnus Kessler <Magnus.Kessler@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34656
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Dave Airlie [Thu, 17 Feb 2011 05:07:57 +0000 (15:07 +1000)]
r600g: EXT_texture_array support.
This adds EXT_texture_array support to r600g, it passes the piglit
array-texture test but I suspect may not be complete.
It currently requires a kernel patch to fix the CS checker to allow
these, so you need to use R600_ARRAY_TEXTURE=true for now
to enable them.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 18 Feb 2011 04:51:58 +0000 (14:51 +1000)]
st/mesa: treat 1D ARRAY upload like a depth or 2D array upload.
This is because the HW doesn't always store a 1D array like a
2D texture, it more likely stores it like 2D texture (i.e.
alignments etc).
This means we upload each slice separately and let the driver
work out where to put it.
this might break nvc0 as I can't test it, I have only nv50 here.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Vinson Lee [Thu, 24 Feb 2011 02:21:14 +0000 (18:21 -0800)]
scons: Fix Cygwin platform names.
Fixes immediate Python exceptions with SCons on Cygwin.
Jakob Bornecrantz [Tue, 22 Feb 2011 23:12:08 +0000 (23:12 +0000)]
i915g: Lazy emit dynamic state
Jakob Bornecrantz [Mon, 21 Feb 2011 23:39:10 +0000 (23:39 +0000)]
i915g: Lazy emit immediate state
Jakob Bornecrantz [Tue, 22 Feb 2011 22:07:03 +0000 (22:07 +0000)]
i915g: Disable LIS7 state updates for now
Jakob Bornecrantz [Mon, 21 Feb 2011 23:09:43 +0000 (23:09 +0000)]
i915g: Clean up in i915_state_immediate
Jakob Bornecrantz [Mon, 21 Feb 2011 22:47:40 +0000 (22:47 +0000)]
i915g: Remove outdated comment
Jakob Bornecrantz [Tue, 22 Feb 2011 00:20:39 +0000 (00:20 +0000)]
i915g: Use dump function in sw winsys
Jakob Bornecrantz [Wed, 23 Feb 2011 00:11:09 +0000 (00:11 +0000)]
i915g: Enable mirror repeat wrap mode
Jakob Bornecrantz [Tue, 22 Feb 2011 22:28:06 +0000 (22:28 +0000)]
i915g: Always set vbo to flush on flushes
Reported-by Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Wed, 23 Feb 2011 23:09:36 +0000 (23:09 +0000)]
intel: gen3 is particular sensitive to batch size
... and prefers a small batch whereas gen4+ prefer a large batch to
carry more state.
Tuning using openarena/padman indicate that a batch size of just 4096 is
best for those cases.
Bugzilla: https://bugs.freedesktop.org/process_bug.cgi
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Wed, 23 Feb 2011 22:09:12 +0000 (22:09 +0000)]
i915: And remember assign the new value to the state reg...
Fixes regression from
298ebb78de8a6b6edf0aa0fe8d784d00bbc2930e.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34589
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tom Fogal [Tue, 22 Feb 2011 05:32:18 +0000 (22:32 -0700)]
Fix GLX_USE_TLS define.
It was only getting set in the case of DRI drivers.
Fabian Bieler [Mon, 14 Feb 2011 21:44:42 +0000 (22:44 +0100)]
r600g: Request DWORD aligned vertex buffers.
The spec says that the offsets in the vertex-fetch instructions need to be byte-aligned and makes no specification with regard to the required alignment of the offset and stride in the vertex resource constant register.
However, testing indicates that all three values need to be DWORD aligned.
Wiktor Janas [Wed, 23 Feb 2011 06:10:12 +0000 (07:10 +0100)]
st/mesa: fix computing the lowest address for interleaved attribs
Ptr can be very well NULL, so when there are two arrays, with one having
offset 0 (and thus NULL Ptr), and the other having a non-zero offset,
the non-zero value is taken as minimum (because of !low_addr ? start ...).
On 32-bit systems, this somehow works. On 64-bit systems, it leads to crashes.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Brian Paul [Tue, 22 Feb 2011 21:31:15 +0000 (14:31 -0700)]
vbo: added vbo_check_buffers_are_unmapped() debug function
Brian Paul [Tue, 22 Feb 2011 21:23:50 +0000 (14:23 -0700)]
vbo: removed unused #defines, add comments
Brian Paul [Tue, 22 Feb 2011 20:37:30 +0000 (13:37 -0700)]
mesa: move comment, change debug code
Brian Paul [Tue, 22 Feb 2011 20:31:09 +0000 (13:31 -0700)]
vbo: simplify NeedFlush flag clearing
Brian Paul [Tue, 22 Feb 2011 20:24:56 +0000 (13:24 -0700)]
vbo: use ctx intstead of exec->ctx
Brian Paul [Tue, 22 Feb 2011 19:44:42 +0000 (12:44 -0700)]
r300g: fix missing initializers warning
Brian Paul [Tue, 22 Feb 2011 19:44:10 +0000 (12:44 -0700)]
i915g: remove extra semicolons
Andy Skinner [Fri, 11 Feb 2011 15:31:25 +0000 (07:31 -0800)]
xlib: pass Display pointer to XMesaGarbageCollect()
Fixes an issue when different displays are used on different threads.
Signed-off-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Sun, 20 Feb 2011 01:22:47 +0000 (17:22 -0800)]
i965: Increase Sandybridge point size clamp.
255.875 matches the hardware documentation. Presumably this was a typo.
Found by inspection. Not known to fix any issues.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sun, 20 Feb 2011 00:48:24 +0000 (16:48 -0800)]
i965/fs: Correctly set up gl_FragCoord.w on Sandybridge.
pixel_w is the final result; wpos_w is used on gen4 to compute it.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sun, 20 Feb 2011 00:12:28 +0000 (16:12 -0800)]
i965/fs: Refactor control flow stack handling.
We can't safely use fixed size arrays since Gen6+ supports unlimited
nesting of control flow.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sat, 19 Feb 2011 09:05:11 +0000 (01:05 -0800)]
i965/fs: Avoid register coalescing away gen6 MATH workarounds.
The code that generates MATH instructions attempts to work around
the hardware ignoring source modifiers (abs and negate) by emitting
moves into temporaries. Unfortunately, this pass coalesced those
registers, restoring the original problem. Avoid doing that.
Fixes several OpenGL ES2 conformance failures on Sandybridge.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Sat, 19 Feb 2011 09:03:08 +0000 (01:03 -0800)]
i965/fs: Apply source modifier workarounds to POW as well.
Single-operand math already had these workarounds, but POW (the only two
operand function) did not. It needs them too - otherwise we can hit
assertion failures in brw_eu_emit.c when code is actually generated.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 22 Feb 2011 18:04:18 +0000 (10:04 -0800)]
i965: Fix shaders that write to gl_PointSize on Sandybridge.
gl_PointSize (VERT_RESULT_PSIZ) doesn't take up a message register,
as it's part of the header. Without this fix, writing to gl_PointSize
would cause the SF to read and use the wrong attributes, leading to all
kinds of random looking failure.
Reviewed-by: Eric Anholt <eric@anholt.net>
José Fonseca [Tue, 22 Feb 2011 14:59:09 +0000 (14:59 +0000)]
mesa: Avoid undeclared ffs function warning on mingw.
José Fonseca [Tue, 22 Feb 2011 14:14:45 +0000 (14:14 +0000)]
gallium: s/PIPE_TRANSFER_CPU_READ/PIPE_TRANSFER_READ/ in comments.
José Fonseca [Tue, 22 Feb 2011 14:14:22 +0000 (14:14 +0000)]
gallium/docs: Update PIPE_TRANSFER_xx docs. Reformat to use definitions.
Keith Whitwell [Wed, 5 Jan 2011 17:33:43 +0000 (17:33 +0000)]
gallium: new transfer flag: DISCARD_WHOLE_RESOURCE
Marek Olšák [Sun, 20 Feb 2011 17:05:24 +0000 (18:05 +0100)]
st/mesa: fix crash when using both user and vbo buffers with the same stride
If two buffers had the same stride where one buffer is a user one and
the other is a vbo, it was considered to be one interleaved buffer,
resulting in incorrect rendering and crashes.
This patch makes sure that the interleaved buffer is either user or vbo,
not both.
Marek Olšák [Sun, 20 Feb 2011 15:50:48 +0000 (16:50 +0100)]
st/mesa: fix crash when DrawBuffer->_ColorDrawBuffers[0] is NULL
This fixes the game Tiny and Big.
Chris Wilson [Tue, 22 Feb 2011 11:17:39 +0000 (11:17 +0000)]
i965: Trim the interleaved upload to the minimum number of vertices
... should have no impact on a properly formatted draw operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Tue, 22 Feb 2011 11:19:32 +0000 (11:19 +0000)]
i965: Reinstate max-index paranoia
Don't trust the applications not to reference beyond the end of the
vertex buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Tue, 22 Feb 2011 11:18:25 +0000 (11:18 +0000)]
i965: Zero the offset into the vbo when uploading non-interleaved
Fixes regression from
559435d9152acc7162e4e60aae6591c7c6c8274b.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Jakob Bornecrantz [Wed, 1 Dec 2010 04:04:25 +0000 (05:04 +0100)]
st/dri: Track drawable context bindings
Needs to track this ourself since because we get into a race condition with
the dri_util.c code on make current when rendering to the front buffer.
This is what happens:
Old context is rendering to the front buffer.
App calls MakeCurrent with a new context. dri_util.c sets
drawable->driContextPriv to the new context and then calls the driver make
current. st/dri make current flushes the old context, which calls back into
st/dri via the flush frontbuffer hook. st/dri calls dri loader flush
frontbuffer, which calls invalidate buffer on the drawable into st/dri.
This is where things gets wrong. st/dri grabs the context from the dri
drawable (which now points to the new context) and calls invalidate
framebuffer to the new context which has not yet set the new drawable as its
framebuffers since we have not called make current yet, it asserts.
Eric Anholt [Tue, 22 Feb 2011 00:24:41 +0000 (16:24 -0800)]
i965: Fix VB packet reuse when offset for the new buffer isn't stride aligned.
Fixes regression in scissor-stencil-clear and 5 other tests.
Brian Paul [Tue, 22 Feb 2011 00:01:00 +0000 (17:01 -0700)]
Revert "mesa: convert macros to inline functions"
This reverts commit
e9ff76aa81d9bd973d46b7e46f1e4ece2112a5b7.
Need to use macros so __FUNCTION__ reports the caller.
Brian Paul [Mon, 21 Feb 2011 23:54:23 +0000 (16:54 -0700)]
st/mesa: need to translate clear color according to surface's base format
When clearing a GL_LUMINANCE_ALPHA buffer, for example, we need to convert
the clear color (R,G,B,A) to (R,R,R,A). We were doing this for texture border
colors but not renderbuffers. Move the translation function to st_format.c
and share it.
This fixes the piglit fbo-clear-formats test.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
Brian Paul [Mon, 21 Feb 2011 23:46:02 +0000 (16:46 -0700)]
st/mesa: fix the default case in st_format_datatype()
Part of the fix for piglit fbo-clear-formats
NOTE: This is a candidate for the 7.9 and 7.10 branches.
Daniel Vetter [Mon, 21 Feb 2011 18:14:02 +0000 (19:14 +0100)]
i915g: add some throttling
Intel classic drivers switched to this, too, so it must be good.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 21 Feb 2011 17:25:20 +0000 (18:25 +0100)]
i915g: s/bool/boolean/ style-fixup in winsys
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jakob Bornecrantz [Mon, 21 Feb 2011 22:00:02 +0000 (22:00 +0000)]
i915g: Fix warning
Jakob Bornecrantz [Sun, 20 Feb 2011 12:41:18 +0000 (13:41 +0100)]
i915g: Add option to lie about caps
Jakob Bornecrantz [Sun, 20 Feb 2011 11:52:55 +0000 (12:52 +0100)]
i915g: Move debug fields to screen
Jakob Bornecrantz [Sun, 20 Feb 2011 10:41:32 +0000 (11:41 +0100)]
i915g: Use debug get once options
Jakob Bornecrantz [Sun, 20 Feb 2011 11:52:11 +0000 (12:52 +0100)]
i915g: Rework texture tiling a bit
Jakob Bornecrantz [Mon, 21 Feb 2011 21:27:05 +0000 (21:27 +0000)]
i915g: Anisotropic filtering works
Jakob Bornecrantz [Sun, 20 Feb 2011 13:00:03 +0000 (14:00 +0100)]
i915g: TODO about point sprites
Jakob Bornecrantz [Sun, 20 Feb 2011 12:58:11 +0000 (13:58 +0100)]
i915g: TODO about untested code hidden behind caps
Should be fairly easy to test and fix since you can look at
the code in the classic driver.
Jakob Bornecrantz [Sun, 20 Feb 2011 10:45:48 +0000 (11:45 +0100)]
i915g: Reorg caps
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
st/mesa: fix incorrect texture size allocation in st_finalize_texture()
If finalizing a non-POW mipmapped texture with an odd-sized base texture
image we were allocating the wrong size of gallium texture (off by one).
Need to be more careful about computing the base texture image size.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=34463
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
st/mesa: refactor guess_and_alloc_texture() code
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
st/mesa: fix mipmap generation for non-POW textures
This is part of the fix for https://bugs.freedesktop.org/show_bug.cgi?id=34463
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
mesa: convert macros to inline functions
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: more comments
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: make vbo_exec_FlushVertices_internal() static
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: remove old debug code, add comments
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: rename, document function params
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: comments
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: replace assert(0) with proper assertions
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: rename some vars, add new comments, fix formatting, etc.
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
vbo: use ctx instead of exec->ctx
Brian Paul [Mon, 21 Feb 2011 22:11:44 +0000 (15:11 -0700)]
radeon: add default switch case to silence unhandled enum warning
Ian Romanick [Thu, 10 Feb 2011 18:26:42 +0000 (10:26 -0800)]
Use C-style system headers in C++ code to avoid issues with std:: namespace
Chris Wilson [Mon, 21 Feb 2011 20:56:06 +0000 (20:56 +0000)]
intel: Fix insufficient integer width for upload buffer offset
I was being overly miserly and gave the offset of the buffer into the bo
insufficient bits, distracted by the adjacency of the buffer[4096].
Ref: https://bugs.freedesktop.org/show_bug.cgi?id=34541
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
José Fonseca [Mon, 21 Feb 2011 18:24:36 +0000 (18:24 +0000)]
svga: Remove some remaining fake S3TC rendering support.
Chris Wilson [Mon, 21 Feb 2011 16:02:26 +0000 (16:02 +0000)]
i965: Remove spurious duplicate ADVANCE_BATCH
... a leftover from a bad merge.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 25 Nov 2010 15:41:37 +0000 (15:41 +0000)]
i915: Emit a single relocation per vbo
Reducing the number of relocations has lots of nice knock-on effects,
not least including reducing batch buffer size, auxilliary array sizes
(vmalloced and copied into the kernel), processing of uncached
relocations etc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 26 Nov 2010 11:18:50 +0000 (11:18 +0000)]
i915: Suppress emission of redundant stencil updates
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 26 Nov 2010 10:57:06 +0000 (10:57 +0000)]
i915: Separate BLEND from general context state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 26 Nov 2010 10:25:23 +0000 (10:25 +0000)]
i915: Only flag context changes if the actual state is changed
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 25 Nov 2010 22:27:37 +0000 (22:27 +0000)]
i915: suppress repeated sampler state emission
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 25 Nov 2010 21:39:21 +0000 (21:39 +0000)]
i915: Eliminate redundant CONSTANTS updates
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Tue, 8 Feb 2011 22:58:35 +0000 (22:58 +0000)]
i965: Use compiler builtins when available
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 20 Feb 2011 15:36:52 +0000 (15:36 +0000)]
i965: Micro-optimise check_state
Replace the intermediate tests due to the logical or with the bitwise
or.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 30 Dec 2010 21:47:39 +0000 (21:47 +0000)]
intel: use throttle ioctl for throttling
Rather than waiting on the first batch after the last swapbuffers to be
retired, call into the kernel to wait upon the retirement of any request
less than 20ms old. This has the twofold advantage of (a) not blocking
any other clients from utilizing the device whilst we wait and (b) we
attain higher throughput without overloading the system.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sat, 12 Feb 2011 11:28:25 +0000 (11:28 +0000)]
i965: Remove unused 'next_free_page' member
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 00:03:48 +0000 (00:03 +0000)]
intel: Skip the flush before read-pixels via blit
As we will flush when reading the return values of the blit, we can forgo
the earlier flush.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 18 Feb 2011 10:37:43 +0000 (10:37 +0000)]
intel: extend current vertex buffers
If the next vertex arrays are a (discontiguous) continuation of the
current arrays, such that the new vertices are simply offset from the
start of the current vertex buffer definitions we can reuse those
defintions and avoid the overhead of relocations and invalidations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 18 Feb 2011 12:30:37 +0000 (12:30 +0000)]
intel: Use specified alignment for writes into the upload buffer
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 15:29:26 +0000 (15:29 +0000)]
i965: Clean up brw_prepare_vertices()
Use a temporary glarray variable to replace the numerous input->glarray.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 19:40:08 +0000 (19:40 +0000)]
intel: combine short memcpy using a temporary allocated buffer
Using a temporary buffer for large discontiguous uploads into the common
buffer and a single buffered upload is faster than performing the
discontiguous copies through a mapping into the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 14:14:18 +0000 (14:14 +0000)]
i965: upload normal arrays as interleaved
Upload the non-vbo arrays into a single interleaved buffer object, and
so need to just emit a single vertex buffer relocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 14:45:19 +0000 (14:45 +0000)]
i965: interleaved vbo
If the user passed in several arrays interleaved in the same vbo, only
emit a single vertex buffer and relocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 14:45:10 +0000 (14:45 +0000)]
i965: emit one vb packet per vbo
Track reuse of the vertex buffer objects and so minimise the number of
vertex buffers used by the hardware (and their relocations).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Fri, 11 Feb 2011 00:18:21 +0000 (00:18 +0000)]
i965: upload transient indices into the same discontiguous buffer
As we now pack the indices into a common upload buffer, we can reuse a
single CMD_INDEX_BUFFER packet and translate each invocation with a
start vertex offset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 20 Feb 2011 13:37:00 +0000 (13:37 +0000)]
i965: suppress repeat-emission of identical vertex elements
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 20 Feb 2011 13:23:47 +0000 (13:23 +0000)]
i965: Move repeat-instruction-suppression to batchbuffer core
Move the tracking of the last emitted instructions into the core
batchbuffer routines and take advantage of the shadow batch copy to
avoid extra memory allocations and copies.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 10 Feb 2011 20:25:51 +0000 (20:25 +0000)]
intel: use pwrite for batch
It's faster. Not only is the memcpy more efficiently performed in the
kernel (making up for the system call overhead), but by not using mmap
we remove the greater overhead of tracking the vma of every batch.
And it means we can read back from the batch buffer without incurring
the cost of a uncached read through the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 10 Feb 2011 18:31:13 +0000 (18:31 +0000)]
i965: drop state_bo references to batch_bo
As we use state relocations and we know that all the state belongs to
the same bo, we can drop the multiple references to the same bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 10 Feb 2011 18:14:40 +0000 (18:14 +0000)]
i965: directly write wm state to batch
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Thu, 10 Feb 2011 18:11:58 +0000 (18:11 +0000)]
i965: write cc straight to batch
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>