platform/upstream/mesa.git
8 years agoradeonsi: move si_shader_dump call out of si_compile_llvm
Marek Olšák [Sun, 3 Jan 2016 16:18:04 +0000 (17:18 +0100)]
radeonsi: move si_shader_dump call out of si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: inline si_shader_binary_read
Marek Olšák [Sun, 3 Jan 2016 16:05:05 +0000 (17:05 +0100)]
radeonsi: inline si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move si_shader_dump call out of si_shader_binary_read
Marek Olšák [Sun, 3 Jan 2016 16:03:24 +0000 (17:03 +0100)]
radeonsi: move si_shader_dump call out of si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: separate shader dumping code to si_shader_dump and *_dump_stats
Marek Olšák [Sun, 3 Jan 2016 15:39:24 +0000 (16:39 +0100)]
radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats

Eventually, I'd like to dump stats for several combined binaries, which is
why you don't see a binary parameter in si_shader_dump_stats

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add si_shader_destroy_binary
Marek Olšák [Sun, 27 Dec 2015 23:53:29 +0000 (00:53 +0100)]
radeonsi: add si_shader_destroy_binary

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_compile_llvm
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move si_shader_binary_upload out of si_compile_llvm
Marek Olšák [Sun, 27 Dec 2015 22:47:00 +0000 (23:47 +0100)]
radeonsi: move si_shader_binary_upload out of si_compile_llvm

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: always keep shader code, rodata, and relocs in memory
Marek Olšák [Sun, 27 Dec 2015 22:35:08 +0000 (23:35 +0100)]
radeonsi: always keep shader code, rodata, and relocs in memory

We won't compile shaders in draw calls, but we will concatenate shader
binaries according to states in draw calls, so keep the binaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_shader_binary_read
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't pass si_shader to si_shader_binary_read_config
Marek Olšák [Mon, 28 Dec 2015 00:45:00 +0000 (01:45 +0100)]
radeonsi: don't pass si_shader to si_shader_binary_read_config

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: add struct si_shader_config
Marek Olšák [Sun, 27 Dec 2015 23:14:05 +0000 (00:14 +0100)]
radeonsi: add struct si_shader_config

There will be 1 config per variant, which will be a union of configs
from {prolog, main, epilog}. For now, just add the structure.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move NULL exporting into a separate function
Marek Olšák [Sun, 27 Dec 2015 19:05:19 +0000 (20:05 +0100)]
radeonsi: move NULL exporting into a separate function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move MRT color exporting into a separate function
Marek Olšák [Sun, 27 Dec 2015 19:02:41 +0000 (20:02 +0100)]
radeonsi: move MRT color exporting into a separate function

This will be used by a fragment shader epilog.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use EXP_NULL for pixel shaders without outputs
Marek Olšák [Sun, 27 Dec 2015 18:36:33 +0000 (19:36 +0100)]
radeonsi: use EXP_NULL for pixel shaders without outputs

This never happens currently.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: only use LLVMBuildLoad once when updating color outputs at the end
Marek Olšák [Sun, 27 Dec 2015 16:53:44 +0000 (17:53 +0100)]
radeonsi: only use LLVMBuildLoad once when updating color outputs at the end

without LLVMBuildStore.

So:
- do LLVMBuildLoad
- update the values as necessary
- export

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: export "undef" values for undefined PS outputs
Marek Olšák [Sun, 27 Dec 2015 16:45:52 +0000 (17:45 +0100)]
radeonsi: export "undef" values for undefined PS outputs

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: move MRTZ export into a separate function
Marek Olšák [Sun, 27 Dec 2015 16:38:37 +0000 (17:38 +0100)]
radeonsi: move MRTZ export into a separate function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: simplify setting the DONE bit for PS exports
Marek Olšák [Wed, 23 Dec 2015 17:06:04 +0000 (18:06 +0100)]
radeonsi: simplify setting the DONE bit for PS exports

First find out what the last export is and simply set the DONE bit there.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation
Marek Olšák [Wed, 23 Dec 2015 15:43:54 +0000 (16:43 +0100)]
radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: write all MRTs only if there is exactly one output
Marek Olšák [Wed, 23 Dec 2015 15:24:02 +0000 (16:24 +0100)]
radeonsi: write all MRTs only if there is exactly one output

This doesn't fix a known bug, but better safe than sorry.

Also, simplify the expression in si_shader.c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation
Marek Olšák [Wed, 23 Dec 2015 15:02:46 +0000 (16:02 +0100)]
radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: determine DB_SHADER_CONTROL outside of shader compilation
Marek Olšák [Wed, 23 Dec 2015 14:36:05 +0000 (15:36 +0100)]
radeonsi: determine DB_SHADER_CONTROL outside of shader compilation

because the API pixel shader binary will not emulate alpha test one day,
so the KILL_ENABLE bit must be determined elsewhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agotgsi/scan: set which color components are read by a fragment shader
Marek Olšák [Fri, 1 Jan 2016 18:42:44 +0000 (19:42 +0100)]
tgsi/scan: set which color components are read by a fragment shader

This will be used by radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agotgsi/scan: fix tgsi_shader_info::reads_z
Marek Olšák [Sat, 2 Jan 2016 16:28:19 +0000 (17:28 +0100)]
tgsi/scan: fix tgsi_shader_info::reads_z

This has no users in Mesa.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agotgsi/scan: set if a fragment shader writes sample mask
Marek Olšák [Wed, 23 Dec 2015 02:01:32 +0000 (03:01 +0100)]
tgsi/scan: set if a fragment shader writes sample mask

This will be used by radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoglsl: Disallow vectorization of vector_insert/extract.
Kenneth Graunke [Tue, 5 Jan 2016 13:34:24 +0000 (05:34 -0800)]
glsl: Disallow vectorization of vector_insert/extract.

vector_insert takes a vector, a scalar location, and a scalar value,
and produces a new vector with that component updated.  As such, it
can't be vectorized properly.

vector_extract takes a vector and a scalar location, and returns
that scalar component of the vector.  Vectorization doesn't really
make any sense.

Treating both as horizontal operations makes sure the vectorizer
won't try to touch these.

Found by inspection.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agosoftpipe: tell draw about the vertex layout we want
Roland Scheidegger [Tue, 22 Dec 2015 02:42:33 +0000 (03:42 +0100)]
softpipe: tell draw about the vertex layout we want

This makes it more similar to llvmpipe. It also allows us to let draw emit
code handle things like getting zeros for non-existing vs outputs
automatically. There probably isn't really any overhead either way, there isn't
really any "simply copy everything" code in the emit path it would copy each
attrib individually just the same. Likewise, we still do another mapping step
in softpipe as the layout may still not match exactly (same as in llvmpipe,
should probably nuke the pointless mapping in both drivers).

This fixes the piglit arb_fragment_layer_viewport no_gs/no_write tests.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agollvmpipe: use ints not unsigned for slots
Roland Scheidegger [Sat, 19 Dec 2015 05:12:27 +0000 (06:12 +0100)]
llvmpipe: use ints not unsigned for slots

They can't actually be 0 (as position is there) but should avoid confusion.

This was supposed to have been done by af7ba989fb5a39925a0a1261ed281fe7f48a16cf
but I accidentally pushed an older version of the patch in the end...
Also prettify slightly. And make some notes about the confusing and useless
fs input "map".

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agodraw: nuke the interp parameter from vertex_info
Roland Scheidegger [Sat, 19 Dec 2015 02:43:14 +0000 (03:43 +0100)]
draw: nuke the interp parameter from vertex_info

draw emit couldn't care less what the interpolation mode is...
This somehow looked like it would matter, all drivers more or less
dutifully filled that in correctly. But this is only used for emit,
if draw needs to know about interpolation mode (for clipping for instance)
it will get that information from the vs anyway.
softpipe actually used to depend on that interpolation parameter, as it
abused that structure quite a bit but no longer.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agosoftpipe: don't abuse the draw vertex_info struct for something different
Roland Scheidegger [Sat, 19 Dec 2015 02:37:17 +0000 (03:37 +0100)]
softpipe: don't abuse the draw vertex_info struct for something different

softpipe would calculate two "vertex layouts". The second one was however
just used for internal purposes, draw would know nothing about it even though
it looked exactly the same as the other one we tell draw about.
So, store that information separately as this was just confusing.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agosoftpipe: fix mapping of "special" vs outputs
Roland Scheidegger [Sat, 19 Dec 2015 01:33:25 +0000 (02:33 +0100)]
softpipe: fix mapping of "special" vs outputs

Unlike llvmpipe, softpipe always tells draw to emit the vertices as-is.
The two vertex layouts it calculates are a bit confusing, one which is just
used to tell draw to emit vertices as-is, and the other which has draw written
all over it but draw is completely unaware of and is used only to look up the
correct interpolation info later in setup.
Thus, the slots used are different to what llvmpipe does (I'm going to clean
up the confusing two layout stuff).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agollvmpipe: scratch some special handling of vp_index/layer
Roland Scheidegger [Fri, 18 Dec 2015 20:44:06 +0000 (21:44 +0100)]
llvmpipe: scratch some special handling of vp_index/layer

It was actually slightly buggy (missing initialization / setup not dependent
on new vs albeit I didn't see issues), but the case of non-existing attributes
is now handled by draw emit code so don't need that anymore.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agodraw: rework handling of non-existing outputs in emit code
Roland Scheidegger [Thu, 7 Jan 2016 00:52:39 +0000 (01:52 +0100)]
draw: rework handling of non-existing outputs in emit code

Previously the code would just redirect requests for attributes which
don't exist to use output 0. Rework this to output all zeros instead which
seems more useful - in particular some extensions like
ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by
previous stages. That way, drivers don't have to special case this depending
if the vs/gs outputs some attribute or not.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agomesa: Add KBL PCI IDs and platform information.
Sarah Sharp [Mon, 21 Sep 2015 21:22:53 +0000 (14:22 -0700)]
mesa: Add KBL PCI IDs and platform information.

Add PCI IDs for the Intel Kabylake platforms.  The IDs are taken
directly from the Linux kernel patches, which are under review:

http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2

The Kabylake PCI IDs taken from the kernel are rearranged to be in order
of GT type, then PCI ID.

Please note that if this patch is backported, the following fixes will
need to be added before this patch:

commit 28ed1e08e8ba98e "i965/skl: Remove early platform support"
commit c1e38ad37042b0e "i965/skl: Use larger URB size where available."

Thanks to Ben for fixing a bug around setting urb.size, and being
patient with my questions about what the various fields mean.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2)
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
8 years agosvga: Rename SVGA_HINT_FLAG_DRAW_EMITTED
Sinclair Yeh [Thu, 10 Dec 2015 22:26:29 +0000 (14:26 -0800)]
svga: Rename SVGA_HINT_FLAG_DRAW_EMITTED

Rename SVGA_HINT_FLAG_DRAW_EMITTED to SVGA_HINT_FLAG_CAN_PRE_FLUSH
because preemptive flush can be unblocked by more commands than
draw.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agosvga: allow preemptive flushing on DMA, update, and readback commands
Sinclair Yeh [Wed, 9 Dec 2015 23:05:49 +0000 (15:05 -0800)]
svga: allow preemptive flushing on DMA, update, and readback commands

The existing code effectively turns off preemptive flushing for all
but the regions used for draws.  This turns out to be overly
restrictive as some memory regions, e.g. GMR, may never get a draw
when used as a DMA upload staging area, causing problems for apps
that upload a large amount of textures, e.g. Unigine Heaven.

This patch fixes the Unigine Heaven memory allocation error and
has been verified to not cause a regression in the previous extended
retina display issue.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agosvga: skip vertex attribute instruction with zero usage_mask
Charmaine Lee [Mon, 4 Jan 2016 18:36:48 +0000 (10:36 -0800)]
svga: skip vertex attribute instruction with zero usage_mask

In emit_input_declarations(), we are skipping declarations for those
registers that are not being used. But in emit_vertex_attrib_instructions(),
we are still emitting instructions to tweak the vertex attributes even if
they are not being used. This causes an assert in the backend because an
input register is not declared in the shader. This patch fixes the problem
by skipping the instruction if the vertex attribute is not being used.
Changes in this patch is originated from the code snippet from Jose as
suggested in bug 1530161.

Tested with piglit, Heaven, Turbine, glretrace.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agost/mesa: minor clean-ups in st_atom.c
Brian Paul [Wed, 6 Jan 2016 22:45:08 +0000 (15:45 -0700)]
st/mesa: minor clean-ups in st_atom.c

Remove useless comment.  Reformat code.

8 years agost/mesa: replace bitmap size checks with assertion
Brian Paul [Wed, 6 Jan 2016 18:48:52 +0000 (11:48 -0700)]
st/mesa: replace bitmap size checks with assertion

The _mesa_Bitmap() caller already checks for zero-sized bitmaps.

8 years agost/mesa: check texture target in allocate_full_mipmap()
Brian Paul [Thu, 17 Dec 2015 21:16:24 +0000 (14:16 -0700)]
st/mesa: check texture target in allocate_full_mipmap()

Some kinds of textures never have mipmaps.  3D textures seldom have
mipmaps.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: move mipmap allocation check logic into a function
Brian Paul [Thu, 17 Dec 2015 21:06:11 +0000 (14:06 -0700)]
st/mesa: move mipmap allocation check logic into a function

Better readability and easier to extend.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agomain: s/GLuint/GLbitfield for state bitmasks
Brian Paul [Wed, 6 Jan 2016 15:38:33 +0000 (08:38 -0700)]
main: s/GLuint/GLbitfield for state bitmasks

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agovbo: s/GLuint/GLbitfield/ for state bitmasks
Brian Paul [Wed, 6 Jan 2016 15:38:03 +0000 (08:38 -0700)]
vbo: s/GLuint/GLbitfield/ for state bitmasks

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: use GLbitfield in st_state_flags, add comments
Brian Paul [Wed, 6 Jan 2016 15:32:02 +0000 (08:32 -0700)]
st/mesa: use GLbitfield in st_state_flags, add comments

Use GLbitfield instead of GLuint to be consistent with other variables.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agos/GLuint/GLbitfield/ for st_invalidate_state() parameter
Brian Paul [Wed, 6 Jan 2016 15:33:36 +0000 (08:33 -0700)]
s/GLuint/GLbitfield/ for st_invalidate_state() parameter

To match dd_function_table::UpdateState().

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: be more careful about state validation in st_Bitmap()
Brian Paul [Wed, 6 Jan 2016 01:11:14 +0000 (18:11 -0700)]
st/mesa: be more careful about state validation in st_Bitmap()

If the only dirty state is mesa's _NEW_PROGRAM_CONSTANTS flag, we can
skip state validation before drawing a bitmap since that state doesn't
effect bitmap rendering.

This further increases the performance of the ipers demo on llvmpipe
to about what it was before commit 36c93a6fae27561.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: move bitmap cache flushing out of state validation
Brian Paul [Wed, 6 Jan 2016 01:28:57 +0000 (18:28 -0700)]
st/mesa: move bitmap cache flushing out of state validation

Just do it where needed (before drawing, clearing, etc).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: check state->mesa in early return check in st_validate_state()
Brian Paul [Wed, 6 Jan 2016 00:10:12 +0000 (17:10 -0700)]
st/mesa: check state->mesa in early return check in st_validate_state()

We were checking the dirty->st flags but not the dirty->mesa flags.
When we took the early return, we didn't clear the dirty->mesa flags
so the next time we called st_validate_state() we'd often flush the
glBitmap cache.  And since st_validate_state() is called from
st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap()
call.

This change seems to recover most of the performance loss observed
with the ipers demo on llvmpipe since commit commit 36c93a6fae27561.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
8 years agost/mesa: protect debug printf() with a conditional instead of comment
Brian Paul [Wed, 6 Jan 2016 00:26:29 +0000 (17:26 -0700)]
st/mesa: protect debug printf() with a conditional instead of comment

8 years agost/mesa: fix comment indentation in st_flush_bitmap_cache()
Brian Paul [Wed, 6 Jan 2016 00:38:00 +0000 (17:38 -0700)]
st/mesa: fix comment indentation in st_flush_bitmap_cache()

8 years agoglsl: fix varying slot allocation for blocks and structs with explicit locations
Timothy Arceri [Wed, 6 Jan 2016 09:22:46 +0000 (20:22 +1100)]
glsl: fix varying slot allocation for blocks and structs with explicit locations

Previously each member was being counted as using a single slot,
count_attribute_slots() fixes the count for array and struct members.

Also don't assign a negitive to the unsigned expl_location variable.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoglsl: don't try adding built-ins to explicit locations bitmask
Timothy Arceri [Tue, 15 Dec 2015 05:23:29 +0000 (16:23 +1100)]
glsl: don't try adding built-ins to explicit locations bitmask

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: fix overlapping of varying locations for arrays and structs
Timothy Arceri [Tue, 15 Dec 2015 05:40:26 +0000 (16:40 +1100)]
glsl: fix overlapping of varying locations for arrays and structs

Previously we were only reserving a single location for arrays and
structs.

We also didn't take into account implicit locations clashing with
explicit locations when assigning locations for their arrays or
structs.

This patch fixes both issues.

V5: fix regression for patch inputs/outputs in tessellation shaders
V4: just use count_attribute_slots() to get the number of slots,
also calculate the correct number of slots to reserve for gs and
tess stages by making use of the new get_varying_type() helper.
V3: handle arrays of structs
V2: also fix for arrays of arrays and structs.

Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: create helper to remove outer vertex index array used by some stages
Timothy Arceri [Fri, 18 Dec 2015 02:53:27 +0000 (13:53 +1100)]
glsl: create helper to remove outer vertex index array used by some stages

This will be used in the following patch for calculating array sizes correctly
when reserving explicit varying locations.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: remove unused varyings before packing them
Timothy Arceri [Mon, 21 Dec 2015 23:14:45 +0000 (10:14 +1100)]
glsl: remove unused varyings before packing them

Previously we would pack varyings before trying to remove them, this
relied on the packing pass not packing varyings with a location of -1
to avoid packing varyings that should be removed.
However this meant unused varyings with an explicit location would be
packed before they could be removed when we enable packing of them in a
later patch.

V2: fix regression in V1 removing unused varyings in multi-stage SSO,
fix regression with single stage programs.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agogallium/r600: Replace ALIGN_DIVUP with DIV_ROUND_UP
Krzysztof Sobiecki [Tue, 29 Dec 2015 19:27:44 +0000 (20:27 +0100)]
gallium/r600: Replace ALIGN_DIVUP with DIV_ROUND_UP

ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP functionality.
Replacing it with DIV_ROUND_UP eliminates this problems.

Signed-off-by: Krzysztof A. Sobiecki <sobkas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agovc4: Fix driver build from last minute rebase fix.
Eric Anholt [Wed, 6 Jan 2016 20:48:19 +0000 (12:48 -0800)]
vc4: Fix driver build from last minute rebase fix.

I had the driver all tested for the last series, and in my last build I
noticed that get_swizzled_channel was unused now, and removed
it... apparently without testing to find that I removed the wrong channel
swizzle function.

8 years agovc4: Optimize out a comparison for bcsel based on an ALU comparison
Eric Anholt [Wed, 6 Jan 2016 01:18:09 +0000 (17:18 -0800)]
vc4: Optimize out a comparison for bcsel based on an ALU comparison

We routinely have code like:

vec1 ssa_220 = fge ssa_104, ssa_61
vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105

and we would compare fge's args and choose between ~0 and 0 to generate
ssa_220, then compare ssa_220 to 0 and choose between bcsel's args.
Instead, try to notice the pattern and compare between fge's args to
select between bcsel's args.

total instructions in shared programs: 88019 -> 87574 (-0.51%)
instructions in affected programs:     9985 -> 9540 (-4.46%)
total estimated cycles in shared programs: 245752 -> 245237 (-0.21%)
estimated cycles in affected programs:     17232 -> 16717 (-2.99%)

8 years agovc4: Add missing sRGB decode to texel fetches.
Eric Anholt [Wed, 6 Jan 2016 00:36:28 +0000 (16:36 -0800)]
vc4: Add missing sRGB decode to texel fetches.

We only see txf on MSAA textures, currently, and apparently this didn't
impact any of our piglit tests.

8 years agovc4: Add support for GL_ARB_texture_swizzle.
Eric Anholt [Wed, 6 Jan 2016 00:25:07 +0000 (16:25 -0800)]
vc4: Add support for GL_ARB_texture_swizzle.

We already had the code supporting it, since it's needed for the depth
mode when doing shadow comparisons.

8 years agovc4: Use NIR texture lowering for texture swizzling.
Eric Anholt [Sat, 19 Dec 2015 03:15:03 +0000 (19:15 -0800)]
vc4: Use NIR texture lowering for texture swizzling.

We can't use its other features currently (mostly because we don't want
Newton-Raphson on rcps for texture coordinates), but it gets us started.

This eliminates some comparisons with constants in GLB2.7 and ETQW traces
at the QIR level by moving the comparisons into NIR, where they get
constant-folded out.

instructions in affected programs:     165 -> 156 (-5.45%)
total uniforms in shared programs: 32087 -> 32085 (-0.01%)
total estimated cycles in shared programs: 245762 -> 245752 (-0.00%)
estimated cycles in affected programs:     461 -> 451 (-2.17%)

8 years agovc4: Replace the SSA-style SEL operators with conditional MOVs.
Eric Anholt [Tue, 22 Dec 2015 21:37:36 +0000 (13:37 -0800)]
vc4: Replace the SSA-style SEL operators with conditional MOVs.

I'm moving away from QIR being SSA (since NIR is doing lots of SSA
optimization for us now) and instead having QIR just be QPU operations
with virtual registers.  By making our SELs be composed of two MOVs, we
could potentially coalesce the registers for the MOV's src and dst and
eliminate the MOV.

total instructions in shared programs: 88448 -> 88028 (-0.47%)
instructions in affected programs:     39845 -> 39425 (-1.05%)
total estimated cycles in shared programs: 246306 -> 245762 (-0.22%)
estimated cycles in affected programs:     162887 -> 162343 (-0.33%)

8 years agovc4: Don't try the SF coalescing unless it's on a def.
Eric Anholt [Mon, 4 Jan 2016 21:56:39 +0000 (13:56 -0800)]
vc4: Don't try the SF coalescing unless it's on a def.

If you want the SF of the value of a register produced from a series of
packing MOVs or conditional MOVs, we can't just SF on the last MOV into
the register.

8 years agogallium/drivers/svga: Use unsigned for loop index
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:23 +0000 (21:07 +1100)]
gallium/drivers/svga: Use unsigned for loop index

Fix a 's/unsigned int/unsigned/' consistency case while here.

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/drivers/r600: Use unsigned for loop index
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:22 +0000 (21:07 +1100)]
gallium/drivers/r600: Use unsigned for loop index

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/drivers/ilo: Use unsigned for loop index
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:21 +0000 (21:07 +1100)]
gallium/drivers/ilo: Use unsigned for loop index

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium: Use unsigned for loop index
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:20 +0000 (21:07 +1100)]
gallium: Use unsigned for loop index

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium/drivers: Remove unnecessary semicolons
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:19 +0000 (21:07 +1100)]
gallium/drivers: Remove unnecessary semicolons

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agogallium: Remove unnecessary semicolons
Edward O'Callaghan [Tue, 5 Jan 2016 10:07:18 +0000 (21:07 +1100)]
gallium: Remove unnecessary semicolons

Fix silly issue with MSVC case fall-though support to need
a extra 'break;'

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agollvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8
Oded Gabbay [Tue, 29 Dec 2015 16:12:35 +0000 (18:12 +0200)]
llvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8

This patch converts the SSE-optimized lp_rast_triangle_32_3_16()
to VMX/VSX.

I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.

                      FPS/Score
 Name            Before     After    Delta
------------------------------------------------
openarena        16.35      16.7     2.14%
xonotic          4.707      4.97     5.57%

glmark2 didn't show a significant (more than 1%) difference.

v2: Make sure code is build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agollvmpipe: Optimize BUILD_MASK(_LINEAR) for POWER8
Oded Gabbay [Tue, 29 Dec 2015 16:12:34 +0000 (18:12 +0200)]
llvmpipe: Optimize BUILD_MASK(_LINEAR) for POWER8

This patch converts the SSE-optimized build_mask_32() and
build_mask_linear_32() to VMX/VSX.

I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.

                      FPS/Score
  Name            Before     After    Delta
------------------------------------------------
glmark2 (score)   139.8      142.7    2.07%

openarena and xonotic didn't show a significant (more than 1%)
difference.

v2: Make sure code is build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agollvmpipe: Optimize do_triangle_ccw for POWER8
Oded Gabbay [Sun, 13 Dec 2015 15:49:32 +0000 (17:49 +0200)]
llvmpipe: Optimize do_triangle_ccw for POWER8

This patch converts the SSE optimization done in do_triangle_ccw to
VMX/VSX.

I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.

                      FPS/Score
  Name            Before     After    Delta
------------------------------------------------
glmark2 (score)   136.6      139.8    2.34%
openarena         16.14      16.35    1.30%
xonotic           4.655      4.707    1.11%

v2:

- Convert loads to use aligned loads
- Make sure code is build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agollvmpipe: add POWER8 portability file - u_pwr8.h
Oded Gabbay [Thu, 3 Dec 2015 07:11:13 +0000 (09:11 +0200)]
llvmpipe: add POWER8 portability file - u_pwr8.h

This file provides a portability layer that will make it easier to convert
SSE-based functions to VMX/VSX-based functions.

All the functions implemented in this file are prefixed using "vec_".
Therefore, when converting from SSE-based function, one needs to simply
replace the "_mm_" prefix of the SSE function being called to "vec_".

Having said that, not all functions could be converted as such, due to the
differences between the architectures. So, when doing such
conversion hurt the performance, I preferred to implement a more ad-hoc
solution. For example, converting the _mm_shuffle_epi32 needed to be done
using ad-hoc masks instead of a generic function.

All the functions in this file support both little-endian and big-endian
but currently the file is build only on POWER8 LE machine.

All of the functions are implemented using the Altivec/VMX intrinsics,
except one where I needed to use inline assembly (due to missing
intrinsic).

v2:
- Use vec_vgbbd instead of __builtin_vec_vgbbd
- Add an aligned load function
- Don't use typeof()
- Make file build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoconfigure.ac: Detect if running on POWER8 arch
Oded Gabbay [Thu, 3 Dec 2015 07:11:04 +0000 (09:11 +0200)]
configure.ac: Detect if running on POWER8 arch

To determine if we could use special POWER8 assembly directives, we first
need to detect whether we are running on POWER8 architecture. This patch
adds this detection to configure.ac and adds the necessary compilation
flags accordingly.

v2:

- Add option to disable POWER8 instructions generation
- Detect whether building on BE or LE machine and build with
  -mpower8-vector only on LE machine
- Make the printed messages more standard

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agonir: Add a lower_fdiv option, turn fdiv into fmul/frcp.
Kenneth Graunke [Tue, 5 Jan 2016 13:09:46 +0000 (05:09 -0800)]
nir: Add a lower_fdiv option, turn fdiv into fmul/frcp.

The nir_opt_algebraic rule

(('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))),

can produce new fdiv operations, which need to be lowered on i965,
as we don't actually implement fdiv.  (Normally, we handle this in
GLSL IR's lower_instructions pass, but in the above case we introduce
an fdiv after that point.  So, make NIR do it for us.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
8 years agoi965: Only turn on ARB_compute_shader if we can write registers.
Kenneth Graunke [Tue, 5 Jan 2016 10:54:50 +0000 (02:54 -0800)]
i965: Only turn on ARB_compute_shader if we can write registers.

Compute shaders require reconfiguring the L3 for shared local memory
support.  We have to be able to write the L3 registers to do that.

This effectively turns off compute shaders prior to Kernel 4.2.

(Previously, the extension enable was in an API_OPENGL_CORE conditional.
However, that isn't necessary - core Mesa extension handling already
restricts it properly.  I've moved it out in this patch.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
8 years agoi965: Use rcp in brw_lower_texture_gradients rather than 1.0 / x.
Kenneth Graunke [Tue, 5 Jan 2016 12:46:33 +0000 (04:46 -0800)]
i965: Use rcp in brw_lower_texture_gradients rather than 1.0 / x.

That's what it's for.  Plus, we actually implement rcp.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agomesa: fix GL_MAX_NAME_LENGTH query for tessellation shaders
Timothy Arceri [Wed, 6 Jan 2016 00:27:05 +0000 (11:27 +1100)]
mesa: fix GL_MAX_NAME_LENGTH query for tessellation shaders

This fixes some piglit subtests for ARB_program_interface_query.

V3: remove some of the unnecessary parentheses
V2: fix alignment

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoglsl: don't change the varying type in validation code
Timothy Arceri [Wed, 23 Dec 2015 03:26:49 +0000 (14:26 +1100)]
glsl: don't change the varying type in validation code

There is a function dedicated to demoting unused varyings lets
trust it to do its job.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: move lowering after matching validation
Timothy Arceri [Wed, 23 Dec 2015 03:11:04 +0000 (14:11 +1100)]
glsl: move lowering after matching validation

After lowering the matching flag is_unmatched_generic_inout is lost so
we need to move this validation before lowering.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoglsl: only add outward facing varyings to resourse list for SSO
Timothy Arceri [Wed, 23 Dec 2015 22:50:59 +0000 (09:50 +1100)]
glsl: only add outward facing varyings to resourse list for SSO

An SSO program can have multiple stages and we only want to add the externally
facing varyings. The current code was adding both the packed inputs and outputs
for the first and last stage of each program.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
8 years agoi965/gen9: Modify the conditions to use blitter on skl+
Anuj Phogat [Tue, 24 Mar 2015 23:07:40 +0000 (16:07 -0700)]
i965/gen9: Modify the conditions to use blitter on skl+

Conditions modified allow skl+ to use blitter:
 - for all tiling formats
 - to write data to YF/YS tiled surfaces

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
8 years agoi965/gen9: Return false in place of assert in intelEmitCopyBlit()
Anuj Phogat [Tue, 10 Nov 2015 23:33:53 +0000 (15:33 -0800)]
i965/gen9: Return false in place of assert in intelEmitCopyBlit()

This allows the fallback paths to handle it correctly.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi965/gen9: Remove regions overlap check in fast copy blit
Anuj Phogat [Tue, 3 Nov 2015 18:31:45 +0000 (10:31 -0800)]
i965/gen9: Remove regions overlap check in fast copy blit

Overlapping blits are anyway undefined in OpenGL. So no need
of overlap check here.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi965/gen9: Don't use fast copy blit in case of non power of 2 cpp
Anuj Phogat [Tue, 28 Jul 2015 17:47:35 +0000 (10:47 -0700)]
i965/gen9: Don't use fast copy blit in case of non power of 2 cpp

Fast copy blit is currently enabled for use only with Yf/Ys tiling.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
8 years agoi915/i965: Fix typo in perf_debug message
Ian Romanick [Fri, 18 Dec 2015 01:50:34 +0000 (17:50 -0800)]
i915/i965: Fix typo in perf_debug message

Trivial

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
8 years agost/mesa: minor indentation fixes
Brian Paul [Tue, 5 Jan 2016 20:04:46 +0000 (13:04 -0700)]
st/mesa: minor indentation fixes

8 years agodraw: minor indentation fix
Brian Paul [Tue, 5 Jan 2016 20:03:05 +0000 (13:03 -0700)]
draw: minor indentation fix

8 years agomesa: minor clean-up of some memcpy/sizeof() calls in m_matrix.c
Brian Paul [Tue, 5 Jan 2016 20:03:05 +0000 (13:03 -0700)]
mesa: minor clean-up of some memcpy/sizeof() calls in m_matrix.c

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agoutil: add debug_dump_ubyte_rgba_bmp()
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
util: add debug_dump_ubyte_rgba_bmp()

Like debug_dump_float_rgba_bmp() but takes ubyte values.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agomesa: check for z=0 in _mesa_Vertex3dv()
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
mesa: check for z=0 in _mesa_Vertex3dv()

It's very rare that a GL app calls glVertex3dv(), but one in particular
calls it lot, always with Z = 0.  Check for that condition and convert
the call into glVertex2f.  This reduces VBO memory used and reduces
the number of times we have to switch between float[2] and float[3]
vertex formats in the svga driver.  This results in a small but
measurable performance improvement.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: fix test for SVGA_NEW_STIPPLE
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
svga: fix test for SVGA_NEW_STIPPLE

We only want to set the SVGA_NEW_STIPPLE dirty flag when the polygon
stipple state changes.  Before, we only set the flag when we were
enabling stipple, but not disabling.

We don't really have to add SVGA_NEW_STIPPLE to the dirty FS state
set since it's a subset of SVGA_NEW_RAST, but let's be explicit.

This doesn't fix any known bugs.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: add some comments in svga_state_vs.c
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
svga: add some comments in svga_state_vs.c

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: change svga_hw_view_state::dirty to boolean
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
svga: change svga_hw_view_state::dirty to boolean

Since it's a true/false value.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: avoid emitting redundant SetVertexBuffers() commands
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
svga: avoid emitting redundant SetVertexBuffers() commands

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agosvga: check for no-ops in svga_bind_sampler_states()
Brian Paul [Tue, 5 Jan 2016 20:03:04 +0000 (13:03 -0700)]
svga: check for no-ops in svga_bind_sampler_states()

and svga_set_sampler_views().  If there's no change, return early
and don't set a SVGA_NEW_x dirty state flag.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
8 years agoi965: quieten compiler warning about out-of-bounds access
Ilia Mirkin [Tue, 5 Jan 2016 04:28:52 +0000 (23:28 -0500)]
i965: quieten compiler warning about out-of-bounds access

gcc 4.9.3 shows the following error:

brw_vue_map.c:260:20: warning: array subscript is above array bounds
[-Warray-bounds]
    return brw_names[slot - VARYING_SLOT_MAX];

This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum
type. Adding an assert will generate no additional code but will teach
the compiler to not complain.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
8 years agobuild: enable st/va with nouveau driver
Julien Isorce [Thu, 9 Apr 2015 12:45:17 +0000 (13:45 +0100)]
build: enable st/va with nouveau driver

vainfo fails in vaDriverInit because "dd_create_screen"
does not reach strcmp(driver_name, "nouveau") code.
Indeed when compiling the va target.c, the macro GALLIUM_NOUVEAU
is not defined.
This patch define the macro the same it is done for dri and
vdpau targets.

Tested with:
./autogen.sh --enable-glx --enable-gles2 --enable-egl --enable-vdpau --enable-glx-tls=yes --enable-va
--with-gallium-drivers=swrast,nouveau --with-dri-drivers=swrast,nouveau --with-egl-platforms=x11

LIBVA_DRIVER_NAME=gallium vainfo
Output:
vainfo: Driver version: mesa gallium vaapi
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple                  : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG4Simple            : VAEntrypointVLD
      VAProfileMPEG4AdvancedSimple    : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264Baseline           : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonvc0: add support for st/va
Julien Isorce [Wed, 23 Dec 2015 09:25:53 +0000 (09:25 +0000)]
nvc0: add support for st/va

- split nvc0_decoder_bsp in begin/next/end
- preserve content buffer when calling nvc0_decoder_bsp_next
- implement pipe_video_codec::begin_frame/end_frame

https://bugs.freedesktop.org/show_bug.cgi?id=89969

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agonouveau: split nouveau_vp3_bsp in begin/next/end
Julien Isorce [Wed, 23 Dec 2015 09:25:52 +0000 (09:25 +0000)]
nouveau: split nouveau_vp3_bsp in begin/next/end

It allows to call nouveau_vp3_bsp_next multiple times
between one begin/end.

It is required to support st/va.

https://bugs.freedesktop.org/show_bug.cgi?id=89969

Signed-off-by: Julien Isorce <j.isorce@samsung.com>
[imirkin: create strparm_bsp function, simplified w0 calculation]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>