review.tizen.org Git - platform/upstream/mesa.git/log

main: get rid of needless conditional

We already check if the driver changed the completeness, we don't
need to duplicate that check. Let's just early out there instead.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>

gallium/util: removed unused header-file

This hasn't been in use since c476305 ("gallium/util: pregenerate
half float tables"), where the last bit of run-time init using this
was killed. So let's just get rid of the pointless header.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>

nvc0: do not force re-binding of compute constbufs on Fermi

Re-binding compute constant buffers after launching a grid have no effects
because they are not currently validated and because dirty_cp is not updated
accordingly. This might also prevent weird future behaviours when UBOs will
be bound for compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

meta: Unconditionally set GL_SKIP_DECODE_EXT

The path that depends on this will be avoided (by fallback_required) if
the extension is not supported. _mesa_set_sampler_srgb_decode does not
generate GL errors (by design), so there are no problems there.

I kept this change separate and last because it is one of the few in the
series that is not a candidate for the stable branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta: Only bind the sampler in one place

All of the calls after the first _mesa_bind_sampler call are DSA style
calls that don't depend on the current binding.

I kept this change separate and last because it is one of the few in the
series that is not a candidate for the stable branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/decompress: Don't pollute the sampler object namespace

tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

- Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

- Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

- Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/decompress: Save and restore the sampler using gl_sampler_object instead of GL API object handle

Some meta operations can be called recursively. Future changes (the
"Don't pollute the ... namespace" changes) will cause objects with
invalid names to be used. If a nested meta operation tries to restore
an object named 0xDEADBEEF, it will fail.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/decompress: Track sampler using gl_sampler_object instead of GL API object handle

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/decompress: Use internal functions for sampler object access

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/generate_mipmap: Don't pollute the sampler object namespace

tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

- Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

- Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

- Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/generate_mipmap: Save and restore the sampler using gl_sampler_object instead of GL API object handle

Some meta operations can be called recursively. Future changes (the
"Don't pollute the ... namespace" changes) will cause objects with
invalid names to be used. If a nested meta operation tries to restore
an object named 0xDEADBEEF, it will fail.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/generate_mipmap: Track sampler using gl_sampler_object instead of GL API object handle

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/generate_mipmap: Use internal functions for sampler object access

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/blit: Don't pollute the sampler object namespace in _mesa_meta_setup_sampler

tl;dr: For many types of GL object, we can *NEVER* use the Gen function.

In OpenGL ES (all versions!) and OpenGL compatibility profile,
applications don't have to call Gen functions.  The GL spec is very
clear about how you can mix-and-match generated names and non-generated
names: you can use any name you want for a particular object type until
you call the Gen function for that object type.

Here's the problem scenario:

- Application calls a meta function that generates a name.  The first
   Gen will probably return 1.

- Application decides to use the same name for an object of the same
   type without calling Gen.  Many demo programs use names 1, 2, 3,
   etc. without calling Gen.

- Application calls the meta function again, and the meta function
   replaces the data.  The application's data is lost, and the app
   fails.  Have fun debugging that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/blit: Save and restore the sampler using gl_sampler_object instead of GL API object handle

Some meta operations can be called recursively.  Future changes (the
"Don't pollute the ... namespace" changes) will cause objects with
invalid names to be used.  If a nested meta operation tries to restore
an object named 0xDEADBEEF, it will fail.

v2: Add a comment explaining why samp_obj_save is set to NULL in
_mesa_meta_fb_tex_blit_begin.  This came out of review feedback from
Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/blit: Use internal functions for sampler object access

This requires tracking the sampler object using the gl_sampler_object*
instead of the object name.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

meta/blit: Group the SamplerParameteri calls with the other sampler operations

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

mesa: Refator _mesa_BindSampler to make _mesa_bind_sampler

Pulls the parts of _mesa_BindSampler that aren't just parameter
validation out into a function that can be called from other parts of
Mesa (e.g., meta).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

mesa: Add _mesa_set_sampler_srgb_decode method

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

mesa: Add _mesa_set_sampler_filters method

v2: Add filter enum assertions. Suggested by Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

mesa: Add _mesa_set_sampler_wrap method

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>

nvc0: remove useless goto in nvc0_launch_grid()

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

mesa: Mark Identity as const

I was going to send this as review for dce1e1a8, but I missed that
window.  This saves 64 bytes of unshared data and prelaces it with 96
bytes shared text.  My guess is that some of the calls to memcpy get
optimized to something else.

   text    data     bss     dec     hex filename
7847613 220208   27432 8095253 7b8615 i965_dri.so before
7847709 220144   27432 8095285 7b8635 i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Brian Paul <brianp@vmware.com>

configure.ac: always define __STDC_CONSTANT_MACROS

The ISO C99 standard (7.18.4) specifies that C++
implementations should define UINT64_C only when
__STDC_CONSTANT_MACROS is defined.

Because we now use UINT64_C in our cpp files (since commit
208bfc493debe0344d0b9cb93975981f14412628), we need to add this define.

This also solves compilation errors with GCC 4.8.x on ppc64le machines.

v2: add this define to SCons build system

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

i965: Upload 3DSTATE_BINDING_TABLE_POINTERS_HS when !TCS on Gen9+.

Gen9+ requires us to emit 3DSTATE_BINDING_TABLE_POINTERS_HS for the
hull shader push constants to take effect. The passthrough TCS uses
push constants for the default tessellation levels. So, when those
change, we need to re-upload the binding table as well.

Fixes five Piglit tests on Skylake:
- spec/arb_tessellation_shader/vs-tes-vertex
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-quads
- spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-tris
- spec/arb_tessellation_shader/tes-read-texture
- spec/arb_tessellation_shader/tess_with_geometry

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

Add missing platform information for KBL

In testing KBL, I found:

- urb size was not set for slices gt1.5, gt2, and gt3.  The value I
   used for these slices (384) was taken from an earlier patch authored
   by Ben Widawsky.

- slice count was missing.  This field was added by
   a403ad4f5a034e52a3cd845e91c4aa3e6927b731

With this commit, KBL passes piglit at parity with SKL.

Note: As requested by Kristian, Sarah modified this patch to drop
setting urb size for gt1.5, gt2, and gt3, since the correct default is
set in the GEN9 macro by commit c1e38ad37042b0ec261eb0ba5631b7ff0ee7a9da
"i965/skl: Use larger URB size where available."

Signed-off-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Cc: "11.1" <mesa-stable@lists.freedesktop.org>

nv50/ir: the whole point of data array is to hand out regular registers

Fixes: 0d3051f75a (nv50/ir: Fix scratch allocation size and file)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

mesa/uniform_query: add IROUNDD and use for doubles->ints (v2)

For the case where we convert a double to an int, we should
round the same as we do for floats.

This fixes GL41-CTS.gpu_shader_fp64.state_query

v2: add IROUNDD (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>

glsl: replace unreachable code path with assert

The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

Revert "glsl: replace unreachable code path with assert"

This reverts commit 98270fd20d4d58db8ae5af3b6f10ed6a81c058a6.

Something went terribly wrong the commit is not what the commit
message says.

glsl: replace unreachable code path with assert

The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

glsl: combine if blocks

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

mesa: Update todo regarding StencilOp and StencilOpSeparate.

OpenGL 2.0 function StencilOp() is in part internally implemented via
StencilOpSeparate(). This change happened some time ago, however the
accompanying doxygen todo comment was not accordingly updated.

Replace the outdated portion of this doxygen todo comment, leaving the
remainder unchanged.

Also better respect the 80 character suggested line length in this file.

v2: Fully remove comment, following code review by t_arceri@yahoo.com.au

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>

glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.

Currently, opt_vectorize() tries to combine:

    result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
    result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
    result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
    result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);

into a single ir_quadop_bitfield_insert opcode, which operates on
ivec4s.  However, GLSL IR's opcodes currently require the bits and
offset parameters to be scalar integers.  So, this breaks.

We want to be able to vectorize this eventually, but for now, just
chicken out and make opt_vectorize() bail by marking all the bitfield
insert/extract related opcodes as horizontal.  This is a relatively
uncommon case today, so we'll do the simple fix for stable branches,
and fix it properly on master.

Fixes assertion failures when compiling Shadow of Mordor vertex shaders
on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org

nv50/ir: Fix scratch allocation size and file

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

mesa: merge bind_atomic_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa: merge bind_shader_storage_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa: merge bind_uniform_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa: merge bind_xfb_buffers_{base|range}

Reduced code duplication should make the code more maintainable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

glsl: Don't add nir files to libglsl_la_SOURCES

SCons doesn't understand nir yet and doesn't want to compile the glsl to
nir pass. Move the files to their own variable so we can add it only for
automake.

Tested-by: Brian Paul <brianp@vmware.com>

nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

glsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c

These are used by code that doesn't necessarily link to libglsl.la. Move
them to shader_enums.[ch] where we keep similar helpers.

Reviewed-by: Matt Turner <mattst88@gmail.com>

i965: Move GLSL lowering passes out of libi965_compiler.la

The scope of libi965_compiler.la is to be able to take nir shaders and
generate i965 EU code. As such, we don't want the GLSL IR lowering
passes in the library. With this change, libi965_compiler.la no longer
needs to link to libglsl.la.

Reviewed-by: Matt Turner <mattst88@gmail.com>

glsl: Move glsl_to_nir files to LIBGLSL_FILES

libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for
libglsl.la consumers, this is a no-op. libnir.la however no longer uses
any GLSL IR infrastructure and can be used without also linking to
libglsl.la.

Acked-by: Matt Turner <mattst88@gmail.com>

mesa: Use separate indices for UBO & SSBO during binding

Previously we were treating the binding index for Uniform Buffer
Objects and Shader Storage Buffer Objects as being part of the
combined BufferInterfaceBlocks array.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93322
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

mesa: Map program UBOs and SSBOs to Interface Blocks

v2:
* Fill UboInterfaceBlockIndex and SsboInterfaceBlockIndex in
split_ubos_and_ssbos (Iago)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

mesa: docs: Add link to planet.freedesktop.org

The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or
any of the graphics project wikis (including the DRI wiki) on
freedeskop.org. Fix that by linking to it from the sidebar.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

freedreno: add ir3_compiler to gitignore

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

gallium: add a RESQ opcode to query info about a resource

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

tgsi: update atomic op docs

Specify that the operation only applies to the x component, not
per-component as previously specified. This is unnecessary for GL and
creates additional complications for images which need to support these
operations as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

tgsi: add a is_store property

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

tgsi: provide a way to encode memory qualifiers for SSBO

Each load/store on most hardware can specify what caching to do. Since
SSBO allows individual variables to also have separate caching modes,
allow loads/stores to have the qualifiers instead of attempting to
encode them in declarations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

ureg: add buffer support to ureg

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

tgsi: add ureg support for image decls

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

glsl: Ensure 64bits shift is used.

I believe that `1u << x`, where x >= 32 yields undefined results
according to the C standard.

Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)`.

Reviewed-by: Brian Paul <brianp@vmware.com>

mesa/main: Avoid `void function returning a value` warning.

Trivial.

Reviewed-by: Brian Paul <brianp@vmware.com>

configure.ac: add --enable-profile

For profiling mesa's code, especially llvmpipe, PROFILE should be
defined. Currently, this define can only be generated if mesa is
built using scons.
This patch makes it possible to generate this define also when building
mesa through automake tools.

v2:

- Change --enable-llvmpipe-profile to --enable-profile
- Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

nine: allow fragment shader POSITION and FACE to be system values

Reported-by: Axel Davy <axel.davy@ens.fr>

vl: allow fragment shader POSITION to be a system value

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

util/pstipple: allow fragment shader POSITION to be a system value

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

st/mesa: add support for POSITION and FACE system values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

tgsi/scan: update for POSITION and FACE sytem values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

gallium: add caps for POSITION and FACE system values

v2: document the integer behavior

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

program: add a helper for rewriting FP position input to sysval

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

glsl: optionally declare gl_FragCoord & gl_FrontFacing as system values

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com
Reviewed-by: Brian Paul <brianp@vmware.com>

tgsi/ureg: handle redundant declarations in ureg_DECL_system_value

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

tgsi/ureg: remove index parameter from ureg_DECL_system_value

It can be trivially derived from the number of already declared system
values. This allows ureg users not to worry about which index to choose.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

st/mesa: remove dead code from mesa_to_tgsi

These aren't part of ARB_fragment_program.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

gallium/aux: Use TGSI chan name defines inplace of literals

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

mesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4

The piglit copyteximage check has recently been augmented to test this, but
apparently it hasn't been fixed in Mesa so far.

This language also already appears in the OpenGL 2.1 spec (Ian).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

i965/compiler: Enable more lowering in NIR

We don't need these for GLSL or ARB, but we need them for SPIR-V

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

nir/algebraic: Add more lowering

This commit adds lowering options for the following opcodes:

- nir_op_fmod
- nir_op_bitfield_insert
- nir_op_uadd_carry
- nir_op_usub_borrow

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>

nir/opcodes: Fix up uadd_carry and usub_borrow

Both were defined as returning bool but the gpu_shader5 functions are
defined to return int. Also, we had the parameters for usub borrwo
backwards in the folding expression.

Reviewed-by: Matt Turner <mattst88@gmail.com>

nvc0: add ARB_indirect_parameters support

I chose to make separate macros for this due to the additional
complexity and extra scratch usage.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

st/mesa: expose ARB_indirect_parameters when the backend driver allows

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa: add support for ARB_indirect_parameters draw functions

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa: add parameter buffer, used for ARB_indirect_parameters

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

glapi: add ARB_indirect_parameters definitions

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

nvc0: add support for real ARB_multi_draw_indirect

The draw groups are now split up into groups of 32 if there's a
non-packed stride, or in groups of 400-500 if the draw data is packed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

nvc0: adjust indirect draw macros to handle multiple draws at once

These are still invoked one at a time, but the underlying macro can
handle multiple draws.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>

st/mesa: add support for new mesa indirect draw interface

This shifts all indirect draws to go through the new function. If the
driver doesn't have support for multi draws, we break those up and
perform N draws. Otherwise, we pass everything through for just a single
draw call.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: add caps to expose support for multi indirect draws

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: add sufficient draw interface to allow new indirect features

This makes it possible to support indirect multidraws as well as having
the number of such draws to come from a separate GPU resource.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

vbo: create a new draw function interface for indirect draws

All indirect draws are passed to the new draw function. By default
there's a fallback implementation which pipes it right back to
draw_prims, but eventually both the fallback and draw_prim's support for
indirect drawing should be removed.

This should allow a backend to properly support ARB_multi_draw_indirect
and ARB_indirect_parameters.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

llvmpipe: do 64bit plane calculations in the sse path

The sse path was pretty much disabled for practical purposes because the
largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations.
This is actually not that difficult, though a problem is that we can't do
a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall,
the code still looks reasonable, though it's not like changes there in
setup really make much of a difference in the end...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

llvmpipe: don't store eo as 64bit int

eo, just like dcdx and dcdy, cannot overflow 32bit.
Store it as unsigned though just in case (it cannot be negative, but
in theory twice as big as dcdx or dcdy so this gives it one more bit).
This doesn't really change anything, albeit it might help minimally on
32bit archs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

llvmpipe: use aligned data for the assembly program in setup

Back in the day (before 24678700edaf5bb9da9be93a1367f1a24cfaa471) the values
were not actually in a struct but even then I can't see why we didn't simply
align the values. Especially since it's trivial to do so.
(Not that it actually matters since the code is pretty much unused for now.)

Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

draw: initialize prim header flags when clipping lines

Otherwise, clipped lines would have undefined stippling reset bit if line
stippling is enabled.
(Untested, and I just assume copying over the bits from the original line
is actually the right thing to do.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

draw: fix line stippling with unfilled prims

The unfilled stage was not filling in the prim header, and the line stage
then decided to reset the stipple counter or not based on the uninitialized
data. This causes some failures in conform linestipple test (albeit quite
randomly happening depending on environment).
So fill in the prim header in the unfilled stage - I am not entirely sure
if anybody really needs determinant after that stage, but there's at least
later stages (wide line for instance) which copy over the determinant as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

glsl: replace null check with assert

This was added in 54f583a20 since then error handling has improved.

The test this was added to fix now fails earlier since 01822706ec

Reviewed-by: Matt Turner <mattst88@gmail.com>

i965: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

i915: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

radeon: use _mesa_delete_buffer_object

This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

st/mesa: use _mesa_delete_buffer_object

This is more future-proof than the current code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>

mesa/bufferobj: make _mesa_delete_buffer_object externally accessible

gl_buffer_object has grown more complicated and requires cleanup. Using this
function from drivers will be more future-proof.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

llvmpipe: use sse2 conv code for altivec

In lp_build_conv() and lp_build_conv_auto(), there is a special case of
conversion when sse2 is present. That code path is suitable without any
changes to altivec, because all the functions that are called in that
code path already support altivec.

This patch increase the FPS in POWER arch across the board
between 10%-25%

I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

radeonsi: adjust the parameters of si_shader_dump

The function will be extended to dump all binaries shaders will consist of,
so si_shader* makes sense here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>