review.tizen.org Git - platform/upstream/mesa.git/log

projects / platform / upstream / mesa.git / log

Nayan Deshmukh [Wed, 22 Feb 2017 08:25:02 +0000 (13:55 +0530)]

vl: u_upload_alloc might fail to allocate buffer in bicubic filter

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 20 Feb 2017 18:34:02 +0000 (19:34 +0100)]

gallium: reorder fields in pipe_draw_info

sizeof(struct pipe_draw_info) = 104 -> 88

Also, vertices_per_patch is switched to ubyte, because it can't be more
than 32.

Seemed-reasonable-to: Roland Scheidegger

commit | commitdiff | tree

Marek Olšák [Sun, 19 Feb 2017 18:29:06 +0000 (19:29 +0100)]

gallium/hud: handle a thread switch for API-thread-busy monitoring

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 19 Feb 2017 18:28:14 +0000 (19:28 +0100)]

gallium/hud: prevent an infinite loop

v2: use UINT64_MAX / 11

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 20 Feb 2017 17:42:41 +0000 (18:42 +0100)]

gallium/u_queue: isolate util_queue_fence implementation

it's cleaner this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Mon, 20 Feb 2017 14:27:07 +0000 (15:27 +0100)]

gallium/u_queue: fix random crashes when the app calls exit()

This fixes:
    vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
    llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
    "Pass ID not registered!"' failed.

v2: use list_head, switch the call order in destroy

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Robert Bragg [Fri, 13 Mar 2015 22:10:47 +0000 (22:10 +0000)]

i965: Implement INTEL_performance_query backend

This adds a bare-bones backend for the INTEL_performance_query extension
that exposes pipeline statistics.

Although this could be considered redundant given that the same
statistics are already available via query objects, they are a simple
starting point for this extension and it's expected to be convenient for
tools wanting to have a single go to api to introspect what performance
counters are available, along with names, descriptions and semantic/data
types.

This code is derived from Kenneth Graunke's work, temporarily removed
while the frontend and backend interface were reworked.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Robert Bragg [Wed, 29 Apr 2015 07:41:34 +0000 (08:41 +0100)]

mesa: Model INTEL perf query backend after query obj BE

Instead of using the same backend interface as AMD_performance_monitor
this defines a dedicated INTEL_performance_query interface that is
modelled more on the ARB_query_buffer_object interface (considering the
similarity of the extensions) with the addition of vfuncs for
initializing and enumerating query and counter info.

Compared to the previous backend, some notable differences are:

- The backend is free to represent counters using whatever data
  structures are optimal/convenient since queries and counters are
  enumerated via an iterator api instead of declaring them using
  structures directly shared with the frontend.

  This is also done to help us support the full range of data and
  semantic types available with INTEL_performance_query which is awkward
  while using a structure shared with the AMD_performance_monitor
  backend since neither extension's types are a subset of the other.

- The backend must support waiting for a query instead of the frontend
  simply using glFinish().

- Objects go through 'Active' and 'Ready' states consistent with the
  query object backend (hopefully making them more familiar). There is
  no 'Ended' state (which used to show that a query has ended at least
  once for a given object). There is a new 'Used' state, set when a
  query is first begun which implies that we are expecting to get
  results back for the object at some point. There's no equivalent to
  the 'EverBound' state since the spec doesn't require there to be a
  limbo state between generating IDs and associating them with an object
  on query Begin.

The INTEL_performance_query and AMD_performance_monitor extensions are
now completely orthogonal within Mesa main (though a driver could
optionally choose to implement both extensions within a unified backend
if that were convenient for the sake of sharing state/code).

v2: (Samuel Pitoiset)
- init PerfQuery.NumQueries in frontend
- s/return_string/output_clipped_string/
- s/backed/backend/ typo
- remove redundant *bytesWritten = 0
v3:
- Add InitPerfQueryInfo for lazy probing of available queries
v4:
- Clean up some internal usage of GL typedefs (Ken)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Robert Bragg [Wed, 29 Apr 2015 07:17:20 +0000 (08:17 +0100)]

mesa: Separate INTEL_performance_query frontend

To allow the backend interfaces for AMD_performance_monitor and
INTEL_performance_query to evolve independently based on the more
specific requirements of each extension this starts by separating
the frontends of these extensions.

Even though there wasn't much tying these frontends together, this
separation intentionally copies what few helpers/utilities that were
shared between the two extensions, avoiding any re-factoring specific to
INTEL_performance_query so that the evolution will be easier to follow
later.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Thomas Hellstrom [Fri, 23 Sep 2016 11:23:05 +0000 (13:23 +0200)]

gallium/vl: Simplify the matrix filter fragment shader

It looks like it was partly copied from the median filter fragment shader
and unnecessesarily saved a lot of temporary values.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Fri, 1 Apr 2016 05:31:18 +0000 (07:31 +0200)]

st/vdpau: Fix multithreading

The vdpau state tracker allows multiple threads access to the same gallium
context simultaneously. We can fix this either by locking the same mutex
each time the context is used or by using a different gallium context for
each mutex domain. Here we do the latter, although I'm not sure that's really
the best option.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Fri, 4 Mar 2016 11:11:23 +0000 (12:11 +0100)]

gallium/vl: Parameter substitution in the csc matrix computation

Makes the code significantly more readable.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Fri, 4 Mar 2016 10:49:31 +0000 (11:49 +0100)]

gallium/vl: Simplify usage of full range matrices

When looking at the full range matrices, it becomes obvious that the difference
between the standard matrices and the full range matrices is that the full
range matrices are multiplied by 1.164. Together with offsetting the y value
with -16/255, this will scale and offset RGB with the desired quantities.

However, the standard SMPTE 240M matrix seems to differ a bit since the
U and V coefficients are only multiplied with 1.138 to get the full range
matrix. This would actually alter the color somewhat so I figure that's an
error. The full range matrix is consistent with Nvidia's VDPAU implementation.

We can also incorporate the ybias in the brightness simplifying the
calculation somewhat.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Tue, 8 Mar 2016 06:30:52 +0000 (07:30 +0100)]

gallium/vl Fix brightness matrix description

The brightness matrix doesn't actually match the procamp matrix and
what's calculated in vl_csc_get_matrix.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Mon, 29 Feb 2016 12:33:39 +0000 (13:33 +0100)]

gallium/vl: Don't map vertex buffers on creation

It will cause multiple simultaneous maps of the same vertex buffer and
flushed-while-mapped warnings.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Fri, 23 Sep 2016 10:57:54 +0000 (12:57 +0200)]

gallium/vl: Add sampler views to video filter fragment shaders

Needed for at least the svga driver.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Thomas Hellstrom [Thu, 18 Aug 2016 11:41:42 +0000 (13:41 +0200)]

gallium/vl: declare sampler views in compositor shaders

The svga driver relies on the existence of these sampler views.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Brian Paul [Tue, 21 Feb 2017 22:52:40 +0000 (15:52 -0700)]

util: fix MSVC build issue in disk_cache.h

Windows doesn't have dlfcn.h. Protect the code in question
with #if ENABLE_SHADER_CACHE test. And fix indentation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Dave Airlie [Wed, 22 Feb 2017 02:20:18 +0000 (02:20 +0000)]

radv: fix typo in the subpass barrier patch.

Fixes: dbb0eaccc radv: handle subpass cache flushes

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:27 +0000 (09:53 -0800)]

i965/gen6+: Enable arb_transform_feedback_overflow_query.

This extension adds new query types which can be used to detect overflow
of transform feedback buffers. The new query types are also accepted by
conditional rendering commands.

v3:
- s/gen7+/gen6+/ in the relnotes (Jordan Justen)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:26 +0000 (09:53 -0800)]

i965: Add support for xfb overflow query on conditional render.

Enable the use of a transform feedback overflow query with
glBeginConditionalRender. The render commands will only execute if the
query is true (i.e. if there was an overflow).

Use ARB_conditional_render_inverted to change this behavior.

v4:
    - reuse MI_MATH calcs from hsw_queryob (Kenneth)
    - fallback to software conditional rendering when MI_MATH is not
      available (Kenneth)

v5:
    - check query->Target (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:25 +0000 (09:53 -0800)]

i965: Add support for xfb overflow on query buffer objects.

Enable getting the results of a transform feedback overflow query with a
buffer object.

v4:
    - hsw_overflow_result_to_gpr0 a public function, so it can be used
      by conditional render. (Kenneth)
    - fix typo grp0/gpr0 (Kenneth)
    - rename load_gen_written_data_to_regs to
      load_overflow_data_to_cs_gprs (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:24 +0000 (09:53 -0800)]

i965: add plumbing for ARB_transform_feedback_overflow_query.

When querying for transform feedback overflow on one or all of the
streams, store information about number of generated and written
primitives. Then check whether generated == written.

v2:
- use only SO_PRIM_STORAGE_NEEDED, do not fallback to
CL_INVOCATION_COUNT. (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:23 +0000 (09:53 -0800)]

mesa: Track transform feedback overflow query objects.

Also update checks on conditional rendering.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Rafael Antognolli [Fri, 20 Jan 2017 17:53:22 +0000 (09:53 -0800)]

mesa: Add types for ARB_transform_feedback_oveflow_query.

Add some basic types and storage for the queries of this extension.

v2:
- update date of extension (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Eric Engestrom [Tue, 21 Feb 2017 14:15:39 +0000 (14:15 +0000)]

gallium/docs: use imgmath instead of pngmath

WARNING: sphinx.ext.pngmath has been deprecated. Please use
sphinx.ext.imgmath instead.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Eric Engestrom [Tue, 21 Feb 2017 14:15:38 +0000 (14:15 +0000)]

gallium/docs: fix section title formatting

src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Eric Engestrom [Tue, 21 Feb 2017 14:15:37 +0000 (14:15 +0000)]

gallium/docs: add missing newlines

Without these, mathjax considers these as the continuation of the
previous line.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Eric Engestrom [Tue, 21 Feb 2017 14:15:36 +0000 (14:15 +0000)]

gallium/docs: add missing math formatting

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Eric Engestrom [Tue, 21 Feb 2017 14:15:35 +0000 (14:15 +0000)]

gallium/docs: fix sublist formatting

src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation.

Sub lists need to be surrounded by a blank line.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>

commit | commitdiff | tree

Timothy Arceri [Tue, 21 Feb 2017 05:34:49 +0000 (16:34 +1100)]

util/disk_cache: create timestamp and gpu_id dirs when MESA_GLSL_CACHE_DIR is used

The make check test is also updated to make sure these dirs are created.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Timothy Arceri [Fri, 10 Feb 2017 02:02:22 +0000 (13:02 +1100)]

util/radv: move *_get_function_timestamp() to utils

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 18:18:35 +0000 (10:18 -0800)]

docs: Update features.txt and relnotes for GL_ARB_transform_feedback2

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 04:05:39 +0000 (20:05 -0800)]

i965: Enable ARB_transform_feedback2 on Sandybridge.

The only feature over and above ES 3.0 is DrawTransformFeedback().

We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
order to compute the SVBI value for ResumeTransformFeedback(), at which
point our existing GetTransformFeedbackVertexCount() implementation will
do the trick (though with a stall to CPU map the buffer).

Someday, we could probably implement DrawTransformFeedback() more
efficiently, using the "Load Internal Vertex Count" feature of
3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.

Rumor has it this allows people to use WebGL 2.0 on Sandybridge.

Note that we don't need pipelined register writes like Gen7+ because
we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:24:20 +0000 (21:24 -0800)]

i965: Properly reset SVBI counters on ResumeTransformFeedback().

This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused
GLES 3.0 test.  When resuming the transform feedback object, we need to
reset the SVBI counters so we continue writing at the correct point in
the buffer.

Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the
Streamed Vertex Buffer Index (SVBI) counters, which contain a count of
vertices emitted.

Unfortunately, there's no straightforward way to store the current SVBI
counter values to a buffer.  They're not available in a register.  You
can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another
internal counter which 3DPRIMITIVE can use...but there's no good way to
extract that either.

So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex
numbers.  Thankfully, we can reuse most of the existing Gen7+ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:31:57 +0000 (21:31 -0800)]

i965: Save max_index in brw_transform_feedback_object.

I'm going to need this in a new Resume hook shortly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:17:44 +0000 (21:17 -0800)]

i965: Update brw_save_primitives_written_counters for pre-Gen7.

Sandybridge and earlier only have a single counter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:15:07 +0000 (21:15 -0800)]

i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.

This way on Sandybridge we'll only do 1 stream worth of math, since
we only have one SO_NUM_PRIMS_WRITTEN counter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:06:34 +0000 (21:06 -0800)]

i965: Move some code from gen7_sol_state.c to gen6_sol.c.

I plan to use these functions on Sandybridge soon. I changed the prefix
on a couple of functions to "brw" instead of "gen7" as in theory they
should be usable all the way back to G45.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Kenneth Graunke [Fri, 17 Feb 2017 05:22:51 +0000 (21:22 -0800)]

i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.

These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
are supported, which Gen8+ can always do. So this code is dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>

commit | commitdiff | tree

Marek Olšák [Mon, 20 Feb 2017 18:24:18 +0000 (19:24 +0100)]

vbo: kill primitive restart lowering in glDrawArrays

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Marek Olšák [Sat, 18 Feb 2017 16:08:34 +0000 (17:08 +0100)]

radeonsi: fix issues with monolithic shaders

R600_DEBUG=mono has had no effect since:

    commit 1fabb297177069e95ec1bb7053acb32f8ec3e092
    Author: Marek Olšák <marek.olsak@amd.com>
    Date:   Tue Feb 14 22:08:32 2017 +0100

    radeonsi: have separate LS and ES main shader parts in the shader selector

Also, this assertion was failing:
    si_state_shaders.c:1307: si_shader_select_with_key: Assertion
    `!shader->is_optimized' failed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Thu, 19 Jan 2017 13:36:17 +0000 (14:36 +0100)]

radeonsi: set no-signed-zeros-fp-math

Recommended by Matt Arsenault.

46757 shaders in 28742 tests
Totals:
SGPRS: 2068851 -> 2066907 (-0.09 %)
VGPRS: 1604056 -> 1602676 (-0.09 %)
Spilled SGPRs: 1402 -> 1382 (-1.43 %)
Spilled VGPRs: 113 -> 113 (0.00 %)
Private memory VGPRs: 1332 -> 1332 (0.00 %)
Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread
Code Size: 58815520 -> 58716788 (-0.17 %) bytes
LDS: 1162 -> 1162 (0.00 %) blocks
Max Waves: 354616 -> 354905 (0.08 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 786452 -> 784508 (-0.25 %)
VGPRS: 530000 -> 528620 (-0.26 %)
Spilled SGPRs: 958 -> 938 (-2.09 %)
Spilled VGPRs: 85 -> 85 (0.00 %)
Private memory VGPRs: 636 -> 636 (0.00 %)
Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread
Code Size: 26349936 -> 26251204 (-0.37 %) bytes
LDS: 304 -> 304 (0.00 %) blocks
Max Waves: 108962 -> 109251 (0.27 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Sun, 29 Jan 2017 21:45:36 +0000 (22:45 +0100)]

gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)

v2: define lp_float_mode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Sat, 18 Feb 2017 15:55:50 +0000 (16:55 +0100)]

radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them

We were unconditionally storing these outputs, sometimes even one component
at a time, but apps never read them in TES.

Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can
easily disable them on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Sat, 18 Feb 2017 14:30:25 +0000 (15:30 +0100)]

radeonsi: skip LDS stores in TCS if there are no LDS output reads

This removes a lot of useless LDS stores.

A few games read TESSINNER/OUTER, but not any other outputs. Most games
don't read any outputs.

The only app doing LDS output reads is UE4 Lightsroom Interior.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Marek Olšák [Sat, 18 Feb 2017 14:26:42 +0000 (15:26 +0100)]

tgsi/scan: add basic info about tessellation OUT and IN uses

not all of them will be used immediately

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 19:04:12 +0000 (11:04 -0800)]

anv: Take a device parameter in anv_state_flush

This allows the helper to check for llc instead of having to do it
manually at all the call sites.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 18:37:34 +0000 (10:37 -0800)]

anv: Pull all clflushing into a clflush_range helper

All this cache line address calculation stuff is tricky. Let's not
duplicate it more places than we have to.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 18:30:20 +0000 (10:30 -0800)]

anv: Remove the unused state_pool_emit macro

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 18:28:27 +0000 (10:28 -0800)]

anv: Rename clflush_range and state_clflush

It's a bit shorter and easier to work with. Also, we're about to add a
helper called clflush which does the clflush but without any memory
fencing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 19:03:04 +0000 (11:03 -0800)]

intel/blorp: Explicitly flush all allocated state

Found by inspection. However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Feb 2017 23:58:40 +0000 (15:58 -0800)]

anv: Put everything about queries in genX_query.c

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Feb 2017 23:51:11 +0000 (15:51 -0800)]

anv/Makefile: alphabetize

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Feb 2017 23:21:04 +0000 (15:21 -0800)]

anv/query: Perform CmdResetQueryPool on the GPU

This fixes a some rendering corruption in The Talos Principle

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Feb 2017 23:18:31 +0000 (15:18 -0800)]

genxml: Make MI_STORE_DATA_IMM more consistent

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Jason Ekstrand [Sat, 18 Feb 2017 21:25:04 +0000 (13:25 -0800)]

anv/query: clflush the bo map on non-LLC platforms

Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Jason Ekstrand [Mon, 20 Feb 2017 18:18:57 +0000 (10:18 -0800)]

anv: Add an invalidate_range helper

This is similar to clflush_range except that it puts the mfence on the
other side to ensure caches are flushed prior to reading.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>

commit | commitdiff | tree

Christian Gmeiner [Wed, 8 Feb 2017 12:14:05 +0000 (13:14 +0100)]

etnaviv: remove number of pixel pipes validation

This validation was added before the etnaviv drm driver landed in
the linux kernel. Due some pre-merge API changes we had to fix-up
this value but with a mainline kernel this is not a problem anymore.

Lets remove that validation which also gets rid of problem caught
by Coverity, reported to me by imirkin.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Christian Gmeiner [Wed, 8 Feb 2017 12:07:25 +0000 (13:07 +0100)]

etnaviv: move pctx initialisation to avoid a null dereference

In case ctx->stream == NULL the fail label gets executed where
pctx gets dereferenced - too bad pctx is NULL in that case.

Caught by Coverity, reported to me by imirkin.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Christian Gmeiner [Wed, 8 Feb 2017 12:03:19 +0000 (13:03 +0100)]

etnaviv: add missing fallthrough annotation

Caught by Coverity, reported to me by imirkin.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Emil Velikov [Mon, 20 Feb 2017 19:27:49 +0000 (19:27 +0000)]

docs/releasing.html: reword "distro breaking changes" hunk

v2: s/rare/rarely/ (Eric)

Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Emil Velikov [Sun, 19 Feb 2017 11:49:21 +0000 (11:49 +0000)]

radv: make radv_resolve_entrypoint static

Used only within the generated source file.

Fixes: 12301c54186 ("radv: drop the RADV_CALL macro.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Emil Velikov [Sun, 19 Feb 2017 11:49:20 +0000 (11:49 +0000)]

radv: remove unused radv_dispatch_table dtable

Fixes: 12301c54186 ("radv: drop the RADV_CALL macro.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Emil Velikov [Sun, 19 Feb 2017 11:49:19 +0000 (11:49 +0000)]

anv: remove unused anv_dispatch_table dtable

Fixes: 4c9dec80ede ("anv: Get rid of the ANV_CALL macro")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:43 +0000 (15:16 +0000)]

i915: remove extern "C" guards

None of this code is used in C++ context.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:42 +0000 (15:16 +0000)]

i915: remove 'virtual' and extern C workarounds

Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:41 +0000 (15:16 +0000)]

i965: remove 'virtual' and extern C workarounds

The headers are properly annotated thus we don't need these.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:40 +0000 (15:16 +0000)]

i965: add extern C notation in headers

Otherwise symbols wont be annotated with C linkage and we'll fail at
link time.

Currently this is worked around by wrapping the header inclusion itself.
The latter in itself fragile and not recommended.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:39 +0000 (15:16 +0000)]

gallium: do not #include foo.h within extern C {}

Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:38 +0000 (15:16 +0000)]

nir: do not #include util/debug.h within extern C {}

It's a problem waiting to happen. Individual headers should be annotated
if needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:37 +0000 (15:16 +0000)]

glsl: resolve extern C workarounds/hacks

Do not wrap header inclusion in extern C since it can cause issues.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:36 +0000 (15:16 +0000)]

st/mesa: move extern C wrappers where applicable

Namely, after the include directives. The headers are properly annotated
so keeping things as-is is only asking for trouble.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:35 +0000 (15:16 +0000)]

mesa/tests: remove unneeded extern C { #include foo } hack

The header itself (enums.h) is already properly annotated.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:34 +0000 (15:16 +0000)]

mesa: remove unneeded extern C {} wrapper

compiler.h defines a few mesa specific macros which are not C specific.
This allows us to avoid buggy extern C { #include $system_header }
constructs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:33 +0000 (15:16 +0000)]

mesa: annotate functions for C linkage

i.e. add extern C {} in program/symbol_table.h

It will allow us remove a workaround we have elsewhere in the code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:32 +0000 (15:16 +0000)]

anv: remove unneeded extern C notation

Analogous to previous commit - never used in any C++ code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 15:16:31 +0000 (15:16 +0000)]

radv: remove unneeded extern C notation

Header is never #include(d) by a C++ source.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Rhys Kidd [Sat, 11 Feb 2017 22:31:09 +0000 (17:31 -0500)]

glsl/tests: Add UINT64 and INT64 types

glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_UINT64’ not handled in switch [-Wswitch]
switch (type->base_type) {
^
glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_INT64’ not handled in switch [-Wswitch]

Fixes: 8ce53d4a2f3 ("glsl: Add basic ARB_gpu_shader_int64 types")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>

commit | commitdiff | tree

Eric Engestrom [Tue, 14 Feb 2017 22:48:56 +0000 (22:48 +0000)]

docs: fix gamma correction link

That link has been dead for 15 years...
We could link to Archive.org [1] to get the last time this page existed,
but I feel like Wikipedia is a better choice.

[1] http://web.archive.org/web/20021211151318/http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Tue, 14 Feb 2017 21:18:52 +0000 (21:18 +0000)]

docs: add link to gallium doc

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 20 Feb 2017 11:07:21 +0000 (12:07 +0100)]

radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK

The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.

Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 20 Feb 2017 09:46:13 +0000 (10:46 +0100)]

radeonsi: handle MultiDrawIndirect in si_get_draw_start_count

Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.

Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.

Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.

v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
have count 0.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 19 Feb 2017 09:42:57 +0000 (10:42 +0100)]

winsys/amdgpu: reduce max_alloc_size based on GTT limits

Allocating huge buffers in VRAM is not a problem, but when those buffers
start being migrated, the kernel runs into errors because it cannot split
those buffer up for moving through GTT.

This should fix intermittent failures of
GL45-CTS.texture_buffer.texture_buffer_max_size

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 20 Feb 2017 08:27:17 +0000 (09:27 +0100)]

radv: Don't flush at the start of a command buffer.

The preamble flushes now and the rest is the responsibility of the app.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 20 Feb 2017 08:26:00 +0000 (09:26 +0100)]

radv: Flush in the initial preamble CS.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 20 Feb 2017 08:08:31 +0000 (09:08 +0100)]

radv: Special case the initial preamble.

For flushing we don't want to flush every third IB.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 20 Feb 2017 00:57:46 +0000 (01:57 +0100)]

radv: Split emitting the cache flush out.

So that we can use it without a cmd_buffer.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 20 Feb 2017 01:22:39 +0000 (02:22 +0100)]

radv: Free empty_cs on device destruction.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Ben Skeggs [Tue, 21 Feb 2017 00:01:16 +0000 (10:01 +1000)]

nvc0: use PascalB for most Pascal boards

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Dave Airlie [Mon, 20 Feb 2017 05:13:33 +0000 (15:13 +1000)]

radv: handle subpass cache flushes

This splits out the cache flush bit setting code
dependent on the src/dest access flags.

It then calls it from the subpass barrier code.

It also marks a TODO to remove the aggressive CS/PS
flushes at some point.

This fixes a bunch of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Grazvydas Ignotas [Fri, 10 Feb 2017 23:01:40 +0000 (01:01 +0200)]

r300g: only allow byteswapped formats on big endian

They cause regressions on little endian.

Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Timothy Arceri [Sun, 19 Feb 2017 23:16:20 +0000 (10:16 +1100)]

mesa: remove unused variable warning in release builds

This assert might have made sense before but we no longer use
gl_linked_shader here. Unless the caller has really done something
crazy this assert is fairly useless.

We also do some small tidy ups in this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Emil Velikov [Mon, 13 Feb 2017 19:23:41 +0000 (19:23 +0000)]

docs/submittingpatches.html: document the Fixes tag

Provide information and an example.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Emil Velikov [Mon, 13 Feb 2017 19:23:40 +0000 (19:23 +0000)]

docs/submittingpatches.html: remove version tag for nominations

The version tag used to nominate has bitten even experienced mesa
developers. Not to mention that it deviates from the one used in the
kernel leading to further confusion.

Simplify things and omit it all together.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Emil Velikov [Mon, 13 Feb 2017 19:23:39 +0000 (19:23 +0000)]

docs/submittingpatches.html: add #backports section

Provide information about merge conflicts resolution and sending
backports.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Emil Velikov [Mon, 13 Feb 2017 19:23:38 +0000 (19:23 +0000)]

docs/submittingpatches.html: rework the #criteria section

Reword the section to focus on what is allowed, using a more brief, yet
descriptive wording.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Emil Velikov [Thu, 16 Feb 2017 14:09:41 +0000 (14:09 +0000)]

travis: bring the scons build on par with AppVeyor

Namely, always build with LLVM and run the check target.

Cc: Rhys Kidd <rhyskidd@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Ben Crocker [Thu, 19 Jan 2017 01:44:09 +0000 (20:44 -0500)]

gallivm: Reenable PPC VSX (v3)

Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
now that LLVM bug 26775 and its corollary, 25503, are fixed.

Amendment: remove extraneous spaces in macro def & invocations.

We would prefer a runtime check, e.g. via an LLVMQueryString
(analogous to glGetString, eglQueryString) or LLVMGetVersion API,
but no such API exists at this time.

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
[Emil Velikov: remove LLVM_VERSION macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Ben Crocker [Fri, 10 Feb 2017 23:08:07 +0000 (18:08 -0500)]

gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)

If llvm::sys::getHostCPUName() returns "generic", override
it with "pwr8" (on PPC64LE).

This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
is missing, resulting in (big-endian) "generic" being returned on
little-endian Power8NVL systems. The result is that code that
attempts to load the least significant 32 bits of a 64-bit quantity in
memory loads the wrong half.

This omission should be fixed in the next version of LLVM (4.0),
but this work-around should be left in place in case some
future version of POWER<n> also ends up unrepresented in LLVM's table.

This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion
tests on POWER8NVL processors.

(V4: add similar comment in the code.)

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>

Domain: Graphics System / GL;

RSS Atom