platform/upstream/mesa.git
6 years agoac: add ac_count_scratch_private_memory()
Samuel Pitoiset [Thu, 1 Mar 2018 21:12:54 +0000 (22:12 +0100)]
ac: add ac_count_scratch_private_memory()

Imported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: only enable used channels when exporting parameters
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:22 +0000 (11:54 +0100)]
ac/nir: only enable used channels when exporting parameters

This allows us to generate, for example,
"exp param0 v0, off, off, off" if only the first channel is needed.

Not sure if this improves performance but it's worth trying.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac: update enabled channels mask when optimizing PARAM exports
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:21 +0000 (11:54 +0100)]
ac: update enabled channels mask when optimizing PARAM exports

When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: pass the number of enabled channels to si_llvm_init_export_args()
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:20 +0000 (11:54 +0100)]
ac/nir: pass the number of enabled channels to si_llvm_init_export_args()

Currently, it's always 0xf but an upcoming patch will reduce the
number of channels for parameters export.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/shader: scan output usage mask for VS and TES
Samuel Pitoiset [Thu, 1 Mar 2018 10:54:19 +0000 (11:54 +0100)]
ac/shader: scan output usage mask for VS and TES

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agointel: Add missing includes for building on Android
Clayton Craft [Tue, 6 Mar 2018 01:00:05 +0000 (17:00 -0800)]
intel: Add missing includes for building on Android

This adds a missing library to the i965/Android.mk file, and updates
intel/Android.mk to include the new library. Without this, mesa does not
build on Android.

Fixes: 272bef0601a "intel: Split gen_device_info out into
libintel_dev"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agovulkan: do not expose surface/swapchain extensions on Android
Tapani Pälli [Mon, 5 Mar 2018 16:46:20 +0000 (18:46 +0200)]
vulkan: do not expose surface/swapchain extensions on Android

On Android surface/swapchain extensions are implemented by the loader. Patch
modifies both anv and radv extension scripts disabling currently exposed
ones. See also earlier commit 9f763c1f9b.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoanv: Don't expose VK_KHX_multiview on android.
Tapani Pälli [Mon, 5 Mar 2018 08:57:29 +0000 (10:57 +0200)]
anv: Don't expose VK_KHX_multiview on android.

Just like commit 2ffe395 does for radv.

Fixes following dEQP test on i965:
   dEQP-VK.api.info.android.no_unknown_extensions

v2: make it !ANDROID since this extension is not about
    surfaces/swapchain

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agogallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128
Roland Scheidegger [Tue, 27 Feb 2018 02:38:17 +0000 (03:38 +0100)]
gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128

Some state trackers require 128.
(There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl
state tracker it's unlikely more than 32 will be needed, if you need
more use bindless.)

6 years agotgsi/scan: use wrap-around shift behavior explicitly for file_mask
Roland Scheidegger [Fri, 2 Mar 2018 02:00:41 +0000 (03:00 +0100)]
tgsi/scan: use wrap-around shift behavior explicitly for file_mask

The comment said it will only represent the lowest 32 regs. This was
not entirely true in practice, since at least on x86 you'll get
masked shifts (unless the compiler could recognize it already and toss
it out). It turns out this actually works out alright (presumably
noone uses it for temp regs) when increasing max sampler views, so
make that behavior explicit.
Albeit it feels a bit hacky (but in any case, explicit behavior there
is better than undefined behavior).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoclover: Allow overriding platform/device version numbers
Aaron Watry [Thu, 10 Aug 2017 03:02:30 +0000 (22:02 -0500)]
clover: Allow overriding platform/device version numbers

Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case

6 years agoclover/llvm: Pass device down to compile
Aaron Watry [Sun, 6 Aug 2017 01:41:40 +0000 (20:41 -0500)]
clover/llvm: Pass device down to compile

We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.

v2: Rebase after removing the previous patch (Pierre)
  - Removed "clover: Add device_clc_version to llvm::create_compiler_instance"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoclover: Pass device to llvm::create_compiler_instance
Aaron Watry [Sun, 6 Aug 2017 01:18:48 +0000 (20:18 -0500)]
clover: Pass device to llvm::create_compiler_instance

We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
    instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
    patch to prevent temporary build breakage.
    (Jan) Use device_clc_version instead of device_version for compile/link

6 years agoclover/llvm: Use device in llvm compilation instead of copying fields
Aaron Watry [Sat, 10 Feb 2018 20:03:13 +0000 (14:03 -0600)]
clover/llvm: Use device in llvm compilation instead of copying fields

Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.

v3: Rebase on current master
v2: Use device in function args before making additional changes in
    following patches

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoradeonsi/nir: fix handling of doubles for gs inputs
Timothy Arceri [Thu, 1 Mar 2018 04:21:52 +0000 (15:21 +1100)]
radeonsi/nir: fix handling of doubles for gs inputs

Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac: pass the unmodified number of components to load gs inputs
Timothy Arceri [Thu, 1 Mar 2018 04:37:25 +0000 (15:37 +1100)]
ac: pass the unmodified number of components to load gs inputs

Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.

Since we pass the type we can just let the functions handle
doubles in a way they choose.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradeonsi: move si_nir_load_input_gs() to si_shader.c
Timothy Arceri [Thu, 1 Mar 2018 04:17:34 +0000 (15:17 +1100)]
radeonsi: move si_nir_load_input_gs() to si_shader.c

All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agobroadcom/vc4: Add support for HW perfmon
Boris Brezillon [Thu, 11 Jan 2018 09:22:04 +0000 (10:22 +0100)]
broadcom/vc4: Add support for HW perfmon

The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
6 years agodrm-uapi: Update vc4 header with perfmon related definitions
Boris Brezillon [Thu, 11 Jan 2018 09:22:03 +0000 (10:22 +0100)]
drm-uapi: Update vc4 header with perfmon related definitions

v2: Update to the final version with the documentation.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
6 years agor600: fix color export mask
Roland Scheidegger [Mon, 5 Mar 2018 19:12:32 +0000 (20:12 +0100)]
r600: fix color export mask

The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit 5b14e06d8b42e2b08ebc52b6c314ef8647d87a1f when updating the
pixel state, leading to misrenderings (probably with MRT).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262

Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
6 years agotravis: keep meson version below 0.45.0
Andres Gomez [Mon, 5 Mar 2018 15:25:36 +0000 (17:25 +0200)]
travis: keep meson version below 0.45.0

Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agointel: Drop SURFACE_FORMAT enum from genxml.
Kenneth Graunke [Wed, 14 Feb 2018 02:13:51 +0000 (18:13 -0800)]
intel: Drop SURFACE_FORMAT enum from genxml.

We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/common: Use isl for decoder surface formats
Jordan Justen [Tue, 27 Feb 2018 04:31:22 +0000 (20:31 -0800)]
intel/common: Use isl for decoder surface formats

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/isl: Add isl_format_is_valid
Jordan Justen [Tue, 27 Feb 2018 01:57:19 +0000 (17:57 -0800)]
intel/isl: Add isl_format_is_valid

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel: Split gen_device_info out into libintel_dev
Jordan Justen [Mon, 26 Feb 2018 23:39:59 +0000 (15:39 -0800)]
intel: Split gen_device_info out into libintel_dev

Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.

This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agogallium/aux/hud: Avoid possible buffer overflow
Gert Wollny [Wed, 28 Feb 2018 13:50:21 +0000 (14:50 +0100)]
gallium/aux/hud: Avoid possible buffer overflow

Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agogbm: give a name to rgba fields
Eric Engestrom [Wed, 28 Feb 2018 16:08:54 +0000 (16:08 +0000)]
gbm: give a name to rgba fields

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agoegl: remove duplicated initialization
Andres Gomez [Fri, 2 Mar 2018 11:28:28 +0000 (13:28 +0200)]
egl: remove duplicated initialization

Found by inspection.

The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.

v2: enhance the commit log (Emil).

Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agofreedreno/ir3: start dealing with half-precision
Rob Clark [Fri, 2 Mar 2018 15:21:55 +0000 (10:21 -0500)]
freedreno/ir3: start dealing with half-precision

Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision).  So add some code to sanity check the src/dst registers to
catch mixups.

Also propagate half-precision flag for SSA sources.  The instruction
consuming a SSA value needs to be of the same type as the one producing
it.

This is probably not complete half-precision support, but a useful first
step.  We do still need to add support for nir alu instructions for
converting between half/full precision.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: fix fixing-up register footprint
Rob Clark [Wed, 28 Feb 2018 22:33:29 +0000 (17:33 -0500)]
freedreno/ir3: fix fixing-up register footprint

It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.

This problem showed up with compute shaders.  If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption.  Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: surfaces can be PIPE_BUFFER
Rob Clark [Tue, 27 Feb 2018 17:59:57 +0000 (12:59 -0500)]
freedreno: surfaces can be PIPE_BUFFER

At least for clover.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: handle compute resources
Rob Clark [Mon, 26 Feb 2018 18:38:22 +0000 (13:38 -0500)]
freedreno/a5xx: handle compute resources

Not *entirely* sure why this is a different BIND bit, but it is.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: ignore return jump
Rob Clark [Mon, 26 Feb 2018 18:04:21 +0000 (13:04 -0500)]
freedreno/ir3: ignore return jump

I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add some more compute caps
Rob Clark [Mon, 26 Feb 2018 16:29:05 +0000 (11:29 -0500)]
freedreno: add some more compute caps

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: don't expose 64b pointers yet
Rob Clark [Mon, 26 Feb 2018 16:24:13 +0000 (11:24 -0500)]
freedreno/a5xx: don't expose 64b pointers yet

Temporary hack, but since we can't do 64b math yet in ir3, pretend that
we don't support 64b pointers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: steal handy macro for compute caps from nouveau
Rob Clark [Mon, 26 Feb 2018 16:22:33 +0000 (11:22 -0500)]
freedreno: steal handy macro for compute caps from nouveau

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add global_bindings state
Rob Clark [Sun, 25 Feb 2018 21:11:06 +0000 (16:11 -0500)]
freedreno: add global_bindings state

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: small cleanup
Rob Clark [Sun, 25 Feb 2018 20:05:01 +0000 (15:05 -0500)]
freedreno/ir3: small cleanup

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add pctx->memory_barrier()
Rob Clark [Sun, 25 Feb 2018 20:01:07 +0000 (15:01 -0500)]
freedreno: add pctx->memory_barrier()

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: cmdline compiler updates for spv shaders
Rob Clark [Sat, 24 Feb 2018 16:33:09 +0000 (11:33 -0500)]
freedreno/ir3: cmdline compiler updates for spv shaders

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agoac: add ac_build_fsign()
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:32 +0000 (15:01 +0100)]
ac: add ac_build_fsign()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoac: add ac_build_isign()
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:31 +0000 (15:01 +0100)]
ac: add ac_build_isign()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoac: add ac_build_fract()
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:30 +0000 (15:01 +0100)]
ac: add ac_build_fract()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agovirgl: add offset alignment values to to v2 caps struct
gurchetansingh@chromium.org [Fri, 23 Feb 2018 02:02:18 +0000 (18:02 -0800)]
virgl: add offset alignment values to to v2 caps struct

glBindBufferRange(..) in vrend_draw_bind_ubo is failing with
more than one uniform block. This is due to improper alignment
of the start of the second block. Let's query the proper
alignment from the driver and pass it back to Mesa.

Let's query for the texture alignment too, even though the Virgl
renderer doesn't call glTexBufferRange yet.

The default values are the widest workable range possible (for example,
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256).

Fixes:
dEQP-GLES3.functional.ubo.* on Nvidia

Example test:
dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex

Note: This is based on "virgl: reduce some default capset limits.",
which hasn't landed in Mesa yet but should relatively soon.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agovirgl: reduce some default capset limits.
Dave Airlie [Wed, 21 Feb 2018 01:48:05 +0000 (11:48 +1000)]
virgl: reduce some default capset limits.

Since v2 might take a while to rollout, we should reduce
these inside some gathered minimums and then v2 can increase
them using host values.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agovirgl: handle getting new capsets.
Dave Airlie [Thu, 15 Feb 2018 04:20:37 +0000 (14:20 +1000)]
virgl: handle getting new capsets.

This checks the kernel api is new enough and asks for the
larger caps size since the kernel won't mess it up now.

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradeonsi/nir: call ac_lower_indirect_derefs()
Timothy Arceri [Mon, 5 Mar 2018 01:06:01 +0000 (12:06 +1100)]
radeonsi/nir: call ac_lower_indirect_derefs()

Fixes piglit tests:
tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test
tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradeonsi: add chip class to compiler_ctx_state
Timothy Arceri [Mon, 5 Mar 2018 01:04:47 +0000 (12:04 +1100)]
radeonsi: add chip class to compiler_ctx_state

This will be used in the following patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c
Timothy Arceri [Mon, 5 Mar 2018 00:13:11 +0000 (11:13 +1100)]
ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c

Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: Fix copying from 3D images starting at non-zero depth.
Bas Nieuwenhuizen [Sun, 4 Mar 2018 13:47:20 +0000 (14:47 +0100)]
radv: Fix copying from 3D images starting at non-zero depth.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoswr/rast: Fix macOS macro.
Vinson Lee [Sat, 24 Feb 2018 23:49:32 +0000 (15:49 -0800)]
swr/rast: Fix macOS macro.

Fixes: a25093de7188 ("swr/rast: Implement JIT shader caching to disk")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
6 years agovbo: Try to reuse the same VAO more often for successive dlists.
Mathias Fröhlich [Wed, 28 Feb 2018 07:31:44 +0000 (08:31 +0100)]
vbo: Try to reuse the same VAO more often for successive dlists.

The change tries to catch more opportunities to reuse the same set
of VAO's when building up display lists. Instead of checking the
offset with respect to the beginning of the vertex buffer object
the change tries to apply this same optimization with respect to the
previous display list node.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
6 years agomesa: Silence unused parameter warnings from TEXSTORE_PARAMS
Ian Romanick [Thu, 22 Feb 2018 03:23:44 +0000 (19:23 -0800)]
mesa: Silence unused parameter warnings from TEXSTORE_PARAMS

Reduces my build from 1717 warnings to 1547 warnings by silencing 170
instances of things like

In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0,
                 from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31:
../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’:
../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter]
  mesa_format dstFormat, \
              ^
../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’
 _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
                                ^~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoi965: Silence unused parameter warnings in genX_state_upload
Ian Romanick [Thu, 22 Feb 2018 00:16:53 +0000 (16:16 -0800)]
i965: Silence unused parameter warnings in genX_state_upload

Reduces my build from 1772 warnings to 1717 warnings by silencing 55
instances of things like

../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter]
                                unsigned end_offset,
                                         ^~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter]
 genX(emit_sampler_state_pointers_xs)(struct brw_context *brw,
                                                          ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter]
                                      struct brw_stage_state *stage_state)
                                                              ^~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter]
                            mesa_format format, GLenum base_format,
                                        ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter]
 translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
                                         ^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter]
                            uint32_t batch_offset_for_sampler_state)
                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoisl: Silence unused parameter warnings in __gen_combine_address implementations
Ian Romanick [Wed, 21 Feb 2018 02:42:02 +0000 (18:42 -0800)]
isl: Silence unused parameter warnings in __gen_combine_address implementations

Reduces my build from 1808 warnings to 1772 warnings by silencing 36
instances of things like

../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’:
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                             ^~~~
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter]
 __gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
                                         ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agogenxml: Silence unused parameter warnings in generated pack code
Ian Romanick [Sat, 17 Feb 2018 03:09:13 +0000 (19:09 -0800)]
genxml: Silence unused parameter warnings in generated pack code

Reduces my build from 1960 warnings to 1808 warnings by silencing 152
instances of things like

In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0,
                 from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36:
src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’:
src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_uint(uint64_t v, uint32_t start, uint32_t end)
                                                 ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’:
src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                   ^~~~~
src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
                                                   ^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’:
src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter]
 __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits)
                                                ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoi965: Silence unused parameter warnings in blorp
Ian Romanick [Sat, 17 Feb 2018 03:00:21 +0000 (19:00 -0800)]
i965: Silence unused parameter warnings in blorp

Reduces my build from 2023 warnings to 1960 warnings by silencing 63
instances of things like

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter]
                        const struct blorp_params *params)
                                                   ^~~~~~
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter]
                          const struct blorp_params *params)
                                                     ^~~~~~
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter]
                     const struct blorp_params *params)
                                                ^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                       ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                    ^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter]
 blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
                                                                  ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agonir: Silence unused parameter warnings in generated nir_constant_expressions code
Ian Romanick [Sat, 17 Feb 2018 01:48:57 +0000 (17:48 -0800)]
nir: Silence unused parameter warnings in generated nir_constant_expressions code

Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like

src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
 evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
                                                             ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoi965: Silence unused parameter warnings in generated OA code
Ian Romanick [Mon, 12 Feb 2018 19:26:39 +0000 (11:26 -0800)]
i965: Silence unused parameter warnings in generated OA code

Reduces my build from 6301 warnings to 2075 warnings by silencing 4226
instances of things like

src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’:
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter]
 hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw,
                                                              ^~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoi965: Silence warnings about mixing enum and non-enum in conditional
Ian Romanick [Mon, 12 Feb 2018 19:16:55 +0000 (11:16 -0800)]
i965: Silence warnings about mixing enum and non-enum in conditional

Reduces my build from 6451 warnings to 6301 warnings by silencing 150
instances of

../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
    unsigned file = __builtin_strcmp("dst", #reg) == 0 ?                       \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
                    BRW_GENERAL_REGISTER_FILE :                                \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    brw_inst_##reg##_reg_file(devinfo, inst);                  \
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’
 REG_TYPE(src1)
 ^~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/compiler: Silence unused parameter warnings in release builds
Ian Romanick [Mon, 12 Feb 2018 18:57:06 +0000 (10:57 -0800)]
intel/compiler: Silence unused parameter warnings in release builds

Reduces my build from 7005 warnings to 6451 warnings by silencing 554
instances of

In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0:
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                         ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
                                               ^~~~~~~
In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0,
                 from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29:
../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’:
../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
 brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo,
                                                             ^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
         unsigned _reg_file,
                  ^~~~~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoi965: Silence unused parameter warnings
Ian Romanick [Mon, 12 Feb 2018 18:52:49 +0000 (10:52 -0800)]
i965: Silence unused parameter warnings

Reduces my build from 7119 warnings to 7005 warnings by silencing 114
instances of

In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0,
                 from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter]
 static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; }
                                               ^~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel: Drop program size pointer from vec4/fs assembly getters.
Kenneth Graunke [Tue, 27 Feb 2018 00:34:55 +0000 (16:34 -0800)]
intel: Drop program size pointer from vec4/fs assembly getters.

These days, we're just passing a pointer to a prog_data field, which
we already have access to.  We can just use it directly.

(In the past, it was a pointer to a separate value.)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoi965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.
Kenneth Graunke [Tue, 27 Feb 2018 07:41:33 +0000 (23:41 -0800)]
i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.

This should have no practical impact.  For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
6 years agoi965: Generalize intel_upload.c to support multiple uploaders.
Kenneth Graunke [Tue, 27 Feb 2018 07:17:35 +0000 (23:17 -0800)]
i965: Generalize intel_upload.c to support multiple uploaders.

I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions.  So, we can't
use a single uploader for both situations - we'll need two of them.

This creates a public 'uploader' structure, and adjusts the interface
to take an uploader rather than always using brw->upload.  It should
have no functional change at the moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
6 years agointel/compiler: Memory fence commit must always be enabled for gen10+
Anuj Phogat [Wed, 7 Feb 2018 01:09:09 +0000 (17:09 -0800)]
intel/compiler: Memory fence commit must always be enabled for gen10+

Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES # 1404612949

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoRevert "i965/fs: Predicate byte scattered writes if needed"
Francisco Jerez [Sun, 25 Feb 2018 00:05:21 +0000 (16:05 -0800)]
Revert "i965/fs: Predicate byte scattered writes if needed"

This reverts commit a4031bdfa927fb4c3c5d0bdadc70634f3c1a5eac.  It's
redundant with the sample mask predication done at this point by the
common logical send lowering infrastructure, and rather buggy because
it wasn't applying the correct sample mask in shaders using discard,
since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS
doesn't reflect samples discarded by the shader, so it could have led
to data corruption in fragment shader invocations that execute discard
based on a non-dynamically uniform condition.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/fs: Handle surface opcode sample masks via predication.
Francisco Jerez [Tue, 12 Dec 2017 20:05:04 +0000 (12:05 -0800)]
intel/fs: Handle surface opcode sample masks via predication.

The main motivation is to enable HDC surface opcodes on ICL which no
longer allows the sample mask to be provided in a message header, but
this is enabled all the way back to IVB when possible because it
decreases the instruction count of some shaders using HDC messages
significantly, e.g. one of the SynMark2 CSDof compute shaders
decreases instruction count by about 40% due to the removal of header
setup boilerplate which in turn makes a number of send message
payloads more easily CSE-able.  Shader-db results on SKL:

 total instructions in shared programs: 15325319 -> 15314384 (-0.07%)
 instructions in affected programs: 311532 -> 300597 (-3.51%)
 helped: 491
 HURT: 1

Shader-db results on BDW where the optimization needs to be disabled
in some cases due to hardware restrictions:

 total instructions in shared programs: 15604794 -> 15598028 (-0.04%)
 instructions in affected programs: 220863 -> 214097 (-3.06%)
 helped: 351
 HURT: 0

The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
laptop with this change.  According to Eero this improves performance
of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2
desktop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
6 years agointel/eu: Plumb header present bit to codegen helpers for HDC messages.
Francisco Jerez [Tue, 12 Dec 2017 20:05:03 +0000 (12:05 -0800)]
intel/eu: Plumb header present bit to codegen helpers for HDC messages.

This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
Francisco Jerez [Thu, 22 Feb 2018 20:49:01 +0000 (12:49 -0800)]
intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.

This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg.  This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/ir: Allow representing additional flag subregisters in the IR.
Francisco Jerez [Tue, 12 Dec 2017 20:05:02 +0000 (12:05 -0800)]
intel/ir: Allow representing additional flag subregisters in the IR.

This allows representing conditional mods and predicates on f1.0-f1.1
at the IR level by adding an extra bit to the flag_subreg
backend_instruction field.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/l3: Don't allocate SLM partition on ICL+.
Francisco Jerez [Tue, 12 Dec 2017 20:05:00 +0000 (12:05 -0800)]
intel/l3: Don't allocate SLM partition on ICL+.

SLM has a chunk of special-purpose memory separate from L3 on ICL+, we
shouldn't allocate a partition for it on L3 anymore.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agosvga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs
Charmaine Lee [Tue, 27 Feb 2018 12:09:58 +0000 (04:09 -0800)]
svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs

Since geometry shader also consumes prescale constants, the
geometry shader constant buffer will need to be updated when prescale
factor is changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agosvga: fix blending regression
Brian Paul [Thu, 22 Feb 2018 04:00:38 +0000 (21:00 -0700)]
svga: fix blending regression

The earlier Mesa commit 3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.

In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.

We now have to scan the blend targets to find the first one that's
enabled (if any).  We have to use the index of that target for getting
the src/dst blend terms.  And note that we have to set identical blend
terms for all targets.

This fixes the Piglit fbo-drawbuffers2-blend test.  VMware bug 2063493.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agosvga: check svga_have_vgpu10() in svga_delete_blend_state()
Brian Paul [Thu, 22 Feb 2018 20:22:11 +0000 (13:22 -0700)]
svga: check svga_have_vgpu10() in svga_delete_blend_state()

We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not
enabled (bs->id==0 by default), resulting in lots of device errors.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
6 years agosvga: if svga_update_state() fails, skip the draw call
Brian Paul [Wed, 21 Feb 2018 20:57:39 +0000 (13:57 -0700)]
svga: if svga_update_state() fails, skip the draw call

If svga_update_state() fails, we flush the command buffer and retry.
If it fails again, it likely means we were unable to translate a shader
for some reason (uses too many resources, for example).  In that case,
let's just skip the draw call.  The alternative, just disabling the
shader stage in question, would certainly lead to bad rendering anyway,
and probably device errors.

Fixes failed assertion running Piglit glsl-1.50/execution/
variable-indexing/gs-output-array-vec4-index-wr.shader_test since it
uses too many GS output registers (though the test still fails).
VMware bug 2063492.

v2: also call pipe_debug_message() so apps or apitrace can be notified
when this issue occurs.
v3: use svga_update_state_retry().

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
6 years agosvga: let svga_update_state_retry() return a bool
Brian Paul [Thu, 22 Feb 2018 16:32:33 +0000 (09:32 -0700)]
svga: let svga_update_state_retry() return a bool

This will allow minor simplifications elsewhere.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
6 years agosvga: s/unsigned/boolean/ for a few local vars
Brian Paul [Thu, 22 Feb 2018 21:43:41 +0000 (14:43 -0700)]
svga: s/unsigned/boolean/ for a few local vars

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
6 years agomeson: install vulkan_intel.h header
Dylan Baker [Fri, 2 Mar 2018 18:28:11 +0000 (10:28 -0800)]
meson: install vulkan_intel.h header

Fixes: d1992255bb29054fa51763376d125183a9f602f3
       ("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agost/omx_bellagio: add picture profile and entry point
Boyuan Zhang [Fri, 2 Mar 2018 16:11:01 +0000 (11:11 -0500)]
st/omx_bellagio: add picture profile and entry point

Profile and entry point were missing in the picture structure.
Therefore, add them back.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agoradeonsi: fix radeon create encoder return
Boyuan Zhang [Tue, 27 Feb 2018 22:29:44 +0000 (17:29 -0500)]
radeonsi: fix radeon create encoder return

Previous patch missed a "return" when trying to modify the create encoder
function, which made the whole logic fail. Therefore, add the return back.

Fixes: b38b208ff8886e799d6a2 "radeonsi:create uvd hevc enc entry"

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoloader: Add support for platform and host1x busses
Thierry Reding [Wed, 21 Dec 2016 13:15:06 +0000 (14:15 +0100)]
loader: Add support for platform and host1x busses

ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.

While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.

Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.

The ID path tag for a device can be obtained by running udevadm info on
the device node, as shown in this example on NVIDIA Tegra:

$ udevadm info /dev/dri/card0 | grep ID_PATH_TAG
E: ID_PATH_TAG=platform-50000000_host1x

The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is
constructed, can be found in the sysfs "uevent" attribute for the card0
device's parent:

$ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent
OF_FULLNAME=/host1x@50000000

Similarily, /dev/dri/card1 corresponds to the GPU:

$ udevadm info /dev/dri/card1 | grep ID_PATH_TAG
E: ID_PATH_TAG=platform-57000000_gpu

and:

$ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent
OF_FULLNAME=/gpu@57000000

Changes in v2:
- avoid confusing pre-increment in strdup()
- add examples of tags to commit message

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
6 years agodisk cache: Link with -latomic if necessary
Thierry Reding [Fri, 23 Feb 2018 13:13:27 +0000 (14:13 +0100)]
disk cache: Link with -latomic if necessary

The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.

Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.

This is the meson equivalent of 2ef7f23820a6 ("configure: check if
-latomic is needed for __atomic_*").

For some background information on this, see:

https://gcc.gnu.org/wiki/Atomic/GCCMM

Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
6 years agoradv: do not set pending_reset_query in BeginCommandBuffer()
Samuel Pitoiset [Thu, 1 Mar 2018 09:53:49 +0000 (10:53 +0100)]
radv: do not set pending_reset_query in BeginCommandBuffer()

This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
   in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
   is done in a previous command buffer we are safe.

Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agor600/cayman: fix fragcood loading recip generation.
Dave Airlie [Thu, 1 Mar 2018 03:38:32 +0000 (03:38 +0000)]
r600/cayman: fix fragcood loading recip generation.

This fixes some hangs seen where the recip_ieee opcodes would
end up split across the wrong slots.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoi965: Allow 48-bit addressing on Gen8+.
Kenneth Graunke [Mon, 12 Feb 2018 15:18:29 +0000 (07:18 -0800)]
i965: Allow 48-bit addressing on Gen8+.

This allows most GPU objects to use the full 48-bit address space
offered by Gen8+ platforms, rather than being stuck with 32-bit.
This expands the available GPU memory from 4G to 256TB or so.

A few objects - instruction, scratch, and vertex buffers - need to
remain pinned in the low 4GB of the address space for various reasons.
We default everything to 48-bit but disable it in those cases.

Thanks to Jason Ekstrand for blazing this trail in anv first and
finding the nasty undocumented hardware issues.  This patch simply
rips off all of his findings.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoi965: Shorten the name of the workaround BO.
Kenneth Graunke [Mon, 26 Feb 2018 23:51:04 +0000 (15:51 -0800)]
i965: Shorten the name of the workaround BO.

This makes the name shorter in debug printouts.  If "workaround_bo"
is good enough for the code, it's probably good enough for debugging.

6 years agoi965: Add debugging code to dump the validation list.
Kenneth Graunke [Tue, 28 Nov 2017 18:07:43 +0000 (10:07 -0800)]
i965: Add debugging code to dump the validation list.

When anything goes wrong with this code, dumping the validation list
is a useful way to figure out what's happening.

6 years agointel/fs: Set up sampler message headers in the visitor on gen7+
Jason Ekstrand [Thu, 1 Mar 2018 03:57:44 +0000 (19:57 -0800)]
intel/fs: Set up sampler message headers in the visitor on gen7+

This gives the scheduler visibility into the headers which should
improve scheduling.  More importantly, however, it lets the scheduler
know that the header gets written.  As-is, the scheduler thinks that a
texture instruction only reads it's payload and is unaware that it may
write to the first register so it may reorder it with respect to a read
from that register.  This is causing issues in a couple of Dota 2 vertex
shaders.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
6 years agoac: fix nir_intrinsic_shared_atomic_comp_swap handling
Timothy Arceri [Thu, 1 Mar 2018 09:17:38 +0000 (20:17 +1100)]
ac: fix nir_intrinsic_shared_atomic_comp_swap handling

Following on from 49879f377870 this makes sure we use the correct
src index.

Fixes cts test:
KHR-GL46.compute_shader.atomic-case3

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agost/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs
Timothy Arceri [Thu, 1 Mar 2018 02:39:20 +0000 (13:39 +1100)]
st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs

We only need to check for previously processed location on user
defined varyings as they are the only ones that support component
packing. Therefore a single instance of processed_locs can be
shared by regular varyings and patches.

For simplicity we make processed_locs an array in order to handle
dual source bleanding.

Fixes the follow piglit test on radeonsi:
tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoanv: Enable MSAA fast-clears
Jason Ekstrand [Sat, 24 Feb 2018 05:12:50 +0000 (21:12 -0800)]
anv: Enable MSAA fast-clears

This speeds up the Sascha Willems multisampling demo by around 25% when
using 8x or 16x MSAA.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/cmd_buffer: Add support for MCS fast-clears and resolves
Jason Ekstrand [Sat, 24 Feb 2018 05:12:35 +0000 (21:12 -0800)]
anv/cmd_buffer: Add support for MCS fast-clears and resolves

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/cmd_buffer: Add helpers for computing resolve predicates
Jason Ekstrand [Sat, 24 Feb 2018 05:00:52 +0000 (21:00 -0800)]
anv/cmd_buffer: Add helpers for computing resolve predicates

We'll want to re-use the complex resolve predicate computations for MCS
resolves so it's nice to have them as helper functions.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage
Jason Ekstrand [Sat, 24 Feb 2018 04:45:26 +0000 (20:45 -0800)]
anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage

This doesn't actually do anything because att_state->fast_clear is
determined based on the return value of anv_layout_to_fast_clear_type
which currently returns NONE for multisampled images.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/blorp: Pass the clear address to blorp for subpass MSAA resolves
Jason Ekstrand [Sat, 24 Feb 2018 05:11:58 +0000 (21:11 -0800)]
anv/blorp: Pass the clear address to blorp for subpass MSAA resolves

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/blorp: Allow indirect clear colors on blorp sources on gen7
Jason Ekstrand [Sat, 24 Feb 2018 06:05:39 +0000 (22:05 -0800)]
anv/blorp: Allow indirect clear colors on blorp sources on gen7

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agoanv/blorp: Add partial clear support to anv_image_mcs_op
Jason Ekstrand [Sat, 11 Nov 2017 22:32:21 +0000 (14:32 -0800)]
anv/blorp: Add partial clear support to anv_image_mcs_op

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
6 years agointel/blorp: Add indirect clear color support to mcs_partial_resolve
Jason Ekstrand [Sat, 11 Nov 2017 22:28:17 +0000 (14:28 -0800)]
intel/blorp: Add indirect clear color support to mcs_partial_resolve

This is a bit complicated because we have to get the indirect clear
color in there somehow.  In order to not do any more work in the shader
than needed, we set it up as it's own vertex binding which points
directly at the clear color address specified by the client.

Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
6 years agointel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE
Jason Ekstrand [Sat, 11 Nov 2017 21:40:03 +0000 (13:40 -0800)]
intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE

There are enough #ifs in there that it's kind-of pointless to duplicate
it for each buffer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>