platform/upstream/mesa.git
5 years agoanv: Use gen_mi_builder for indirect dispatch
Jason Ekstrand [Sat, 30 Mar 2019 23:17:56 +0000 (18:17 -0500)]
anv: Use gen_mi_builder for indirect dispatch

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Use gen_mi_builder for indirect draw parameters
Jason Ekstrand [Sat, 30 Mar 2019 22:30:00 +0000 (17:30 -0500)]
anv: Use gen_mi_builder for indirect draw parameters

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Use gen_mi_builder for computing resolve predicates
Jason Ekstrand [Sat, 30 Mar 2019 22:21:12 +0000 (17:21 -0500)]
anv: Use gen_mi_builder for computing resolve predicates

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Use gen_mi_builder for CmdDrawIndirectByteCount
Jason Ekstrand [Sat, 30 Mar 2019 21:08:13 +0000 (16:08 -0500)]
anv: Use gen_mi_builder for CmdDrawIndirectByteCount

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/common: Add unit tests for gen_mi_builder
Jason Ekstrand [Tue, 2 Apr 2019 15:26:09 +0000 (10:26 -0500)]
intel/common: Add unit tests for gen_mi_builder

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/common: Add a MI command builder
Jason Ekstrand [Sat, 30 Mar 2019 18:09:10 +0000 (13:09 -0500)]
intel/common: Add a MI command builder

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agov3d: Add an optimization pass for redundant flags updates.
Eric Anholt [Fri, 22 Feb 2019 22:26:26 +0000 (14:26 -0800)]
v3d: Add an optimization pass for redundant flags updates.

Our exec masking introduces lots of redundant flags updates, and even
without that there will be cases where NIR comparisons on the same sources
for different reasons may generate the same comparison instruction before
the selection.

total instructions in shared programs: 6492930 -> 6460934 (-0.49%)
total uniforms in shared programs: 2117460 -> 2115106 (-0.11%)
total spills in shared programs: 4983 -> 4987 (0.08%)
total fills in shared programs: 6408 -> 6416 (0.12%)

5 years agokmsro: Extend to include armada-drm
Lubomir Rintel [Thu, 21 Mar 2019 21:19:34 +0000 (22:19 +0100)]
kmsro: Extend to include armada-drm

This allows using the Marvell Armada display controllers (with the
armada drm modesetting driver) along with the render-only drivers,
such as Etnaviv on an OLPC XO-1.75 laptop.

v2:
- Add to Android.mk too

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agolima: implement blit with util_blitter
Icenowy Zheng [Thu, 11 Apr 2019 07:26:12 +0000 (15:26 +0800)]
lima: implement blit with util_blitter

As we have already prepared for using util_blitter, use it to implement
lima_blit.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: make lima_context_framebuffer subtype of pipe_framebuffer_state
Icenowy Zheng [Thu, 11 Apr 2019 07:24:59 +0000 (15:24 +0800)]
lima: make lima_context_framebuffer subtype of pipe_framebuffer_state

Currently the lima driver saves the framebuffer state in its
from-scratch struct lima_context_framebuffer. However, util_blitter
requires to save framebuffer with standard struct
pipe_framebuffer_state.

Make the lima_context_framebuffer a subtype of the standard
pipe_framebuffer_state, thus the standard part can be used for
util_blitter framebuffer state saving.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: add dummy set_sample_mask function
Icenowy Zheng [Thu, 11 Apr 2019 07:24:01 +0000 (15:24 +0800)]
lima: add dummy set_sample_mask function

The set_sample_mask function is required in util_blitter.

Add a dummy one to make util_blitter work.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agogitlab-ci: build gallium extra hud
Eric Engestrom [Tue, 19 Mar 2019 14:23:59 +0000 (14:23 +0000)]
gitlab-ci: build gallium extra hud

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomeson: remove meson-created megadrivers symlinks
Eric Engestrom [Tue, 9 Apr 2019 08:28:17 +0000 (09:28 +0100)]
meson: remove meson-created megadrivers symlinks

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110356
Fixes: aa7afe324c2092fb31f9 "meson: strip rpath from megadrivers"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: initialise some variables in opt_if_loop_last_continue()
Timothy Arceri [Wed, 10 Apr 2019 23:38:02 +0000 (09:38 +1000)]
nir: initialise some variables in opt_if_loop_last_continue()

Fixes a couple of Coverity warnings CID 1444626.

Fixes: e30804c6024f ("nir/radv: remove restrictions on opt_if_loop_last_continue()")

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agonir/xfb: do not use bare interface type
Juan A. Suarez Romero [Wed, 10 Apr 2019 15:13:19 +0000 (17:13 +0200)]
nir/xfb: do not use bare interface type

In commit 3b3653c4cfb we decided not to use bare types; hence do not use
bare type when comparing with interface type to find out if the xfb
variable is an array block.

This fixes dEQP-VK.transform_feedback.* tests.

Fixes: 3b3653c4cfb ("nir/spirv: don't use bare types, remove assert in
                     split vars for testing")
CC: Dave Airlie <airlied@redhat.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agogitlab-ci: Run CI pipeline for all branches in the main repository
Michel Dänzer [Wed, 10 Apr 2019 08:33:13 +0000 (10:33 +0200)]
gitlab-ci: Run CI pipeline for all branches in the main repository

In turn, do not run the pipeline for the master branch in forked
repositories.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovirgl: use debug_printf instead of fprintf
Erik Faye-Lund [Wed, 10 Apr 2019 11:40:56 +0000 (13:40 +0200)]
virgl: use debug_printf instead of fprintf

While we're at it, prefix the string with "VIRGL: ", to match similar
code elsewhere in virgl.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: do not warn about display-target binding
Erik Faye-Lund [Wed, 10 Apr 2019 10:22:34 +0000 (12:22 +0200)]
virgl: do not warn about display-target binding

We never want to display a transfer-temp surface, so let's ignore that
flag when calculating the new binding flags.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: only warn about unchecked flags
Erik Faye-Lund [Wed, 10 Apr 2019 10:18:33 +0000 (12:18 +0200)]
virgl: only warn about unchecked flags

The other flags are already vetted, so there's no point in reporting
them.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: unsigned int -> unsigned
Erik Faye-Lund [Wed, 10 Apr 2019 10:16:33 +0000 (12:16 +0200)]
virgl: unsigned int -> unsigned

We don't usually spell out the int part of unsigned.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoegl: setup fds array correctly when exporting dmabuf
Tapani Pälli [Tue, 9 Apr 2019 07:43:59 +0000 (10:43 +0300)]
egl: setup fds array correctly when exporting dmabuf

For formats with multiple planes, application will pass a num_planes
sized fds array which should be initialized properly in case fds amount
utilized by the driver is less than the number of planes.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agodocs: update calendar, and news item and link release notes for 19.0.2
Dylan Baker [Thu, 11 Apr 2019 03:51:58 +0000 (20:51 -0700)]
docs: update calendar, and news item and link release notes for 19.0.2

5 years agodocs: Add sha256 sums for 19.0.2
Dylan Baker [Thu, 11 Apr 2019 03:40:42 +0000 (20:40 -0700)]
docs: Add sha256 sums for 19.0.2

5 years agodocs: Add release notes for 19.0.2
Dylan Baker [Thu, 11 Apr 2019 03:34:09 +0000 (20:34 -0700)]
docs: Add release notes for 19.0.2

5 years agogallium/aux: Report error if loading of a pipe driver fails.
Jan Vesely [Tue, 2 Apr 2019 21:03:13 +0000 (17:03 -0400)]
gallium/aux: Report error if loading of a pipe driver fails.

Skip over non-existent files.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agokmsro: Add platform support for exynos and sun4i
Rob Herring [Fri, 25 Jan 2019 16:56:18 +0000 (10:56 -0600)]
kmsro: Add platform support for exynos and sun4i

v2:
- add Android.mk change

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agokmsro: Add lima renderonly support
Rob Herring [Fri, 25 Jan 2019 16:39:40 +0000 (10:39 -0600)]
kmsro: Add lima renderonly support

Enable using lima for KMS renderonly. This still needs KMS driver
name mapping to kmsro to be used automatically.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agogallium: add lima driver
Qiang Yu [Tue, 12 Mar 2019 19:49:26 +0000 (13:49 -0600)]
gallium: add lima driver

v2:
- use renamed util_dynarray_grow_cap
- use DEBUG_GET_ONCE_FLAGS_OPTION for debug flags
- remove DRM_FORMAT_MOD_ARM_AGTB_MODE0 usage
- compute min/max index in driver

v3:
- fix plbu framebuffer state calculation
- fix color_16pc assemble
- use nir_lower_all_source_mods for lowering neg/abs/sat
- use float arrary for static GPU data
- add disassemble comment for static shader code
- use drm_find_modifier

v4:
- use lima_nir_lower_uniform_to_scalar

v5:
- remove nir_opt_global_to_local when rebase

Cc: Rob Clark <robdclark@gmail.com>
Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Koen Kooi <koen@dominion.thruhere.net>
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: marmeladema <xademax@gmail.com>
Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rohan Garg <rohan@garg.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agodrm-uapi: add lima_drm.h
Qiang Yu [Sun, 10 Mar 2019 04:05:39 +0000 (12:05 +0800)]
drm-uapi: add lima_drm.h

Acked-by: Eric Anholt <eric@anholt.net>
Signed-of-by: Qiang Yu <yuq825@gmail.com>
5 years agogallium/u_vbuf: export u_vbuf_get_minmax_index
Qiang Yu [Wed, 20 Mar 2019 12:31:17 +0000 (20:31 +0800)]
gallium/u_vbuf: export u_vbuf_get_minmax_index

This helper function can be used by driver which
always need min/max index.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agou_dynarray: add util_dynarray_grow_cap
Qiang Yu [Mon, 19 Feb 2018 13:44:44 +0000 (21:44 +0800)]
u_dynarray: add util_dynarray_grow_cap

This is for the case that user only know a max size
it wants to append to the array and enlarge the array
capacity before writing into it.

v2:
- rename newsize to newcap
- rename util_dynarray_enlarge to util_dynarray_grow_cap

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agou_math: add ushort_to_float/float_to_ushort
Qiang Yu [Sat, 17 Jun 2017 16:37:39 +0000 (00:37 +0800)]
u_math: add ushort_to_float/float_to_ushort

v2:
- return 0 for NaN too

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallium: trace: Add missing fence related wrappers
Guido Günther [Fri, 29 Mar 2019 16:48:31 +0000 (17:48 +0100)]
gallium: trace: Add missing fence related wrappers

Without that kmscube with GALLIUM_TRACE would segfault like:

  #0  0x0000000000000000 in  ()
  #1  0x0000ffff8f311760 in dri2_create_fence_fd (_ctx=0xaaaae266b8b0, fd=10) at ../src/gallium/state_trackers/dri/dri_helpers.c:122
  #2  0x0000ffff90788670 in dri2_create_sync (drv=0xaaaae2667910, disp=0xaaaae26691f0, type=12612, attrib_list=0xaaaae26b9290) at ../src/egl/drivers/dri2/egl_dri2.c:2993
  #3  0x0000ffff90776a9c in _eglCreateSync (disp=0xaaaae26691f0, type=12612, attrib_list=0xaaaae26b9290, orig_is_EGLAttrib=0, invalid_type_error=12292) at ../src/egl/main/eglapi.c:1823
  #4  0x0000ffff90776be4 in eglCreateSyncKHR (dpy=0xaaaae26691f0, type=12612, int_list=0xfffff662e828) at ../src/egl/main/eglapi.c:1848

Signed-off-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/tools: Remove redundant definitions of INTEL_DEBUG
Mark Janes [Fri, 5 Apr 2019 18:39:18 +0000 (11:39 -0700)]
intel/tools: Remove redundant definitions of INTEL_DEBUG

INTEL_DEBUG is declared extern and defined in gen_debug.c

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/common: move gen_debug to intel/dev
Mark Janes [Fri, 5 Apr 2019 22:39:51 +0000 (15:39 -0700)]
intel/common: move gen_debug to intel/dev

libintel_common depends on libintel_compiler, but it contains debug
functionality that is needed by libintel_compiler.  Break the circular
dependency by moving gen_debug files to libintel_dev.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: support INTEL_NO_HW environment variable
Mike Blumenkrantz [Tue, 9 Apr 2019 16:40:06 +0000 (12:40 -0400)]
iris: support INTEL_NO_HW environment variable

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: Fix the description of Coffeelake pci-id 0x3E98
Jian-Hong Pan [Wed, 10 Apr 2019 08:04:13 +0000 (16:04 +0800)]
intel: Fix the description of Coffeelake pci-id 0x3E98

According to Intel website [1], the description of chipset 8086:3E98 is
Intel(R) UHD Graphics 630.  Besides, xserver also mentions it as
"Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2)" in commit d3a26bbf
(DRI2: Add another Coffeelake PCI ID) [2].

This patch modifies the description to sync with xserver.

[1]: https://ark.intel.com/content/www/us/en/ark/products/134896/intel-core-i5-9600k-processor-9m-cache-up-to-4-60-ghz.html
[2]: https://gitlab.freedesktop.org/xorg/xserver/commit/d3a26bbf618507e1ca05b2bc99a880075b77db77

Fixes: commit 44f1dcf9b3fd "i965: Add a new CFL PCI ID."
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Anuj Phogat anuj.phogat@gmail.com
5 years agoPartially revert "gallium: fix autotools build of pipe_msm.la"
Jan Vesely [Mon, 1 Apr 2019 16:00:22 +0000 (12:00 -0400)]
Partially revert "gallium: fix autotools build of pipe_msm.la"

This partially reverts commit 356ec7a21960d77db282f67af577dcdb46966b5a.
There are symbols needed by libglsl missing, so we might as well skip
the entire library.

Fixes: 356ec7a21960d77db282f67af577dcdb46966b5a
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Vinson Lee <vlee@freedesktop.org>
5 years agovc4: Upload CS/VS UBO uniforms together.
Eric Anholt [Thu, 1 Dec 2016 20:15:10 +0000 (12:15 -0800)]
vc4: Upload CS/VS UBO uniforms together.

Same as I did for V3D, drop all this code trying to GC the
non-indirectly-loaded uniforms from the UBO that's used for indirect
access of gallium cb[0].  While it does successfully drop some of those,
it came at the cost of uploading the VS's indirect unifroms twice, for the
bin and render versions of the shader.

With the UBO loads simplified, I was also able to easily backport V3D's
change to pack a UBO offset into the uniform_data[] field so that we don't
need to do the add of the uniform base in the shader.

As a bonus, now vc4 doesn't depend on mesa/st type_size functions.

total uniforms in shared programs: 25514 -> 25490 (-0.09%)
total instructions in shared programs: 77019 -> 76836 (-0.24%)

5 years agovc4: Split UBO0 and UBO1 address uniform handling.
Eric Anholt [Tue, 9 Apr 2019 04:39:08 +0000 (21:39 -0700)]
vc4: Split UBO0 and UBO1 address uniform handling.

I'm going to extend how UBO0 works in a moment.

5 years agovc4: Don't forget to set the range when scalarizing our uniforms.
Eric Anholt [Tue, 9 Apr 2019 04:01:02 +0000 (21:01 -0700)]
vc4: Don't forget to set the range when scalarizing our uniforms.

In the next commit, we'll want this for handling UBO access clamping.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well.
Eric Anholt [Mon, 8 Apr 2019 23:32:01 +0000 (16:32 -0700)]
st: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well.

PIPE_CAP_PACKED_UNIFORMS conflates several things: Lowering uniforms i/o
at the st level instead of the backend, packing uniforms with no padding
at all, and lowering to UBOs.

Requiring backends to lower uniforms i/o for !PIPE_CAP_PACKED_UNIFORMS
leads to the driver needing to either link against the type size function
in mesa/st, or duplicating it in the backend.  Given that all backends
want this lower-io as far as I can tell, just move it to mesa/st to
resolve the link issue and avoid the driver author needing to understand
st's uniforms layout.

Incidentally, fixes uniform layout failures in nouveau in:

dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment
dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex
dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_fragment
dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex

and I think in Lima as well.

v2: fix indents

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoanv: don't use default pipeline cache for hits for VK_EXT_pipeline_creation_feedback
Lionel Landwerlin [Wed, 10 Apr 2019 17:28:20 +0000 (18:28 +0100)]
anv: don't use default pipeline cache for hits for VK_EXT_pipeline_creation_feedback

If the user didn't provide a pipeline cache and we're using the
default internal pipeline cache, then we shouldn't consider a cache
hit for VK_EXT_pipeline_creation_feedback as the application did not
provide a cache.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6601e5d6fc68cd ("anv: implement VK_EXT_pipeline_creation_feedback")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoRevert "glsl: fix shader_storage_blocks_write_access for SSBO block arrays"
Marek Olšák [Wed, 10 Apr 2019 14:48:37 +0000 (10:48 -0400)]
Revert "glsl: fix shader_storage_blocks_write_access for SSBO block arrays"

This reverts commit b7ca074cc0df6101c428b2dfa53a59a0c6620af2.

It broke a lot of tests.

5 years agoglsl/standalone: add GLES3.1 and GLES3.2 compatibility
Karol Herbst [Thu, 17 Jan 2019 20:05:00 +0000 (21:05 +0100)]
glsl/standalone: add GLES3.1 and GLES3.2 compatibility

also set some constants for SSBOs.

With that it can compile the shader from:
dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.18

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agovirgl: use debug_printf instead of fprintf
Erik Faye-Lund [Wed, 10 Apr 2019 11:43:34 +0000 (13:43 +0200)]
virgl: use debug_printf instead of fprintf

While we're at it, prefix the string with "VIRGL: ", to match similar
code elsewhere in virgl.

Fixes: d7b31969767 ("virgl: Return an error if we use fp64 on top of GLES")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: Enable passing arrays as input to fragment shaders
Gert Wollny [Tue, 15 Jan 2019 09:32:17 +0000 (10:32 +0100)]
virgl: Enable passing arrays as input to fragment shaders

This is needed to properly handle interpolateAt* when the input to be
interpolated is passed as array in the original GLSL.

Currently, the the GLSL compiler would lower selecting the correct input so
that the interpolant parameter to interpolateAt* is a temporary, and this
can not be used to create a valid shader on the host side, because here the
parameter must a shader input.

By allowing the passing the created TGSI allows to create proper GLSL.
This is related to the virglrenderer bug
  https://gitlab.freedesktop.org/virgl/virglrenderer/issues/74

v2: Squash the two patches handling these flags into another

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoGallium: Add new CAP that indicated whether IO array definitions can be shriked
Gert Wollny [Tue, 15 Jan 2019 09:31:16 +0000 (10:31 +0100)]
Gallium: Add new CAP that indicated whether IO array definitions can be shriked

PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS is added to indicate whether the TGSI
pass to shrink IO arrays should be skipped to enforce the originally declared array
sizes and locations instead.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agowsi: allow to override the present mode with MESA_VK_WSI_PRESENT_MODE
Samuel Pitoiset [Tue, 9 Apr 2019 13:30:05 +0000 (15:30 +0200)]
wsi: allow to override the present mode with MESA_VK_WSI_PRESENT_MODE

This is common to all Vulkan drivers and all WSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv: enable VK_AMD_gpu_shader_half_float
Samuel Pitoiset [Thu, 21 Mar 2019 08:57:32 +0000 (09:57 +0100)]
radv: enable VK_AMD_gpu_shader_half_float

Should be safe to enable as all instructions seem to support 16-bit.
Unfortunately, there is no CTS test.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_build_ddxy()
Rhys Perry [Mon, 8 Apr 2019 07:24:57 +0000 (09:24 +0200)]
ac: add 16-bit support to ac_build_ddxy()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: fix nir_op_b2f16
Samuel Pitoiset [Mon, 8 Apr 2019 07:24:56 +0000 (09:24 +0200)]
ac/nir: fix nir_op_b2f16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovirgl: Set bind when creating temp resource.
Lepton Wu [Mon, 8 Apr 2019 16:39:46 +0000 (09:39 -0700)]
virgl: Set bind when creating temp resource.

virgl render complains about "Illegal resource" when running
dEQP-EGL.functional.color_clears.single_context.gles2.rgb888_window,
the reason is that a zero bind value was given for temp resource.

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
5 years agoradv: Add non-uniform indexing lowering.
Bas Nieuwenhuizen [Sun, 7 Apr 2019 21:57:58 +0000 (23:57 +0200)]
radv: Add non-uniform indexing lowering.

This patch does it as late as possible so the potential extra
basic blocks don't inhibit other optimizations.

Big thanks to Jason for writing the lowering pass.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agonir: Add access qualifiers on load_ubo intrinsic.
Bas Nieuwenhuizen [Sun, 7 Apr 2019 22:34:14 +0000 (00:34 +0200)]
nir: Add access qualifiers on load_ubo intrinsic.

Otherwise nir_lower_non_uniform_access crashes when it tries
to get the access of a load_ubo.

Fixes: 8ed583fe523 "spirv: Handle the NonUniformEXT decoration"
Fixes: e50ab2c0f23 "nir: Add access flags to deref and SSBO atomics"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoglsl: fix shader_storage_blocks_write_access for SSBO block arrays
Marek Olšák [Mon, 8 Apr 2019 21:20:13 +0000 (17:20 -0400)]
glsl: fix shader_storage_blocks_write_access for SSBO block arrays

CTS: GL45-CTS.compute_shader.resources-max

Fixes: 4e1e8f684bf "glsl: remember which SSBOs are not read-only and pass it to gallium"

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agofreedreno: PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT unreachable statement
Khaled Emara [Tue, 9 Apr 2019 08:19:22 +0000 (10:19 +0200)]
freedreno: PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT unreachable statement

There seems to be a duplicate return statement,
as A2XX doesn't support shader buffers.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agogenxml: sort xml files using new script
Lionel Landwerlin [Tue, 9 Apr 2019 16:31:59 +0000 (17:31 +0100)]
genxml: sort xml files using new script

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agogenxml: add a sorting script
Lionel Landwerlin [Sun, 7 Apr 2019 15:52:55 +0000 (16:52 +0100)]
genxml: add a sorting script

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agobin: drop unused import from install_megadrivers.py
Eric Engestrom [Tue, 9 Apr 2019 08:12:13 +0000 (09:12 +0100)]
bin: drop unused import from install_megadrivers.py

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoanv: advertise 8 subtexel/mipmap precision bits
Juan A. Suarez Romero [Tue, 9 Apr 2019 15:28:42 +0000 (15:28 +0000)]
anv: advertise 8 subtexel/mipmap precision bits

So far ANV was advertising 4 bits for both subTexelPrecisionBits and
mipmapPrecisionBits. But these values were not actually verified.

But it seems the right value is actually 8 bits for both cases.

Unfortunately Intel PRM does not clarify how many bits the hardware use.
For the mipmap case, there is the following reference in PRM Volume 6
(3D Media GPGPU), specifically in LOD Computation Pseudocode:

```
Bias: S4.8
MinLod: U4.8
MaxLod: U4.8
Base: U4.1
MIPCnt: U4
SurfMinLod: U4.8
ResMinLod: U4.8
``

We have other clues, though:

- On one side, dEQP-VK.texture.explicit_lod.* tests fail when using 4
bits, but work when using 8 bits. These tests try to mimic the expected
behaviour as much real as possible, and they use the reported
subTexelPrecisionBits and mipmapPrecisionBits reported to get this.

- On the other side, the equivalent driver for Windows is reporting 8
bits for both elements. Not sure if they got to verify it from the PRM
or from a diffent source.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agost/va: reverse qt matrix back to its original order
Boyuan Zhang [Mon, 8 Apr 2019 18:34:44 +0000 (14:34 -0400)]
st/va: reverse qt matrix back to its original order

The quantiser matrix that VAAPI provides has been applied with inverse z-scan.
However, what we expect in MPEG2 picture description is the original order.
Therefore, we need to reverse it back to its original order.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110257
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
5 years agoglsl/linker: location aliasing requires types to have the same width
Andres Gomez [Fri, 29 Mar 2019 10:38:34 +0000 (12:38 +0200)]
glsl/linker: location aliasing requires types to have the same width

From the OpenGL 4.60.5 spec, section 4.4.1 Input Layout Qualifiers,
Page 67, (Location aliasing):

  " Further, when location aliasing, the aliases sharing the location
    must have the same underlying numerical type and bit
    width (floating-point or integer, 32-bit versus 64-bit, etc.) and
    the same auxiliary storage and interpolation qualification."

Additionally, we have improved the linker error descriptions.
Specifically, when taking structs into account we were producing a
linker error because we assumed that all components in each location
were used and that would cause component aliasing. This is not
accurate of the actual problem. Now, the failure specifies that the
underlying numerical type incompatibility is the cause for the
failure.

Fixes the following piglit test:

tests/spec/arb_enhanced_layouts/linker/component-layout/vs-to-fs-width-mismatch-double-float.shader_test

v2:
  - Do not assert if we see invalid numerical types. These come
    straight from shader code, so we should produce linker errors if
    shaders attempt to do location aliasing on variables that are not
    numerical such as records.
  - While we are at it, improve error reporting for the case of
    numerical type mismatch to include the shader stage.

v3:
  - Allow location aliasing of images and samplers. If we get these
    it means bindless support is active and they should be handled
    as 64-bit integers (Ilia)
  - Make sure we produce link errors for any non-numerical type
    for which we attempt location aliasing, not just structs.

v4:
  - Rebased with minor fixes (Andres).
  - Added fixing tag to the commit log (Andres).

v5:
  - Remove the helper function and check individually for the
    underlying numerical type and bit width (Timothy).
  - Implicitly, assume that any non-treated type which is checked for
    its underlying numerical type is either integer or
    float and has a defined bit width (Timothy).
  - Implicitly, assume that structs are the only non-treated
    non-numerical type (Timothy).
  - Improve the linker error descriptions and commit log (Andres).

Fixes: 13652e7516a ("glsl/linker: Fix type checks for location aliasing")
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agosoftpipe: Enable PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT
Gert Wollny [Sun, 7 Apr 2019 06:40:52 +0000 (08:40 +0200)]
softpipe: Enable PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT

The offset alignment must be set to s16 because the tile cache is
implemented to require this.

This enables ARB_buffer_texture_range and OES_texture_buffer for
softpipe. The according deqp-gles31 tests pass.

Also update the feature table.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agosoftpipe: Add an extra code path for the buffer texel lookup
Gert Wollny [Sun, 7 Apr 2019 06:37:45 +0000 (08:37 +0200)]
softpipe: Add an extra code path for the buffer texel lookup

With buffers the addressing is done on a per-byte bases so the code
path for normal textures doesn't work properly. Also add an assert
to make sure that the bit cound for storing the X coordinate is
large enough.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agosoftpipe: raise number of bits used for X coordinate texture lookup
Gert Wollny [Sun, 7 Apr 2019 06:33:34 +0000 (08:33 +0200)]
softpipe: raise number of bits used for X coordinate texture lookup

With buffers the addressing is done on a per byte basis and we with
a maximal block size of 16 byte we have to take into acount four more
bits. For simplicity just remove the TEX_TILE_SIZE_LOG2, which is 5 bit.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agosoftpipe: Don't use mag filter for gather op
Gert Wollny [Sun, 7 Apr 2019 07:39:22 +0000 (09:39 +0200)]
softpipe: Don't use mag filter for gather op

For the gather op no magnifictaion filter is provided, so always use
the filter given for minification (which is the linear filter)

Fixes: 0dff1533f25951adda3c36be6d9efa944741befb
    softpipe: Use mag texture filter also for clamped lod == 0

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agonir: Get rid of global registers
Jason Ekstrand [Sat, 6 Apr 2019 00:58:46 +0000 (19:58 -0500)]
nir: Get rid of global registers

We have a pass to lower global registers to locals and many drivers
dutifully call it.  However, no one ever creates a global register ever
so it's all dead code.  It's time we bury it.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: Get rid of nir_register::is_packed
Jason Ekstrand [Sat, 6 Apr 2019 00:41:03 +0000 (19:41 -0500)]
nir: Get rid of nir_register::is_packed

All we ever do is initialize it to zero, clone it, print it, and
validate it.  No one ever sets or uses it.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agovirgl: add support for ARB_indirect_parameters
Dave Airlie [Mon, 11 Feb 2019 02:46:10 +0000 (12:46 +1000)]
virgl: add support for ARB_indirect_parameters

The protocol changes are already in place for it.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: add support for ARB_multi_draw_indirect
Dave Airlie [Mon, 11 Feb 2019 00:51:01 +0000 (10:51 +1000)]
virgl: add support for ARB_multi_draw_indirect

This will pass the multi draw through to the host if it has
support for it instead of using the st to emulate it

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: add support for missing command buffer binding.
Dave Airlie [Mon, 11 Feb 2019 02:53:38 +0000 (12:53 +1000)]
virgl: add support for missing command buffer binding.

When I added indirect support I forgot this, however to use it
now we need to check for a new enough capability on the host side.

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agodocs: Add NV_compute_shader_derivatives to 19.1.0 relnotes
Caio Marcelo de Oliveira Filho [Sun, 7 Apr 2019 05:34:05 +0000 (22:34 -0700)]
docs: Add NV_compute_shader_derivatives to 19.1.0 relnotes

5 years agoanv: Implement VK_NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 17:36:43 +0000 (10:36 -0700)]
anv: Implement VK_NV_compute_shader_derivatives

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: Add support for DerivativeGroup capabilities
Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 17:23:02 +0000 (10:23 -0700)]
spirv: Add support for DerivativeGroup capabilities

As defined in SPV_NV_compute_shader_derivatives. These control how the
invocations are arranged in a CS when doing derivative and related
operations (which are also enabled by the extension).

Since we expect valid SPIR-V, we don't need to do more work at SPIR-V
level to enable the derivative and related operations to be called.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Enable NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 08:48:38 +0000 (01:48 -0700)]
iris: Enable NV_compute_shader_derivatives

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agogallium: Add PIPE_CAP_COMPUTE_SHADER_DERIVATIVES
Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 08:47:10 +0000 (01:47 -0700)]
gallium: Add PIPE_CAP_COMPUTE_SHADER_DERIVATIVES

To enable NV_compute_shader_derivatives, which allows derivatives (and
texture lookups with implicit derivatives) in compute shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoi965: Advertise NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 06:41:03 +0000 (23:41 -0700)]
i965: Advertise NV_compute_shader_derivatives

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agointel/fs: Use NIR_PASS_V when lowering CS intrinsics
Caio Marcelo de Oliveira Filho [Wed, 3 Apr 2019 00:29:52 +0000 (17:29 -0700)]
intel/fs: Use NIR_PASS_V when lowering CS intrinsics

This will make that step visible in NIR_PRINT=1.

v2: Also use the macro for the cleanup passes.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs: Don't loop when lowering CS intrinsics
Caio Marcelo de Oliveira Filho [Wed, 3 Apr 2019 00:41:07 +0000 (17:41 -0700)]
intel/fs: Don't loop when lowering CS intrinsics

This was needed when certain intrinsics were lowered to other ones
that were defined by the same pass.  After 060817b2 "intel,nir: Move
gl_LocalInvocationID lowering to nir_lower_system_values" we don't
need the loop anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs: Add support for CS to group invocations in quads
Caio Marcelo de Oliveira Filho [Wed, 27 Mar 2019 22:07:59 +0000 (15:07 -0700)]
intel/fs: Add support for CS to group invocations in quads

When using quads, instead of mapping the elements to the next 4 local
invocation indices, we map the two next in the "current" row and two
next in the "next row".  A side effect is that a thread will execute
the indices in a different order.

We now perform the lowering of both local invocation ID and index
together -- and don't rely anymore on lowering done by
nir_lower_system_values.  That is convenient when doing the math for
quads, because we need X and Y to get the right invocation index.

When the pass progresses, fold the constants and clean up to reduce
the noise from the indexing math.

This implements the derivative_group_quadsNV semantics from
NV_compute_shader_derivatives.

v2: Take subgroup_id into account, otherwise only values in the first
    subgroup would be used. (Jason)

v3: Calculate invocation index and ID together, to avoid duplicating
    some math in the quads case when both index and ID are used. (Jason)

v4: Don't call cleanup passes as part of the lowering, let that to the
    call site. (Jason)
    Change calculation to use less instructions. (Jason)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/fs: Use TEX_LOGICAL whenever implicit lod is supported
Caio Marcelo de Oliveira Filho [Fri, 5 Apr 2019 23:07:16 +0000 (16:07 -0700)]
intel/fs: Use TEX_LOGICAL whenever implicit lod is supported

Make sure we include compute shaders that have a derivative group
defined.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Don't set LOD=0 for compute shader that has derivative group
Caio Marcelo de Oliveira Filho [Fri, 5 Apr 2019 23:04:40 +0000 (16:04 -0700)]
nir: Don't set LOD=0 for compute shader that has derivative group

When using NV_compute_shader_derivatives to set a derivative group,
a compute shader supports texture with implicit LOD calculation, so
don't set an explicit LOD.

Note if the extension is used but the derivative group is not
specified, it will default to LOD=0 as before.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/algebraic: Lower CS derivatives to zero when no group defined
Caio Marcelo de Oliveira Filho [Wed, 27 Mar 2019 22:07:20 +0000 (15:07 -0700)]
nir/algebraic: Lower CS derivatives to zero when no group defined

In compute shaders if no derivative group is defined, the derivatives
will always be zero.  Specified in NV_compute_shader_derivatives.

To make the check more convenient, add a "info" local variable to the
generated code so we can refer to it in the Python rules.  (Jason)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoglsl: Parse and propagate derivative_group to shader_info
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 07:04:57 +0000 (00:04 -0700)]
glsl: Parse and propagate derivative_group to shader_info

NV_compute_shader_derivatives allow selecting between two possible
arrangements (quads and linear) when calculating derivatives and
certain subgroup operations in case of Vulkan.  So parse and propagate
those up to shader_info.h.

v2: Do not fail when ARB_compute_variable_group_size is being used,
    since we are still clarifying what is the right thing to do here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoglsl: Enable texture builtins for NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 07:36:39 +0000 (00:36 -0700)]
glsl: Enable texture builtins for NV_compute_shader_derivatives

Renamed a few predicates from "fs_only" to be "derivative_only" (or
similar pairs).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoglsl: Enable derivative builtins for NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 07:04:01 +0000 (00:04 -0700)]
glsl: Enable derivative builtins for NV_compute_shader_derivatives

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoglsl: Remove redundant conditions when asserting in_qualifier
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 06:47:21 +0000 (23:47 -0700)]
glsl: Remove redundant conditions when asserting in_qualifier

As the code evolved, we ended up with a redundant conditions.  Clean
this up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa: Extension boilerplate for NV_compute_shader_derivatives
Caio Marcelo de Oliveira Filho [Mon, 18 Mar 2019 21:26:52 +0000 (14:26 -0700)]
mesa: Extension boilerplate for NV_compute_shader_derivatives

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/radv: remove restrictions on opt_if_loop_last_continue()
Timothy Arceri [Mon, 8 Apr 2019 10:13:49 +0000 (20:13 +1000)]
nir/radv: remove restrictions on opt_if_loop_last_continue()

When I implemented opt_if_loop_last_continue() I had restricted
this pass from moving other if-statements inside the branch opposite
the continue. At the time it was causing a bunch of spilling in
shader-db for i965.

However Samuel Pitoiset noticed that making this pass more aggressive
significantly improved the performance of Doom on RADV. Below are
the statistics he gathered.

28717 shaders in 14931 tests
Totals:
SGPRS: 1267317 -> 1267549 (0.02 %)
VGPRS: 896876 -> 895920 (-0.11 %)
Spilled SGPRs: 24701 -> 26367 (6.74 %)
Code Size: 48379452 -> 48507880 (0.27 %) bytes
Max Waves: 241159 -> 241190 (0.01 %)

Totals from affected shaders:
SGPRS: 23584 -> 23816 (0.98 %)
VGPRS: 25908 -> 24952 (-3.69 %)
Spilled SGPRs: 503 -> 2169 (331.21 %)
Code Size: 2471392 -> 2599820 (5.20 %) bytes
Max Waves: 586 -> 617 (5.29 %)

The codesize increases is related to Wolfenstein II it seems largely
due to an increase in phis rather than the existing jumps.

This gives +10% FPS with Doom on my Vega56.

Rhys Perry also benchmarked Doom on his VEGA64:

Before: 72.53 FPS
After:  80.77 FPS

v2: disable pass on non-AMD drivers

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agosoftpipe: add support for vertex streams (v2)
Dave Airlie [Wed, 27 May 2015 07:41:32 +0000 (17:41 +1000)]
softpipe: add support for vertex streams (v2)

This enables the ARB_gpu_shader5 vertex streams on softpipe.

v2: only enable when not using llvm.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 years agodraw: add support to tgsi paths for geometry streams. (v2)
Dave Airlie [Wed, 27 May 2015 07:39:05 +0000 (17:39 +1000)]
draw: add support to tgsi paths for geometry streams. (v2)

This hooks up the geometry shader processing to the TGSI
support added in the previous commits.

It doesn't change the llvm interface other than to
keep things building.

v2: fix some regressions caused by primitiveoffsets

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 years agosoftpipe: add support for indexed queries.
Dave Airlie [Wed, 27 May 2015 07:37:46 +0000 (17:37 +1000)]
softpipe: add support for indexed queries.

We need indexed queries to retrieve the geom shader info.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 years agotgsi: add support for geometry shader streams.
Dave Airlie [Wed, 27 May 2015 07:35:32 +0000 (17:35 +1000)]
tgsi: add support for geometry shader streams.

This adds support to retrieve the primitive counts
for each stream, along with the offset for each
primitive into the output array.

It also adds support for parsing the stream argument
to the emit and end instructions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 years agodraw: add stream member to stats callback
Dave Airlie [Wed, 27 May 2015 07:22:13 +0000 (17:22 +1000)]
draw: add stream member to stats callback

This just adds space for the member to the callback, doesn't
change anything else.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
5 years agovulkan/wsi: make wl_drm optional
Chia-I Wu [Fri, 8 Feb 2019 22:57:33 +0000 (14:57 -0800)]
vulkan/wsi: make wl_drm optional

When wl_drm is missing and the driver supports modifiers, use
zwp_linux_dmabuf_v1 for the list of supported formats and for buffer
creation.

Limit the supported formats to those with modifiers, which are
WL_DRM_FORMAT_{ARGB8888,XRGB8888} currently.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agovulkan/wsi: add wsi_wl_display_dmabuf
Chia-I Wu [Tue, 12 Feb 2019 20:31:57 +0000 (12:31 -0800)]
vulkan/wsi: add wsi_wl_display_dmabuf

Add wsi_wl_display_dmabuf for zwp_linux_dmabuf_v1-related states.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agovulkan/wsi: add wsi_wl_display_drm
Chia-I Wu [Tue, 12 Feb 2019 20:31:57 +0000 (12:31 -0800)]
vulkan/wsi: add wsi_wl_display_drm

Add wsi_wl_display_drm for wl_drm-related states.  We will move
formats into the struct in a later commit.

Remove the unnecessary check for wl_registry_bind failures.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agovulkan/wsi: refactor drm_handle_format
Chia-I Wu [Tue, 12 Feb 2019 05:25:49 +0000 (21:25 -0800)]
vulkan/wsi: refactor drm_handle_format

Refactor the swtich statement in drm_handle_format out to
wsi_wl_display_add_wl_format.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
5 years agovulkan/wsi: create wl_drm wrapper as needed
Chia-I Wu [Tue, 12 Feb 2019 01:23:56 +0000 (17:23 -0800)]
vulkan/wsi: create wl_drm wrapper as needed

When modifiers are specified, we have to use dmabuf rather than
wl_drm.  We don't need the wrapper in that case.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>