Jason Ekstrand [Fri, 24 Feb 2017 00:18:00 +0000 (16:18 -0800)]
anv: Advertise shaderInt64 on Broadwell and above
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 1 Mar 2017 23:20:31 +0000 (15:20 -0800)]
nir/int64: Properly handle imod/irem
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder. They just leave the behavior undefined
whenever either source is negative. However, in SPIR-V, there is a
defined behavior for negative arguments. This commit beefs up the pass
so that it handles both correctly. Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Fri, 24 Feb 2017 05:35:00 +0000 (21:35 -0800)]
nir/builder: Add an int64 immediate helper
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sat, 4 Jun 2016 01:09:55 +0000 (18:09 -0700)]
genxml: Fill out Gen4 and G45 XML.
This is a work in progress - some things may still need fixing.
But it should be in pretty decent shape.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Sat, 25 Feb 2017 23:41:37 +0000 (00:41 +0100)]
ac: normalize build helper names
s/emit/build/
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 25 Feb 2017 22:48:23 +0000 (23:48 +0100)]
ac: replace SI.vs.load.input with amdgcn.buffer.load.format
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 25 Feb 2017 22:40:52 +0000 (23:40 +0100)]
radeonsi: move SI.vs.load.input building into amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Tue, 15 Nov 2016 23:26:47 +0000 (00:26 +0100)]
radeonsi: detect and mark loads/stores from read-only/write-only memory
Marek Olšák [Fri, 24 Feb 2017 01:09:47 +0000 (02:09 +0100)]
ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0
ADD_TID doesn't work. Needs more investigation.
v2: remove leftover dead code
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Marek Olšák [Fri, 24 Feb 2017 01:14:52 +0000 (02:14 +0100)]
radeonsi: use the writeonly LLVM attribute
Marek Olšák [Fri, 24 Feb 2017 19:23:23 +0000 (20:23 +0100)]
ac: remove offen parameter from ac_build_buffer_store_dword
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 00:45:31 +0000 (01:45 +0100)]
radeonsi: enable TC L2 for tessellation offchip stores
Vulkan does the same thing.
Marek Olšák [Fri, 24 Feb 2017 00:20:35 +0000 (01:20 +0100)]
radeonsi: merge and simplify tbuffer_store functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 22:14:35 +0000 (23:14 +0100)]
radeonsi: set noalias on input shader pointers
Marek Olšák [Fri, 24 Feb 2017 22:06:31 +0000 (23:06 +0100)]
radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 21:44:18 +0000 (22:44 +0100)]
radeonsi: move kill intrinsic building into amd/common
just a cleanup
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 16:16:28 +0000 (17:16 +0100)]
radeonsi: set readnone on reads from read-only memory
Marek Olšák [Fri, 24 Feb 2017 15:54:05 +0000 (16:54 +0100)]
radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.load
Marek Olšák [Fri, 24 Feb 2017 15:38:25 +0000 (16:38 +0100)]
radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz
Marek Olšák [Thu, 23 Feb 2017 22:37:59 +0000 (23:37 +0100)]
ac: replace old image intrinsics with new ones
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 23:52:07 +0000 (00:52 +0100)]
radeonsi: remove last use of llvm.SI.resinfo
and move one function up to reuse the code.
Marek Olšák [Thu, 23 Feb 2017 22:00:19 +0000 (23:00 +0100)]
radeonsi: move image intrinsic building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 01:15:54 +0000 (02:15 +0100)]
ac: replace SI.export with amdgcn.exp.*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 01:06:40 +0000 (02:06 +0100)]
radeonsi: move llvm.SI.export building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 21:58:49 +0000 (22:58 +0100)]
ac: unify build_type_name_for_intr functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 21:15:17 +0000 (22:15 +0100)]
radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as well
It was harmless, because we also set unorm in the sampler state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 01:11:07 +0000 (02:11 +0100)]
gallivm, ac: add writeonly and inaccessiblememonly attributes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Tue, 15 Nov 2016 23:47:35 +0000 (00:47 +0100)]
tgsi/scan: record load/store/atomic image usage
Reviewed-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Wed, 27 Feb 2013 20:58:58 +0000 (12:58 -0800)]
glapi: Fix a comment typo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Thu, 2 Mar 2017 16:18:14 +0000 (17:18 +0100)]
mesa/main: *TextureSubImage* generates INVALID_OPERATION on wrong target
Equivalent *TexSubImage* methods generates INVALID_ENUM.
From OpenGL 4.5 spec, section 8.6 Alternate Texture Image
Specification Commands:
"An INVALID_ENUM error is generated by *TexSubImage* if target does
not match the command, as shown in table 8.15."
And:
"An INVALID_OPERATION error is generated by *TextureSubImage* if
the effective target of texture does not match the command, as
shown in table 8.15."
Fixes:
GL45-CTS.direct_state_access.textures_copy_errors
v2: slightly change commit summary (Samuel)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ben Widawsky [Fri, 3 Mar 2017 01:47:02 +0000 (17:47 -0800)]
i965: Add Kaby Lake brandstrings
While here, use the spacing defined in Ark.
https://ark.intel.com/products/codename/82879/Kaby-Lake
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Grazvydas Ignotas [Thu, 2 Mar 2017 22:46:53 +0000 (00:46 +0200)]
tgsi/ureg: return correct token count in ureg_get_tokens
Valgrind reports that the shader cache writes uninitialized data to disk.
Turns out ureg_get_tokens() is returning the count of allocated tokens
instead of how many are actually used, so the cache writes out unused
space at the end. Use the real count instead.
This change should not cause regressions elsewhere because the only
ureg_get_tokens() user that cares about token count is the shader cache.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Timothy Arceri [Tue, 24 Jan 2017 16:08:22 +0000 (17:08 +0100)]
radeonsi: add support for an on-disk shader cache
V2:
- when loading from disk cache also binary insert into memory cache.
- check that the binary loaded from disk is the correct size. If not
delete the cache item and skip loading from cache.
V3:
- remove unrequired variable
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 1 Mar 2017 05:04:23 +0000 (16:04 +1100)]
util/disk_cache: compress individual cache entries
This reduces the cache size for Deus Ex from ~160M to ~30M for
radeonsi (these numbers differ from Grigori's results below
probably due to different graphics quality settings).
I'm also seeing the following improvements in minimum fps in the
Shadow of Mordor benchmark on an i5-6400 CPU@2.70GHz, with a HDD:
no-cache: ~10fps
with-cache-no-compression: ~15fps
with-cache-and-compression: ~20fps
Note: The with cache results are from the second run after closing
and opening the game to avoid the in-memory cache.
Since we mainly care about decompression I went with
Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson
who has benchmarked decompression speeds.
Grigori Goronzy provided the following stats for Deus Ex: Mankind
Divided start-up times on a Athlon X4 860k with a SSD:
No Cache 215 sec
Cold Cache zlib BEST_COMPRESSION 285 sec
Warm Cache zlib BEST_COMPRESSION 33 sec
Cold Cache zlib BEST_SPEED 264 sec
Warm Cache zlib BEST_SPEED 33 sec
Cold Cache no compression 266 sec
Warm Cache no compression 34 sec
The total cache size for that game is 48 MiB with BEST_COMPRESSION,
56 MiB with BEST_SPEED and 170 MiB with no compression.
These numbers suggest that it may be ok to go with Z_BEST_SPEED
but we should gather some actual decompression times before doing
so. Other options might be to do the compression in a separate
thread, this might allow us to use a higher compression algorithim
such as LZMA.
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 24 Feb 2017 04:14:56 +0000 (15:14 +1100)]
util/disk_cache: add support for detecting corrupt cache entries
V2: fix pointer increments for writing/reading crc
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Samuel Pitoiset [Wed, 1 Mar 2017 21:09:28 +0000 (22:09 +0100)]
glsl: fix subroutine mismatch between declarations/definitions
Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.
Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.
This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes:
be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Matt Turner [Thu, 2 Mar 2017 19:05:34 +0000 (11:05 -0800)]
genxml: Depend on Makefile.am for generated sources.
Depending on the generated Makefile means that all generated sources are
recreated after ./configure.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Matt Turner [Thu, 2 Mar 2017 04:43:21 +0000 (04:43 +0000)]
clover: Work around build failure with AltiVec.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504
Acked-by: Francisco Jerez <currojerez@riseup.net>
Nanley Chery [Tue, 31 Jan 2017 20:23:18 +0000 (12:23 -0800)]
anv/image: Allow HiZ on input attachment-capable depth/stencil images
While an input attachment may only take on one of those two layouts,
other depth/stencil attachments that use the same image may have
HiZ-enabled layouts. Improves the average frame rate on a release
candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my
SKL GT4.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Thu, 23 Feb 2017 20:11:58 +0000 (12:11 -0800)]
anv/cmd_buffer: Centralize automatic layout transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 1 Feb 2017 00:42:58 +0000 (16:42 -0800)]
anv/cmd_buffer: Add attachment transitioning functions
This is needed to transition input attachments.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 20:31:36 +0000 (12:31 -0800)]
anv/blorp: Encapsulate subpass id querying
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 22:09:03 +0000 (14:09 -0800)]
anv/cmd_buffer: Enable render pass awareness
v2: Update cmd_state_reset (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Sat, 25 Feb 2017 23:57:32 +0000 (15:57 -0800)]
anv/pass: Store subpass attachment reference list
We'll loop through this array when performing automatic layout
transitions.
v2: Adjust formatting of an assignment (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 17:38:25 +0000 (09:38 -0800)]
anv/pass: Fix size of anv_render_pass:subpass_attachments
Don't allocate space for resolve attachments if the subpass has none.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 1 Feb 2017 00:12:50 +0000 (16:12 -0800)]
anv: Store the user's VkAttachmentReference
We will be using the image layout. Store the full struct directly from
the user.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Tue, 31 Jan 2017 19:36:22 +0000 (11:36 -0800)]
anv/cmd_buffer: Remove extra resolve for certain depth buffers
Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer
when reading from a depth image subresource that is in the general
layout. Remove this unneeded resolve.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Tue, 31 Jan 2017 19:25:31 +0000 (11:25 -0800)]
anv/cmd_buffer: Conditionally choose the sampled image surface state
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Tue, 31 Jan 2017 19:13:44 +0000 (11:13 -0800)]
anv/descriptor_set: Store aux usage of sampled image descriptors
v2: Rebase onto latest changes
v3: Account for NULL image_view in aux_usage assignment
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Tue, 31 Jan 2017 19:04:42 +0000 (11:04 -0800)]
anv/image: Create an additional surface state for sampling
This will be used to sample a depth input attachment without having to
pass through the HiZ buffer.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Thu, 23 Feb 2017 18:02:17 +0000 (10:02 -0800)]
anv/image: Simplify setup of HiZ sampler surface state
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 22 Feb 2017 02:17:59 +0000 (18:17 -0800)]
anv/image: Remove extra dependency on HiZ-specific variable
surf_usage is only useful to image views that may use HiZ buffers.
Storage image views don't use HiZ buffers.
v2: Update commit message and add an assertion.
Fixes:
055ff2ec521 ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ")
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Fri, 17 Feb 2017 18:14:59 +0000 (10:14 -0800)]
anv: Update the HiZ sampling helper
Validate the inputs, verify that this image has a depth
buffer, use gen_device_info instead of
v2:
- Add parenthesis (Jason Ekstrand)
- Make parameters const
- Use gen_device_info instead of gen
- Pass aspect to missed function in transition_depth_buffer
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Fri, 17 Feb 2017 01:35:39 +0000 (17:35 -0800)]
anv/cmd_buffer: Replace layout_to_hiz_usage()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 1 Feb 2017 19:27:58 +0000 (11:27 -0800)]
anv/image: Add anv_layout_to_aux_usage()
This function supersedes layout_to_hiz_usage().
v2:
- Don't find the optimal buffer for layout transitions (Jason Ekstrand).
- Pass the devinfo instead of the gen (Jason Ekstrand)
- Update the function documentation.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 18:23:33 +0000 (10:23 -0800)]
anv/pass: Avoid accessing attachment array out of bounds
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jonas Pfeil [Wed, 1 Mar 2017 17:11:10 +0000 (18:11 +0100)]
ralloc: Make sure ralloc() allocations match malloc()'s alignment.
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.
Fixes SIGBUS on Raspberry Pi at high optimization levels.
This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.
v2: Commit message reword/typo fix, and add a bigger explanation in the
code (by anholt)
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Bruce Cherniak [Thu, 2 Mar 2017 04:58:36 +0000 (22:58 -0600)]
swr: fix crash in swr_update_derived following st/mesa state changes
Recent change to st/mesa state update logic caused major regressions to
swr validation code.
swr uses the same validation logic (swr_update_derived) for both draw
and Clear calls. New st/mesa state update logic results in certain state
objects not being set/bound during Clear. This was causing null ptr
exceptions. Creation of static dummy state objects allows setting these
pointers during Clear validation, without interfering with relevant state
validation.
Once fixed, new logic also highlighted an error in dirty bit checking for
fragment shader and clip validation.
(The alternative is to have a simplified validation routine for Clear.
Which may do that at some point.)
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Bruce Cherniak [Sun, 26 Feb 2017 03:17:07 +0000 (21:17 -0600)]
docs: update features.txt for GL_ARB_clear_texture with swr
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Bruce Cherniak [Sun, 26 Feb 2017 03:09:57 +0000 (21:09 -0600)]
swr: enable clear_texture with util_clear_texture
Passes corresponding piglit tests.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Gregory Hainaut [Fri, 24 Feb 2017 20:45:12 +0000 (21:45 +0100)]
doc: GL_ARB_buffer_storage is supported on llvmpipe/swr
At least, the extension is exported (gallium capability
PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT is 1)
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 2 Mar 2017 15:49:48 +0000 (15:49 +0000)]
automake: i965: list correct header in Makefile.source
Fixes:
7ac47b1af767 ("i965: Add a header for brw_vec4_vs_visitor")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Brian Paul [Wed, 1 Mar 2017 22:29:55 +0000 (15:29 -0700)]
svga: fix crash regression since
e027935a795
During the first update of the hw_clear_state atoms, we may not yet
have a current rasterizer state object. So, svga->curr.rast may be
NULL and we crash.
Add a few null pointer checks to work around this. Note that these
are only needed in the state update functions which are called for
'clear' validation.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 1 Mar 2017 21:52:46 +0000 (14:52 -0700)]
svga: s/unsigned/pipe_prim_type/
And add some default switch cases to silence compiler warnings.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 1 Mar 2017 20:50:48 +0000 (13:50 -0700)]
svga: whitespace fixes in svga_context.h
Trivial.
Brian Paul [Wed, 1 Mar 2017 20:48:28 +0000 (13:48 -0700)]
svga: whitespace and formatting fixes in svga_stage.c
Trivial.
Robert Foss [Thu, 2 Mar 2017 00:14:39 +0000 (19:14 -0500)]
mesa: Avoid read of uninitialized variable
The is_color_attachement variable is later read when handling two
separate error cases, where only one of the cases results in the
variable being initialized.
This can be avoided by giving the variable a safe default value.
Coverity-Id: 1398631
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Lionel Landwerlin [Tue, 17 Jan 2017 16:38:01 +0000 (16:38 +0000)]
anv: add VK_KHR_descriptor_update_template support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Thu, 12 Jan 2017 16:12:46 +0000 (16:12 +0000)]
anv: add VK_KHR_push_descriptor support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 17 Jan 2017 17:43:08 +0000 (17:43 +0000)]
anv: descriptor: make descriptor writing take a stream allocator
This allows us to allocate surface states from the command buffer when
pushing descriptor sets rather than allocating them through a
descriptor set pool.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Mon, 23 Jan 2017 15:33:37 +0000 (15:33 +0000)]
anv: descriptors: extract writing of descriptors elements
This will be reused later on.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 17 Jan 2017 14:30:19 +0000 (14:30 +0000)]
anv: make layout size computation helper available across compilation units
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 17 Jan 2017 14:28:20 +0000 (14:28 +0000)]
anv: move buffer_view declaration
We will need this declaration closer for readability later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tomasz Figa [Thu, 23 Feb 2017 08:05:18 +0000 (17:05 +0900)]
mesa: Use _mesa_has_OES_geometry_shader() when validating draws
In validate_DrawElements_common() we need to check for OES_geometry_shader
extension to determine if we should fail if transform feedback is
unpaused. However current code reads ctx->Extensions.OES_geometry_shader
directly, which does not take context version into account. This means
that if the context is GLES 3.0, which makes the OES_geometry_shader
inapplicable, we would not validate the draw properly. To fix it, let's
replace the check with a call to _mesa_has_OES_geometry_shader().
Fixes following dEQP tests on i965 with a GLES 3.0 context:
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Thu, 2 Mar 2017 07:31:15 +0000 (23:31 -0800)]
i965: Replace BRW_SURFACEFORMAT_* with ISL_FORMAT_*.
One less set of enums. Dropped the #defines from brw_defines.h and ran:
$ for file in *.cpp *.c *.h; do sed -i \
-e 's/BRW_SURFACEFORMAT_/ISL_FORMAT_/g' \
-e 's/ISL_FORMAT_ASTC_[A-Zxs0-9_]*/\U&/g' $file; \
done
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Chris Wilson [Tue, 10 Jan 2017 21:23:26 +0000 (21:23 +0000)]
i965: Only flush the batchbuffer if we need to zero the SO offsets
If we don't have pipelined register access (e.g. Haswell before kernel
v4.2), then we can only implement EXT_transform_feedback by reseting the
SO offsets *between* batches. However, if we do have pipelined access to
the SO registers on gen7, we can simply emit an inline reset of the SO
registers without a full batch flush.
v2 [by Ken]: Simplify after recent kernel feature detection changes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Iago Toral Quiroga [Wed, 22 Feb 2017 10:33:13 +0000 (11:33 +0100)]
anv: do not subtract the base layer to compute depth in 3DSTATE_DEPTH_BUFFER
According to the PRM description of the Depth field:
"This field specifies the total number of levels for a volume texture
or the number of array elements allowed to be accessed starting at the
Minimum Array Element for arrayed surfaces"
However, ISL defines array_len as the length of the range
[base_array_layer, base_array_layer + array_len], so it already represents
a value relative to the base array layer like the hardware expects.
v2: Depth is defined as a U11-1 field, so subtract 1 from
the actual value (Jason)
This fixes a number of new CTS tests that would crash otherwise:
dEQP-VK.pipeline.render_to_image.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iago Toral Quiroga [Fri, 24 Feb 2017 07:35:39 +0000 (08:35 +0100)]
isl: document the meaning of the array_len field in isl_view
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jacob Lifshay [Wed, 1 Mar 2017 04:30:57 +0000 (20:30 -0800)]
vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably
enable DRI3. More importantly, it includes a little heuristic to check
to see if we're running on AMD or NVIDIA's proprietary X11 drivers and,
if we are, doesn't emit the warning. This way, users with both a discrete
card and Intel graphics don't get the warning when they're just running
on the discrete card.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715
Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Rene Lindsay <rjklindsay@hotmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
Jason Ekstrand [Thu, 23 Feb 2017 22:54:13 +0000 (14:54 -0800)]
i965: Do int64 lowering in NIR
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 23 Feb 2017 21:56:15 +0000 (13:56 -0800)]
nir: Add a simple int64 lowering pass
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.
v2: Properly handle vectors
v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the
calculations anyway and this is just extra instructions.
v4:
- Add back in the log2_denom stuff since it's needed for ensuring that
the shifts don't overflow.
- Rework the looping part of the pass to be easier to expand.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Wed, 15 Feb 2017 18:47:03 +0000 (10:47 -0800)]
spirv: Use nir_builder for control flow
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 15 Feb 2017 18:15:58 +0000 (10:15 -0800)]
nir/lower_indirect: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 15 Feb 2017 18:14:47 +0000 (10:14 -0800)]
nir/lower_gs_intrinsics: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 15 Feb 2017 18:04:47 +0000 (10:04 -0800)]
glsl/nir: Use nir_builder's new control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 15 Feb 2017 16:42:45 +0000 (08:42 -0800)]
nir/builder: Add support for easily building control-flow
Each of the pop functions (and push_else) take a control flow parameter as
their second argument. If NULL, it assumes that the builder is in a block
that's a direct child of the control-flow node you want to pop off the
virtual stack. This is what 90% of consumers will want. The SPIR-V pass,
however, is a bit more "creative" about how it walks the CFG and it needs
to be able to pop multiple levels at a time, hence the argument.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 1 Mar 2017 19:20:25 +0000 (11:20 -0800)]
i965: Move intel_debug.h to intel/common/gen_debug.h
This is shared between the Vulkan and GL drivers as it's a requirement
of the back-end compiler. However, it doesn't really belong in the
compiler. We rename the file to match the prefix of the other stuff in
common and because libdrm defines an intel_debug.h and this avoids a
pile of possible name conflicts.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Jason Ekstrand [Wed, 1 Mar 2017 16:58:43 +0000 (08:58 -0800)]
i965: Reduce cross-pollination between the DRI driver and compiler
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 19:53:41 +0000 (11:53 -0800)]
i965: Move select_clip_planes to brw_vs.c
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Wed, 1 Mar 2017 03:08:22 +0000 (19:08 -0800)]
i965: Delete brw_do_cubemap_normalize
This hasn't been used for quite some time now but we never bothered to
get rid of it when we dropped GLSL IR support for vec4.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 03:00:45 +0000 (19:00 -0800)]
i965: Add a header for brw_vec4_vs_visitor
brw_vs.h is not a compiler file but brw_vec4_visitor is definitely a
compiler thing.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:49:45 +0000 (18:49 -0800)]
i965: Move a bunch of pre-compile and link stuff to brw_program.h
It's all GL-specific and brw_program.h is not part of i965_compiler.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:48:58 +0000 (18:48 -0800)]
i965: Move image uniform setup to brw_nir_uniforms.cpp
It's the only thing that's using it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:48:23 +0000 (18:48 -0800)]
i965: Move channel_expressions and vector_splitting to brw_program.h
They're GL-specific.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:52:29 +0000 (18:52 -0800)]
i965: Make mark_surface_used a static inline in brw_compiler.h
One of these days, I'd like to see this function go away all together
but for now, let's at least put it near the struct it updates.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:38:27 +0000 (18:38 -0800)]
i965: Move BRW_ATTRIB_WA_* defines to brw_compiler.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:14:49 +0000 (18:14 -0800)]
i965: Move BRW_MAX_DRAW_BUFFERS to brw_compiler.h
It does sort-of go with MAX_UBO and friends but MAX_DRAW_BUFFERS is an
actual hardware constant based on the number of things we can blend
rather than an arbitrary "number of things allowed in GL" like some of
the other maximums are.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:12:35 +0000 (18:12 -0800)]
i965/inst: Stop using fi_type
It's a mesa define that's trivial to inline. This removes a dependence
on main/imports.h.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:11:33 +0000 (18:11 -0800)]
i965: Move brw_register_blocks to brw_fs.cpp
Its one and only caller is brw_compile_fs which lives there.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 1 Mar 2017 02:10:53 +0000 (18:10 -0800)]
i965: Move SHADER_TIME_STRIDE to brw_compiler.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>