Alex Smith [Mon, 6 Mar 2017 14:54:28 +0000 (14:54 +0000)]
radv: Emit pending flushes before executing a secondary command buffer
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.
This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Samuel Pitoiset [Wed, 1 Mar 2017 21:53:52 +0000 (22:53 +0100)]
mesa/main: remove useless check in _mesa_IsSampler()
_mesa_lookup_samplerobj() returns NULL if sampler is 0.
v2: use _mesa_lookup...(...) != NULL
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 1 Mar 2017 21:54:09 +0000 (22:54 +0100)]
getteximage: avoid to lookup textures with id 0
This fixes the following assertion when the key is 0.
main/hash.c:181: _mesa_HashLookup_unlocked: Assertion `key' failed.
Fixes:
633c959fae ("getteximage: Return correct error value when texure object is not found")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Mon, 6 Mar 2017 16:34:41 +0000 (17:34 +0100)]
docs/relnotes/17.1.0: document the new LLVM requirement
Marek Olšák [Tue, 28 Feb 2017 20:27:59 +0000 (21:27 +0100)]
gallium/radeon: don't monitor SDMA busyness on EG/Cayman/SI
It's always busy.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99955
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 4 Mar 2017 23:15:31 +0000 (00:15 +0100)]
radeonsi: drop support for LLVM 3.6 & 3.7
They are too old.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sun, 26 Feb 2017 18:00:44 +0000 (19:00 +0100)]
radeonsi: set the convergent attribute where needed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sun, 22 Jan 2017 01:36:48 +0000 (02:36 +0100)]
gallivm,ac: add LP_FUNC_ATTR_CONVERGENT
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sun, 5 Mar 2017 22:19:57 +0000 (23:19 +0100)]
radeonsi: fix LLVM 3.9 - don't use non-matching attributes on declarations
Call site attributes are used since LLVM 4.0.
This also reverts commit
b19caecbd6f310c1663b0cfe483d113ae3bd5fe2
"radeon/ac: fix intrinsic version check", because this is the correct fix.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Mark Thompson [Sun, 5 Mar 2017 23:18:11 +0000 (23:18 +0000)]
st/omx: Set end-of-frame flag on bitstream output buffers
Since all output buffers are whole frames, this should always be set.
Technically, setting this flag is is optional (see OpenMAX IL section
3.1.2.7.1), but some clients assume that it will be used and
therefore buffer indefinitely thinking that all output buffers are
fragments of the first frame when it is not set.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Sun, 5 Mar 2017 22:10:14 +0000 (22:10 +0000)]
st/omx: Fix port format enumeration
From OpenMAX IL section 4.3.5:
"The value of nIndex is the range 0 to N-1, where N is the number of
formats supported by the port. There is no need for the port to
report N, as the caller can determine N by enumerating all the
formats supported by the port. Each port shall support at least one
format. If there are no more formats, OMX_GetParameter returns
OMX_ErrorNoMore (i.e., nIndex is supplied where the value is N or
greater)."
Only one format is supported, so N = 1 and OMX_ErrorNoMore should be
returned if nIndex >= 1. The previous code here would return the
same format for all values of nIndex, resulting in an infinite loop
when a client attempts to enumerate all formats.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Wed, 1 Mar 2017 20:07:09 +0000 (20:07 +0000)]
st/va: Fix forward/backward referencing for deinterlacing
The VAAPI documentation is not very clear here, but the intent
appears to be that a forward reference is forward from a frame in the
past, not forward to a frame in the future (that is, forward as in
forward prediction, not as in a forward reference in source code).
This interpretation is derived from other implementations, in
particular the i965 driver and the gstreamer client.
In order to match those other implementations, this patch swaps the
meaning of forward and backward references as they currently appear
for motion-adaptive deinterlacing.
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Fri, 27 Jan 2017 22:03:10 +0000 (22:03 +0000)]
st/va: Support fractional framerate in misc parameter
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
Andy Furniss [Sun, 29 Jan 2017 14:22:31 +0000 (14:22 +0000)]
st/va encode handle ntsc framerate rate control
Tested with ffmpeg and gst-vaapi. Without this bits per
frame is set way too low for fractional framerates.
v2: Mark Thompson: simplify calculation.
Use float.
Signed-off-by: Andy Furniss <adf.lists@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Bas Nieuwenhuizen [Mon, 6 Mar 2017 00:34:42 +0000 (01:34 +0100)]
radv: Use the new L2 writeback flag.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 6 Mar 2017 00:28:53 +0000 (01:28 +0100)]
radv: Add L2 writeback.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Mon, 6 Mar 2017 02:25:59 +0000 (13:25 +1100)]
util/disk_cache: fix make check
Fixes make check after
11f0efec2e615f5233d which caused disk cache
to create an additional directory.
Dave Airlie [Sun, 5 Mar 2017 22:32:24 +0000 (08:32 +1000)]
radv/ac: use bitfield extract new intrinsics.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 5 Mar 2017 22:29:07 +0000 (08:29 +1000)]
radv/ac: move to new kill build.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 5 Mar 2017 22:23:53 +0000 (08:23 +1000)]
radv/ac: move to using new export intrinsics.
This uses the new code in build to do exports.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 5 Mar 2017 22:06:04 +0000 (08:06 +1000)]
radv/ac: switch to new intrinsics for pkrtz and clamp.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 5 Mar 2017 23:26:16 +0000 (23:26 +0000)]
radv: drop Z24 support.
This isn't exposed in -pro, the hw docs say it is deprecated,
so let's not bother with it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Grazvydas Ignotas [Sun, 5 Mar 2017 21:04:53 +0000 (23:04 +0200)]
radv: use VK_NULL_HANDLE for handles
Avoids warnings on 32bit.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Grazvydas Ignotas [Sun, 5 Mar 2017 21:04:52 +0000 (23:04 +0200)]
radv: check for upload alloc failure
Mainly to avoid gcc's complains about uninitialized ptr and offset use
later in that code.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Grazvydas Ignotas [Sun, 5 Mar 2017 21:04:51 +0000 (23:04 +0200)]
radv: don't use uninitialized value on failure
Mainly to avoid a warning.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Grazvydas Ignotas [Sun, 5 Mar 2017 21:04:50 +0000 (23:04 +0200)]
radv: avoid casting warnings on 32bit
Use the same helpers as for other handle<->pointer conversions.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Sun, 5 Mar 2017 19:58:31 +0000 (20:58 +0100)]
radv/amdgpu: Add some debug flags.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 5 Mar 2017 21:25:20 +0000 (22:25 +0100)]
radv: Cache command buffers in command pool.
So that we don't keep allocating BOs for the IBs and upload buffers.
We run some risk of memory increase with e.g. a bimodal size
distribution of command buffers, but I haven't noticed a significant
increase with dota2 and talos.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 23 Feb 2017 02:09:03 +0000 (13:09 +1100)]
Revert "glsl: Switch to disable-by-default for the GLSL shader cache"
This reverts commit
0f60c6616e93cba72bff4fbfedb72a753ef78e05.
Piglit and all games tested so far seem to be working without
issue. This change will allow wide user testing and we can decided
before the next release if we need to turn it off again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 2 Mar 2017 06:08:34 +0000 (17:08 +1100)]
docs: update envvars.html to reflect having a cache per arch
Timothy Arceri [Sat, 4 Mar 2017 21:07:22 +0000 (08:07 +1100)]
util/disk_cache: support caches for multiple architectures
Previously we were deleting the entire cache if a user switched
between 32 and 64 bit applications.
V2: make the check more generic, it should now work with any
platform we are likely to support.
V3: Use suggestion from Emil to make even more generic/fix issue
with __ILP32__ not being declared on gcc for regular 32-bit builds.
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Grazvydas Ignotas [Sun, 5 Mar 2017 20:58:52 +0000 (22:58 +0200)]
util/disk_cache: mark read-only arguments const
No functional changes.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Dave Airlie [Sun, 5 Mar 2017 20:05:58 +0000 (06:05 +1000)]
radeon/ac: fix intrinsic version check
Reported-by: 375gnu@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068
Signed-off-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 5 Mar 2017 10:19:08 +0000 (11:19 +0100)]
radv: Merge fast clear flushes.
Don't flush multiple times if we clear multiple attachments. Also allows
doing the depth clear in parallel with the fast color clears.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tim Rowley [Fri, 3 Mar 2017 17:20:35 +0000 (11:20 -0600)]
relnotes: [swr] note addition of gs, increased llvm requirement
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 2 Mar 2017 22:45:53 +0000 (16:45 -0600)]
docs: update features.txt for swr geometry shaders
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 2 Mar 2017 22:41:22 +0000 (16:41 -0600)]
swr: [rasterizer core] fix primID provoking vertex for GS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 2 Mar 2017 22:41:02 +0000 (16:41 -0600)]
swr: implement geometry shaders
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 2 Mar 2017 22:38:43 +0000 (16:38 -0600)]
configure.ac: increase required swr llvm to 3.9.0
GS implementation uses the masked.{gather,store} intrinsics,
introduced in llvm-3.9.0. swr llvm version requirement in
automake and scons now match (scons already needed >= 3.9).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Kenneth Graunke [Fri, 3 Mar 2017 07:28:42 +0000 (23:28 -0800)]
i965: Clamp texture buffer size to GL_MAX_TEXTURE_BUFFER_SIZE.
The OpenGL 4.5 specification's description of TexBuffer says:
"The number of texels in the texture image is then clamped to an
implementation-dependent limit, the value of MAX_TEXTURE_BUFFER_SIZE."
We set GL_MAX_TEXTURE_BUFFER_SIZE to 2^27. For buffers with a byte
element size, this is the maximum possible size we can encode in
SURFACE_STATE. If you bind a buffer object larger than this as a
texture buffer object, we'll exceed that limit and hit an isl assert:
assert(num_elements <= (1ull << 27));
To fix this, clamp the size in bytes to MaxTextureSize / texel_size.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Emil Velikov [Sat, 4 Mar 2017 21:42:18 +0000 (21:42 +0000)]
automake: move wayland-drm prior to Vulkan
Earlier commit was picked from a larger series, but did not consider
that it removed the vulkan <> wayland-drm interdependency.
Rather than reverting everything, temporarily move wayland-drm further
up to resolve the issue. Since it [wayland-drm] does not have any
in-mesa dependencies that's perfectly safe.
Cc: Vedran Miletić <vedran@miletic.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100060
Fixes:
e135ce6f088 ("vulkan: Build common Vulkan code earlier")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Javier Jardón <jjardon@gnome.org>
Mauro Rossi [Sat, 4 Mar 2017 21:11:27 +0000 (22:11 +0100)]
android: fix libz dynamic library dependencies
Fixes a series of libz related building errors:
target SharedLib: gallium_dri_32
(out/target/prod...SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so)
external/elfutils/libelf/elf_compress.c:117: error: undefined reference to 'deflateInit_'
...
external/elfutils/libelf/elf_compress.c:244: error: undefined reference to 'inflateEnd'
clang++: error: linker command failed with exit code 1 (use -v to see
invocation)
Fixes: 85a9b1b "util/disk_cache: compress individual cache entries"
Timothy Arceri [Thu, 2 Mar 2017 23:06:24 +0000 (10:06 +1100)]
svga: pass NULL to ureg_get_tokens()
The number of tokens in never used and the pointer is NULL checked
so just pass NULL.
Reviewed-by: Brian Paul <brianp@vmware.com>
Ilia Mirkin [Fri, 3 Mar 2017 01:18:24 +0000 (20:18 -0500)]
nvc0: take extra pushbuf space into account for pushbuf_space calls
See detailed explanation of why this is needed in commit
eb60a89bc3a.
This spot was missed/overlooked. Basically as a result of the fact
that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8
to the requested amount, we have to be mindful of that when doing bare
nouveau_pushbuf_space calls.
Reportedly this fixes some crashes when replaying a hitman trace taken
on radeonsi.
Fixes:
eb60a89bc3a ("nouveau: take extra push space into account for pushbuf_space calls")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ilia Mirkin [Wed, 1 Mar 2017 16:09:30 +0000 (11:09 -0500)]
nvc0: increase alignment to 256 for texture buffers on fermi
When binding as textures, the alignment can be 16. However when binding
as an image, the address has to be aligned to 256. (Also when binding as
an RT, but that can't happen with GL or current gallium APIs.)
Reported-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tapani Pälli [Fri, 3 Mar 2017 10:52:56 +0000 (12:52 +0200)]
android: fix outdir for gen_enum_to_str files
when files are being generated the value of $intermediates var content can be
completely random, this makes sure that outdir is the wanted one.
Fixes:
3f2cb699 ("android: vulkan: add support for libmesa_vulkan_util")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Xiaosong Wei [Wed, 8 Feb 2017 02:46:02 +0000 (10:46 +0800)]
EGL/Android: Add EGL_EXT_buffer_age extension
This patch implements the EGL_EXT_buffer_age extension for Android.
https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_buffer_age.txt
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Sat, 4 Mar 2017 15:56:58 +0000 (15:56 +0000)]
docs: add news item and link release notes for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Sat, 4 Mar 2017 15:53:51 +0000 (15:53 +0000)]
docs: add sha256 checksums for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
5c9273152c59777771fa6c7b546316caf3f091d8)
Emil Velikov [Sat, 4 Mar 2017 15:44:59 +0000 (15:44 +0000)]
docs: add release notes for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
8fee1d348cc3d91a88319c0d72689acabaa2bf47)
Emil Velikov [Fri, 3 Mar 2017 19:25:40 +0000 (19:25 +0000)]
gallium/targets: don't leave an empty target directory(ies)
Some drivers do not support certain targets - for example nouveau
doesn't do VAAPI, while freedreno doesn't do of the video backends.
As such if we enter vdpau when building freedreno/ilo/etc, a vdpau/
folder will be created, empty library will be build and almost
immediately removed. Thus keeping an empty vdpau/ folder around.
There are two ways to fix this.
* add substantial tracking in configure/makefiles so that we never end
up in targets/vdpau
Downsides:
Error prone, as the configure checks and the 'include
gallium/drivers/foo/Automake.inc' can easily get out of sync.
* remove the folder, if empty, alongside the empty library.
Downsides:
In the latter case vdpau/ might be empty before the mesa build has
started, yet we'll remove it either way.
This patch implements the latter option, as the downside isn't that
significant, plus the patch is way shorter ;-)
v2: use has_drivers to track since TARGET_DRIVERS can contain space,
hence neither string comparison nor -n/-z works correctly.
Gentoo Bugzilla: https://bugs.gentoo.org/545230
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Wed, 1 Mar 2017 12:09:14 +0000 (12:09 +0000)]
radv: use enum_to_str util functions.
Port of
e9dcb17962f7e58a81c93bae7bd33885675b1043
vulkan/util: Add generator for enum_to_str functions
Cc: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jason Ekstrand [Thu, 2 Mar 2017 03:18:56 +0000 (19:18 -0800)]
vulkan: Build common Vulkan code earlier
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jason Ekstrand [Fri, 24 Feb 2017 00:18:00 +0000 (16:18 -0800)]
anv: Advertise shaderInt64 on Broadwell and above
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 1 Mar 2017 23:20:31 +0000 (15:20 -0800)]
nir/int64: Properly handle imod/irem
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder. They just leave the behavior undefined
whenever either source is negative. However, in SPIR-V, there is a
defined behavior for negative arguments. This commit beefs up the pass
so that it handles both correctly. Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Fri, 24 Feb 2017 05:35:00 +0000 (21:35 -0800)]
nir/builder: Add an int64 immediate helper
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sat, 4 Jun 2016 01:09:55 +0000 (18:09 -0700)]
genxml: Fill out Gen4 and G45 XML.
This is a work in progress - some things may still need fixing.
But it should be in pretty decent shape.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Sat, 25 Feb 2017 23:41:37 +0000 (00:41 +0100)]
ac: normalize build helper names
s/emit/build/
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 25 Feb 2017 22:48:23 +0000 (23:48 +0100)]
ac: replace SI.vs.load.input with amdgcn.buffer.load.format
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sat, 25 Feb 2017 22:40:52 +0000 (23:40 +0100)]
radeonsi: move SI.vs.load.input building into amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Tue, 15 Nov 2016 23:26:47 +0000 (00:26 +0100)]
radeonsi: detect and mark loads/stores from read-only/write-only memory
Marek Olšák [Fri, 24 Feb 2017 01:09:47 +0000 (02:09 +0100)]
ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0
ADD_TID doesn't work. Needs more investigation.
v2: remove leftover dead code
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Marek Olšák [Fri, 24 Feb 2017 01:14:52 +0000 (02:14 +0100)]
radeonsi: use the writeonly LLVM attribute
Marek Olšák [Fri, 24 Feb 2017 19:23:23 +0000 (20:23 +0100)]
ac: remove offen parameter from ac_build_buffer_store_dword
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 00:45:31 +0000 (01:45 +0100)]
radeonsi: enable TC L2 for tessellation offchip stores
Vulkan does the same thing.
Marek Olšák [Fri, 24 Feb 2017 00:20:35 +0000 (01:20 +0100)]
radeonsi: merge and simplify tbuffer_store functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 22:14:35 +0000 (23:14 +0100)]
radeonsi: set noalias on input shader pointers
Marek Olšák [Fri, 24 Feb 2017 22:06:31 +0000 (23:06 +0100)]
radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 21:44:18 +0000 (22:44 +0100)]
radeonsi: move kill intrinsic building into amd/common
just a cleanup
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 16:16:28 +0000 (17:16 +0100)]
radeonsi: set readnone on reads from read-only memory
Marek Olšák [Fri, 24 Feb 2017 15:54:05 +0000 (16:54 +0100)]
radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.load
Marek Olšák [Fri, 24 Feb 2017 15:38:25 +0000 (16:38 +0100)]
radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz
Marek Olšák [Thu, 23 Feb 2017 22:37:59 +0000 (23:37 +0100)]
ac: replace old image intrinsics with new ones
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 23:52:07 +0000 (00:52 +0100)]
radeonsi: remove last use of llvm.SI.resinfo
and move one function up to reuse the code.
Marek Olšák [Thu, 23 Feb 2017 22:00:19 +0000 (23:00 +0100)]
radeonsi: move image intrinsic building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 01:15:54 +0000 (02:15 +0100)]
ac: replace SI.export with amdgcn.exp.*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 01:06:40 +0000 (02:06 +0100)]
radeonsi: move llvm.SI.export building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 21:58:49 +0000 (22:58 +0100)]
ac: unify build_type_name_for_intr functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 23 Feb 2017 21:15:17 +0000 (22:15 +0100)]
radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as well
It was harmless, because we also set unorm in the sampler state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 24 Feb 2017 01:11:07 +0000 (02:11 +0100)]
gallivm, ac: add writeonly and inaccessiblememonly attributes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Tue, 15 Nov 2016 23:47:35 +0000 (00:47 +0100)]
tgsi/scan: record load/store/atomic image usage
Reviewed-by: Dave Airlie <airlied@redhat.com>
Eric Anholt [Wed, 27 Feb 2013 20:58:58 +0000 (12:58 -0800)]
glapi: Fix a comment typo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Alejandro Piñeiro [Thu, 2 Mar 2017 16:18:14 +0000 (17:18 +0100)]
mesa/main: *TextureSubImage* generates INVALID_OPERATION on wrong target
Equivalent *TexSubImage* methods generates INVALID_ENUM.
From OpenGL 4.5 spec, section 8.6 Alternate Texture Image
Specification Commands:
"An INVALID_ENUM error is generated by *TexSubImage* if target does
not match the command, as shown in table 8.15."
And:
"An INVALID_OPERATION error is generated by *TextureSubImage* if
the effective target of texture does not match the command, as
shown in table 8.15."
Fixes:
GL45-CTS.direct_state_access.textures_copy_errors
v2: slightly change commit summary (Samuel)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ben Widawsky [Fri, 3 Mar 2017 01:47:02 +0000 (17:47 -0800)]
i965: Add Kaby Lake brandstrings
While here, use the spacing defined in Ark.
https://ark.intel.com/products/codename/82879/Kaby-Lake
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Grazvydas Ignotas [Thu, 2 Mar 2017 22:46:53 +0000 (00:46 +0200)]
tgsi/ureg: return correct token count in ureg_get_tokens
Valgrind reports that the shader cache writes uninitialized data to disk.
Turns out ureg_get_tokens() is returning the count of allocated tokens
instead of how many are actually used, so the cache writes out unused
space at the end. Use the real count instead.
This change should not cause regressions elsewhere because the only
ureg_get_tokens() user that cares about token count is the shader cache.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Timothy Arceri [Tue, 24 Jan 2017 16:08:22 +0000 (17:08 +0100)]
radeonsi: add support for an on-disk shader cache
V2:
- when loading from disk cache also binary insert into memory cache.
- check that the binary loaded from disk is the correct size. If not
delete the cache item and skip loading from cache.
V3:
- remove unrequired variable
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 1 Mar 2017 05:04:23 +0000 (16:04 +1100)]
util/disk_cache: compress individual cache entries
This reduces the cache size for Deus Ex from ~160M to ~30M for
radeonsi (these numbers differ from Grigori's results below
probably due to different graphics quality settings).
I'm also seeing the following improvements in minimum fps in the
Shadow of Mordor benchmark on an i5-6400 CPU@2.70GHz, with a HDD:
no-cache: ~10fps
with-cache-no-compression: ~15fps
with-cache-and-compression: ~20fps
Note: The with cache results are from the second run after closing
and opening the game to avoid the in-memory cache.
Since we mainly care about decompression I went with
Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson
who has benchmarked decompression speeds.
Grigori Goronzy provided the following stats for Deus Ex: Mankind
Divided start-up times on a Athlon X4 860k with a SSD:
No Cache 215 sec
Cold Cache zlib BEST_COMPRESSION 285 sec
Warm Cache zlib BEST_COMPRESSION 33 sec
Cold Cache zlib BEST_SPEED 264 sec
Warm Cache zlib BEST_SPEED 33 sec
Cold Cache no compression 266 sec
Warm Cache no compression 34 sec
The total cache size for that game is 48 MiB with BEST_COMPRESSION,
56 MiB with BEST_SPEED and 170 MiB with no compression.
These numbers suggest that it may be ok to go with Z_BEST_SPEED
but we should gather some actual decompression times before doing
so. Other options might be to do the compression in a separate
thread, this might allow us to use a higher compression algorithim
such as LZMA.
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 24 Feb 2017 04:14:56 +0000 (15:14 +1100)]
util/disk_cache: add support for detecting corrupt cache entries
V2: fix pointer increments for writing/reading crc
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Samuel Pitoiset [Wed, 1 Mar 2017 21:09:28 +0000 (22:09 +0100)]
glsl: fix subroutine mismatch between declarations/definitions
Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.
Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.
This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes:
be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Matt Turner [Thu, 2 Mar 2017 19:05:34 +0000 (11:05 -0800)]
genxml: Depend on Makefile.am for generated sources.
Depending on the generated Makefile means that all generated sources are
recreated after ./configure.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Matt Turner [Thu, 2 Mar 2017 04:43:21 +0000 (04:43 +0000)]
clover: Work around build failure with AltiVec.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504
Acked-by: Francisco Jerez <currojerez@riseup.net>
Nanley Chery [Tue, 31 Jan 2017 20:23:18 +0000 (12:23 -0800)]
anv/image: Allow HiZ on input attachment-capable depth/stencil images
While an input attachment may only take on one of those two layouts,
other depth/stencil attachments that use the same image may have
HiZ-enabled layouts. Improves the average frame rate on a release
candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my
SKL GT4.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Thu, 23 Feb 2017 20:11:58 +0000 (12:11 -0800)]
anv/cmd_buffer: Centralize automatic layout transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 1 Feb 2017 00:42:58 +0000 (16:42 -0800)]
anv/cmd_buffer: Add attachment transitioning functions
This is needed to transition input attachments.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 20:31:36 +0000 (12:31 -0800)]
anv/blorp: Encapsulate subpass id querying
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 22:09:03 +0000 (14:09 -0800)]
anv/cmd_buffer: Enable render pass awareness
v2: Update cmd_state_reset (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Sat, 25 Feb 2017 23:57:32 +0000 (15:57 -0800)]
anv/pass: Store subpass attachment reference list
We'll loop through this array when performing automatic layout
transitions.
v2: Adjust formatting of an assignment (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Mon, 27 Feb 2017 17:38:25 +0000 (09:38 -0800)]
anv/pass: Fix size of anv_render_pass:subpass_attachments
Don't allocate space for resolve attachments if the subpass has none.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Wed, 1 Feb 2017 00:12:50 +0000 (16:12 -0800)]
anv: Store the user's VkAttachmentReference
We will be using the image layout. Store the full struct directly from
the user.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nanley Chery [Tue, 31 Jan 2017 19:36:22 +0000 (11:36 -0800)]
anv/cmd_buffer: Remove extra resolve for certain depth buffers
Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer
when reading from a depth image subresource that is in the general
layout. Remove this unneeded resolve.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>