Dylan Baker [Tue, 6 Mar 2018 18:36:09 +0000 (10:36 -0800)]
meson: Fix indent in omx meson.build
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Dylan Baker [Tue, 6 Mar 2018 18:36:42 +0000 (10:36 -0800)]
meson: Use include directory variables instead of traversing
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Dylan Baker [Tue, 6 Mar 2018 18:11:38 +0000 (10:11 -0800)]
meson: Re-add auto option for omx
This re-adds the auto option for omx, without it we default to tizonia
and the build fails almost immediately, this is especially obnoxious
those building a driver that doesn't support the OMX state tracker to
begin with.
v2: - Only define OMX_FOO for auto cases if the dependencies are found.
This fixes building tizonia with auto (Julien, Eric)
CC: Gurkirpal Singh <gurkirpal204@gmail.com>
Fixes:
bb5e27fab6087a5c1528a5faf507acce700e883c
("st/omx/bellagio: Rename st and target directories")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com> (v1)
Dylan Baker [Tue, 6 Mar 2018 19:33:16 +0000 (11:33 -0800)]
meson: fix tizonia compilation
It needs to have src/egl in it's includes as well.
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Dylan Baker [Tue, 6 Mar 2018 19:32:23 +0000 (11:32 -0800)]
meson: combine state trackers and target if blocks
This is needed later since tizonia requires dri
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Marek Olšák [Fri, 23 Feb 2018 19:42:41 +0000 (20:42 +0100)]
st/mesa: expose 0 shader binary formats for compat profiles for Qt
Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Roland Scheidegger [Tue, 6 Mar 2018 20:33:16 +0000 (21:33 +0100)]
draw: fix line stippling with aa lines
In contrast to non-aa, where stippling is based on either dx or dy
(depending on if it's a x or y major line), stippling is based on
actual distance with smooth lines, so adjust for this.
(It looks like there's some minor artifacts with mesa demos
line-sample and stippling, it looks like the line endpoints
aren't quite right with aa + stippling - maybe due to the
integer math in the stipple stage, but I can't quite pinpoint it.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Tue, 6 Mar 2018 18:16:45 +0000 (19:16 +0100)]
draw: simplify (and correct) aaline fallback (v2)
The motivation actually was to get rid of the additional tex
instruction, since that requires the draw fallback code to intercept
all sampler / view calls (even if the fallback is never hit).
Basically, the idea is to use coverage of the pixel to calculate
the alpha value, and coverage is simply based on the distance
to the center of the line (in both line direction, which is useful
for wide lines, as well as perpendicular to the line).
This is much closer to what hw supporting this natively actually does.
It also fixes an issue with line width not quite being correct, as
well as endpoints getting stretched too far (in line direction) with
wide lines, which is apparent with mesa demo line-sample.
(For llvmpipe, it would probably make sense to do something like this
directly when drawing lines, since rendering two tris is twice as
expensive as a line, but it would need some changes with state
management.)
Since we're no longer relying on mipmapping to get the alpha value,
we also don't need to draw 3 rects (6 tris), one is sufficient.
There's still issues (as before):
- quite sure it's not correct without half_pixel_center, but can't test
this with GL.
- aaline + line stipple is incorrect (evident with line-sample demo).
Looking at the spec the stipple pattern should actually be based on
distance (not just dx or dy for x/y major lines as without aa).
- outputs (other than pos + the one used for line aa) should be
reinterpolated since we actually increase line length by half a pixel
(but there's no tests which would care).
v2: simplify the math (should be equivalent), don't need immediate
v3: use float versions of atan2,cos,sin, minor cleanups
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Bas Nieuwenhuizen [Wed, 7 Mar 2018 15:38:32 +0000 (16:38 +0100)]
radv: Don't emit a warning on VI-GFX9.
We are conformant:
https://www.khronos.org/conformance/adopters/conformant-products#submission_308
v2: Actually not emit it on gfx9.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Tue, 6 Feb 2018 00:40:00 +0000 (01:40 +0100)]
radv: Enable vulkan 1.1.0 for configurations that can support it.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 22 Jan 2018 21:22:41 +0000 (22:22 +0100)]
radv: Disable sampler ycbcr conversion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 23:34:08 +0000 (00:34 +0100)]
radv: Expose that we don't support any VK_KHR_16_bit_storage parts.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 21:34:11 +0000 (22:34 +0100)]
radv: Implement vkEnumerateInstanceVersion.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 16:13:26 +0000 (17:13 +0100)]
radv: Add trivial device group implementation.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 15:32:38 +0000 (16:32 +0100)]
radv: Implement vkCmdDispatchBase.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 15:11:48 +0000 (16:11 +0100)]
radv: Implement VkGetDeviceQueue2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:59:45 +0000 (15:59 +0100)]
radv: Support VkPhysicalDeviceProtectedMemoryFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:57:59 +0000 (15:57 +0100)]
radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:53:03 +0000 (15:53 +0100)]
radv: Implement VK_KHR_maintenance3.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 14:06:10 +0000 (15:06 +0100)]
radv: Add minimal subgroup support.
Deliberately not implementing workgroup scopes as that is not needed
for core vulkan.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 12:55:26 +0000 (13:55 +0100)]
radv: Change client version check.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 21 Jan 2018 12:39:22 +0000 (13:39 +0100)]
radv: Update MAX_API_VERSION to 1.1.0
v2: Don't bump supported version.
v3: Update json files.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 5 Feb 2018 21:54:18 +0000 (22:54 +0100)]
ac/nir: Add vote_ieq/vote_feq lowering pass.
The old vote_eq implementation supported only booleans, but now
we have to support arbitrary values, so use the read_first_invocation
intrinsic + ballot.
I took this as an opportunity to figure out how easy it was to do this
in nir instead of in the nir_to_llvm pass, and it actually turned out
pretty okay IMO. Only creating the pass is some extra code.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Fri, 10 Nov 2017 03:17:29 +0000 (19:17 -0800)]
anv: Support version overrides
While always sketchy to do, this is useful for debugging.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 10 Nov 2017 03:17:17 +0000 (19:17 -0800)]
vulkan/util: Add a helper to get a version override
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 22 Sep 2017 14:44:10 +0000 (07:44 -0700)]
anv: Enable Vulkan 1.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 28 Apr 2017 08:22:39 +0000 (01:22 -0700)]
anv: Add support for SPIR-V 1.3 subgroup operations
This requires us to bump the subgroup size to 32 for all shader stages
because Vulkan requires that to be a physical device query.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 1 Sep 2017 22:18:02 +0000 (15:18 -0700)]
intel/fs: Add support for subgroup quad operations
NIR has code to lower these away for us but we can do significantly
better in many cases with register regioning and SIMD4x2.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 1 Sep 2017 05:12:48 +0000 (22:12 -0700)]
intel/fs: Implement reduce and scan opeprations
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 1 Sep 2017 04:50:31 +0000 (21:50 -0700)]
intel/fs: Add a helper for emitting scan operations
This commit adds a helper to the builder for emitting "scan" operations.
Given a binary operation #, a scan takes the vector [a0, a1, ..., aN]
and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each
channel contains the combination of all previous channels. The sequence
of instructions to perform the scan is fairly optimal; a 16-wide scan on
a 32-bit type is only 6 instructions. The subgroup scan and reduction
operations will be implemented in terms of this.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 1 Sep 2017 04:45:30 +0000 (21:45 -0700)]
intel/fs: Add a couple of simple helper opcodes
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Aug 2017 03:10:35 +0000 (20:10 -0700)]
spirv: Add support for subgroup arithmetic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Aug 2017 03:36:55 +0000 (20:36 -0700)]
nir: Add a helper for getting binop identities
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 30 Aug 2017 03:09:58 +0000 (20:09 -0700)]
nir: Add subgroup arithmetic reduction intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 17:21:31 +0000 (10:21 -0700)]
spirv: Add subgroup quad support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 17:20:56 +0000 (10:20 -0700)]
nir: Add quad operations and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 16:21:32 +0000 (09:21 -0700)]
i965/fs: Add support for nir_intrinsic_shuffle
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 16:44:44 +0000 (09:44 -0700)]
spirv: Add subgroup shuffle support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Thu, 7 Dec 2017 05:41:47 +0000 (21:41 -0800)]
nir: Add subgroup shuffle intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 00:38:53 +0000 (17:38 -0700)]
i965/fs: Support nir_intrinsic_vote_feq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 29 Aug 2017 02:55:34 +0000 (19:55 -0700)]
nir/lower_subgroups: Add scalarizing for vote_eq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 24 Aug 2017 18:01:22 +0000 (11:01 -0700)]
spirv: Add subgroup vote support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 29 Aug 2017 00:33:33 +0000 (17:33 -0700)]
nir: Generalize nir_intrinsic_vote_eq
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type. This has two implications:
1) We need to be able to handle vectors so we switch the vote_eq
intrinsics to be vectorized intrinsics.
2) We need to handle floats which have different behavior with respect
to +-0, NaN, etc. than the integer variant so we need two variants.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 22 Aug 2017 23:53:05 +0000 (16:53 -0700)]
spirv: Add subgroup ballot support
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 22 Aug 2017 05:17:37 +0000 (22:17 -0700)]
i965/fs: Implement basic SPIR-V subgroup intrinsics
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Fri, 28 Apr 2017 11:45:50 +0000 (04:45 -0700)]
spirv: Add initial subgroup support
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 9 May 2017 23:44:13 +0000 (16:44 -0700)]
nir: Add new SPIR-V ballot intrinsics and lowering
Someone can make the lowering optional later if they want something
different for their hardware.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Sat, 30 Sep 2017 21:50:40 +0000 (14:50 -0700)]
compiler: Add two new system values for subgroups
This will be required for SPIR-V subgroup support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 3 Oct 2017 01:19:44 +0000 (18:19 -0700)]
nir: Add new SPIR-V ballot ALU intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 11 Oct 2017 23:29:28 +0000 (16:29 -0700)]
spirv: Handle the new OpModuleProcessed instruction
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 11 Oct 2017 23:06:13 +0000 (16:06 -0700)]
anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER
From the Vulkan 1.1 spec:
"Vulkan 1.0 implementations were required to return
VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0.
Implementations that support Vulkan 1.1 or later must not return
VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion."
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 12 Oct 2017 01:09:32 +0000 (18:09 -0700)]
anv: Implement vkEnumerateInstanceVersion
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Iago Toral Quiroga [Tue, 6 Feb 2018 09:37:16 +0000 (10:37 +0100)]
anv/device: fail to initialize device if we have queues with unsupported flags
This is not strictly necessary since users should not be requesting any
flags that are not valid for the list of enabled features requested and
we already fail if they attempt to use an unsupported feature, however
it is an easy to implement sanity check that would help developes realize
that they are doing things wrong, so we might as well do it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iago Toral Quiroga [Tue, 6 Feb 2018 09:06:30 +0000 (10:06 +0100)]
anv/device: GetDeviceQueue2 should only return queues with matching flags
From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure:
"The queue returned by vkGetDeviceQueue2 must have the same flags value
from this structure as that used at device creation time in a
VkDeviceQueueCreateInfo instance. If no matching flags were specified
at device creation time then pQueue will return VK_NULL_HANDLE."
For us this means no flags at all since we don't support any.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Fri, 22 Sep 2017 17:03:18 +0000 (10:03 -0700)]
anv: Support querying for protected memory
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 6 Oct 2017 02:29:27 +0000 (19:29 -0700)]
anv: Implement GetDeviceQueue2
This belongs to the protected memory feature but there's nothing about
it that's specific to protected memory.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Thu, 21 Sep 2017 20:54:55 +0000 (13:54 -0700)]
anv: Trivially implement VK_KHR_device_group
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Tue, 3 Oct 2017 22:23:07 +0000 (15:23 -0700)]
anv: Implement vkCmdDispatchBase
This is part of the device groups extension/feature but it's a decent
chunk of work in its own right so it's worth breaking into its own
patch. The mechanism we use is fairly straightforward: we just push the
base work group id into the shader and add it to the work group id we
get from dispatch.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 21 Sep 2017 22:51:55 +0000 (15:51 -0700)]
nir/spirv: Add support for device groups
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 5 Oct 2017 23:03:29 +0000 (16:03 -0700)]
anv: Implement VK_KHR_maintenance3
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 13 Oct 2017 18:03:07 +0000 (11:03 -0700)]
anv: Support VkPhysicalDeviceShaderDrawParameterFeatures
This advertises the VK_KHR_shader_draw_parameters functionality as a
"core optimal feature" in Vulkan 1.1.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 17 Oct 2017 04:48:11 +0000 (21:48 -0700)]
anv/entrypoints: Drop support for protect attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 20 Sep 2017 20:16:26 +0000 (13:16 -0700)]
Get rid of a bunch of KHR suffixes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 20 Sep 2017 19:18:10 +0000 (12:18 -0700)]
anv: Add version 1.1.0 but leave it disabled
This requires us to rename any Vulkan API entrypoints which became core
in 1.1 to no longer have the KHR suffix.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Mon, 21 Aug 2017 23:15:36 +0000 (16:15 -0700)]
spirv: Update the SPIR-V headers and json to 1.3.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 19 Sep 2017 20:04:13 +0000 (13:04 -0700)]
vulkan: Update the XML and headers to 1.1.70
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 21 Sep 2017 15:26:06 +0000 (08:26 -0700)]
vulkan/enum_to_str: Add support for aliases and new Vulkan versions
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 24 Jan 2018 03:43:00 +0000 (19:43 -0800)]
vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 17 Oct 2017 04:46:55 +0000 (21:46 -0700)]
anv/entrypoints: Generate #ifdef guards from platform attributes
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 22 Sep 2017 14:36:39 +0000 (07:36 -0700)]
anv/extensions: Add support for multiple API versions
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 19 Sep 2017 21:44:26 +0000 (14:44 -0700)]
anv/entrypoints_gen: Add support for aliases in the XML
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 24 Jan 2018 03:18:08 +0000 (19:18 -0800)]
anv/entrypoints: Allow an entrypoint to require multiple extensions
In this case, we say an entrypoint is supported if ANY of the extensions
is supported. This is because, in the XML, entrypoints don't require
extensions so much as extensions require entrypoints.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 24 Jan 2018 03:15:27 +0000 (19:15 -0800)]
anv/entrypoints: Add an is_device_entrypoint helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 20 Sep 2017 19:38:12 +0000 (12:38 -0700)]
anv/entrypoints_gen: Allow the string map to grow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 20 Sep 2017 16:41:50 +0000 (09:41 -0700)]
anv/entrypoints_gen: A bit of refactoring
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Wed, 20 Sep 2017 15:25:05 +0000 (08:25 -0700)]
anv/entrypoints: Generalize the string map a bit
The original string map assumed that the mapping from strings to
entrypoints was a bijection. This will not be true the moment we
add entrypoint aliasing. This reworks things to be an arbitrary map
from strings to non-negative signed integers. The old one also had a
potential bug if we ever had a hash collision because it didn't do the
strcmp inside the lookup loop. While we're at it, we break things out
into a helpful class.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Tue, 19 Sep 2017 20:49:05 +0000 (13:49 -0700)]
vulkan: Rename multiview from KHX to KHR
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Wed, 23 Aug 2017 05:01:42 +0000 (22:01 -0700)]
spirv: Rework barriers
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier. This commit completely reworks the way we emit barriers to
make things both more precise and more correct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Jason Ekstrand [Wed, 23 Aug 2017 05:16:01 +0000 (22:16 -0700)]
spirv: Add a vtn_constant_value helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Marek Olšák [Sat, 24 Feb 2018 23:39:44 +0000 (00:39 +0100)]
radeonsi: remove si_llvm_add_attribute
Marek Olšák [Sat, 24 Feb 2018 23:33:28 +0000 (00:33 +0100)]
radeonsi: fix passing address32_hi to LLVM for high values
The old function treats high values as negative, which LLVM interprets as 0.
Marek Olšák [Wed, 21 Feb 2018 22:07:05 +0000 (23:07 +0100)]
radeonsi: assume has_virtual_memory == true
Marek Olšák [Thu, 22 Feb 2018 16:13:51 +0000 (17:13 +0100)]
radeonsi: add/update assertions for 32-bit address space
Marek Olšák [Thu, 22 Feb 2018 19:21:42 +0000 (20:21 +0100)]
radeonsi: prevent a negative buffer offset in si_upload_descriptors
Marek Olšák [Wed, 21 Feb 2018 23:28:39 +0000 (00:28 +0100)]
radeonsi: properly extract a buffer address from a descriptor
Marek Olšák [Wed, 21 Feb 2018 22:33:38 +0000 (23:33 +0100)]
radeonsi: fix vertex buffer address computation with full 64-bit addresses
Marek Olšák [Wed, 21 Feb 2018 22:30:41 +0000 (23:30 +0100)]
radeonsi: mask out high VM address bits in registers where needed
Bas Nieuwenhuizen [Tue, 6 Mar 2018 00:08:43 +0000 (01:08 +0100)]
radv: Add entrypoints generation with the new vk.xml
A lot of it is based on intel again.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Simon Hausmann [Wed, 14 Feb 2018 11:51:11 +0000 (12:51 +0100)]
glsl: Fix memory leak with known glsl_type instances
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.
v2: remove lambda usage (Tapani)
(+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Caio Marcelo de Oliveira Filho [Mon, 5 Mar 2018 21:58:11 +0000 (13:58 -0800)]
spirv: Add SpvCapabilityShaderViewportIndexLayerEXT
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.
v2: Make conditional to the capability, add gl_Layer, add tesselation
shaders. (Iago)
v3: Don't export to tesselation control shader.
v4: Add Reviewd-by tag.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Mauro Rossi [Tue, 6 Mar 2018 23:15:15 +0000 (00:15 +0100)]
android: anv: add libmesa_intel_dev static dependency
Fixes the following building errors:
external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override'
external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name'
external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes:
272bef0601a "intel: Split gen_device_info out into libintel_dev"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Timothy Arceri [Tue, 6 Mar 2018 03:48:07 +0000 (14:48 +1100)]
Revert "nir: bump loop unroll limit to 96."
This reverts commit
2d36efdb7f18f061c519dbb93f6058bf161aad33.
This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.
The fps for the ssao demo on radv remains unchanged when reverting
this.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Dave Airlie [Wed, 7 Mar 2018 03:24:25 +0000 (03:24 +0000)]
ac/nir: don't put lod into args if it's zero.
If it's zero but put it in args we still end up consuming a
register for it.
This fixes some spilling in the NIR paths in Dirt Rally that
isn't seen with TGSI.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Christian Gmeiner [Tue, 6 Mar 2018 09:34:08 +0000 (10:34 +0100)]
freedreno: bump required libdrm version
Fixes:
26a9321d0a "freedreno: add global_bindings state"
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Ian Romanick [Tue, 13 Feb 2018 02:58:53 +0000 (18:58 -0800)]
nir: Simplify some comparisons like a+b < a
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs:
14514555 ->
14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.
total cycles in shared programs:
533141444 ->
533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.
Sandy Bridge
total instructions in shared programs:
10533667 ->
10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.
total cycles in shared programs:
146136887 ->
146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.
Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.
total cycles in shared programs:
178100396 ->
178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).
GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.
total cycles in shared programs:
122167654 ->
122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Wed, 7 Feb 2018 01:27:53 +0000 (17:27 -0800)]
nir: Use De Morgan's Law on logic compounded comparisons
The replacement of the comparison operators must happen during this
step. If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination. The resulting infinite loop will eventually get OOM
killed.
Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs:
14808185 ->
14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.
total cycles in shared programs:
559438333 ->
559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.
Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs:
11811776 ->
11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.
total cycles in shared programs:
257617270 ->
257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.
Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.
total cycles in shared programs:
178100812 ->
178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.
GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.
total cycles in shared programs:
122167930 ->
122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Sat, 3 Feb 2018 01:39:54 +0000 (17:39 -0800)]
nir: Replace fmin(b2f(a), b) with a bcsel
All of the affected shaders are HDR mappers from Serious Sam 3.
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs:
14516285 ->
14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.
total cycles in shared programs:
533163876 ->
533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.
Sandy Bridge
total instructions in shared programs:
10533321 ->
10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.
total cycles in shared programs:
146136632 ->
146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.
Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.
total cycles in shared programs:
178113176 ->
178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).
GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.
total cycles in shared programs:
122178730 ->
122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Thu, 1 Feb 2018 23:33:04 +0000 (15:33 -0800)]
nir: Pull b2f out of bcsel
All platforms had similar results. (Skylake shown)
total instructions in shared programs:
14516592 ->
14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0
total cycles in shared programs:
533167044 ->
533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Wed, 31 Jan 2018 19:11:02 +0000 (11:11 -0800)]
nir: Replace an odd comparison involving fmin of -b2f
I noticed the fge version while looking at a shader for an unrelated
reason. The feq version prevents a regression in a later change that
performs strength reduction of some compares.
Broadwell and Skylake had similar results. (Skylake shown)
total instructions in shared programs:
14514808 ->
14514796 (<.01%)
instructions in affected programs: 750 -> 738 (-1.60%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.43% -0.36%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs:
533144939 ->
533144853 (<.01%)
cycles in affected programs: 8911 -> 8825 (-0.97%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19
helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31%
95% mean confidence interval for cycles value: -32.94 -10.06
95% mean confidence interval for cycles %-change: -2.30% -0.26%
Cycles are helped.
Haswell
total instructions in shared programs:
13093785 ->
13093775 (<.01%)
instructions in affected programs: 924 -> 914 (-1.08%)
helped: 4
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.53 1.20
95% mean confidence interval for instructions %-change: -2.02% 0.97%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs:
409580553 ->
409580118 (<.01%)
cycles in affected programs: 10909 -> 10474 (-3.99%)
helped: 5
HURT: 1
helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18
helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78%
HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13
HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39%
95% mean confidence interval for cycles value: -180.68 35.68
95% mean confidence interval for cycles %-change: -19.55% 3.79%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs:
11811851 ->
11811840 (<.01%)
instructions in affected programs: 1032 -> 1021 (-1.07%)
helped: 5
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1
helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19%
95% mean confidence interval for instructions value: -4.17 0.51
95% mean confidence interval for instructions %-change: -1.86% 0.36%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs:
257618403 ->
257618168 (<.01%)
cycles in affected programs: 10784 -> 10549 (-2.18%)
helped: 4
HURT: 2
helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17
helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72%
HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11
HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for cycles value: -133.11 54.78
95% mean confidence interval for cycles %-change: -14.79% 5.59%
Inconclusive result (value mean confidence interval includes 0).
GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown)
total instructions in shared programs:
10533871 ->
10533859 (<.01%)
instructions in affected programs: 865 -> 853 (-1.39%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21%
95% mean confidence interval for instructions value: -6.67 0.67
95% mean confidence interval for instructions %-change: -2.16% -0.29%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs:
146139904 ->
146139852 (<.01%)
cycles in affected programs: 15213 -> 15161 (-0.34%)
helped: 4
HURT: 0
helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15
helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29%
95% mean confidence interval for cycles value: -23.79 -2.21
95% mean confidence interval for cycles %-change: -0.88% 0.09%
Inconclusive result (%-change mean confidence interval includes 0).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Mon, 26 Feb 2018 22:49:47 +0000 (14:49 -0800)]
nir: Mark bcsel-to-fmin (or fmax) transformations as inexact
These transformations are inexact because section 4.7.1 (Range and
Precision) says:
Operations and built-in functions that operate on a NaN are not
required to return a NaN as the result.
The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>