Eric Anholt [Mon, 26 Oct 2020 19:11:40 +0000 (12:11 -0700)]
st/nir: Fix the st->pbo.use_gs case.
This case hadn't been ported to NIR before, and I missed that when
removing the TGSI path and replacing it with NIR -> NTT for TGSI drivers.
This caused breakage in nv50 on piglit's pbo-teximage.
In the process, the !use_gs gets its layer output fixed to be an int
instead of a vec4, which I suspect would fix validation in that path.
Fixes:
57effa342b75 ("st/mesa: Drop the TGSI paths for PBOs and use nir-to-tgsi if needed.")
Closes: #3680
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7320>
Alyssa Rosenzweig [Wed, 4 Nov 2020 16:13:55 +0000 (11:13 -0500)]
pan/bi: Correctly calculate render target index
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 14:05:57 +0000 (09:05 -0500)]
pan/bi: Lower depth/stencil stores
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 14:05:39 +0000 (09:05 -0500)]
pan/bi: Emit +ZS_EMIT as needed
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:52:48 +0000 (08:52 -0500)]
pan/bi: Stub handling for nir_intrinsic_store_combined_output_pan
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:46:32 +0000 (08:46 -0500)]
pan/bi: Factor out bi_emit_blend
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:42:51 +0000 (08:42 -0500)]
pan/bi: Factor out bi_emit_atest
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:22:53 +0000 (08:22 -0500)]
pan/bi: Infer z/stencil flags from sources passed
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:18:22 +0000 (08:18 -0500)]
pan/bi: Add +ZS_EMIT instruction to IR
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 16:17:43 +0000 (11:17 -0500)]
panfrost: Deduplicate shader properties
Between Midgard and Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:59:35 +0000 (08:59 -0500)]
panfrost: Pass through src_type
Needed since Bifrost blends are typed well.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:37:55 +0000 (08:37 -0500)]
pan/mdg: Move writeout lowering to common panfrost
These will be used in the Bifrost compiler, albeit for a slightly
different purpose.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:32:16 +0000 (08:32 -0500)]
pan/mdg: Deduplicate nir_find_variable_with_driver_location
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Alyssa Rosenzweig [Wed, 4 Nov 2020 13:57:03 +0000 (08:57 -0500)]
nir: Add SRC_TYPE to store_combined_output_pan
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7446>
Iago Toral Quiroga [Wed, 4 Nov 2020 09:39:12 +0000 (10:39 +0100)]
v3dv: add a v3dv_bo_init helper
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7440>
Tony Wasserka [Tue, 3 Nov 2020 13:40:20 +0000 (14:40 +0100)]
aco/ra: Fix counting of subdword variables in get_reg_create_vector
The loop variable "k" shadowed another variable in the outer scope, so
this loop had no actual effect.
Fixes:
52cc1f8237d ("aco: improve p_create_vector RA for sub-dword operands")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7427>
Rhys Perry [Wed, 8 Jul 2020 18:19:43 +0000 (19:19 +0100)]
aco: implement 8/16-bit instructions which can be trivially widened
When nir_lower_bit_size becomes more capable, we might want to revert some
of this.
fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
SGPRs: 11320 -> 10200 (-9.89%)
VGPRs: 7156 -> 7364 (+2.91%)
CodeSize: 1453948 -> 1430136 (-1.64%); split: -1.66%, +0.02%
Instrs: 258530 -> 254840 (-1.43%); split: -1.44%, +0.01%
Cycles:
37334360 ->
37247936 (-0.23%); split: -0.26%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Mon, 27 Apr 2020 20:17:26 +0000 (21:17 +0100)]
aco: implement some 16-bit arithmetic instead of lowering
fossil-db (parallel-rdp, Navi):
Totals from 210 (30.75% of 683) affected shaders:
SGPRs: 9704 -> 10248 (+5.61%)
VGPRs: 5884 -> 5368 (-8.77%)
CodeSize: 1155564 -> 1098752 (-4.92%)
Instrs: 199927 -> 189940 (-5.00%)
Cycles:
20438392 ->
19860124 (-2.83%)
v2: use divergence analysis to determine which instructions to lower.
Co-Authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Wed, 8 Jul 2020 16:56:41 +0000 (17:56 +0100)]
radv: rework nir_lower_bit_size callback and run DA on GFX8+
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Fri, 30 Oct 2020 15:54:12 +0000 (15:54 +0000)]
radv: do nir_lower_bit_size after algebraic optimizations
There are too many algebraic optimizations to be certain that one of them
couldn't create instructions which need lowering. It also creates better
code for some reason.
fossil-db (parallel-rdp, Navi):
Totals from 217 (31.77% of 683) affected shaders:
VGPRs: 7716 -> 7672 (-0.57%)
CodeSize: 1516152 -> 1510688 (-0.36%); split: -0.38%, +0.02%
MaxWaves: 3964 -> 3982 (+0.45%)
Instrs: 269445 -> 268508 (-0.35%); split: -0.36%, +0.02%
Cycles:
37963416 ->
37912592 (-0.13%); split: -0.15%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Thu, 29 Oct 2020 11:21:42 +0000 (11:21 +0000)]
radv: move a few passes to after load/store vectorization
load/store vectorization can create 8/16-bit alu to do packing/unpacking,
which would make shader_info::bit_sizes_used out of date.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Fri, 30 Oct 2020 15:18:25 +0000 (15:18 +0000)]
nir/lower_bit_size: optimize upcast of b2i8/b2i16
This also seems to be done by nir_opt_algebraic, but RADV will be moving
nir_lower_bit_size() to after that (so it doesn't create unsupported
8/16-bit instructions) and it doesn't seem worth creating a new pass just
for this simple optimization.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Rhys Perry [Thu, 29 Oct 2020 10:52:25 +0000 (10:52 +0000)]
nir: add shader_info::bit_sizes_used
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
Pierre-Eric Pelloux-Prayer [Tue, 27 Oct 2020 20:33:44 +0000 (21:33 +0100)]
va: support VA_RT_FORMAT_PROTECTED
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Pierre-Eric Pelloux-Prayer [Tue, 27 Oct 2020 20:13:40 +0000 (21:13 +0100)]
va/picture: make sure destination buffer is protected if needed
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Veerabadhran Gopalakrishnan [Mon, 5 Oct 2020 02:13:43 +0000 (22:13 -0400)]
frontends/va: Added protected playback support for VP9
Add VP9 header handling in slice data buffer.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Boyuan Zhang [Thu, 14 May 2020 00:08:55 +0000 (20:08 -0400)]
radeon/vcn: program drm message buffer
Add a function to handle drm message buffer using input decryption parameters.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Pierre-Eric Pelloux-Prayer [Wed, 28 Oct 2020 15:52:47 +0000 (16:52 +0100)]
radeon/vcn: delay dec->ctx and dec->dpb allocation
This will allow to allocate them as encrypted if needed.
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Pierre-Eric Pelloux-Prayer [Tue, 27 Oct 2020 08:33:27 +0000 (09:33 +0100)]
radeon: add si_vid_create_tmz_buffer helper
Same code as si_vid_create_buffer except that the buffer is using TMZ.
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Boyuan Zhang [Fri, 2 Oct 2020 19:00:28 +0000 (15:00 -0400)]
radeon/vcn: add defines for drm message buffer
Add defines and structure for drm message buffer.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Boyuan Zhang [Mon, 10 Feb 2020 19:55:54 +0000 (14:55 -0500)]
radeon: add decryption params definition header
Add a header file for decryption parameters.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Boyuan Zhang [Thu, 14 May 2020 01:27:45 +0000 (21:27 -0400)]
frontends/va: handle protected slice data buffer
Add a function to handle VaProtectedSliceDataBuffer, which is used for
sending decryption parameters. Also, for protected playback, there is
no need to check start code since data is encrypted.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Boyuan Zhang [Mon, 10 Feb 2020 19:37:36 +0000 (14:37 -0500)]
vl: add flag and definition for protected playback
Add a flag to indicate if playback is protected/encrypted.
Add a pointer to decryption key for later decryption use.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7006>
Jason Ekstrand [Wed, 4 Nov 2020 04:46:14 +0000 (22:46 -0600)]
nir/find_array_copies: Don't assume all children exist
Fixes:
9f3c595dfc4cd "nir/find_array_copies: Handle cast derefs"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7437>
Vinson Lee [Wed, 21 Oct 2020 23:32:55 +0000 (16:32 -0700)]
radesonsi: Remove unnecessary shader->selector NULL check.
shader->selector->info.stage == MESA_SHADER_COMPUTE at this case statement.
Fix defect reported by Coverity Scan.
Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking shader->selector suggests that it may
be null, but it has already been dereferenced on all paths leading to
the check.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7266>
Dave Airlie [Tue, 3 Nov 2020 06:38:09 +0000 (16:38 +1000)]
lavapipe: request correct sample mask behaviour
Fixes:
dEQP-VK.rasterization.frag_side_effect*
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7435>
Dave Airlie [Tue, 3 Nov 2020 06:37:10 +0000 (16:37 +1000)]
llvmpipe: respect the sample mask in non-multisample flag
This partly revert
50987644 llvmpipe: don't use sample mask with 0 samples
since Vulkan wants this behaviour.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7435>
Dave Airlie [Tue, 3 Nov 2020 06:33:16 +0000 (16:33 +1000)]
gallium: add a non-multisample sample mask out behaviour flag.
Vulkan/DX want to use output sample mask even when not multisampling
GL wants it ignored.
Add a rasterizer flag to lavapipe can get correct behaviour.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7435>
Jason Ekstrand [Fri, 23 Oct 2020 21:48:38 +0000 (16:48 -0500)]
nir/opt_intrinsic: Optimize bcsel(b, shuffle(x, i), shuffle(x, j))
The shuffles provided by the SPV_INTEL_subgroups extension generate
bcsel(b, shuffle(x, i), shuffle(y, j))
In the case where x and y are the same, we can turn this into a shuffle
with the bcsel on the index which lets us drop a whole shuffle.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Fri, 23 Oct 2020 21:05:29 +0000 (16:05 -0500)]
nir/opt_intrinsics: Refactor a bit
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Thu, 29 Oct 2020 15:10:35 +0000 (10:10 -0500)]
nir/constant_folding: Fold subgroup shuffle intrinsics
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Thu, 29 Oct 2020 15:08:15 +0000 (10:08 -0500)]
nir: Move constant folding of vote to opt_constant_folding
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Thu, 29 Oct 2020 15:05:21 +0000 (10:05 -0500)]
nir/constant_folding: Use the standard variable naming convention
Typically, if we have one alu instruction, we call it "alu" and if we
have one intrinsic we call it "intrin".
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Thu, 29 Oct 2020 14:58:57 +0000 (09:58 -0500)]
nir/constant_folding: Use a switch in try_fold_intrinsic
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Jason Ekstrand [Thu, 29 Oct 2020 15:15:46 +0000 (10:15 -0500)]
nir/opt_intrinsics: Report progress for the gl_SampleMask optimization
Fixes:
d3ce8a7f6b93 "nir: optimize gl_SampleMaskIn to gl_HelperInvocation..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7366>
Rhys Perry [Tue, 3 Nov 2020 13:18:56 +0000 (13:18 +0000)]
nir: use nir_alu_src_is_trivial_ssa() in nir_ssa_for_alu_src()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7426>
Rhys Perry [Tue, 3 Nov 2020 13:21:37 +0000 (13:21 +0000)]
nir: skip bcsel with non-trivial swizzle in opt_simplify_bcsel_of_phi()
Fixes validation error in a Dota 2 shader.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes:
b031c643491 ("nir: Convert a bcsel with only phi node sources to a phi node")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7426>
Rhys Perry [Tue, 3 Nov 2020 13:17:22 +0000 (13:17 +0000)]
nir: add nir_alu_src_is_trivial_ssa()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7426>
Jason Ekstrand [Sat, 15 Aug 2020 05:57:14 +0000 (00:57 -0500)]
nir/lower_io: Add a new 62bit_generic address format
Unlike most address formats, this address format is capable of handling
all of the fancy generic pointers stuff like is_global and friends.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 06:54:45 +0000 (01:54 -0500)]
nir/lower_io: Support generic pointer access
If the pointer is generic and we haven't yet figured out what kind of
pointer it is yet, we emit an if-ladder based on a mode check.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 05:32:46 +0000 (00:32 -0500)]
nir/lower_io: Add support for lowering deref_mode_is
The guts are still missing so it will blow up if it sees any
deref_mode_is intrinsic that it can't constant-fold from the mode.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Tue, 18 Aug 2020 15:27:41 +0000 (10:27 -0500)]
nir/lower_io: Add support for 32/64bit_global for shared
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 05:39:00 +0000 (00:39 -0500)]
nir/lower_io: Add a mode parameter to addr_format_is_*
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 14 Aug 2020 23:20:12 +0000 (18:20 -0500)]
nir/lower_io: Add a mode parameter to build_addr_iadd
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 16:14:20 +0000 (11:14 -0500)]
nir/opt_deref: Add an optimization for deref_mode_is
If opt_restrict_deref_modes makes progress, we may be able to figure out
the mode well enough to turn a deref_mode_is intrinsic into a constant.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 05:29:59 +0000 (00:29 -0500)]
nir/opt_deref: Add a deref mode specialization optimization
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 04:41:14 +0000 (23:41 -0500)]
spirv: Add generic pointer support
Most of this is fairly straightforward; we just set all the modes on any
derefs which are generic. The one tricky bit is OpGenericCastToPtrExplicit.
Instead of adding NIR intrinsics to do the cast, we add NIR intrinsics
to do a storage class check and then bcsel based on that.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 05:28:55 +0000 (00:28 -0500)]
nir: Add support for generic pointers
The way they're handled is that deref->modes is treated as a bitfield of
possible modes. Variables are required to have a specific mode and
derefs with deref_type_var are as well.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 30 Oct 2020 17:14:05 +0000 (12:14 -0500)]
nir: Make nir_deref_instr::mode a bitfield
We rename it to "modes" to make it clear that it may contain more than
one mode and adjust all the uses of nir_deref_instr::modes to attempt to
handle multiple modes.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:49:36 +0000 (17:49 -0600)]
nir/split_*_vars: Prepare for generic pointers
All three passes check the variables for complex uses and don't split
them if they have any complex uses. Most of these checks are just early
returns to avoid chasing the deref to the variable and a hash table
lookup if we can quickly determine it has the wrong mode. In a couple
of cases, we need to re-arrange or add other checks to ensure that it's
safe for generic pointers.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:48:49 +0000 (17:48 -0600)]
nir/find_array_copies: Prepare for generic pointers
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:46:56 +0000 (17:46 -0600)]
nir: Use nir_deref_mode_may_be in deref optimizations
All the checks being replaced are fore potential aliasing so we want to
flush stores whenever the mode might be something that aliases.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:45:26 +0000 (17:45 -0600)]
nir/vec3_to_vec4: Use nir_deref_must_be
We use the same nir_deref_mode_is_in_set helper that we use in
nir_lower_vars_to_explicit_types for the same reason. If there are any
generic pointers in play, we have to lower all generic pointer modes at
the same time or else we risk types getting out-of-sync.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:40:34 +0000 (17:40 -0600)]
nir/vars_to_ssa: Use nir_deref_must_be
We can only lower a deref to SSA in this pass if it's guaranteed to be
nir_var_function_temp. We already flag any variables with complex uses
(i.e. casts) as not being lowerable and refuse to lower any derefs to
them so we don't have to worry about false negatives.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:22:19 +0000 (17:22 -0600)]
nir: Only force loop unrolling if we know it's a in/out/temp
If we don't know the actual mode then we can't get to the variable so
it's going to be a scratch or other indirect load anyway and we aren't
saving ourselves anything by unrolling the loop.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 23:15:28 +0000 (17:15 -0600)]
nir/phis_to_scalar,gcm: Use nir_deref_mode_may_be
In both cases, we're trying to determine if a load is scalarizable. We
don't want to scalarize if it's a function_temp or shader_temp because
it might turn into something we can't scalarize.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 22:58:27 +0000 (16:58 -0600)]
nir/lower_io: Use nir_deref_mode_* helpers
For non-explicit nir_lower_io, we use nir_deref_mode_is because there's
no way it works for generic pointers. For nir_lower_vars_to_explicit_types,
and nir_lower_explicit_io, we use nir_deref_mode_is_in_set to ensure we
never get type confusion. For generic pointers, this means that they
must be called with the full set of generic pointer modes.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sun, 1 Nov 2020 22:55:25 +0000 (16:55 -0600)]
nir/lower_array_deref_of_vec: Use nir_deref_mode_must_be
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 30 Oct 2020 17:19:25 +0000 (12:19 -0500)]
nir: Add and use some deref mode helpers
NIR derefs currently have exactly one variable mode. This is about to
change so we can handle OpenCL generic pointers. In order to transition
safely, we need to audit every deref->mode check. This commit adds a
set of helpers that provide more nuanced mode checks and converts most
of NIR to use them.
For simple cases, we add nir_deref_mode_is and nir_deref_mode_is_one_of
helpers. These can be used in passes which don't have to bother with
generic pointers and just want to know what mode a thing is. If the
pass ever encounters generic pointers in a way that this check would be
unsafe, it will assert-fail to alert developers that they need to think
harder about things and fix the pass.
For more complex passes which require a more nuanced understanding of
modes, we add nir_deref_mode_may_be and nir_deref_mode_must_be helpers
which accurately describe the compiler's best knowledge about the given
deref. Unfortunately, we may not be able to exactly identify the mode
in a generic pointers scenario so we have to be very careful when we use
these. Conversion of these passes is left to later commits.
For the case of mass lowering of a particular mode (nir_lower_explicit_io
is one good example), we add nir_deref_mode_is_in_set. This is also
pretty assert-happy like nir_deref_mode_is but is for a set containment
comparison on deref modes where you expect the deref to either be all-in
or all-out.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 30 Oct 2020 20:07:47 +0000 (15:07 -0500)]
nir/opt_find_array_copies: Allow copies from mem_constant
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 30 Oct 2020 20:03:23 +0000 (15:03 -0500)]
nir: Disallow writes to system values and mem_constant
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Fri, 30 Oct 2020 17:30:09 +0000 (12:30 -0500)]
nir: Use var->data.mode instead of deref->mode in a few cases
We already have the variable so we know the mode exactly. Just use that
instead of the deref mode. If these paths ever have to handle variable
pointers (not likely since they're OpenGL-specific), we can fix them to
handle crazy deref modes then.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Tue, 29 Sep 2020 15:30:52 +0000 (10:30 -0500)]
nir: Handle incomplete derefs in split_struct_vars
In split_var_list_structs where we initalize the splitting, we already
use get_complex_used_vars to avoid splitting any variables that have a
complex use. However, we weren't actually handling the complex uses
properly in the case where we can't actually find the variable.
Fixes:
f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Mon, 17 Aug 2020 17:06:56 +0000 (12:06 -0500)]
nir/phis_to_scalar: Use a deny-list for load_deref modes
I can't think of any reason why shared and output aren't in this list.
The real thing we're trying to do is avoid premature scalarization
because of a shader or function temporary variable because we might
lower it to something we don't want scalarized later. Also fix the
version we copy+pasted into GCM.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Jason Ekstrand [Sat, 15 Aug 2020 05:11:27 +0000 (00:11 -0500)]
nir/builder: Add a nir_ieq_imm helper
This shows up surprisingly often.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
Dave Airlie [Tue, 3 Nov 2020 03:57:32 +0000 (13:57 +1000)]
lavapipe: don't advertise linear filtering on integer textures.
The backend doesn't support this.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Tue, 3 Nov 2020 01:42:01 +0000 (11:42 +1000)]
lavapipe: use clear_buffer callback
llvmpipe needs the clear buffer callback for CL, make lavapipe
use it as well.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Mon, 2 Nov 2020 07:13:48 +0000 (17:13 +1000)]
llvmpipe: add clear_buffer callback. (v2)
This fixes CL CTS thread dimensions test
v2: optimise for 1 and 4. (ajax)
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Mon, 2 Nov 2020 01:33:36 +0000 (11:33 +1000)]
lavapipe: stop crashes with 3D z blits
This code just didn't handle 3D Z blits properly, rewrite
to handle change in direction here.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Sun, 1 Nov 2020 23:42:59 +0000 (09:42 +1000)]
lavapipe: fix 3d compressed texture copies.
The img stride was being calculated incorrectly.
Fixes crashes in:
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.3d.bc1_rgb_srgb_block.bc1_rgb_srgb_block.general_general
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Fri, 30 Oct 2020 03:48:55 +0000 (13:48 +1000)]
gallivm/nir: fix vulkan vertex inputs
the tgsi file max is used to determine the number of input slots,
the old code was pretty bogus for the lavapipe cases (it worked
by accident).
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Fri, 30 Oct 2020 04:19:18 +0000 (14:19 +1000)]
gallivm/nir: handle dvec3/4 inputs properly.
This code works but isn't entirely correct, for a dvec3 it would
fetch loc 0,1 2,3 4,5 but really each loc only has 4 entries,
instead catch this and read loc 0,1 2,3 loc+1 0,1
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Wed, 30 Sep 2020 00:59:14 +0000 (10:59 +1000)]
lavapipe: fix dEQP-VK.info.device_properties
Fix bounds and widelines aren't supported for now.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Fri, 30 Oct 2020 03:33:45 +0000 (13:33 +1000)]
lavapipe: constify state pointers into command buffers.
for render pass information pointers into the command buffer are stored,
but command buffers are immutable content so make sure to use const ptrs
to avoid problems like was seen with clears.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Fri, 30 Oct 2020 03:31:26 +0000 (13:31 +1000)]
lavapipe: don't write to pending clear aspects in cmd buffer
When the cmd buffer is recorded, we record the attachments state
into it, however we use the pending clear aspects to keep track
of clears, but it should be kept in the state no overwrritten
in the cmd buffer.
Allocate some memory to store this hanging off the state.
This fixes gears and radialblur demos.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Dave Airlie [Fri, 23 Oct 2020 06:04:04 +0000 (16:04 +1000)]
gallivm: fix f16 quantize.
Add the correct flush to 0 behaviour.
Fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7416>
Igor V. Kovalenko [Tue, 3 Nov 2020 06:19:06 +0000 (09:19 +0300)]
r600: amend space check for chips older than EVERGREEN
evergreen_emit_atomic_buffer_setup_count is only called if chip >= EVERGREEN
otherwise atomic_used_mask is left uninitialized when unconditionally used by
r600_need_cs_space so it might want more space than needed
fix this by always initializing atomic_used_mask
Fixes:
32529e60849 ("r600/eg: rework atomic counter emission with flushes")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7417>
Mike Blumenkrantz [Thu, 10 Sep 2020 14:21:42 +0000 (10:21 -0400)]
zink: break up dynamic access lowering
this is gross and slow, but to handle dominance properly, now we just load all
the ubos and bcsel away
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7199>
Mike Blumenkrantz [Wed, 29 Jul 2020 19:28:26 +0000 (15:28 -0400)]
zink: add pass for lowering dynamic ubo/ssbo block indexing to constants
spirv can't handle this, so instead we have to convert this into constant values
for any driver passing this sort of instruction through to vtn
eventually this will get removed in favor of using direct bo derefs
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7199>
Eric Anholt [Mon, 2 Nov 2020 18:11:11 +0000 (10:11 -0800)]
mesa/st: Fix a use-after-free of the NIR shader stage.
We just freed the NIR after turning it into TGSI, no using it in that last
switch statement.
Closes: #3725
Fixes:
57effa342b75 ("st/mesa: Drop the TGSI paths for PBOs and use nir-to-tgsi if needed.")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7407>
Jason Ekstrand [Fri, 23 Oct 2020 19:00:49 +0000 (14:00 -0500)]
mesa/spirv: Lower variable initializers for global variables
We lower variable initializers for local variables higher up in the
function but we never called nir_lower_variable_initializers for
anything else.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7296>
Duncan Hopkins [Fri, 30 Oct 2020 10:01:42 +0000 (10:01 +0000)]
zink: Added support for MacOS MoltenVK APIs.
Detects the MoltenVK layer and extension.
If present, get the ext function pointers and use to enable full swizzeling suport.
Fixes issues with Swizzling behaviour fro MoltenVk is disabled by default and needed to be enable via this API.
This also supplied the ground work to allow IOSurfaces to be used later for surface passing.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7383>
Duncan Hopkins [Fri, 30 Oct 2020 09:35:35 +0000 (09:35 +0000)]
zink: Basic framework to check for optional instance layers and instance extensions.
Needed for later optional instance level features.
Possible layer check are:
* device groups
* MoltenVK
* Debug validation without external hooks.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
WIP
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7383>
James Park [Sun, 1 Nov 2020 07:04:42 +0000 (00:04 -0700)]
radv,radv/winsys: Move RADV_MAX_IBS_PER_SUBMIT
RADV_MAX_IBS_PER_SUBMIT needs to be defined even for the null driver.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7394>
Anthoine Bourgeois [Tue, 28 Apr 2020 23:28:24 +0000 (01:28 +0200)]
docs/features: add some extensions we missed
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4266>
Anthoine Bourgeois [Tue, 28 Apr 2020 23:28:15 +0000 (01:28 +0200)]
docs/features: VK_KHR_mir_surface is disabled, remove it
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4266>
Anthoine Bourgeois [Tue, 28 Apr 2020 23:28:02 +0000 (01:28 +0200)]
docs/features: Minor update extensions support
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4266>
Rhys Perry [Fri, 19 Jun 2020 10:30:27 +0000 (11:30 +0100)]
nir/algebraic: better propagate constants up fadd chains
Make the optimization create more mad-friendly code if the order of the
fadd's operands is unlucky.
fossil-db (Navi):
Totals from 9259 (8.07% of 114665) affected shaders:
SGPRs: 615991 -> 616191 (+0.03%); split: -0.05%, +0.08%
VGPRs: 442184 -> 443568 (+0.31%); split: -0.10%, +0.41%
CodeSize:
32674876 ->
32625572 (-0.15%); split: -0.17%, +0.02%
MaxWaves: 108560 -> 108152 (-0.38%); split: +0.07%, -0.44%
Instrs: 6126473 -> 6120463 (-0.10%); split: -0.13%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>
Rhys Perry [Wed, 17 Jun 2020 12:44:40 +0000 (13:44 +0100)]
spirv: reverse order in matrix multiplication
This will create code that is easier to combine into MADs/FMA when the
last component of the vector is 1.0.
nir_opt_algebraic_late has an optimization to do something similar but it
only works for inexact code, if the multiplication-by-1 optimization is
done before it and if the backend enables fuse_ffma.
fossil-db (Navi):
Totals from 4296 (3.75% of 114665) affected shaders:
SGPRs: 283468 -> 283764 (+0.10%); split: -0.02%, +0.12%
VGPRs: 172868 -> 172904 (+0.02%); split: -0.09%, +0.11%
CodeSize:
14045312 ->
14027128 (-0.13%); split: -0.15%, +0.02%
MaxWaves: 59285 -> 59282 (-0.01%); split: +0.04%, -0.05%
Instrs: 2703507 -> 2683187 (-0.75%); split: -0.76%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>
Rhys Perry [Tue, 16 Jun 2020 15:04:09 +0000 (16:04 +0100)]
nir: scalarize fdot in reverse
This will create code that is easier to combine into MADs/FMA when the
last component is 1.0.
nir_opt_algebraic_late has an optimization to do something similar but it
only works for inexact code, if the multiplication-by-1 optimization is
done before it and if the backend enables fuse_ffma.
fossil-db (Navi):
Totals from 85583 (74.64% of 114665) affected shaders:
SGPRs: 4556060 -> 4558596 (+0.06%); split: -0.07%, +0.12%
VGPRs: 3315060 -> 3312984 (-0.06%); split: -0.23%, +0.17%
SpillSGPRs: 13552 -> 13553 (+0.01%)
CodeSize:
184962756 ->
184431388 (-0.29%); split: -0.32%, +0.03%
MaxWaves: 1208693 -> 1209361 (+0.06%); split: +0.17%, -0.11%
Instrs:
35678819 ->
35361617 (-0.89%); split: -0.91%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>