Alyssa Rosenzweig [Fri, 29 Jan 2021 01:25:55 +0000 (20:25 -0500)]
panfrost: Add sample positions sysval
For Midgard. On Bifrost, the hardware pushes this directly to FAU.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Alyssa Rosenzweig [Fri, 29 Jan 2021 17:43:08 +0000 (12:43 -0500)]
panfrost: Preload sample mask if needed
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Alyssa Rosenzweig [Fri, 12 Feb 2021 21:28:52 +0000 (16:28 -0500)]
pan/decode: Only print local storage for vertex jobs
It's convenient to group this with the framebuffer, but the other fields
are unused by the hardware for vertex jobs. They _are_ used for tiler
jobs.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Alyssa Rosenzweig [Fri, 29 Jan 2021 21:52:45 +0000 (16:52 -0500)]
pan/decode: Cleanup sample locations decode
We know what this is now. I opted to leave it in ~fixed-point format to
avoid bikeshedding over precision.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Alyssa Rosenzweig [Wed, 27 Jan 2021 21:18:01 +0000 (16:18 -0500)]
nir: Add sample_positions_pan intrinsic
Facilites the gl_SamplePosition lowering on Bifrost, where the sample
positions are accessed directly in a packed in-memory format.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8774>
Kenneth Graunke [Fri, 12 Feb 2021 19:39:45 +0000 (11:39 -0800)]
iris: Make a pin_scratch_space() helper
We need to (re-)pin the scratch buffer in four different places, and
it's going to get slightly more complicated on future platforms. So,
make a helper function, allowing us to add the complexity in one spot.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9023>
Hoe Hao Cheng [Fri, 12 Feb 2021 20:56:23 +0000 (04:56 +0800)]
zink: enable KHR_shader_draw_parameters on Vulkan <1.2
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Fri, 12 Feb 2021 19:14:26 +0000 (03:14 +0800)]
zink/codegen: do not enable extensions that are now core
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Fri, 12 Feb 2021 13:50:55 +0000 (21:50 +0800)]
zink/codegen: fix type annotations
mypy complains about this
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 19:35:32 +0000 (03:35 +0800)]
zink/codegen: validate has_properties and has_features
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 18:41:18 +0000 (02:41 +0800)]
zink/codegen: perform basic validation in zink_device_info
Check for existence of extension and its type
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 18:10:40 +0000 (02:10 +0800)]
zink/codegen: make zink_device_info accept vk.xml
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 18:16:06 +0000 (02:16 +0800)]
zink/codegen: introduce notion of non-standard extensions
this is for the MoltenVK extensions, especially "VK_MVK_moltenvk", which
right now is reserved in the registry. Making it non-standard skips all
the validations.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 18:14:14 +0000 (02:14 +0800)]
zink/codegen: more validation in zink_instance
the MVK check is a workaround, since VK_MVK_moltenvk is not an official
VK extension per se - the next patch will introduce nonstandardness to
Extension.
Two new validations are added by this patch:
1. extension type (non-instance extensions are rejected)
2. existence of specified instance functions
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Hoe Hao Cheng [Thu, 11 Feb 2021 16:17:04 +0000 (00:17 +0800)]
zink/codegen: introduce ExtensionRegistry
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9021>
Samuel Pitoiset [Fri, 12 Feb 2021 09:08:14 +0000 (10:08 +0100)]
radv/winsys: set use_global_list inside the critical section
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9006>
Samuel Pitoiset [Fri, 12 Feb 2021 09:06:48 +0000 (10:06 +0100)]
radv: only make the WSI images resident if the global BO list is used
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4270
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4286
Fixes: 96b03aaa175 ("radv: use the global BO list from the winsys")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9006>
Daniel Schürmann [Tue, 2 Feb 2021 16:46:35 +0000 (17:46 +0100)]
aco: use VCC as regular SGPR pair on GFX10
There is no need to reserve it for special purposes, only.
Totals from 139391 (100.00% of 139391) affected shaders (Navi10):
VGPRs:
4738296 ->
4738156 (-0.00%); split: -0.01%, +0.00%
SpillSGPRs: 16188 -> 14968 (-7.54%); split: -7.60%, +0.06%
CodeSize:
294204472 ->
294118048 (-0.03%); split: -0.04%, +0.01%
MaxWaves:
2119584 ->
2119619 (+0.00%); split: +0.00%, -0.00%
Instrs:
56075079 ->
56056235 (-0.03%); split: -0.05%, +0.01%
Cycles:
1757781564 ->
1755354032 (-0.14%); split: -0.16%, +0.02%
VMEM:
52995887 ->
52996319 (+0.00%); split: +0.07%, -0.07%
SMEM:
9005338 ->
9004858 (-0.01%); split: +0.16%, -0.17%
VClause:
1178436 ->
1178331 (-0.01%); split: -0.02%, +0.01%
SClause:
2403649 ->
2404542 (+0.04%); split: -0.14%, +0.18%
Copies:
3447073 ->
3432417 (-0.43%); split: -0.66%, +0.23%
Branches:
1166542 ->
1166422 (-0.01%); split: -0.11%, +0.10%
PreSGPRs:
4229322 ->
4235538 (+0.15%)
PreVGPRs:
3817111 ->
3817040 (-0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Mon, 8 Feb 2021 16:30:26 +0000 (17:30 +0100)]
aco: don't abort() if disassembly fails
We used that to catch assembly errors in the past,
but now, there are too many hardware features we
use in ACO that are not supported by the LLVM disassembler,
that it is not really suited anymore as a debugging tool.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Sat, 6 Feb 2021 17:40:21 +0000 (18:40 +0100)]
aco: check get_reg_specified() on register hints
This ensures that max_used_sgpr is adjusted accordingly.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Mon, 8 Feb 2021 13:38:43 +0000 (14:38 +0100)]
aco: also consider VCC in get_reg_specified()
This allows split_vector and others to keep their VCC position.
Totals from 4573 (3.28% of 139391) affected shaders (Navi10):
CodeSize:
54292268 ->
54289324 (-0.01%); split: -0.03%, +0.03%
Instrs:
10327645 ->
10326941 (-0.01%); split: -0.04%, +0.04%
Cycles:
744410748 ->
744034732 (-0.05%); split: -0.07%, +0.02%
VMEM: 749093 -> 749092 (-0.00%); split: +0.00%, -0.00%
SMEM: 269306 -> 269322 (+0.01%)
SClause: 358746 -> 358744 (-0.00%)
Copies: 826051 -> 823910 (-0.26%); split: -0.55%, +0.29%
Branches: 355074 -> 356493 (+0.40%); split: -0.01%, +0.41%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Fri, 5 Feb 2021 13:38:08 +0000 (14:38 +0100)]
aco: don't decrease the vgpr_limit when encountering bpermute
Instead we recalculate vgpr_limit on demand, depending on
the number of needed shared VGPRs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Fri, 5 Feb 2021 13:36:39 +0000 (14:36 +0100)]
aco: refactor GPR limit calculation
This patch delays the calculation of GPR limits in order to
precisely incorporate extra registers (VCC etc.) and shared VGPRs.
Additionally, the allocation granularity is used to set the config.
This has some effect on the reported SGPR stats.
Totals (Navi10):
SGPRs:
6971787 ->
17753642 (+154.65%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Tue, 2 Feb 2021 16:33:09 +0000 (17:33 +0100)]
aco: change gpr_alloc_granule to full alignment
This also switches the alloc_granule of Tonga and Iceland
to 96, so that the calculation is consistent.
Also changes the granularity for RDNA to 16 to keep
better stats with the upcoming patch.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Daniel Schürmann [Fri, 5 Feb 2021 17:25:18 +0000 (18:25 +0100)]
aco: fix shared VGPR allocation on RDNA2
VGPRs are now allocated in blocks of 8 normal
or 16 shared VGPRs, respectively.
Fixes: 14a5021aff661a26d76f330fec55d400d35443a8 ('aco/gfx10: Refactor of GFX10 wave64 bpermute.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>
Hoe Hao Cheng [Thu, 11 Feb 2021 16:35:15 +0000 (00:35 +0800)]
zink: VK_KHR_draw_indirect_count is a device extension
this fixes some testcases on CI.
Fixes: 1c01ad1b804 ("zink: add KHR_draw_indirect_count detection")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8991>
Samuel Pitoiset [Thu, 11 Feb 2021 18:27:24 +0000 (19:27 +0100)]
radv: emit pipeline bind markers for SQTT
I suspect this marker to be useful for correlating pipeline shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8995>
Mike Blumenkrantz [Fri, 12 Feb 2021 15:22:43 +0000 (10:22 -0500)]
zink: fix streamout for tess stage
the tess shader needs to actually emit xfb stuff in order for it to work
Fixes: 2891e0b74e6 ("zink: pull xfb info from tess shader when applicable")
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9013>
Jesse Natalie [Thu, 4 Feb 2021 16:43:25 +0000 (08:43 -0800)]
wgl: Disable automatic use of layered drivers with LIBGL_ALWAYS_SOFTWARE
Fixes: 8955980f ("gallium/targets/libgl-gdi: prefer d3d12 driver")
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8865>
Jesse Natalie [Thu, 4 Feb 2021 16:30:02 +0000 (08:30 -0800)]
d3d12: Fail screen creation if a shader validator is needed and can't be created
Also fail screen creation if experimental shader models are requested, but can't be enabled
Fixes: 2ea15cd6 ("d3d12: introduce d3d12 gallium driver")
Reviewed-by: Bill Kristiansen <billkris@microsoft.com>
Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4022
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8865>
Jesse Natalie [Thu, 4 Feb 2021 16:08:12 +0000 (08:08 -0800)]
wgl: Add a loop for screen creation with an ordered list of fallbacks
Fixes: 8955980f ("gallium/targets/libgl-gdi: prefer d3d12 driver")
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4022
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8865>
Jesse Natalie [Thu, 4 Feb 2021 15:58:06 +0000 (07:58 -0800)]
wgl: Refactor screen creation to a function
Fixes: 8955980f ("gallium/targets/libgl-gdi: prefer d3d12 driver")
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4022
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8865>
Alyssa Rosenzweig [Fri, 12 Feb 2021 13:21:45 +0000 (08:21 -0500)]
pan/bi: Fix empty shader handling
Fixes INSTR_INVALID_ENC fault on dEQP-GLES31.functional.compute.basic.empty
Fixes: bfcdc8f1747 ("pan/bi: Add some zero bytes after shaders on Bifrost")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9011>
Alyssa Rosenzweig [Thu, 11 Feb 2021 20:23:01 +0000 (15:23 -0500)]
pan/bi: Fix jumps to terminal block again
New scheduler broke this. We need to shuffle some code around so we do
the lower pre-schedule instead of post-schedule (no clauses to work
with).
Fixes: 77933d16d8c ("pan/bi: Switch to new scheduler")
Reported-by: Icecream95 <ixn@disroot.org>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9011>
Alyssa Rosenzweig [Wed, 27 Jan 2021 17:17:36 +0000 (12:17 -0500)]
panfrost: Fake shader images for bifrost+deqp
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9011>
Michel Dänzer [Thu, 11 Feb 2021 08:56:09 +0000 (09:56 +0100)]
ci: Disable scons-win64 job
It's failed for almost a month, so right now it's mostly noise and a
waste of CI resources.
It can easily be re-enabled by an MR which makes it pass again.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8976>
Bas Nieuwenhuizen [Thu, 11 Feb 2021 19:32:00 +0000 (20:32 +0100)]
radv: Ignore WC flags for VRAM.
Otherwise there might be buffers for which we don't have a type.
Fixes: 7262c743dc8 ("radv: Determine memory type for import based on fd.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4280
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8996>
Mike Blumenkrantz [Thu, 20 Aug 2020 17:36:46 +0000 (13:36 -0400)]
zink: support SO_OVERFLOW pipe query types
this is really just not what we want to be doing. vulkan has no method for
doing on-gpu checks for xfb overflow, instead providing the info as two values
for the user to do with as they will, forcing us to gpu stall any time we
need to interact with these queries
for the ANY variant of the query, we need to create even more xfb query pools,
since we now need to be monitoring all available vertex streams for overflows
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8992>
Mike Blumenkrantz [Sat, 6 Feb 2021 00:31:05 +0000 (19:31 -0500)]
zink: put SO_OVERFLOW queries on the primgen list
these need to know if xfb was active during the query
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8992>
Mike Blumenkrantz [Thu, 20 Aug 2020 17:35:46 +0000 (13:35 -0400)]
zink: break out cpu query reading for qbos into separate function
we're going to need this more than once
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8992>
Mike Blumenkrantz [Thu, 20 Aug 2020 14:27:38 +0000 (10:27 -0400)]
zink: make the xfb_query_pool into an array
we'll need to potentially be observing all streams to handle the
query types from ARB_transform_feedback_overflow_query
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8992>
Mike Blumenkrantz [Thu, 20 Aug 2020 14:26:57 +0000 (10:26 -0400)]
zink: always use query->type for starting/stopping xfb queries
we're going to be seeing some overlap here
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8992>
Alyssa Rosenzweig [Fri, 12 Feb 2021 02:33:02 +0000 (21:33 -0500)]
pan/bi: Skip ATEST for colour blit shaders
Small win.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9002>
Alyssa Rosenzweig [Fri, 12 Feb 2021 02:32:39 +0000 (21:32 -0500)]
panfrost: Pass is_blit flag around
There are blit shader specific optimizations available.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9002>
Erik Faye-Lund [Thu, 4 Feb 2021 10:21:35 +0000 (11:21 +0100)]
zink: use gallium api to copy to display-target
This allows us to avoid us to avoid forcing linear and host-visible
display-targets.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Thu, 4 Feb 2021 12:58:12 +0000 (13:58 +0100)]
zink: ignore irrelevant bind-flags
We don't need to create display-targets for shared or scanout, becuase
we never even see those in the sw-winsys case.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Thu, 4 Feb 2021 09:46:56 +0000 (10:46 +0100)]
zink: limit host-visible bind-flags
The only type that should really require to be host-visible is the
display-target, and that's just because of our silly flush_frontbuffer
implementation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Thu, 4 Feb 2021 09:42:22 +0000 (10:42 +0100)]
zink: don't always require linear display-targets
We only need these display-targets to be linear in the case of a
software winsys. In the DRM case, they can be tiled without issues.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Wed, 3 Feb 2021 16:22:29 +0000 (17:22 +0100)]
zink: do not use extra staging resource unless needed
The reason we check for staging-resources here is really because they
are the only images guaranteed to be host-visible.
But on UMA architectures, it's quite likely to have memory that is
*both* host-visible *and* device-local, so let's see what we found
instead.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Wed, 3 Feb 2021 16:11:56 +0000 (17:11 +0100)]
zink: drop extra set of parens
We don't need to be doubly sure here, we can just use a single set of
parents instead.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858>
Erik Faye-Lund [Thu, 11 Feb 2021 10:12:55 +0000 (11:12 +0100)]
ci: disable sporadically failing test
spec@arb_timer_query@timestamp-get seems to fail on D3D12 / Windows
every now and then. Until that's been figured out, let's disable the
test in CI.
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8978>
Erik Faye-Lund [Thu, 11 Feb 2021 12:34:45 +0000 (13:34 +0100)]
lavapipe: handle null-buffers for xfb
The Vulkan spec says the following for vkCmdBeginTransformFeedbackEXT:
"For each element of pCounterBuffers that is VK_NULL_HANDLE, transform
feedback will start capturing vertex data to byte zero in the
corresponding bound transform feedback buffer."
While not quite as explicit, similar wording exists for
vkCmdEndTransformFeedbackEXT in "Valid Usage" section.
So, this means that we should handle NULL in this case, and simply
ignore the corresponding reads and writes.
This fixes a whole lot of crashes when using transform-feedback with
Zink.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8982>
Giovanni Mascellani [Fri, 12 Feb 2021 07:36:58 +0000 (08:36 +0100)]
anv: Allow null handle in DestroyDescriptorUpdateTemplate.
By the Vulkan specification, and similarly to many other Vulkan calls,
it is allowed to destroy a null descriptor update template.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Fixes: af5f13e58c9dfe ("anv: add VK_KHR_descriptor_update_template support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9005>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:28:52 +0000 (12:28 +0100)]
broadcom/compiler: use unifa for UBO loads from uniform addresses
This basically processes UBO loads as uniform loads by writing
the load address to the unifa register and reading sequential
values with ldunifa.
This process is faster than going through the TMU, but we can only
use it when the address we are reading from is uniform across all
channels, since we are basically reading from the UBO address
as if it was a uniform stream.
This leads to better performance in the UE4 Shooter demo.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:24:57 +0000 (12:24 +0100)]
broadcom/compiler: emit ldunifarf when needed
Just like ldunif and ldunifrf, ldunifa writes to the r5 accumulator
and ldunifarf writes to the register file.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:18:38 +0000 (12:18 +0100)]
broadcom/compiler: do not DCE ldunifa
ldunifa reads a uniform from the unifa address and updates the unifa
address implicitly, so if we dead-code-eliminate one a follow-up
ldunifa will not read from the appropriate address.
We could avoid this if the compiler ensures that every ldunifa is
paired with an explicit unifa, so for example if we are reading a
vec4, we could emit:
unifa (addrr)
ldunifa
unifa (addr+4)
ldunifa
unifa (addr+8)
ldunifa
unifa (addr+12)
ldunifa
instead of:
unifa (addr)
ldunifa
ldunifa
ldunifa
ldunifa
But since each unifa has a 3 delay slot before we can do ldunifa,
that would end up being quite expensive.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:16:10 +0000 (12:16 +0100)]
broadcom/compiler: disallow reading two uniforms in the same instruction
The simulator asserts on this, which can happen if we merge a ldunif
(or any other instruction that reads a uniform implicitly) and
ldunifa in the same instruction.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:15:17 +0000 (12:15 +0100)]
broadcom/compiler: ensure 3-slot delay between unifa and ldunifa
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:13:02 +0000 (12:13 +0100)]
broadcom/compiler: preserve ordering of unifa/ldunifa sequences
unifa writes the addresss from which follow-up ldunifa loads,
and each ldunifa increments the unifa addeess by 32-bit so the
loads need to be ordered too.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 11:09:28 +0000 (12:09 +0100)]
broadcom/compiler: disallow unifa overlap with thread switch/end
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 12:22:28 +0000 (13:22 +0100)]
broadcom/compiler: add a helper to check if an instruction writes unifa
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 10:52:13 +0000 (11:52 +0100)]
broadcom/compiler: don't check for GFXH-1633 on V3D 4.2.x
This has been fixed since V3D 4.2.14 (Rpi4), which is the hardware
we are targetting. Our version resolution doesn't allow us to check
for 4.2 versions lower than .14, but that is okay because the
simulator would still validate this in any case.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 10:32:35 +0000 (11:32 +0100)]
broadcom/compiler: name registers correctly based on V3D version
So we can differentiate between TMU for V3D 4.x and UNIFA for V3D 4.x,
which are aliased.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 10:29:00 +0000 (11:29 +0100)]
broadcom/compiler: pass a devinfo to check if an instruction writes to TMU
V3D 3.x has V3D_QPU_WADDR_TMU which in V3D 4.x is V3D_QPU_WADDR_UNIFA
(which isn't a TMU write address). This change passes a devinfo to
any functions that need to do these checks so we can account for the
target V3D version correctly.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Iago Toral Quiroga [Thu, 11 Feb 2021 09:29:12 +0000 (10:29 +0100)]
broadcom/compiler: add V3D_QPU_WADDR_UNIFA
This only exists in V3D 4.x and aliases V3D_QPU_WADDR_TMU from V3D 3.x.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
Giovanni Mascellani [Thu, 11 Feb 2021 13:27:39 +0000 (14:27 +0100)]
disk_cache: Fail creation when cannot inizialize queue.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: e2c4435b078a ("util/disk_cache: add thread queue to disk cache")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8983>
Arcady Goldmints-Orlov [Mon, 8 Feb 2021 22:03:20 +0000 (17:03 -0500)]
broadcom/compiler: Skip bool_to_cond where possible
This change keeps track of when a boolean temp is loaded into the flags
by a comparison instruction and uses that information to skip emitting
instructions to set the flags in ntq_emit_bool_to_cond when the flags
already have the right contents.
total instructions in shared programs:
11116502 ->
11112225 (-0.04%)
instructions in affected programs: 631691 -> 627414 (-0.68%)
helped: 1591
HURT: 754
helped stats (abs) min: 1 max: 94 x̄: 4.14 x̃: 3
helped stats (rel) min: 0.11% max: 13.46% x̄: 2.10% x̃: 1.58%
HURT stats (abs) min: 1 max: 19 x̄: 3.07 x̃: 2
HURT stats (rel) min: 0.13% max: 19.67% x̄: 1.88% x̃: 1.15%
95% mean confidence interval for instructions value: -2.02 -1.63
95% mean confidence interval for instructions %-change: -0.94% -0.71%
Instructions are helped.
total uniforms in shared programs:
3281555 ->
3281513 (<.01%)
uniforms in affected programs: 1754 -> 1712 (-2.39%)
helped: 10
HURT: 5
helped stats (abs) min: 1 max: 19 x̄: 7.90 x̃: 5
helped stats (rel) min: 0.56% max: 11.11% x̄: 7.37% x̃: 11.05%
HURT stats (abs) min: 1 max: 15 x̄: 7.40 x̃: 3
HURT stats (rel) min: 0.64% max: 9.55% x̄: 5.31% x̃: 3.41%
95% mean confidence interval for uniforms value: -8.57 2.97
95% mean confidence interval for uniforms %-change: -7.35% 1.07%
Inconclusive result (value mean confidence interval includes 0).
total max-temps in shared programs:
1758419 ->
1758174 (-0.01%)
max-temps in affected programs: 7006 -> 6761 (-3.50%)
helped: 290
HURT: 14
helped stats (abs) min: 1 max: 8 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.79% max: 22.86% x̄: 6.61% x̃: 4.88%
HURT stats (abs) min: 1 max: 13 x̄: 6.00 x̃: 3
HURT stats (rel) min: 1.54% max: 54.17% x̄: 23.99% x̃: 9.12%
95% mean confidence interval for max-temps value: -1.03 -0.58
95% mean confidence interval for max-temps %-change: -6.24% -4.16%
Max-temps are helped.
total sfu-stalls in shared programs: 23676 -> 23610 (-0.28%)
sfu-stalls in affected programs: 1578 -> 1512 (-4.18%)
helped: 257
HURT: 252
helped stats (abs) min: 1 max: 3 x̄: 1.37 x̃: 1
helped stats (rel) min: 11.11% max: 100.00% x̄: 46.70% x̃: 40.00%
HURT stats (abs) min: 1 max: 2 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.00% max: 200.00% x̄: 41.65% x̃: 25.00%
95% mean confidence interval for sfu-stalls value: -0.25 -0.01
95% mean confidence interval for sfu-stalls %-change: -8.24% 2.33%
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs:
11140178 ->
11135835 (-0.04%)
inst-and-stalls in affected programs: 633972 -> 629629 (-0.69%)
helped: 1581
HURT: 755
helped stats (abs) min: 1 max: 94 x̄: 4.26 x̃: 3
helped stats (rel) min: 0.11% max: 13.46% x̄: 2.12% x̃: 1.59%
HURT stats (abs) min: 1 max: 17 x̄: 3.17 x̃: 2
HURT stats (rel) min: 0.05% max: 19.67% x̄: 1.93% x̃: 1.20%
95% mean confidence interval for inst-and-stalls value: -2.06 -1.66
95% mean confidence interval for inst-and-stalls %-change: -0.93% -0.70%
Inst-and-stalls are helped.
Reviewed-by: Iago Toral Quioroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8933>
Arcady Goldmints-Orlov [Mon, 8 Feb 2021 21:41:35 +0000 (16:41 -0500)]
broadcom/compiler: Add a v3d_compile argument to vir_set_[pu]f
Reviewed-by: Iago Toral Quioroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8933>
Bas Nieuwenhuizen [Sun, 7 Feb 2021 15:01:50 +0000 (16:01 +0100)]
radv: Define supported extensions in C.
One python generator less.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8900>
Bas Nieuwenhuizen [Sun, 7 Feb 2021 14:25:14 +0000 (15:25 +0100)]
radv: Remove custom icd json generation.
No Android.mk changes as the radv provided json file isn't used.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8900>
Alyssa Rosenzweig [Thu, 13 Aug 2020 22:41:11 +0000 (18:41 -0400)]
panfrost: Set barriers flag for compute shaders
Pipe in the info from NIR. Fix incorrect handling of helper invocations,
which also use the barrier flag.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312>
Alyssa Rosenzweig [Thu, 13 Aug 2020 22:38:25 +0000 (18:38 -0400)]
compiler, nir: Add and set barrier metadata
Useful for determining whether certain optimizations are legal for a
compute shader (e.g. optimizing workgroup size in the driver).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312>
Alyssa Rosenzweig [Thu, 11 Feb 2021 17:37:17 +0000 (12:37 -0500)]
panfrost: Enable ES3 conformant floating-point
Don't suppress inf/nan. Triggers bugs in broken apps like glmark2 (fixed
upstream but traces don't have the fix yet), so update the trace
expectations.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7550>
Kenneth Graunke [Tue, 9 Feb 2021 02:38:22 +0000 (18:38 -0800)]
iris: Remove context from iris_disk_cache_retrieve
We don't use the context other than getting the screen and uploader.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Kenneth Graunke [Tue, 9 Feb 2021 02:26:57 +0000 (18:26 -0800)]
iris: Remove context from iris_create_uncompiled_shader
Nothing uses the context here, just the screen.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Kenneth Graunke [Tue, 9 Feb 2021 02:25:01 +0000 (18:25 -0800)]
iris: Remove context from iris_compile_vs and friends
Instead, we pass the screen, an uploader, and a debug callback.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Kenneth Graunke [Tue, 9 Feb 2021 02:01:31 +0000 (18:01 -0800)]
iris: Remove context from iris_upload_shader()
Shaders are now shared across contexts, so we'd like to avoid requiring
access to a full context. Instead, we pass the screen and an uploader
to use.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Kenneth Graunke [Tue, 9 Feb 2021 01:50:41 +0000 (17:50 -0800)]
iris: Remove context from iris_debug_recompile
This doesn't and shouldn't use the context. It just wants a debug
callback to print things on.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Kenneth Graunke [Wed, 3 Feb 2021 09:41:42 +0000 (01:41 -0800)]
iris: Fill out scratch base address dynamically
Now that shaders are shared between contexts, we can't pre-bake the
shader scratch address into the derived 3DSTATE_XS packets. Scratch
buffers are and must be per-context, as multiple contexts could be
executing shaders using scratch at the same time.
So instead, we leave that field blank when pre-filling those packets
up-front, and merge in the actual address when emitting them. It's
a little more overhead, but only in the case where scratch is used.
Fixes: 84a38ec1336 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922>
Mike Blumenkrantz [Thu, 11 Feb 2021 15:33:25 +0000 (10:33 -0500)]
zink: lower flrp64 and ffma64 when in softfp64 mode
fixes a bunch of crashes
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8986>
Mike Blumenkrantz [Fri, 11 Dec 2020 23:56:26 +0000 (18:56 -0500)]
zink: add spirv interfaces for bo and image/sampler/push variables
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8985>
Jordan Justen [Mon, 25 Jan 2021 20:33:19 +0000 (12:33 -0800)]
anv: Add ANV_QUEUE_OVERRIDE env-var to override advertised queues
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8771>
Jason Ekstrand [Tue, 26 Jan 2021 16:38:47 +0000 (10:38 -0600)]
anv: Add fake graphics-only and compute-only queue families
Rework:
* Jordan: Add graphics-only queue
* Jordan: Bump ANV_MAX_QUEUE_FAMILIES and add related asserts
* Jordan: Fix queueCount on compute-only family
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8771>
Erik Faye-Lund [Wed, 10 Feb 2021 17:22:03 +0000 (18:22 +0100)]
ci: enable max texture size tests for zink
I cargo-culted these from the llvmpipe tests, but they seem to pass and
not take much time or memory at all, so let's try enabling them. If this
works fine, we might want to try the same for llvmpipe as well...
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8979>
Michel Zou [Thu, 11 Feb 2021 08:11:42 +0000 (09:11 +0100)]
vulkan: Fix windows api conflict
It must be undefined in the header too
Fixes: e487ae1b
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8975>
Alyssa Rosenzweig [Tue, 9 Feb 2021 23:23:06 +0000 (18:23 -0500)]
pan/bi: Push UBOs on Bifrost
Based on the Midgard pass. Results look better since Midgard already had
a basic UBO pushing pass to begin with. Particularly nice to see the
dramatic reduction in spilling.
total instructions in shared programs: 169141 -> 161215 (-4.69%)
instructions in affected programs: 164102 -> 156176 (-4.83%)
helped: 1269
HURT: 90
helped stats (abs) min: 1 max: 61 x̄: 6.50 x̃: 4
helped stats (rel) min: 0.15% max: 17.58% x̄: 6.31% x̃: 5.88%
HURT stats (abs) min: 1 max: 170 x̄: 3.58 x̃: 1
HURT stats (rel) min: 0.08% max: 133.33% x̄: 16.65% x̃: 5.26%
95% mean confidence interval for instructions value: -6.28 -5.38
95% mean confidence interval for instructions %-change: -5.39% -4.18%
Instructions are helped.
total nops in shared programs: 121049 -> 120997 (-0.04%)
nops in affected programs: 110024 -> 109972 (-0.05%)
helped: 501
HURT: 758
helped stats (abs) min: 1 max: 45 x̄: 5.54 x̃: 2
helped stats (rel) min: 0.25% max: 47.06% x̄: 6.81% x̃: 4.55%
HURT stats (abs) min: 1 max: 102 x̄: 3.59 x̃: 3
HURT stats (rel) min: 0.32% max: 50.00% x̄: 7.13% x̃: 6.06%
95% mean confidence interval for nops value: -0.45 0.37
95% mean confidence interval for nops %-change: 1.07% 2.09%
Inconclusive result (value mean confidence interval includes 0).
total clauses in shared programs: 40388 -> 31610 (-21.73%)
clauses in affected programs: 38825 -> 30047 (-22.61%)
helped: 1367
HURT: 2
helped stats (abs) min: 1 max: 58 x̄: 6.43 x̃: 5
helped stats (rel) min: 1.34% max: 55.56% x̄: 24.97% x̃: 25.00%
HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 7
HURT stats (rel) min: 5.08% max: 6.67% x̄: 5.88% x̃: 5.88%
95% mean confidence interval for clauses value: -6.74 -6.08
95% mean confidence interval for clauses %-change: -25.50% -24.35%
Clauses are helped.
total quadwords in shared programs: 144937 -> 130686 (-9.83%)
quadwords in affected programs: 140419 -> 126168 (-10.15%)
helped: 1369
HURT: 13
helped stats (abs) min: 1 max: 112 x̄: 10.50 x̃: 7
helped stats (rel) min: 0.23% max: 31.82% x̄: 11.36% x̃: 10.78%
HURT stats (abs) min: 1 max: 106 x̄: 10.00 x̃: 1
HURT stats (rel) min: 5.88% max: 10.24% x̄: 9.26% x̃: 10.00%
95% mean confidence interval for quadwords value: -10.96 -9.66
95% mean confidence interval for quadwords %-change: -11.52% -10.82%
Quadwords are helped.
total spills in shared programs: 1106 -> 705 (-36.26%)
spills in affected programs: 1058 -> 657 (-37.90%)
helped: 41
HURT: 0
total fills in shared programs: 2241 -> 1645 (-26.60%)
fills in affected programs: 2219 -> 1623 (-26.86%)
helped: 43
HURT: 2
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 16:47:24 +0000 (11:47 -0500)]
pan/bi: Add SSA-based scalar copy propagation
This is a very simple (and slow...) copyprop pass. It's good enough to
get rid of redundant moves from FAU, but it doesn't help for vector
combines.
total instructions in shared programs: 175219 -> 169141 (-3.47%)
instructions in affected programs: 91439 -> 85361 (-6.65%)
helped: 599
HURT: 0
helped stats (abs) min: 1 max: 112 x̄: 10.15 x̃: 6
helped stats (rel) min: 0.30% max: 33.33% x̄: 8.61% x̃: 8.04%
95% mean confidence interval for instructions value: -11.06 -9.24
95% mean confidence interval for instructions %-change: -9.07% -8.16%
Instructions are helped.
total nops in shared programs: 120011 -> 121049 (0.86%)
nops in affected programs: 47355 -> 48393 (2.19%)
helped: 110
HURT: 309
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.44% max: 16.67% x̄: 3.59% x̃: 3.16%
HURT stats (abs) min: 1 max: 56 x̄: 4.10 x̃: 2
HURT stats (rel) min: 0.32% max: 80.85% x̄: 6.85% x̃: 3.12%
95% mean confidence interval for nops value: 1.86 3.09
95% mean confidence interval for nops %-change: 3.08% 5.14%
Nops are HURT.
total clauses in shared programs: 40576 -> 40388 (-0.46%)
clauses in affected programs: 3074 -> 2886 (-6.12%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.77 x̃: 2
helped stats (rel) min: 0.42% max: 22.22% x̄: 7.17% x̃: 6.90%
95% mean confidence interval for clauses value: -1.91 -1.63
95% mean confidence interval for clauses %-change: -7.80% -6.53%
Clauses are helped.
total quadwords in shared programs: 146590 -> 144937 (-1.13%)
quadwords in affected programs: 59475 -> 57822 (-2.78%)
helped: 493
HURT: 1
helped stats (abs) min: 1 max: 28 x̄: 3.35 x̃: 2
helped stats (rel) min: 0.28% max: 15.38% x̄: 4.08% x̃: 3.85%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38%
95% mean confidence interval for quadwords value: -3.61 -3.08
95% mean confidence interval for quadwords %-change: -4.33% -3.81%
Quadwords are helped.
total spills in shared programs: 1106 -> 1106 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0
total fills in shared programs: 2241 -> 2241 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 16:44:37 +0000 (11:44 -0500)]
pan/bi: Simplify derivative lowering
Now that we lower FAU correctly, we don't need to write the extra move
explicitly, it will be lowered in later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 16:43:18 +0000 (11:43 -0500)]
pan/bi: Rework FAU lowering
Move and reshape bi_lower_fau to bi_schedule.c. This generalizes the
pass for FAU reads, allowing copyprop to work with FAU without problems.
The pass must run immediately before scheduling. Its post-conditions are
directly specified as the scheduler's pre-conditions. It momentarily
will depend on internal scheduler predicates. It is, for all intents and
purposes, part of the scheduler. Keep it all together.
Finally, adjust the 0 handling to avoid a move at the expense of
constrained scheduling of something like `FADD.v2f16.clamp_0_1 u0, #0`
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 17:38:09 +0000 (12:38 -0500)]
pan/bi: Handle modifiers in rewrite_fau_to_pass
Will prevent failures when we start using FAU together with modifiers in
a few commits.
Fixes: fc7770b1dda ("pan/bi: Add trivial rewrite helpers")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 17:39:09 +0000 (12:39 -0500)]
pan/bi: Generalize bi_update_fau with fast zero
Ensure we don't fall over if we have an instruction like
FADD.f32 u0, #0
In this case, the tuple's FAU requirement implies the instruction can be
scheduler without lowering to the FMA slot but not the ADD slot.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Tue, 9 Feb 2021 23:34:08 +0000 (18:34 -0500)]
pan/bi: Print FAU uniforms in IR
Uses "u3, u3[1]" syntax which is close enough to the assembly syntax
"u3.w0, u3.w1".
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 16:47:03 +0000 (11:47 -0500)]
pan/bi: Add bi_is_ssa helper
Convenient for SSA-based opt passes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 16:46:01 +0000 (11:46 -0500)]
pan/bi: Add bi_replace_index helper
I keep open-coding this, incorrectly... Since bi_index contains both
"position" and "modifier" data, it's common to want to swap the position
while preserving modifiers.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Thu, 11 Feb 2021 00:40:38 +0000 (19:40 -0500)]
pan/bi: Fix multithreaded shader-db
Clobbered names.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Sun, 7 Feb 2021 16:09:06 +0000 (11:09 -0500)]
pan/mdg: Push uniforms based on UBO analysis
Skips over "holes" in UBO ranges and allows pushing things other than
UBO #0 (GL uniforms) and sysvals. shader-db results relative to
beginning of series (so includes the hurt from lowering UBO to
uniforms):
total instructions in shared programs: 96611 -> 95018 (-1.65%)
instructions in affected programs: 22356 -> 20763 (-7.13%)
helped: 204
HURT: 13
helped stats (abs) min: 1 max: 27 x̄: 8.18 x̃: 7
helped stats (rel) min: 0.42% max: 26.09% x̄: 8.60% x̃: 8.07%
HURT stats (abs) min: 1 max: 33 x̄: 5.77 x̃: 2
HURT stats (rel) min: 0.47% max: 15.64% x̄: 3.56% x̃: 1.72%
95% mean confidence interval for instructions value: -8.29 -6.39
95% mean confidence interval for instructions %-change: -8.74% -7.00%
Instructions are helped.
total bundles in shared programs: 44886 -> 44790 (-0.21%)
bundles in affected programs: 9640 -> 9544 (-1.00%)
helped: 131
HURT: 70
helped stats (abs) min: 1 max: 11 x̄: 4.34 x̃: 4
helped stats (rel) min: 1.04% max: 42.31% x̄: 10.39% x̃: 9.84%
HURT stats (abs) min: 1 max: 16 x̄: 6.76 x̃: 6
HURT stats (rel) min: 2.22% max: 37.50% x̄: 13.78% x̃: 10.00%
95% mean confidence interval for bundles value: -1.37 0.42
95% mean confidence interval for bundles %-change: -3.99% 0.04%
Inconclusive result (value mean confidence interval includes 0).
total quadwords in shared programs: 76320 -> 75140 (-1.55%)
quadwords in affected programs: 16691 -> 15511 (-7.07%)
helped: 206
HURT: 5
helped stats (abs) min: 1 max: 18 x̄: 5.91 x̃: 6
helped stats (rel) min: 0.36% max: 27.78% x̄: 7.93% x̃: 8.33%
HURT stats (abs) min: 1 max: 19 x̄: 7.40 x̃: 1
HURT stats (rel) min: 0.55% max: 15.79% x̄: 7.39% x̃: 3.57%
95% mean confidence interval for quadwords value: -6.19 -5.00
95% mean confidence interval for quadwords %-change: -8.32% -6.82%
Quadwords are helped.
total registers in shared programs: 6958 -> 6827 (-1.88%)
registers in affected programs: 1083 -> 952 (-12.10%)
helped: 112
HURT: 16
helped stats (abs) min: 1 max: 3 x̄: 1.32 x̃: 1
helped stats (rel) min: 6.25% max: 50.00% x̄: 17.13% x̃: 12.50%
HURT stats (abs) min: 1 max: 2 x̄: 1.06 x̃: 1
HURT stats (rel) min: 9.09% max: 20.00% x̄: 11.97% x̃: 11.81%
95% mean confidence interval for registers value: -1.19 -0.86
95% mean confidence interval for registers %-change: -15.78% -11.21%
Registers are helped.
total threads in shared programs: 5109 -> 5153 (0.86%)
threads in affected programs: 62 -> 106 (70.97%)
helped: 42
HURT: 6
helped stats (abs) min: 1 max: 2 x̄: 1.19 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.68 1.16
95% mean confidence interval for threads %-change: 66.69% 95.81%
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Sat, 6 Feb 2021 14:00:13 +0000 (09:00 -0500)]
pan/mdg: Update UBO promotion comment
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 00:09:43 +0000 (19:09 -0500)]
panfrost: Don't store uniform_count on Midgard
We weren't reading it anywhere outside this function, no need to keep
the extra copy of the data around. Avoids a footgun since this field
isn't even used on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Wed, 10 Feb 2021 00:08:54 +0000 (19:08 -0500)]
panfrost: Set FAU count based on program->push
There's no "cutoff" to worry about on Bifrost, just do the simple thing.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>
Alyssa Rosenzweig [Sun, 7 Feb 2021 15:09:21 +0000 (10:09 -0500)]
panfrost: Push uniforms required by the program
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973>