Alyssa Rosenzweig [Tue, 7 Mar 2023 04:09:38 +0000 (23:09 -0500)]
agx: Don't overallocate registers
We need to account for the full vector lengths. Especially important once we
start restricting the reg file.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Tue, 28 Feb 2023 21:44:12 +0000 (16:44 -0500)]
agx: Coalesce more collects
Try harder to coalesce collects, by trying to allocate collects only to regions
of the register file where we actually have a full vector worth of registers
free. If we already know that the vector will be blocked later, it's not a good
base register to pick since we'd be force to shuffle later. So, this tweak to
the collect coalescing heuristic lets us eliminate a pile of pointless copying.
shader-db results are excellent. Note that, although we use more registers,
none of the shaders tested had their thread count affected, likely because the
max HURT isn't too high and most of the scary % here is from using a few more
registers when the register pressure is already low. In the near future, that
property will become guaranteed thanks to live range splitting, too.
total instructions in shared programs: 1507337 -> 1500562 (-0.45%)
instructions in affected programs: 428137 -> 421362 (-1.58%)
helped: 2658
HURT: 167
helped stats (abs) min: 1.0 max: 34.0 x̄: 2.63 x̃: 2
helped stats (rel) min: 0.10% max: 25.00% x̄: 3.04% x̃: 2.14%
HURT stats (abs) min: 1.0 max: 10.0 x̄: 1.24 x̃: 1
HURT stats (rel) min: 0.20% max: 23.81% x̄: 3.90% x̃: 3.57%
95% mean confidence interval for instructions value: -2.49 -2.31
95% mean confidence interval for instructions %-change: -2.76% -2.51%
Instructions are helped.
total bytes in shared programs:
10333670 ->
10293172 (-0.39%)
bytes in affected programs: 2996682 -> 2956184 (-1.35%)
helped: 2660
HURT: 175
helped stats (abs) min: 2.0 max: 204.0 x̄: 15.70 x̃: 12
helped stats (rel) min: 0.08% max: 23.08% x̄: 2.64% x̃: 1.83%
HURT stats (abs) min: 2.0 max: 60.0 x̄: 7.26 x̃: 6
HURT stats (rel) min: 0.12% max: 22.39% x̄: 3.19% x̃: 2.78%
95% mean confidence interval for bytes value: -14.81 -13.76
95% mean confidence interval for bytes %-change: -2.39% -2.18%
Bytes are helped.
total halfregs in shared programs: 417284 -> 427363 (2.42%)
halfregs in affected programs: 49814 -> 59893 (20.23%)
helped: 95
HURT: 3018
helped stats (abs) min: 1.0 max: 8.0 x̄: 2.29 x̃: 2
helped stats (rel) min: 2.44% max: 28.57% x̄: 9.20% x̃: 6.06%
HURT stats (abs) min: 1.0 max: 14.0 x̄: 3.41 x̃: 4
HURT stats (rel) min: 2.08% max: 150.00% x̄: 36.54% x̃: 27.27%
95% mean confidence interval for halfregs value: 3.17 3.31
95% mean confidence interval for halfregs %-change: 34.05% 36.23%
Halfregs are HURT.
total threads in shared programs:
16465280 ->
16465280 (0.00%)
threads in affected programs: 0 -> 0
helped: 0
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 18 Mar 2023 18:45:08 +0000 (14:45 -0400)]
asahi: Set PIPE_CAP_LOAD_CONSTBUF
The CAP is a bit of a misnomer, what it really does is relax the alignment
requirements for UBO packing. It should work fine and save us some memory.
Noticed while debugging piglit fails.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 18 Mar 2023 18:33:10 +0000 (14:33 -0400)]
asahi/decode: Print VDM barriers
Instead of just decoding silently.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 18 Mar 2023 18:30:31 +0000 (14:30 -0400)]
asahi/decode: Remove agxdecode_dump_bo
Now that we have proper parsing this is more of a nuissance than not.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 11 Mar 2023 20:39:09 +0000 (15:39 -0500)]
agx: Add helper for calculating occupancy
Add information about the relationship between program register usage and
program occupancy (the maximum number of threads that may execute concurrently
on a single shader core). This table is derived from studying the
maxTotalThreadsPerThreadgroup property in Metal while varying the register
usage, something I blogged about a few years back. It's probably not 100%
accurate and it hasn't been tested against hardware, but it matters "only" for
performance (not correctness) so I'm not super stressed about the details.
In the (near) future, RA will be able to make use of this information to know
exactly when it can use more registers without hurting performance. In the
present, it's just used for better shader-db statistics.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Tue, 7 Mar 2023 04:33:11 +0000 (23:33 -0500)]
agx: Set loads_varying accurately
Instead of just always mashing to true. Should be better for depth-only passes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 11 Mar 2023 22:48:56 +0000 (17:48 -0500)]
asahi: Add perf debug for shader variants
Compiling this can cause jank. This is still an issue in Quake3. There is a way
to solve it but it's rather involved and certainly not this weekend's project.
Better perf debugging on the other hand apparently is ^_^
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 11 Mar 2023 21:46:45 +0000 (16:46 -0500)]
asahi: Add perf debug for generate_mipmap
The current implementation leaves a lot of perf on the table, so call it out on
ASAHI_MESA_DEBUG=perf to help debugging perf problems, especially if this
ever happens in a real application (i.e. not a benchmark).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sun, 12 Mar 2023 03:16:52 +0000 (22:16 -0500)]
agx: Don't destroy usub_sat with constant
Fixes KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sun, 12 Mar 2023 03:07:18 +0000 (22:07 -0500)]
agx: Don't allow uniform source to local_atomic
Fixes KHR-GLES31.core.compute_shader.atomic-case3
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sun, 5 Mar 2023 03:17:29 +0000 (22:17 -0500)]
agx: Constify agx_{read,write}_registers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Tue, 7 Mar 2023 03:59:23 +0000 (22:59 -0500)]
agx: Assert that we don't overflow registers
This will become particularly important when we bound to smaller register files.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sat, 4 Mar 2023 22:13:30 +0000 (17:13 -0500)]
agx: DCE even with noopt
To simplify live range splitting, RA will soon assume that DCE has run (removing
extraneous vectors). So run DCE even when otherwise disabling backend
optimizations. AGX_MESA_DEBUG=noopt is still useful for disabling instruction
combining, which is the more-likely-to-be-buggy pass anyway.
This also fixes IR not being printed with noopt.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Alyssa Rosenzweig [Sun, 12 Mar 2023 00:47:27 +0000 (19:47 -0500)]
asahi: Support more renderable formats
Fixes KHR-GLES3.copy_tex_image_conversions.forbidden.*
Arguably working around a mesa/st issue but more format support is good for
compatibility and performance anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22353>
Yiwei Zhang [Fri, 31 Mar 2023 06:53:58 +0000 (06:53 +0000)]
venus/docs: sync to latest venus supported extensions
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22243>
Yiwei Zhang [Fri, 31 Mar 2023 06:26:01 +0000 (06:26 +0000)]
venus: add VK_EXT_rasterization_order_attachment_access support
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22243>
Yiwei Zhang [Fri, 31 Mar 2023 08:28:04 +0000 (08:28 +0000)]
venus: add VK_EXT_load_store_op_none support
There's no feature/properties structs associated with this extension.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22243>
Yiwei Zhang [Fri, 31 Mar 2023 06:16:36 +0000 (06:16 +0000)]
venus: sync latest protocol for layering extensions
- VK_EXT_load_store_op_none
- VK_EXT_rasterization_order_attachment_access
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22243>
Sajeesh Sidharthan [Wed, 5 Apr 2023 08:58:59 +0000 (14:28 +0530)]
radeonsi/vcn: optimize bitstream buffer resize logic
bitstream buffer is unmapped, resized and mapped again if new size
is greater than the current bitstream buffer size. This will be done
for each input buffer. This patch will avoid that and do resize
only once irrespective of number of input buffers. With the new logic,
total size is calculated first and call unmap, resize and map only once.
Signed-off-by: Sajeesh Sidharthan <sajeesh.sidharthan@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22308>
Alyssa Rosenzweig [Thu, 30 Mar 2023 18:08:38 +0000 (14:08 -0400)]
nir/print: Don't print sampler_index for txf
NIR's docs for sampler_index say
The following operations do not require a sampler and, as such, this
field should be ignored:
- nir_texop_txf
- nir_texop_txf_ms
- nir_texop_txs
- nir_texop_query_levels
- nir_texop_texture_samples
- nir_texop_samples_identical
Contrary to this documentation, we were still printing the sampler_index anyway,
even though the value is formally undefined. This was helpful for
PIPE_CAP_TEXTURE_BUFFER_SAMPLER drivers that (despite the NIR docs) respected the
sampler_index anyway. There are no longer any such drivers, so we should stop
printing sampler_index for txf to avoid confusion (and noise).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Alyssa Rosenzweig [Fri, 31 Mar 2023 01:02:45 +0000 (21:02 -0400)]
docs/gallium: Note samplers are not used for txf
Now that PIPE_CAP_TEXTURE_BUFFER_SAMPLER is gone, txf does not require samplers
for any texture on any Gallium driver. NIR already requires drivers to ignore
sampler_index for non-sampler operation (mainly txf), and nowadays all Gallium
drivers ingest NIR. So, document that samplers aren't bound for txf (etc) as
part of the Gallium frontend-driver contract.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Alyssa Rosenzweig [Thu, 30 Mar 2023 17:24:16 +0000 (13:24 -0400)]
gallium: Remove PIPE_CAP_TEXTURE_BUFFER_SAMPLER
No more users. It was already not respected by rusticl so you couldn't set it if
you wanted OpenCL support. I regret introducing the CAP in the first place, and
no more drivers should use it.
Reverts
d5d3f77e4ac ("gallium: Add new cap PIPE_CAP_TEXTURE_BUFFER_SAMPLER").
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Alyssa Rosenzweig [Thu, 30 Mar 2023 17:21:14 +0000 (13:21 -0400)]
panfrost: Unset TEXTURE_BUFFER_SAMPLERS
We no longer need this CAP, as we can easily synthesize our own internal sampler
for this case. Gallium doesn't need to know about this quirk of our hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Alyssa Rosenzweig [Thu, 30 Mar 2023 18:00:46 +0000 (14:00 -0400)]
pan/{mdg,bi}: Always use sampler 0 for txf
Now that we upload workaround samplers for txf, sampler 0 is guaranteed to be
valid but other samplers are not. So ignore whatever the current sampler_index
value is (it's formally undefined in NIR) and use 0, which we know is valid. We
already do this on Valhall for OpenCL, just need to generalize for Midgard and
Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Alyssa Rosenzweig [Thu, 30 Mar 2023 17:55:14 +0000 (13:55 -0400)]
panfrost: Always upload a workaround sampler
The hardware requires a valid sampler even for texelFetch (txf), even though its
contents are ignored. We'd rather not pass on this requirement to the frontends,
so we should handle it by uploading our own workaround sampler in the case when
no sampler is already present. We already do this on Valhall (for rusticl), so
we just need to port the same workaround back to Midgard/Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>
Mike Blumenkrantz [Thu, 6 Apr 2023 20:12:46 +0000 (16:12 -0400)]
zink: don't try copying multiple results for conditional render copy
conditional render is only a single result, so multiple results need
to first be aggregated
fixes #8798
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22345>
Ian Romanick [Fri, 17 Feb 2023 20:20:09 +0000 (12:20 -0800)]
nir/tests: Port almost all loop_analyze tests to new macro-based infastructure
The one test that remains would have an automatically generated name
that would conflict with another test. This test is also a little
special (per the comment in the test), so it's probably best to leave it
separate anyway.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Yevhenii Kolesnikov [Fri, 17 Jan 2020 11:01:01 +0000 (13:01 +0200)]
nir/loop_analyze: Determine iteration counts for more kinds of loops
If loop iterator is incremented with something other than regular
addition, it would be more error prone to calculate the number of
iterations theoretically. What we can do instead, is try to emulate the
loop, and determine the number of iterations empirically.
These operations are covered:
- imul
- fmul
- ishl
- ishr
- ushr
Also add unit tests for loop unrollment.
Improves performance of Aztec Ruins (sixonix
gfxbench5.aztec_ruins_vk_high) by -1.28042% +/- 0.498555% (N=5) on Intel
Arc A770.
v2 (idr): Rebase on 3 years. :( Use nir_phi_instr_add_src in the test
cases.
v3 (idr): Use try_eval_const_alu in to evaluate loop termination
condition in get_iteration_empirical. Also restructure the loop
slightly. This fixed off by one iteration errors in "inverted" loop
tests (e.g., nir_loop_analyze_test.ushr_ieq_known_count_invert_31).
v4 (idr): Use try_eval_const_alu in to evaluate induction variable
update in get_iteration_empirical. This fixes non-commutative update
operations (e.g., shifts) when the induction varible is not the first
source. This fixes the unit test
nir_loop_analyze_test.ishl_rev_ieq_infinite_loop_unknown_count.
v5 (idr): Fix _type parameter for fadd and fadd_rev loop unroll
tests. Hopefully that fixes the failure on s390x. Temporarily disable
fmul. This works-around the revealed problem in
glsl-fs-loop-unroll-mul-fp64, and there were no shader-db or fossil-db
changes.
v6 (idr): Plumb max_unroll_iterations into get_iteration_empirical. I
was going to do this, but I forgot. Suggested by Tim.
v7 (idr): Disable fadd tests on s390x. They fail because S390 is weird.
Almost all of the shaders affected (OpenGL or Vulkan) are from gfxbench
or geekbench. A couple shaders in Deus Ex (OpenGL), Dirt Rally (OpenGL),
Octopath Traveler (Vulkan), and Rise of the Tomb Raider (Vulkan) are
helped.
The lost / gained shaders in OpenGL are an Aztec Ruins shader that goes
from SIMD16 to SIMD8. The spills / fills affected are in a single Aztec
Ruins (Vulkan) compute shader.
shader-db results:
Skylake, Ice Lake, and Tiger Lake had similar results. (Tiger Lake shown)
total loops in shared programs: 5514 -> 5470 (-0.80%)
loops in affected programs: 62 -> 18 (-70.97%)
helped: 37 / HURT: 0
LOST: 2
GAINED: 2
Haswell and Broadwell had similar results. (Broadwell shown)
total loops in shared programs: 5346 -> 5298 (-0.90%)
loops in affected programs: 66 -> 18 (-72.73%)
helped: 39 / HURT: 0
fossil-db results:
Skylake, Ice Lake, and Tiger Lake had similar results. (Tiger Lake shown)
Instructions in all programs:
157374679 ->
157397421 (+0.0%)
Instructions hurt: 28
SENDs in all programs: 7463800 -> 7467639 (+0.1%)
SENDs hurt: 28
Loops in all programs: 38980 -> 38950 (-0.1%)
Loops helped: 28
Cycles in all programs:
7559486451 ->
7557455384 (-0.0%)
Cycles helped: 28
Spills in all programs: 11405 -> 11403 (-0.0%)
Spills helped: 1
Fills in all programs: 19578 -> 19588 (+0.1%)
Fills hurt: 1
Lost: 1
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Yevhenii Kolesnikov [Fri, 17 Jan 2020 11:01:01 +0000 (13:01 +0200)]
nir/loop_analyze: Track induction variables incremented by more operations
These operations are covered:
- imul
- fmul
- ishl
- ishr
- ushr
The only cases that can be currently affected are those where the
calculated loop-trip count would be zero.
v2 (idr): Split out from original commit. Rebase on lots of other work.
v3 (idr): Move operand size assertion. This code only cares that the
operands have the same size for the iadd and fadd cases. In other
cases, such as shifts, the sizes may not match. Fixes assertion
failures in
tests/spec/arb_gpu_shader_int64/glsl-fs-loop-unroll-ishl-int64.shader_test.
No shader-db or fossil-db changes on any Intel platform.
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Tue, 14 Feb 2023 01:33:29 +0000 (17:33 -0800)]
nir/loop_analyze: Use try_eval_const_alu and induction variable basis info
This dramatically simplifies will_break_on_first_iteration, and, much
more importantly, makes it significantly more flexible. It is now
possible to handle loops with more complex exit condition and other
kinds of increment operations.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Wed, 15 Feb 2023 00:12:23 +0000 (16:12 -0800)]
nir/loop_analyze: Change invert_cond instead of changing the condition
This ensures that scenarios like
nir_loop_analyze_test.iadd_inot_ilt_rev_known_count_5 don't regress in
the next commit. It also means we don't change float comparisons. These
are probably fine... but it still made me a little uneasy.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Tue, 14 Feb 2023 01:30:16 +0000 (17:30 -0800)]
nir/loop_analyze: Track induction variable basis information
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Tue, 14 Feb 2023 01:23:14 +0000 (17:23 -0800)]
nir/loop_analyze: Add a function to evaluate an ALU as constant
...with a substitution. This function is largely a copy-and-paste of
try_fold_alu (nir_opt_constant_folding.c), and an argument could be made
that this function belongs in that file.
v2: Some changes were mistakenly squashed in to "nir/loop_analyze: Use
try_eval_const_alu and induction variable basis info" that should have
been here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Thu, 2 Feb 2023 00:18:04 +0000 (16:18 -0800)]
nir/tests: Add many loop analysis tests for induction variables modified by imul
Loop analysis doesn't currently treat values updated by multiplication
as induction variables. Future patches will change this.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Thu, 2 Feb 2023 20:44:52 +0000 (12:44 -0800)]
nir/tests: Add more loop analysis tests for induction vars updated by shifts
These reverse the order of the comparison (e.g., -2 >= i vs i >= -2). I
split this into a separate commit because the previous commit was so
large.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Ian Romanick [Wed, 1 Feb 2023 23:18:05 +0000 (15:18 -0800)]
nir/tests: Add many loop analysis tests for induction vars updated by shifts
Loop analysis doesn't currently treat values updated by shifts as
induction variables. Future patches will change this.
v2: Don't use the contradiction ilt(x, INT_MIN).
v3: Delete some errant code in UNKNOWN_COUNT_TEST. Noticed by Tim.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3445>
Sajeesh Sidharthan [Wed, 15 Mar 2023 04:24:58 +0000 (09:54 +0530)]
radeonsi/vcn: set bitstream buffer size to encoded bitstream size
initial bitstream size was set to width * height * 2 which is
larger than yuv size. set initial bitstream size to encoded
bitstream size approximately to optimize memory consumption.
This is just an initial size setting, it will get resized later
if it's not big enough. As a result of this change, we don't need to
allocate super big size at the every beginning. Only allocate
big size when needed in order to save some memory
Signed-off-by: Sajeesh Sidharthan <sajeesh.sidharthan@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21918>
Jesse Natalie [Wed, 5 Apr 2023 16:53:47 +0000 (09:53 -0700)]
dzn: Fix bindless descriptor sets with multiple dynamic buffers that need custom descriptors
Fixes:
5d2b4ee4 ("dzn: Allocate descriptor sets in buffers for bindless mode")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Fri, 31 Mar 2023 17:18:19 +0000 (10:18 -0700)]
dzn: Batch command lists together
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Fri, 31 Mar 2023 20:43:48 +0000 (13:43 -0700)]
dzn: Don't do initial-layout barriers for simultaneous-access resources
Fixes:
4daeac01 ("dzn: Enhanced barriers fixes/workarounds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 30 Mar 2023 20:39:41 +0000 (13:39 -0700)]
dzn: Attempt to force depth write states for depth access in LAYOUT_GENERIC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 30 Mar 2023 20:38:37 +0000 (13:38 -0700)]
dzn: Ensure buffer offsets are aligned
If the app passes us unaligned buffer offsets, we need to align them
down to the nearest aligned offset, and then put the difference into
the descriptor set buffer.
Fixes:
8bd5fbf8 ("dzn: Bind buffers for bindless descriptor sets")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Mon, 27 Mar 2023 18:09:57 +0000 (11:09 -0700)]
dzn: Don't use write-combine memory for cache-coherent UMA
Cache coherent UMA implies that the GPU is reading data through the
CPU caches. Using write-combined CPU pages for such a system would
be bad, since the GPU would then be reading uncached data. One
example of such a system is WARP. This significantly improves WARP's
performance for some apps (including the CTS).
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Mon, 27 Mar 2023 18:09:38 +0000 (11:09 -0700)]
dzn: Ensure pipeline variants are used for dynamic stencil masks
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Mon, 27 Mar 2023 18:08:50 +0000 (11:08 -0700)]
dzn: Align descriptor sets in the bindless buffer
Fixes:
5d2b4ee4 ("dzn: Allocate descriptor sets in buffers for bindless mode")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 23 Mar 2023 15:41:22 +0000 (08:41 -0700)]
dzn: Report some more caps correctly that are supported
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 23 Mar 2023 15:35:41 +0000 (08:35 -0700)]
dzn: Raise max number of descriptor sets to 8
DOOM Eternal just assumes you support at least 5, which caused corruption
due to overrunning arrays. We can just bump this up. 8 should work with
and without bindless.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 23 Mar 2023 15:34:39 +0000 (08:34 -0700)]
dzn: Fix SRV barrier state on compute command lists
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 30 Mar 2023 21:44:11 +0000 (14:44 -0700)]
dzn: Add a driconf option for enabling subgroup ops in VS/GS
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Fri, 27 Jan 2023 21:05:33 +0000 (13:05 -0800)]
dzn: Add a driconf entry for enabling 8bit loads and stores
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 23 Mar 2023 15:36:22 +0000 (08:36 -0700)]
spirv2dxil: Add some more supported caps
8-bit loads and stores work via lowering, but they do work
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Tue, 4 Apr 2023 20:41:10 +0000 (13:41 -0700)]
microsoft/compiler: Fix large shifts
Unlike DXBC, DXIL's shift instructions don't have the implicit behavior
that they only take the 5 bits. This is observable if you try to have
DXC do a shift of a dynamic value, e.g. a constant buffer value, where
the compiler inserts the appropriate 'and' op. We need to do the same.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Fri, 31 Mar 2023 17:25:00 +0000 (10:25 -0700)]
microsoft/compiler: Assign 1D wave IDs based on local thread ID
Fixes corruption/flickering seen in DOOM Eternal's decals/lighting.
It seems the shader has an implicit assumption about this property.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 30 Mar 2023 20:24:14 +0000 (13:24 -0700)]
microsoft/compiler: Fix barrier for wave ID computation
Fixes:
2f8a8b59 ("microsoft/compiler: Add lowering passes for basic subgroup vars")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Jesse Natalie [Thu, 23 Mar 2023 15:30:58 +0000 (08:30 -0700)]
microsoft/compiler: Fix 8-bit loads and stores when supporting 16-bit DXIL
Shifts should always use 32bit shift values, and when lowering to
masked, we need to use 32-bit atomics. That means that we should also
treat 24bit stores as a single masked op rather than one 16bit unmasked
and one 8bit masked.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Adam Jackson [Wed, 21 Jul 2021 16:02:35 +0000 (12:02 -0400)]
glx: Fix error handling yet again in CreateContextAttribs
Unlike the legacy CreateContext path, we would try to send the
GLXCreateContextAttribs request regardless of whether we'd successfully
created the client context state. And there's not a lot on the server
side to go wrong besides BadAlloc, so if the request succeeded but
the client side didn't we'd need to destroy the server context and
synthesize an X error. Since that itself involves more X protocol it's
tricky to get the request number right in the error, and tests and apps
can notice when you get it wrong.
Since we have now fixed client-side validation to generate the right
errors at the right times, this patch does something simpler, we match
CreateContext and fail early if the client-side setup fails. Now there's
no question of what request number to use, because we haven't sent any
protocol, the error is for the request as if it'd been sent.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4763
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12006>
Adam Jackson [Fri, 23 Jul 2021 20:13:45 +0000 (16:13 -0400)]
glx: Disable the indirect fallback in CreateContextAttribs
If your app cares enough to use CreateContextAttribs it's probably not
going to be happy with the pre-GL-1.5 indirect experience.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12006>
Adam Jackson [Fri, 23 Jul 2021 22:14:33 +0000 (18:14 -0400)]
glx/dri: Fix error generation for invalid GLX_RENDER_TYPE
This needs to throw BadValue.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12006>
Adam Jackson [Thu, 22 Jul 2021 20:28:31 +0000 (16:28 -0400)]
dri: Validate more of the context version in validate_context_version
There's two kinds of "bad version" you might encounter here, either the
combination does not name a defined version (like 1.7) or it names
something the driver can't do (like asking r300 to do 4.0). EGL does not
distinguish these cases, but GLX calls them BadMatch and GLXBadFBConfig
respectively.
Since api_mask is the set of driver supported APIs, and we can only
support defined APIs, don't check it early in driCreateContextAttribs,
just let it fall out from validate_context_version.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12006>
Adam Jackson [Wed, 21 Jul 2021 22:05:12 +0000 (18:05 -0400)]
glx/dri: Use X/GLX error codes for our create_context_attribs
This has no functional change because everyone calling this is
discarding the error code, because we're relying on the server to
generate the right thing for us. But we create the direct context first
and the server isn't going to enforce everything we want it to
(supported GL versions for example). Convert out from DRI error codes to
X/GLX error codes so we can fail the right way on the client side. We're
still throwing the error away in all of the callers but that'll change
shortly.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12006>
Ian Romanick [Fri, 31 Mar 2023 22:47:48 +0000 (15:47 -0700)]
intel/fs: White space fixes
Trivial
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Tue, 21 Mar 2023 03:57:47 +0000 (20:57 -0700)]
intel/fs: Preserve meta data more often in brw_nir_move_interpolation_to_top
This pass rarely makes any changes, so work a little harder to preserve
more meta data.
On my Ice Lake laptop (using a locked CPU speed and other measures to
prevent thermal throttling, etc.) using a debugoptimized build, improves
performance of Vulkan CTS "deqp-vk --deqp-case='dEQP-VK.*spir*'" by
-0.2% ± 0.1% (n = 5, pooled s = 0.431885).
v2: Add some parenthesis. Suggested by Lionel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Tue, 21 Mar 2023 03:57:47 +0000 (20:57 -0700)]
intel/fs: Linked list micro optimizations in brw_nir_move_interpolation_to_top
Two linked list management changes:
- Use the list head sentinel as the initial cursor. It is, after all, a
proper node in the list.
- Iterate the list of blocks starting with the second block instead of
skipping the first block in the loop.
On my Ice Lake laptop (using a locked CPU speed and other measures to
prevent thermal throttling, etc.) using a release build, improves
performance of compiling shaders from batman_arkham_city_goty.foz by
-0.24% ± 0.09% (n = 5, pooled s = 0.324106).
v2: Use nir_cursor instead of direct list manipultion. Suggested by
Lionel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Tue, 14 Mar 2023 02:47:07 +0000 (19:47 -0700)]
intel/compiler: Micro optimize regions_overlap
On my Ice Lake laptop (using a locked CPU speed and other measures to
prevent thermal throttling, etc.) using a release build, improves
performance of compiling shaders from batman_arkham_city_goty.foz by
-1.09% ± 0.084% (n = 5, pooled s = 0.354471)
Reduces the size of a release build by 26k.
text data bss dec hex filename
23163641 400720 231360
23795721 16b1809 before/lib64/dri/iris_dri.so
23137264 400720 231360
23769344 16ab100 after/lib64/dri/iris_dri.so
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Thu, 23 Mar 2023 23:20:38 +0000 (16:20 -0700)]
intel/fs: Use specialized version of regions_overlap in opt_copy_propagation
Since one of the register must always be either VGRF or FIXED_GRF, much
of regions_overlap and reg_offset can be elided.
On my Ice Lake laptop (using a locked CPU speed and other measures to
prevent thermal throttling, etc.) using a debugoptimized build, improves
performance of Vulkan CTS "deqp-vk --deqp-case='dEQP-VK.*spir*'" by
-0.29% ± 0.097% (n = 5, pooled s = 0.361697).
Using a release build, improves performance of compiling shaders from
batman_arkham_city_goty.foz by -3.3% ± 0.04% (n = 5, pooled s =
0.178312).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Tue, 14 Mar 2023 02:46:46 +0000 (19:46 -0700)]
intel/compiler: Micro optimize inst_is_in_block
This function only exists in builds with assertions, so it only matters
there.
On my Ice Lake laptop (using a locked CPU speed and other measures to
prevent thermal throttling, etc.) using a debugoptimized build, improves
performance of Vulkan CTS "deqp-vk --deqp-case='dEQP-VK.*spir*'" by
-5.2% ± 0.16% (n = 5, pooled s = 0.657887).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Wed, 11 Jan 2023 19:15:27 +0000 (11:15 -0800)]
intel/compiler: Use NIR_PASS instead of NIR_PASS_V
Reduce debug log spam by only logging the shader if a pass made some
changes. This can also elide some nir_validate calls in debug builds.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Ian Romanick [Wed, 22 Mar 2023 02:13:03 +0000 (19:13 -0700)]
intel/compiler: Remove one overload of backend_instruction::insert_before
The version that takes a list of instructions is not used. I did not do
any archaeology to find out when the last user was removed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>
Tomeu Vizoso [Mon, 20 Mar 2023 11:48:50 +0000 (12:48 +0100)]
etnaviv: don't read too much from uniform arrays
Fixes:
77af1ca690f ("etnaviv: add disk cache")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Italo Nicola [Thu, 15 Sep 2022 19:13:43 +0000 (19:13 +0000)]
etnaviv: implement nir_op_uclz and lower find_{msb,lsb} to uclz
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Italo Nicola [Fri, 14 Oct 2022 18:12:52 +0000 (18:12 +0000)]
etnaviv: lower (un)pack_{2x16,2x32}_split and extract_{byte,word}
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Tomeu Vizoso [Wed, 6 Jul 2022 07:06:49 +0000 (09:06 +0200)]
etnaviv: print writemask of store operations
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Tomeu Vizoso [Wed, 6 Jul 2022 07:07:41 +0000 (09:07 +0200)]
etnaviv: handle missing alu conversion opcodes
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Italo Nicola [Thu, 6 Oct 2022 12:29:31 +0000 (12:29 +0000)]
etnaviv: add default clear_buffer and clear_texture APIS
These are required to support rusticl.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Italo Nicola [Thu, 24 Nov 2022 03:35:18 +0000 (03:35 +0000)]
etnaviv: use stderr for compiler error logging
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Italo Nicola [Thu, 23 Mar 2023 00:47:36 +0000 (00:47 +0000)]
etnaviv: abort() instead of assert(0) on compiler error
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22210>
Marek Olšák [Sat, 1 Apr 2023 03:06:47 +0000 (23:06 -0400)]
amd/registers: use gfx9 packet definitions for gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Sat, 1 Apr 2023 03:06:11 +0000 (23:06 -0400)]
amd/registers: update gfx940.json
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Sat, 1 Apr 2023 03:04:34 +0000 (23:04 -0400)]
amd/registers: fix the parser to include CP_COHER registers for gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Sat, 1 Apr 2023 03:03:19 +0000 (23:03 -0400)]
amd/registers: simplify integer division by 0x1000 in the parser
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Sat, 1 Apr 2023 03:00:50 +0000 (23:00 -0400)]
radeonsi: don't set registers that don't exist on gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Mon, 27 Feb 2023 10:31:50 +0000 (04:31 -0600)]
radeonsi/vcn: enable RGBA/ARGB formats on gfx940 jpeg
enable RGBA/ARGB format on gfx940 to aid RGBA/ARGB conversion after decode
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Thu, 2 Mar 2023 07:07:32 +0000 (01:07 -0600)]
frontends/va: support crop region in jpeg decode
propogate region of interest co-ordinates for crop region decode
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Wed, 29 Mar 2023 21:19:10 +0000 (02:49 +0530)]
radeonsi/vcn: reset to default value when ROI/FC is not used
when decoding without ROI/FC feature reset the registers to default value.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sonny Jiang [Thu, 2 Feb 2023 20:44:06 +0000 (15:44 -0500)]
radeonsi/vcn: Add decode support for gfx940
Add VCN decode for gfx940
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sonny Jiang [Thu, 2 Feb 2023 20:29:48 +0000 (15:29 -0500)]
radeonsi/vcn: Add video capabilities support for gfx940
Add VCN codec caps support for gfx940
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sonny Jiang [Thu, 2 Feb 2023 20:17:37 +0000 (15:17 -0500)]
amd/common: Add gfx940 codec query support
Add support for GFX940 VCN query
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Mon, 7 Nov 2022 11:05:27 +0000 (16:35 +0530)]
radeonsi/vcn: set jpeg reg version for gfx940
select appropriate jpeg register version for gfx940
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Wed, 2 Nov 2022 18:26:21 +0000 (23:56 +0530)]
radeonsi/vcn: support ARGB/RGBA conversion on JPEG 4.0.3
enable ARGB/RGBA conversion feature on JPEG 4.0.3
v2: fix regression caused due to uninitialized variable
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Fri, 21 Oct 2022 14:26:07 +0000 (19:56 +0530)]
radeonsi/vcn: add support for picture crop on JPEG 4.0.3
set the crop region and enable the feature if requested
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Fri, 21 Oct 2022 13:40:39 +0000 (19:10 +0530)]
radeonsi/vcn: use register versions for jpeg
update the register version and select appropriate registers during decoder create
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Sathishkumar S [Thu, 3 Nov 2022 11:27:27 +0000 (16:57 +0530)]
radeonsi/vcn: add register definitions for JPEG 4.0.3
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Ganesh Belgur Ramachandra [Mon, 27 Feb 2023 10:30:33 +0000 (04:30 -0600)]
ac/nir: fix CDNA image lowering for array textures
The x,y coordinates were not added.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Fri, 28 Oct 2022 21:21:07 +0000 (17:21 -0400)]
ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Thu, 27 Oct 2022 17:23:23 +0000 (13:23 -0400)]
radeonsi: add an emulated image descriptor for gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Fri, 28 Oct 2022 12:29:47 +0000 (08:29 -0400)]
ac/surface: force linear image layout for chips not supporting image opcodes
Image opcodes will be emulated.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Fri, 21 Oct 2022 20:10:01 +0000 (16:10 -0400)]
radeonsi: always use ffma32 on gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Fri, 21 Oct 2022 19:59:36 +0000 (15:59 -0400)]
radeonsi: use COMPUTE_DISPATCH_SCRATCH_BASE on gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>
Marek Olšák [Fri, 21 Oct 2022 19:10:40 +0000 (15:10 -0400)]
amd: add initial code for gfx940
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22158>