Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt6
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 20:49:48 +0000 (16:49 -0400)]
radeonsi: update copyrights
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 20:40:30 +0000 (16:40 -0400)]
radeonsi: switch radeon_add_to_buffer_list parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt5
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt4
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt3
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt2
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:37:11 +0000 (15:37 -0400)]
radeonsi: use r600_common_context less pt1
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:24:07 +0000 (15:24 -0400)]
radeonsi: don't use r600_common_context in si_emit_cache_flush
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:16:26 +0000 (15:16 -0400)]
radeonsi: switch r600_atom::emit parameter to si_context
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:07:58 +0000 (15:07 -0400)]
radeonsi: flatten / remove struct r600_ring
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 19:03:23 +0000 (15:03 -0400)]
radeonsi: remove r600_ring::flush callback
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:59:44 +0000 (14:59 -0400)]
radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:52:42 +0000 (14:52 -0400)]
radeonsi: add_to_buffer_list functions can return void
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:46:05 +0000 (14:46 -0400)]
radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:40:34 +0000 (14:40 -0400)]
radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:31:02 +0000 (14:31 -0400)]
radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:24:53 +0000 (14:24 -0400)]
radeonsi: rename si_hw_context.c -> si_gfx_cs.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:22:54 +0000 (14:22 -0400)]
radeonsi: move si_destroy_saved_cs to si_debug.c
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:18:17 +0000 (14:18 -0400)]
radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:17:23 +0000 (14:17 -0400)]
radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:15:34 +0000 (14:15 -0400)]
radeonsi: remove r600_pipe_common::blit_decompress_depth
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:13:22 +0000 (14:13 -0400)]
radeonsi: remove r600_pipe_common::decompress_dcc
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:06:06 +0000 (14:06 -0400)]
radeonsi: remove r600_pipe_common::invalidate_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:04:04 +0000 (14:04 -0400)]
radeonsi: remove r600_pipe_common::rebind_buffer
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 18:00:14 +0000 (14:00 -0400)]
radeonsi: remove r600_common_context::set_occlusion_query_state
and remove unused old_enable parameter.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:55:34 +0000 (13:55 -0400)]
radeonsi: remove r600_pipe_common::save_qbo_state
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:51:09 +0000 (13:51 -0400)]
radeonsi: remove unused query code
The get_size perf counter callback is also inlined and removed.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:40:41 +0000 (13:40 -0400)]
radeonsi: use num_cs_dw_queries_suspend
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:32:47 +0000 (13:32 -0400)]
radeonsi: remove r600_pipe_common::need_gfx_cs_space
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:30:02 +0000 (13:30 -0400)]
radeonsi: remove r600_pipe_common::set_atom_dirty
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:29:04 +0000 (13:29 -0400)]
radeonsi: remove r600_pipe_common::check_vm_faults
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sun, 1 Apr 2018 17:24:43 +0000 (13:24 -0400)]
radeonsi: call CS flush functions directly whenever possible
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sat, 31 Mar 2018 02:15:52 +0000 (22:15 -0400)]
radeonsi: skip DCC render feedback checking if color writes are disabled
Dylan Baker [Wed, 4 Apr 2018 17:23:02 +0000 (10:23 -0700)]
meson: fix megadriver symlinking
Which should be relative instead of absolute.
Fixes:
f7f1b30f81e842db6057591470ce3cb6d4fb2795
("meson: extend install_megadrivers script to handle symmlinking")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dylan Baker [Wed, 4 Apr 2018 17:53:16 +0000 (10:53 -0700)]
meson: Set .so version for xa like autotools does
Fixes:
0ba909f0f111824223bc38563d1a6bc73e69c2cc
("meson: build gallium xa state tracker")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Rafael Antognolli [Tue, 6 Mar 2018 17:21:40 +0000 (09:21 -0800)]
anv: Make blorp update the clear color.
Instead of updating the clear color in anv before a resolve, just let
blorp handle that for us during fast clears.
v5: Update comment about HiZ clear color (Jordan).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Fri, 19 Jan 2018 01:19:30 +0000 (17:19 -0800)]
anv: Use clear address for HiZ fast clears too.
Store the default clear address for HiZ fast clears on a global bo, and
point to it when needed.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Thu, 18 Jan 2018 17:50:48 +0000 (09:50 -0800)]
anv: Emit the fast clear color address, instead of value.
On Gen10+, instead of copying the clear color from the state buffer to
the surface state, just use the address of the state buffer in the
surface state directly. This way we can avoid the copy from state buffer
to surface state.
v4:
- Remove use_clear_address from anv code. (Jason)
- Use the helper to extract clear color from attachment (Jason)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Wed, 28 Feb 2018 01:06:13 +0000 (17:06 -0800)]
anv: Add a helper to extract clear color from the attachment.
Extract the code from color_attachment_compute_aux_usage, so we can
later reuse it to update the clear color state buffer.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Tue, 29 Aug 2017 23:30:26 +0000 (16:30 -0700)]
i965/surface_state: Emit the clear color address instead of value.
On Gen10, when emitting the surface state, use the value stored in the
clear color entry buffer by using a clear color address in the surface
state.
v4: Use the clear color offset from the clear_color_bo, when available.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Tue, 29 Aug 2017 23:25:30 +0000 (16:25 -0700)]
i965/blorp: Update the fast clear value buffer.
On Gen10, whenever we do a fast clear, blorp will update the clear color
state buffer for us, as long as we set the clear color address
correctly.
However, on a hiz clear, if the surface is already on the fast clear
state we skip the actual fast clear operation and, before gen10, only
updated the miptree. On gen10+ we need to update the clear value state
buffer too, since blorp will not be doing a fast clear and updating it
for us.
v4:
- do not use clear_value_size in the for loop
- Get the address of the clear color from the aux buffer or the
clear_color_bo, depending on which one is available.
- let core blorp update the clear color, but also update it when we
skip a fast clear depth.
v5: Better subject (Jordan).
v6: Remove outdated comment (Jason).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Mon, 5 Mar 2018 19:25:12 +0000 (11:25 -0800)]
i965: Add aux_buf variable to simplify code.
In a follow up patch, we make use of clear_color_bo, which is in
mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the
same thing on both aux buffers, just use aux_buf already.
v5: Add aux_buf to brw_wm_surface_state too.
v6: Drop aux_surf and use aux_buf->surf instead (Jason).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Thu, 1 Mar 2018 00:11:34 +0000 (16:11 -0800)]
i965/miptree: Add new clear color BO for winsys aux buffers
Add an extra BO to store clear color when we receive the aux buffer from
the window system. Since we have no control over the aux buffer size in
this case, we need the new BO to store only the clear color.
v5:
- Better subject (Jordan).
- Drop alignment from brw_bo_alloc().
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Thu, 10 Aug 2017 16:36:28 +0000 (09:36 -0700)]
i965/miptree: Add space to store the clear value in the aux surface.
Similarly to vulkan where we store the clear value in the aux surface,
we can do the same in GL.
v2: Remove unneeded extra function.
v3: Use clear_value_state_size instead of clear_value_size.
v4:
- rename to clear_color_state_size
- store clear_color_bo and clear_color_offset in the aux buf struct
v5: Unreference clear color bo (Jordan)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Mon, 5 Mar 2018 16:52:35 +0000 (08:52 -0800)]
intel/blorp: Update clear color state buffer during fast clears.
We always want to update the fast clear color during a fast clear on
i965. On anv, we are doing that before a resolve, but by adding support
to blorp, we can do a similar thing and update it during a fast clear
instead.
The goal is to remove some code from anv that does such update, and
centralize everything in blorp, hopefully removing a lot of code
duplication. It also allows us to have a similar behavior on gen < 9 and
gen >= 10.
v5: s/we/we are/ (Jordan)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Wed, 7 Mar 2018 18:49:03 +0000 (10:49 -0800)]
intel/blorp: Only copy clear color when doing a resolve.
We only need to copy the clear color from the state buffer to the
inlined surface state when doing a resolve.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Thu, 7 Dec 2017 16:47:38 +0000 (08:47 -0800)]
intel/blorp: Add support for fast clear address.
On gen10+, if surface->clear_color_addr is present, use it directly
intead of copying it to the surface state.
v4: Remove redundant #if clause for GEN <= 10 (Jason)
v5: Move flush after the reloc, and keep lower bits (Topi).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Thu, 10 Aug 2017 16:29:51 +0000 (09:29 -0700)]
intel/isl: Add support to emit clear value address.
gen10 can emit the clear color by setting it on a buffer somewhere, and
then adding only the address to the surface state.
This commit add support for that on isl_surf_fill_state, and if that is
requested, skip setting the clear value itself.
v2: Add assert to make sure we are at least on gen10.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Tue, 27 Mar 2018 22:51:21 +0000 (15:51 -0700)]
intel: Use Clear Color struct size.
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.
v4:
- Add struct to gen11 too (Jason, Jordan)
- Add field for Converted Clear Color to gen11 (Jason)
- Add clear_color_state_offset to differentiate from
clear_value_offset.
- Fix all the places where clear_value_size was used.
v5 (Jason):
- Split genxml changes to another commit.
- Remove unnecessary gen checks.
- Bring back missing offset increment to init_fast_clear_color().
v6 (Jason):
- On init_fast_clear_color, change:
addr.offset += 4 => sdi.Address.offset += i * 4
- Use GEN_GEN instead of GEN_VERSIONx10.
[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Tue, 27 Mar 2018 22:48:44 +0000 (15:48 -0700)]
intel/genxml: Add Clear Color struct to gen10+.
v5: Split genxml changes into its own commit (Jason).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Mon, 7 Aug 2017 19:14:04 +0000 (12:14 -0700)]
intel/genxml: Use a single field for clear color address on gen10.
genxml does not support having two address fields with different names
but same position in the state struct. Both "Clear Color Address"
and "Clear Depth Address Low" mean the same thing, only for different
surface types.
To workaround this genxml limitation, rename "Clear Color Address"
to "Clear Value Address" and use it for both color and depth. Do the
same for the high bits.
TODO: add support for multiple addresses at the same position in the
xml.
v2: Combine high and low order bits into a single address field.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rafael Antognolli [Thu, 18 Jan 2018 00:19:41 +0000 (16:19 -0800)]
genxml: Preserve fields that share dword space with addresses.
Some instructions contain fields that are either an address or a value
of some type based on the content of other fields, such as clear color
values vs address. That works fine if these fields are in the less
significant dword, the lower 32 bits of the address, because they get
OR'ed with the address. But if they are in the higher 32 bits, they get
discarded.
On Gen10 we have fields that share space with the higher 16 bits of the
address too. This commit makes sure those fields don't get discarded.
v5: Remove spurious whitespace (Jason).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rafael Antognolli [Thu, 18 Jan 2018 22:12:08 +0000 (14:12 -0800)]
anv/image: Do not override lower bits of dword.
The lower bits seem to have extra fields in every platform but gen8
(even though we don't use them in gen9). So just go ahead and avoid
using them for the address.
v4: Use Jason's suggestion for comment explaining the change.
v5: Fix aux_address comment in anv_private.h (Jason)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:04 +0000 (12:12 +0200)]
radv: implement a fast prefetch path for the vertex stage
This allows to start draws as soon as possible.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:03 +0000 (12:12 +0200)]
radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Samuel Pitoiset [Wed, 4 Apr 2018 10:12:02 +0000 (12:12 +0200)]
radv: use a mask for VBOs and shaders prefetching
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Wed, 4 Apr 2018 20:11:03 +0000 (16:11 -0400)]
gallium/pp: fix MLAA shaders
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549
Marek Olšák [Wed, 4 Apr 2018 20:04:30 +0000 (16:04 -0400)]
gallium/pp: use user constant buffers
This fixes a radeonsi crash.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026
Marek Olšák [Mon, 2 Apr 2018 21:58:30 +0000 (17:58 -0400)]
st/mesa: set stencil border color the same as intensity
This fixes some stencil border color tests on Vega and Raven chips.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jon Turney [Tue, 3 Apr 2018 16:52:56 +0000 (17:52 +0100)]
Fix use of alloca() without #include <c99_alloca.h>
Fix use of alloca() without #include <c99_alloca.h> in
1da345e5
vbo/vbo_context.c: In function '_vbo_draw_indirect':
vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
struct _mesa_prim *space = alloca(draw_count*sizeof(struct _mesa_prim));
^~~~~~
vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Samuel Pitoiset [Wed, 28 Mar 2018 17:03:00 +0000 (19:03 +0200)]
radv: implement out-of-order rasterization when it's safe on VI+
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.
No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.
Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.
Loosely based on RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 29 Mar 2018 08:54:29 +0000 (10:54 +0200)]
radv: change blend_enable field to use four bits per CB
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 29 Mar 2018 08:49:33 +0000 (10:49 +0200)]
radv: scan which color blend attachments are enabled
With cb_target_enabled_4bit in order to have four bits per CB.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 29 Mar 2018 08:01:41 +0000 (10:01 +0200)]
radv: put more fields in radv_blend_state
Some will be used for further optimizations (ie. out-of-order rast).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 29 Mar 2018 12:51:20 +0000 (14:51 +0200)]
radv: do not always disable dual quad mode when chip has RbPlus
For GFX9+ only, RadeonSI does this too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 4 Apr 2018 08:55:43 +0000 (10:55 +0200)]
radv: don't use the SPI barrier management bug workaround
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 4 Apr 2018 08:55:42 +0000 (10:55 +0200)]
radv: mask out high VM address bits in registers where needed
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Lionel Landwerlin [Tue, 3 Apr 2018 13:41:18 +0000 (14:41 +0100)]
intel: compiler: silence compiler warning
../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg*, const brw_reg*)’:
../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type]
Introduced by
8f83eea71e233 ("i965: Add negative_equals methods").
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Iago Toral Quiroga [Mon, 2 Apr 2018 09:39:41 +0000 (11:39 +0200)]
compiler/spirv: set is_shadow for depth comparitor sampling opcodes
From the SPIR-V spec, OpTypeImage:
"Depth is whether or not this image is a depth image. (Note that
whether or not depth comparisons are actually done is a property of
the sampling opcode, not of this type declaration.)"
The sampling opcodes that specify depth comparisons are
OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set
is_shadow only for these (we were using the deph property of the
image until now).
v2:
- Do the same for OpImageDrefGather.
- Set is_shadow to false if the sampling opcode is not one of these (Jason)
- Reuse an existing switch statement instead of adding a new one (Jason)
Fixes crashes in:
dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
Sergii Romantsov [Mon, 2 Apr 2018 06:59:06 +0000 (09:59 +0300)]
i965: Extend the negative 32-bit deltas to 64-bits
Gen8+ use 48-bit address relocations so need to extend the sign
to 64-bit return value. Without it we have higher bits zeroed
and missing the negavive values.
Haswell and older use 32-bit deltas so are unaffected by this issue.
v2:
used int32_t fucntion parameter instead of explicit type conversion.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Tested-by: Andriy Khulap <andriy.khulap@globallogic.com>
Tested-by: Stuart Young <cefiar@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Fri, 23 Mar 2018 18:05:04 +0000 (11:05 -0700)]
nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Otherwise we may end up trying to coalesce in a case such as
ssa_1 = fadd r1, r2
r3.x = fneg(r2);
r3 = vec4(ssa_1, ssa_1.y, ...)
and that would cause us to move the writes to r3 from the vec to the
fadd which would re-order them with respect to the write from the fneg.
In order to solve this, we just don't coalesce if the destination of the
vec is not SSA. We could try to get clever and still coalesce if there
are no writes to the destination of the vec between the vec and the ALU
source. However, since registers only come from phi webs and indirects,
the chances of having a vec with a register destination that is actually
coalescable into its source is very slim.
Shader-db results on Haswell:
total instructions in shared programs:
13657906 ->
13659101 (<.01%)
instructions in affected programs: 149291 -> 150486 (0.80%)
helped: 0
HURT: 592
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440
Fixes:
2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible"
Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kevin Strasser [Tue, 3 Apr 2018 21:21:34 +0000 (14:21 -0700)]
anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL
If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel
will hit a -EBADF error. Move the close(fd) call to the end of
anv_CreateDmaBufImageINTEL().
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Tue, 3 Apr 2018 01:38:13 +0000 (11:38 +1000)]
glsl: always call do_lower_jumps() after loop unrolling
This fixes a bug in radeonsi where LLVM cannot handle the case where
a break exists but its not the last instruction in the block.
LLVM would fail with:
Terminator found in the middle of a basic block!
LLVM ERROR: Broken function found, compilation aborted!
Fixes:
96fe8834f539 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
James Legg [Fri, 30 Mar 2018 15:45:01 +0000 (16:45 +0100)]
vulkan/wsi/wayland: fix leaks
Fixes:
bfa22266cd ("vulkan/wsi/wayland: Add support for zwp_dmabuf")
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: Jason Ekstrand <jason@jlekstrand.net>
Juan A. Suarez Romero [Tue, 3 Apr 2018 17:38:36 +0000 (17:38 +0000)]
docs: update calendar, add news and link release notes to 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Tue, 3 Apr 2018 17:33:23 +0000 (17:33 +0000)]
docs: add sha256 checksums for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
ba371c7262a484391cace9d5e17635ed14c58692)
Juan A. Suarez Romero [Tue, 3 Apr 2018 16:39:48 +0000 (16:39 +0000)]
docs: add release notes for 17.3.8
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
3bf5c10c5c0e9fac6eb0b2c201bcf44755ecfaec)
Jakob Bornecrantz [Tue, 3 Apr 2018 15:58:10 +0000 (16:58 +0100)]
st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB.
When running virgl on a GLES host the only sRGB formats that support
rendering is RGBA and RGBX. That pipe format is in the sRGB default
lists that the state tracker uses when mapping mesa formats.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Lionel Landwerlin [Tue, 3 Apr 2018 10:45:24 +0000 (11:45 +0100)]
intel: gen-decoder: print all dword a field belongs to
Prior to printing a decoded field, print out all dwords that field
belongs to. In particular with address fields spanning multiple
dwords, we want to have all the dwords presented before the field is
decoded to make it easier to read.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Lionel Landwerlin [Tue, 3 Apr 2018 10:21:31 +0000 (11:21 +0100)]
intel: genxml: decode variable length MI_LRI
MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one
command. In our drivers we only use one tuple at a time, but the
kernel might load more than one at a time.
Instead of making all the tuple part of a group, we leave out the
first tuple (the one we use in the generated packing structures).
This is particularly useful for looking at error stats generated by
the kernel.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Lionel Landwerlin [Tue, 3 Apr 2018 10:01:56 +0000 (11:01 +0100)]
intel: gen-decoder: don't decode fields beyond a dword length
For example, a PIPE_CONTROL with DWordLength = 2 should look like
this :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Prior to this change, fields beyond the length of the command would be
decoded (notice the MI_BATCH_BUFFER_END decoded as part of the
previous PIPE_CONTROL) :
0xffffe374: 0x7a000002: PIPE_CONTROL
0xffffe374: 0x7a000002 : Dword 0
DWord Length: 2
0xffffe378: 0x00800000 : Dword 1
Depth Cache Flush Enable: false
Stall At Pixel Scoreboard: false
State Cache Invalidation Enable: false
Constant Cache Invalidation Enable: false
VF Cache Invalidation Enable: false
DC Flush Enable: false
Pipe Control Flush Enable: false
Notify Enable: false
Indirect State Pointers Disable: false
Texture Cache Invalidation Enable: false
Instruction Cache Invalidate Enable: false
Render Target Cache Flush Enable: false
Depth Stall Enable: false
Post Sync Operation: 0 (No Write)
Generic Media State Clear: false
TLB Invalidate: false
Global Snapshot Count Reset: false
Command Streamer Stall Enable: false
Store Data Index: 0
LRI Post Sync Operation: 1 (MMIO Write Immediate Data)
Destination Address Type: 0 (PPGTT)
Flush LLC: false
0xffffe37c: 0x00000000 : Dword 2
Address: 0x00000000
0xffffe380: 0x00000000 : Dword 3
0xffffe384: 0x05000000 : Dword 4
Immediate Data:
83886080
0xffffe384: 0x05000000: MI_BATCH_BUFFER_END
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Lionel Landwerlin [Tue, 27 Mar 2018 17:10:45 +0000 (18:10 +0100)]
intel: error_decode: add an option to decode all buffers
The kernel reports workaround batch buffers, but we're not presenting
them currently. Also they might not be useful for debugging purely
userspace driver issues, when problems arise because of interactions
between kernel & userspace drivers, it's nice to be able to decode
them.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Lionel Landwerlin [Tue, 27 Mar 2018 16:56:44 +0000 (17:56 +0100)]
intel: genxml: add preemption control instructions
Helpful to debug kernel workaround batchbuffers.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Dylan Baker [Mon, 2 Apr 2018 22:29:45 +0000 (15:29 -0700)]
mesa: ensure that variable is initialized
This variable controls whether we link using the glsl code path or the
spirv path. It's set when we validate that all shaders are glsl or
spirv, but if there are no shaders attached to the program it will
remain unset, resulting in undefined behavior. We want to go down the
glsl path in that case, so initialize to false.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820
Fixes:
16f6634e7fb5ada308e55b852cd49251e7f3f8b1
("mesa/program: Link SPIR-V shaders using the SPIR-V code-path")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Marek Olšák [Mon, 2 Apr 2018 19:06:42 +0000 (15:06 -0400)]
radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Samuel Pitoiset [Mon, 2 Apr 2018 16:17:55 +0000 (18:17 +0200)]
radv: enable VK_EXT_shader_viewport_index_layer
The driver already supports exporting the Layer and ViewportIndex
built-ins from vertex or tessellation shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Rob Clark [Wed, 28 Mar 2018 12:32:10 +0000 (08:32 -0400)]
nir+drivers: add helpers to get # of src/dest components
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Mon, 2 Apr 2018 14:47:23 +0000 (10:47 -0400)]
freedreno/ir3: fix fallout of unused false-depth elimination
Since we were MARK flag for both preventing loops, and tracking whether
instructions were used, we could end up in an infinite loop due to
bd2ca2bcdd. Instead invert the logic.. mark all instructions UNUSED
up front and clear the flag as we visit them.
Fixes:
bd2ca2bcdd freedreno/ir3: eliminate unused false-deps
Signed-off-by: Rob Clark <robdclark@gmail.com>
Timothy Arceri [Sat, 31 Mar 2018 23:32:28 +0000 (09:32 +1000)]
gallium/pipebuffer: fix parenthesis location
Without this the return value will never get set to -1. This
was first added in
49866c8f3457 and copied in
2b396eeed983.
Fixes:
2b396eeed983 "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342
Tapani Pälli [Tue, 3 Apr 2018 05:43:18 +0000 (08:43 +0300)]
Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels"
This reverts commit
41cf30b8bc55fdf36adac3311002dc32b6715949.
Commit caused regressions with KHR-GLES3.packed_pixels.* tests.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eric Anholt <eric@anholt.net>
Mike Lothian [Sun, 1 Apr 2018 00:32:22 +0000 (01:32 +0100)]
gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Mike Lothian [Sun, 1 Apr 2018 00:32:21 +0000 (01:32 +0100)]
radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Mike Lothian [Sun, 1 Apr 2018 00:32:20 +0000 (01:32 +0100)]
ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass
Include llvm-c/Transforms/Utils.h with the newest LLVM 7
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Daniel Stone [Mon, 2 Apr 2018 12:20:34 +0000 (13:20 +0100)]
st/dri: Initialise modifier to INVALID for DRI2
When allocating a buffer for DRI2, set the modifier to INVALID to inform
the backend that we have no supplied modifiers and it should do its own
thing. The missed initialisation forced linear, even if the
implementation had made other decisions.
This resulted in VC4 DRI2 clients failing with:
Modifier 0x0 vs. tiling (0x700000000000001) mismatch
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Andreas Müller <schnitzeltony@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes:
3f8513172ff6 ("gallium/winsys/drm: introduce modifier field to winsys_handle")
Marek Olšák [Sun, 7 Jan 2018 20:05:52 +0000 (21:05 +0100)]
radeonsi: implement GL_KHR_blend_equation_advanced
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Fri, 23 Mar 2018 03:40:55 +0000 (23:40 -0400)]
radeonsi: rename unpack_param -> si_unpack_param
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 20 Mar 2018 23:14:57 +0000 (19:14 -0400)]
radeonsi: move FMASK shader logic to shared code
We'll need it for FBFETCH in both TGSI and NIR paths.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 20 Mar 2018 21:32:56 +0000 (17:32 -0400)]
radeonsi: add R600_DEBUG=nofmask to disable MSAA compression
For testing.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 20 Mar 2018 20:45:03 +0000 (16:45 -0400)]
gallium/u_tests: test FBFETCH and shader-based blending with MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>