Marek Olšák [Thu, 28 Sep 2017 19:46:33 +0000 (21:46 +0200)]
gallium/u_tests: test sync_file fences
This should be sufficient for testing all kernel/libdrm/radeonsi codepaths
that are used by radeonsi.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Plamena Manolova [Mon, 2 Oct 2017 20:58:27 +0000 (23:58 +0300)]
i965: Implement ARB_indirect_parameters.
We can implement ARB_indirect_parameters for i965 by
taking advantage of the conditional rendering mechanism.
This works by issuing maxdrawcount draw calls and using
conditional rendering to predicate each of them with
"drawcount > gl_DrawID"
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Plamena Manolova [Mon, 2 Oct 2017 20:58:26 +0000 (23:58 +0300)]
i965: Refactor brw_try_draw_prims.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit refactors the brw_try_draw_prims function and
renames it to brw_draw_single_prim.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Plamena Manolova [Mon, 2 Oct 2017 20:58:25 +0000 (23:58 +0300)]
i965: Indroduce brw_finish_drawing.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit introduces the brw_finish_drawing function where
we move the code that executes once after the loop.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Plamena Manolova [Mon, 2 Oct 2017 20:58:24 +0000 (23:58 +0300)]
i965: Introduce brw_prepare_drawing.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit introduces the brw_prepare_drawing function where
we move the code that executes once before the loop.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Mon, 18 Sep 2017 23:02:14 +0000 (18:02 -0500)]
glsl: Remove spurious assertions
It's inside an if-statement that already checks that the variables are
not NULL.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Thu, 21 Sep 2017 00:51:39 +0000 (19:51 -0500)]
glsl: Move 'foo = foo;' optimization to opt_dead_code_local
The optimization as done in opt_copy_propagation would have to be
removed in the next patch. If we just eliminate that optimization
altogether, shader-db results, even on platforms that use NIR, are hurt
quite substantially. I have not investigated why NIR isn't picking up
the slack here.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Ian Romanick [Mon, 18 Sep 2017 21:20:38 +0000 (16:20 -0500)]
glsl/ast: Use logical-or instead of conditional assignment to set fallthru_var
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Tue, 19 Sep 2017 20:59:00 +0000 (15:59 -0500)]
glsl/ast: Generate a more compact expression to disable execution of default case
Instead of generating a sequence like:
run_default = true;
if (i == 3) // some label that appears after default
run_default = false;
if (i == 4) // some label that appears after default
run_default = false;
...
if (run_default) {
...
}
generate something like:
run_default = !((i == 3) || (i == 4) || ...);
if (run_default) {
...
}
This eliminates one use of conditional assignment, and it enables the
elimination of another.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Tue, 19 Sep 2017 20:47:52 +0000 (15:47 -0500)]
glsl/ast: Explicitly track the set of case labels that occur after default
Previously the instruction stream was walked looking for comparisons
with case-label values. This should generate nearly identical code.
For at least fs-default-notlast-fallthrough.shader_test, the code is
identical.
This change will make later changes possible.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Mon, 18 Sep 2017 21:15:14 +0000 (16:15 -0500)]
glsl/ast: Convert ast_case_label::hir to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Mon, 18 Sep 2017 20:56:22 +0000 (15:56 -0500)]
glsl/ast: Use ir_binop_equal instead of ir_binop_all_equal
The values being compared are scalars, so these are the same. While
I'm here, simplify the run_default condition to just deref the flag
(instead of comparing a scalar bool with true).
There is a bit of extra change in this patch. When constructing an
ir_binop_equal ir_expression, there is an assertion that the types are
the same. There is no such assertion for ir_binop_all_equal, so
passing glsl_type::uint_type with glsl_type::int_type was previously
fine. A bunch of the code motion is to deal with that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Wed, 27 Sep 2017 00:54:48 +0000 (17:54 -0700)]
glsl/ast: Stop processing a switch-statement after an error in the init-expression
This happens to work now because ir_binop_all_equal is used. This
causes vector typed init-expressions to produce scalar Boolean values
after comparison.
The next commit changes ir_binop_all_equal to ir_binop_equal. Vector
typed init-expressions will then produce vector Boolean values, and, in
debug builds, the ir_assignment constructor will fail an assertion.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Mon, 18 Sep 2017 20:30:51 +0000 (15:30 -0500)]
glsl: Don't pass NULL to ir_assignment constructor when not necessary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Mon, 18 Sep 2017 20:04:03 +0000 (15:04 -0500)]
glsl: Convert lower_variable_index_to_cond_assign to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Ian Romanick [Mon, 18 Sep 2017 19:17:57 +0000 (14:17 -0500)]
glsl: Fix coding standards issues in lower_variable_index_to_cond_assign
Mostly tabs-before-spaces, but there was some other trivium too.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Ian Romanick [Fri, 15 Sep 2017 00:28:35 +0000 (17:28 -0700)]
glsl: Convert lower_vec_index_to_cond_assign to using ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Ian Romanick [Fri, 15 Sep 2017 01:20:14 +0000 (18:20 -0700)]
glsl: Return ir_variable from compare_index_block
This is basically a wash now, but it simplifies later patches that
convert to using ir_builder.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Ian Romanick [Fri, 15 Sep 2017 00:11:37 +0000 (17:11 -0700)]
glsl: Fix coding standards issues in lower_vec_index_to_cond_assign
Mostly tabs-before-spaces, but there was some other trivium too.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Ian Romanick [Thu, 14 Sep 2017 23:58:31 +0000 (16:58 -0700)]
glsl: Fix coding standards issues in lower_if_to_cond_assign
Mostly tabs-before-spaces issues.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Bas Nieuwenhuizen [Sun, 1 Oct 2017 22:23:42 +0000 (00:23 +0200)]
nir/spirv: Allow loop breaks in a switch body.
Per the SPIR-V spec 2.11 Structured Control Flow:
"The only blocks in a construct that can branch outside the construct are
...
- a break block for the innermost loop it is inside of.
..."
With
"Break block: A block containing a branch to the Merge Block of a loop header's merge instruction."
Note that it puts no restriction on not being in an if or switch within the innermost loop.
This passes the loop_break block to the switch body so it can properly detect loop breaks.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Rob Clark [Mon, 2 Oct 2017 16:22:11 +0000 (12:22 -0400)]
freedreno/a5xx: fix missing restore state
RB_CLEAR_CNTL seems to be in a funny state after boot (at least on
8x96/a530).
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Samuel Pitoiset [Mon, 2 Oct 2017 10:25:42 +0000 (12:25 +0200)]
radv: make radv_dynamic_state_copy() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dylan Baker [Thu, 28 Sep 2017 21:02:51 +0000 (14:02 -0700)]
meson: change vulkan icd config to - instead of _
Just to be consistent.
v2: - update meson.build too
v3: - remove unrelated whitespace change
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Dylan Baker [Thu, 28 Sep 2017 17:48:30 +0000 (10:48 -0700)]
meson: check for python2 mako
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Juan A. Suarez Romero [Mon, 2 Oct 2017 15:47:41 +0000 (17:47 +0200)]
docs: update calendar, add news item and link release notes for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Mon, 2 Oct 2017 16:10:02 +0000 (18:10 +0200)]
docs: add sha256 checksums for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
5a71ed6fa5b78f04b29e972e0759fa15cf0247b2)
Juan A. Suarez Romero [Mon, 2 Oct 2017 15:26:10 +0000 (17:26 +0200)]
docs: add release notes for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
bc12538a8e79113b733381ffdc0f6c89d59d0a50)
Emil Velikov [Thu, 28 Sep 2017 17:38:13 +0000 (18:38 +0100)]
wayland-egl: rework and simplify wl_egl_window initialization
Use calloc instead of malloc + explicitly zeroing the different fields.
We need special handling for the version field which is of type
const intptr_t.
As we're here document why keeping the constness is a good idea.
The wl_egl_window_resize() call is replaced with an explicit set of the
width/height.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:25:56 +0000 (18:25 +0100)]
wayland-egl: move WL_EGL_EXPORT declaration to where it's used
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:22:22 +0000 (18:22 +0100)]
wayland-egl: use C99 comments
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:13:18 +0000 (18:13 +0100)]
wayland-egl: remove no longer needed wayland-client dependency
Was required for wl_surface, which is opaque and forward declared with
earlier patch.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:11:25 +0000 (18:11 +0100)]
wayland-egl: add stdint.h include for intptr_t
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:09:48 +0000 (18:09 +0100)]
wayland-egl: forward declare struct wl_surface
It makes the header self-contained and with later commit we'll remove
the unnecessary wayland-client.h include.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Thu, 28 Sep 2017 17:03:58 +0000 (18:03 +0100)]
wayland-egl: rename wayland-egl-{priv,backend}.h
In preparation to lifting the whole thing out as a separate library.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
Emil Velikov [Wed, 27 Sep 2017 16:36:29 +0000 (17:36 +0100)]
egl/dri: link directly to libglapi.so
Shared glapi (libglapi.so) has been a requirement for years, in order
to build EGL.
Remove the no longer necessary dlopen/dlsym dance and link to the
library directly.
This allows us to remove a handful of platform specific workarounds, due
to the different name of the library.
v2:
- Android: export the include dir (RobH)
- Drop unused local variable (Eric)
Cc: Jonathan Gray <jsg@jsg.id.au>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Cc: Julien Isorce <julien.isorce@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Tomasz Figa <tfiga@chromium.org> (v1)
Tested-by: Rob Herring <robh@kernel.org>
Emil Velikov [Mon, 18 Sep 2017 10:29:21 +0000 (11:29 +0100)]
swr/rast: do not crash on NULL strings returned by getenv
The current convenience function GetEnv feeds the results of getenv
directly into std::string(). That is a bad idea, since the variable
may be unset, thus we feed NULL into the C++ construct.
The latter of which is not allowed and leads to a crash.
v2: Better variable name, implicit char* -> std::string conversion (Eric)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832
Fixes:
a25093de718 ("swr/rast: Implement JIT shader caching to disk")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Bernhard Rosenkraenzer <bero@lindev.ch>
[Emil Velikov: make an actual commit from the misc diff]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Laurent Carlier <lordheavym@gmail.com> (v1)
Rob Clark [Thu, 14 Sep 2017 19:03:11 +0000 (15:03 -0400)]
freedreno/a5xx: align height to GMEM
Similar to the way width/pitch alignment works, it seems like we need to
do similar for height. Otherwise the BLIT from system memory to GMEM
can over-fetch beyond the end of the buffer, triggering a fault.
I'm not sure if there is a better solution yet. Possibly we could fall
back to pre-a5xx style DRAW packets for cases where BLIT might over-
fetch. (We in theory have that problem already with rendering to higher
mipmap levels, although fortunately those tend to use GMEM bypass.)
This fixes issues reported with glamor.
Reported-by: don.harbin@linaro.org
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Nicolai Hähnle [Tue, 26 Sep 2017 18:36:10 +0000 (20:36 +0200)]
radeonsi: adjust clip discard based on line width / point size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 16:10:58 +0000 (18:10 +0200)]
radeonsi: remove si_context::{scissor_enabled,clip_halfz}
They are just copies of the rasterizer state.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 16:00:21 +0000 (18:00 +0200)]
radeonsi: simplify the signature of si_update_vs_writes_viewport_index
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 15:57:59 +0000 (17:57 +0200)]
radeonsi: move current_rast_prim into si_context
v2: rebase fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 15:56:15 +0000 (17:56 +0200)]
radeonsi: move and rename scissor and viewport state and functions
v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 15:24:19 +0000 (17:24 +0200)]
radeonsi: remove si_apply_scissor_bug_workaround
It only affects pre-SI chips.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 26 Sep 2017 15:17:55 +0000 (17:17 +0200)]
radeonsi: move r600_viewport.c to si_viewport.c
This is purely a file-move + #include fixup + build system changes.
Other cleanups will follow in subsequent commits.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 17 Sep 2017 09:59:37 +0000 (11:59 +0200)]
radeonsi: fix maximum advertised point size / line width
The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.
Fixes dEQP-GLES3.functional.rasterization.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 17 Sep 2017 09:28:21 +0000 (11:28 +0200)]
radeonsi: deduce rast_prim correctly for tessellation point mode
Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 17 Sep 2017 09:26:53 +0000 (11:26 +0200)]
radeonsi: don't discard points and lines
This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sun, 17 Sep 2017 09:10:04 +0000 (11:10 +0200)]
radeonsi: move current_rast_prim to r600_common_context
We'll use it in the scissors / clip / guardband state.
v2: avoid a performance regression on r600 when applied to
(pre-fork) stable branches
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 27 Sep 2017 15:05:07 +0000 (17:05 +0200)]
st/mesa: use R10G10B10X2 format where applicable
This is the last step of fixing
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
for radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 27 Sep 2017 13:25:35 +0000 (15:25 +0200)]
gallium: add PIPE_FORMAT_R10G10B10X2_UNORM
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 27 Sep 2017 13:25:10 +0000 (15:25 +0200)]
mesa/main: R10G10B10_(A2) formats are not color renderable in ES
The EXT_texture_type_2_10_10_10_REV (ES only) states the following issue:
"1. Should textures specified with this type be renderable?
UNRESOLVED: No. A separate extension could provide this functionality."
This partially fixes
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.{rgb,rgba}_unsigned_int_2_10_10_10_rev
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 27 Sep 2017 13:24:31 +0000 (15:24 +0200)]
mesa/main: select the R10G10B10X2_UNORM internal format based on data type
ES requires it. This is a partial fix for
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 11 Sep 2017 14:40:37 +0000 (16:40 +0200)]
glsl: do not set the 'smooth' qualifier by default on ES shaders
It leads to surprising states with integer inputs and outputs on
vertex processing stages (e.g. geometry stages). Instead, rely on the
driver to choose smooth interpolation by default.
We still allow varyings to match when one stage declares it as smooth
and the other declares it without interpolation qualifiers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Rob Clark [Sun, 1 Oct 2017 19:29:24 +0000 (15:29 -0400)]
freedreno: fix PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
Fixes an assert in fd_acc_query_register_provider() about query provider
not already registered.
Fixes:
3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Eric Engestrom [Mon, 2 Oct 2017 10:34:52 +0000 (11:34 +0100)]
egl/wayland: simplify LIBGL_ALWAYS_SOFTWARE logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Nicolai Hähnle [Fri, 29 Sep 2017 09:21:01 +0000 (11:21 +0200)]
radeonsi: fix a regression in integer cube map handling
A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.
Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.
Fixes:
6fb0c1013b35 ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 29 Sep 2017 09:17:03 +0000 (11:17 +0200)]
amd/common: move ac_build_phi from radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Thu, 28 Sep 2017 11:55:31 +0000 (13:55 +0200)]
radv: remove unused radv_meta_state::btoi::render_pass handle
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 29 Sep 2017 15:11:12 +0000 (17:11 +0200)]
radv: do not check the number of levels when doing fast htile
We shouldn't reach this point because HTILE is only enabled
when the number of levels is 1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 28 Sep 2017 11:50:56 +0000 (13:50 +0200)]
radv: cleanup radv_device_finish_meta_XXX() helpers
Unnecessary to double check that handles are not NULL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 29 Sep 2017 12:13:48 +0000 (14:13 +0200)]
radv: select the pipeline outside of emit_fast_clear_flush()
It can't change during the decompression pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 29 Sep 2017 09:07:22 +0000 (11:07 +0200)]
radv: drop useless param in emit_depth_decomp()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 28 Sep 2017 09:18:01 +0000 (11:18 +0200)]
radv: drop useless check in depth_view_can_fast_clear()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 29 Sep 2017 07:43:29 +0000 (09:43 +0200)]
radv: add radv_subpass_clear_attachment() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 28 Sep 2017 19:47:14 +0000 (21:47 +0200)]
radv: add radv_attachment_needs_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 28 Sep 2017 19:28:18 +0000 (21:28 +0200)]
radv: remove unused param in radv_handle_{cmask,dcc}_image_transition()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 28 Sep 2017 08:24:09 +0000 (10:24 +0200)]
radv: add radv_vi_dcc_enabled() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 27 Sep 2017 19:56:20 +0000 (21:56 +0200)]
radv: do not need to double zero-init the meta state structures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 27 Sep 2017 19:49:53 +0000 (21:49 +0200)]
radv: inline destroy_render_pass()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 27 Sep 2017 19:47:55 +0000 (21:47 +0200)]
radv: use pipeline handles instead of objects for meta clear operations
To be consistent with other meta operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 27 Sep 2017 19:21:57 +0000 (21:21 +0200)]
radv: inline blit2d_unbind_dst()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 29 Sep 2017 14:48:07 +0000 (16:48 +0200)]
radv: rework DCC/CMASK/FMASK/HTILE allocations
Add helpers and some comments to make the thing more readable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Fri, 29 Sep 2017 14:25:18 +0000 (15:25 +0100)]
meson: fix version typo + grammar
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Iago Toral Quiroga [Wed, 20 Sep 2017 07:22:51 +0000 (09:22 +0200)]
i965: skip reading unused slots at the begining of the URB for the FS
We can start reading the URB at the first offset that contains varyings
that are actually read in the URB. We still need to make sure that we
read at least one varying to honor hardware requirements.
This helps alleviate a problem introduced with
99df02ca26f61 for
separate shader objects: without separate shader objects we assign
locations sequentially, however, since that commit we have changed the
method for SSO so that the VUE slot assigned depends on the number of
builtin slots plus the location assigned to the varying. This fixed
layout is intended to help SSO programs by avoiding on-the-fly recompiles
when swapping out shaders, however, it also means that if a varying uses
a large location number close to the maximum allowed by the SF/FS units
(31), then the offset introduced by the number of builtin slots can push
the location outside the range and trigger an assertion.
This problem is affecting at least the following CTS tests for
enhanced layouts:
KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations
which use SSO and the the location layout qualifier to select such
location numbers explicitly.
This change helps these tests because for SSO we always have to include
things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is
very unlikely to read them, so by doing this we free builtin slots from
the fixed VUE layout and we avoid the tests to crash in this scenario.
Of course, this is not a proper fix, we'd still run into problems if someone
tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or
gl_CullDistancein in the FS, but that would be a much less common bug and we
can probably wait to see if anyone actually runs into that situation in a real
world scenario before making the decision that more aggresive changes are
required to support this without reverting
99df02ca26f61.
v2:
- Add a debug message when we skip clip distances (Ilia)
- we also need to account for this when we compute the urb setup
for the fragment shader stage, so add a compiler util to compute
the first slot that we need to read from the URB instead of
replicating the logic in both places.
v3:
- Make the util more generic so it can account for all unused slots
at the beginning of the URB, that will make it more useful (Ken).
- Drop the debug message, it was not what Ilia was asking for.
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Fri, 30 Jun 2017 22:10:17 +0000 (15:10 -0700)]
i965: Normalize types for FBL, FBH, etc
Allows the instructions to be compacted. The documentation claims that
some of these only accept UD types, even though the type doesn't change
the operation performed. Just normalize the types to ensure we get
instruction compaction.
The only functional changes are for FBL and CBIT (always use UD types)
and FBH (always use the same types).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]
radeonsi: don't use the template keyword
for C++ editors
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]
glx: don't use the template keyword
for C++ editors
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]
gallium/vl: don't use the template keyword
for C++ editors
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]
egl/dri2: don't use the template keyword
for C++ editors
Reviewed-by: Brian Paul <brianp@vmware.com>
Benedikt Schemmer [Fri, 29 Sep 2017 19:02:13 +0000 (21:02 +0200)]
radeonsi/uvd: clean up si_video_buffer_create
V2: remove code duplication and one unnessecary variable, minor whitespace fix
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Fri, 29 Sep 2017 14:05:49 +0000 (16:05 +0200)]
radeonsi/uvd: fix planar formats broken since
f70f6baaa3bb0f8b280ac2eaea69bb
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Roland Scheidegger [Thu, 28 Sep 2017 01:45:04 +0000 (03:45 +0200)]
gallium: add new LOD opcode
The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Kamil Páral [Fri, 22 Sep 2017 21:31:08 +0000 (23:31 +0200)]
drirc: whitelist glthread for Outlast
FPS increase 10-20% in starting locations on Core i5-4570 +
Radeon R9 270.
Jan Vesely [Sat, 16 Sep 2017 22:55:48 +0000 (18:55 -0400)]
travis: Add clover build using llvm-5.0
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jan Vesely [Sat, 16 Sep 2017 22:54:24 +0000 (18:54 -0400)]
travis: Add clover build using llvm-4.0
llvm-4 needs gcc 4.8:
http://releases.llvm.org/4.0.1/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Jan Vesely [Sat, 16 Sep 2017 22:52:52 +0000 (18:52 -0400)]
travis: Add clover build using llvm-3.9
Use r600,radeonsi instead of i915
Update binutils, new linker is required for llvm-3.9:
https://www.ubuntuupdates.org/package/core/trusty/universe/updates/binutils-2.26
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Leo Liu [Fri, 29 Sep 2017 01:41:29 +0000 (21:41 -0400)]
st/va: add dst rect to avoid scale on deint
For 1080p video transcode, the height will be scaled to 1088 when deint
to progressive buffer. Set dst rect to make sure no scale.
Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint"
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andy Furniss <adf.lists@gmail.com>
Nicolai Hähnle [Sat, 16 Sep 2017 10:52:21 +0000 (12:52 +0200)]
radeonsi: emit DLDEXP and DFRACEXP TGSI opcodes
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 14:59:09 +0000 (16:59 +0200)]
radeonsi: emit LDEXP opcode
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 14:52:23 +0000 (16:52 +0200)]
st/glsl_to_tgsi: use LDEXP when available
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 14:51:14 +0000 (16:51 +0200)]
gallium: add LDEXP TGSI instruction and corresponding cap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 16:47:52 +0000 (18:47 +0200)]
tgsi: infer that dst[1] of DFRACEXP is an integer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Sat, 16 Sep 2017 10:50:42 +0000 (12:50 +0200)]
gallivm: add support for TGSI instructions with two outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 16:45:32 +0000 (18:45 +0200)]
gallivm: add dst register index to lp_build_tgsi_context::emit_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 16:34:48 +0000 (18:34 +0200)]
tgsi: clarify the semantics of DFRACEXP
The status quo is quite the mess:
1. tgsi_exec will do a per-channel computation, and store the dst[0]
result (significand) correctly for each channel. The dst[1] result
(exponent) will be written to the first bit set in the writemask.
So per-component calculation only works partially.
2. r600 will only do a single computation. It will replicate the
exponent but not the significand.
3. The docs pretend that there's per-component calculation, but even
get dst[0] and dst[1] confused.
4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
and kind-of assumes that everything is replicated, generating this for
the dvec4 case:
DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw
Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 15:47:27 +0000 (17:47 +0200)]
tgsi: fix the documentation of DLDEXP
Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 15:40:05 +0000 (17:40 +0200)]
tgsi: infer that DLDEXP's second source has an integer type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 15 Sep 2017 14:39:31 +0000 (16:39 +0200)]
glsl/lower_instruction: handle denorms and overflow in ldexp correctly
GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.
Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Thu, 28 Sep 2017 15:52:42 +0000 (17:52 +0200)]
util/queue: fix a race condition in the fence code
A tempting alternative fix would be adding a lock/unlock pair in
util_queue_fence_is_signalled. However, that wouldn't actually
improve anything in the semantics of util_queue_fence_is_signalled,
while making that test much more heavy-weight. So this lock/unlock
pair in util_queue_fence_destroy for "flushing out" other threads
that may still be in util_queue_fence_signal looks like the better
fix.
v2: rephrase the comment
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>