platform/upstream/mesa.git
5 years agoradv: Do not use extra descriptor space for the 3rd plane.
Bas Nieuwenhuizen [Wed, 8 May 2019 23:57:28 +0000 (01:57 +0200)]
radv: Do not use extra descriptor space for the 3rd plane.

While ImageFormatProperties returns the number of internal descriptors,
it turns out that applications do not need to actually allocate more
descriptors in the descriptor pool.

So if we make descriptors with more planes larger we have to be
convervative and always allocate space for the larger descriptors
which is a waste given the low usage of this ext.

So let us make use of the fact that 3plane formats all have the
same formats & dimensions for the last two planes. This way we
only need the first half of the descriptor of the 3rd plane and
can share the second half of the second plane.

This allows us to use 16 bytes for the descriptor which nicely
fits into the 16 bytes that are unused right next to the sampler.

Fixes: 5564c38212a "radv: Update descriptor sets for multiple planes."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add support for icd loader interface v4.
Bas Nieuwenhuizen [Sat, 5 Jan 2019 12:46:53 +0000 (13:46 +0100)]
radv: Add support for icd loader interface v4.

Adds support for physical device functions unknown to the loader.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agopanfrost/midgard: Handle csel correctly
Alyssa Rosenzweig [Tue, 7 May 2019 02:52:08 +0000 (02:52 +0000)]
panfrost/midgard: Handle csel correctly

We use an algebraic pass for the csel optimizations, and use proper
vectorized csel ops (i/fcsel_v) for mixed, rather lowering.

To avoid regressions along the way, we fix an issue with the copy
propagation pass (it should not attempt to propagate constants).
Similarly, we take care to break bundles when using csel to fix some
scheduler corner cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agoiris: Implement ARB_indirect_parameters
Illia Iorin [Thu, 9 May 2019 21:44:39 +0000 (00:44 +0300)]
iris: Implement ARB_indirect_parameters

iris_draw_vbo is divided into two functions to remove unnecessary
operations from the loop. This implementation of ARB_indirect_parameters
takes into account NV_conditional_render by saving MI_PREDICATE_RESULT
at the start of a draw call and restoring it at the end also the result
of NV_conditional_render is taken into account when computing predicates
that limit draw calls for ARB_indirect_parameters in a similar way
to 1952fd8d in ANV.

v2: Optimize indirect draws (suggested by Kenneth Graunke)
v3: (by Kenneth Graunke)
 - Fix an issue where indirect draws wouldn't set patch information
   before updating the compiled TCS.
 - Move some code back to iris_draw_vbo to avoid duplicating it.
 - Fix minor indentation issues.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Split iris_update_draw_info into two functions.
Kenneth Graunke [Sun, 12 May 2019 06:43:17 +0000 (23:43 -0700)]
iris: Split iris_update_draw_info into two functions.

Shader draw parameters need updating on each iteration of a multidraw
loop, but the primitive based information only needs to be updated once.

Also, patch information needs to be recorded before filling out the TCS
program key, as it determines the number of HS instances.

5 years agonir: Fix wrong sign in lower_rcp
Ruslan Kabatsayev [Sat, 11 May 2019 11:04:36 +0000 (14:04 +0300)]
nir: Fix wrong sign in lower_rcp

The nested fma calls were supposed to implement

x_new = x + x * (1 - x*src),

but instead current code is equivalent to

x_new = x - x * (1 - x*src).

The result is that Newton-Raphson steps don't improve precision at all.
This patch fixes this problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110435
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: drop misleading driver name from gen_get_device_info()
Mike Blumenkrantz [Mon, 15 Apr 2019 16:20:54 +0000 (12:20 -0400)]
intel: drop misleading driver name from gen_get_device_info()

5 years agoradv: clear vertex bindings while resetting command buffer
Józef Kucia [Fri, 10 May 2019 19:38:22 +0000 (21:38 +0200)]
radv: clear vertex bindings while resetting command buffer

Only vertex inputs accessed by vertex shader must have valid buffers
bound.

Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 5010436e09f "radv: bail out when binding the same vertex buffers"

5 years agost/mesa: fix 2 crashes in st_tgsi_lower_yuv
Marek Olšák [Thu, 25 Apr 2019 22:44:51 +0000 (18:44 -0400)]
st/mesa: fix 2 crashes in st_tgsi_lower_yuv

src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct
 tgsi_full_dst_register *, const struct tgsi_full_dst_register *, unsigned
 int): assertion "dst->Register.WriteMask" failed

The second crash was due to insufficient allocated size for TGSI
instructions.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agoiris: Use full ways for L3 cache setup on Icelake.
Kenneth Graunke [Fri, 10 May 2019 21:15:53 +0000 (14:15 -0700)]
iris: Use full ways for L3 cache setup on Icelake.

Anuj fixed this in i965 and anv, but the fix never landed in iris.
Fixes tessellation corruption on Icelake.  Thanks to Rafael for
bisecting this and tracking it down.

Fixes: d0996d5fab6 iris: Emit default L3 config for the render pipeline
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoanv: Fix limits when VK_EXT_descriptor_indexing is used
Caio Marcelo de Oliveira Filho [Thu, 9 May 2019 08:01:19 +0000 (01:01 -0700)]
anv: Fix limits when VK_EXT_descriptor_indexing is used

Update various limits in
VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously
zero to their values from VkPhysicalDeviceLimits.  When using
VK_EXT_descriptor_indexing, the former limits will apply to all the
descriptor layout sets -- not only those using the new feature bits.

For the reference, VK_EXT_descriptor_indexing says

    "There are new descriptor set layout and descriptor pool creation
    flags that are required to opt in to the update-after-bind
    functionality, and there are separate maxPerStage* and
    maxDescriptorSet* limits that apply to these descriptor set
    layouts which may be much higher than the pre-existing limits. The
    old limits only count descriptors in non-updateAfterBind
    descriptor set layouts, and the new limits count descriptors in
    all descriptor set layouts in the pipeline layout."

Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovulkan/overlay: keep allocating draw data until it can be reused
Lionel Landwerlin [Thu, 9 May 2019 16:52:11 +0000 (17:52 +0100)]
vulkan/overlay: keep allocating draw data until it can be reused

The original implementation assumed that we could allocate the same
amount of command buffers as the number of images in the swapchain.
But the application could potentially render much faster and rerender
into images that have been submitted for presentation but not yet
presented.

This change keeps on allocating command buffers, vertex buffer, vertex
indices as well as a semaphore and a fence for as long as we can't
reuse a previously submitted one.

This fixes rendering issues in the overlay at high frame rates.

v2: Don't recreate semaphores constantly (Józef)

v3: Drop useless surface & FreeCommandBuffers (Józef)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
5 years agovulkan/overlay: fix truncating error on 32bit platforms
Lionel Landwerlin [Thu, 9 May 2019 14:22:40 +0000 (15:22 +0100)]
vulkan/overlay: fix truncating error on 32bit platforms

Non dispatchable handles can be uint64_t. When compiling the layer on
a 32bit platform, this will lead to casting uint64_t into (void *)
which is 32bit, leading to incorrect handles being mapped internally
in the layer.

v2: Use more HKEY() (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Fixes: 2d2927938f074f ("vulkan/overlay-layer: fix cast errors")
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
5 years agoi965: Fix memory leaks in brw_upload_cs_work_groups_surface().
Kenneth Graunke [Thu, 9 May 2019 22:40:13 +0000 (15:40 -0700)]
i965: Fix memory leaks in brw_upload_cs_work_groups_surface().

This was taking a reference to the 64kB upload buffer and never
returning it, leaking a reference each time this atom triggered.

This leaked lots of 64kB upload BOs, eventually running us out of
of VMA space.  This would usually happen when using mpv to watch a
movie, after 20-40 minutes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134
Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agost/va: set the visible image dimensions in vlVaDeriveImage
Julien Isorce [Wed, 8 May 2019 19:40:50 +0000 (12:40 -0700)]
st/va: set the visible image dimensions in vlVaDeriveImage

This fixes video being rendered incorrectly.

User wants height of 360 but internally pipe_video_buffer 's height
is 368 in the test below.

Test:
  GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoswrast: Rename blend_func->swrast_blend_func
Alyssa Rosenzweig [Fri, 10 May 2019 16:22:05 +0000 (16:22 +0000)]
swrast: Rename blend_func->swrast_blend_func

This avoids a conflict with the new (driver-agnostic) blend_func enum in
shader_enum.h, which broke the build of swrast (and i965 by extension).

My apologies :(

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: f41be53a ("compiler: Add enums for blend state")
Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agotravis: fix syntax, and drop unused stuff
Eric Engestrom [Thu, 9 May 2019 12:28:41 +0000 (13:28 +0100)]
travis: fix syntax, and drop unused stuff

Fixes: a988d953899c099719f3 "ci: Delete autotools build jobs"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agonir: Add blend_const_color_rgba sysval
Alyssa Rosenzweig [Mon, 6 May 2019 02:00:37 +0000 (02:00 +0000)]
nir: Add blend_const_color_rgba sysval

This represents a float vec4 constant color, as passed to glBlendColor.
While the existing 4 shader sysvals are retained to minimize code churn,
a single vectorized intrinsic is required for efficient blending on
vector architectures. (This may also apply to archictectures like
Bifrost where ALU is scalar but load/store is vector; it largely depends
on how blending is implemented per-driver.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogallium: Add helper to convert PIPE blending to shader_enum style
Alyssa Rosenzweig [Mon, 6 May 2019 02:03:48 +0000 (02:03 +0000)]
gallium: Add helper to convert PIPE blending to shader_enum style

Complementing the new API-agnostic shader_enum blending style, we add
helpers to translate between the two forms. Ideally, we could just use
PIPE blending directly, but that makes Vulkan support challenging.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agocompiler: Add enums for blend state
Alyssa Rosenzweig [Mon, 6 May 2019 01:58:56 +0000 (01:58 +0000)]
compiler: Add enums for blend state

We add enums corresponding to (GLES) blend state to shader_enums.h,
complementing the existing advanced blending enums in the file. This
allows us to represent blending state in a driver-agnostic, API-agnostic
way to permit lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agonir: allow specifying a set of opcodes in lower_alu_to_scalar
Jonathan Marek [Wed, 8 May 2019 16:45:48 +0000 (12:45 -0400)]
nir: allow specifying a set of opcodes in lower_alu_to_scalar

This can be used by both etnaviv and freedreno/a2xx as they are both vec4
architectures with some instructions being scalar-only.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agointel/fs/copy-prop: Don't walk all the ACPs for each instruction
Jason Ekstrand [Sun, 5 May 2019 05:13:20 +0000 (00:13 -0500)]
intel/fs/copy-prop: Don't walk all the ACPs for each instruction

In order to set up KILL sets, the dataflow code was walking the entire
array of ACPs for every instruction.  If you assume the number of ACPs
increases roughly with the number of instructions, this is O(n^2).  As
it turns out, regions_overlap() is not nearly as cheap as one would like
and shows up as a significant chunk on perf traces.

This commit changes things around and instead first builds an array of
exec_lists which it uses like a hash table (keyed off ACP source or
destination) similar to what's done in the rest of the copy-prop code.
By first walking the list of ACPs and populating the table and then
walking instructions and only looking at ACPs which probably have the
same VGRF number, we can reduce the complexity to O(n).  This takes the
execution time of the piglit vs-isnan-dvec test from about 56.4 seconds
on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to
about 38.7 seconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs/copy-prop: Purge unused ACPs
Jason Ekstrand [Sun, 5 May 2019 04:51:23 +0000 (23:51 -0500)]
intel/fs/copy-prop: Purge unused ACPs

If the destination of an ACP entry exists only within this block, then
there's no need to keep it for dataflow analysis.  We can delete it from
the out_acp table and avoid growing the bitsets any bigger than we
absolutely have to.  This reduces the maximum number of global ACP
entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630
to 3942 and takes the execution time of the piglit vs-isnan-dvec test
from about 1:16.2 on an unoptimized debug build (what we run in CI) with
NIR_VALIDATE=0 to about 56.4 seconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs/copy-prop: Bump the hash table size to 64
Jason Ekstrand [Sun, 5 May 2019 03:47:59 +0000 (22:47 -0500)]
intel/fs/copy-prop: Bump the hash table size to 64

While the number of ACPs is generally not huge compared to the number of
blocks, 16 does seem a bit small.  Bumping it to 64 takes the execution
time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized
debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agowinsys/amdgpu: add VCN JPEG to no user fence group
Leo Liu [Wed, 8 May 2019 12:13:52 +0000 (08:13 -0400)]
winsys/amdgpu: add VCN JPEG to no user fence group

There is no user fence for JPEG, the bug triggering
kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
5 years agolima: fix width 4096 resolution GP fail
Qiang Yu [Thu, 9 May 2019 12:40:24 +0000 (20:40 +0800)]
lima: fix width 4096 resolution GP fail

When width=4096 and shift_w=0, block_w=0x100 which overflow
the PLBU_CMD 8 bits for it.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agopanfrost: Add CAPFs for conservative rasterization
Tomeu Vizoso [Thu, 9 May 2019 12:07:45 +0000 (14:07 +0200)]
panfrost: Add CAPFs for conservative rasterization

Just do what everybody else but Nouveau does and return 0.0f.

This prevents the repeated logging of these messages on startup:

Unexpected PIPE_CAPF 6 query
Unexpected PIPE_CAPF 7 query
Unexpected PIPE_CAPF 8 query

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: Only take the fast paths on buffers aligned to block size
Tomeu Vizoso [Wed, 8 May 2019 06:59:30 +0000 (08:59 +0200)]
panfrost: Only take the fast paths on buffers aligned to block size

As the functions operate on 16-byte blocks.

Fixes this Valgrind error:

Invalid read of size 4
   at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85)
   by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171)
   by 0x584F587: panfrost_tile_texture (pan_resource.c:489)
   by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525)
   by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516)
   by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515)
   by 0x5875F13: u_default_texture_subdata (u_transfer.c:80)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
 Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd
   at 0x483F5C8: malloc (vg_replace_malloc.c:299)
   by 0x584F47D: panfrost_transfer_map (pan_resource.c:467)
   by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
   by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
   by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
5 years agopanfrost: Fix two uninitialized accesses in compiler
Tomeu Vizoso [Tue, 7 May 2019 15:28:36 +0000 (17:28 +0200)]
panfrost: Fix two uninitialized accesses in compiler

Valgrind was complaining of those.

NIR_PASS only sets progress to TRUE if there was progress.

nir_const_load_to_arr() only sets as many constants as components has
the instruction.

This was causing some dEQP tests to flip-flop, such as:

dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: 14531d676b11 ("nir: make nir_const_value scalar")

5 years agopanfrost: ci: Skip running some tests
Tomeu Vizoso [Tue, 7 May 2019 11:43:14 +0000 (13:43 +0200)]
panfrost: ci: Skip running some tests

These tests add too much time to the total run time, and some of them
even hang the DUTs, even if I haven't been able to reproduce it locally.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Don't restart Weston
Tomeu Vizoso [Tue, 7 May 2019 10:57:58 +0000 (12:57 +0200)]
panfrost: ci: Don't restart Weston

There doesn't seem to actually be any noticeably memory leaks on Weston
when running dEQP. We do seem to leak quiet a bit in the client, so we
still have to run the dEQP runner in batches.

This removes the risk of Weston not restarting properly and introducing
spurious failures.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Update list of expected failures
Tomeu Vizoso [Tue, 7 May 2019 08:58:36 +0000 (10:58 +0200)]
panfrost: ci: Update list of expected failures

This matches the current state of things on both RK3288 and RK3399.
Hopefully, from now on we'll only remove stuff from this list.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Tweak dEQP to improve throughput
Tomeu Vizoso [Tue, 7 May 2019 08:00:23 +0000 (10:00 +0200)]
panfrost: ci: Tweak dEQP to improve throughput

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Fix list of tests to run
Tomeu Vizoso [Tue, 7 May 2019 06:45:19 +0000 (08:45 +0200)]
panfrost: ci: Fix list of tests to run

Make sure we have only test case names in the list, excluding names of
test groups.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Check for incomplete runs
Tomeu Vizoso [Tue, 7 May 2019 06:44:03 +0000 (08:44 +0200)]
panfrost: ci: Check for incomplete runs

To improve robustness, check that we got the expected number of results.
Right now we hard-code the expected number of tests run, but with some
effort we may be able to infer it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Add tests to flip-flop list
Tomeu Vizoso [Mon, 6 May 2019 05:39:42 +0000 (07:39 +0200)]
panfrost: ci: Add tests to flip-flop list

These tests aren't giving reliable results. Mask them for now.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Add support for running the tests on RK3288
Tomeu Vizoso [Fri, 3 May 2019 15:48:48 +0000 (17:48 +0200)]
panfrost: ci: Add support for running the tests on RK3288

Build artifacts for armhf and schedule them on a Veyron Chromebook with
RK3288.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agolima: fix tile buffer reloading
Vasily Khoruzhick [Wed, 8 May 2019 02:03:34 +0000 (19:03 -0700)]
lima: fix tile buffer reloading

Buffer needs to be reloaded every time unless explicit clear() was
called.

Fixes rendering issues with wayland compositors.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoanv: Remove special allocation for anv_push_constants
Caio Marcelo de Oliveira Filho [Tue, 7 May 2019 06:46:42 +0000 (23:46 -0700)]
anv: Remove special allocation for anv_push_constants

The key reason for that mechanism is gone: all the extra optional data
that could be in the anv_push_constants was moved elsewhere.  At this
point, just put anv_push_constants directly in anv_cmd_state (part of
anv_cmd_buffer).

v2: Remove a NULL check we don't need anymore in
    anv_cmd_buffer_push_constants().  (Lionel)
    Fix size we consider for valid push params.  (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY
Kenneth Graunke [Wed, 8 May 2019 18:33:50 +0000 (11:33 -0700)]
iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY

This provides a way for the application to query whether any resets have
happened, which lets us expose "robust" contexts.  This also enables the
KHR_robust_buffer_access_behavior tests.

5 years agoiris: Hook up device reset callbacks
Kenneth Graunke [Wed, 8 May 2019 05:26:22 +0000 (22:26 -0700)]
iris: Hook up device reset callbacks

This mechanism lets the driver inform the state tracker about GPU
resets, say for destroying a robust API context and reporting a "device
lost" error to the application, making it take action to deal with this.

5 years agoiris: Try to recover from GPU hangs.
Kenneth Graunke [Wed, 8 May 2019 06:19:30 +0000 (23:19 -0700)]
iris: Try to recover from GPU hangs.

The iris batch module now tries to detect that the kernel has banned
our GEM context, creates a new non-banned context, and informs the
iris context module that all assumptions about state are now invalid
and it needs to reinitialize the relevant state.

Based on Chris Wilson's work, but significantly rewritten by me.

5 years agoiris: Add helpers to clone a hardware context.
Chris Wilson [Wed, 8 May 2019 06:23:13 +0000 (23:23 -0700)]
iris: Add helpers to clone a hardware context.

(Chris Wilson wrote this code in a patch titled "i965: Be resilient in
the face of GPU hangs"; Ken fixed a bug and copied it to iris.)

5 years agoiris: Mark render batches as non-recoverable.
Kenneth Graunke [Wed, 8 May 2019 06:03:46 +0000 (23:03 -0700)]
iris: Mark render batches as non-recoverable.

Adapted from Chris Wilson's patch.  The comment is largely his.

Currently, when iris hangs the GPU, it will continue sending batches
which incrementally update the state, assuming it's preserved across
batches.  However, the kernel's GPU reset support reinitializes the
guilty context to the default GPU state (reasonably not wanting to
trust the current state).  This ends up resetting critical things
like STATE_BASE_ADDRESS, causing memory accesses in all subsequent
batches to be garbage, and almost certainly result in more hangs
until we're banned or we kill the machine.

We now ask the kernel to ban our render context immediately, so we
notice we've gone off the rails as fast as possible.  Eventually, we'll
attempt to recover and continue.  For now, we just avoid torching the
GPU over and over.

5 years agofreedreno/ir3: fix rasterflat/glxgears
Rob Clark [Thu, 9 May 2019 22:58:16 +0000 (15:58 -0700)]
freedreno/ir3: fix rasterflat/glxgears

Ofc legacy gl features that are broken don't trigger fails in deqp.  I
should remember to test glxgears more often.

Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agoanv: Use corresponding type from the vector allocation
Lionel Landwerlin [Thu, 9 May 2019 12:33:43 +0000 (13:33 +0100)]
anv: Use corresponding type from the vector allocation

We didn't notice this issue much because the 2 struct share a similar
layout, expect for the additional fields...

We run into that issue in Anv :

==15236== Invalid write of size 8
==15236==    at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211)
==15236==    by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264)
==15236==    by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312)
==15236==    by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167)
==15236==    by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190)
==15236==    by 0x8CF60871: alloc_surface_state (anv_image.c:1122)
==15236==    by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519)
==15236==    by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358)
==15236==  Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd
==15236==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15236==    by 0x8D2578E6: u_vector_init (u_vector.c:47)
==15236==    by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168)
==15236==    by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921)
==15236==    by 0x8CF56517: anv_CreateDevice (anv_device.c:1909)
==15236==    by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073)
==15236==    by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so)
==15236==    by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so)
==15236==    by 0x8BCB35C6: loader_create_device_chain (loader.c:5449)
==15236==    by 0x8BCBC230: vkCreateDevice (trampoline.c:838)

v2: Rename mmap_cleanups to avoid confusion (Caio)

v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agodocs: update calendar, and news item and link release notes for 19.0.4
Dylan Baker [Thu, 9 May 2019 20:48:47 +0000 (13:48 -0700)]
docs: update calendar, and news item and link release notes for 19.0.4

5 years agodocs: Add SHA256 sums for mesa 19.0.4
Dylan Baker [Thu, 9 May 2019 20:45:19 +0000 (13:45 -0700)]
docs: Add SHA256 sums for mesa 19.0.4

5 years agoDocs: add 19.0.4 release notes
Dylan Baker [Thu, 9 May 2019 20:21:34 +0000 (13:21 -0700)]
Docs: add 19.0.4 release notes

5 years agomesa: fix GL_PROGRAM_BINARY_RETRIEVABLE_HINT handling
Pierre-Eric Pelloux-Prayer [Thu, 9 May 2019 07:52:40 +0000 (09:52 +0200)]
mesa: fix GL_PROGRAM_BINARY_RETRIEVABLE_HINT handling

When first implemented in fefd03e16c16 Mesa's behavior was aligned on behavior
of Nvidia's driver. This caused a failing test in piglit but was ok since the
specification is unclear on this subject.

Nvidia's driver behavior has been modified because using version 410.104, the
problematic test (program_binary_retrievable_hint) now passes.

This commit defers BinaryRetrievableHint update until the next linking so the
test passes on Mesa as well.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir: Initialize lower_flrp_progress everywhere
Ian Romanick [Wed, 8 May 2019 14:32:43 +0000 (07:32 -0700)]
nir: Initialize lower_flrp_progress everywhere

I don't know why I thought NIR_PASS always set the progress variable.
Derp.

Fixes: d41cdef2a59 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic")
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Coverity CID: 1444996
Coverity CID: 1444995
Coverity CID: 1444994
Coverity CID: 1444993
Coverity CID: 1444991
Coverity CID: 1444989

5 years agogallium: fix typo in comment
Eric Engestrom [Fri, 8 Mar 2019 11:03:48 +0000 (11:03 +0000)]
gallium: fix typo in comment

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomeson: fix a couple typos in comments
Eric Engestrom [Thu, 7 Mar 2019 14:45:26 +0000 (14:45 +0000)]
meson: fix a couple typos in comments

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoi965_asm: avoid free()ing uninitialized pointers
Eric Engestrom [Wed, 8 May 2019 15:19:23 +0000 (16:19 +0100)]
i965_asm: avoid free()ing uninitialized pointers

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965_asm: fix memleak
Eric Engestrom [Wed, 8 May 2019 15:18:51 +0000 (16:18 +0100)]
i965_asm: fix memleak

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv: fix setting the number of rectangles when it's dyanmic
Samuel Pitoiset [Thu, 9 May 2019 08:15:20 +0000 (10:15 +0200)]
radv: fix setting the number of rectangles when it's dyanmic

We need to know the number of rectangles.

This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*.

Fixes: 5db0bf99944 ("radv: Implement VK_EXT_discard_rectangles.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoiris: Reorganise execbuf to have a single point of failure
Chris Wilson [Mon, 25 Mar 2019 22:32:12 +0000 (22:32 +0000)]
iris: Reorganise execbuf to have a single point of failure

Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure
if need be. We retain the current behaviour to abort() at the first sign
of trouble -- for a non-robustness context, arguably this is the right
thing to do as the client cannot recover, and the system state is lost.
How to properly integrate with KHR_robustness and reset-strategy is
left as a future exercise.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agodrm-uapi: Update i915_drm.h for I915_CONTEXT_PARAM_RECOVERABLE
Chris Wilson [Sun, 17 Feb 2019 13:53:55 +0000 (13:53 +0000)]
drm-uapi: Update i915_drm.h for I915_CONTEXT_PARAM_RECOVERABLE

Pull i915_drm.h to include

kernel commit ba4fda620a5f7db521aa9e0262cf49854c1b1d9c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 18 10:58:21 2019 +0000

    drm/i915: Optionally disable automatic recovery after a GPU reset

for improved resilience in handling GPU hangs.

5 years agokmsro: add _dri.so to two of the kmsro drivers.
Dave Airlie [Wed, 8 May 2019 02:38:18 +0000 (12:38 +1000)]
kmsro: add _dri.so to two of the kmsro drivers.

Fixes: 8cfc17bdda3 (kmsro: Add the rest of the current set of tinydrm drivers.)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoiris: Report the same video memory settings as i965.
Kenneth Graunke [Wed, 8 May 2019 19:37:32 +0000 (12:37 -0700)]
iris: Report the same video memory settings as i965.

This just copy and pastes Ian's code from i965.

5 years agogitlab-ci: add the vulkan overlay layer to the vulkan build
Eric Engestrom [Wed, 8 May 2019 16:17:23 +0000 (18:17 +0200)]
gitlab-ci: add the vulkan overlay layer to the vulkan build

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agogitlab-ci: add the vulkan overlay layer to the vulkan build
Eric Engestrom [Wed, 8 May 2019 16:08:11 +0000 (18:08 +0200)]
gitlab-ci: add the vulkan overlay layer to the vulkan build

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
[ Michel Dänzer: Take changes affecting the docker image from !299,
  plus remove the unzip package again before generating the image ]

5 years agogitlab-ci: Don't install WINE packages
Michel Dänzer [Fri, 3 May 2019 09:56:36 +0000 (11:56 +0200)]
gitlab-ci: Don't install WINE packages

They were just making the docker image larger for no benefit at this
point.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Reorder jobs a bit to be generally ordered longer => shorter
Michel Dänzer [Fri, 3 May 2019 16:19:25 +0000 (18:19 +0200)]
gitlab-ci: Reorder jobs a bit to be generally ordered longer => shorter

This makes the longer jobs likely to run earlier, which can help the
overall pipeline duration.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Build clover against all supported versions of LLVM
Michel Dänzer [Fri, 3 May 2019 08:58:48 +0000 (10:58 +0200)]
gitlab-ci: Build clover against all supported versions of LLVM

And consolidate it all into a single job.

It doesn't take much longer than a single version, thanks to ccache.
Overall, this single job might be faster or at least use fewer CPU
cycles than the two jobs before, while covering thrice as many versions
of LLVM.

v2:
* Move "rm -rf _build" to meson-build.sh.
* Set GALLIUM_DRIVERS the same way both times in the meson-clover job,
  for symmetry.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
5 years agogitlab-ci: Move meson job script to separate file
Michel Dänzer [Fri, 3 May 2019 08:49:43 +0000 (10:49 +0200)]
gitlab-ci: Move meson job script to separate file

No functional change intended (except for no longer running meson
--version separately, as the version appears early in meson's output
anyway).

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Remove superfluous comment about image tag counter suffix
Michel Dänzer [Fri, 3 May 2019 08:41:45 +0000 (10:41 +0200)]
gitlab-ci: Remove superfluous comment about image tag counter suffix

We really shouldn't ever need a suffix, otherwise it indicates a failure
in coordination. :) In which case, it doesn't really matter how the tag
is disambiguated.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomeson: Force the use of config-tool for llvm
Dylan Baker [Tue, 7 May 2019 16:55:38 +0000 (09:55 -0700)]
meson: Force the use of config-tool for llvm

meson git now has a cmake find method for llvm, but it lacks a couple of
features that we use from the config tool version. Until that reaches
parity we need to use the config-tool version.

CC: 19.0 19.1 <<mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogallium/util: fix two MSVC compiler warnings
Brian Paul [Sat, 4 May 2019 16:04:57 +0000 (10:04 -0600)]
gallium/util: fix two MSVC compiler warnings

Remove stray const qualifier.
s/unsigned/enum tgsi_semantic/

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warning
Brian Paul [Sat, 4 May 2019 16:04:07 +0000 (10:04 -0600)]
gallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warning

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agonoop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warning
Brian Paul [Sat, 4 May 2019 16:02:48 +0000 (10:02 -0600)]
noop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warning

The function pointer declaration in pipe_context uses unsigned
for the bitmask.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoddebug: fix a few MSVC compiler warnings
Brian Paul [Sat, 4 May 2019 16:01:31 +0000 (10:01 -0600)]
ddebug: fix a few MSVC compiler warnings

Don't return an expression in void functions.
Replace an unsigned int with proper enum.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoglsl: s/GLboolean/bool/ to silence MSVC compiler warning
Brian Paul [Sat, 4 May 2019 16:00:08 +0000 (10:00 -0600)]
glsl: s/GLboolean/bool/ to silence MSVC compiler warning

It complains about mixing GLboolean and bool in the |= expression.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agonir/flrp: Reassociate add in flrp(±1, b, c) lowering path
Ian Romanick [Tue, 7 May 2019 17:27:11 +0000 (10:27 -0700)]
nir/flrp: Reassociate add in flrp(±1, b, c) lowering path

With this reassociation, this lowering path is still beneficial.

Ice Lake
total instructions in shared programs: 17220191 -> 17207181 (-0.08%)
instructions in affected programs: 999871 -> 986861 (-1.30%)
helped: 3703
HURT: 17
helped stats (abs) min: 1 max: 686 x̄: 3.52 x̃: 3
helped stats (rel) min: 0.09% max: 51.97% x̄: 2.21% x̃: 1.35%
HURT stats (abs)   min: 1 max: 9 x̄: 1.47 x̃: 1
HURT stats (rel)   min: 0.08% max: 4.55% x̄: 0.78% x̃: 0.55%
95% mean confidence interval for instructions value: -4.01 -2.99
95% mean confidence interval for instructions %-change: -2.29% -2.11%
Instructions are helped.

total cycles in shared programs: 360871298 -> 360755040 (-0.03%)
cycles in affected programs: 9931334 -> 9815076 (-1.17%)
helped: 2388
HURT: 1569
helped stats (abs) min: 1 max: 10228 x̄: 93.54 x̃: 18
helped stats (rel) min: <.01% max: 74.11% x̄: 3.36% x̃: 1.07%
HURT stats (abs)   min: 1 max: 1917 x̄: 68.27 x̃: 22
HURT stats (rel)   min: <.01% max: 44.90% x̄: 3.44% x̃: 1.72%
95% mean confidence interval for cycles value: -39.48 -19.28
95% mean confidence interval for cycles %-change: -0.86% -0.46%
Cycles are helped.

total spills in shared programs: 12355 -> 12159 (-1.59%)
spills in affected programs: 295 -> 99 (-66.44%)
helped: 2
HURT: 1

total fills in shared programs: 25398 -> 25207 (-0.75%)
fills in affected programs: 288 -> 97 (-66.32%)
helped: 2
HURT: 1

LOST:   3
GAINED: 44

Iron Lake
total instructions in shared programs: 8169225 -> 8159729 (-0.12%)
instructions in affected programs: 1025712 -> 1016216 (-0.93%)
helped: 3352
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 2.83 x̃: 3
helped stats (rel) min: 0.15% max: 12.00% x̄: 1.51% x̃: 1.05%
95% mean confidence interval for instructions value: -2.86 -2.80
95% mean confidence interval for instructions %-change: -1.56% -1.46%
Instructions are helped.

total cycles in shared programs: 188656796 -> 188612280 (-0.02%)
cycles in affected programs: 18633584 -> 18589068 (-0.24%)
helped: 3085
HURT: 14
helped stats (abs) min: 2 max: 72 x̄: 14.45 x̃: 12
helped stats (rel) min: 0.02% max: 5.73% x̄: 0.73% x̃: 0.31%
HURT stats (abs)   min: 2 max: 4 x̄: 3.71 x̃: 4
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -14.55 -14.18
95% mean confidence interval for cycles %-change: -0.76% -0.69%
Cycles are helped.

GM45
total instructions in shared programs: 5026905 -> 5021856 (-0.10%)
instructions in affected programs: 584169 -> 579120 (-0.86%)
helped: 1776
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 2.84 x̃: 3
helped stats (rel) min: 0.15% max: 11.11% x̄: 1.43% x̃: 0.98%
95% mean confidence interval for instructions value: -2.88 -2.80
95% mean confidence interval for instructions %-change: -1.50% -1.37%
Instructions are helped.

total cycles in shared programs: 129047376 -> 129018918 (-0.02%)
cycles in affected programs: 12941924 -> 12913466 (-0.22%)
helped: 1722
HURT: 14
helped stats (abs) min: 4 max: 72 x̄: 16.56 x̃: 18
helped stats (rel) min: 0.02% max: 5.73% x̄: 0.72% x̃: 0.30%
HURT stats (abs)   min: 2 max: 4 x̄: 3.71 x̃: 4
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -16.65 -16.13
95% mean confidence interval for cycles %-change: -0.76% -0.66%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agonir/flrp: Fix typo on the flrp(±1, b, c) path
Ian Romanick [Tue, 7 May 2019 16:22:27 +0000 (09:22 -0700)]
nir/flrp: Fix typo on the flrp(±1, b, c) path

After Samuel reported the bisect, I was able to find the bug by
inspection.  Good thing for well-named varibles. :)

Unfortunately, this undoes almost all of the benefit of the original
patch.

Ice Lake
total instructions in shared programs: 17183159 -> 17218166 (0.20%)
instructions in affected programs: 1308722 -> 1343729 (2.67%)
helped: 98
HURT: 4746
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.47% max: 2.70% x̄: 0.60% x̃: 0.57%
HURT stats (abs)   min: 1 max: 691 x̄: 7.40 x̃: 8
HURT stats (rel)   min: 0.10% max: 700.00% x̄: 5.82% x̃: 2.83%
95% mean confidence interval for instructions value: 6.82 7.64
95% mean confidence interval for instructions %-change: 5.22% 6.15%
Instructions are HURT.

total cycles in shared programs: 360705959 -> 360853522 (0.04%)
cycles in affected programs: 10754380 -> 10901943 (1.37%)
helped: 1594
HURT: 3331
helped stats (abs) min: 1 max: 1896 x̄: 119.81 x̃: 60
helped stats (rel) min: <.01% max: 35.48% x̄: 5.06% x̃: 3.64%
HURT stats (abs)   min: 1 max: 10208 x̄: 101.63 x̃: 38
HURT stats (rel)   min: 0.01% max: 878.95% x̄: 9.01% x̃: 2.78%
95% mean confidence interval for cycles value: 21.11 38.81
95% mean confidence interval for cycles %-change: 3.76% 5.15%
Cycles are HURT.

total spills in shared programs: 12158 -> 12355 (1.62%)
spills in affected programs: 98 -> 295 (201.02%)
helped: 1
HURT: 2

total fills in shared programs: 25204 -> 25398 (0.77%)
fills in affected programs: 94 -> 288 (206.38%)
helped: 0
HURT: 3

LOST:   15
GAINED: 8

Iron Lake
total instructions in shared programs: 8121430 -> 8166733 (0.56%)
instructions in affected programs: 1148353 -> 1193656 (3.95%)
helped: 2
HURT: 4046
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.85% max: 1.92% x̄: 1.89% x̃: 1.89%
HURT stats (abs)   min: 1 max: 43 x̄: 11.20 x̃: 11
HURT stats (rel)   min: 0.20% max: 716.67% x̄: 7.40% x̃: 3.87%
95% mean confidence interval for instructions value: 11.02 11.37
95% mean confidence interval for instructions %-change: 6.84% 7.94%
Instructions are HURT.

total cycles in shared programs: 188376326 -> 188601568 (0.12%)
cycles in affected programs: 27416674 -> 27641916 (0.82%)
helped: 68
HURT: 3947
helped stats (abs) min: 2 max: 222 x̄: 13.88 x̃: 6
helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01%
HURT stats (abs)   min: 2 max: 670 x̄: 57.31 x̃: 64
HURT stats (rel)   min: <.01% max: 1811.11% x̄: 4.11% x̃: 1.09%
95% mean confidence interval for cycles value: 55.01 57.20
95% mean confidence interval for cycles %-change: 2.88% 5.19%
Cycles are HURT.

LOST:   35
GAINED: 3

GM45
total instructions in shared programs: 4979794 -> 5003551 (0.48%)
instructions in affected programs: 635174 -> 658931 (3.74%)
helped: 1
HURT: 2142
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.85% max: 1.85% x̄: 1.85% x̃: 1.85%
HURT stats (abs)   min: 1 max: 43 x̄: 11.09 x̃: 11
HURT stats (rel)   min: 0.20% max: 716.67% x̄: 7.00% x̃: 3.53%
95% mean confidence interval for instructions value: 10.85 11.33
95% mean confidence interval for instructions %-change: 6.25% 7.74%
Instructions are HURT.

total cycles in shared programs: 128519586 -> 128654990 (0.11%)
cycles in affected programs: 17635304 -> 17770708 (0.77%)
helped: 46
HURT: 2088
helped stats (abs) min: 4 max: 220 x̄: 18.13 x̃: 6
helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01%
HURT stats (abs)   min: 2 max: 670 x̄: 65.25 x̃: 66
HURT stats (rel)   min: <.01% max: 1464.29% x̄: 4.05% x̃: 0.99%
95% mean confidence interval for cycles value: 61.75 65.15
95% mean confidence interval for cycles %-change: 2.58% 5.34%
Cycles are HURT.

LOST:   38
GAINED: 38

Fixes: 5b908db604b ("nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently")
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoanv: fix use after free
Lionel Landwerlin [Wed, 8 May 2019 10:39:09 +0000 (11:39 +0100)]
anv: fix use after free

Once mem->bo is removed from the cache, it is likely to be freed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b80930a6fea075 ("anv: add support for VK_EXT_memory_budget")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: rework queries writes to ensure ordering memory writes
Lionel Landwerlin [Wed, 1 May 2019 11:30:41 +0000 (12:30 +0100)]
anv: rework queries writes to ensure ordering memory writes

We use a mix of MI & PIPE_CONTROL commands to write our queries' data
(results & availability). Those commands' memory write order is not
guaranteed with regard to their order in the command stream, unless CS
stalls are inserted between them. This is problematic for 2 reasons :

   1. We copy results from the device using MI commands even though
      the values are generated from PIPE_CONTROL, meaning we could
      copy unlanded values into the results and then copy the
      availability that is inconsistent with the values.

   2. We allow the user to poll on the availability values of the
      query pool from the CPU. If the availability lands in memory
      before the values then we could return invalid values.

This change does 2 things to address this problem :

      - We use either PIPE_CONTROL or MI commands to write both
        queries values and availability, so that the ordering of the
        memory writes guarantees that if availability is visible,
        results are also visible.

      - For the occlusion & timestamp queries we apply a CS stall
        before copying the results on the device, to ensure copying
        with MI commands see the correct values of previous
        PIPE_CONTROL writes of availability (required by the Vulkan
        spec).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: call constant folding before opt algebraic
Timothy Arceri [Thu, 2 May 2019 03:38:52 +0000 (13:38 +1000)]
radv: call constant folding before opt algebraic

The pattern of calling opt algebraic first seems to have originated
in i965. The order in OpenGL drivers generally doesn't matter
because the GLSL IR optimisations do constant folding before
opt algebraic.

However in Vulkan drivers calling opt algebraic first can result
in missed constant folding opportunities.

vkpipeline-db results (VEGA64):

Totals from affected shaders:
SGPRS: 3160 -> 3176 (0.51 %)
VGPRS: 3588 -> 3580 (-0.22 %)
Spilled SGPRs: 52 -> 44 (-15.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 12 -> 12 (0.00 %) dwords per thread
Code Size: 261812 -> 261036 (-0.30 %) bytes
LDS: 7 -> 7 (0.00 %) blocks
Max Waves: 346 -> 348 (0.58 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agodocs: drop h1 in header
Erik Faye-Lund [Mon, 6 May 2019 11:26:47 +0000 (13:26 +0200)]
docs: drop h1 in header

It's generally frowned upon to have more than one H1 per document in
HTML4. So let's put the text directly inside the header. This means we
can drop the flex-based centering, which makes things a bit easier. We
also need to change the padding to rem instead of em, because the em has
now changed.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: harmonize headings and titles
Erik Faye-Lund [Mon, 6 May 2019 11:13:11 +0000 (13:13 +0200)]
docs: harmonize headings and titles

We're pretty insonsistent in what the headings and titles are, especially
compared to what the articles are listed as in the sidebar. Let's
harmonize this.

There's a notable exception for meson.html, where the sidebar uses a
short-hand form that makes sense in the sidebar, but not in the article
due to the visible context being different.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: renumber headings
Erik Faye-Lund [Mon, 6 May 2019 10:57:15 +0000 (12:57 +0200)]
docs: renumber headings

It's generally frowned upon to have multiple H1 headings in HTML4. So
let's make sure each article has a primary heading for the article, and
that that heading is the title that is used in the sidebar.

While we're at it, let's update the title in the articles to match the
title from the sidebar as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: give download-article a primary heading
Erik Faye-Lund [Mon, 6 May 2019 10:50:34 +0000 (12:50 +0200)]
docs: give download-article a primary heading

It's generally frowned upon to have multiple H1 headings in HTML4. So
let's add a primary heading for the article, and source that from the
title used in the sidebar.

While we're at it, let's update the title in the article to match the
title from the sidebar as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: use title-casing for all headings in sidebar
Erik Faye-Lund [Mon, 6 May 2019 10:36:26 +0000 (12:36 +0200)]
docs: use title-casing for all headings in sidebar

We generally use title-casing for headings in the sidebar. But not
all headings was constently cased like that. Let's make sure this
is consistent.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: spell out "and" in sidebar
Erik Faye-Lund [Mon, 6 May 2019 10:32:59 +0000 (12:32 +0200)]
docs: spell out "and" in sidebar

There's no need to keep this short, we can just spell out "and" here.
Besides, a slash kind of implies "or", but these articles are about
both of these, not either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: remove pointless list-entry
Erik Faye-Lund [Thu, 2 May 2019 18:47:55 +0000 (20:47 +0200)]
docs: remove pointless list-entry

It's quite visible that there's more docs below, we don't need to spell
it out for the reader.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: spell out faq in sidebar
Erik Faye-Lund [Mon, 6 May 2019 10:28:54 +0000 (12:28 +0200)]
docs: spell out faq in sidebar

We're not short on space here, so there's little point in abbreviating
this. This also matches the heading in the article.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: spell out "and" in sidebar
Erik Faye-Lund [Thu, 2 May 2019 18:43:25 +0000 (20:43 +0200)]
docs: spell out "and" in sidebar

We're not short on space here, so let's just spell out "and" instead of
using the ampersand. This is more consistent with the entry above in the
sidebar.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoglsl_to_nir: remove unused type_is_int()
Timothy Arceri [Wed, 8 May 2019 03:55:53 +0000 (13:55 +1000)]
glsl_to_nir: remove unused type_is_int()

This was missed in e00fa99b08b3.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoRevert "glx: Fix synthetic error generation in __glXSendError"
Timothy Arceri [Tue, 7 May 2019 03:55:32 +0000 (13:55 +1000)]
Revert "glx: Fix synthetic error generation in __glXSendError"

This reverts commit e91ee763c378d03883eb88cf0eadd8aa916f7878.

This seems to have broken a number of wine games. Lets revert
everything for now and try again later.

Acked-by: Adam Jackson <ajax@redhat.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590

5 years agoradeonsi: add an AMD_TEX_ANISO environment variable
Timothy Arceri [Tue, 7 May 2019 00:18:54 +0000 (10:18 +1000)]
radeonsi: add an AMD_TEX_ANISO environment variable

This brings it inline with the recently added AMD_DEBUG.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109619

5 years agoi965: leave the top 4Gb of the high heap VMA unused
Kenneth Graunke [Fri, 3 May 2019 19:02:41 +0000 (12:02 -0700)]
i965: leave the top 4Gb of the high heap VMA unused

This ports commit 9e7b0988d6e98690eb8902e477b51713a6ef9cae from anv
to i965.  Thanks to Lionel for noticing that it was missing!

Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965: Force VMA alignment to be a multiple of the page size.
Kenneth Graunke [Sat, 27 Apr 2019 01:52:45 +0000 (18:52 -0700)]
i965: Force VMA alignment to be a multiple of the page size.

This should happen regardless, but let's be paranoid.

Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965: Fix BRW_MEMZONE_LOW_4G heap size.
Kenneth Graunke [Sat, 27 Apr 2019 00:09:11 +0000 (17:09 -0700)]
i965: Fix BRW_MEMZONE_LOW_4G heap size.

The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages,
and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB.

So we can't use the top page.

Fixes: 01058a55229 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: Unset flag reg when FB write is not predicated
Matt Turner [Mon, 29 Apr 2019 23:01:08 +0000 (16:01 -0700)]
intel/compiler: Unset flag reg when FB write is not predicated

In the FS IR we pretend that the instruction is predicated with (+f0.1)
just for flag dependency tracking purposes. Since the instruction
doesn't support predication before Haswell, we unset the predicate so we
should also unset the flag register so that we can round-trip the
disassembly.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/disasm: Disassemble immediate value properly for dim
Sagar Ghuge [Fri, 29 Mar 2019 21:04:03 +0000 (14:04 -0700)]
intel/disasm: Disassemble immediate value properly for dim

On haswell, for dim instruction we encode immediate float value operand
into double float,

v2: Fix comment (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/disasm: Disassemble JIP offset for while
Sagar Ghuge [Thu, 28 Mar 2019 00:07:01 +0000 (17:07 -0700)]
intel/disasm: Disassemble JIP offset for while

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/compiler: Replicate 16 bit immediate value correctly
Sagar Ghuge [Tue, 26 Mar 2019 04:17:08 +0000 (21:17 -0700)]
intel/compiler: Replicate 16 bit immediate value correctly

For the W or UW (signed or unsigned word) source types, the 16-bit value
must be replicated in both the low and high words of the 32-bit
immediate value.

v2: Fix replication in other places as well
V3: fix a few nits (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/compiler: Print quad value in hex format
Sagar Ghuge [Sun, 24 Mar 2019 03:02:54 +0000 (20:02 -0700)]
intel/compiler: Print quad value in hex format

Print quad value same as unsigned quad so that we can distinguish in
between quater control disassembled values for e.g 1/2/3[Q] and
immediate quad value for e.g 1Q. This allows round-tripping through the
assembler/disassembler.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/tools: Add unit tests for assembler
Sagar Ghuge [Sat, 23 Mar 2019 02:13:54 +0000 (19:13 -0700)]
intel/tools: Add unit tests for assembler

v1: Pass executable object from meson to test(Dylan Baker)
v2: Ignore generated output files from git status(Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agointel/tools: Initialize offset correctly for i965_asm
Mika Kuoppala [Thu, 21 Feb 2019 00:47:01 +0000 (16:47 -0800)]
intel/tools: Initialize offset correctly for i965_asm

If we leave offset uninitialized, access to store
will be random depending on stack value and can
segfault.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>