Jason Ekstrand [Wed, 16 Aug 2017 23:50:46 +0000 (16:50 -0700)]
ralloc: Allow reparenting to a NULL context
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 18:18:04 +0000 (11:18 -0700)]
anv/pipeline: Refactor setup of the prog_data::param array
Now that the only thing we put in the array up-front are client push
constants, we can simplify anv_pipeline_compile a bit.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 18:10:22 +0000 (11:10 -0700)]
anv/pipeline: Grow the param array for images
Before, we were calculating up-front and then filling in later. Now we
just grow as needed in anv_nir_apply_pipeline_layout.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 18:09:04 +0000 (11:09 -0700)]
anv/pipeline: Whack nir->num_uniforms to MAX_PUSH_CONSTANT_SIZE
This way any image uniforms end up having locations higher than
MAX_PUSH_CONSTANT_SIZE. There's no bug here at the moment, but this
consistency will make the next commit easier. Also, because
nir_apply_pipeline_layout properly increments nir->num_uniforms when
it expands the param array, we no longer need to stomp it to match
prog_data::nr_params because it already does.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 17:50:07 +0000 (10:50 -0700)]
intel/vs: Grow the param array for clip planes
Instead of requiring the caller of brw_compile_vs to figure it out, just
grow the param array on-demand.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 17:37:40 +0000 (10:37 -0700)]
intel/cs: Grow prog_data::param on-demand for thread_local_id_index
Instead of making the caller of brw_compile_cs add something to the
param array for thread_local_id_index, just add it on-demand in
brw_nir_intrinsics and grow the array. This is now safe to do because
everyone is now using ralloc for prog_data::param.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 6 Oct 2017 17:08:11 +0000 (10:08 -0700)]
intel/compiler: Make brw_nir_lower_intrinsics compute-specific
It's already only ever called from brw_compile_cs and only handles
compute intrinsics. Let's just make it CS-specific. We can always
make it handle other stages again later if we want.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 18:05:55 +0000 (11:05 -0700)]
intel/compiler: Add a helper for growing the prog_data::param array
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 17:12:12 +0000 (10:12 -0700)]
intel/compiler: Stop adding params for texture sizes
We haven't needed this ever since we started using NIR for lowering
rectangle textures.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 17:16:43 +0000 (10:16 -0700)]
i965: Only add the wpos state reference if we lowered something
Otherwise, in the ARB program case _mesa_add_state_reference may grow
the parameter array which will cause brw_nir_setup_arb_uniforms to write
past the end of the param array because it only looks at the parameter
list length but the parma array is allocated based on nir->num_uniforms.
The only reason this hasn't caused us problems is because we are padding
out the param array for fragment programs unnecessarily.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 05:16:55 +0000 (22:16 -0700)]
intel/compiler: Add a flag for pull constant support
The Vulkan driver does not support pull constants. It simply limits
things such that we can always push everything. Previously, we were
determining whether or not to push things based on whether or not the
prog_data::pull_param array is non-null. This is rather hackish and
about to stop working.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 17:06:17 +0000 (10:06 -0700)]
anv/pipeline: Ralloc prog_data::param of the compile mem_ctx
This way we stop leaking it. This is completely safe because, when we
hand it off to anv_shader_bin_create or anv_pipeline_cache_upload_kernel,
they make a copy of the entire param array.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 04:51:48 +0000 (21:51 -0700)]
anv/pipeline: Add a mem_ctx parameter to anv_pipeline_compile
This lets us avoid some of the manual ralloc stealing and prepares for
future commits in which we will want to ralloc prog_data::param.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 02:04:31 +0000 (19:04 -0700)]
i965: Store image_param in brw_context instead of prog_data
This burns an extra 10k of memory or so in the case where you don't have
any images. However, if you have several shaders which use images, this
should be much less memory. It also gets rid of a part of prog_data
that really has nothing to do with the compiler.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 29 Sep 2017 02:03:38 +0000 (19:03 -0700)]
i965: Use prog->info.num_images for needs_dc computation
This should be just as good as looking in prog_data but removes our one
state setup dependency on brw_stage_prog_data::nr_image_param.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 28 Sep 2017 23:25:31 +0000 (16:25 -0700)]
intel: Rewrite the world of push/pull params
This moves us away to the array of pointers model and onto a model where
each param is represented by a generic uint32_t handle. We reserve 2^16
of these handles for builtins that get generated by somewhere inside the
compiler and have well-defined meanings. Generic params have handles
whose meanings are defined by the driver.
The primary downside to this new approach is that it moves a little bit
of the work that we would normally do at compile time to draw time. On
my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to
have any measurable affect on OglBatch7. So, while this may come back
to bite us, it doesn't look too bad.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 28 Sep 2017 22:27:46 +0000 (15:27 -0700)]
i965: Get rid of gen7_cs_state.c
The only thing it was handling was push constants. We pull the actual
constant upload code into gen6_constant_state.c and the atoms into
genX_state_upload.c.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 28 Sep 2017 22:05:51 +0000 (15:05 -0700)]
i965: Add a helper for populating constant buffers
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 28 Sep 2017 21:39:00 +0000 (14:39 -0700)]
i965: Move brw_upload_pull_constants to gen6_constant_state.c
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 29 Aug 2017 00:06:24 +0000 (17:06 -0700)]
nir: Get rid of the variable on vote intrinsics
This looks like a copy+paste error. They don't actually write into that
variable as would be implied by putting the return there.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Mon, 28 Aug 2017 22:05:11 +0000 (15:05 -0700)]
nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0. Switching it to bit >= 0 solves this problem.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Fri, 13 Oct 2017 05:36:48 +0000 (22:36 -0700)]
meta: Delete the PBO texsubimage path for real
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 11 Oct 2017 19:13:35 +0000 (12:13 -0700)]
anv/pipeline_cache: Rework to use multialloc and blob
This gets rid of all of our hand-rolled size calculation and
serialization code and replaces it with safe "standards" that are used
elsewhere in anv and mesa. This should be significantly safer than
rolling our own.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 19:28:20 +0000 (12:28 -0700)]
anv/pipeline: Declare bind maps closer to their use
This is just a trivial cleanup.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 19:13:19 +0000 (12:13 -0700)]
anv/multialloc: Add new add_size helper
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 19:10:08 +0000 (12:10 -0700)]
compiler/blob: Make some parameters void instead of uint8_t
There are certain advantages to using uint8_t internally such as
well-defined arithmetic on all platforms. However, interfaces that
work in terms of raw data should use a void* type.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 19:09:02 +0000 (12:09 -0700)]
compiler/blob: Constify the reader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Fri, 13 Oct 2017 03:58:43 +0000 (20:58 -0700)]
compiler/blob: Add (reserve|overwrite)_(uint32|intptr) helpers
These helpers not only call blob_reserve_bytes but also make sure that
the blob is properly aligned as if blob_write_* were called.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Connor Abbott [Fri, 15 Sep 2017 04:29:46 +0000 (00:29 -0400)]
compiler/blob: make blob_reserve_bytes() more useful
Despite the name, it could only be used if you immediately wrote to the
pointer. Noboby was using it outside of one test, so clearly this
behavior wasn't that useful. Instead, make it return an offset into the
data buffer so that the result isn't invalidated if you later write to
the blob. In conjunction with blob_overwrite_bytes(), this will be
useful for leaving a placeholder and then filling it in later, which
we'll need to do for handling phi nodes when serializing NIR.
v2 (Jason Ekstrand):
- Detect overflow in the offset + to_write computation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Fri, 13 Oct 2017 04:02:48 +0000 (21:02 -0700)]
compiler/blob: Allow for fixed-size blobs with a NULL data pointer
These can be used to easily count up the number of bytes that will be
required by "writing" it into the NULL blob.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 16:52:07 +0000 (09:52 -0700)]
compiler/blob: Add a concept of a fixed-allocation blob
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 16:44:33 +0000 (09:44 -0700)]
compiler/blob: Switch to init/finish instead of create/destroy
There's no reason why that tiny bit of memory needs to be on the heap.
We always put blob_reader on the stack, so why not do the same with the
writable blob.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Wed, 11 Oct 2017 16:54:55 +0000 (09:54 -0700)]
compiler: Move blob up a level
We're going to want to use the blob for Vulkan pipeline caching so it
makes sense to have it in libcompiler not libglsl.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Jason Ekstrand [Fri, 13 Oct 2017 04:19:32 +0000 (21:19 -0700)]
meson: Add inc_compiler to the libglsl includes
Jason Ekstrand [Wed, 11 Oct 2017 20:32:45 +0000 (13:32 -0700)]
glsl/blob: Return false from grow_to_fit if we've ever failed
Otherwise we could have a failure followed by a smaller write that
succeeds and get a corrupted blob. If we ever OOM, we should stop.
v2 (Jason Ekstrand):
- Initialize the new boolean member in create_blob
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Wed, 11 Oct 2017 17:56:48 +0000 (10:56 -0700)]
glsl/blob: Return false from ensure_can_read on overrun
Otherwise, if you have a large read fail and then try to do a small
read, the small read may succeed even though it's at the wrong offset.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Chris Wilson [Wed, 11 Oct 2017 20:43:45 +0000 (21:43 +0100)]
i965: Share the flush for brw_blorp_miptree_download into a pbo
As all users of brw_blorp_miptree_download() must emit a full pipeline
and cache flush when targetting a user PBO (as that PBO may then be
subsequently bound or *be* bound anywhere and outside of the driver
dirty tracking) move that flush into brw_blorp_miptree_download()
itself.
v2 (Ken): Rebase without userptr stuff so it can land sooner.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Fri, 2 Jun 2017 03:38:19 +0000 (20:38 -0700)]
meta: Delete the PBO texture upload/download path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 6 Jun 2017 16:59:01 +0000 (09:59 -0700)]
i965: Use blorp instead of meta for PBO pixel reads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 6 Jun 2017 16:58:23 +0000 (09:58 -0700)]
i965: Use blorp instead of meta for PBO texture downloads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 1 Jun 2017 00:53:34 +0000 (17:53 -0700)]
i965/tex: Use blorp texture upload for all CCS_E textures
This improves the FillTex benchmark in GLBench 2.7 by 30% on my Broxton.
On Ken's Broxton which only has single-channel ram, it improves by 210%.
v2 (Ken): Check mt->aux_usage == ISL_AUX_USAGE_CCS_E rather than using
intel_miptree_is_lossless_compressed().
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 6 Jun 2017 16:58:07 +0000 (09:58 -0700)]
i965: Use blorp instead of meta for PBO texture uploads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sun, 27 Nov 2016 15:48:05 +0000 (17:48 +0200)]
i965: Add blorp-based texture upload and download paths
v1 (Topi Pohjolainen): original patch.
v2 (Topi Pohjolainen):
- Fix return value (s/MESA_FORMAT_NONE/false/) (Anuj)
- Move _mesa_tex_format_from_format_and_type() just
in the end avoiding additional if-block (Anuj)
- Explain better the array alignment restriction (Anuj)
- Do not bail out in case of gl_pixelstore_attrib::ImageHeight,
it is handled by _mesa_image_offset() automatically (Ken).
- Support 1D_ARRAY by flipping depth, width and y, z (Ken).
v3 (Topi Pohjolainen):
- Contrary to v2, do not try to handle
gl_pixelstore_attrib::ImageHeight. Currently there are no
tests in piglit or cts for it. One could possibly copy or
modify tests/texturing/texsubimage.c. There, however, seems
to be number of corner cases to consider. Moreover, current
meta path applies the packing height for both source and
targets when determining the offset. This would probably
require re-visiting also.
v4 (Topi Pohjolainen): Rebased on top of merged drm-bacon
v5 (Jason Ekstrand):
- Move to brw_blorp.c
- Significant refactoring
- Fixed 1-D array textures
- Simplified handling of PBOs vs. CPU data.
- Handle gl_pixelstore_attrib::ImageHeight. It turns out there are
piglit tests that cover this. The original version was failing them
because of an error in the way it handled 1-D array textures.
- Add support for texture download
v6 (Kenneth Graunke): Rebase fixes:
- Use intel_miptree_check_level_layer instead of deleted fields
- Update for mesa_format_supports_render[] rename.
- Pass 'false' (read-only) to intel_bufferobj_buffer
v7 (Kenneth Graunke):
- Fix brw_blorp_download_miptree to pass 'false' (not read only) for
the destination buffer (caught by Chris Wilson).
- Fix blorp_get_client_bo to pass intel_bufferobj_buffer !read_only
for the 'writable' parameter instead of 'false' (caught by Jason).
- Support GL_BGR, GL_BGRA, GL_BGRA_INTEGER, GL_BGR_INTEGER, allowing
us to use this for ReadPixels on the window system buffer (caught
by Chris Wilson).
- Fix y-flipping bugs in download path (exposed by BGRA support).
- Fix false vs. NULL return value in blorp_get_client_bo.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Kenneth Graunke [Thu, 12 Oct 2017 03:33:25 +0000 (20:33 -0700)]
i965: Refactor y-flipping coordinate transform.
I want to reuse it for the BLORP download path.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 1 Jun 2017 00:04:13 +0000 (17:04 -0700)]
i965/tex: Check if there is data to upload up-front
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 15 Jun 2017 05:28:25 +0000 (22:28 -0700)]
i965/barrier: Do the correct flushes for framebuffer access
Framebuffer access includes framebuffer reads so we need to invalidate
the texture cache. We do not, however, need to flush the depth cache
because you cannot do bind a depth texture as an image.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Thu, 15 Jun 2017 05:27:20 +0000 (22:27 -0700)]
i965/barrier: Do the correct flushes for texture updates
Texture uploads and downloads may go through the render pipe which may
result in texturing from or rendering to the texture or the PBO. We
need to flush accordingly.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 12 Oct 2017 22:45:11 +0000 (15:45 -0700)]
include: Revert out the update of the Khronos GLX extension header.
They made a mistake in the MESA_swap_control XML, which I'm pursuing in
their github. Until then, we can just back this piece out.
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Kenneth Graunke [Sat, 9 Sep 2017 07:20:26 +0000 (00:20 -0700)]
i965: Ignore GL_SKIP_DECODE_EXT for textures accessed via texelFetch().
The GL_EXT_texture_sRGB_decode spec says:
"The conversion of sRGB color space components to linear color space is
always performed if the texel lookup function is one of the texelFetch
builtin functions.
Otherwise, if the texel lookup function is one of the texture builtin
functions or one of the texture gather functions, the conversion of sRGB
color space components to linear color space is controlled by the
TEXTURE_SRGB_DECODE_EXT parameter.
If the TEXTURE_SRGB_DECODE_EXT parameter is DECODE_EXT, the conversion
of sRGB color space components to linear color space is performed.
If the TEXTURE_SRGB_DECODE_EXT parameter is SKIP_DECODE_EXT, the value
is returned without decoding. However, if the texture is also accessed
with a texelFetch function, then the result of texture builtin functions
and/or texture gather functions may be returned with decoding or without
decoding."
This patch makes i965 force sRGB decoding for any textures accessed via
texelFetch(). If textures are accessed via texelFetch() and a regular
texture access function, this will affect the other ones too - which is
fine - it's undefined according to the last paragraph quoted.
We could make both work, but we'd have to emit multiple SURFACE_STATEs,
and have two binding table sections, like we do for texture gather hacks
on older platforms.
Fixes the following Android O CTS test:
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Wed, 11 Oct 2017 07:18:38 +0000 (00:18 -0700)]
meta: Unset the textures_used_by_txf bitfield.
Drivers that use Meta are happily using blitting data using texelFetch
and GL_SKIP_DECODE_EXT, but the GL_EXT_texture_sRGB spec unfortunately
makes GL_SKIP_DECODE_EXT not necessarily work with texelFetch.
As a hack, just unset the texture_used_by_txf bitfield so we can
continue with the old desired behavior.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Sat, 9 Sep 2017 07:19:57 +0000 (00:19 -0700)]
nir: Make nir_shader_gather_info() track texelFetch texture accesses.
For TGSI-based drivers, st_glsl_to_tgsi records this information.
For NIR-based drivers, nir_shader_gather_info() will do so.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Sat, 9 Sep 2017 07:19:57 +0000 (00:19 -0700)]
compiler: Move gl_program::TexelFetchSamplers to shader_info.
I'd like to put this sort of metadata in the shader_info structure,
rather than adding more things to gl_program.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dave Airlie [Thu, 12 Oct 2017 04:10:53 +0000 (05:10 +0100)]
radv: take unsafe_math and sisched into account when hashing shaders.
We want to generate different variants for sisched and unsafe_math
shader variants, so add them to the hash key.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 11 Oct 2017 23:32:14 +0000 (09:32 +1000)]
mesa/bufferobj: fix atomic offset/size get
When I realigned the bufferobj code, I didn't see the getters
were different, realign the getters to work the same as ssbo.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103214
Fixes:
65d3ef7cd (mesa: align atomic buffer handling code with ubo/ssbo (v1.1))
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Thu, 12 Oct 2017 20:27:20 +0000 (22:27 +0200)]
relnotes: document EGL_ANDROID_native_fence_sync on radeonsi
Eric Anholt [Wed, 11 Oct 2017 16:54:47 +0000 (09:54 -0700)]
include: Update GL headers from khronos opengl registry.
Taken from their
c6a99aff31874697741a08cbc8a3488606ce59c7, keeping the
BUILDING_MESA hunk in place.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 11 Oct 2017 16:50:13 +0000 (09:50 -0700)]
mapi: Update extension number of MESA_tile_raster_order.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 10 Oct 2017 20:56:55 +0000 (13:56 -0700)]
broadcom/vc5: Remove the u_resource_vtbl usage.
Like for vc4, this was just a wasted indirection.
Eric Anholt [Wed, 11 Oct 2017 17:32:25 +0000 (10:32 -0700)]
mesa: Disallow GL_RED/GL_RG with half-floats on GLES2.
Sure, you'd think that the combination of GL_OES_texture_half_float and
GL_EXT_texture_rg would mean that GL_RG16F exists, but it doesn't.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227
Fixes:
c16a7443e999 ("mesa: Expose GL_OES_required_internalformat on GLES contexts.")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 12 Sep 2017 18:17:31 +0000 (20:17 +0200)]
radeonsi: implement sync_file import/export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 12 Sep 2017 18:13:06 +0000 (20:13 +0200)]
winsys/amdgpu: implement sync_file import/export
syncobj is used internally for interactions with command submission.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 12 Sep 2017 18:06:37 +0000 (20:06 +0200)]
ac: add radeon_info::has_sync_file
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Eric Anholt [Thu, 12 Oct 2017 00:40:35 +0000 (17:40 -0700)]
broadcom/vc5: Don't pair VPMSETUP with other peripheral access.
The specs don't say you can't, but pairing it with an SFU write on the
7268 breaks all our simple shader tests using gl_MVP * gl_Vertex.
Eric Anholt [Thu, 12 Oct 2017 00:15:10 +0000 (17:15 -0700)]
broadcom/vc5: Fix inclusion of FS flag bits in dumping the FS address.
Marek Olšák [Mon, 9 Oct 2017 16:56:22 +0000 (18:56 +0200)]
st/dri: implement __DRIimageExtension::validateUsage properly
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 9 Oct 2017 16:44:50 +0000 (18:44 +0200)]
gallium: add pipe_screen::check_resource_capability
This is optional (and no CAP).
Implemented by radeonsi, ddebug, rbug, trace.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 9 Oct 2017 16:42:48 +0000 (18:42 +0200)]
ac/surface: add ac_surface::is_displayable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 9 Oct 2017 16:31:12 +0000 (18:31 +0200)]
amd/addrlib: add Addr2IsValidDisplaySwizzleMode
Some "standard" (_S) swizzle modes are displayable on Raven,
even though the micro tile mode says it's not displayable.
Expose the addrlib function to the driver.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
tournier.elie [Thu, 12 Oct 2017 11:24:10 +0000 (12:24 +0100)]
meson: fix typo in isl
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Rob Herring [Wed, 11 Oct 2017 22:46:10 +0000 (17:46 -0500)]
Android: disable i9x5 drivers on non-x86 builds
The i965 driver has become dependent on x86 specific compiler builtin
functions, so ensure it's disabled for non-x86 builds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Wladimir J. van der Laan [Sat, 30 Sep 2017 08:11:32 +0000 (10:11 +0200)]
etnaviv: Do GC3000 resolve-in-place when possible
If an RS blit is done with source exactly the same as destination, and
the hardware supports this, do an in-place resolve. This only fills in
tiles that have not been rendered to using information from the TS.
This is the same as the blob does and potentially saves significant
bandwidth when doing i.MX6qp scanout using PRE, and when rendering to
textures (though here using sampler TS would be even better).
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Eric Engestrom [Mon, 25 Sep 2017 21:04:24 +0000 (22:04 +0100)]
egl_haiku: drop haiku_egl_driver struct
The struct only contained the one field we're interested in.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Thu, 12 Oct 2017 13:26:47 +0000 (14:26 +0100)]
egl: remove left over _EGLMain_t
Fixes:
b174a1ae720cb404738c "egl: Simplify the "driver" interface"
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Mon, 25 Sep 2017 21:39:24 +0000 (22:39 +0100)]
egl: drop memset(0) of calloc'ed memory
`_EGLDriver *drv` is a freshly calloc()'ed object, memset(0)'ing some of
it is a no-op.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Eric Engestrom [Tue, 26 Sep 2017 12:13:39 +0000 (13:13 +0100)]
egl: replace _egl_driver->Unload() callback with a simple free()
Bonus: fixes a memleak on haiku when unloading the driver
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dave Airlie [Thu, 12 Oct 2017 04:24:41 +0000 (14:24 +1000)]
radv: don't crash if cache is disabled.
If you set MESA_GLSL_CACHE_DISABLE, radv crashed.
Fixes:
fd24be134f (radv: make use of on-disk cache)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Samuel Pitoiset [Thu, 5 Oct 2017 13:51:20 +0000 (15:51 +0200)]
radv: use CLEAR_STATE for initializing some registers
Based on RadeonSI.
This improves some Vulkan demos by +1% to +3%.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Thu, 5 Oct 2017 12:55:24 +0000 (14:55 +0200)]
radv: add has_clear_state and enable it on CIK+ only
This will allow us to emit the CLEAR_STATE packet instead
of a bunch of useless packets when doing CS initialization.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Thu, 5 Oct 2017 13:13:19 +0000 (15:13 +0200)]
radv: do not set registers for merged ES-GS on GFX9
Based on RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Thu, 5 Oct 2017 12:43:05 +0000 (14:43 +0200)]
radv: move the raster config emission in si_set_raster_config()
Similar to RadeonSI, also only call this function for <= VI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Tue, 3 Oct 2017 13:02:22 +0000 (15:02 +0200)]
radeonsi: add support for PIPE_FORMAT_{X1,A1}R5G5B5_UNORM
Fixes dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Tue, 3 Oct 2017 13:00:24 +0000 (15:00 +0200)]
gallium: add tests for PIPE_FORMAT_{X1,A1}B5G5R5_UNORM formats
This is a left-over from my version of adding the new format
after rebasing on Eric's version.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dave Airlie [Wed, 11 Oct 2017 00:45:50 +0000 (10:45 +1000)]
include/drm-uapi: clarify when headers can be updated.
Clarify when headers can be updated here.
Reviewed-by: Gurchetan Singh<gurchetansingh@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Wed, 11 Oct 2017 04:10:37 +0000 (15:10 +1100)]
radv: remove duplicate line of code
The same line of code is a few lines above.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Timothy Arceri [Wed, 11 Oct 2017 01:10:31 +0000 (12:10 +1100)]
radv: make use of on-disk cache
If the app provided in-memory pipeline cache doesn't yet contain
what we are looking for, or it doesn't provide one at all then we
fallback to the on-disk cache.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 11 Oct 2017 01:00:27 +0000 (12:00 +1100)]
radv: create on-disk shader cache
This is the drivers on-disk cache intended to be used as a
fallback as opposed to the pipeline cache provided by apps.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 11 Oct 2017 00:59:20 +0000 (11:59 +1100)]
radv: remove duplicate debug_flags field
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Lionel Landwerlin [Wed, 11 Oct 2017 16:24:37 +0000 (17:24 +0100)]
anv: intel: use anv_image's computed size for importing a BO
Rather than relying on size = stride * height, we can rely on
anv_image's total size.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Lionel Landwerlin [Wed, 11 Oct 2017 16:21:53 +0000 (17:21 +0100)]
anv: bo_cache: allow importing a BO larger than needed
It's not a problem if a BO has been allocated larger than we need it
to be.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102940
Fixes:
818b857914 ("anv: Use the BO cache for DeviceMemory allocations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
Nicolai Hähnle [Fri, 6 Oct 2017 18:28:43 +0000 (20:28 +0200)]
st/glsl_to_tgsi: the second destination doesn't support relative addressing
It's not used -- DFRACEXP gets array indexes of its exponent out-parameter
lowered earlier -- and it wouldn't have worked correctly anyway when both
dst and dst1 use relative addressing.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 6 Oct 2017 18:27:40 +0000 (20:27 +0200)]
st/glsl_to_tgsi: fix DFRACEXP with only one destination
Replace the undefined destination by a new temporary register.
Cleanup merge_two_dsts while we're at it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 6 Oct 2017 15:14:46 +0000 (17:14 +0200)]
st/glsl_to_tgsi: fix indirect access to 64-bit integer
Make sure we actually allocate two adjacent TGSI temporaries. The
current code fails e.g. when an arithmetic operation has two
operands with indirect accesses.
I will send out a new piglit test
(arb_gpu_shader_int64/execution/indirect-array-two-accesses.shader_test)
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 5 Oct 2017 17:25:48 +0000 (19:25 +0200)]
st/mesa: don't assign prog->ShadowSamplers
It's not used, and the assignment for the TGSI case was incorrect
for sampler arrays.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 5 Oct 2017 17:39:33 +0000 (19:39 +0200)]
st/glsl_to_tgsi: ignore GL_TEXTURE_SRGB_DECODE_EXT for samplers used with texelFetch*()
See the comment for the relevant spec quote.
Fixes dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch
v2: note the interaction between ARB_bindless_texture and EXT_texture_sRGB_decode
as a TODO
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Thu, 5 Oct 2017 12:08:04 +0000 (14:08 +0200)]
st/mesa: store state that affects sampler views per context
This fixes sequences like:
1. Context 1 samples from texture with sRGB decode enabled
2. Context 2 samples from texture with sRGB decode disabled
3. Context 1 samples from texture with sRGB decode disabled
Previously, step 3 would see the prev_sRGBDecode value from context 2
and would incorrectly use the old sampler view with sRGB decode enabled.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tim Rowley [Tue, 10 Oct 2017 16:08:29 +0000 (11:08 -0500)]
swr: simd16 shaders work in progress
Start building vertex shaders as simd16.
Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 10 Oct 2017 16:07:11 +0000 (11:07 -0500)]
gallium: allow 512-bit vectors
Increase the max allowed vector size from 256 to 512.
No piglit llvmpipe regressions running on avx2.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Kenneth Graunke [Tue, 10 Oct 2017 17:19:21 +0000 (10:19 -0700)]
i965: Drop brw_bo_alloc in ARB_indirect_parameters implementation.
The original implementation allocated a new BO here, but we decided to
switch to intel_upload_space, which returns a reference to the current
upload BO. We accidentally kept the brw_bo_alloc, even though it's no
longer necessary - intel_upload_space will immediately unreference it,
causing us to allocate and immediately free a buffer.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Kenneth Graunke [Sun, 30 Jul 2017 23:15:56 +0000 (16:15 -0700)]
i965: Allow mapped VBOs during drawing in non-debug contexts.
Section 6.3.2 of the GL 4.5 spec says:
"Any GL command which attempts to read from, write to, or change
the state of a buffer object may generate an INVALID_OPERATION error
if all or part of the buffer object is mapped ... However, only
commands which explicitly describe this error are required to do so.
If an error is not generated, such commands will have undefined
results and may result in GL interruption or termination."
Setting this flag allows us to skip walking over the buffer bindings
for every enabled vertex attribute (_mesa_all_buffers_are_unmapped).
Improves performance in GFXBench4's gl_driver2_off microbenchmark by
3.05797% +/- 0.709031% (n=33) on Apollolake.
This breaks KHR-*.draw_elements_base_vertex_tests.invalid_mapped_bos,
but that test is invalid and has been removed from the upstream CTS.
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Tue, 10 Oct 2017 21:50:53 +0000 (14:50 -0700)]
meson: fix glx test
That requires a generated header that was rolled into a loop.
fixes:
a47c525f3281a27 ("meson: build glx")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>