Samuel Pitoiset [Wed, 24 Oct 2018 06:50:25 +0000 (08:50 +0200)]
radv: fix a comment in radv_meta_buffer_to_image_cs_r32g32b32()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:24 +0000 (08:50 +0200)]
radv: add get_image_stride_for_r32g32b32() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:23 +0000 (08:50 +0200)]
radv: add create_bview_for_r32g32b32() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:22 +0000 (08:50 +0200)]
radv: add create_buffer_from_image() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Sagar Ghuge [Wed, 24 Oct 2018 23:25:53 +0000 (16:25 -0700)]
intel/compiler: Print message descriptor as immediate source
While disassembling send(c) instruction print message descriptor as
immediate source operand along with message descriptor. This allows
assembler to read immediate source operand and set bits accordingly.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Sagar Ghuge [Wed, 24 Oct 2018 20:27:27 +0000 (13:27 -0700)]
intel/compiler: Print hex representation along with floating point value
While encoding the immediate floating point values in instruction we use
values upto precision 9, but while disassembling, we print precision to
6 places, which round up the value and gives wrong interpretation for
encoded immediate constant.
To avoid misinterpretation of encoded immediate values in instruction
and disassembled output, print hex representation along with floating
point value which can be used by assembler in future.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
David McFarland [Wed, 24 Oct 2018 00:51:09 +0000 (21:51 -0300)]
util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id. Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id. Instead of writing a uint32_t it now writes to a
mesa_sha1. All drivers using disk_cache_get_function_identifier are
updated accordingly.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes:
83ea8dd99bb1 ("util: add disk_cache_get_function_identifier()")
Hyunjun Ko [Wed, 24 Oct 2018 01:57:15 +0000 (10:57 +0900)]
freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Following the commit
2385d7b066 and
8e798e28f7, for resource dependancy
tracking.
Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
with FD_MESA_DEBUG=inorder
Signed-off-by: Rob Clark <robdclark@gmail.com>
Hyunjun Ko [Thu, 25 Oct 2018 08:26:19 +0000 (17:26 +0900)]
freedreno/ir3: take reg->num out of union in ir3_register
To avoid wrong result when identifying the type of register.
Ie. If the reg is an array, it might be identified as address or
predicate register.
Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 25 Oct 2018 19:27:10 +0000 (15:27 -0400)]
freedreno/a6xx: disable unused groups
Don't leave vsconst/fsconst group enabled if we switch to shader with no
uniforms.
Fixes:
abcdf5627a2 freedreno/a6xx: move const emit to state group
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 18 Oct 2018 13:05:52 +0000 (09:05 -0400)]
freedreno: add useful assert
Would have been useful to catch the problem fixed in
8e798e28f736e22e9e1e4534ab42a36cde14b142
Signed-off-by: Rob Clark <robdclark@gmail.com>
Alok Hota [Tue, 16 Oct 2018 23:15:29 +0000 (18:15 -0500)]
swr/rast: ignore CreateElementUnorderedAtomicMemCpy
This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Alok Hota [Wed, 19 Sep 2018 17:42:57 +0000 (12:42 -0500)]
swr/rast: fix intrinsic/function for LLVM 7 compatibility
Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and
removed createInstructionSimplifierPass, which were both removed in LLVM
7.0.0
These changes combine patches we received from the community and our own
internal patches
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Rhys Perry [Thu, 4 Oct 2018 12:40:43 +0000 (13:40 +0100)]
nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+
Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82%
FPS improvement with Dirt Rally on my GTX 1060.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bas Nieuwenhuizen [Tue, 23 Oct 2018 08:54:24 +0000 (10:54 +0200)]
radv: Emit enqueued pipeline barriers on event write.
Since the CPU can read them we need to execute any GPU->CPU
flushes before the event is written.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524
Fixes:
f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Sun, 30 Sep 2018 18:02:04 +0000 (20:02 +0200)]
radv: Add support for VK_KHR_driver_properties.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Engestrom [Sat, 20 Oct 2018 17:00:09 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Engestrom [Sat, 20 Oct 2018 17:00:08 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Dylan Baker [Tue, 23 Oct 2018 17:02:05 +0000 (10:02 -0700)]
gen: Add AMD_gpu_shader_int64.xml to tarball
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes:
b3c17330e631695b5e5dc209ba9ea1a528618c97
("mesa: expose AMD_gpu_shader_int64")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Dylan Baker [Tue, 23 Oct 2018 17:00:01 +0000 (10:00 -0700)]
gen: Add EXT_vertex_attrib_64bit.xml to dependency lists
Which is also required to put it in the tarball, a requirement for
building with meson from the tarball.
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes:
263c962cfdee6b43578ee5f28601309ea77d1434
("mesa: expose EXT_vertex_attrib_64bit")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Eric Engestrom [Tue, 23 Oct 2018 14:37:21 +0000 (15:37 +0100)]
anv: move variable to proper scope and mark as MAYBE_UNUSED
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 23 Oct 2018 14:27:51 +0000 (15:27 +0100)]
anv: use snprintf() instead of memset()+strcpy()
snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 23 Oct 2018 14:25:45 +0000 (15:25 +0100)]
anv: drop unused includes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Dylan Baker [Tue, 23 Oct 2018 18:01:12 +0000 (11:01 -0700)]
autotools: include intel_tiled_memcopy.c
There are two problems with the fixed patch. First, it fails to create a
dependency on the sourced .c file, so changes to intel_tiled_memcpy.c
won't trigger a rebuild. It also doesn't get included in the dist
tarball.
Fixes:
11b1afdc92db98e93f2ca50beeb7fc481a11e708
("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Dylan Baker [Tue, 23 Oct 2018 17:40:15 +0000 (10:40 -0700)]
meson: fix formatting and add extra_files to i965
extra_files is just a nice way to to tell certain IDEs (and those
reading the file) that this file is also a dependency. Meson will use
the .d file generated by the compiler to figure out what the target
actually depends on.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Eduardo Lima Mitev [Tue, 23 Oct 2018 19:24:11 +0000 (21:24 +0200)]
ir3_compiler/nir: fix imageSize() for buffer-backed images
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.
Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.
This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.
Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*
v2: Pre-compute and submit the log2 of the image format's bpp as shader
constant instead of emitting the LOG2 instruction in code. (Rob Clark)
v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)
Reviewed-by: Rob Clark <robdclark@gmail.com>
Jose Fonseca [Wed, 24 Oct 2018 10:33:09 +0000 (11:33 +0100)]
nir: Fix array initializer.
Empty initializer is not standard C. This fixes MSVC build.
Trivial.
Liviu Prodea [Wed, 24 Oct 2018 10:08:35 +0000 (11:08 +0100)]
scons: Put to rest zombie texture_float build option.
I found a remnant of texture_float build option that wasn't removed in
commit
66673bef941af344314fe9c91cad8cd330b245eb
This patch removes it.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Alex Smith [Thu, 18 Oct 2018 16:29:37 +0000 (17:29 +0100)]
anv: Allow presenting via a different GPU
anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for
this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not.
Apps which check for presentation support via the latter (all Feral
Vulkan games at least) will therefore fail.
This allows me to render on an Intel GPU and present to a display
connected to an AMD card (tested HD 530 + Vega 64).
v2: Rebase on current master.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Juan A. Suarez Romero [Tue, 23 Oct 2018 13:55:11 +0000 (15:55 +0200)]
nir: fix nir_copy_propagation test
Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.
v2: Use nir_src_comp_as_uint() [Jason]
Fixes:
16870de8a0a ("nir: Use nir_src_is_const and nir_src_as_* in core
code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Tue, 23 Oct 2018 10:56:31 +0000 (21:56 +1100)]
radv: call nir_link_xfb_varyings()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Timothy Arceri [Tue, 23 Oct 2018 10:56:30 +0000 (21:56 +1100)]
radv: move nir_lower_io_to_scalar_early() to radv_link_shaders()
nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Tue, 23 Oct 2018 10:56:29 +0000 (21:56 +1100)]
nir: add linking helper nir_link_xfb_varyings()
The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Sagar Ghuge [Sat, 20 Oct 2018 01:25:23 +0000 (18:25 -0700)]
intel/compiler: Change src1 reg type to unsigned doubleword
To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Eduardo Lima Mitev [Tue, 23 Oct 2018 05:56:58 +0000 (07:56 +0200)]
mesa/glformats: Remove redundant helper _mesa_base_format_component_count
There exists _mesa_components_in_format() which already includes
all cases handled in _mesa_base_format_component_count().
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Jason Ekstrand [Mon, 22 Oct 2018 23:29:52 +0000 (18:29 -0500)]
nir/algebraic: Fix a typo in the bit size validation code
The conon_bit_class and canon_var_class variables got switched.
Fixes:
932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Leo Liu [Tue, 23 Oct 2018 16:57:31 +0000 (12:57 -0400)]
amd/common: check DRM version 3.27 for JPEG decode
JPEG was added after DRM version 3.26
Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes:
4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Juan A. Suarez Romero [Fri, 5 Oct 2018 09:14:59 +0000 (11:14 +0200)]
docs: update calendar
I'll take care of 18.2 releases series on Andres behalf.
CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Lionel Landwerlin [Tue, 23 Oct 2018 00:39:39 +0000 (01:39 +0100)]
intel/decoders: fix end of batch limit
Pointer arithmetic...
v2: s/4/sizeof(uint32_t)/ (Eric)
v3: Give bytes to print_batch() in error_decode (Lionel)
Make clear what values we're dealing with in error_decode (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:30 +0000 (15:03 -0400)]
radeonsi: enable vcn jpeg decode for raven
Enable vcn jpeg decode for raven.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:29 +0000 (15:03 -0400)]
winsys/amdgpu: add vcn jpeg cs support
Add vcn jpeg cs support, align cs by no-op.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:28 +0000 (15:03 -0400)]
amd/common: add vcn jpeg ip info query
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:27 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg target buffer cmd
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:26 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg bitstream buffer cmd
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:25 +0000 (15:03 -0400)]
radeon/uvd: remove get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:24 +0000 (15:03 -0400)]
st/va: get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:23 +0000 (15:03 -0400)]
radeon/vcn: add jpeg decode implementation
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:22 +0000 (15:03 -0400)]
radeon/vcn: separate send cmd call from end frame
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:21 +0000 (15:03 -0400)]
radeon/vcn: create cs based on ring type
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:20 +0000 (15:03 -0400)]
radeon/winsys: add vcn jpeg ring type
Add a new ring type for vcn jpeg.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:19 +0000 (15:03 -0400)]
radeon/vcn: add vcn jpeg decode interface
Add VCN Jpeg decode interfaces and register defines.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:18 +0000 (15:03 -0400)]
radeon/vcn: move radeon decoder define to header file
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:17 +0000 (15:03 -0400)]
meson: update required amdgpu version to 2.4.95
VCN jpeg requires new hw ip
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:16 +0000 (15:03 -0400)]
configure.ac: update libdrm amdgpu version to 2.4.95
VCN jpeg requires new hw ip
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Samuel Pitoiset [Mon, 22 Oct 2018 13:42:31 +0000 (15:42 +0200)]
radv: fix btoi for R32G32B32 when the dest offset is not 0
Fixes:
593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Scott D Phillips [Mon, 24 Sep 2018 08:39:33 +0000 (11:39 +0300)]
i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.
Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.
v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
v3: Add units to parameter names of tile_extents (Nanley Chery)
Use _mesa_align_malloc for the shadow copy (Nanley)
Continue using gtt maps on gen4 (Nanley)
v4: Use streaming_load_memcpy when detiling
v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
takes precedence. Add intel_miptree_access_raw, needed after
rebasing on commit
b499b85b0f2cc0c82b7c9af91502c2814fdc8e67.
v6: refactor to changes done for sse41 separation (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Scott D Phillips [Mon, 24 Sep 2018 05:33:06 +0000 (08:33 +0300)]
i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
The reference for MOVNTDQA says:
For WC memory type, the nontemporal hint may be implemented by
loading a temporary internal buffer with the equivalent of an
aligned cache line without filling this data to the cache.
[...] Subsequent MOVNTDQA reads to unread portions of the WC
cache line will receive data from the temporary internal
buffer if data is available.
This hidden cache line sized temporary buffer can improve the
read performance from wc maps.
v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
v3: add Android build support (Tapani)
v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
separate sse41 to own static library (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tapani Pälli [Wed, 19 Sep 2018 07:16:58 +0000 (10:16 +0300)]
i965: expose type of memcpy instead of memcpy function itself
There is currently no use of returned memcpy functions outside
intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy
type instead of actual function. This makes it easier later to separate
streaming load copy in to own static library.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Engestrom [Tue, 16 Oct 2018 08:43:07 +0000 (09:43 +0100)]
util: use *unsigned* ints for bit operations
Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every
time this macro is used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Engestrom [Thu, 18 Oct 2018 14:51:47 +0000 (15:51 +0100)]
radv: s/abs/fabsf/ for floats
Fixes:
a4c4efad89eceb26cf82 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Thu, 11 Oct 2018 15:38:24 +0000 (16:38 +0100)]
meson: drop option description relic
`platforms` is no longer a comma-separated string, and some of our
option descriptions are way too long already. Just drop the incorrect
bit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Jason Ekstrand [Tue, 2 Oct 2018 03:16:59 +0000 (22:16 -0500)]
st/mesa: Record shader access qualifiers for images
They're not required to be the same as the access flag on the image
unit. For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.
v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
- Reduce both access and shader_access to uint16_t to avoid making
the pipe_image_view structure larger.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:33:36 +0000 (14:33 -0500)]
nir/algebraic: Provide descriptive asserts for bit size checks
This will hopefully make debugging opt_algebraic bit-size compile
failures easier.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:31:19 +0000 (14:31 -0500)]
nir/algebraic: Loosen a restriction on variables
Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size. However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:03:24 +0000 (14:03 -0500)]
nir/algebraic: A bit of validation refactoring'
We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:01:31 +0000 (14:01 -0500)]
nir/algebraic: Make internal classes str-able
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 17:43:43 +0000 (12:43 -0500)]
nir/algebraic: Generalize an optimization
There's nothing boolean about (a | ~a) ~> -1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 03:31:08 +0000 (22:31 -0500)]
nir/algebraic: Use bool internally instead of bool32
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Kenneth Graunke [Mon, 22 Oct 2018 04:41:39 +0000 (21:41 -0700)]
intel: Fix decoding for partial STATE_BASE_ADDRESS updates.
STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is
set. Otherwise, we want to keep the existing base address.
Iris uses this for updating Surface State Base Address while leaving the
others as-is.
v2: Also update aubinator_viewer_decoder (caught by Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 20 Oct 2018 14:10:02 +0000 (09:10 -0500)]
nir: Use nir_src_is_const and nir_src_as_* in core code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Sat, 20 Oct 2018 17:07:41 +0000 (12:07 -0500)]
nir/search_helpers: Use nir_src_is_const and friends
This not only makes them safe for more bit sizes but it also fixes a bug
in is_zero_to_one where it would return true for constant NaN.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Sat, 20 Oct 2018 17:17:30 +0000 (12:17 -0500)]
nir/search: Use nir_src_is_const and friends
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Sat, 20 Oct 2018 13:36:21 +0000 (08:36 -0500)]
nir: Add some new helpers for working with const sources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Alyssa Rosenzweig [Sun, 21 Oct 2018 18:29:37 +0000 (11:29 -0700)]
mesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs
On scalar ISAs, nir_lower_io_to_scalar_early enables significant
optimizations. However, on vector ISAs, it is counterproductive and
impedes optimal codegen. This patch only calls
nir_lower_io_to_scalar_early for scalar ISAs. It appears that at present
there are no upstreamed drivers using Gallium, NIR, and a vector ISA, so
for existing code, this should be a no-op. However, this patch is
necessary for the upcoming Panfrost (Midgard) and Lima (Utgard)
compilers, which are vector.
With this patch, Panfrost is able to consume NIR directly, rather than
TGSI with the TGSI->NIR conversion.
For how this affects Lima, see
https://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg189216.html
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Dylan Baker [Mon, 22 Oct 2018 14:26:44 +0000 (07:26 -0700)]
meson: don't require libelf for r600 without LLVM
r600 doesn't have a hard requirement on LLVM, and therefore doesn't have
a hard requirement on libelf. Currently the logic doesn't allow that
however.
Distro-bug: https://bugs.gentoo.org/669058
Fixes:
5060c51b6f4dfb0d5358bde6523285163d3faaad
("meson: build r600 driver")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Sat, 13 Oct 2018 13:46:20 +0000 (08:46 -0500)]
anv,radv: Trivially expose two new VK_GOOGLE extensions
This patch exposes support for the following two extensions:
* VK_GOOGLE_decorate_string
* VK_GOOGLE_hlsl_functionality1
There's nothing for the driver to do; it's all handled in spirv_to_nir.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Sat, 13 Oct 2018 13:41:36 +0000 (08:41 -0500)]
spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1
This extension adds two new decorations which carry meaning only for
HLSL shaders. They are expected to be handled by higher level layers
and can be ignored by implementations. However, it does save the client
a bit of work if the implementation safely ignores them instead of the
client having to strip them out of the SPIR-V in order for it to be
valid.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Jason Ekstrand [Sat, 13 Oct 2018 13:33:22 +0000 (08:33 -0500)]
spirv: Add support for SPV_GOOGLE_decorate_string
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Rob Herring [Tue, 24 Jul 2018 09:09:39 +0000 (11:09 +0200)]
android: Build kms_swrast for the Android platform
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Connor Abbott [Thu, 18 Oct 2018 13:39:13 +0000 (15:39 +0200)]
ac: Fix loading a dvec3 from an SSBO
The comment was wrong, since the loop above casts to a type with the
correct bitsize already.
Fixes:
7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Connor Abbott [Thu, 18 Oct 2018 13:30:11 +0000 (15:30 +0200)]
ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.
Fixes:
7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eduardo Lima Mitev [Sun, 21 Oct 2018 18:48:41 +0000 (20:48 +0200)]
ir3/nir: Set up image_dims consts for image_deref_size intrinsic too
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Karol Herbst [Fri, 19 Oct 2018 17:26:39 +0000 (19:26 +0200)]
nv50/ir: fix ConstantFolding::createMul for 64 bit muls
Fixes:
2f52925f5c60c72c9389bfdc122c3d5f8e15b25f
"nv50/ir: move a * b -> a << log2(b) code into createMul()"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Sonny Jiang [Fri, 19 Oct 2018 20:16:41 +0000 (16:16 -0400)]
radeonsi: Disable clear_state with radeon kernel driver
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Kenneth Graunke [Tue, 30 Jan 2018 09:32:07 +0000 (01:32 -0800)]
meson: Add -Werror=return-type when supported.
This warning detects non-void functions with a missing return statement,
return statements with a value in void functions, and functions with an
bogus return type that ends up defaulting to int. It's already enabled
by default with -Wall. Generally, these are fairly serious bugs in the
code, which developers would like to notice and fix immediately. This
patch promotes it from a warning to an error, to help developers catch
such mistakes early.
I would not expect this warning to change much based on the compiler
version, so hopefully it won't become a problem for packagers/builders.
See the GCC documentation or 'man gcc' for more details:
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/Warning-Options.html#index-Wreturn-type
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jason Ekstrand [Mon, 15 Oct 2018 03:20:17 +0000 (22:20 -0500)]
anv: Define trampolines as the weak functions
Instead of having weak references to the anv functions and separate
trampoline functions with their own dispatch table, just make the
trampoline functions weak. This gets rid of a dispatch table and
potentially lets the compiler delete the unused weak function. The
end result is a reduction in the .text section of 5.7K and a reduction
in the .data section of 1.4K.
Before:
text data bss dec hex filename
3190329 282232 8960 3481521 351fb1 _install/lib64/libvulkan_intel.so
After:
text data bss dec hex filename
3184548 280792 8960 3474300 35037c _install/lib64/libvulkan_intel.so
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:47:45 +0000 (18:47 +0200)]
docs: fix typo in 18.2.3 release notes link
Fixes:
86b4bd52dc ("docs: update calendar, add news item and link
release notes for 18.2.3")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:45:41 +0000 (18:45 +0200)]
docs: update calendar, add news item and link release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:43:26 +0000 (18:43 +0200)]
docs: add sha256 checksums for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
27fd12857b53ec22c0e918eee6c4c009643fccbc)
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:02:51 +0000 (18:02 +0200)]
docs: add release notes for 18.2.3
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
d219361b4226944835959676d1721b2a9d29da72)
Jose Fonseca [Thu, 18 Oct 2018 14:04:49 +0000 (15:04 +0100)]
scons: Remove gles option.
It's broken, and WGL state tracker is always built with GLES support
noawadays.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Bas Nieuwenhuizen [Fri, 19 Oct 2018 09:51:47 +0000 (11:51 +0200)]
radv: Fix WSI & PCI bus info initialization order.
Trying to access the bus info before it is initialized is not going
to work.
Fixes:
baa38c144f6 "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Marek Olšák [Thu, 18 Oct 2018 22:01:00 +0000 (18:01 -0400)]
radeonsi: fix a typo in a comment in emit_guardband
Marek Olšák [Thu, 18 Oct 2018 21:54:24 +0000 (17:54 -0400)]
radeonsi: fix gnome-shell crash
I wasn't expecting to get viewports with the center having
negative coordinates.
Broken by:
6cc79e4411f
Jason Ekstrand [Mon, 15 Oct 2018 02:56:47 +0000 (21:56 -0500)]
Revert "anv: Stop generating weak references for instance entrypoints"
This reverts commit
00bb42105d6edf6e432c0e3712ffb9d3eb0aece4. It was
not as well thought out as I had intended and broke the build when
VK_KHR_display is disabled in the build.
Marek Olšák [Wed, 17 Oct 2018 16:26:54 +0000 (12:26 -0400)]
radeonsi: clamp point size to the limit
This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by:
ea039f789d9b54e1bd1d644b6a29863ca3500314
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
Marek Olšák [Tue, 16 Oct 2018 19:10:01 +0000 (15:10 -0400)]
radeonsi: fix a VGT hang with primitive restart on Polaris10 and later
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
Marek Olšák [Wed, 17 Oct 2018 16:41:38 +0000 (12:41 -0400)]
radeonsi: fix a deadlock due to partially-initialized context on CI
Jan Vesely [Thu, 18 Oct 2018 19:15:06 +0000 (15:15 -0400)]
radeonsi: Bump number of allowed global buffers to 32
Fixes assertion failure/crash when running luxmark/luxball on clover.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Andres Rodriguez [Thu, 18 Oct 2018 19:32:31 +0000 (15:32 -0400)]
radv: fix check for perftest options size
It was using the debug options array size.
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>