Tim Rowley [Mon, 7 Aug 2017 23:13:54 +0000 (18:13 -0500)]
swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble
For consistency and to support overloading.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 4 Aug 2017 23:07:01 +0000 (18:07 -0500)]
swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 1 Aug 2017 20:21:04 +0000 (15:21 -0500)]
swr/rast: Removed some trailing whitespace caught during review
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 18 Aug 2017 16:51:59 +0000 (11:51 -0500)]
swr: set caps for VB 4-byte alignment
Needed to compensate for change to fetch jit requiring
alignment.
Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 9 Aug 2017 22:32:28 +0000 (17:32 -0500)]
swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Samuel Pitoiset [Wed, 6 Sep 2017 13:24:49 +0000 (15:24 +0200)]
radv: fix error code when resizing the upload BO
malloc() failures are unrelated to the device memory.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Gert Wollny [Wed, 6 Sep 2017 12:21:25 +0000 (14:21 +0200)]
mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVC
If <windows.h> is included then max is a macro that clashes
with std::numeric_limits::max, hence undefine it.
For some reason the struct access_record is not recognizes
outside the anonymouse namespace, make it a class.
The patch successfully was tested on AppVeyor.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Fri, 30 Jun 2017 06:55:17 +0000 (08:55 +0200)]
mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
This patch replaces the old register lifetime estiamtion and
rename mapping evaluation with the new one.
Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread.
-----------------------------------------------------------
old new(std::sort)
---------------- time ./run -j1 shaders --------------------
real 5.80s 5.75s
user 5.75s 5.70s
sys 0.05s 0.05s
---- valgrind --tool=callgrind --dump-instr=yes------------
merge 0.08% 0.18%
estimate lifetime 0.02% 0.11%
evaluate mapping (incl=0.3%) 0.04%
apply mapping 0.03% 0.02%
--- perf (approximate because of statistic sampling) ----
merge (total) 0.09% 0.16%
estimate lifetime 0.03% 0.10%
evaluate mapping (incl=0.02%) 0.04%
apply mapping 0.04% 0.04%
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Fri, 30 Jun 2017 06:49:41 +0000 (08:49 +0200)]
mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
The patch adds tests for the register rename mapping evaluation and
combined life time estimation and renaming.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Fri, 30 Jun 2017 06:45:48 +0000 (08:45 +0200)]
mesa/st: glsl_to_tgsi: add register rename mapping evaluator
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.
Registers that are not written to are not considered for renaming since in
glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Fri, 30 Jun 2017 06:37:36 +0000 (08:37 +0200)]
mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker
This patch adds a set of unit tests for the new lifetime tracker.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Fri, 30 Jun 2017 06:35:06 +0000 (08:35 +0200)]
mesa/st: glsl_to_tgsi: implement new temporary register lifetime tracker
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned second time to record the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actual minimal life time is
evaluated.
In addition, when compiled in debug mode (i.e. NDEBUG is not defined)
the shaders and estimated temporary life times can be logged to stderr
by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gert Wollny [Wed, 21 Jun 2017 08:05:23 +0000 (10:05 +0200)]
mesa/st: glsl_to_tgsi move some helper classes to extra files
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.
Specifically these are:
class st_src_reg;
class st_dst_reg;
class glsl_to_tgsi_instruction;
struct rename_reg_pair;
int swizzle_for_type(const glsl_type *type, int component);
as inline:
bool is_resource_instruction(unsigned opcode);
unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dave Airlie [Tue, 30 May 2017 05:52:13 +0000 (15:52 +1000)]
st_glsl_to_tgsi: rewrite rename registers to use array fully.
Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.
Removes this from the profile on some of the fp64 tests
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Tue, 29 Aug 2017 21:11:38 +0000 (23:11 +0200)]
radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
When the HS wave is empty, the hardware writes the LS VGPRs starting at
v0 instead of v2. Workaround by shifting them back into place when
necessary. For simplicity, this is always done in the LS prolog.
According to the hardware team, this will be fixed in future chips,
so take that into account already.
Note that this is not a bug fix, as the bug was already worked
around by commit
166823bfd26 ("radeonsi/gfx9: add a temporary workaround
for a tessellation driver bug"). This change merely replaces the
workaround by one that should be better.
v2: add workaround code to shader only when necessary
v3: clarify the prefer_mono comment
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 4 Sep 2017 09:05:13 +0000 (11:05 +0200)]
ac/debug: take ASIC generation into account when printing registers
There were some overlapping changes in gfx9 especially in the CB/DB
blocks which made register dumps rather misleading.
The split is along the lines of the header files, so we'll print VI-only
fields on SI and CI, for example, but we won't print GFX9 fields on
SI/CI/VI, and we won't print SI/CI/VI fields on GFX9.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 4 Sep 2017 08:35:06 +0000 (10:35 +0200)]
amd/common: pass chip_class to ac_dump_reg
Acked-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 4 Sep 2017 08:02:36 +0000 (10:02 +0200)]
ac/sid_tables: add FieldTable object
Automatically re-use table entries like StringTable and IntTable do.
This allows us to get rid of the "fields_owner" logic, and simplifies
the next change.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 4 Sep 2017 07:24:04 +0000 (09:24 +0200)]
ac/sid_tables: remove unused variable varname_values
Acked-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Mon, 4 Sep 2017 09:09:46 +0000 (11:09 +0200)]
radeonsi/gfx9: always flush DB metadata on framebuffer changes
This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Sat, 26 Aug 2017 01:06:09 +0000 (03:06 +0200)]
util/ralloc: set prev-pointers correctly in ralloc_adopt
Found by inspection.
I'm not aware of any actual failures caused by this, but a precise
sequence of ralloc_adopt and ralloc_free should be able to cause
problems.
v2: make the code slightly clearer (Eric)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Iago Toral Quiroga [Tue, 5 Sep 2017 11:06:37 +0000 (13:06 +0200)]
mesa/main: Fix GetTextureImage error reporting
GetTex*Image should return INVALID_ENUM if target is not valid, however,
GetTextureImage does not receive a target, and instead should return
INVALID_OPERATION if the effective target is not valid. From the
OpenGL 4.6 core profile spec, section 8.11 Texture Queries:
"An INVALID_OPERATION error is generated by GetTextureImage if the effective
target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY,
TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or
TEXTURE_CUBE_MAP (for GetTextureImage only)."
Fixes:
KHR-GL45.direct_state_access.textures_image_query_errors
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tapani Pälli [Tue, 5 Sep 2017 11:48:56 +0000 (14:48 +0300)]
egl: remove unused 'Screens' array from _egl_display
This was used by EGL_MESA_screen_surface that has been removed
in commit
7a58262e58d8edac3308777def0950032628edee.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <eml.velikov@collabora.com>
Dave Airlie [Mon, 21 Aug 2017 03:25:59 +0000 (04:25 +0100)]
Revert "radv: disable support for VEGA for now."
This reverts commit
611076a41aac3095a82dff2432943d7f8d429822.
With the two previous commits, vega shouldn't be unstable,
doesn't pass CTS, but can do a complete run, and games shouldn't
hang anymore, so bring it back online.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 21 Aug 2017 20:08:10 +0000 (21:08 +0100)]
radv/gfx9: set descriptor up for base_mip to level range.
This is required on GFX9, fixes a bug in Talos where all the
mipmaps overlay each other.
Just pushing this as well as it fixes Talos.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 22 Aug 2017 02:47:09 +0000 (12:47 +1000)]
radv: disable 1d/2d linear optimisation on gfx9.
This causes hangs in some of the CTS tests with a 2d
1536x2 texture.
This fixes hangs with:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.iew_type.1d_aray.format.r4g4b4a4_unorm_pack16.count_1.size.512x1_array_of_3
if we reenable it, make sure these don't regress.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 25 Aug 2017 00:15:32 +0000 (01:15 +0100)]
radv/gfx9: fix buffer size on gfx9.
The VI sizing only applies to VI.
This fixes:
dEQP-VK.image.image_size.buffer.*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Tue, 5 Sep 2017 22:28:22 +0000 (00:28 +0200)]
radv: Fix vkCopyImage with both depth and stencil aspects.
Fixes:
f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 11:20:35 +0000 (21:20 +1000)]
mesa/mtypes: repack gl_sampler_object.
160->152.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 11:17:47 +0000 (21:17 +1000)]
mesa/mtypes: repack gl_texture_object.
reduces size from 1144 to 1128.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 11:12:17 +0000 (21:12 +1000)]
mesa/mtypes: repack gl_shader_program_data.
This reduces the size from 144 bytes to 128 bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 11:08:34 +0000 (21:08 +1000)]
mesa/mtypes: reorganise gl_shader
This reduces this from 200->182 bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 10:37:01 +0000 (20:37 +1000)]
mesa/mtypes: repack display list structs.
This reduces each of these by 8 bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 10:35:19 +0000 (20:35 +1000)]
mesa/mtypes: reduce size of gl_sync_object.
Drops from 40->32 bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 10:08:18 +0000 (20:08 +1000)]
mesa/mtypes: reorg vertex/fragment program state.
reduces both of these by 8 bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 09:53:07 +0000 (19:53 +1000)]
mesa/bindless: reorder gl_bindless_image gl_bindless_sampler.
This makes these use 16-bytes instead of 24-bytes.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 1 Sep 2017 12:07:43 +0000 (14:07 +0200)]
radv: fix a memleak when compiling the GS copy shader
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Charmaine Lee [Fri, 1 Sep 2017 19:34:01 +0000 (12:34 -0700)]
svga: move index buffer bind flag assertion
The buffer bind flags can be promoted in svga_buffer_handle(), so
move the assertion after it. This has already been done for
vertex buffer in commit
6b4bf7e8be, but it misses the one for
index buffer.
Fixes assertion running WarThunder.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Charmaine Lee [Tue, 1 Aug 2017 22:58:50 +0000 (15:58 -0700)]
svga: avoid emitting redundant SetShaderResources and SetVertexBuffers
Minor performance improvement in avoiding binding the same shader resource
or the same vertex buffer for the same slot.
Tested with MTT glretrace.
v2: Per Brian's suggestion, add a helper function to do vertex buffer
comparision.
v3: Change the helper function to vertex_buffers_equal().
Reviewed-by: Brian Paul <brianp@vmware.com>
Jason Ekstrand [Tue, 22 Aug 2017 05:11:49 +0000 (22:11 -0700)]
spirv: Add support for the HelperInvocation builtin
I have no idea how this got missed but it's been missing since forever.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Thomas Hellstrom [Mon, 4 Sep 2017 12:05:25 +0000 (14:05 +0200)]
loader/dri3: Use client local back to front blit in copySubBuffer if available
The copySubBuffer functionality always attempted a server side blit from
back to fake front if a fake front was present, and we weren't displaying
on a remote GPU.
Now that we always have local blit capability on modern drivers, first
attempt a local blit, and only if that fails, try the server blit.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Axel Davy <axel.davy@normalesup.org>
Marek Olšák [Tue, 29 Aug 2017 01:58:22 +0000 (03:58 +0200)]
radeonsi/gfx9: implement primitive binning
This increases performance, but it was tuned for Raven, not Vega.
We don't know yet how Vega will perform, hopefully not worse.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 29 Aug 2017 18:55:11 +0000 (20:55 +0200)]
radeonsi: add more state flags into si_state_dsa
3 flags for primitive binning, 2 flags for out-of-order rasterization
(but that will be done some other time)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 28 Aug 2017 21:44:16 +0000 (23:44 +0200)]
radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tapani Pälli [Mon, 4 Sep 2017 05:02:13 +0000 (08:02 +0300)]
vbo: fix build errors on android
incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int')
from 'const char *' [-Werror,-Wint-conversion]
offset = indices;
^ ~~~~~~~
Fixes:
2d93b462b4d ("vbo: fix offset in minmax cache key")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 4 Sep 2017 17:26:34 +0000 (18:26 +0100)]
docs: add news item and link release notes for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 4 Sep 2017 17:24:29 +0000 (18:24 +0100)]
docs: add sha256 checksums for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
b4473dd5191878249ccb53f40407206f1e57fa6f)
Emil Velikov [Mon, 4 Sep 2017 17:16:16 +0000 (18:16 +0100)]
docs: Update 17.2.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
f5925b2897308530c64e1abf44ebc1ee0e017ada)
Marek Olšák [Tue, 29 Aug 2017 01:41:18 +0000 (03:41 +0200)]
radeonsi: eliminate PS color outputs when colormask kills them
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 29 Aug 2017 11:14:34 +0000 (13:14 +0200)]
gallium/radeon: sort DBG shader flags according to pipe_shader_type
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Nicolai Hähnle [Fri, 25 Aug 2017 07:04:40 +0000 (09:04 +0200)]
radeonsi: ensure cache flushes happen before SET_PREDICATION packets
The data is read when the render_cond_atom is emitted, so we must
delay emitting the atom until after the flush.
Fixes:
0fe0320dc074 ("radeonsi: use optimal packet order when doing a pipeline sync")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 25 Aug 2017 14:19:56 +0000 (16:19 +0200)]
radeonsi: fix ARB_transform_feedback_overflow_query on <= VI
The result written by the shader workaround needs to be written back, or
the CP may read stale data.
Fixes:
78476cfe071a ("radeonsi: enable ARB_transform_feedback_overflow_query")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Fri, 25 Aug 2017 14:08:50 +0000 (16:08 +0200)]
radeonsi: fix compute shader state dumping
Fixes:
420c438589c8 ("radeonsi: log draw and compute state into log context")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:14:32 +0000 (18:14 +0200)]
radeonsi: add an assertion that only two-dimensional constant references are used
v2: remove some redundant checks
Acked-by: Roland Scheidegger <sroland@vmware.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:13:57 +0000 (18:13 +0200)]
gallium/radeon: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:13:40 +0000 (18:13 +0200)]
gallium/tests: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:13:30 +0000 (18:13 +0200)]
pp: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:13:15 +0000 (18:13 +0200)]
gallium/hud: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:04:05 +0000 (18:04 +0200)]
nine: always generate two-dimensional constant file accesses
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 16:07:58 +0000 (18:07 +0200)]
st/glsl_to_tgsi: inline src_register into translate_src
src_register has no meaningful standalone use, it only makes sense when
called from translate_src.
v2: fix input array handling
Acked-by: Roland Scheidegger <sroland@vmware.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 15:09:09 +0000 (17:09 +0200)]
st/glsl_to_tgsi: ir_load_ubo always has a second index
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Sat, 26 Aug 2017 22:42:11 +0000 (00:42 +0200)]
st/drawpixels: always use two-dimensional constant references
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Sat, 26 Aug 2017 22:42:53 +0000 (00:42 +0200)]
tgsi/build: always generate two-dimensional constant file accesses
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 15:48:27 +0000 (17:48 +0200)]
tgsi/ureg: always emit constants (and their decls) as 2D
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Nicolai Hähnle [Wed, 23 Aug 2017 15:41:27 +0000 (17:41 +0200)]
gallium: all drivers should accept two-dimensional constant buffer indexing
Most older drivers seem to just ignore the Dimension setting, so virtually
no changes should be needed.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Engestrom [Sun, 3 Sep 2017 18:22:06 +0000 (19:22 +0100)]
anv: fix off by one in array check
`anv_formats[ARRAY_SIZE(anv_formats)]` is already one too far.
Spotted by Coverity.
CovID: 1417259
Fixes:
242211933a0682696170 "anv/formats: Nicely handle unknown VkFormat enums"
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Dave Airlie [Sun, 3 Sep 2017 08:57:49 +0000 (18:57 +1000)]
ac: reorg ac_shader_binary struct to take less space.
This reduces the size from 96 to 80 bytes but putting all the
32-bit sizes at the start.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 3 Sep 2017 08:37:38 +0000 (18:37 +1000)]
radv: drop emit2d_dst_type.
This is completely unused now.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuien.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Xavier Bouchoux [Thu, 31 Aug 2017 08:12:52 +0000 (10:12 +0200)]
radv/meta: missing initialisations in create_pass().
Otherwise radv_cmd_state_setup_attachments() will complain it has no clearvalues,
when called via radv_process_depth_image_inplace().
v2: use LOAD/STORE instead of DONT_CARE, to preserve stencil values.
Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Sat, 2 Sep 2017 19:47:11 +0000 (21:47 +0200)]
radv: Enable command buffer chaining by default.
For approx 5-10% performance improvement in dota2.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sat, 2 Sep 2017 10:59:55 +0000 (12:59 +0200)]
radv: Put semaphore waits in preamble cs.
The separate flush cs gets in the way of batchchain.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 3 Sep 2017 21:35:37 +0000 (23:35 +0200)]
radv: Actually set the cmd_buffer usage_flags.
Otherwise, the simultaneous uage bit doesn't get set from the begin
info, which we need for batchchaining.
Fixes:
f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Eric Engestrom [Thu, 31 Aug 2017 16:55:56 +0000 (16:55 +0000)]
util: improve compiler guard
Glibc 2.26 has dropped xlocale.h, but the functions needed (strtod_l()
and strdof_l()) can be found in stdlib.h.
Improve the detection method to allow newer builds to still make use of
the locale-setting.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102454
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Leo Liu [Fri, 25 Aug 2017 17:17:41 +0000 (13:17 -0400)]
radeon/uvd: add Define Restart Interval to MJPEG bitstream reconstruction
It adds the capacity to decode MJPEG stream with DRI marker
Signed-off-by: Leo Liu <leo.liu@amd.com>
Leo Liu [Fri, 25 Aug 2017 17:17:40 +0000 (13:17 -0400)]
radeon/uvd: fix MJPEG quantization table index
Fixes:
130d1f456b8 ("radeon/uvd: reconstruct MJPEG bitstream")
Signed-off-by: Leo Liu <leo.liu@amd.com>
Roland Scheidegger [Thu, 31 Aug 2017 23:48:42 +0000 (01:48 +0200)]
st/mesa: fix view template initialization in try_pbo_readpixels
I think this is what the code was meant to do, albeit as far as I can tell
the redundant initialization some analyzers complain about should work as
well just fine (only the first layer will be used, if the view contains one
or more layers doesn't really matter).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102467
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Kenneth Graunke [Thu, 31 Aug 2017 22:45:57 +0000 (15:45 -0700)]
genxml: Make Border Color Pointer an address on Gen4-5, not an offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Thu, 31 Aug 2017 20:12:44 +0000 (13:12 -0700)]
i965: Inline emit_reloc in __genx_combine_address
One less layer of baklava.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 1 Sep 2017 06:54:20 +0000 (23:54 -0700)]
i965: Fix crash in fallback GTT mapping.
We can't perf_debug without a context.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Kenneth Graunke [Tue, 22 Aug 2017 22:12:55 +0000 (15:12 -0700)]
i965: Fix state flagging of Gen6 SOL programs.
It doesn't seem like the old code could possibly work.
1. brw_gs_state_dirty made us bail unless one of these flags were set:
_NEW_TEXTURE, BRW_NEW_GEOMETRY_PROGRAM, BRW_NEW_TRANSFORM_FEEDBACK
2. If there was no geometry program, we called brw_upload_ff_gs_prog()3
3. That checked brw_ff_gs_state_dirty and bailed unless these were set:
_NEW_LIGHT, BRW_NEW_PRIMITIVE, BRW_NEW_TRANSFORM_FEEDBACK,
BRW_NEW_VS_PROG_DATA.
4. brw_ff_gs_prog_key pv_first and attr fields were set based on data
depending on _NEW_LIGHT and BRW_NEW_VS_PROG_DATA.
This means that if we needed a FF GS program, and changed the VS
outputs or provoking vertex mode, we'd fail to notice that we needed
to emit a new program.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Tue, 22 Aug 2017 22:18:32 +0000 (15:18 -0700)]
i965: Drop useless gen6_brw_upload_ff_gs_prog() wrapper.
gen6...brw? Drop some baklava layers.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Rob Clark [Sat, 26 Aug 2017 15:55:05 +0000 (11:55 -0400)]
freedreno: skip batch-cache for compute shaders
It is kind of pointless for compute, and avoids issues with apps kicking
off more than 32 compute shaders at once.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Vinson Lee [Thu, 31 Aug 2017 22:38:57 +0000 (15:38 -0700)]
m4: Use older autoconf 2.63 compatible ax_check_compile_flag.
CentOS 6 and RHEL 6 have autoconf 2.63.
Fixes:
e4b2b69e828c ("configure: Add and use AX_CHECK_COMPILE_FLAG")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Wed, 30 Aug 2017 08:16:23 +0000 (01:16 -0700)]
i965: Move BATCH_SZ define into intel_batchbuffer.c.
It's only used in one file.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Kenneth Graunke [Wed, 30 Aug 2017 08:13:38 +0000 (01:13 -0700)]
i965: Drop batch_size argument from brw_bufmgr_init().
This is dead code and hasn't been used in a long time.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Wed, 30 Aug 2017 07:02:10 +0000 (00:02 -0700)]
i965: Rename brw_bo::offset64 to gtt_offset.
We can drop the meaningless "64" suffix - libdrm_intel originally had
an "offset" field that was an "unsigned long" which was the wrong size,
and we couldn't remove/alter that field without breaking ABI, so we had
to add a uint64_t "offset64" field.
"gtt_offset" is also more descriptive than "offset".
(Patch originally written by Ken, but Chris suggested a better name and
supplied the giant comment making up the bulk of the patch, so I changed
the authorship to him.)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Kenneth Graunke [Wed, 30 Aug 2017 08:43:34 +0000 (01:43 -0700)]
i965: Drop the BRW_BATCH_STRUCT macro.
It's used in exactly one place these days, and not much simpler than
just calling intel_batchbuffer_data directly.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Kenneth Graunke [Wed, 30 Aug 2017 08:02:30 +0000 (01:02 -0700)]
i965: Don't double count the batch in aperture_space.
intel_batchbuffer_reset calls add_exec_bo on the batch right away,
which adds in the batch BO size.
Fixes:
29ba502a4e28 ("i965: Use I915_EXEC_BATCH_FIRST when available.")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cherniak, Bruce [Fri, 25 Aug 2017 19:59:13 +0000 (14:59 -0500)]
swr: Report format max_samples=1 to maintain support for "fake" msaa.
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.
This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.
Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.
This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Eric Engestrom [Fri, 1 Sep 2017 15:55:55 +0000 (16:55 +0100)]
aubinator: remove duplicate initialisation
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Samuel Pitoiset [Thu, 31 Aug 2017 09:44:00 +0000 (11:44 +0200)]
radv: report VM faults if detected
It's fairly simple for now, but this might be quite useful.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 31 Aug 2017 09:43:59 +0000 (11:43 +0200)]
radeonsi: move si_vm_fault_occured() to AMD common code
For radv, in order to report VM faults when detected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 1 Sep 2017 07:44:45 +0000 (09:44 +0200)]
radv: add radv_check_gpu_hangs() helper function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 30 Aug 2017 13:12:21 +0000 (15:12 +0200)]
radv: disassemble SPIR-V binaries with RADV_DEBUG=spirv
This introduces a new separate option because the output can
be quite verbose. If spirv-dis is not found in the path, this
debug option is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 30 Aug 2017 13:12:20 +0000 (15:12 +0200)]
radv: move RADV_TRACE_FILE functions to radv_debug.c
At the moment, debugging radv is not really easy because the
driver doesn't report enough information when it hangs. This
new file will be the main location for all debug tools.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 28 Aug 2017 10:17:11 +0000 (12:17 +0200)]
radv: silent a compiler warning in radv_emit_framebuffer_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 30 Aug 2017 13:12:03 +0000 (15:12 +0200)]
radv: compute correct maximum wave count per SIMD
Ported from RadeonSI (original patch by Marek).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Brian Paul [Tue, 22 Aug 2017 02:53:25 +0000 (20:53 -0600)]
st/mesa: only try to create 1x msaa surfaces for "fake" msaa drivers
For software drivers where we want "fake" msaa support for GL 3.x, we
treat 1 sample as being msaa.
For drivers with real msaa support, start format probing at 2x msaa.
For drivers with fake msaa support, start format probing at 1x msaa.
This also tweaks the MaxSamples code in st_init_extensions() so that
we use MaxSamples=1 for fake msaa. This allows the format proble loops
to run at least one iteration.
This fixes a llvmpipe/VTK regression from commit
6839d3369905eb02151.
And for drivers with fake msaa support, calls such as
glTexImage2DMultisample(samples=1) will now succeed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102125
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tobias Klausmann [Sun, 13 Aug 2017 01:02:39 +0000 (03:02 +0200)]
nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:
...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd remove in that case:
...
mov u32 $r0 %r473 (0)
mov u32 $r1 0x00000003 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...
Shaderdb stats:
total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
total gprs used in shared programs : 582972 -> 582881 (-0.02%)
total local used in shared programs : 17960 -> 17960 (0.00%)
local gpr inst bytes
helped 0 91 112 112
hurt 0 0 0 0
v2:
implement some changes proposed by imirkin, the manual deletion of the dead
mov is necessary after
ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
to division function") as the potentially dead mov is unlinked properly,
causing later passes to not notice the mov op at all and thus not cleaning it
up. That makes up a big chunk of the regression the above commit caused.
Keep the deletion of the op where it is, deleting it later unnecessarily blows
up size of the change.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 27 Aug 2017 16:00:52 +0000 (18:00 +0200)]
nvc0: write 0 to pipeline_statistics.cs_invocations
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.
fixes on nvc0:
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
* KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org