Ilia Mirkin [Wed, 2 Jul 2014 16:12:51 +0000 (12:12 -0400)]
glsl: add support for AMD_vertex_shader_viewport_index
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tested-by: Tobias Droste <tdroste@gmx.de>
Ilia Mirkin [Wed, 2 Jul 2014 16:12:28 +0000 (12:12 -0400)]
mesa: add support for AMD_vertex_shader_viewport_index
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tested-by: Tobias Droste <tdroste@gmx.de>
Ilia Mirkin [Mon, 23 Jun 2014 13:32:59 +0000 (09:32 -0400)]
mesa/st: enable ARB_fragment_layer_viewport
If multiple viewports are supported, that implies the presence of a GS
and layered rendering, so we can enable ARB_fragment_layer_viewport as
well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Eric Anholt [Wed, 21 May 2014 21:31:31 +0000 (14:31 -0700)]
i965/gen6: Add a spec citation about push constant packet requirements.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 21 May 2014 21:09:25 +0000 (14:09 -0700)]
i965: Add a comment about null renderbuffer surfaces and why they exist.
I noticed this when trying to find comments about pull constant buffers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 15:51:12 +0000 (08:51 -0700)]
i965: Update a ton of comments about constant buffers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 21 May 2014 21:22:47 +0000 (14:22 -0700)]
i965: Merge VS/GS and WM pull constant buffer upload paths.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 17:06:15 +0000 (10:06 -0700)]
i965/gen6+: Merge VS/GS and WM push constant buffer upload paths.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 17:10:01 +0000 (10:10 -0700)]
i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data.
I wanted to access this value from stage-generic code, so stop storing it
under two different names.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 16:37:00 +0000 (09:37 -0700)]
i965: Fix state flags for gen4/5 CURBE.
If we had some NOS affecting VS compilation that resulted in optimization
changing the set of constants to be uploaded, we might not have reuploaded
the constants.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 04:47:29 +0000 (21:47 -0700)]
i965: Remove a dead define.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 04:24:47 +0000 (21:24 -0700)]
i965: Reuse libdrm's header for AUB definitions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 03:48:16 +0000 (20:48 -0700)]
i965: Fix stale comments about the state cache.
This changed in the state streaming work years ago.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Mon, 19 May 2014 03:44:02 +0000 (20:44 -0700)]
i965: Fix stale binding table comment.
I recently moved the code from the mentioned location right into this
file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 20 May 2014 18:54:26 +0000 (11:54 -0700)]
i965: Drop the memcmp for finding duplicated CURBE uploads.
At this point, the extra copy of the data and memcmp are as expensive as
just re-uploading.
Note: now that we'll always upload, and brw_constant_buffer watches
BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo
at batch reset time.
No significant performance difference on glamor copywinwin10 (n=55),
despite that test having a 98% hit rate on the cache.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Tue, 20 May 2014 18:09:57 +0000 (11:09 -0700)]
i965: Reuse intel_upload.c for gen4/5 constant buffers.
No performance difference on glamor with copywinwin10 (n=40) on my gm45.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Tue, 17 Jun 2014 15:52:34 +0000 (08:52 -0700)]
gallium: Add PIPE_SHADER_CAP_DOUBLES
This is for reporting whether or not double precision floating-point
operations are supported.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Matt Arsenault [Tue, 10 Jun 2014 05:21:52 +0000 (22:21 -0700)]
clover: Fix not setting build log if the build succeeds v2
If there were only warnings, they would not be added to the log.
v2:
- Use compat::string.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Francisco Jerez [Sat, 21 Jun 2014 16:06:07 +0000 (18:06 +0200)]
clover: Have compat::string allocate its own memory.
Tom Stellard [Wed, 18 Jun 2014 20:58:33 +0000 (16:58 -0400)]
gallium/radeon: Only print a message for LLVM diagnostic errors
We were printing messages for all diagnostic types, which was
spamming the console for some OpenCL programs.
Tom Stellard [Tue, 24 Jun 2014 23:35:08 +0000 (19:35 -0400)]
radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>
https://bugs.freedesktop.org/show_bug.cgi?id=80015
CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Tue, 24 Jun 2014 23:23:20 +0000 (19:23 -0400)]
r600g: allow viewport index/layer to be sent to ps
In order to support ARB_fragment_layer_viewport, we need to explicitly
send these along to the pixel shader, since it has no other way to
retrieve them.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Emil Velikov [Wed, 2 Jul 2014 11:06:41 +0000 (12:06 +0100)]
targets/dri: allow duplicated symbols
With the inclusion of xmlconfig in the loader we're providing dri* symbols
which are already available in libdricommon.la. This leads to a build
break due to the multiple definitions.
Temporary allow multiple definitions, until we come with a better solution.
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 16 Jun 2014 14:17:40 +0000 (15:17 +0100)]
st/dri: Remove the old libdridrm library
With all the hw drivers converted, we can go back to having
a single libdridrm provider.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 12:50:01 +0000 (13:50 +0100)]
targets/dri-vmwgfx: Convert to static/shared pipe-drivers
Convert the final hardware driver to a single dri provider which
includes all the pipe-drivers.
Update the scons build and drop the unused vmw_powf.c.
Cc: José Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Cc: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 12:31:10 +0000 (13:31 +0100)]
targets/dri-ilo: Convert to static/shared pipe-driver
Cc: Chia-I Wu <olv@lunarg.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Sat, 21 Jun 2014 11:45:54 +0000 (12:45 +0100)]
targets/dri-i915: Convert to static/shared pipe-drivers
v2:
- Drop inclusion of the winsys wrapper and softpipe/llvmpipe.
- Remove old Makefile.am, target.c.
- Correctly append i915 to the megadrivers list.
Cc: Stephane Marchesin <stephane.marchesin@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 01:54:56 +0000 (02:54 +0100)]
targets/dri-freedreno: Convert to static/shared pipe-drivers
Now we don't need a second dri module when using kgsl :)
Cc: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 13:25:50 +0000 (14:25 +0100)]
targets/(r300|r600|radeonsi)/dri: Convert to static/shared pipe-drivers
Related to previous commit, merge the separate dri targets to a single
one.
This is essentially all the buildsystem mayhem required for megaradeon.
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Emil Velikov [Mon, 16 Jun 2014 13:23:50 +0000 (14:23 +0100)]
targets/dri-nouveau: Convert to static/shared pipe-drivers
Similiar to other targets, we'd like to convert all the separate
targets into a single one, thus we'll minimize the duplication and
overall size of mesa. The conversion per API basis, with the drivers
available either statically or shared. Currently the former is the
default.
v2: Correctly append the version script to the linker flags.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 01:05:05 +0000 (02:05 +0100)]
st/dri/drm: Add a second libdridrm library
Will be used to create the single dri target library, on our
way to convert all the dri targets during the conversion to
to static/shared pipe-drivers.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Sun, 15 Jun 2014 23:20:07 +0000 (00:20 +0100)]
st/dri: Allow separate dri-targets
With this commit we add a couple of DEFINES making the ST code
conditional, in a way that we can use it to gradually convert
the dri-targets from separate libraries into a single one.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Mon, 16 Jun 2014 13:21:13 +0000 (14:21 +0100)]
targets/dri-swrast: use drm aware dricommon when building more than swrast
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Thomas Helland <thomashelland90 at gmail.com>
Acked-by: Tom Stellard <thomas.stellard@amd.com>
Ilia Mirkin [Tue, 1 Jul 2014 14:27:38 +0000 (10:27 -0400)]
docs: update hw-dependent bits of ARB_gpu_shader5
Some of the features are completely implemented by core, while others
have hardware dependencies. Create a list of drivers supporting each
sub-feature that must have hw support.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Ilia Mirkin [Wed, 2 Jul 2014 00:05:09 +0000 (20:05 -0400)]
nvc0: add missed PIPE_CAP_DRAW_INDIRECT
Real support will be forthcoming. For now, avoid the unknown cap error
and compiler warning.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Roland Scheidegger [Tue, 1 Jul 2014 13:37:56 +0000 (15:37 +0200)]
llvmpipe: get rid of llvmpipe_get_texture_tile_linear
Because the layout is always linear this didn't really do much any longer -
at some point this triggered per-tile swizzled->linear conversion. The x/y
coords were ignored too.
Apart from triggering conversion, this also invoked alloc_image_data(), which
could only actually trigger mapping of display target resources. So, instead
just call resource_map in the callers (which also gives the ability to unmap
again). Note that mapping/unmapping of display target resources still isn't
really all that clean (map/unmap may be unmatched, and all such mappings use
the same pointer thus usage flags are a lie).
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Tue, 1 Jul 2014 01:38:41 +0000 (03:38 +0200)]
llvmpipe: get rid of llvmpipe_get_texture_image
The only caller left used it only for non display target textures,
hence it was really the same as llvmpipe_get_texture_image_address - it
also had a usage flag but this was ignored anyway.
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Tue, 1 Jul 2014 01:09:44 +0000 (03:09 +0200)]
llvmpipe: get rid of llvmpipe_get_texture_image_all
Once used for invoking swizzled->linear conversion for all needed images.
But we now have a single allocation for all images in a resource, thus looping
through all slices is rather pointless, conversion doesn't happen neither.
Also simplify the sampling setup code to use the mip_offsets array in the
resource directly - if the (non display target) resource exists its memory
will already be allocated as well.
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Tue, 1 Jul 2014 15:06:48 +0000 (17:06 +0200)]
llvmpipe: allocate regular texture memory upfront
The deferred allocation doesn't really make much sense anymore, since we no
longer allocate swizzled/linear memory in chunks and not per level / slice
neither.
This means we could fail resource creation a bit more (could already fail in
theory anyway) but should not fail maps later (right now, callers can't deal
with neither really).
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Tue, 1 Jul 2014 00:18:56 +0000 (02:18 +0200)]
llvmpipe: get rid of linear_img struct
Just use a tex_data pointer directly - the description was no longer correct
neither.
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Mon, 30 Jun 2014 23:54:30 +0000 (01:54 +0200)]
llvmpipe: (trivial) rename linear_mip_offsets to mip_offsets
Since switching to non-swizzled rendering we only have "normal", aka linear,
offsets.
Reviewed-by: Brian Paul <brianp@vmware.com>
Roland Scheidegger [Tue, 1 Jul 2014 19:16:00 +0000 (21:16 +0200)]
target-helpers: don't use designated initializers
it looks since
ce1a1372280d737a1b85279995529206586ae480 they are now included
in more places, in particular even for things buildable with msvc, and hence
those break the build.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Christoph Bumiller [Fri, 5 Apr 2013 12:29:37 +0000 (14:29 +0200)]
st/mesa: add support for indirect drawing
Marek Olšák [Thu, 24 Apr 2014 23:27:34 +0000 (01:27 +0200)]
gallium/u_vbuf: get draw info from an indirect buffer if there's any
This is required for fallbacks to work with ARB_draw_indirect.
Christoph Bumiller [Fri, 5 Apr 2013 12:29:36 +0000 (14:29 +0200)]
gallium: add facilities for indirect drawing
v2:
Added comments to util_draw_indirect, clarified and fixed map size.
Removed unlikely().
Christoph Bumiller [Fri, 5 Apr 2013 12:29:35 +0000 (14:29 +0200)]
gallium: add PIPE_BIND_COMMAND_ARGS_BUFFER
Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER
target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.
Dave Airlie [Tue, 1 Jul 2014 22:24:05 +0000 (08:24 +1000)]
xmlconfig/dri: bool -> unsigned char
Drop stdbool, due to the X server being a pain and having
struct members called bool, although I've sent a patch to fix
that we should retain stupidity here. Use unsigned char
which is what GLboolean is anyways.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cody Northrop [Tue, 1 Jul 2014 18:43:54 +0000 (12:43 -0600)]
i965/fs: Update discard jump to preserve uniform loads via sampler.
Commit
17c7ead7 exposed a bug in how uniform loading happens in the
presence of discard. It manifested itself in an application as
randomly incorrect pixels on the borders of conditional areas.
This is due to how discards jump to the end of the shader incorrectly
for some channels. The current implementation checks each 2x2
subspan to preserve derivatives. When uniform loading via samplers
was turned on, it uses a full execution mask, as stated in
lower_uniform_pull_constant_loads(), and only populates four channels
of the destination (see generate_uniform_pull_constant_load_gen7()).
It happens incorrectly when the first subspan has been jumped over.
The series that implemented this optimization was done before the
changes to use samplers for uniform loads. Uniform sampler loads
use special execution masks and only populate four channels, so we
can't jump over those or corruption ensues.
This fix only jumps to the end of the shader if all relevant channels
are disabled, i.e. all 8 or 16, depending on dispatch. This
preserves the original GLbenchmark 2.7 speedup noted in commit
beafced2.
It changes the shader assembly accordingly:
before : (-f0.1.any4h) halt(8) 17 2 null { align1 WE_all 1Q };
after(8) : (-f0.1.any8h) halt(8) 17 2 null { align1 WE_all 1Q };
after(16): (-f0.1.any16h) halt(16) 17 2 null { align1 WE_all 1H };
v2: Cleaned up comments and conditional ordering.
v3: Fix typo.
Signed-off-by: Cody Northrop <cody@lunarg.com>
Reviewed-by: Mike Stroyan <mike@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948
Matt Turner [Mon, 30 Jun 2014 01:24:18 +0000 (18:24 -0700)]
i965/fs: Mark case unreachable to silence warning.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 29 Jun 2014 21:54:01 +0000 (14:54 -0700)]
i965: Use unreachable() instead of unconditional assert().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Mon, 30 Jun 2014 02:12:04 +0000 (19:12 -0700)]
mesa: Make unreachable macro take a string argument.
To aid in debugging.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 29 Jun 2014 03:33:17 +0000 (20:33 -0700)]
i965/vec4: Remove useless conditionals.
Setting a couple of bits is the same cost or less as conditionally
setting a couple of bits.
Matt Turner [Sun, 29 Jun 2014 03:02:51 +0000 (20:02 -0700)]
i965/fs: Pass cfg to calculate_live_intervals().
We've often created the CFG immediately before, so use it when
available.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Sun, 29 Jun 2014 02:52:04 +0000 (19:52 -0700)]
i965: Mark fields in the live interval classes protected.
cfg, for instance, is a pointer to a local variable in
calculate_live_intervals, certainly not valid after that function has
returned.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 05:43:21 +0000 (22:43 -0700)]
glsl: Remove now unused foreach_list* macros.
foreach_list_typed_const was never used as far as I can tell.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 23:31:38 +0000 (16:31 -0700)]
i965: Use typed foreach_in_list_safe instead of foreach_list_safe.
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 22:53:19 +0000 (15:53 -0700)]
i965: Use typed foreach_in_list instead of foreach_list.
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 19:42:00 +0000 (12:42 -0700)]
i965: Add and use foreach_inst_in_block macros.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 19:41:35 +0000 (12:41 -0700)]
i965/fs: Use is_head_sentinel() instead of ->prev == NULL.
Makes it more clear what we're doing and requires less knowledge of
exec_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 17:51:30 +0000 (10:51 -0700)]
mesa: Add and use foreach_list_typed_safe.
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 17:32:38 +0000 (10:32 -0700)]
mesa: Add and use foreach_in_list_use_after.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 05:29:08 +0000 (22:29 -0700)]
glsl: Replace uses of foreach_list_const.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 05:11:04 +0000 (22:11 -0700)]
glsl: Replace another couple uses of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 05:02:24 +0000 (22:02 -0700)]
glsl: Use foreach_list_typed when possible.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 04:58:50 +0000 (21:58 -0700)]
mesa: Use typed foreach_in_list_safe instead of foreach_list_safe.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 04:58:35 +0000 (21:58 -0700)]
glsl: Use typed foreach_in_list_safe instead of foreach_list_safe.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 04:43:45 +0000 (21:43 -0700)]
mesa: Use typed foreach_in_list instead of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Wed, 25 Jun 2014 04:34:05 +0000 (21:34 -0700)]
glsl: Use typed foreach_in_list instead of foreach_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 23:32:29 +0000 (16:32 -0700)]
glsl: Add typed foreach_in_list_safe macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Tue, 24 Jun 2014 23:36:04 +0000 (16:36 -0700)]
glsl: Add typed foreach_in_list/_reverse macros.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Axel Davy [Tue, 1 Jul 2014 15:15:41 +0000 (11:15 -0400)]
mesa: fix the condition in src/loader/Makefile.am
We want to have the dri common files compiled to define USE_DRICONF.
We need to check both NEED_OPENGL_COMMON and HAVE_DRICOMMON
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: Brian Paul <brianp@vmware.com>
Brian Paul [Tue, 1 Jul 2014 14:19:26 +0000 (08:19 -0600)]
mesa: update comment for UniformBufferSize to indicate size is in bytes
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 1 Jul 2014 14:17:09 +0000 (08:17 -0600)]
st/mesa: fix incorrect size of UBO declarations
UniformBufferSize is in bytes so we need to divide by 16 to get the
number of constant buffer slots. Also, the ureg_DECL_constant2D()
function takes first..last parameters so we need to subtract one
for the last value.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 1 Jul 2014 13:57:43 +0000 (07:57 -0600)]
st/mesa: don't use address register for constant-indexed ir_binop_ubo_load
Before, we were always using the address register and indirect addressing
to index into a UBO constant buffer. With this change we only do that
when necessary.
Using the piglit bin/arb_uniform_buffer_object-rendering test as an
example:
Shader code:
uniform ub_rot {float rotation; };
...
m[1][1] = cos(rotation);
Before:
IMM[1] INT32 {0, 1, 0, 0}
1: UARL ADDR[0].x, IMM[1].xxxx
2: MOV TEMP[0].x, CONST[3][ADDR[0].x].xxxx
3: COS TEMP[1].x, TEMP[0].xxxx
After:
0: COS TEMP[0].x, CONST[3][0].xxxx
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 1 Jul 2014 13:55:00 +0000 (07:55 -0600)]
st/mesa: allow 2D indexing for all shader types in translate_src()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 1 Jul 2014 13:53:16 +0000 (07:53 -0600)]
st/mesa: don't ignore const buf index in src_register()
Otherwise, if we were creating a const buffer src register for a UBO
the index into the UBO was always zero.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Sun, 15 Jun 2014 20:39:14 +0000 (16:39 -0400)]
nvc0: expose 4 vertex streams, use stream ids in xfb
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 28 Jun 2014 02:00:57 +0000 (22:00 -0400)]
nvc0/ir: only merge emit/restart for identical streams
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 28 Jun 2014 01:55:16 +0000 (21:55 -0400)]
nvc0/ir: avoid creating restarts with non-0 stream
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 27 Jun 2014 04:27:07 +0000 (00:27 -0400)]
nvc0/ir: fix emitting vertex stream
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 15 Jun 2014 22:49:50 +0000 (18:49 -0400)]
mesa/st: add vertex stream support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Fri, 27 Jun 2014 00:01:50 +0000 (20:01 -0400)]
gallium: add a cap for max vertex streams
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Thu, 26 Jun 2014 23:33:07 +0000 (19:33 -0400)]
gallium: add an index argument to create_query
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Sun, 15 Jun 2014 20:38:35 +0000 (16:38 -0400)]
gallium: add support for stream in so info
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Wed, 11 Jun 2014 19:33:41 +0000 (15:33 -0400)]
gallium: add vertex stream argument to EMIT/ENDPRIM
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Matt Turner [Sun, 29 Jun 2014 06:32:05 +0000 (23:32 -0700)]
i965/fs: Mark predicated PLN instructions with dependency hints.
To implement the unlit_centroid_workaround, previously we emitted
(+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 1Q };
(-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 1Q };
where the flag register contains the channel enable bits from g0.
Since the predicates are complementary, the pair of pln instructions
write to non-overlapping components of the destination, which is the
case that the dependency control hints are designed for.
Typically setting dependency control hints on predicated instructions
isn't safe (if an instruction doesn't execute due to the predicate, it
won't update the scoreboard, leaving it in a bad state) but since we
must have at least one channel executing (i.e., +f0 is true for some
channel) by virtue of the fact that the thread is running, we can put
the +f0 pln instruction last and set the hints:
(-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 NoDDClr 1Q };
(+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 NoDDChk 1Q };
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Sun, 29 Jun 2014 01:38:03 +0000 (18:38 -0700)]
i965/fs: Predicate PLN instructions used in unlit centroid WA.
Maybe lets us skip some PLN instructions if whole subspans are disabled?
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Sun, 29 Jun 2014 06:31:04 +0000 (23:31 -0700)]
i965/fs: Add no_dd_{clear,check} fields to fs_inst.
And plumb them through. Also make the assert in the generator look like
the vec4 one.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Matt Turner [Sun, 29 Jun 2014 01:00:27 +0000 (18:00 -0700)]
i965/fs: Let sat-prop ignore live ranges if producer already has sat.
This sequence (where both x and w are used afterwards) wasn't handled.
mul.sat x, y, z
...
mov.sat w, x
We assumed that if x was used after the mov.sat, that we couldn't
propagate the saturate modifier, but in fact x was already saturated.
So ignore the live range check if the producing instruction already
saturates its result. Cuts one instruction from hundreds of TF2 shaders.
total instructions in shared programs: 1995631 -> 1994951 (-0.03%)
instructions in affected programs: 155248 -> 154568 (-0.44%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Sun, 29 Jun 2014 06:11:22 +0000 (23:11 -0700)]
i965/fs: Pass const references to emit functions.
Cuts 10k of .text and saves a bunch of useless struct copies.
Matt Turner [Sat, 28 Jun 2014 20:53:55 +0000 (13:53 -0700)]
i965/vec4: Pass const references to instruction functions.
text data bss dec hex filename
4231165 123200 39648 4394013 430c1d i965_dri.so
4186277 123200 39648 4349125 425cc5 i965_dri.so
Cuts 43k of .text and saves a bunch of useless struct copies.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Sat, 28 Jun 2014 20:46:29 +0000 (13:46 -0700)]
i965/vec4: Pass const references to vec4_instruction().
text data bss dec hex filename
4244821 123200 39648 4407669 434175 i965_dri.so
4231165 123200 39648 4394013 430c1d i965_dri.so
Cuts 13k of .text and saves a bunch of useless struct copies.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Sat, 28 Jun 2014 20:40:52 +0000 (13:40 -0700)]
i965/fs: Pass const references to instruction functions.
text data bss dec hex filename
4270747 123200 39648 4433595 43a6bb i965_dri.so
4244821 123200 39648 4407669 434175 i965_dri.so
Cuts 25k of .text and saves a bunch of useless struct copies.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Axel Davy [Wed, 28 May 2014 00:04:08 +0000 (20:04 -0400)]
radeonsi: Use dma_copy when possible for si_blit.
This improves GLX DRI3 GPU offloading significantly on CPU
bound benchmarks particularly.
No performance impact for DRI2 GPU offloading.
v2: Add missing tests
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák<marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Axel Davy [Sat, 17 May 2014 16:12:11 +0000 (12:12 -0400)]
glx/dri3: add GPU offloading support.
The differences with DRI2 GPU offloading are:
a) There's no logic for GPU offloading needed in the Xserver
b) for DRI2, the card would render to a back buffer, and
the content would be copied to the front buffer (the same buffers
everytime). Here we can potentially use several back buffers and copy
to buffers with no tiling to share with X. We send them with the
Present extension.
That means than the DRI2 solution is forced to have tearings with GPU
offloading. In the ideal scenario, this DRI3 solution doesn't have this
problem.
However without dma-buf fences, a race can appear (if the card is slow
and the rendering hasn't finished before the server card reads the buffer),
and then old content is displayed. If a user hits this, he should probably
revert to the DRI2 solution (LIBGL_DRI3_DISABLE). Users with cards fast
enough seem to not hit this in practice (I have an Amd hd 7730m, and I
don't hit this, except if I force a low dpm mode)
c) for non-fullscreen apps, the DRI2 GPU offloading solution requires
compositing. This DRI3 solution doesn't have this requirement. Rendering
to a pixmap also works.
d) There is no need to have a DDX loaded for the secondary card.
V4: Fixes some piglit tests
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Axel Davy [Sun, 8 Jun 2014 23:47:48 +0000 (19:47 -0400)]
loader: Use drirc device_id parameter in complement to DRI_PRIME
DRI_PRIME is not very handy, because you have to launch the executable
with it set, which is not always easy to do.
By using drirc, the user specifies the target executable
and the device to use. After that the program will be launched everytime
on the target device.
For example if .drirc contains:
<driconf>
<device driver="loader">
<application name="Glmark2" executable="glmark2">
<option name="device_id" value="pci-0000_01_00_0" />
</application>
</device>
</driconf>
Then glmark2 will use if possible the render-node of
ID_PATH_TAG pci-0000_01_00_0.
v2: Fix compilation issue
v3: Add "-lm" and rebase.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Axel Davy [Sun, 8 Jun 2014 23:42:15 +0000 (19:42 -0400)]
loader: add gpu selection code via DRI_PRIME.
v2: Fix the leak of device_name
v3: Rebased
It enables to use the DRI_PRIME env var to specify
which gpu to use.
Two syntax are supported:
If DRI_PRIME is 1 it means: take any other gpu than the default one.
If DRI_PRIME is the ID_PATH_TAG of a device: choose this device if
possible.
The ID_PATH_TAG is a tag filled by udev.
You can check it with 'udevadm info' on the device node.
For example it can be "pci-0000_01_00_0".
Render-nodes need to be enabled to choose another gpu,
and they need to have the ID_PATH_TAG advertised.
It is possible for not very recent udev that the tag
is not advertised for render-nodes, then
ones need to add a file containing:
SUBSYSTEM=="drm", IMPORT{builtin}="path_id"
in /etc/udev/rules.d/
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Axel Davy [Thu, 6 Mar 2014 11:02:44 +0000 (12:02 +0100)]
drirc: Add string support
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Dave Airlie [Fri, 27 Jun 2014 03:23:24 +0000 (13:23 +1000)]
dri: remove GL types from config queries
This in theory changes ABI for the boolean->bool I think,
but nothing in the tree uses configQueryb AFAICS.
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 27 Jun 2014 03:11:44 +0000 (13:11 +1000)]
dri/xmlconfig: remove GL types.
This just drops all the GL types from the xmlconfig and use
std C types from stdint and stdbool.
v2: drop further double and header include.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>