contrib/beignet.git
10 years agoAdd the support for vector type in printf.
Junyan He [Tue, 24 Jun 2014 08:35:58 +0000 (16:35 +0800)]
Add the support for vector type in printf.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: Further optimize exp().
Ruiling Song [Tue, 24 Jun 2014 06:23:31 +0000 (14:23 +0800)]
GBE: Further optimize exp().

Use native_exp() as much as possible.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoadd cpu copy for 1Darray and 2darray related copy APIs.
Luo [Tue, 24 Jun 2014 02:09:12 +0000 (10:09 +0800)]
add cpu copy for 1Darray and 2darray related copy APIs.

detail cases: 1Darray, 2Darray, 2Darrayto2D, 2Darrayto3D, 2Dto2Darray, 3Dto2Darray.

1d used gpu copy.

v2:
fixed 1d array to 1d array copy, don't need to switch depth and height.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoadd BEIGNET_INSTALL_DIR to clean code
Guo Yejun [Mon, 23 Jun 2014 22:22:07 +0000 (06:22 +0800)]
add BEIGNET_INSTALL_DIR to clean code

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoset LD_LIBRARY_PATH of libgbe.so for gbe_bin_generater
Guo Yejun [Mon, 23 Jun 2014 21:36:50 +0000 (05:36 +0800)]
set LD_LIBRARY_PATH of libgbe.so for gbe_bin_generater

it is needed for cross compiler

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoimplement API clEnqueueFillImage.
Luo [Sun, 22 Jun 2014 22:03:30 +0000 (06:03 +0800)]
implement API clEnqueueFillImage.

enqueues a command to fill an image object with a specified color.

fix typo cl_context_get_static_kernel_from_bin.

v2:
fix image 1d array bug.

Signed-off-by: Luo <xionghu.luo@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agofix crash when OCL_STRICT_CONFORMANCE is unset
Guo Yejun [Mon, 23 Jun 2014 20:14:21 +0000 (04:14 +0800)]
fix crash when OCL_STRICT_CONFORMANCE is unset

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the format and flag support for printf.
Junyan He [Mon, 23 Jun 2014 08:38:56 +0000 (16:38 +0800)]
Add the format and flag support for printf.

The format and flag such as -+# and precision request has
been added into the output.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoupdate docs on environment variables.
Ruiling Song [Thu, 19 Jun 2014 07:20:54 +0000 (15:20 +0800)]
update docs on environment variables.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: switch to non strict conformance mode by default.
Zhigang Gong [Mon, 23 Jun 2014 08:59:56 +0000 (16:59 +0800)]
GBE: switch to non strict conformance mode by default.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoutest_generator.py: add OCL_STRICT_CONFORMANCE enviroment condition.
Yi Sun [Mon, 23 Jun 2014 00:56:33 +0000 (08:56 +0800)]
utest_generator.py: add OCL_STRICT_CONFORMANCE enviroment condition.

For auto-generated math cases, when OCL_STRICT_CONFORMANCE is not set,
the expected diff increases to 1000x.

Signed-off-by: Yi Sun <yi.sun@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: declare correct prototype for fastpath_rootn
Ruiling Song [Mon, 23 Jun 2014 08:34:55 +0000 (16:34 +0800)]
GBE: declare correct prototype for fastpath_rootn

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: fix some builtin math function
Ruiling Song [Mon, 23 Jun 2014 08:34:54 +0000 (16:34 +0800)]
GBE: fix some builtin math function

__gen_ocl_exp stands for 2^x. So, use __gen_ocl_pow to implement native_exp().
Fix atanh implementation.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd some OpenCL1.2 parameters of function clGetDeviceInfo.
Yang Rong [Mon, 23 Jun 2014 14:38:36 +0000 (22:38 +0800)]
Add some OpenCL1.2 parameters of function clGetDeviceInfo.

Include CL_DEVICE_LINKER_AVAILABLE, CL_DEVICE_PRINTF_BUFFER_SIZE, CL_DEVICE_PREFERRED_INTEROP_USER_SYNC.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix a CL_MEM_HOST_PTR bug.
Yang Rong [Mon, 23 Jun 2014 14:38:35 +0000 (22:38 +0800)]
Fix a CL_MEM_HOST_PTR bug.

Can't add sub_offset if mem is image.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: replace OwningPtr with std::unique_ptr
Ruiling Song [Mon, 23 Jun 2014 06:39:26 +0000 (14:39 +0800)]
GBE: replace OwningPtr with std::unique_ptr

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: improve builtin exp.
Ruiling Song [Mon, 23 Jun 2014 02:33:17 +0000 (10:33 +0800)]
GBE: improve builtin exp.

Put some variables into register.
This could improve luxMark sala about 10% under strict conformance.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the test cases for 1D Image Array
Junyan He [Fri, 20 Jun 2014 10:07:40 +0000 (18:07 +0800)]
Add the test cases for 1D Image Array

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoUpdate the printf test case.
Junyan He [Fri, 20 Jun 2014 09:41:31 +0000 (17:41 +0800)]
Update the printf test case.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoAdd the support for %s in printf
Junyan He [Fri, 20 Jun 2014 09:41:26 +0000 (17:41 +0800)]
Add the support for %s in printf

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoFix a crash bug when no %d appears in the printf fmt
Junyan He [Fri, 20 Jun 2014 09:41:19 +0000 (17:41 +0800)]
Fix a crash bug when no %d appears in the printf fmt

If there no %d for all the printf statement, the curbe
will ignore the content buffer ptr because no one use it.
So when bind the buffer ptr in the run time, crash happens.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoAdd %f and %c support for printf.
Junyan He [Fri, 20 Jun 2014 09:41:13 +0000 (17:41 +0800)]
Add %f and %c support for printf.

Add the %c and %f support for printf.
Also add the int to float and int to char conversion.
Some minor errors such as wrong index flags have been fixed.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoGBE: fix some get kernel arg info bugs.
Zhigang Gong [Fri, 20 Jun 2014 11:09:35 +0000 (19:09 +0800)]
GBE: fix some get kernel arg info bugs.

Still can't handle the sampler_t which is not used actually.
Access qualifier seems broken with llvm 3.3.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoruntime: choose the actual EU number as the max compute units.
Zhigang Gong [Fri, 20 Jun 2014 10:07:23 +0000 (18:07 +0800)]
runtime: choose the actual EU number as the max compute units.

Use the EU number as compute unit make more sense.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
10 years agoGBE: Handle empty basicblock in Instruction selection
Ruiling Song [Fri, 20 Jun 2014 08:13:13 +0000 (16:13 +0800)]
GBE: Handle empty basicblock in Instruction selection

I meet a corner case which leads to empty bb.

Lable $12
add %3, %2, 1

and what's more %3 is not used anymore later, so we will not select
instruction for this line of code. Then only Label instruction left
in the bb, which leads to wrong endifLabel used. The fix simply
generate endif instruction if needed at first in matchBasicBlock().

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: tweak register expire frequency on simd16 mode.
Zhigang Gong [Tue, 17 Jun 2014 04:56:31 +0000 (12:56 +0800)]
GBE: tweak register expire frequency on simd16 mode.

According to Yongjia's test report, it's better to keep
the same freqency of expiration with both simd8 and simd16
mode.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Tested-by: Yongjia Zhang <yongjia.zhang@intel.com>
10 years agoAdd some API's OpenCL 1.2 parameter support.
Yang Rong [Fri, 20 Jun 2014 16:15:44 +0000 (00:15 +0800)]
Add some API's OpenCL 1.2 parameter support.

Support CL_PROGRAM_KERNEL_NAMES and CL_PROGRAM_NUM_KERNELS in API clGetProgramInfo,
and CL_DOUBLE_FP_CONFIG in API clGetDeviceInfo.
Also fix a bug of CL_MEM_HOST_PTR in API clGetMemObjectInfo.

v2:
also fix the utest get_mem_info.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd some OpenCL1.2 new buffer flags handle.
Yang Rong [Fri, 20 Jun 2014 16:15:43 +0000 (00:15 +0800)]
Add some OpenCL1.2 new buffer flags handle.

And mem_base_addr_align' unit is bit, and origin's is byte, correct it when compare.

v2:
fix sub_buffer_check test case.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix sub buffer bug in clEnqueueReadBufferRect, clEnqueueWriteBufferRect, clEnqueueMap...
Yang Rong [Fri, 20 Jun 2014 16:15:42 +0000 (00:15 +0800)]
Fix sub buffer bug in clEnqueueReadBufferRect, clEnqueueWriteBufferRect, clEnqueueMapBuffer.

Should add sub_offset in these functions.

V2: clEnqueueMapBuffer's return ptr should not add sub offset. It will add sub offset in _cl_map_mem
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoruntime: fix image1d buffer allocation.
Zhigang Gong [Fri, 20 Jun 2014 07:45:34 +0000 (15:45 +0800)]
runtime: fix image1d buffer allocation.

Per bspec, a image should has a at least 2 line vertical alignment,
thus we can't simply attach a buffer to a 1d image surface which has the same size.
We have to create a new image, and copy the buffer data to this new image.
And replace all the buffer object's reference to this image.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoruntime: fix a slice pitch calculation bug.
Zhigang Gong [Fri, 20 Jun 2014 04:24:22 +0000 (12:24 +0800)]
runtime: fix a slice pitch calculation bug.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoutest: decrease the accuracy of tanpi.
Yi Sun [Fri, 20 Jun 2014 02:09:58 +0000 (10:09 +0800)]
utest: decrease the accuracy of tanpi.

Since some issue in tanpi, decrease the accuracy by 100 times.

Signed-off-by: Yi Sun <yi.sun@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoruntime: fix some get image info bugs.
Zhigang Gong [Thu, 19 Jun 2014 06:09:46 +0000 (14:09 +0800)]
runtime: fix some get image info bugs.

According to ocl spec:

Return height of the image in pixels. For a
1D image, 1D image buffer and 1D image
array object, height = 0.

Return depth of the image in pixels. For a
1D image, 1D image buffer, 2D image or
1D and 2D image array object, depth = 0.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoGBE/runtime: fixup broken 1d array image support.
Zhigang Gong [Wed, 18 Jun 2014 02:10:07 +0000 (10:10 +0800)]
GBE/runtime: fixup broken 1d array image support.

As sample LD message doesn't support array index, we have
to create a 2D array surface with the same buffer object.
Thus one 1D array image will have two surfaces binded to it
one is the index and the second is 128 + index.

And then at kernel side, we will access the corresponding
2D array surface when the LD message is required otherwise
will access the origin 1D array surface.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agocl/runtime: fixup 1D array image region and origins.
Zhigang Gong [Wed, 18 Jun 2014 06:53:06 +0000 (14:53 +0800)]
cl/runtime: fixup 1D array image region and origins.

As we treat 1D array image as a 2d array image with height 1
internally, we need to fixup region and origins passed in
from external APIs.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agocl/driver: fix the incorrect handling of 1D array.
Zhigang Gong [Wed, 18 Jun 2014 02:01:15 +0000 (10:01 +0800)]
cl/driver: fix the incorrect handling of 1D array.

According to the bspec, the 1D array should be treated as a 3D like
surface which has height 1. So we need to make sure the depth is
the array_size. Thus the rt_view_extent's value should be always
the same as the depth.

According to the ocl spec, the 1D array firstly should be a 1D image rather
than a 2D image. Thus we should access different lines according to the
slice_pitch rather than the image_row_pitch.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoEnable the 1D and 2D image support in run time.
Junyan He [Tue, 17 Jun 2014 04:06:54 +0000 (12:06 +0800)]
Enable the 1D and 2D image support in run time.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the image1d_array_t and image2d_array_t defines.
Junyan He [Tue, 17 Jun 2014 04:06:47 +0000 (12:06 +0800)]
Add the image1d_array_t and image2d_array_t defines.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd a lock in the place of printf output
Junyan He [Wed, 18 Jun 2014 06:42:15 +0000 (14:42 +0800)]
Add a lock in the place of printf output

If multi-thread run the kernel simultaneously, the output
may interlace with each other. Add a lock to avoid this.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoRefine the code in llvm_printf_parser.cpp
Junyan He [Wed, 18 Jun 2014 06:42:07 +0000 (14:42 +0800)]
Refine the code in llvm_printf_parser.cpp

Fix some typo and use macro to simplify the code.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: pass compile against LLVM 3.5
Ruiling Song [Wed, 18 Jun 2014 07:09:44 +0000 (15:09 +0800)]
GBE: pass compile against LLVM 3.5

backward compatible with LLVM 3.3

merged a bug fix patch into this one.
  1. use_iterator point to 'Use' now instead of 'User'.
  2. all c-string are in constant address space now, which follows OCL Spec.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix an event status bug.
Yang Rong [Thu, 19 Jun 2014 14:37:42 +0000 (22:37 +0800)]
Fix an event status bug.

If event status is an Error code, the status of events wait on this event also should set to Error code.

V2: should not execute the enqueue command wait on the event whose status is error.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoTry to use drm render nodes.
Abrahm Scully [Thu, 19 Jun 2014 02:28:42 +0000 (22:28 -0400)]
Try to use drm render nodes.

Allows non-root user to run without X.
Works on Fedora 20 with render nodes enabled.

Signed-off-by: Abrahm Scully <abrahm.scully@gmail.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix build with mesa 10.1.
Abrahm Scully [Thu, 19 Jun 2014 02:28:08 +0000 (22:28 -0400)]
Fix build with mesa 10.1.

Mesa renamed some constants and a directory.

Signed-off-by: Abrahm Scully <abrahm.scully@gmail.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix linking to X11 libraries.
Abrahm Scully [Thu, 19 Jun 2014 02:26:53 +0000 (22:26 -0400)]
Fix linking to X11 libraries.

After FindXLib.cmake was removed, XLIB_LIBARY should have been
replaced with X11_LIBRARIES.

Signed-off-by: Abrahm Scully <abrahm.scully@gmail.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: Correctly process constant for phi instruction
Ruiling Song [Wed, 18 Jun 2014 07:59:53 +0000 (15:59 +0800)]
GBE: Correctly process constant for phi instruction

Simply use getRegister which deals with various ConstantExpr.
Thanks to Abrahm Scully who report the bug.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoadd binary type support for compiled object and library.
Luo [Wed, 18 Jun 2014 00:17:34 +0000 (08:17 +0800)]
add binary type support for compiled object and library.

save the llvm bitcode to program->binary: insert a byte in front of the
bitcode stands for binary type(0 means GEN binary, 1 means COMPILED_OBJECT, 2 means LIBRARY);
load the binary to module by ParseIR.

create random directory to save compile header files.
use strncpy and strncat to replace strcpy and strcat.

v6: fix enqueue_copy_fill bug, use '\0' instead of 0 in the header.
v7  binary header format issue: fix test_load_program_from_bin bug of standalone kernel generated by gbe_bin_generater.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
10 years agofix clEnqueueMarkerWithWaitList bug when input event is null.
Luo [Tue, 17 Jun 2014 02:59:05 +0000 (10:59 +0800)]
fix clEnqueueMarkerWithWaitList bug when input event is null.

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agodriver: fix a potential Null reference.
Zhigang Gong [Tue, 17 Jun 2014 03:16:55 +0000 (11:16 +0800)]
driver: fix a potential Null reference.

cl_gpgpu_flush may be called when the batch buffer has been
released. We need to check whether there is a valid buffer
before we really take the following actions.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoFix a clEnqueueBarrierWithWaitList event status bug.
Yang Rong [Mon, 16 Jun 2014 08:20:08 +0000 (16:20 +0800)]
Fix a clEnqueueBarrierWithWaitList event status bug.

Event's status should be CL_COMPLETE if all wait events are complete in the wait list, in function
clEnqueueBarrierWithWaitList and clEnqueueMarkerWithWaitList.

v2: revert delete the event change in v1.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoBump beignet version to 0.8.99.
Zhigang Gong [Fri, 13 Jun 2014 09:50:31 +0000 (17:50 +0800)]
Bump beignet version to 0.8.99.

We are approaching the releae of version 0.9, so we bump
the version to 0.8.99 now.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoBump OpenCL version to 1.2.
Zhigang Gong [Fri, 13 Jun 2014 09:44:28 +0000 (17:44 +0800)]
Bump OpenCL version to 1.2.

Now all opencl 1.2 functions in the opencl 1.2 branch have been
merged into master branch. Let's bump master's ocl version to 1.2.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoutests: use OpenCL 1.2 API for image related test cases.
Zhigang Gong [Fri, 13 Jun 2014 09:44:00 +0000 (17:44 +0800)]
utests: use OpenCL 1.2 API for image related test cases.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agouse LLVM_INSTALL_DIR as the path to clang/llvm-as/llvm-link
Guo Yejun [Thu, 12 Jun 2014 22:06:50 +0000 (06:06 +0800)]
use LLVM_INSTALL_DIR as the path to clang/llvm-as/llvm-link

I invented CMAKE_BINARY_PATH as the path to clang/llvm-as/llvm-link
in last patch, it is not elegant. Actually, LLVM_INSTALL_DIR is
already used in CMake file and is a better choice.

So, for cross compile case, cmake can find the binaries such as clang,
llvm-as, llvm-link and llvm-config with the help of LLVM_INSTALL_DIR.

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
10 years agoclean code to remove gbe_kernel_set_const_buffer_size
Guo Yejun [Thu, 12 Jun 2014 18:14:10 +0000 (02:14 +0800)]
clean code to remove gbe_kernel_set_const_buffer_size

this function is no longer needed.

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
10 years agoadd [opencl-1.2] clUnloadPlatformCompiler implementation
Luo [Thu, 5 Jun 2014 21:00:40 +0000 (05:00 +0800)]
add [opencl-1.2] clUnloadPlatformCompiler implementation

just a empty hook.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoImplement the clEnqueueMigrateMemObjects API
Junyan He [Wed, 11 Jun 2014 01:33:36 +0000 (09:33 +0800)]
Implement the clEnqueueMigrateMemObjects API

So far, we just support 1 device and no subdevices.
So all the command queues should belong to the small context.
There is no need to migrate the mem objects from one subcontext
to another by now. We just do the checks and fill the event.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: Enable some implemented Opencl 1.2 functions in icd table.
Zhigang Gong [Mon, 9 Jun 2014 10:37:46 +0000 (18:37 +0800)]
GBE: Enable some implemented Opencl 1.2 functions in icd table.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
10 years agoAdd the utest case for clGetKernelArgInfo
Junyan He [Fri, 13 Jun 2014 09:05:06 +0000 (17:05 +0800)]
Add the utest case for clGetKernelArgInfo

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the clGetKernelArgInfo api and misc help functions
Junyan He [Fri, 13 Jun 2014 09:04:58 +0000 (17:04 +0800)]
Add the clGetKernelArgInfo api and misc help functions

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the llvm info to the function for later usage.
Junyan He [Fri, 13 Jun 2014 09:04:49 +0000 (17:04 +0800)]
Add the llvm info to the function for later usage.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the -cl-kernel-arg-info into the clang build options
Junyan He [Fri, 13 Jun 2014 09:04:39 +0000 (17:04 +0800)]
Add the -cl-kernel-arg-info into the clang build options

We always add -cl-kernel-arg-info to the options. This option just generate
the arg information for the backend, no other side effect and does not have
performance issue.  So we just always add it here.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoadd [opencl-1.2] test case runtime_compile_link.
Luo [Fri, 13 Jun 2014 03:17:39 +0000 (11:17 +0800)]
add [opencl-1.2] test case runtime_compile_link.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
10 years agoadd [opencl-1.2] API clLinkProgram.
Luo [Fri, 13 Jun 2014 03:17:38 +0000 (11:17 +0800)]
add [opencl-1.2] API clLinkProgram.

this API links a set of compiled program objects and libraries for all
the devices or a specific device(s) in the OpenCL context and creates
an executable.
the llvm bitcode in the compiled program objects are linked together and
built to Gen binary.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Conflicts:
src/cl_gbe_loader.h

10 years agoadd [opencl-1.2] API clCompileProgram.
Luo [Fri, 13 Jun 2014 03:17:37 +0000 (11:17 +0800)]
add [opencl-1.2] API clCompileProgram.

This API compiles a program's source for all the devices or a specific
device in the OpenCL context associated with program.
The pre-processor runs before the program sources are compiled.

Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
10 years agoadd [opencl-1.2] API clCreateSubDevice.
Luo [Fri, 13 Jun 2014 03:17:36 +0000 (11:17 +0800)]
add [opencl-1.2] API clCreateSubDevice.

creates an array of sub-devices that each reference a non-intersecting
set of compute units within in_device, according to a partition scheme
given by properties.

Reviewed-by: He Junyan <junyan.he@inbox.com>
Signed-off-by: Luo <xionghu.luo@intel.com>
10 years agoadd test case runtime_barrier_list and runtime_marker_list.
Luo [Fri, 13 Jun 2014 03:17:35 +0000 (11:17 +0800)]
add test case runtime_barrier_list and runtime_marker_list.

Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Signed-off-by: Luo <xionghu.luo@intel.com>
Conflicts:
utests/CMakeLists.txt

10 years agoadd [opencl-1.2] API clEnqueueBarrierWithWaitList.
Luo [Fri, 13 Jun 2014 03:17:34 +0000 (11:17 +0800)]
add [opencl-1.2] API clEnqueueBarrierWithWaitList.

This command blocks command execution, that is, any following commands
enqueued after it do not execute until it completes;
API clEnqueueMarkerWithWaitList patch didn't push the latest, update in
 this patch.

Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Signed-off-by: Luo <xionghu.luo@intel.com>
Conflicts:
src/cl_event.c

10 years agoutests: fix the image desc initilization for get_image_info.
Junyan He [Fri, 13 Jun 2014 07:08:10 +0000 (15:08 +0800)]
utests: fix the image desc initilization for get_image_info.

As now the clCreateImage implements more check, we need to
set more elements to pass all the argument check.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoAdd the test case for 1D image from buffer
Junyan He [Fri, 13 Jun 2014 07:08:01 +0000 (15:08 +0800)]
Add the test case for 1D image from buffer

v2:
should not released the buffer which is handled by the utest helper.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoAdd the support for 1D image from buffer.
Junyan He [Fri, 13 Jun 2014 07:07:52 +0000 (15:07 +0800)]
Add the support for 1D image from buffer.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd test cases for 1d image fill and copy
Junyan He [Fri, 13 Jun 2014 07:07:44 +0000 (15:07 +0800)]
Add test cases for 1d image fill and copy

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the support for 1D image in backend
Junyan He [Fri, 13 Jun 2014 07:07:31 +0000 (15:07 +0800)]
Add the support for 1D image in backend

1. Delete the is3D member in instruction class. Because we need more
than 1 bit to represent 1D 2D and 3D. We now add an invalid register
in ir profile, and comparing the coords to it to judge the dimension.
2. Rename all the xxx_image to xxx_image2D to make its meaning clear.
3. Update the according Sampler and Typed_Write instruction in selection
and Gen IR generation.

v2:
fix the use of InvalidRegister. Use ir::ocl::invalid only.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoAdd checks for clCreateImage and add 1d image creating logic
Junyan He [Fri, 13 Jun 2014 07:07:10 +0000 (15:07 +0800)]
Add checks for clCreateImage and add 1d image creating logic

Add more check for Image creating according to the spec.
Update the according image utest cases to pass it.
The 1d image creating is also be added.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoadd[opencl-1.2] test case for API clCreateProgramWithBuiltInKernels.
Luo [Fri, 13 Jun 2014 00:58:17 +0000 (08:58 +0800)]
add[opencl-1.2] test case for API clCreateProgramWithBuiltInKernels.

Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoadd [opencl-1.2] API clCreateProgramWithBuiltInKernels.
Luo [Fri, 13 Jun 2014 00:58:16 +0000 (08:58 +0800)]
add [opencl-1.2] API clCreateProgramWithBuiltInKernels.

This API creates a built-in program object for a context, and loads the
built-in kernels into this program object.

v2:
fix the image base index handling issue.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoadd [opencl 1.2] API clEnqueueMarkerWithWaitList.
Luo [Fri, 13 Jun 2014 00:58:15 +0000 (08:58 +0800)]
add [opencl 1.2] API clEnqueueMarkerWithWaitList.

Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
10 years agoAdd the test case for clEnqueueFillBuffer
Junyan He [Fri, 13 Jun 2014 05:30:49 +0000 (13:30 +0800)]
Add the test case for clEnqueueFillBuffer

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoImplement the clEnqueueFillBuffer API.
Junyan He [Fri, 13 Jun 2014 05:30:42 +0000 (13:30 +0800)]
Implement the clEnqueueFillBuffer API.

We use the floatn's assigment to do the copy.
128 pattern size is according to double16, and because
the double problem on our platform, we use to float16
to handle this.
unaligned cases is not optimized now, just use the char
assigment.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the kernels used by clEnqueueBufferFill API
Junyan He [Fri, 13 Jun 2014 05:30:30 +0000 (13:30 +0800)]
Add the kernels used by clEnqueueBufferFill API

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: switch to ocl-1.2 header files.
Zhigang Gong [Thu, 12 Jun 2014 06:31:00 +0000 (14:31 +0800)]
GBE: switch to ocl-1.2 header files.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agorelax the build dependency on Gen GPU
Guo Yejun [Mon, 26 May 2014 22:13:12 +0000 (06:13 +0800)]
relax the build dependency on Gen GPU

currently, the Gen GPU pciid of the underlying system is queried
and then passed to gbe_bin_generater as the target option.

This does not work when building the driver on another system with
non-intel GPUs, this patch relaxes the dependency by exporting the
pciid setting at CMake level, therefore, the pciid could be given
as a CMake option besides the current real time query method.

this patch also remove the redundancy code in utest/CMake by setting
PARENT_SCOPE in src/CMake.

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoFix the same kernel name issue of OCL_OUTPUT_KERNEL_PERF
Yongjia Zhang [Mon, 23 Jun 2014 15:09:33 +0000 (23:09 +0800)]
Fix the same kernel name issue of OCL_OUTPUT_KERNEL_PERF

Now it treats kernels with same kernel name and different build
options separately. When OCL_OUTPUT_KERNEL_PERF==1, it outputs the
time summary as before, but if OCL_OUTPUT_KERNEL_PERF==2, it will
output the time details including the kernel build options and
kernels with same kernel name but different build options will
output separately.

v2: use strncmp and strncpy instead of strcmp and strcpy.

Signed-off-by: Yongjia Zhang <yongjia.zhang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoutest: reduce group size to fit into baytrail platform.
Zhigang Gong [Thu, 12 Jun 2014 08:45:19 +0000 (16:45 +0800)]
utest: reduce group size to fit into baytrail platform.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
10 years agoHSW: Remove the jmpi distance limit of HSW.
Yang Rong [Thu, 12 Jun 2014 15:22:15 +0000 (23:22 +0800)]
HSW: Remove the jmpi distance limit of HSW.

Because the HSW's jmpi distance's unit is byte, the distance in JMPI instruction should
be S31, so remove S16 restriction.
It can fix luxmark fail when OCL_STRICT_CONFORMANCE=1.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Li, Peng <peng.li@intel.com>
10 years agoGBE: fix some bugs in 64bit bitcast.
Ruiling Song [Thu, 12 Jun 2014 07:11:52 +0000 (15:11 +0800)]
GBE: fix some bugs in 64bit bitcast.

1. set correct vstride when do int64 bitcast.
2. the condition to offset to next half should be (i%multiple) >= multiple/2.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoHSW: Fix potential issue of GT3 when calc stack address.
Yang Rong [Thu, 12 Jun 2014 11:42:12 +0000 (19:42 +0800)]
HSW: Fix potential issue of GT3 when calc stack address.

GT3 have 4 half slice, so should shift left 2 bits, and also should enlarge the stack buffer size,
otherwize, if thread generate is non-balance, may out of bound.
Per bspec, scratch size need set 2X of desired.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoHandle the difference timestamp count, got from drm_intel_reg_read.
Yang Rong [Thu, 12 Jun 2014 11:04:27 +0000 (19:04 +0800)]
Handle the difference timestamp count, got from drm_intel_reg_read.

In HSW and IVB, if x86_64 system, the low 32bits of timestamp count are stored in the high 32 bits of result which
got from drm_intel_reg_read, and 32-35 bits are lost; but in i386 system, the timestamp count match bspec.
It seems the kernel readq bug. So shift 32 bit in x86_64, and only remain 32 bits data in i386.

V2: In baytrail, don't have these issue, but need clear 32-35 bits.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoremove RTLD_DEEPBIND to avoid stdc++ issues
Guo Yejun [Wed, 11 Jun 2014 18:38:22 +0000 (02:38 +0800)]
remove RTLD_DEEPBIND to avoid stdc++ issues

there are weired issues about stdc++ when dlopen .so file with flag
RTLD_DEEPBIND, remove the flag by renaming the function pointer names.
The new names in runtime begin with interp_*, meaning that they finally
go into libgbeinterp.so to interpret the meta data of binary kernel.

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Junyan He <junyan.he@linux.intel.com>
10 years agofix utest simd_any for simd width 8 and 16
Guo Yejun [Tue, 10 Jun 2014 21:27:26 +0000 (05:27 +0800)]
fix utest simd_any for simd width 8 and 16

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: ignoring some debug related intrinsics.
Zhigang Gong [Fri, 6 Jun 2014 07:34:18 +0000 (15:34 +0800)]
GBE: ignoring some debug related intrinsics.

We don't need to assert the kernel if we met some
debug related intrinsics. Just ignore them.

This patch could make beignet works well with Debug
mode clBLAS.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
10 years agoGBE: output compact flag when output asm.
Ruiling Song [Wed, 11 Jun 2014 03:14:52 +0000 (11:14 +0800)]
GBE: output compact flag when output asm.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agofix issue when create cl image from libva with offset
Guo Yejun [Mon, 9 Jun 2014 00:39:33 +0000 (08:39 +0800)]
fix issue when create cl image from libva with offset

to share data between libva and ocl (at drm level), it is acceptable
to create cl image from libva with offset (to drm object). Correct
the bo offset whose value will finally go to ss1.base_addr.

Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the utest case for printf
Junyan He [Tue, 10 Jun 2014 04:53:22 +0000 (12:53 +0800)]
Add the utest case for printf

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the printf logic into the run time.
Junyan He [Tue, 10 Jun 2014 04:53:12 +0000 (12:53 +0800)]
Add the printf logic into the run time.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the printfSet into the kernel Class and add misc helper functions
Junyan He [Tue, 10 Jun 2014 04:53:04 +0000 (12:53 +0800)]
Add the printfSet into the kernel Class and add misc helper functions

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the PrintfParser llvm parser into the llvm backend.
Junyan He [Tue, 10 Jun 2014 04:52:54 +0000 (12:52 +0800)]
Add the PrintfParser llvm parser into the llvm backend.

The PrintfParser will work before the llvm gen backend.
It will filter out all the printf function call. When
the printf call found, we will analyse the print format
and % place holder here. Replace the print call with
STORE or CONV+STORE instruction if needed.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd the PrintfSet class into the ir
Junyan He [Tue, 10 Jun 2014 04:52:45 +0000 (12:52 +0800)]
Add the PrintfSet class into the ir

The PrintfSet will be used to collect all the infomation in
the kernel. After the kernel executed, it will be used
to generate the according printf output.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoAdd two special register for printf output buffer usage
Junyan He [Tue, 10 Jun 2014 04:52:37 +0000 (12:52 +0800)]
Add two special register for printf output buffer usage

printfiptr for printf index buffer pointer in curbe
and printfbptr for printf output buffer pointer in curbe.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
10 years agoGBE: support SLM bool load and store.
Zhigang Gong [Tue, 10 Jun 2014 02:45:56 +0000 (10:45 +0800)]
GBE: support SLM bool load and store.

The OCL spec does allow the use of a i1/BOOL SLM
variable, so we have to support the load and store of
it. To make things simple, I choose to use S16 to represent
i1 value.

Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>