Zhigang Gong [Fri, 6 Sep 2013 05:43:06 +0000 (13:43 +0800)]
Runtime: enable border color state support.
Also fix the wrong clamp mode for CL_ADDRESS_CLAMP.
According to Gen Bspec, when the surface format is
int/uint, it doesn't support clamp border. We need
to workaround it in the kernel side.
v2: move compiler_copy_image1 to the have issue utest set.
As this patch can really enable to use clamp to border
mode for a int/uint surface. We have issues for this
combination. Need to be fixed.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Zhigang Gong [Fri, 6 Sep 2013 05:43:05 +0000 (13:43 +0800)]
Runtime: fix a bug when set sampler value.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Zhigang Gong [Fri, 6 Sep 2013 05:43:04 +0000 (13:43 +0800)]
Runtime: disable some unecessary image formats.
Per OpenCL, the minimum list of supported format is as below:
CL_RGBA:
CL_UNORM_INT8
CL_UNORM_INT16
CL_SIGNED_INT8
CL_SIGNED_INT16
CL_SIGNED_INT32
CL_UNSIGNED_INT8
CL_UNSIGNED_INT16
CL_UNSIGNED_INT32
CL_HALF_FLOAT
CL_FLOAT
CL_BGRA:
CL_UNORM_INT8
Let's only support this type and CL_R currently.
Also removed an unnecessary assertion. And fix the CL_Rx's type size.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Ruiling Song [Wed, 18 Sep 2013 02:18:43 +0000 (10:18 +0800)]
utests: add more constant test cases for composite type.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Wed, 18 Sep 2013 02:18:42 +0000 (10:18 +0800)]
GBE: Support composite type constant.
struct/vector/array of vector/struct of array/array of struct.
Also fix a bug 'constant index into constant array get wrong result'
brought in by patch 'Fix non-4byte program global constant issue'.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: "Sun, Yi" <yi.sun@intel.com>
Yang Rong [Tue, 17 Sep 2013 08:10:01 +0000 (16:10 +0800)]
Implement clEnqueueMarker and clEnqueueBarrier.
Add some event info to cl_command_queue.
One is non-complete user events, used to block marker event and barrier.
After these events become CL_COMPLETE, the events blocked by these events also
become CL_COMPLETE, so marker event will also set to CL_COMPLETE. If there is no
user events, need wait last event complete and set marker event to complete.
Add barrier_index, for clEnqueueBarrier, point to user events, indicate the enqueue
apis follow clEnqueueBarrier should wait on how many user events.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Tue, 17 Sep 2013 08:10:00 +0000 (16:10 +0800)]
Refine and fix some event bugs.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Tue, 17 Sep 2013 08:09:59 +0000 (16:09 +0800)]
Remove non-used data in clEnqueueMapImage to fix, and fix a clGetEventInfo bug.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Tue, 17 Sep 2013 08:09:58 +0000 (16:09 +0800)]
Fix cl_mem_kernel_copy_image typo.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Wed, 11 Sep 2013 07:22:04 +0000 (15:22 +0800)]
change constant test case to cover short/long type.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Wed, 11 Sep 2013 07:22:03 +0000 (15:22 +0800)]
Fix non-4byte program global constant issue.
We put array elements simply one after another, that is packed.
So, constant memory address should be calculated using real type size.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Boqun Feng [Tue, 17 Sep 2013 03:41:50 +0000 (11:41 +0800)]
GBE: define python interpreter by cmake variable
In some distros, python is linked to python3 not
python2, and GBE can't be built on such distros
without modification.
CMake provides a variable PYTHON_EXECUTABLE.
By default, this variable is the same as
`/usr/bin/env python`, and if another python2
interpreter is needed, just add this defination in
`cmake` command.
-DPYTHON_EXECUTABLE:FILEPATH=/path/to/python2
And this will change PYTHON_EXECUTABLE to
/path/to/python2
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Wed, 11 Sep 2013 03:21:37 +0000 (11:21 +0800)]
add 64-bit version of "rhadd"
v2:
keep highest carry bit
tested by piglit test cases:
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-ulong-rhadd-1.0.generated.cl
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-long-rhadd-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Homer Hsing [Fri, 13 Sep 2013 01:41:02 +0000 (09:41 +0800)]
support converting 64-bit integer to 32-bit float
version 2:
improve algorithm to convert signed integer
fix source operand type in llvm_gen_backend
enable predicate in addWithCarry
change test case to test signed integer
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Yang Rong [Mon, 9 Sep 2013 08:10:23 +0000 (16:10 +0800)]
Implement api clEnqueueCopyBufferToImage.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Mon, 9 Sep 2013 08:10:22 +0000 (16:10 +0800)]
Implement api clEnqueueCopyImageToBuffer.
Also fix the function cl_mem_kernel_copy_image 3D image error.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Fri, 13 Sep 2013 06:06:59 +0000 (14:06 +0800)]
Implement api clEnqueueTask and clEnqueueNativeKernel.
Also refine the whole memcpy's condition in function
cl_enqueue_read_buffer_rect and cl_enqueue_write_buffer_rect.
V2: Add a mem_list to enqueue_data to fix utest error.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Fri, 13 Sep 2013 02:22:48 +0000 (10:22 +0800)]
add built-in function "atan2pi"
version 2: fix a typo. and add corner cases
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Junyan He [Thu, 12 Sep 2013 02:52:47 +0000 (10:52 +0800)]
Add the virtual dctr function of Serialization to kill warning.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Junyan He [Thu, 12 Sep 2013 06:06:18 +0000 (14:06 +0800)]
Add a test case for binary load.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Junyan He [Wed, 11 Sep 2013 10:07:51 +0000 (18:07 +0800)]
Implement the clCreateProgramWithBinary to deseralize the binary.
We now do not check the format of the binary.
We need to check the binary file format to handle the internal binary,
the LLVM binary or the invalid format differently.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Junyan He [Wed, 11 Sep 2013 10:07:44 +0000 (18:07 +0800)]
Add one tool program to build and serial the program.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Junyan He [Wed, 11 Sep 2013 10:07:39 +0000 (18:07 +0800)]
Add the serialization support for backend
The Serializable class define the interface of serialize_to/deserialize_from
functions for internal binary and llvm binary. And also a print status
function for debugging.
The class which may need the serializaion support need to derive from it,
these classes including: Program, Kernel, ConstantSet, ImageSet and SamplerSet.
This patch just add serialize_to/deserialize_from internal binary support for
all these classes.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Homer Hsing [Wed, 11 Sep 2013 03:04:56 +0000 (11:04 +0800)]
add 64-bit version of "hadd"
v2:
keep top carry bit
passed piglit test cases:
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-long-hadd-1.0.generated.cl
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-ulong-hadd-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Homer Hsing [Mon, 2 Sep 2013 01:25:10 +0000 (09:25 +0800)]
support converting 64-bit integer to shorter integer
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Homer Hsing [Thu, 29 Aug 2013 05:41:24 +0000 (13:41 +0800)]
add built-in function "atan2"
also improve the accuracy of built-in function "atan"
also add a test case
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Yi Sun [Mon, 9 Sep 2013 08:54:12 +0000 (16:54 +0800)]
utest.cpp: run the cases with issue seperately.
We should run both passed cases and failed cases via option '-c'.
Signed-off-by: Yi Sun <yi.sun@intel.com>
Reviewed-by: "Lu, Guanqun" <guanqun.lu@intel.com>
Yang Rong [Mon, 9 Sep 2013 08:10:09 +0000 (16:10 +0800)]
Add api clEnqueueCopyImage.
Also do some mirror changes:
1. Add a image var name to macro CHECK_IMAGE.
2. Fix local size error in cl_mem_copy_buffer_rect.
3. Fix cl_enqueue_write_image typo.
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Mon, 9 Sep 2013 08:10:08 +0000 (16:10 +0800)]
Add clEnqueueCopyBufferRect api.
Using enqueue ND range to copy two buffers. Now compile the kernel string, after
load binary ready, should using static binary.
V2: Add a comment for function check_copy_overlap and rename CL_INVALID TO CL_INTERNAL_KERNEL_MAX.
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Wed, 4 Sep 2013 08:58:08 +0000 (16:58 +0800)]
Add clEnqueueWriteBufferRect api.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Wed, 4 Sep 2013 08:58:07 +0000 (16:58 +0800)]
Add clEnqueueReadBufferRect api.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Mon, 26 Aug 2013 14:45:48 +0000 (22:45 +0800)]
CL: Enalbe gl sharing with new egl extension.
The previous implementation is only for 2d/3d texture sharing and
is implemented in a hacky fashinon. We need to replace it with a
clean and complete one. We introduce a new egl extension to export
low level layout information of a buffer object/texture/render buffer
from the mesa dri driver to the cl driver layer. As the extension is
not accpepted by mesa, we have to implement this new extension in
beignet internally.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: He Junyan <junyan.he@inbox.com>
Zhigang Gong [Wed, 4 Sep 2013 07:04:03 +0000 (15:04 +0800)]
Runtime: Only return the format allowed in the spec.
For the CL_INTENSITY and CL_LUMINANCE, it only supports
CL_UNORM_INT8,CL_UNORM_INT16, CL_SNORM_INT8, CL_SNORM_INT16,
CL_HALF_FLOAT or CL_FLOAT.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Zhigang Gong [Tue, 3 Sep 2013 10:01:32 +0000 (18:01 +0800)]
GBE: silent the compilation warning when generate the pch file.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: "Sun, Yi" <yi.sun@intel.com>
Homer Hsing [Wed, 4 Sep 2013 02:33:39 +0000 (10:33 +0800)]
fix 64-bit "clz" if parameter is "long4" or "ulong4"
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Wed, 4 Sep 2013 06:24:54 +0000 (14:24 +0800)]
Implement constant buffer based on constant cache.
Currently, simply allocate enough graphics memory as constant memory space.
And bind it to bti 2. Constant cache read are backed by dword scatter read.
Different from other data port messages, the address need to be dword aligned,
and the addresses are in units of dword.
The constant address space data are placed in order: first global constant,
then the constant buffer kernel argument.
v2: change function & variable naming, to make clear 'curbe' and 'constant buffer'
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Wed, 4 Sep 2013 05:55:23 +0000 (13:55 +0800)]
Fix atomic_xchg float type error.
Also refine the "\" of some atomic macro.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Wed, 4 Sep 2013 06:35:18 +0000 (14:35 +0800)]
Utests: Enable bool_cross_basic_block.
And put it to the category with known issues. It will be run
when invoke the the utests as below:
./utest_run -a
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yi Sun [Wed, 4 Sep 2013 02:48:48 +0000 (10:48 +0800)]
Utests_run: Add known issue cases support.
Add some arguments:
-c <casename>: run sub-case named 'casename'
-l : list all the available case name
-a : run all test cases
-n : run all test cases without known issue
-h : display this usage
Add a alternate macro named MAKE_UTEST_FROM_FUNCTION_WITH_ISSUE to register a new test case, which has some known issue to be fixed till now.
While utest_run running, only cases which registered by MAKE_UTEST_FROM_FUNCTION will be involved by defalut.
If you want to run all the test cases including those with known issue, you should use argument '-a'.
Besides, you can use option '-c' to run any test case.
Signed-off-by: Yi Sun <yi.sun@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Fri, 30 Aug 2013 05:23:06 +0000 (13:23 +0800)]
Runtime: fix the incorrect global mem size.
The max_mem_alloc_size is 128M, we should set global mem size
less or equal to it. May be we can set both of them to much
larger than 128M in the future. For now, just set it to 128MB.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Tue, 3 Sep 2013 07:42:37 +0000 (15:42 +0800)]
Change constant unit test to cover 4 byte data type.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Tue, 3 Sep 2013 07:42:35 +0000 (15:42 +0800)]
GBE: Enable DWord scatter gather message for constant cache read.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Wed, 4 Sep 2013 01:18:20 +0000 (09:18 +0800)]
fix GPU data type for 16-bit moving
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Tue, 3 Sep 2013 07:39:56 +0000 (15:39 +0800)]
utest: memset the output buffer to fix random fail.
the inactive lanes will not modify corresponding output.
So, output buffer needs initialization to 0.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Tue, 3 Sep 2013 06:30:46 +0000 (14:30 +0800)]
GBE: Support builtin vector functions for select() autogeneration.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
Homer Hsing [Tue, 3 Sep 2013 03:13:18 +0000 (11:13 +0800)]
add same type "convert_*(*)"
add some versions of "convert_*(*)" converting same-type parameter
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Tue, 3 Sep 2013 00:41:28 +0000 (08:41 +0800)]
fix 32-bit signed version of "sub_sat"
This patch makes following piglit test case pass.
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-int-sub_sat-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Tue, 3 Sep 2013 00:20:35 +0000 (08:20 +0800)]
add 64-bit version of "rotate"
tested by piglit. following test cases pass.
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-long-rotate-1.0.generated.cl
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-ulong-rotate-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 08:33:23 +0000 (16:33 +0800)]
add 64-bit version of "clz"
this patch passes following piglit test cases:
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-ulong-clz-1.0.generated.cl
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-long-clz-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 08:21:25 +0000 (16:21 +0800)]
fix 8-bit version of "clz"
fix a typo in ocl_stdlib.tmpl.h
fix instruction type of 8-bit moving
this patch is tested by piglit
following two test cases has passed:
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-char-clz-1.0.generated.cl
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-uchar-clz-1.0.generated.cl
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 05:42:35 +0000 (13:42 +0800)]
add 64-bit version of "shuffle", "shuffle2"
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 05:21:47 +0000 (13:21 +0800)]
Add scalar version of "convert_*(*)"
Scalar version of "convert_*(*)" was missing.
This patch adds scalar version.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 04:32:40 +0000 (12:32 +0800)]
fix scalar type built-in function "select"
add some missing scalar type version
v2: third parameter of "select" cannot be "float"
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 2 Sep 2013 02:59:51 +0000 (10:59 +0800)]
add 64-bit version of "bitselect"
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Fri, 30 Aug 2013 03:16:24 +0000 (11:16 +0800)]
Runtime: fix the max group size for GT2.
We should keep the max group size and the CL_KERNEL_WORK_GROUP_SIZE
consistency wit each other. Otherwise, the conformance test will trigger
an error.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Lu, Guanqun" <guanqun.lu@intel.com>
Zhigang Gong [Thu, 29 Aug 2013 06:35:01 +0000 (14:35 +0800)]
GBE: We should set no predication/mask for EOT preparation.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Zhigang Gong [Thu, 29 Aug 2013 02:47:35 +0000 (10:47 +0800)]
Runtime: initialize single fp mode correctly.
According to opencl spec,
The mandated minimum single precision floating-point capability given by
CL_DEVICE_SINGLE_FP_CONFIG is CL_FP_ROUND_TO_ZERO or CL_FP_ROUND_TO_NEAREST.
We set the single float mode to IEEE 754 and set the rounding mode
to RTN.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Zhigang Gong [Thu, 29 Aug 2013 02:47:34 +0000 (10:47 +0800)]
Runtime: vendor specified information is required for CL_DEVICE_VERSION/OPENCL_C_VERSION.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
Zhigang Gong [Fri, 30 Aug 2013 09:19:40 +0000 (17:19 +0800)]
Runtime: clEnqueueMapImage also need to maintain the mapped images.
v3: Use cl_mem_unmap_gtt rather than cl_mem_unmap_auto in function _cl_map_mem.
v2: merge with:
commit
0237652c579123436e5f48514f733e36c8b5264a
Author: Yang Rong <rong.r.yang@intel.com>
Date: Fri Aug 23 11:04:21 2013 +0800
Add clEnqueueMapBuffer and clEnqueueMapImage non-blocking map support.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Revied-by: "Yang, Rong R" <rong.r.yang@intel.com>
Zhigang Gong [Mon, 2 Sep 2013 04:48:04 +0000 (12:48 +0800)]
GBE: null register could be used as src1.
We should not assert if null register is used as src1.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Fri, 30 Aug 2013 03:16:23 +0000 (11:16 +0800)]
GBE: add some macros for atom_xxx builtin functions.
The atom_xxx APIs are on OpenCL spec 1.0, but the conformance test suite
will tes them anyway.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Lu, Guanqun" <guanqun.lu@intel.com>
Zhigang Gong [Thu, 29 Aug 2013 06:27:05 +0000 (14:27 +0800)]
GBE: don't use flag register as src 1 for xor instruction.
Gen doesn't support to use ARF as src1. This bug is reported by
Edward Ching <edward.k.ching@gmail.com>.
v2: add an assert at setSrc1 to check whether we encode an instruction which
is using ARF as SRC1.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Edward Ching <dward.k.ching@gmail.com>
Yang Rong [Thu, 29 Aug 2013 05:07:38 +0000 (13:07 +0800)]
Correct event type' typo.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Lu Guanqun [Wed, 28 Aug 2013 02:16:57 +0000 (10:16 +0800)]
add a space to make the error more readable
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Wed, 28 Aug 2013 08:53:36 +0000 (16:53 +0800)]
Runtime: fix the incorrect platform info size (conformance).
As sizeof(str) already includes the '\0', we should not add 1
on the return size. Conformance case computeinfo could pass with
this patch.
(28-Aug 16:51:00) BEGIN Compute Info :
==> CL_DEVICE_ERROR_CORRECTION_SUPPORT == 0
==> CL_DEVICE_ERROR_CORRECTION_SUPPORT == 0
==> CL_DEVICE_ERROR_CORRECTION_SUPPORT == 0
PASSED computeinfo.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
Ruiling Song [Fri, 30 Aug 2013 08:29:32 +0000 (16:29 +0800)]
Fix utest compiler_group_size4 error.
Per opencl spec, bitfield is not supported.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Fri, 23 Aug 2013 03:04:22 +0000 (11:04 +0800)]
Change event test case to cover clEnqueueMapBuffer.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Fri, 23 Aug 2013 03:04:21 +0000 (11:04 +0800)]
Add clEnqueueMapBuffer and clEnqueueMapImage non-blocking map support.
There is a unsync map function drm_intel_gem_bo_map_unsynchronized in drm, that can
be used to do non-blocking map. But this function only map gtt, so force to use map
gtt for all clEnqueueMapBuffer and clEnqueueMapImage.
V2: refined comment, and using map_gtt_unsync in clEnqueueMapBuffer/Image
instead of map_auto to avoid confuse.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Mon, 26 Aug 2013 07:44:55 +0000 (15:44 +0800)]
Add pfn_notify support in clCreateContext.
Remove assert in cl_create_context when pfn_notify is not NULL,
and save it, but don't used now.
Per spec, driver should call it when devices becomes unavailable.
Now driver doesn't check the device status.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 26 Aug 2013 04:51:53 +0000 (12:51 +0800)]
add built-in function "lgamma", "lgamma_r"
also include test cases
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 26 Aug 2013 02:20:33 +0000 (10:20 +0800)]
add built-in function "tgamma"
also include a test case
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Fri, 30 Aug 2013 07:24:42 +0000 (15:24 +0800)]
improve built-in function "sinpi"
"sinpi" was calculated as "sin(pi * x)".
But that was not a quite-good way.
This patch improved the function, also included a test case.
v2: fix compiling warning
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Mon, 26 Aug 2013 14:45:47 +0000 (22:45 +0800)]
CL: Refactor cl_mem's implementation.
The buffer object is much simpler than the image object.
We'd better to not use the same big data structure for
both objects.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Lu, Guanqun" <guanqun.lu@intel.com>
Chuanbo Weng [Thu, 22 Aug 2013 11:23:38 +0000 (19:23 +0800)]
Add a test case that trigger a known bug.
This unit test case trigger a known bug:
ASSERTION FAILED: TODO Boolean values cannot escape their definition
basic block.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Thu, 22 Aug 2013 08:52:05 +0000 (16:52 +0800)]
utests: Add a unit test for non-aligned group size.
To hit prediction logic.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Ruiling Song [Thu, 22 Aug 2013 08:52:04 +0000 (16:52 +0800)]
GBE: Clear Flag register to fix a gpu hang.
When group size not aligned to simdWidth, prediction any8/16h will
calculate pmask also using flag register bits mapped to non-active
lanes. As flag register is not cleared by default, any8/16h used
for jmpi instruction may cause wrong jump, and possibly infinite loop.
So, we clear Flag register to 0 to make any8/16h prediction work correct.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Wed, 21 Aug 2013 09:18:04 +0000 (17:18 +0800)]
GBE: disable cl_khr_fp64.
As the double support is incomplete currently, we disable it.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Lu Guanqun [Tue, 20 Aug 2013 07:01:22 +0000 (15:01 +0800)]
list all available utests' names
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Lu Guanqun [Tue, 20 Aug 2013 06:45:15 +0000 (14:45 +0800)]
rename ulong to ulong64 to avoid the conflicts in <sys/types.h>
[ 31%] Building CXX object utests/CMakeFiles/utests.dir/compiler_abs_diff.cpp.o
/home/q/beignet.git/utests/compiler_abs_diff.cpp:201:18: error: conflicting declaration ‘typedef uint64_t ulong’
/usr/include/i386-linux-gnu/sys/types.h:151:27: error: ‘ulong’ has a previous declaration as ‘typedef long unsigned int ulong’
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Lu Guanqun [Mon, 19 Aug 2013 06:23:56 +0000 (14:23 +0800)]
fix warning when egl is not there
[ 32%] Building CXX object utests/CMakeFiles/utests.dir/utest_helper.cpp.o
/home/q/beignet.git/utests/utest_helper.cpp: In function ‘int cl_ocl_init()’:
/home/q/beignet.git/utests/utest_helper.cpp:314:8: warning: variable ‘hasGLExt’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Lu Guanqun [Mon, 19 Aug 2013 06:23:55 +0000 (14:23 +0800)]
fix left shift warning
/home/q/beignet.git/utests/compiler_long.cpp: In function ‘void compiler_long()’:
/home/q/beignet.git/utests/compiler_long.cpp:33:32: warning: left shift count >= width of type [enabled by default]
/home/q/beignet.git/utests/compiler_long.cpp:34:32: warning: left shift count >= width of type [enabled by default]
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Lu Guanqun [Mon, 19 Aug 2013 04:29:17 +0000 (12:29 +0800)]
fix left shift warnings in utests
We should use the explicit 64 bit types. Otherwise we would have warnings.
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Fri, 16 Aug 2013 07:28:52 +0000 (15:28 +0800)]
Utests: enable long/ulong for abs_diff test case.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 19 Aug 2013 06:55:29 +0000 (14:55 +0800)]
enable signed 64-bit version of "abs_diff"
fixed operand type in IR instruction "move".
used one less flag register in 64-bit integer comparing.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 19 Aug 2013 02:38:00 +0000 (10:38 +0800)]
enable unsigned 64bit version of "abs_diff"
tested by piglit,
piglit/framework/../bin/cl-program-tester generated_tests/cl/builtin/int/builtin-ulong-abs_diff-1.0.generated.cl
piglit test case passed.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 19 Aug 2013 01:43:32 +0000 (09:43 +0800)]
GBE: skip instruction pattern match for 64 bit sel_cmp.
CPU instruction "sel_cmp" don't support 64bit int.
not emit SelectModifierInstructionPattern in that case.
tested by piglit. piglit test cases "long(ulong)-max(min,clamp)" all passed.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Mon, 19 Aug 2013 01:41:17 +0000 (09:41 +0800)]
fix a typo
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Fri, 16 Aug 2013 08:24:09 +0000 (16:24 +0800)]
Add async copy and async stride copy test case.
Just hard code the int2 and char4 type. Other types have tested using
comformance test.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Fri, 16 Aug 2013 08:24:08 +0000 (16:24 +0800)]
Implement async and prefetch built-in.
Using the normal load & store to implement async copy,
and so wait_group_events use barrier.
Prefetch just define an empty function.
V2: fix llvm build error.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Fri, 16 Aug 2013 07:54:24 +0000 (15:54 +0800)]
test 64bit version of "upsample"
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Yang Rong [Thu, 15 Aug 2013 09:10:15 +0000 (17:10 +0800)]
Fix unit test compiler_load_bool_imm error.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Fri, 16 Aug 2013 01:45:17 +0000 (09:45 +0800)]
add 64bit version of "upsample"
since simple 64bit integer are supported,
add 64bit version of "upsample".
to test this patch, in piglit, run
bin/cl-program-tester generated_tests/cl/builtin/int/builtin-int-upsample-1.0.generated.cl
bin/cl-program-tester generated_tests/cl/builtin/int/builtin-uint-upsample-1.0.generated.cl
piglit test cases all pass.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Thu, 15 Aug 2013 06:53:27 +0000 (14:53 +0800)]
add empty 64bit-integer version built-in functions
also change vector built-in generator to auto generate
64bit-integer versions of built-in functions
function body is empty now. detail will add in the future.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Wed, 14 Aug 2013 08:23:18 +0000 (16:23 +0800)]
support built-in function mad_sat(int) and mad_sat(uint)
this patch has been tested by piglit.
piglit test cases "int_mad_sat" and "uint_mad_sat" passed.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zou Nan hai [Thu, 15 Aug 2013 23:56:08 +0000 (07:56 +0800)]
use r112 as source of EOT message
Fix random hang cases.
use r112 as source of EOT message.
Bspec requires r112-r127 as EOT message source.
Signed-off-by: Zou Nanhai <nanhai.zou@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Thu, 15 Aug 2013 02:51:33 +0000 (10:51 +0800)]
GBE: fix an illegal instruction.
Per Gen ISA spec:
When ExecSize = Width, VertStride must be set to Width * HorzStride.
For horizontal stride 2 in bottom_half, we always use it simd8 mode,
so we need to set the vertstride to 16 according to the above restrication.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zhigang Gong [Wed, 14 Aug 2013 08:07:15 +0000 (16:07 +0800)]
GBE: I64CMP should be treated as CMP in reg allocation and insn scheduling.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Wed, 14 Aug 2013 06:23:51 +0000 (14:23 +0800)]
test 64bit-integer comparing
only work when OCL_POST_ALLOC_INSN_SCHEDULE=0
because the post alloc scheduler puts CMP after SEL, but in IR,
CMP is before SEL, like this
GT.int64 %34 %31 %33
LOADI.int64 %38 3
LOADI.int64 %39 4
SEL.int64 %35 %34 %38 %39
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviwed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Homer Hsing [Wed, 14 Aug 2013 01:40:33 +0000 (09:40 +0800)]
support 64bit-integer comparing
support 64bit-integer comparing,
including EQ(==), NEQ(!=), G(>), GE(>=), L(<), LE(<=)
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviwed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Zou Nan hai [Tue, 13 Aug 2013 23:29:18 +0000 (07:29 +0800)]
Flush the queue after enqueue.
Flush the queue after enqueue.
This can fix some random fails in unit tests.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviwed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Yi Sun <yi.sun@intel.com>