contrib/beignet.git
11 years agoFixed a bug when expire registers.
Zhigang Gong [Tue, 19 Feb 2013 11:52:03 +0000 (19:52 +0800)]
Fixed a bug when expire registers.

The previous implementation forgot to change the head when
the to expired register is at the left side of the current
head. Thus the algorithm will be broken, as the algorithm need
the head has the smallest offset.

Without this patch, the register expireing doesn't work. Thus
any kernel function need more than 44 DWORD registers or 11
DWORD vec4 will fail to get registers.

The calculation is:

(Register file size / (type size * simdwidth) - special registers)
(4K / (4 * 16)) - 20 = 44

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoAdd utest case for movforphi's undef case.
Zhigang Gong [Tue, 19 Feb 2013 03:26:35 +0000 (11:26 +0800)]
Add utest case for movforphi's undef case.

This case will trigger MovForPhi to handle a undef vector
element.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoDelete temp files if compiled successfully
Homer Hsing [Thu, 21 Feb 2013 02:45:42 +0000 (10:45 +0800)]
Delete temp files if compiled successfully

If IR was compiled successfully, delete temp files.
My '/tmp' was full of '*.cl', '*.ll' files.
Now the temp files are gone, world is clean :)

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoOutput meaning of special registers in dumped IR
Homer Hsing [Sun, 17 Feb 2013 02:59:44 +0000 (10:59 +0800)]
Output meaning of special registers in dumped IR

Help debug IR. Before we change this, we feel hard to know
what "%3, %4" stuff mean in IR. Now we output their meaning.

Before we change this, dumped IR is:
    .decl.dword %0
    .decl.dword %1
    .decl.dword %2

After we change this, dumped IR is:
    .decl.dword %0 local_id_0
    .decl.dword %1 local_id_1
    .decl.dword %2 local_id_2

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoDisplay function argument name in IR
Homer Hsing [Sun, 17 Feb 2013 01:59:18 +0000 (09:59 +0800)]
Display function argument name in IR

Help debug IR. If we see "LT.int32 %31 %30 %20" and we know
"%20" is a input argument with detailed name, we can debug
IR better.

Before we change it, dumped IR is:
   decl_input.value %20

After we change it, dumped IR is:
   decl_input.value %20 argument_name

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix the possible overflow in slm_sz
Lu Guanqun [Wed, 30 Jan 2013 01:11:10 +0000 (09:11 +0800)]
fix the possible overflow in slm_sz

slm_sz in this structure has 16 bits, but we might specify 64KB which has 17
bits, it would thus cause overflow and undesired truncation.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFix uninitialize value warning
Zou, Nanhai [Mon, 4 Feb 2013 07:10:40 +0000 (15:10 +0800)]
Fix uninitialize value warning

Give a inital value to shut up compiler
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAlso make "arithmetic shift right" work
Homer Hsing [Fri, 1 Feb 2013 07:52:44 +0000 (15:52 +0800)]
Also make "arithmetic shift right" work

Only add a line of code ...
Looks like Mr. Ben has forgotten that line ...
Also add a test case.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake "logical shift right" work
Homer Hsing [Fri, 1 Feb 2013 05:48:59 +0000 (13:48 +0800)]
Make "logical shift right" work

Before we fix the bug, the "logical shift right" is wrong:
  (0xFF000000U >> 24) == 0xFFFFFFFF

After we fix the bug, it is right.
  (0xFF000000U >> 24) == 0xFF

Also add a test case, test by 128 random input.

This patch fixes "component_transfer_linear" filter.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd convert_uchar_sat and test case
Homer Hsing [Wed, 30 Jan 2013 06:05:31 +0000 (14:05 +0800)]
Add convert_uchar_sat and test case

'convert_uchar_sat' converts float to uchar saturately.
'convert_uchar_sat' simply calls add_sat.
by 'convert_uchar_sat' function we don't need to clamp(value, 0, 255).
we also add a test case.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFixed a bug on 64bit system.
Zhigang Gong [Mon, 28 Jan 2013 09:06:57 +0000 (17:06 +0800)]
Fixed a bug on 64bit system.

Previous implementation only considers the 32bit system.
This commit fix it. And use bsf rather than bsr, then we
don't really need to fixup the slot number according to
the bit width. And remove the non-intel platform code.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Feng, Boqun <boqun.feng@intel.com>
11 years agofix disassembler: horizontal stride of dest operand
Homer Hsing [Wed, 30 Jan 2013 02:48:06 +0000 (10:48 +0800)]
fix disassembler: horizontal stride of dest operand

If horizontal stride of a dest operand is four,
before we fix it, the disassembler outputs
(+f0) mov(8) g20<3>UB g18<8,8,1>F { align1 WE_normal 1Q };

after we fix it,
(+f0) mov(8) g20<4>UB g18<8,8,1>F { align1 WE_normal 1Q };

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoTest case for fabs
Homer Hsing [Fri, 25 Jan 2013 02:31:59 +0000 (10:31 +0800)]
Test case for fabs

Test 128 times fabs(). Input data is random

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agofix the wrong zero extend instruction handling
Lu Guanqun [Thu, 24 Jan 2013 04:22:08 +0000 (12:22 +0800)]
fix the wrong zero extend instruction handling

This helps to fix the problem facing Xing Homer. When we encounter 'zext'
instruction, we should take the soruce type as an unsigned type, otherwise
we're just sign extend the number, which is not what we want.

For the implementation, I'm reusing getUnsignedType() and this expects an
integer to be accepted, and this conforms to the spec of LLVM.

This patch doesn't trigger regression. I'll provide the corresponding test case
later.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoTest case for integer division arithmetic ~
Homer Hsing [Tue, 22 Jan 2013 04:18:29 +0000 (12:18 +0800)]
Test case for integer division arithmetic ~

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd a test case for integer remainder arithmetic
Homer Hsing [Tue, 22 Jan 2013 04:05:51 +0000 (12:05 +0800)]
Add a test case for integer remainder arithmetic

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake integer remainder & division arithmetic work ~ ~
Homer Hsing [Tue, 22 Jan 2013 02:59:37 +0000 (10:59 +0800)]
Make integer remainder & division arithmetic work ~ ~

Intel Gen7 GPU cannot do integer remainder (or division) if exec_width is 16.
But we are generating such kind of cmds, so the GPU hangs.
We need to split such command ~ ~

before:
(+f0) math intmod(16) g18<1>D g12<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};

after:
(+f0) math intmod(8) g18<1>D g12<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};
(+f0) math intmod(8) g19<1>D g13<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoSplit the multiple test cases to individual cases.
Zhigang Gong [Tue, 22 Jan 2013 08:46:45 +0000 (16:46 +0800)]
Split the multiple test cases to individual cases.

As the original test framework will do some resource releasing
job at the end of each test runing. If we call multiple tests
in one instance, then we will miss the chance to call those releasing
functions thus may cause some leaking.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoadd sub_sat test case
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add sub_sat test case

Our hardware can't handle the following case properly:

  add.sat g3 g1 -g2

when g1 and g2 are INT_MIN. So let's disable the INT's test case right now.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd anonymous namespace to avoid name collision with the next patch
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add anonymous namespace to avoid name collision with the next patch

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd sub_sat operation
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add sub_sat operation

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix two unused variables
Lu Guanqun [Thu, 17 Jan 2013 06:50:14 +0000 (14:50 +0800)]
fix two unused variables

I don't find these two warnings until now, and here's the simple fix.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd add_sat test case
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
add add_sat test case

Due to our limiation of not supporting 64 bit yet, I'll just omit the long case
in this patch, when 64 bit is ready, we can re-enable this.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd add_sat operation
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
add add_sat operation

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agokeep track of saturate flag in GenInstructionState
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
keep track of saturate flag in GenInstructionState

As a side effect, the constraint about GenInstructionState is removed, because
this structure is merely used for book keeping and it doesn't directly maps
onto hardware, it doesn't need to be strictly uint32_t.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFixed a bug in write_imagef.
Zhigang Gong [Thu, 17 Jan 2013 11:28:34 +0000 (19:28 +0800)]
Fixed a bug in write_imagef.

In write_imagef, the color is a float vector, and the TypedWrite send
message need to use float too. Previous implementation convert the
float to a UD register which is a bug.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoImplement cl_khr_gl_sharing by using upstream technology.
Zhigang Gong [Tue, 26 Feb 2013 09:36:53 +0000 (17:36 +0800)]
Implement cl_khr_gl_sharing by using upstream technology.

The previous implementation use a modified mesa library
and can't work with upstream mesa.

Now I managed to use existing egl extension and gbm to
import a gl texture to a cl image. Actually, the gbm
can't fully support our purpose, as it can't lookup
image for egl x11 platform. We have to touch gbm's
internal data structure to manually initialize its image
extension.

Furthermore, gbm only provide the API to get the image's
handle, and doesn't provide the one to get image's name.
As we are using different fd from the existing GL loader.
The handle is useless for us. I use the DRI2 image extension
function directly to get the name rather than the handle.
And it works well.

Now, after this patch applied. The cl_khr_gl_sharing could
work with upstream mesa. I recommend you use the latest git
master version.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoutest: Added one test case for the int4 constant vector.
Zhigang Gong [Wed, 16 Jan 2013 09:35:05 +0000 (17:35 +0800)]
utest: Added one test case for the int4 constant vector.

This test case will initialize a int4 vector according to
a constant expression. And will hit a bug as current compiler
doesn't handle the ConstantDataSequential type constant correctly.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFixed TYPED_WRITE instruction bug for SRC register allocation.
Zhigang Gong [Wed, 16 Jan 2013 07:43:29 +0000 (15:43 +0800)]
Fixed TYPED_WRITE instruction bug for SRC register allocation.

The previous implementation incorrectly treat the BTI/U/V as
destination register. And this breaks the DAG analysis when
do the instrunction selection. This patch fixes this bug.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoutest: Added one test case to fill a image2d.
Zhigang Gong [Wed, 16 Jan 2013 09:32:06 +0000 (17:32 +0800)]
utest: Added one test case to fill a image2d.

This test case fill a image2d according to a input color value.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoutest: Added one image2d test case copy_image.
Zhigang Gong [Wed, 16 Jan 2013 09:29:52 +0000 (17:29 +0800)]
utest: Added one image2d test case copy_image.

This case create two images, and initialize one then copy
the initilaized on to the other one via OCL kernel.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoutest: Added some new helper macros for image2d test cases.
Zhigang Gong [Wed, 16 Jan 2013 09:28:40 +0000 (17:28 +0800)]
utest: Added some new helper macros for image2d test cases.

This is the first commit to prepare to add some image 2d
test cases.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoDon't always set build type to DEBUGO0.
Zhigang Gong [Wed, 16 Jan 2013 07:42:27 +0000 (15:42 +0800)]
Don't always set build type to DEBUGO0.

This line was committed by accident. Now remove it.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFix the assertion condition check.
Zhigang Gong [Wed, 16 Jan 2013 07:54:05 +0000 (15:54 +0800)]
Fix the assertion condition check.

We have a new sampler arg type, so we have to add it to this
assertion condition check. Otherwise, it triggers an assertion
when met a sampler parameter.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guqanqun <guanqun.lu@intel.com>
11 years agouse TupleDstPolicy for SampleInstruction and TypeWriteInstruction
Lu Guanqun [Tue, 15 Jan 2013 06:22:52 +0000 (14:22 +0800)]
use TupleDstPolicy for SampleInstruction and TypeWriteInstruction

This change also fixes compiler errors where srcNum can't be found in
TupleSrcPolicy.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd TupleDstPolicy for instructions
Lu Guanqun [Tue, 15 Jan 2013 06:22:52 +0000 (14:22 +0800)]
add TupleDstPolicy for instructions

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoimplement OCL 1.2 new APIs.
Zhigang Gong [Thu, 21 Feb 2013 08:44:40 +0000 (16:44 +0800)]
implement OCL 1.2 new APIs.

clCreateImage2D and clCreateFromGLTexture2D have been
deprecated from OCL1.2, we need to implement the new
API clCreateImage/clCreateFromGLTexture to replace them.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoImplement TYPED_WRITE instruction.
Zhigang Gong [Wed, 9 Jan 2013 08:05:25 +0000 (16:05 +0800)]
Implement TYPED_WRITE instruction.

This commit implement TypedWrite instruction, covers the IR layer,
LLVM to GEN and the backend including the code gen.

The write_imagei should work now. As TYPED WRITE message only support
SIMD8, we need to call it twice when the execution size is 16 and we
need more data movements.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoAdded missed macros/structs for typed write message.
Zhigang Gong [Wed, 9 Jan 2013 08:04:08 +0000 (16:04 +0800)]
Added missed macros/structs for typed write message.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoImplement SAMPLE instruction.
Zhigang Gong [Wed, 19 Dec 2012 04:51:50 +0000 (12:51 +0800)]
Implement SAMPLE instruction.

This commit includes the following changes:

0. Handle the image2d properly as kernel input arument.

Mark image2d as an address space 4 pointer, then we can easily
get its type in runtime library.
Mark sampler as an address space 5 pointer.

1. Add one instruction in IR for SAMPLE as below:
Instruction SAMPLE(Tuple dst, Tuple src);
dst contains 4 registers for the return vector value.
src contains 4 registers, for surface_id, sampler_id, u_coord and v_coord.

2. Handle read_imagexx intrinsics.
Allocate the above 2 tuples and initialize it.

3. In instruction selection phase:
Implement SampleInstruction. Extract needed registers and allocate a vector
for the destination vector. Allocate one register as message payload.

4. In instruction emit phase:
Imlement emitSampleInstruction to handle the send instruction generation.
We use a0.0 to contain the desc rather than use the imm. And currently, we
use SIMD16 by default. We may need to check current execute size in the
future.

After this commit, we can generate proper code for read_imagei().

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoImplement sampler support.
Zhigang Gong [Sun, 6 Jan 2013 05:16:49 +0000 (13:16 +0800)]
Implement sampler support.

Accept sampler kernel argument and pass the sampler to driver side.
At driver side, gen7 driver will allocate a sampler slot and initlaize
one sampler state according to the cl sampler state. and then pass
back the slot index to the curbe array.

After this commit, read_imagei(image, sampler, coord) could work as
expected.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFinish the incomplete 2d image support in runtime library.
Zhigang Gong [Thu, 3 Jan 2013 15:50:27 +0000 (23:50 +0800)]
Finish the incomplete 2d image support in runtime library.

We allocate a free binding table index to a 2d surface, and then
pass that bti to kernel through the curbe.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoInsert ocl_common_defines to the cl source file.
Zhigang Gong [Sat, 29 Dec 2012 12:42:07 +0000 (20:42 +0800)]
Insert ocl_common_defines to the cl source file.

Remove the duplicate definitions in the ocl_stdlib.h. As we
may need to refer these two macros, we put it to the ocl_common_defins.h
which could be included by host application.
The two macros are as below:

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoAdd one function generate ARF register.
Zhigang Gong [Sat, 29 Dec 2012 12:39:04 +0000 (20:39 +0800)]
Add one function generate ARF register.

ud1arf to generate ud1 ARF which is for the sampler desc usage.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoAdded a new common header file for both kernel and host.
Zhigang Gong [Sat, 29 Dec 2012 12:33:31 +0000 (20:33 +0800)]
Added a new common header file for both kernel and host.

This new common header file is for the image support and will
be shared between kernel and host.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agobackend: Remove argID in function arguments iteration
Feng, Boqun [Mon, 17 Dec 2012 07:49:04 +0000 (15:49 +0800)]
backend: Remove argID in function arguments iteration

argID is used for checking whether the argument has a byvalue attribute,
and now llvm can do this with the hasByValueAttr function of the
argument, so there is no need to use the old api.

llvm svn revision: r169719

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong>
11 years agoremove all Makefiles
Lu Guanqun [Thu, 10 Jan 2013 01:24:28 +0000 (09:24 +0800)]
remove all Makefiles

There are two motivations for this change:

 - As we're using CMake, we don't need to maintain another build system. More
   code is more burden.
 - We want to delete `ocl_stdlib_str.cpp` completely as it can be
   auto-generated from `ocl_stdlib.h`. But in the current Makefile build system,
   we used it directly in `all_in_one` directory. So the removal of Makefile
   would help us on this.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix the sign-compare warning
Lu Guanqun [Thu, 10 Jan 2013 01:23:52 +0000 (09:23 +0800)]
fix the sign-compare warning

/root/cl-intel/src/cl_api.c:1051:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd check for memory allocation size
Lu Guanqun [Thu, 10 Jan 2013 01:19:09 +0000 (09:19 +0800)]
add check for memory allocation size

When it exceeds the max allocation size, it should fail and return
CL_INVALID_BUFFER_SIZE.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agocleanup .gitignore files
Lu Guanqun [Thu, 27 Dec 2012 03:10:51 +0000 (11:10 +0800)]
cleanup .gitignore files

The files listed in these .gitignore are never seen.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agosupport OpenCL conversions & type casting function "as_uchar4(float f)"
Homer Hsing [Fri, 4 Jan 2013 01:45:00 +0000 (09:45 +0800)]
support OpenCL conversions & type casting function "as_uchar4(float f)"

11 years agosupport OpenCL conversions & type casting function "convert_type_4"
Homer Hsing [Fri, 4 Jan 2013 01:40:58 +0000 (09:40 +0800)]
support OpenCL conversions & type casting function "convert_type_4"

11 years agodefine macro CLK_{LOCAL,GLOBAL}_MEM_FENCE in ocl_stdlib.h
Homer Hsing [Fri, 4 Jan 2013 01:46:43 +0000 (09:46 +0800)]
define macro CLK_{LOCAL,GLOBAL}_MEM_FENCE in ocl_stdlib.h

11 years agomore test case for vector load/store function
Homer Hsing [Fri, 4 Jan 2013 01:25:01 +0000 (09:25 +0800)]
more test case for vector load/store function

11 years agomore test case for OpenCL 1.1 integer built-in macros
Homer Hsing [Fri, 4 Jan 2013 01:23:18 +0000 (09:23 +0800)]
more test case for OpenCL 1.1 integer built-in macros

11 years agosupport OpenCL 1.1 integer built-in macros
Homer Hsing [Fri, 4 Jan 2013 01:22:28 +0000 (09:22 +0800)]
support OpenCL 1.1 integer built-in macros

11 years agoimplement blocking mode of clEnqueueUnmapMemObject
Homer Hsing [Thu, 27 Dec 2012 06:47:13 +0000 (14:47 +0800)]
implement blocking mode of clEnqueueUnmapMemObject

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoimplement blocking mode of clEnqueueMapBuffer
Homer Hsing [Thu, 27 Dec 2012 06:44:57 +0000 (14:44 +0800)]
implement blocking mode of clEnqueueMapBuffer

  it is the first step only ...

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoImplement clEnqueueWriteBuffer
Homer Hsing [Thu, 27 Dec 2012 06:43:00 +0000 (14:43 +0800)]
Implement clEnqueueWriteBuffer

It is only the first step. support only blocking write.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFix extended math function selection logic for int div.
Homer Hsing [Thu, 27 Dec 2012 06:39:44 +0000 (14:39 +0800)]
Fix extended math function selection logic for int div.

If both operand are integer, we should call
  GEN_MATH_FUNCTION_INT_DIV_QUOTIENT,
  not GEN_MATH_FUNCTION_FDIV.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoinstall header files
Lu Guanqun [Tue, 25 Dec 2012 07:15:18 +0000 (15:15 +0800)]
install header files

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoImport gbm internal header files.
Zhigang Gong [Tue, 26 Feb 2013 09:30:25 +0000 (17:30 +0800)]
Import gbm internal header files.

Prepartion to support cl_khr_gl_sharing extension. I
decide to use egl/gbm/dri2 image extension to implement
it.

The reason why we need to import gbm internal header files
is that gbm doesn't initialize the image extension by default
with x11 platform. As it doesn't know the default display and
the context when create a gbm device. We have to use its internal
structure to initialize the image extension properly after the
gbm device's creation. We are not the only one to do that way,
the EGL drm platform does the same thing.

To avoid dependencies to the whole mesa source package, I have
to copy the related header files here.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoImplement OCL extension initizliation.
Zhigang Gong [Fri, 22 Feb 2013 09:21:07 +0000 (17:21 +0800)]
Implement OCL extension initizliation.

We don't have an extension checking and initialization implemenation.
Now add it. For the mandatory extensions for OCL1.2 as below:
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_fp64 (for backward compatibility if
double precision is supported)

It seems that we only support the byte addressable store extension.
We still need to write new test case for it to prove whether we really
support it.

For all the other mandatory extensions, we need to implement them if we
want to comply with OCL1.2 specification.

For the optional extensions, currently we only support cl_khr_gl_sharing.
Actually, we are not fully support it. Current implementation is a hack
fashion. I'll change to use upstream mesa to implement it. For now, just
enable this extension.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoFind GBM/EGL library at build time.
Zhigang Gong [Thu, 21 Feb 2013 11:33:55 +0000 (19:33 +0800)]
Find GBM/EGL library at build time.

We will change to use EGL_KHR_gl_texture_2D_image and GBM library
to implement OCL and OGL interoperation. We need to check GBM.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agobackend: Add LLVM stable version support
Feng, Boqun [Fri, 1 Mar 2013 05:30:19 +0000 (13:30 +0800)]
backend: Add LLVM stable version support

support current llvm stable version 3.2
modify cmake file to check the version of llvm provided by system or
configured by LLVM_INSTALL_DIR, and add a macro define in the format
LLVM_<MAJOR><MINOR> according the version, this macro can be used for
llvm version-specific code.

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoKeep consistent naming rule for LLVM_XXX Cmake variables.
Zhigang Gong [Wed, 19 Dec 2012 06:18:23 +0000 (14:18 +0800)]
Keep consistent naming rule for LLVM_XXX Cmake variables.

Also fixed inconsistent indentation in FindLLVM.cmake and
remove some unecessary blank lines. Based on patch from
guanqun.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix compilation errors when it can't find correct library dirs
Lu Guanqun [Wed, 19 Dec 2012 03:31:42 +0000 (11:31 +0800)]
fix compilation errors when it can't find correct library dirs

The reason is that we specify `LLVM_LIBRARY_DIRS` in
`backend/src/CMakeLists.txt`, while in `FindLLVM.cmake`, we name it wrong.

11 years agoRefine CMake to check llvm version.
Zhigang Gong [Tue, 18 Dec 2012 08:45:36 +0000 (16:45 +0800)]
Refine CMake to check llvm version.

This branch need llvm 3.3 or newer version. We need to modify the cmake to
check the version. This commit also fixed some minor bugs in the find_packages,
and removed those useless files.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoDon't use display :0.0 manually.
Zhigang Gong [Fri, 22 Feb 2013 10:04:18 +0000 (18:04 +0800)]
Don't use display :0.0 manually.

Pass a NULL DISPLAY, the library will use the DISPLAY environment
which should be what we want it to do.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoCMake fixup.
Zhigang Gong [Tue, 11 Dec 2012 06:29:10 +0000 (14:29 +0800)]
CMake fixup.

Add a FindLLVM to handle the llvm package finding function.
Fixed the broken CMake files in backend.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoupdate headers to OpenCL 1.2 standards
Lu Guanqun [Wed, 26 Dec 2012 08:38:41 +0000 (16:38 +0800)]
update headers to OpenCL 1.2 standards

The header files are downloaded from this link:

    http://www.khronos.org/registry/cl/

And there are several other fixes due to this header update:

 - change cl_mem_type to cl_mem_object_type
 - change CL_INVALID_MEM to CL_INVALID_MEM_OBJECT
 - change CL_INVALID_TEXTURE to CL_INVALID_IMAGE_DESCRIPTOR
 - change CL_MEM_ALLOCATION_FAILURE to CL_MEM_OBJECT_ALLOCATION_FAILURE

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix assertion when two kernels exist in cl file
Lu Guanqun [Tue, 25 Dec 2012 05:26:55 +0000 (13:26 +0800)]
fix assertion when two kernels exist in cl file

The root cause is that it fails to increment `currID` when traversing hash map.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agooutput file name and kernel name when cl_kernel_init() fails
Lu Guanqun [Tue, 25 Dec 2012 03:43:44 +0000 (11:43 +0800)]
output file name and kernel name when cl_kernel_init() fails

This would help ease the debugging process a lot.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agosupport vector data load/store for char,short,long
Homer Hsing [Tue, 25 Dec 2012 00:53:39 +0000 (08:53 +0800)]
support vector data load/store for char,short,long

Signed-off-by: Homer Hsing <homer.xing@intel.com>
11 years agoChange to Clang/LLVM ToT(Top of the Tree)
Feng, Boqun [Wed, 28 Nov 2012 08:21:10 +0000 (16:21 +0800)]
Change to Clang/LLVM ToT(Top of the Tree)

Two minor changed in this patch

1.files that use llvm new api.
backend/src/llvm/llvm_{gen_backend,passes}.cpp

2.change backend IR from ptx32 to nvptx.
all test cases in utest passed [SUSSECC], but may give wrong output.
for example, app_mandelbrot makes a wrong picture.
silk's cl_gl_fill passed.

revision of Clang/LLVM components:
llvm r168773
clang r168774
compiler-rt r168700

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong>
11 years agoFixed compilation warnings.
Zhigang Gong [Mon, 17 Dec 2012 07:21:11 +0000 (15:21 +0800)]
Fixed compilation warnings.

__STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are in cflags,
we don't need to put them in source file. Otherwise, it causes
redefine warnings. Reported by guanqun.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoIgnore OpenCL kernel copied from Intel OpenCL SDK
Xing, Homer [Wed, 12 Dec 2012 08:37:28 +0000 (16:37 +0800)]
Ignore OpenCL kernel copied from Intel OpenCL SDK

just modify .gitignore, to ignore OpenCL kernel copied from Intel OpenCL SDK

11 years agosupport OpenCL 1.1 relational builtin functions "all","any"
Homer Hsing [Mon, 10 Dec 2012 01:02:02 +0000 (09:02 +0800)]
support OpenCL 1.1 relational builtin functions "all","any"

11 years agosupport OpenCL 1.1 relational builtin function "signbit"
Homer Hsing [Mon, 10 Dec 2012 00:41:01 +0000 (08:41 +0800)]
support OpenCL 1.1 relational builtin function "signbit"

11 years agosupport some OpenCL 1.1 relational built-in functions
Homer Hsing [Fri, 7 Dec 2012 01:56:21 +0000 (09:56 +0800)]
support some OpenCL 1.1 relational built-in functions

support isfinite, isinf, isnan, isnormal, islessgreater, isordered, isunordered

11 years agoadd missing #define in ocl_stdlib.h: INLINE_OVERLOADABLE and OVERLOADABLE
Homer Hsing [Fri, 7 Dec 2012 01:17:07 +0000 (09:17 +0800)]
add missing #define in ocl_stdlib.h: INLINE_OVERLOADABLE and OVERLOADABLE

use INLINE_OVERLOADABLE in DECL

11 years agosupport some of OpenCL 1.1 relational built-in functions
Homer Hsing [Fri, 7 Dec 2012 01:10:05 +0000 (09:10 +0800)]
support some of OpenCL 1.1 relational built-in functions

support isequal, isnotequal, isgreater, isgreaterequal, isless, islessequal
also fix typo (these typo does not cause bug)

11 years agosupport OpenCL 1.1 kernel_exec preprocessor macro
Homer Hsing [Thu, 6 Dec 2012 08:24:55 +0000 (16:24 +0800)]
support OpenCL 1.1 kernel_exec preprocessor macro

11 years agotest case for __kernel_exec
Homer Hsing [Thu, 6 Dec 2012 08:20:04 +0000 (16:20 +0800)]
test case for __kernel_exec

11 years agosupport OpenCL 1.1 __kernel_exec preprocessor macro
Homer Hsing [Thu, 6 Dec 2012 08:19:32 +0000 (16:19 +0800)]
support OpenCL 1.1 __kernel_exec preprocessor macro

11 years agopass build-options of clBuildProgram to clang compiler
Homer Hsing [Thu, 6 Dec 2012 08:00:13 +0000 (16:00 +0800)]
pass build-options of clBuildProgram to clang compiler

11 years agoadd OpenCL 1.1 preprocessor macros ENDIAN_LITTLE
Homer Hsing [Thu, 6 Dec 2012 06:42:59 +0000 (14:42 +0800)]
add OpenCL 1.1 preprocessor macros ENDIAN_LITTLE

11 years agosupport OpenCL 1.1 other built-in data types
Homer Hsing [Thu, 6 Dec 2012 03:18:51 +0000 (11:18 +0800)]
support OpenCL 1.1 other built-in data types

11 years agoa more general typedef for size_t and ptrdiff_t
Homer Hsing [Thu, 6 Dec 2012 03:14:58 +0000 (11:14 +0800)]
a more general typedef for size_t and ptrdiff_t

11 years agosupport OpenCL 1.1 floating-point macros
Homer Hsing [Thu, 6 Dec 2012 01:33:55 +0000 (09:33 +0800)]
support OpenCL 1.1 floating-point macros

11 years agosupport some OpenCL 1.1 preprocessor directives & macros
Homer Hsing [Thu, 6 Dec 2012 01:18:27 +0000 (09:18 +0800)]
support some OpenCL 1.1 preprocessor directives & macros

11 years agoadd test case for bool const, and vector component addressing
Homer Hsing [Thu, 6 Dec 2012 01:05:44 +0000 (09:05 +0800)]
add test case for bool const, and vector component addressing

11 years agosupport OpenCL 1.1 built-in scalar data types, built-in vector data types
Homer Hsing [Thu, 6 Dec 2012 00:53:31 +0000 (08:53 +0800)]
support OpenCL 1.1 built-in scalar data types, built-in vector data types

11 years agotest OpenCL 1.1 Vector Data Load/Store Functions
Homer Hsing [Thu, 29 Nov 2012 01:41:46 +0000 (09:41 +0800)]
test OpenCL 1.1 Vector Data Load/Store Functions

11 years agotest OpenCL 1.1 Async Copies and Prefetch Functions
Homer Hsing [Thu, 29 Nov 2012 01:40:50 +0000 (09:40 +0800)]
test OpenCL 1.1 Async Copies and Prefetch Functions

11 years agoTest OpenCL 1.1 Geometric Builtin Functions
Homer Hsing [Mon, 26 Nov 2012 07:27:53 +0000 (15:27 +0800)]
Test OpenCL 1.1 Geometric Builtin Functions

11 years agotest OpenCL 1.1 Relational Built-in Functions
Homer Hsing [Mon, 26 Nov 2012 07:20:58 +0000 (15:20 +0800)]
test OpenCL 1.1 Relational Built-in Functions

11 years agoEnable build-in vector data types test in kernels/compiler_data_types.cl
Homer Hsing [Mon, 26 Nov 2012 07:13:07 +0000 (15:13 +0800)]
Enable build-in vector data types test in kernels/compiler_data_types.cl

11 years agotest OpenCL 1.1 Synchronization, explicit memory fence
Homer Hsing [Fri, 23 Nov 2012 00:51:48 +0000 (08:51 +0800)]
test OpenCL 1.1 Synchronization, explicit memory fence

11 years agotest OpenCL 1.1 sampler declaration fields
Homer Hsing [Fri, 23 Nov 2012 00:45:59 +0000 (08:45 +0800)]
test OpenCL 1.1 sampler declaration fields