contrib/beignet.git
11 years agoir::unit can return its constantSet
Homer Hsing [Thu, 2 May 2013 01:00:30 +0000 (09:00 +0800)]
ir::unit can return its constantSet

Version 3.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd special register constoffst expressing curbe offset
Homer Hsing [Thu, 2 May 2013 01:00:29 +0000 (09:00 +0800)]
add special register constoffst expressing curbe offset

Version 3.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agodisable buggy old code doing global constant
Homer Hsing [Thu, 2 May 2013 01:00:28 +0000 (09:00 +0800)]
disable buggy old code doing global constant

Version 3.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd helper functions in gbe::Program
Homer Hsing [Thu, 2 May 2013 01:00:27 +0000 (09:00 +0800)]
add helper functions in gbe::Program

Version 3.
for get global constant related information

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd helper functions in ir::Constant and ir::ConstantSet
Homer Hsing [Thu, 2 May 2013 01:00:26 +0000 (09:00 +0800)]
add helper functions in ir::Constant and ir::ConstantSet

Version 3.
Deleted empty destructor of ConstantSet.
Updated ConstantSet::getConstant()

And I am afraid I cannot delete the empty constructor of ConstantSet.
That will break the constructor of ir::Unit. See unit.cpp, line 30.

add ir::Constant.reg field
add helper functions in ir::ConstantSet to get specified ir::Constant
and binary packed constant array

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoNew test case of ceil()
Homer Hsing [Sat, 27 Apr 2013 06:13:32 +0000 (14:13 +0800)]
New test case of ceil()

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake ceil() work
Homer Hsing [Sat, 27 Apr 2013 06:13:26 +0000 (14:13 +0800)]
Make ceil() work

Delete an extra comma in ocl_stdlib.h, then ceil() works. Fun

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd the INCLUDE_DIRECTORIES directive to all the FindXXX modules.
Junyan He [Fri, 26 Apr 2013 23:56:55 +0000 (07:56 +0800)]
Add the INCLUDE_DIRECTORIES directive to all the FindXXX modules.

If the header files are not at the standard location, we will FAIL
to compile because FIND_PATH directive will not add the Include Dir
to CXX flags. Add the INCLUDE_DIRECTORIES to handle this if we find
the headers we neeeds.

Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoImplement clGetDeviceInfo(..., CL_DRIVER_VERSION, ...)
Simon Richter [Fri, 19 Apr 2013 11:39:12 +0000 (13:39 +0200)]
Implement clGetDeviceInfo(..., CL_DRIVER_VERSION, ...)

This returns the library major/minor version. As it does not follow the
usual naming scheme, the output code is duplicated.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoHandle size queries in clGetDeviceInfo better
Simon Richter [Fri, 19 Apr 2013 11:39:11 +0000 (13:39 +0200)]
Handle size queries in clGetDeviceInfo better

This allows size queries on all fields.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd missing fmin() and fmax() overloads
Simon Richter [Mon, 22 Apr 2013 17:02:52 +0000 (19:02 +0200)]
Add missing fmin() and fmax() overloads

These functions must also exist for vector types.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
11 years agoAdd missing select() overloads
Simon Richter [Mon, 22 Apr 2013 16:29:57 +0000 (18:29 +0200)]
Add missing select() overloads

Both signed and unsigned types are allowed as the condition type,
regardless of the true/false branch types.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
11 years agoFix typo
Simon Richter [Mon, 22 Apr 2013 16:29:56 +0000 (18:29 +0200)]
Fix typo

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
11 years agoFix crash when output IR
Homer Hsing [Tue, 23 Apr 2013 07:34:21 +0000 (15:34 +0800)]
Fix crash when output IR

When output IR, program crashes because two register names are missing.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoAdd typedefs for extension functions
Simon Richter [Mon, 22 Apr 2013 09:26:27 +0000 (11:26 +0200)]
Add typedefs for extension functions

This allows application programs to cast the pointer returned from
clGetExtensionFunctionAddress to the appropriate function pointer type.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd cl*Intel to extension function lookup
Simon Richter [Mon, 22 Apr 2013 09:26:26 +0000 (11:26 +0200)]
Add cl*Intel to extension function lookup

This allows applications to look up the extension functions when going
through the ICD loader.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoRename Intel specific extension functions for ICD
Simon Richter [Mon, 22 Apr 2013 09:26:25 +0000 (11:26 +0200)]
Rename Intel specific extension functions for ICD

The ICD extension specification requires that extension function names end
in the vendor name to allow clGetExtensionFunctionAddress to switch between
implementations.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd clIntelMapBufferGTT, clIntelUnmapBufferGTT, cl_mem_map_gtt and cl_mem_unmap_gtt
Homer Hsing [Mon, 22 Apr 2013 01:35:00 +0000 (09:35 +0800)]
Add clIntelMapBufferGTT, clIntelUnmapBufferGTT, cl_mem_map_gtt and cl_mem_unmap_gtt

cl_mem_map_gtt calls cl_buffer_map_gtt.
cl_mem_unmap_gtt calls cl_buffer_unmap_gtt.
clIntelMapBufferGTT is a one to one mapping of drm_intel_gem_bo_map_gtt.
clIntelUnmapBufferGTT is a one to one mapping of drm_intel_gem_bo_unmap_gtt.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd constant ptr argument test case.
Yang Rong [Mon, 22 Apr 2013 05:11:52 +0000 (13:11 +0800)]
Add constant ptr argument test case.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd constant pointer as argument support in runtime.
Yang Rong [Mon, 22 Apr 2013 05:11:51 +0000 (13:11 +0800)]
Add constant pointer as argument support in runtime.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd constant pointer as argument support in kernel.
Yang Rong [Mon, 22 Apr 2013 05:11:50 +0000 (13:11 +0800)]
Add constant pointer as argument support in kernel.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd register allocate from tail support for constant buffer.
Yang Rong [Mon, 22 Apr 2013 05:11:49 +0000 (13:11 +0800)]
Add register allocate from tail support for constant buffer.

By default curbe alloc from head, grf alloc from tail.

Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoutests: should set pitch to zero if host_ptr is NULL.
Zhigang Gong [Fri, 19 Apr 2013 08:31:09 +0000 (16:31 +0800)]
utests: should set pitch to zero if host_ptr is NULL.

Per OCL spec, we should set pitch to zero if the host_ptr is
NULL.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Simon Richter <Simon.Richter@hogyros.de>

11 years agoImplement clGetMemObjectInfo
Simon Richter [Fri, 19 Apr 2013 06:12:59 +0000 (08:12 +0200)]
Implement clGetMemObjectInfo

Currently, only the properties

 - CL_MEM_TYPE
 - CL_MEM_FLAGS
 - CL_MEM_SIZE

are implemented.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFix pitch parameter in clCreateImage2D
Simon Richter [Fri, 19 Apr 2013 06:12:58 +0000 (08:12 +0200)]
Fix pitch parameter in clCreateImage2D

If the host_ptr is NULL, the pitch parameter must be 0.

The pitch parameter gives the number of bytes per scanline. Specifying 0
selects the minimum possible.

Signed-off-by: Simon Richter <Simon.Richter@hogyros.de>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoUse "clang" command from PATH
Simon Richter [Tue, 2 Apr 2013 13:01:52 +0000 (15:01 +0200)]
Use "clang" command from PATH

This assumes that LLVM is installed in the system path, but avoids
compiling the path of binaries into the library.

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake EGL optional
Simon Richter [Tue, 16 Apr 2013 18:04:42 +0000 (20:04 +0200)]
Make EGL optional

This fixes builds if EGL is unavailable. The OpenGL sharing extension will
be disabled then.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoPrefer versioned llvm-config
Simon Richter [Tue, 16 Apr 2013 18:02:08 +0000 (20:02 +0200)]
Prefer versioned llvm-config

If multiple versions are installed, prefer version 3.2 before falling back
to the default version.

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAccept glibc's implementation of memalign()
Simon Richter [Fri, 12 Apr 2013 09:21:19 +0000 (11:21 +0200)]
Accept glibc's implementation of memalign()

If the platform is not Linux, but glibc based, we assume that the
memalign() function is working satisfactorily.

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoImplement KHR ICD extension
Simon Richter [Wed, 3 Apr 2013 18:32:45 +0000 (20:32 +0200)]
Implement KHR ICD extension

This adds a pointer to the dispatch table at the beginning of every object
of type

 - cl_command_queue
 - cl_context
 - cl_device_id
 - cl_event
 - cl_kernel
 - cl_mem
 - cl_platform_id
 - cl_program
 - cl_sampler

as required by the ICD specification. The layout of the dispatch table
comes from the OpenCL ICD loader by Brice Videau <brice.videau@imag.fr> and
Vincent Danjean <Vincent.Danjean@ens-lyon.org>.

To avoid dispatch table entries being overwritten with the ICD loader's
implementations of the CL functions (as would be the proper behaviour for
the ELF loader), the -Bsymbolic option is given to the linker.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years ago"Implement" clGetExtensionFunctionAddress()
Simon Richter [Tue, 2 Apr 2013 13:11:01 +0000 (15:11 +0200)]
"Implement" clGetExtensionFunctionAddress()

This function should not fail if a function entry point cannot be found --
instead we return NULL.

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAvoid extension names as preprocessor tokens
Simon Richter [Tue, 16 Apr 2013 18:05:54 +0000 (20:05 +0200)]
Avoid extension names as preprocessor tokens

The Khronos Group headers define constants with the names of extensions if
the header defines the extension API. When the preprocessor sees one of
these names, it performs macro substitution, leading to compilation errors.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFix typo in cl_get_platform_info function name
Simon Richter [Tue, 2 Apr 2013 12:51:52 +0000 (14:51 +0200)]
Fix typo in cl_get_platform_info function name

Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agowrite_image: Fixed a bug when use scalar data as color source.
Zhigang Gong [Wed, 17 Apr 2013 03:15:15 +0000 (11:15 +0800)]
write_image: Fixed a bug when use scalar data as color source.

If the color source is scalar data type for example from the
kernel input parameters, then we should use QnPhysical to get
the source, rather than hard coded the subnr to zero.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Meng Lv <meng.lv@intel.com>
11 years agoGenRegister src[] is not allowed in clang, change this style.
Lu Guanqun [Tue, 16 Apr 2013 06:39:03 +0000 (14:39 +0800)]
GenRegister src[] is not allowed in clang, change this style.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoGenRegInterval should be a struct instead of a class
Lu Guanqun [Tue, 16 Apr 2013 06:38:58 +0000 (14:38 +0800)]
GenRegInterval should be a struct instead of a class

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix error in clang: variable array length is not support for non-POD element
Lu Guanqun [Tue, 16 Apr 2013 06:38:54 +0000 (14:38 +0800)]
fix error in clang: variable array length is not support for non-POD element

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years ago--no-rtti should be changed to -fno-rtti in clang++
Lu Guanqun [Tue, 16 Apr 2013 06:38:49 +0000 (14:38 +0800)]
--no-rtti should be changed to -fno-rtti in clang++

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoremove dollar sign in identifier
Lu Guanqun [Tue, 16 Apr 2013 06:38:45 +0000 (14:38 +0800)]
remove dollar sign in identifier

Therefore it removes compiler warnings.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoremove -ffast-math comiler flag
Lu Guanqun [Tue, 16 Apr 2013 06:32:45 +0000 (14:32 +0800)]
remove -ffast-math comiler flag

FC16 has the bug https://bugzilla.redhat.com/show_bug.cgi?id=795995
which prevents us from using our library on FC16.

Even though it's not our compiler's bug, we don't need fast-math flag here as
our compiler is not a computation oriented program, so we can remove this flag
safely.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agomake raw_fd_ostream not close stdout
Homer Hsing [Tue, 16 Apr 2013 03:12:01 +0000 (11:12 +0800)]
make raw_fd_ostream not close stdout

stdout was closed by llvm::raw_fd_ostream.
Now let llvm::raw_fd_ostream not close stdout.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@gmail.com>
11 years agoutests: refine the helper macros.
Zhigang Gong [Thu, 11 Apr 2013 07:08:59 +0000 (15:08 +0800)]
utests: refine the helper macros.

The previous implementation has many duplicate code, and the FN
is used incorrectly. Now fix it and the code looks a little bit
clean and when it fails the reported function name should be
correct now.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoutests: add a simple test case for cl_khr_gl_sharing.
Zhigang Gong [Wed, 10 Apr 2013 11:44:19 +0000 (19:44 +0800)]
utests: add a simple test case for cl_khr_gl_sharing.

This test case creates a OCL image from a OGL texture.
Then use a OCL kernel to fill the image. Then it back
to OGL to read the pixels back and verify the color.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Lu, Guanqun <guanqun.lu@intel.com>
11 years agoutests: added cl_khr_gl_sharing related helper functions.
Zhigang Gong [Wed, 10 Apr 2013 11:39:40 +0000 (19:39 +0800)]
utests: added cl_khr_gl_sharing related helper functions.

This is the preparation to add clgl interoperate test cases.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Lu, Guanqun <guanqun.lu@intel.com>
11 years agoFixed the extension string for both platform and device.
Zhigang Gong [Wed, 10 Apr 2013 11:23:41 +0000 (19:23 +0800)]
Fixed the extension string for both platform and device.

I forgot to calculate the extensions string size. Now fix it.
And also forgot to handle the device's extension case, now add
it and duplicate the extensions from the platform directly.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Lu, Guanqun <guanqun.lu@intel.com>
11 years agoEnable the clFlush.
Zhigang Gong [Wed, 10 Apr 2013 09:38:46 +0000 (17:38 +0800)]
Enable the clFlush.

We don't need to do anything now, as current it alwasy flushs
all the commands each time. We may need to revisit here after we
optmize the clEnqueueNDRangeKernel's behaviour and don't flush
pipeline every time.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Lu, Guanqun <guanqun.lu@intel.com>
11 years agoFix brw instruction field "flag"
Homer Hsing [Wed, 10 Apr 2013 08:39:34 +0000 (16:39 +0800)]
Fix brw instruction field "flag"

bits2.da1.flag_subreg_nr is missing in brw_instruction.
The location of bits2.da1.flag_reg_nr is wrong. See IVB spec.

This patch fixes bugs above, also
make disassembler output correct flag_subreg_nr for conditional modifier
and prediction.

Before we change it:
(+f0.1) cmp.l(8)      null g12<8,8,1>D g2.2<0,1,0>D {align1 WE_normal 1Q};

After we change it:
(+f1.1) cmp.l.f1.1(8) null g12<8,8,1>D g2.2<0,1,0>D {align1 WE_normal 1Q};

Although flag_reg_nr has moved position, other code is still right,
because if we use f0.1 before, now we use f1.0

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu, Guanqun <guanqun.lu@intel.com>
11 years agothrow exception instead of just assert
Lu Guanqun [Wed, 10 Apr 2013 08:12:08 +0000 (16:12 +0800)]
throw exception instead of just assert

So that we know the reason of why we're failing.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agorelease the contraint of volatile pointer
Lu Guanqun [Wed, 10 Apr 2013 08:11:59 +0000 (16:11 +0800)]
release the contraint of volatile pointer

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd disassembler support for message gateway
Lu Guanqun [Wed, 10 Apr 2013 08:11:53 +0000 (16:11 +0800)]
add disassembler support for message gateway

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agobackend: Use alignof keyword when supported
Feng, Boqun [Wed, 10 Apr 2013 06:29:33 +0000 (14:29 +0800)]
backend: Use alignof keyword when supported

the keyword alignof of C++11 is supported after gcc 4.8, other than use
old template way to calculate the align of a class, the keyword is used.

Signed-off-by: Feng, Boqun <boqun.feng@intel.com>
Reviewed-by: Zhigang, Gong <zhigang.gong@linux.intel.com>
11 years agoUpdate documents.
Zhigang Gong [Wed, 27 Mar 2013 12:22:29 +0000 (20:22 +0800)]
Update documents.

Update documents according to our latest progress, which is for
releasing version 0.1. Also add those html files back.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoenable CL_DEVICE_IMAGE_SUPPORT check
Lv, Meng [Tue, 26 Mar 2013 05:25:30 +0000 (05:25 +0000)]
enable CL_DEVICE_IMAGE_SUPPORT check

Signed-off-by: lv meng <meng.lv@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agochange the way clGetDeviceInfo() is called in cl_ocl_init()
Lu Guanqun [Mon, 25 Mar 2013 03:33:40 +0000 (11:33 +0800)]
change the way clGetDeviceInfo() is called in cl_ocl_init()

This works as a test case for the newly implemented clGetDeviceInfo()
behaviour: pass NULL to param_value and it should return the string size.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoenhance clGetDeviceInfo() API to return the length of string fields
Lu Guanqun [Mon, 25 Mar 2013 03:29:31 +0000 (11:29 +0800)]
enhance clGetDeviceInfo() API to return the length of string fields

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agochange the way clGetPlatformInfo() is called in cl_ocl_init()
Lu Guanqun [Mon, 25 Mar 2013 03:19:05 +0000 (11:19 +0800)]
change the way clGetPlatformInfo() is called in cl_ocl_init()

This works as a test case for the newly implemented clGetPlatformInfo()
behaviour: pass NULL to param_value and it should return the string size.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoenhance clGetPlatformInfo() API to return the string length
Lu Guanqun [Mon, 25 Mar 2013 03:04:19 +0000 (11:04 +0800)]
enhance clGetPlatformInfo() API to return the string length

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd test case for clCreateContextFromType()
Lu Guanqun [Fri, 22 Mar 2013 08:07:08 +0000 (16:07 +0800)]
add test case for clCreateContextFromType()

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoimplement clCreateContextFromType()
Lu Guanqun [Fri, 22 Mar 2013 08:07:43 +0000 (16:07 +0800)]
implement clCreateContextFromType()

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix one typo for clCreateContextFromType()
Lu Guanqun [Fri, 22 Mar 2013 03:17:23 +0000 (11:17 +0800)]
fix one typo for clCreateContextFromType()

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix typo in FindLLVM.cmake
Lu Guanqun [Fri, 22 Mar 2013 03:40:55 +0000 (11:40 +0800)]
fix typo in FindLLVM.cmake

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFixed a potential null pointer reference bug.
Zhigang Gong [Tue, 19 Mar 2013 12:41:49 +0000 (20:41 +0800)]
Fixed a potential null pointer reference bug.

When we failed to create a dri2 connection, we jump to the error out
path, and the driver_name may be null pointer, before set it to null,
we need to check it first.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoSet the initial library versions to 0.1.
Zhigang Gong [Tue, 19 Mar 2013 11:39:13 +0000 (19:39 +0800)]
Set the initial library versions to 0.1.

We have two libraries here, one is the gen backend and the other
is libcl runtime library. We set both initial versions to 0.1.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoChange the cl version to 1.0.
Zhigang Gong [Tue, 19 Mar 2013 11:37:41 +0000 (19:37 +0800)]
Change the cl version to 1.0.

The first stage goal is to deliver a OCL 1.0 implementation,
so let's change the version here.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoAdd the interface of cl_buffer_map_gtt
Homer Hsing [Thu, 21 Feb 2013 03:03:08 +0000 (11:03 +0800)]
Add the interface of cl_buffer_map_gtt

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix unused-result warning
Homer Hsing [Mon, 25 Feb 2013 05:16:46 +0000 (13:16 +0800)]
fix unused-result warning

fix unused-result warning, because we didn't use the return value of fread

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoOutput the map from IR reg to ASM reg
Homer Hsing [Sun, 17 Feb 2013 05:23:41 +0000 (13:23 +0800)]
Output the map from IR reg to ASM reg

It is hard to guess the meaning of
  "mul(8) g10<1>d g0.6<0,1,0>d g3.2<0,1,0>d"
if you don't know the IR reg num of "g10" etc.

Now we can output the map from IR reg to ASM reg, such as
  "%0 g10.0D"
  "%1 g0.6D"
  "%2 g3.2D"
So you know the meaning is
  "mul %0 %1 %2"

By default, not output those message.
You can turn on by BVAR "OCL_OUTPUT_REG_ALLOC".

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoUse new OCL1.2 API rather than those deprecated API.
Zhigang Gong [Thu, 21 Feb 2013 09:09:51 +0000 (17:09 +0800)]
Use new OCL1.2 API rather than those deprecated API.

Use clCreateImage to replace the old API clCreateImage2D.
It will silent the compiler warnings.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agodo not use the advanced C++ feature
Lu Guanqun [Tue, 9 Apr 2013 06:41:08 +0000 (14:41 +0800)]
do not use the advanced C++ feature

so that it can work on gcc 4.6 happily, otherwise, I have the following issues:

    backend/src/./ir/instruction.hpp:112:7: error: type ‘gbe::ir::Instruction’ is not a direct base of ‘gbe::ir::Instruction’

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd a case for MEM_INVALID to fix a warning
Lu Guanqun [Tue, 9 Apr 2013 06:44:07 +0000 (14:44 +0800)]
add a case for MEM_INVALID to fix a warning

    warning: enumeration value ‘MEM_INVALID’ not handled in switch [-Wswitch]

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd linking library for gcc compiler.
Lu Guanqun [Tue, 9 Apr 2013 06:22:58 +0000 (14:22 +0800)]
add linking library for gcc compiler.

Otherwise, there are lots of linking errors such as:

    undefined reference to `llvm::ConstantDataSequential::getElementType() const'

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFixed a bug when expire registers.
Zhigang Gong [Tue, 19 Feb 2013 11:52:03 +0000 (19:52 +0800)]
Fixed a bug when expire registers.

The previous implementation forgot to change the head when
the to expired register is at the left side of the current
head. Thus the algorithm will be broken, as the algorithm need
the head has the smallest offset.

Without this patch, the register expireing doesn't work. Thus
any kernel function need more than 44 DWORD registers or 11
DWORD vec4 will fail to get registers.

The calculation is:

(Register file size / (type size * simdwidth) - special registers)
(4K / (4 * 16)) - 20 = 44

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoAdd utest case for movforphi's undef case.
Zhigang Gong [Tue, 19 Feb 2013 03:26:35 +0000 (11:26 +0800)]
Add utest case for movforphi's undef case.

This case will trigger MovForPhi to handle a undef vector
element.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoDelete temp files if compiled successfully
Homer Hsing [Thu, 21 Feb 2013 02:45:42 +0000 (10:45 +0800)]
Delete temp files if compiled successfully

If IR was compiled successfully, delete temp files.
My '/tmp' was full of '*.cl', '*.ll' files.
Now the temp files are gone, world is clean :)

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoOutput meaning of special registers in dumped IR
Homer Hsing [Sun, 17 Feb 2013 02:59:44 +0000 (10:59 +0800)]
Output meaning of special registers in dumped IR

Help debug IR. Before we change this, we feel hard to know
what "%3, %4" stuff mean in IR. Now we output their meaning.

Before we change this, dumped IR is:
    .decl.dword %0
    .decl.dword %1
    .decl.dword %2

After we change this, dumped IR is:
    .decl.dword %0 local_id_0
    .decl.dword %1 local_id_1
    .decl.dword %2 local_id_2

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoDisplay function argument name in IR
Homer Hsing [Sun, 17 Feb 2013 01:59:18 +0000 (09:59 +0800)]
Display function argument name in IR

Help debug IR. If we see "LT.int32 %31 %30 %20" and we know
"%20" is a input argument with detailed name, we can debug
IR better.

Before we change it, dumped IR is:
   decl_input.value %20

After we change it, dumped IR is:
   decl_input.value %20 argument_name

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix the possible overflow in slm_sz
Lu Guanqun [Wed, 30 Jan 2013 01:11:10 +0000 (09:11 +0800)]
fix the possible overflow in slm_sz

slm_sz in this structure has 16 bits, but we might specify 64KB which has 17
bits, it would thus cause overflow and undesired truncation.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFix uninitialize value warning
Zou, Nanhai [Mon, 4 Feb 2013 07:10:40 +0000 (15:10 +0800)]
Fix uninitialize value warning

Give a inital value to shut up compiler
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAlso make "arithmetic shift right" work
Homer Hsing [Fri, 1 Feb 2013 07:52:44 +0000 (15:52 +0800)]
Also make "arithmetic shift right" work

Only add a line of code ...
Looks like Mr. Ben has forgotten that line ...
Also add a test case.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake "logical shift right" work
Homer Hsing [Fri, 1 Feb 2013 05:48:59 +0000 (13:48 +0800)]
Make "logical shift right" work

Before we fix the bug, the "logical shift right" is wrong:
  (0xFF000000U >> 24) == 0xFFFFFFFF

After we fix the bug, it is right.
  (0xFF000000U >> 24) == 0xFF

Also add a test case, test by 128 random input.

This patch fixes "component_transfer_linear" filter.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd convert_uchar_sat and test case
Homer Hsing [Wed, 30 Jan 2013 06:05:31 +0000 (14:05 +0800)]
Add convert_uchar_sat and test case

'convert_uchar_sat' converts float to uchar saturately.
'convert_uchar_sat' simply calls add_sat.
by 'convert_uchar_sat' function we don't need to clamp(value, 0, 255).
we also add a test case.

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFixed a bug on 64bit system.
Zhigang Gong [Mon, 28 Jan 2013 09:06:57 +0000 (17:06 +0800)]
Fixed a bug on 64bit system.

Previous implementation only considers the 32bit system.
This commit fix it. And use bsf rather than bsr, then we
don't really need to fixup the slot number according to
the bit width. And remove the non-intel platform code.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Feng, Boqun <boqun.feng@intel.com>
11 years agofix disassembler: horizontal stride of dest operand
Homer Hsing [Wed, 30 Jan 2013 02:48:06 +0000 (10:48 +0800)]
fix disassembler: horizontal stride of dest operand

If horizontal stride of a dest operand is four,
before we fix it, the disassembler outputs
(+f0) mov(8) g20<3>UB g18<8,8,1>F { align1 WE_normal 1Q };

after we fix it,
(+f0) mov(8) g20<4>UB g18<8,8,1>F { align1 WE_normal 1Q };

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoTest case for fabs
Homer Hsing [Fri, 25 Jan 2013 02:31:59 +0000 (10:31 +0800)]
Test case for fabs

Test 128 times fabs(). Input data is random

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agofix the wrong zero extend instruction handling
Lu Guanqun [Thu, 24 Jan 2013 04:22:08 +0000 (12:22 +0800)]
fix the wrong zero extend instruction handling

This helps to fix the problem facing Xing Homer. When we encounter 'zext'
instruction, we should take the soruce type as an unsigned type, otherwise
we're just sign extend the number, which is not what we want.

For the implementation, I'm reusing getUnsignedType() and this expects an
integer to be accepted, and this conforms to the spec of LLVM.

This patch doesn't trigger regression. I'll provide the corresponding test case
later.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoTest case for integer division arithmetic ~
Homer Hsing [Tue, 22 Jan 2013 04:18:29 +0000 (12:18 +0800)]
Test case for integer division arithmetic ~

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoAdd a test case for integer remainder arithmetic
Homer Hsing [Tue, 22 Jan 2013 04:05:51 +0000 (12:05 +0800)]
Add a test case for integer remainder arithmetic

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoMake integer remainder & division arithmetic work ~ ~
Homer Hsing [Tue, 22 Jan 2013 02:59:37 +0000 (10:59 +0800)]
Make integer remainder & division arithmetic work ~ ~

Intel Gen7 GPU cannot do integer remainder (or division) if exec_width is 16.
But we are generating such kind of cmds, so the GPU hangs.
We need to split such command ~ ~

before:
(+f0) math intmod(16) g18<1>D g12<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};

after:
(+f0) math intmod(8) g18<1>D g12<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};
(+f0) math intmod(8) g19<1>D g13<8,8,1>D g2.1<0,1,0>D {align1 WE_normal 1Q};

Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoSplit the multiple test cases to individual cases.
Zhigang Gong [Tue, 22 Jan 2013 08:46:45 +0000 (16:46 +0800)]
Split the multiple test cases to individual cases.

As the original test framework will do some resource releasing
job at the end of each test runing. If we call multiple tests
in one instance, then we will miss the chance to call those releasing
functions thus may cause some leaking.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Homer Hsing <homer.xing@intel.com>
11 years agoadd sub_sat test case
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add sub_sat test case

Our hardware can't handle the following case properly:

  add.sat g3 g1 -g2

when g1 and g2 are INT_MIN. So let's disable the INT's test case right now.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd anonymous namespace to avoid name collision with the next patch
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add anonymous namespace to avoid name collision with the next patch

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd sub_sat operation
Lu Guanqun [Fri, 18 Jan 2013 07:45:58 +0000 (15:45 +0800)]
add sub_sat operation

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agofix two unused variables
Lu Guanqun [Thu, 17 Jan 2013 06:50:14 +0000 (14:50 +0800)]
fix two unused variables

I don't find these two warnings until now, and here's the simple fix.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd add_sat test case
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
add add_sat test case

Due to our limiation of not supporting 64 bit yet, I'll just omit the long case
in this patch, when 64 bit is ready, we can re-enable this.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoadd add_sat operation
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
add add_sat operation

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agokeep track of saturate flag in GenInstructionState
Lu Guanqun [Thu, 17 Jan 2013 06:37:08 +0000 (14:37 +0800)]
keep track of saturate flag in GenInstructionState

As a side effect, the constraint about GenInstructionState is removed, because
this structure is merely used for book keeping and it doesn't directly maps
onto hardware, it doesn't need to be strictly uint32_t.

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
11 years agoFixed a bug in write_imagef.
Zhigang Gong [Thu, 17 Jan 2013 11:28:34 +0000 (19:28 +0800)]
Fixed a bug in write_imagef.

In write_imagef, the color is a float vector, and the TypedWrite send
message need to use float too. Previous implementation convert the
float to a UD register which is a bug.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoImplement cl_khr_gl_sharing by using upstream technology.
Zhigang Gong [Tue, 26 Feb 2013 09:36:53 +0000 (17:36 +0800)]
Implement cl_khr_gl_sharing by using upstream technology.

The previous implementation use a modified mesa library
and can't work with upstream mesa.

Now I managed to use existing egl extension and gbm to
import a gl texture to a cl image. Actually, the gbm
can't fully support our purpose, as it can't lookup
image for egl x11 platform. We have to touch gbm's
internal data structure to manually initialize its image
extension.

Furthermore, gbm only provide the API to get the image's
handle, and doesn't provide the one to get image's name.
As we are using different fd from the existing GL loader.
The handle is useless for us. I use the DRI2 image extension
function directly to get the name rather than the handle.
And it works well.

Now, after this patch applied. The cl_khr_gl_sharing could
work with upstream mesa. I recommend you use the latest git
master version.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
11 years agoutest: Added one test case for the int4 constant vector.
Zhigang Gong [Wed, 16 Jan 2013 09:35:05 +0000 (17:35 +0800)]
utest: Added one test case for the int4 constant vector.

This test case will initialize a int4 vector according to
a constant expression. And will hit a bug as current compiler
doesn't handle the ConstantDataSequential type constant correctly.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>
11 years agoFixed TYPED_WRITE instruction bug for SRC register allocation.
Zhigang Gong [Wed, 16 Jan 2013 07:43:29 +0000 (15:43 +0800)]
Fixed TYPED_WRITE instruction bug for SRC register allocation.

The previous implementation incorrectly treat the BTI/U/V as
destination register. And this breaks the DAG analysis when
do the instrunction selection. This patch fixes this bug.

Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Lu Guanqun <guanqun.lu@intel.com>