review.tizen.org Git - contrib/beignet.git/commit

author	Ruiling Song <ruiling.song@intel.com>
	Wed, 19 Mar 2014 03:41:54 +0000 (11:41 +0800)
committer	Zhigang Gong <zhigang.gong@intel.com>
	Tue, 25 Mar 2014 05:20:47 +0000 (13:20 +0800)
commit	eeefb77c77920d66834bbced01c002604e5d4f66
tree	76d5ed7d2cc5de1046cd07edec96ebe5b2eeb6f1	tree \| snapshot
parent	c8830424f2ae811a1fbc490c4752e156928b02c5	commit \| diff

GBE: make byte/short vload/vstore process one element each time.

Per OCL Spec, the computed address (p+offset*n) is 8-bit aligned for char,
and 16-bit aligned for short in vloadn & vstoren. That is we can not assume that
vload4 with char pointer is 4byte aligned. The previous implementation will make
Clang generate an load or store with alignment 4 which is in fact only alignment 1.

We need find another way to optimize the vloadn.
But before that, let's keep vloadn and vstoren work correctly.
This could fix the regression issue caused by byte/short optimization.

Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>