summaryrefslogtreecommitdiff
path: root/utests/profiling_exec.cpp
diff options
context:
space:
mode:
authorRuiling Song <ruiling.song@intel.com>2014-03-19 11:41:54 +0800
committerZhigang Gong <zhigang.gong@intel.com>2014-03-25 13:20:47 +0800
commiteeefb77c77920d66834bbced01c002604e5d4f66 (patch)
tree76d5ed7d2cc5de1046cd07edec96ebe5b2eeb6f1 /utests/profiling_exec.cpp
parentc8830424f2ae811a1fbc490c4752e156928b02c5 (diff)
downloadbeignet-eeefb77c77920d66834bbced01c002604e5d4f66.tar.gz
GBE: make byte/short vload/vstore process one element each time.
Per OCL Spec, the computed address (p+offset*n) is 8-bit aligned for char, and 16-bit aligned for short in vloadn & vstoren. That is we can not assume that vload4 with char pointer is 4byte aligned. The previous implementation will make Clang generate an load or store with alignment 4 which is in fact only alignment 1. We need find another way to optimize the vloadn. But before that, let's keep vloadn and vstoren work correctly. This could fix the regression issue caused by byte/short optimization. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Diffstat (limited to 'utests/profiling_exec.cpp')
0 files changed, 0 insertions, 0 deletions