diff options
author | Ruiling Song <ruiling.song@intel.com> | 2014-03-19 11:41:54 +0800 |
---|---|---|
committer | Zhigang Gong <zhigang.gong@intel.com> | 2014-03-25 13:20:47 +0800 |
commit | eeefb77c77920d66834bbced01c002604e5d4f66 (patch) | |
tree | 76d5ed7d2cc5de1046cd07edec96ebe5b2eeb6f1 /utests/profiling_exec.cpp | |
parent | c8830424f2ae811a1fbc490c4752e156928b02c5 (diff) | |
download | beignet-eeefb77c77920d66834bbced01c002604e5d4f66.tar.gz |
GBE: make byte/short vload/vstore process one element each time.
Per OCL Spec, the computed address (p+offset*n) is 8-bit aligned for char,
and 16-bit aligned for short in vloadn & vstoren. That is we can not assume that
vload4 with char pointer is 4byte aligned. The previous implementation will make
Clang generate an load or store with alignment 4 which is in fact only alignment 1.
We need find another way to optimize the vloadn.
But before that, let's keep vloadn and vstoren work correctly.
This could fix the regression issue caused by byte/short optimization.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Diffstat (limited to 'utests/profiling_exec.cpp')
0 files changed, 0 insertions, 0 deletions