| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch mainly contains:
1. built-in function __gen_ocl_ime implementation.
2. Lots of built-in functions of cl_intel_device_side_avc_motion_estimation
are implemented.
3. This extension is required to run in simd16 mode.
v2: move the utests to seprate patches one by one;
as all the utests has extension function check, no need to put them
in stand alone utest;
uncomment the self test;
fix extension check logic issue, should be && instead of ||.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Signed-off-by: Xionghu Luo <xionghu.luo@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Add CL_DEVICE_SUB_GROUP_SIZES_INTEL for clGetDeviceInfo, add
CL_KERNEL_SPILL_MEM_SIZE_INTEL for clGetKernelWorkGroupInfo and add
CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL for clGetKernelSubGroupInfo.
We only have this extension for LLVM 40+ for frontend support.
V2: Add opencl-c define
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
| |
Found a missing macro that need change to support LLVM40+.
Signed-off-by: Pan Xiuli <xiuli.pan@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
| |
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
|
|
|
|
|
| |
Currently, cl_intel_motion_estimation is just implemented on IVB.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
| |
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to query low level layout of GL buffer object/texture/render
buffer, previous implementation introduced an egl extension and
implemented in Beignet side. This way is broken once mesa change its
related internal code. In this patch, we use an new egl extension
(EGL_MESA_image_dma_buf_export) to query related layout infomations
of gl texture. Since this egl extension is already accepted by Khronos,
so it's a stable method. This patch just implement GL texture 2d buffer
sharing, and we will implement other target type if necessary.
v2:
Add CMake build option to enable cl_khr_gl_sharing(default off).
Clean up related CMake code.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
| |
The extension is supported in fact and avoid misunderstanding.
Signed-off-by: Yan Wang <yan.wang@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
| |
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
define a MACRO to hold the value.
v2: use same MACRO in cl_extensions.h; add header file protection for
cl_extension.h.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
1. Just upload the first vme_state.
2. Remove duplicated code in check_opt1_extension.
3. Check image format before cl_gpgpu_bind_image_for_vme.
4. Fix error of getting mv. Because we suppose this kernel run in SIMD16
mode, so dword 0 of grf 1 should be
__gen_ocl_region(8,vme_result.s0), not
__gen_ocl_region(0,vme_result.s1).
v3:
Return CL_IMAGE_FORMAT_NOT_SUPPORTED if image format is not the required
one.
v4:
Fix two conflicts after code rebase and wordaround a curbe related bug.
v6:
Treat simd8 and simd16 differently when getting mv.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We enable fp64 extension just on BDW platform. The
platforms before Gen7 will not have fp64 support.
We will enable fp64 on gen8 later platforms after
this feature is stable.
V3:
Unify the extersion setting for FP16 and FP64.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch allows create 2d image with a cl buffer with zero copy.
v2: should use reference to manage the release the buffer and image.
After being created, the buffer reference count is 2, and image reference
count is 1.
if image is released first, decrease the image reference count and
buffer reference count both, release the bo when the buffer is released
at last;
if buffer is released first, decrease the buffer reference count only,
release the buffer when the image is released.
add CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT in cl_device_info.
v3: move is_image_from_buffer to _cl_mem_image; return
CL_INVALID_IMAGE_SIZE if image size is larger than the buffer.
v4: pitchalignment set to 2.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
| |
The cl device may have different extensions from the
platform. We will add some items based on the platform
extensions.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The init order of the intel_platform and the intel_extension is
somehow not clear. When some API such as clGetDeviceIDs can pass
NULL as cl_platform_id, we just use the global value intel_platform
as the default but do not care about the init state of the extension.
The init of the extension may be done when the cl device is created.
This is OK if the paltform and the device have the same extensions.
But now because of the fp16, they are not always the same.
Use cl_get_platform_default to replace the global value to ensure that
when default platform is available, the extension is also inited.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
|
|
|
| |
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
|
|
|
|
|
| |
need minus one when fill '\0' to sizeof char type array.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
the clang 3.5 will call CallGraphSCCPass to add attribute "Attribute::ReadOnly"
for these parameters only reads memeory, but this attribute is not
supported in the VerifierPass of llvm 3.3. This is a bug of llvm 3.3.
v2: disable this extension in runtime for old llvm.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the SPIR are built by clang generating a standard llvm Module file,
beignet need insert one byte before the module repesents binary type
then parse the module to link.
enable cl_khr_spir extension output string;
enable the SPIR calling conversion of CallingConv::SPIR_KERNEL;
get_global_id shoud be OVERLOADABLE; fix some bugs in prinf parse
and backend.
v2: move OVERLOADABLE change to another patch to keep clean;
rename FROM_INTERMEDIATE to FROM_LLVM_SPIR.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous implementation is only for 2d/3d texture sharing and
is implemented in a hacky fashinon. We need to replace it with a
clean and complete one. We introduce a new egl extension to export
low level layout information of a buffer object/texture/render buffer
from the mesa dri driver to the cl driver layer. As the extension is
not accpepted by mesa, we have to implement this new extension in
beignet internally.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: He Junyan <junyan.he@inbox.com>
|
|
|
|
|
|
|
| |
As the double support is incomplete currently, we disable it.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
|
|
|
|
|
| |
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a pointer to the dispatch table at the beginning of every object
of type
- cl_command_queue
- cl_context
- cl_device_id
- cl_event
- cl_kernel
- cl_mem
- cl_platform_id
- cl_program
- cl_sampler
as required by the ICD specification. The layout of the dispatch table
comes from the OpenCL ICD loader by Brice Videau <brice.videau@imag.fr> and
Vincent Danjean <Vincent.Danjean@ens-lyon.org>.
To avoid dispatch table entries being overwritten with the ICD loader's
implementations of the CL functions (as would be the proper behaviour for
the ELF loader), the -Bsymbolic option is given to the linker.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
|
|
|
|
|
|
| |
The Khronos Group headers define constants with the names of extensions if
the header defines the extension API. When the preprocessor sees one of
these names, it performs macro substitution, leading to compilation errors.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
We don't have an extension checking and initialization implemenation.
Now add it. For the mandatory extensions for OCL1.2 as below:
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_khr_fp64 (for backward compatibility if
double precision is supported)
It seems that we only support the byte addressable store extension.
We still need to write new test case for it to prove whether we really
support it.
For all the other mandatory extensions, we need to implement them if we
want to comply with OCL1.2 specification.
For the optional extensions, currently we only support cl_khr_gl_sharing.
Actually, we are not fully support it. Current implementation is a hack
fashion. I'll change to use upstream mesa to implement it. For now, just
enable this extension.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Tested-by: Homer Hsing <homer.xing@intel.com>
|