delta/beignet.git - gitlab.freedesktop.org: beignet/beignet.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Implement extension cl_intel_device_side_avc_motion_estimation.	Chuanbo Weng	2017-07-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch mainly contains: 1. built-in function __gen_ocl_ime implementation. 2. Lots of built-in functions of cl_intel_device_side_avc_motion_estimation are implemented. 3. This extension is required to run in simd16 mode. v2: move the utests to seprate patches one by one; as all the utests has extension function check, no need to put them in stand alone utest; uncomment the self test; fix extension check logic issue, should be && instead of \|\|. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Signed-off-by: Xionghu Luo <xionghu.luo@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	Runtime: Add new API enums for cl_intel_required_subgroup_size extension	Pan Xiuli	2017-06-16	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	Add CL_DEVICE_SUB_GROUP_SIZES_INTEL for clGetDeviceInfo, add CL_KERNEL_SPILL_MEM_SIZE_INTEL for clGetKernelWorkGroupInfo and add CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL for clGetKernelSubGroupInfo. We only have this extension for LLVM 40+ for frontend support. V2: Add opencl-c define Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	Runtime: Fix a mssing llvm version marco for LLVM40+	Pan Xiuli	2017-06-09	1	-1/+1
\| \| \| \| \| \| \|	Found a missing macro that need change to support LLVM40+. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	GEB/Runtime: eliminate release build warnings.	Yang Rong	2016-12-29	1	-0/+1
\| \| \| \| \|	Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
*	runtime: set cl_intel_motion_estimation as IVB specifc device extension.	Chuanbo Weng	2016-10-20	1	-1/+2
\| \| \| \| \| \| \|	Currently, cl_intel_motion_estimation is just implemented on IVB. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	rumtime: check all the extension id, not only BASE and OPT1.	Chuanbo Weng	2016-10-20	1	-10/+1
\| \| \| \| \|	Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	Runtime: re-enable cl_khr_gl_sharing with existing egl extension.	Chuanbo Weng	2016-09-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to query low level layout of GL buffer object/texture/render buffer, previous implementation introduced an egl extension and implemented in Beignet side. This way is broken once mesa change its related internal code. In this patch, we use an new egl extension (EGL_MESA_image_dma_buf_export) to query related layout infomations of gl texture. Since this egl extension is already accepted by Khronos, so it's a stable method. This patch just implement GL texture 2d buffer sharing, and we will implement other target type if necessary. v2: Add CMake build option to enable cl_khr_gl_sharing(default off). Clean up related CMake code. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	Add cl_khr_3d_image_writes into info string.	Yan Wang	2016-06-12	1	-0/+2
\| \| \| \| \| \| \|	The extension is supported in fact and avoid misunderstanding. Signed-off-by: Yan Wang <yan.wang@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	runtime: error handling to avoid null pointer dereference.	Luo Xionghu	2016-05-23	1	-1/+1
\| \| \| \| \|	Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	runtime: extension size not enough.	Luo Xionghu	2015-11-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	define a MACRO to hold the value. v2: use same MACRO in cl_extensions.h; add header file protection for cl_extension.h. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
*	Add extensions intel_accelerator and basic intel_motion_estimation.	Chuanbo Weng	2015-11-10	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: 1. Just upload the first vme_state. 2. Remove duplicated code in check_opt1_extension. 3. Check image format before cl_gpgpu_bind_image_for_vme. 4. Fix error of getting mv. Because we suppose this kernel run in SIMD16 mode, so dword 0 of grf 1 should be __gen_ocl_region(8,vme_result.s0), not __gen_ocl_region(0,vme_result.s1). v3: Return CL_IMAGE_FORMAT_NOT_SUPPORTED if image format is not the required one. v4: Fix two conflicts after code rebase and wordaround a curbe related bug. v6: Treat simd8 and simd16 differently when getting mv. Signed-off-by: Guo Yejun <yejun.guo@intel.com> Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
*	Runtime: Refine ext enable function for platform.	Junyan He	2015-10-27	1	-10/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	We enable fp64 extension just on BDW platform. The platforms before Gen7 will not have fp64 support. We will enable fp64 on gen8 later platforms after this feature is stable. V3: Unify the extersion setting for FP16 and FP64. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
*	enable create image 2d from buffer in clCreateImage.	Luo Xionghu	2015-09-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	this patch allows create 2d image with a cl buffer with zero copy. v2: should use reference to manage the release the buffer and image. After being created, the buffer reference count is 2, and image reference count is 1. if image is released first, decrease the image reference count and buffer reference count both, release the bo when the buffer is released at last; if buffer is released first, decrease the buffer reference count only, release the buffer when the image is released. add CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT in cl_device_info. v3: move is_image_from_buffer to _cl_mem_image; return CL_INVALID_IMAGE_SIZE if image size is larger than the buffer. v4: pitchalignment set to 2. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
*	Runtime: Add default extension for platforms before BDW.	Junyan He	2015-07-14	1	-0/+8
\| \| \| \| \|	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
*	runtime: Add cl device's standalone extension.	Junyan He	2015-07-06	1	-10/+14
\| \| \| \| \| \| \| \| \|	The cl device may have different extensions from the platform. We will add some items based on the platform extensions. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
*	runtime: Use cl_get_platform_default to replace global value.	Junyan He	2015-07-06	1	-21/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The init order of the intel_platform and the intel_extension is somehow not clear. When some API such as clGetDeviceIDs can pass NULL as cl_platform_id, we just use the global value intel_platform as the default but do not care about the init state of the extension. The init of the extension may be done when the cl device is created. This is OK if the paltform and the device have the same extensions. But now because of the fp16, they are not always the same. Use cl_get_platform_default to replace the global value to ensure that when default platform is available, the extension is also inited. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
*	runtime: Add fp16 extension to BDW later platform.	Junyan He	2015-07-02	1	-4/+25
\| \| \| \| \|	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
*	fix global variable out of boundary writing in libocl.	Luo Xionghu	2015-06-18	1	-1/+1
\| \| \| \| \| \| \|	need minus one when fill '\0' to sizeof char type array. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	only support spir extension for beignet build with llvm 3.5 or later.	Luo Xionghu	2015-03-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	the clang 3.5 will call CallGraphSCCPass to add attribute "Attribute::ReadOnly" for these parameters only reads memeory, but this attribute is not supported in the VerifierPass of llvm 3.3. This is a bug of llvm 3.3. v2: disable this extension in runtime for old llvm. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	enable cl_khr_spir extension to build and run from SPIR binary.	Luo Xionghu	2015-03-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the SPIR are built by clang generating a standard llvm Module file, beignet need insert one byte before the module repesents binary type then parse the module to link. enable cl_khr_spir extension output string; enable the SPIR calling conversion of CallingConv::SPIR_KERNEL; get_global_id shoud be OVERLOADABLE; fix some bugs in prinf parse and backend. v2: move OVERLOADABLE change to another patch to keep clean; rename FROM_INTERMEDIATE to FROM_LLVM_SPIR. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	CL: Enalbe gl sharing with new egl extension.	Zhigang Gong	2013-09-06	1	-18/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementation is only for 2d/3d texture sharing and is implemented in a hacky fashinon. We need to replace it with a clean and complete one. We introduce a new egl extension to export low level layout information of a buffer object/texture/render buffer from the mesa dri driver to the cl driver layer. As the extension is not accpepted by mesa, we have to implement this new extension in beignet internally. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Tested-by: He Junyan <junyan.he@inbox.com>
*	GBE: disable cl_khr_fp64.	Zhigang Gong	2013-08-22	1	-0/+1
\| \| \| \| \| \| \|	As the double support is incomplete currently, we disable it. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
*	Enable int32 atomic and fp64 extensions.	Yang Rong	2013-06-28	1	-2/+0
\| \| \| \| \|	Signed-off-by: Yang Rong <rong.r.yang@intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	Implement KHR ICD extension	Simon Richter	2013-04-18	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a pointer to the dispatch table at the beginning of every object of type - cl_command_queue - cl_context - cl_device_id - cl_event - cl_kernel - cl_mem - cl_platform_id - cl_program - cl_sampler as required by the ICD specification. The layout of the dispatch table comes from the OpenCL ICD loader by Brice Videau <brice.videau@imag.fr> and Vincent Danjean <Vincent.Danjean@ens-lyon.org>. To avoid dispatch table entries being overwritten with the ICD loader's implementations of the CL functions (as would be the proper behaviour for the ELF loader), the -Bsymbolic option is given to the linker. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	Avoid extension names as preprocessor tokens	Simon Richter	2013-04-18	1	-4/+4
\| \| \| \| \| \| \| \|	The Khronos Group headers define constants with the names of extensions if the header defines the extension API. When the preprocessor sees one of these names, it performs macro substitution, leading to compilation errors. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
*	Implement OCL extension initizliation.	Zhigang Gong	2013-04-10	1	-0/+113
	We don't have an extension checking and initialization implemenation. Now add it. For the mandatory extensions for OCL1.2 as below: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64 (for backward compatibility if double precision is supported) It seems that we only support the byte addressable store extension. We still need to write new test case for it to prove whether we really support it. For all the other mandatory extensions, we need to implement them if we want to comply with OCL1.2 specification. For the optional extensions, currently we only support cl_khr_gl_sharing. Actually, we are not fully support it. Current implementation is a hack fashion. I'll change to use upstream mesa to implement it. For now, just enable this extension. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Tested-by: Homer Hsing <homer.xing@intel.com>