summaryrefslogtreecommitdiff
path: root/src/cl_event.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't leak memory on long chains of eventsRebecca N. Palmer2018-08-201-7/+24
| | | | | | | | | | | | Delete event->depend_events when it is no longer needed, to allow the event objects it refers to to be freed. This avoids out-of-memory hangs in large dependency trees (e.g. long iterative calculations): https://launchpad.net/bugs/1354086 Signed-off-by: Rebecca N. Palmer <rebecca_palmer@zoho.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Typo in error messageGiuseppe Bilotta2017-02-081-1/+1
| | | | | Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
* Fix a event notify bug.Junyan He2017-01-061-34/+16
| | | | | | | | | | | When a event complete, we need to notify all the command_queue within the same context. But sometime, some command_queue in the context is already invalid. Modify to ensure all the command_queue to be notified are valid. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Add the NULL pointer check.Yang Rong2016-12-291-1/+1
| | | | | Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
* Runtime: return CL_INVALID_EVENT_WAIT_LIST if not event in the wait list.Meng Mengmeng2016-12-281-1/+1
| | | | | Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Junyan He <junyan.he@linux.intel.com>
* Runtime: fix fill image event assert and some SVM rebase error.Yang, Rong R2016-12-281-1/+1
| | | | | | | Also remove the useless function cl_context_add_svm. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
* Improve event execute function.Junyan He2016-12-281-42/+52
| | | | | | | | | | | | | | | | | | | | | | | Modify the event exec function, make it as the uniformal entry for all event command execution. This will help the timestamp record and profiling feature a lot. V2: 1. Set event init state to bigger than CL_QUEUED. Event state should be set to CL_QUEUED exactly when it is to be queued. Profiling feature make this requirement clearer. We need to record the timestamp exactly when it it to be queued. So we need to add a additional state beyond CL_QUEUED. 2. Fix cl_event_update_timestamp_gen bugi, the CL_SUMITTED time may be less. GPU may record the timestamp of CL_RUNNING before CPU record timestamp of CL_SUMITTED. It is a async process and it is hard for us to control. According to SPEC, we need to record timestamp after some state is done. We can just now set CL_SUMITTED to CL_RUNNING timestamp if the CL_SUBMITTED timestamp is the bigger one. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Add profiling feature based on new event implementation.Junyan He2016-12-281-28/+104
| | | | | | | | | | | | | | | | | | | | | | | TODO: In opencl 2.0, a new profiling item called CL_PROFILING_COMMAND_COMPLETE is imported. It means that we need to record the time stamp of all the child events created by the "Kernel enqueing kernels" feature finish. This should be done after the "Kernel enqueing kernels" feature enabled. V2: Update event time stamp before inserting to queue thread, avoid MT issue. V3: Fixup overflow problem. V4: Fixup overflow to 0xfffffffffffffffff problem. Just take ownership and release event lock when call the update timestamp function. The update timestamp function may have block system call can should not hold the lock to call it. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Refine list related functions.Junyan He2016-12-281-12/+11
| | | | | | | Make the list related functions more clear and readable. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Fix a bug for event error status.Junyan He2016-10-101-8/+10
| | | | | | | | V2: Move the event list status check to clWaitForEvents API. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Modify all event related functions using new event handle.Junyan He2016-09-281-601/+466
| | | | | | | | | | | | | | | | Rewrite the cl_event, and modify all the event functions using this new event manner. Event will co-operate with command queue's thread together. v2: Fix a logic problem in event create failed. V3: Set enqueue default to do nothing, handle some enqueue has nothing to do. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Delete all the verbose locks and use list to store CL objects.Junyan He2016-09-231-33/+14
| | | | | | | | | | We use context's lock when we add and delete cl objects. Every cl object should use it's own lock to protect itself. We also add some helper functions to ease the adding and removing operations. Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Runtime: Apply base object to cl_eventJunyan He2016-09-021-5/+5
| | | | | Signed-off-by: Junyan He <junyan.he@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* runtime: error handling to avoid null pointer dereference.Luo Xionghu2016-05-231-10/+18
| | | | | Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* fix gcc build error.Luo Xionghu2015-12-091-1/+1
| | | | | | | this link fail appears on gcc 5.2.1. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Runtime: return the correct error code in cl_event_check_waitlist.Yang Rong2015-11-191-2/+4
| | | | | | | | Return CL_INVALID_CONTEXT if the context associated with command_queue and events in event_wait_list are not the same. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Luo Xionghu <xionghu.luo@intel.com>
* runtime: refine the last_event in queue to a listPan Xiuli2015-10-131-11/+41
| | | | | | | | | | | | | | Refine the event struct to make last_event become a list to store all uncompeleted events and update them every queue flush. This can make sure all events created in the runtime have a chance to update status and run callback functions and then be deleted. We will also fix the memory leak problem casued by uncompeted events. This is a bugfix for https://bugs.freedesktop.org/show_bug.cgi?id=91710 The leaked events with gpu buffers will be unreferenced and cause other drm buffer leak and result in terrible memory leak. Signed-off-by: Pan Xiuli <xiuli.pan@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Calculate appropriate timestamps for cl profileMidhun Kodiyath2015-09-231-0/+55
| | | | | | | | | Fix to calculate the current cpu monotonic raw timestamp in nanoseconds for enqueued,submitted,start and finshed and send this to application based on the parameter queries. Signed-off-by: Midhun Kodiyath <midhunchandra.kodiyath@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Runtime: correct event and the wait events compare when check event.Yang Rong2015-07-171-1/+1
| | | | | | | | | When the event parament is not NULL, the event will point to a new event, so need to check address of the event and the wait events. V2: check the address of the event and the wait events. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
* Fixed a thread safe bug.Zhigang Gong2015-07-151-9/+9
| | | | | | | last_event and current_event should be thread private data. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* runtime: Enhance the error handling when flush gpgpu command queue.Zhigang Gong2015-04-101-2/+4
| | | | | | | | | Beignet uses drm_intel_gem_bo_context_exec() to flush command queue to linux drm driver layer. We need to check the return value of that function, as it may fail when the application uses very large array. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Fix: (v3) Event callback that were not executed when command was already ↵David Couturier2015-03-271-21/+58
| | | | | | | | | | | | | | | | | | CL_COMPLETE + thread safety for callbacks When trying to register a callback on the clEnqueueReadBuffer command, since it is processed synchroniously all the time, the command was marked CL_COMPLETE every time. If the event returned by clEnqueueReadBuffer was then used to register a callback function, the callback function did no check to execute it if nessary. Modified the handling of the callback registration in cl_set_event_callback to only call the callback being created if it's status is already reached. Added thread safety measures for pfn_notify calls since the status value can be changed while executing the callback. Grouped the pfn_notify calls to a unified function cl_event_call_callback that handles thread safety: it queues callbacks in a node list while under the protection of pthread_mutex and then calls the callbacks outside of the pthread_mutex (this is required because the callback can deadlock if it calls a cl_api function that uses the mutex) Signed-off-by: David Couturier <david.couturier@polymtl.ca> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Fix the opencv_test_core/OCL_Arithm random segment fault.Yang Rong2014-11-211-37/+36
| | | | | | | | | If call cl_event_delete before call back, then event will be deleted if application release event in the call back. So must move the cl_event_delete at the last. V2: V1 will not delete event if not user event, also need delete it. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* License: adjust all license version to LGPL v2.1+.Zhigang Gong2014-11-111-1/+1
| | | | | | | | To make the license statement consistent to each other, adjust all license versions to v2.1+. Thus beignet should have a pure LGPL v2.1+ license. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
* Fix compile warnings for CLANG compilerLv Meng2014-08-191-2/+1
| | | | | | | | | | | 1.fix data structure redefine warnings. 2.fix 'data' with variable sized type 'union<*>' not at the end of a class warning(in immediate.hpp). 3.fix implicitly conversion warning. 4.fix explicitly assigning a variable type warning. 5.fix comparison of unsigned expression < 0 is always false warning(in cl_api.c). Signed-off-by: Lv Meng <meng.lv@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
* runtime: fix some subtle event bugs.Zhigang Gong2014-07-111-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | This patch fix the following two bugs in event handling. 1. When it's time to call a event's user call back function, we need to set the executed to true before the call. As that call back function may call into clReleaseEvent(), and if we don't set the executed status to true, it will enter infinite recursive loop. 2. After the user call clEnqueueNDRangeKernel to get a valid event, the user set a call back function to that event, and in that call back function, it will release that event. This scenario is totally correct. But our current event handling doesn't have a deadicated timer thread to update those on-the-fly events' status. Thus those events will not have a chance to get updated, and those call back function will not executed forever. To introduce a complete timer style thread to maintain this type of events is too heavy for this fix release. This patch choose an easy way to work around it. It will make sure the last gpgpu event to be finished before current task to be enqueued. After this patch, most of the OpenCV 3.0 cases could run smoothly without any serious issue. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* runtime: fix a gpgpu event and thread local gpgpu handling bug.Zhigang Gong2014-07-031-4/+22
| | | | | | | | | | | | | | | When pending a command queue, we need to record the whole gpgpu structure not just the batch buffer. For the following reason: 1. We need to keep those private buffer, for example those printf buffers. 2. We need to make sure this gpgpu will not be reused by other enqueuement. v2: Don't try to flush all user event attached to the queue. Just need to flush the current event when doing command queue flush. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Refine some event code.Yang Rong2014-07-031-4/+7
| | | | | | | | | 1. Do not add user event to cb->wait_list to avoid ref this user event twice. 2. Add assert when update status. 3. Set the queue's last wait event and barrier event to NULL when remove last event. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Fix some event ref count error.Yang Rong2014-07-021-16/+16
| | | | | | | | Move the event add ref to function cl_event_new_enqueue_callback for clear. Also need add the wait user events' ref count. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: "Luo, Xionghu" <xionghu.luo@intel.com>
* Fix an event status bug.Yang Rong2014-06-191-4/+14
| | | | | | | | If event status is an Error code, the status of events wait on this event also should set to Error code. V2: should not execute the enqueue command wait on the event whose status is error. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* fix clEnqueueMarkerWithWaitList bug when input event is null.Luo2014-06-171-3/+8
| | | | Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Fix a clEnqueueBarrierWithWaitList event status bug.Yang Rong2014-06-161-6/+10
| | | | | | | | | Event's status should be CL_COMPLETE if all wait events are complete in the wait list, in function clEnqueueBarrierWithWaitList and clEnqueueMarkerWithWaitList. v2: revert delete the event change in v1. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* add [opencl-1.2] API clEnqueueBarrierWithWaitList.Luo2014-06-131-24/+60
| | | | | | | | | | | | | This command blocks command execution, that is, any following commands enqueued after it do not execute until it completes; API clEnqueueMarkerWithWaitList patch didn't push the latest, update in this patch. Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com> Signed-off-by: Luo <xionghu.luo@intel.com> Conflicts: src/cl_event.c
* add [opencl 1.2] API clEnqueueMarkerWithWaitList.Luo2014-06-131-1/+19
| | | | Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Implement the clEnqueueFillBuffer API.Junyan He2014-06-131-0/+1
| | | | | | | | | | | | We use the floatn's assigment to do the copy. 128 pattern size is according to double16, and because the double problem on our platform, we use to float16 to handle this. unaligned cases is not optimized now, just use the char assigment. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Fix timestamp on HASWELLLi Peng2014-05-301-2/+2
| | | | | | | The GPU timestamp should be lower 36 bit on HASWELL Signed-off-by: Li Peng <peng.li@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
* fix event related bugs.Luo2014-05-221-41/+70
| | | | | | | | 1. remove repeated user events in list. 2. missed braces in loops. 3. fix barrier event reference not incresed. Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
* Silent some compilation warnings.Zhigang Gong2014-04-081-2/+2
| | | | | | Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
* Complete the feature of clGetEventProfilingInfo APIJunyan He2013-11-291-10/+17
| | | | | | | | | | | | | | | | | The profiling feature is now all supported. We use drm_intel_reg_read to get the current time of GPU when the event is queued and submitted, and use PIPI_CONTROL cmd to get the executing time of the GPU for kernel start and end. One trivial problem is that: The GPU timer counter is 36 bits with resolution of 80ns, so 2^36*80 = 5500s, about half an hour. Some test may last about 2~5 min and if it starts at about half an hour, this may cause a wrap back problem and cause the case fail. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Move the gpgpu struct from cl_command_queue to thread specific contextJunyan He2013-11-081-2/+4
| | | | | | | | | | | | We find some cases will use multi-threads to run on the same queue, executing the same kernel. This will cause the gpgpu struct which is very important for GPU context setting be destroyed because we do not implement any sync protect on it now. Move the gpgpu struct into thread specific space will fix this problem because the lib_drm will do the GPU command serialization for us. Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Zou, Nanhai" <nanhai.zou@intel.com>
* runtime: Fix a dangling pointer issueRuiling Song2013-10-311-6/+9
| | | | | | | | | ctx->events points to the head of 'event list' under the ctx. When deleting an event from the list, we should also update the head pointer besides updating its neighbour's next & prev, Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewedy-by: "Xing, Homer" <homer.xing@intel.com>
* Using the PIPE_CONTROL to implement get time stamp in gen backendJunyan He2013-10-181-0/+22
| | | | | | | | | | | | | | | | | | | | | | We use PIPE_CONTROL to get the time stamps from GPU just after batch start and before batch flush. Using the first one the caculate the CL_PROFILING_COMMAND_START time and uing the second one to caculate the CL_PROFILING_COMMAND_END time. There are 2 limitations here: 1. Then end time stamp is just before the FLUSH, so the Flush time is not included, which will cause to lose the accuracy. Because the we do not know which event will be used to do the profling when it is created, adding another flush for end time stamp may add some overload. 2. The time of CPU and GPU can not be sync correctly now. So the time of CL_PROFILING_COMMAND_QUEUED and CL_PROFILING_COMMAND_SUBMIT which happens on CPU side can not be caculated correctly with the same base time of GPU. So we just simplely set them to CL_PROFILING_COMMAND_START now. For the Event not involving GPU operations such as ReadBuffer, all the times are 0 now. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
* Implement clEnqueueMarker and clEnqueueBarrier.Yang Rong2013-09-181-5/+67
| | | | | | | | | | | | | Add some event info to cl_command_queue. One is non-complete user events, used to block marker event and barrier. After these events become CL_COMPLETE, the events blocked by these events also become CL_COMPLETE, so marker event will also set to CL_COMPLETE. If there is no user events, need wait last event complete and set marker event to complete. Add barrier_index, for clEnqueueBarrier, point to user events, indicate the enqueue apis follow clEnqueueBarrier should wait on how many user events. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Refine and fix some event bugs.Yang Rong2013-09-181-6/+34
| | | | | Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Fix event pthread_mutex_lock dead lock.Yang Rong2013-08-141-4/+17
| | | | | | | | | In function cl_event_set_status, between pthread_mutex_lock and pthread_mutex_unlock will call cl_event_delete, which also require the same lock, cause deak lock. Unlock it before call cl_event_delete. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Add openCL event support.Yang Rong2013-08-121-3/+372
| | | | | | | | | | | | | | | | | Now use the defer execute to wait events. If there is no user event waited, then using wait rendering to wait GPU event complete and call the enqueue api immediately. If there is the user events waited, then should prepare the the enqueue data, and resume the enqueue when all user events that waited complete. The achieve these, add the enqueue callback to user event, and add the all user event and other wait event list to enqueue callback. When set user event to complete, check all enqueue callbacks wait this event. Now, clEnqueueMark/clEnqueueBarrier still not impletement, and clEnqueueMapBuffer /clEnqueueMapImage is not consistency with spec. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
* Added support for llvm 3.1Benjamin Segovia2012-11-081-0/+1
| | | | | Cleaned up some warnings Forced pedantic option removal in the Makefile
* Added all miniCL filesbsegovia2012-08-101-0/+19