2014-04-14 41 views
2

這是OpenCL定時內核執行時間的正確方法嗎?我非常熱衷於使用C++ wrapper(不幸的是它沒有很多定時例子)。OpenCL內核的時序執行

cl::CommandQueue queue(context, device, CL_QUEUE_PROFILING_ENABLE, &err); 
checkErr(err, "Cannot create the command queue"); 

/* Warm-up */ 
for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i) 
{ 
    err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, NULL); 
    checkErr(err, "Cannot enqueue the kernel"); 
} 
queue.finish(); 

/* Time kernels */ 
cl::Event start, stop; 
queue.enqueueMarker(&start); 
for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i) 
{ 
    err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, NULL); 
    checkErr(err, "Cannot enqueue the kernel"); 
} 
queue.enqueueMarker(&stop); 

stop.wait(); 
cl_ulong time_start, time_end; 
double total_time; 
start.getProfilingInfo(CL_PROFILING_COMMAND_END, &time_start); 
stop.getProfilingInfo(CL_PROFILING_COMMAND_START, &time_end); 
total_time = time_end - time_start; 

/* Results */ 
cout << "Execution time in milliseconds " << total_time/(float)10e6/NUMBER_OF_ITERATIONS << endl; 

回答

1

我認爲你的方法應該工作得很好(不是)。或者,如果您想每次撥打電話,您都可以將活動傳遞給enqueueNDRangeKernel,並在enqueueNDRangeKernel上致電getProfilingInfo

cl::Event evt; 
err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(512), cl::NullRange, NULL, &evt); 
evt.wait(); 
elapsed += evt.getProfilingInfo<CL_PROFILING_COMMAND_END>() - 
      evt.getProfilingInfo<CL_PROFILING_COMMAND_START>();