1
爲什麼不能使用此屏障。 這應該在輸出數組中產生1個長度的數字,它們的和在輸出[0]中。如果我降低循環縮放這可以正常工作。如果線程規模較大,應等待屏障,但不要產生不正確的輸出。opencl屏障無法正常工作
__kernel void b_test1(__global int* a, int length) {
int id = get_global_id(0);
const int scale = 100;
for (int i=0; i< id*scale; i++) a[id]=0; /* useless loops scaled up by id, just to waste time. note more time is wasted with bigger id */
a[id]=id;
barrier(CLK_GLOBAL_MEM_FENCE);
if (id==0){
int sum=0;
for (int i=0; i < length; i++){
sum+=a[i];
}
a[0]=sum;
}
}
我的Java代碼
CLContext context = JavaCL.createBestContext();
CLQueue queue = context.createDefaultQueue();
CLProgram program = context.createProgram(ReadText.readText(new File("src/kernel1.c")));
CLKernel kernel = program.createKernel("b_test1");
int length=10;
CLIntBuffer input = context.createIntBuffer(CLMem.Usage.InputOutput, length);
kernel.setArgs(input, length);
CLEvent event = kernel.enqueueNDRange(queue, new int[]{length}, new int[]{1});
queue.finish();
IntBuffer output = input.read(queue, event);
String out="";
for (int i=0; i< length; i++){
out+=output.get()+"\t";
}
System.out.println(out);
感謝。
編輯:我對win7的NVIDIA GTX 275 v270.61 opencl1.0 & Ubuntu的NVIDIA 8600M GS
啊謝謝,所以我需要創建類似於'CLEvent event = kernel.enqueueNDRange(queue,new int [] {length},new int [] {length}); – Stephen 2011-04-19 14:23:10