Ruby是否有一個很好的openCL包裝？

require 'opencl_ruby_ffi' 

# select the first platform/device available 
# improve it if you have multiple GPU on your machine 
platform = OpenCL::platforms.first 
device = platform.devices.first 

# prepare the source of GPU kernel 
# this is not Ruby but OpenCL C 
source = <<EOF 
__kernel void addition( float2 alpha, __global const float *x, __global float *y) {\n\ 
    size_t ig = get_global_id(0);\n\ 
    y[ig] = (alpha.s0 + alpha.s1 + x[ig])*0.3333333333333333333f;\n\ 
} 
EOF 

# configure OpenCL environment, refer to OCL API if necessary 
context = OpenCL::create_context(device) 
queue = context.create_command_queue(device, :properties => OpenCL::CommandQueue::PROFILING_ENABLE) 

# create and compile the OpenCL C source code 
prog = context.create_program_with_source(source) 
prog.build 

# allocate CPU (=RAM) buffers and 
# fill the input one with random values 
a_in = NArray.sfloat(65536).random(1.0) 
a_out = NArray.sfloat(65536) 

# allocate GPU buffers matching the CPU ones 
b_in = context.create_buffer(a_in.size * a_in.element_size, :flags => OpenCL::Mem::COPY_HOST_PTR, :host_ptr => a_in) 
b_out = context.create_buffer(a_out.size * a_out.element_size) 

# create a constant pair of float 
f = OpenCL::Float2::new(3.0,2.0) 

# trigger the execution of kernel 'addition' on 128 cores 
event = prog.addition(queue, [65536], f, b_in, b_out, 
         :local_work_size => [128]) 
# #Or if you want to be more OpenCL like: 
# k = prog.create_kernel("addition") 
# k.set_arg(0, f) 
# k.set_arg(1, b_in) 
# k.set_arg(2, b_out) 
# event = queue.enqueue_NDrange_kernel(k, [65536],:local_work_size => [128]) 

# tell OCL to transfer the content GPU buffer b_out 
# to the CPU memory (a_out), but only after `event` (= kernel execution) 
# has completed 
queue.enqueue_read_buffer(b_out, a_out, :event_wait_list => [event]) 

# wait for everything in the command queue to finish 
queue.finish 
# now a_out contains the result of the addition performed on the GPU 

# add some cleanup here ... 

# verify that the computation went well 
diff = (a_in - a_out*3.0) 
65536.times { |i| 
    raise "Computation error #{i} : #{diff[i]+f.s0+f.s1}" if (diff[i]+f.s0+f.s1).abs > 0.00001 
} 
puts "Success!"

來源

2014-11-21 15:16:25 Kevin

是否可以詳細闡述這裏發生的事情？我在哪裏可以讀出添加的實際值等？ – Automatico

我已經評論了源代碼，添加的結果是從GPU內存中用'queue.enqueue_read_buffer'操作獲取的。網上有很多（C）OpenCL教程，一旦你瞭解API的要點，翻譯Ruby應該相當容易。 – Kevin

非常感謝！ :) – Automatico

Ruby是否有一個很好的openCL包裝？

回答

相關問題