如何釋放GPU內存並在Pyopencl中爲不同的陣列使用相同的緩衝區？

以下是參考我的工作代碼：如何釋放GPU內存並在Pyopencl中爲不同的陣列使用相同的緩衝區？

vector = numpy.array([1, 2, 4, 8], numpy.float32) #cl.array.vec.float4 
matrix = numpy.zeros((1, 4), cl.array.vec.float4) 
matrix[0, 0] = (1, 2, 4, 8) 
matrix[0, 1] = (16, 32, 64, 128) 
matrix[0, 2] = (3, 6, 9, 12) 
matrix[0, 3] = (5, 10, 15, 25) 
# vector[0] = (1, 2, 4, 8) 


platform=cl.get_platforms() #gets all platforms that exist on this machine 
device=platform[0].get_devices(device_type=cl.device_type.GPU) #gets all GPU's that exist on first platform from platform list 
context=cl.Context(devices=[device[0]]) #Creates context for all devices in the list of "device" from above. context.num_devices give number of devices in this context 
print("everything good so far") 
program=cl.Program(context,""" 
__kernel void matrix_dot_vector(__global const float4 * matrix,__global const float *vector,__global float *result) 
{ 
int gid = get_global_id(0); 

result[gid]=dot(matrix[gid],vector[0]); 
} 

""").build() 
queue=cl.CommandQueue(context) 
# queue=cl.CommandQueue(context,cl_device_id device) #Context specific to a device if we plan on using multiple GPUs for parallel processing 

mem_flags = cl.mem_flags 
matrix_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=matrix) 
vector_buf = cl.Buffer(context, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=vector) 
matrix_dot_vector = numpy.zeros(4, numpy.float32) 
global_size_of_GPU= 0 
destination_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, matrix_dot_vector.nbytes) 
# threads_size_buf = cl.Buffer(context, mem_flags.WRITE_ONLY, global_size_of_GPU.nbytes) 
program.matrix_dot_vector(queue, matrix_dot_vector.shape, None, matrix_buf, vector_buf, destination_buf) 

## Step #11. Move the kernel’s output data to host memory. 
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf) 
# cl.enqueue_copy(queue, global_size_of_GPU, threads_size_buf) 
print(matrix_dot_vector) 
# print(global_size_of_GPU) 

# COPY SAME ARRAY FROM GPU AGAIN 
cl.enqueue_copy(queue, matrix_dot_vector, destination_buf) 
print(matrix_dot_vector) 
print('copied same array twice')

我怎麼能免費matrix_buf & destination_buf對GPU的內存。一個是隻讀的，另一個是隻寫的。
如何在同一個matrix_buf中加載不同的矩陣數組，而不需要必須在pyopencl中創建新的緩衝區。我讀到，如果我加載新的數據在相同的緩衝區，它會快得多，然後重新創建相同大小的緩衝區每次。
如果我在舊緩衝區中加載的新陣列的大小比那個緩衝區中的舊陣列小，那麼可以。新陣列必須具有完全相同的緩衝區大小？

來源

2017-05-26 Aseem Hegshetye

matrix_buf.release（）& destination_buf.release（） - 這將釋放分配用於在GPU各緩衝器的存儲器中。它更好地釋放內存，如果它沒有用，以避免遇到內存錯誤。如果GPU功能退出，所有的GPU內存都會被pyopencl自動清除。（隊列，矩陣_buf，矩陣_2） - 在matrix_buf中加載一個新的matrix_2數組而不重新創建一個新的矩陣buf。
可以重新使用現有的緩衝區並使用其中的一部分。在內核方面，我們可以控制要訪問的部分。 - {由doqtor}

來源

2017-06-05 23:07:46

回覆1.我相信當緩衝區的變量超出範圍或可以顯式調用release()緩衝區將被釋放。在這種情況下，緩衝區是隻讀還是隻寫不重要。
Re 2.試試pyopencl.enqueue_map_buffer()它返回一個可以從主機端修改的數組。更多here。
Re 3.如果您想重新使用現有的緩衝區並使用其中的一部分，那很好。在內核方面，您可以控制要訪問的部分。

來源

2017-05-29 09:43:35 doqtor

u能請解釋「發佈（）」和‘pyopencl.enqueue_map_buffer（）’有一個例子，我試着讀你所提供的鏈接，但它的艱澀 –

看一看這裏的例子：。 [pyopencl.buffer.release]（http://nullege.com/codes/search?cq=pyopencl.buffer.release）和[pyopencl.enqueue_map_buffer]（http://nullege.com/codes/search?cq=pyopencl .enqueue_map_buffer） – doqtor

如何釋放GPU內存並在Pyopencl中爲不同的陣列使用相同的緩衝區？

回答

相關問題