1
我在下面做了一個非常簡單的內核來練習CUDA。爲什麼更改內核參數耗盡我的資源?
import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np
from pycuda.compiler import SourceModule
from pycuda import gpuarray
import cv2
def compile_kernel(kernel_code, kernel_name):
mod = SourceModule(kernel_code)
func = mod.get_function(kernel_name)
return func
input_file = np.array(cv2.imread('clouds.jpg'))
height, width, channels = np.int32(input_file.shape)
my_kernel_code = """
__global__ void my_kernel(int width, int height) {
// This kernel trivially does nothing! Hurray!
}
"""
kernel = compile_kernel(my_kernel_code, 'my_kernel')
if __name__ == '__main__':
for i in range(0, 2):
print 'o'
kernel(width, height, block=(32, 32, 1), grid=(125, 71))
# When I take this line away, the error goes bye bye.
# What in the world?
width -= 1
現在,如果我們運行上面的代碼,執行繼續執行for循環的第一次迭代就好了。但是,在循環的第二次迭代期間,我收到以下錯誤。
Traceback (most recent call last):
File "outOfResources.py", line 27, in <module>
kernel(width, height, block=(32, 32, 1), grid=(125, 71))
File "/software/linux/x86_64/epd-7.3-1-pycuda/lib/python2.7/site-packages/pycuda-2012.1-py2.7-linux-x86_64.egg/pycuda/driver.py", line 374, in function_call
func._launch_kernel(grid, block, arg_buf, shared, None)
pycuda._driver.LaunchError: cuLaunchKernel failed: launch out of resources
如果我拿走了這一行width -= 1
,錯誤消失。這是爲什麼?我不能第二次更改內核的參數嗎?作爲參考,這裏是clouds.jpg
。
什麼是您的GPU?我猜塊大小很大! – ahmad