使用pycuda進行線性插值（lerp）

我是一名剛剛進入pyCUDA的娛樂pythonista。我想弄清楚如何使用pyCUDA實現線性插值（lerp）。 CUDA CG的功能是：http://http.developer.nvidia.com/Cg/lerp.html 使用pycuda進行線性插值（lerp）

我的最終目標是從一組加權隨機點開始在pycuda中進行雙線性插值。我從來沒有爲C或CUDA編寫過程，而且我正在學習。

這是多遠，我已經得到了：

import pycuda.autoinit 
import pycuda.driver as drv 
import pycuda.compiler as comp 

lerpFunction = """__global__ float lerp(float a, float b, float w) 
{ 
    return a + w*(b-a); 
}""" 

mod = comp.SourceModule(lerpFunction) # This returns an error telling me a global must return a void. :(

任何幫助將是非常美妙！

來源

2012-01-05 Austinstig

是什麼'__global__'嗎？你爲什麼認爲你需要它？ – 2012-01-05 23:36:27

@MarkRansom：這是CUDA，它是必要的 - '__global__'表示NVIDIA編譯器驅動程序的函數是gpu代碼。 – talonmies 2012-01-06 00:28:05

如果您想進一步探索Python上的CUDA，請嘗試一下。 http://www.accelereyes.com/afpy.html – 2012-01-06 22:48:05

錯誤消息非常明確 - CUDA內核無法返回值，必須聲明void，並將可修改的參數作爲指針傳遞。這將更有意義，你的線性插值實現聲明爲這樣的設備功能：

__device__ float lerp(float a, float b, float w) 
{ 
    return a + w*(b-a); 
}

，然後從需要插每個值內核內部調用。你的lerp函數缺少很多「基礎結構」成爲一個有用的CUDA內核。

編輯：沿着相同的路線一個非常基本的內核可能是這個樣子：

__global__ void lerp_kernel(const float *a, const float *b, const float w, float *y) 
{ 
    int tid = threadIdx.x + blockIdx.x*blockDim.x; // unique thread number in the grid 
    y[tid] = a[tid] + w*(b[tid]-a[tid]); 
}

來源

2012-01-06 00:35:35 talonmies

那麼，更多沿着這些線？ pycuda.elementwise.ElementwiseKernel（「float a，float b，float w」，「return a + b + w」，「lerp」） – Austinstig 2012-01-06 14:12:48

沒有那樣的問題 - 內核函數不能返回值。這不像老式的着色語言，內存訪問是通過作爲函數參數傳遞的指針完成的。聽起來您應該閱讀一些文檔或查看大量介紹性教程之一，如果您選擇搜索，您所選擇的搜索引擎將爲您找到。 – talonmies 2012-01-06 14:22:16

使用pycuda進行線性插值（lerp）

回答

相關問題