運行與推力在GPU上的程序,你需要將它們寫在推力方面的算法像reduce
,transform
,sort
,等等。在這種情況下,我們可以寫在transform
方面的計算,由於環路只是計算函數F(fi[i], fj[i])
並將結果存儲在df[i]
中。請注意,我們必須先將輸入數組移到設備,然後再調用transform
,因爲Thrust要求輸入和輸出數組位於相同的位置。
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/functional.h>
#include <cstdio>
struct my_functor
: public thrust::binary_function<float,float,float>
{
__host__ __device__
float operator()(float fi, float fj)
{
float d = fi - fj;
if (d < 0)
d = 0;
else
d = d * d;
if (d > 255)
d = 255;
return d;
}
};
int main(void)
{
size_t N = 5;
// allocate storage on host
thrust::host_vector<float> cpu_fi(N);
thrust::host_vector<float> cpu_fj(N);
thrust::host_vector<float> cpu_df(N);
// initialze fi and fj arrays
cpu_fi[0] = 2.0; cpu_fj[0] = 0.0;
cpu_fi[1] = 0.0; cpu_fj[1] = 2.0;
cpu_fi[2] = 3.0; cpu_fj[2] = 1.0;
cpu_fi[3] = 4.0; cpu_fj[3] = 5.0;
cpu_fi[4] = 8.0; cpu_fj[4] = -8.0;
// copy fi and fj to device
thrust::device_vector<float> gpu_fi = cpu_fi;
thrust::device_vector<float> gpu_fj = cpu_fj;
// allocate storage for df
thrust::device_vector<float> gpu_df(N);
// perform transformation
thrust::transform(gpu_fi.begin(), gpu_fi.end(), // first input range
gpu_fj.begin(), // second input range
gpu_df.begin(), // output range
my_functor()); // functor to apply
// copy results back to host
thrust::copy(gpu_df.begin(), gpu_df.end(), cpu_df.begin());
// print results on host
for (size_t i = 0; i < N; i++)
printf("f(%2.0lf,%2.0lf) = %3.0lf\n", cpu_fi[i], cpu_fj[i], cpu_df[i]);
return 0;
}
僅供參考,這裏是程序的輸出:
f(2, 0) = 4
f(0, 2) = 0
f(3, 1) = 4
f(4, 5) = 0
f(8,-8) = 255
謝謝wnbell。它也可以用於2D矢量。我想要像xi [i] [0] -xj [i] [0]這樣它會像xi [0] -xj [0]? – Madhu 2011-06-15 04:35:12
Thrust只提供一維矢量容器,所以你必須決定如何將2D矢量壓縮成1d矢量。假設你用同樣的方法將所有二維矢量平坦化,當調用諸如「transform」之類的算法時,通常可以忽略數據的二維性質。 – wnbell 2011-06-15 14:49:23