推力操作空主機陣列

-1

我想做一些推力操作，但我不確定如何。推力操作空主機陣列

現在，我收到時許陣列全是零（在h_a陣列）

我：

#include <cstdio> 
#include <cstdlib> 
#include <cmath> 
#include <iostream> 

#include <cuda.h> 
#include <cuda_runtime_api.h> 

#include <thrust/device_ptr.h> 
#include <thrust/fill.h> 
#include <thrust/transform.h> 
#include <thrust/functional.h> 
#include <thrust/device_vector.h> 
#include <thrust/host_vector.h> 
#include <thrust/copy.h> 
#include <thrust/generate.h> 


template <typename T> 
struct square 
{ 
    __host__ __device__ 
    T operator()(const T& x) const 
    { 
     return x * x; 
    } 

}; 


int 
main(
      int argc, 
    const char * argv[]) 
{ 
    const size_t NbOfPoints = 256; 

    int BlocksPerGridX = 16; 
    int BlocksPerGridY = 16; 

    int ThreadsPerBlockX = 16; 
    int ThreadsPerBlockY = 16; 

    // generate random data on the host 
    thrust::host_vector<float> h_Kx (NbOfPoints); 
    thrust::generate(h_Kx.begin(), h_Kx.end(), rand); 

    thrust::host_vector<float> h_Ky (NbOfPoints); 
    thrust::generate(h_Ky.begin(), h_Ky.end(), rand); 

    // transfer to device 
    thrust::device_vector<float> dev_Kx = h_Kx; 
    thrust::device_vector<float> dev_Ky = h_Ky; 

    // create arrays for holding the number of threads per block in each dimension 
    int * X , * Y; 
    cudaMalloc((void **) &X, ThreadsPerBlockX * BlocksPerGridX * sizeof(*X)); 
    cudaMalloc((void **) &Y, ThreadsPerBlockY * BlocksPerGridY * sizeof(*Y)); 

    // wrap raw pointer with a device_ptr 
    thrust::device_ptr<int> dev_X (X); 
    thrust::device_ptr<int> dev_Y (Y); 

    // use device_ptr in Thrust algorithms 
    thrust::fill(dev_X, dev_X + (ThreadsPerBlockX * BlocksPerGridX) , (int) 0); 
    thrust::fill(dev_Y, dev_Y + (ThreadsPerBlockY * BlocksPerGridY) , (int) 0); 

    // setup arguments 
    square<float> square_op; 

    // create various vectors 
    thrust::device_vector<int> distX (NbOfPoints); 
    thrust::device_vector<int> distY (NbOfPoints); 
    thrust::device_vector<unsigned int> Tmp (NbOfPoints); 
    thrust::host_vector<unsigned int> h_a (NbOfPoints); 
    thrust::device_vector<unsigned int> distXSquared (NbOfPoints); 
    thrust::device_vector<unsigned int> distYSquared (NbOfPoints); 


    // compute distX = dev_Kx - dev_X and distY = dev_Ky - dev_Y 
    thrust::transform(dev_Kx.begin(), dev_Kx.begin(), dev_X , distX.begin() , thrust::minus<float>()); 
    thrust::transform(dev_Ky.begin(), dev_Ky.begin(), dev_Y , distY.begin() , thrust::minus<float>()); 

    //square distances 
    thrust::transform(distX.begin(), distX.end(), distXSquared.begin(), square_op); 
    thrust::transform(distY.begin(), distY.end(), distYSquared.begin(), square_op); 

    // compute Tmp = distX + distY 
    thrust::transform(distXSquared.begin() ,distXSquared.begin() , distYSquared.begin() , Tmp.begin() , thrust::plus<unsigned int>()); 
    thrust::copy(Tmp.begin(), Tmp.end(), h_a.begin()); 


    for (int i = 0; i < 5; i ++) 
     printf("\n temp = %u",h_a[ i ]); 


return 0; 
}

UPDATE：

除了羅伯特Crovella的編輯，你必須編輯整數：

square<int> square_op; 
thrust::transform(dev_Kx.begin(), dev_Kx.end(), dev_X , distX.begin() , thrust::minus<int>()); 
thrust::transform(dev_Ky.begin(), dev_Ky.end(), dev_Y , distY.begin() , thrust::minus<int>());

來源

2014-12-09 George

什麼*確切*是那些「各種錯誤」？ – 2014-12-09 12:26:28

@Park Young-Bae：我更新了.. – George 2014-12-09 12:29:37

發表一個例子說有人可以自己編譯和運行它有多難？我絕望，即使在[SO]問了200 *個問題之後，你仍然沒有明白這個地方是如何工作的。 – talonmies 2014-12-09 14:38:05

你有幾個零長度的tra實例nsforms：

thrust::transform(dev_Kx.begin(), dev_Kx.begin(), dev_X , distX.begin() , thrust::minus<float>()); 
thrust::transform(dev_Ky.begin(), dev_Ky.begin(), dev_Y , distY.begin() , thrust::minus<float>());

和：

thrust::transform(distXSquared.begin() ,distXSquared.begin() , distYSquared.begin() , Tmp.begin() , thrust::plus<unsigned int>());

由於第一兩個參數在上述各變換的是相同的，正在進行的工作是零。大概你想在第二個位置相應的.end()迭代器，而不是.begin()

當我進行這些更改時，我打印出非零值。它們非常大，但是你看起來在拼湊大量的價值，所以我不確定你的意圖是什麼。

來源

2014-12-09 15:18:29

：你好，我只是想問一下。如果上面的「dev_X」是一個向量，我會使用「dev_X.begin（）」。然後，當我們使用'begin（）'時，這意味着它會佔用整個矢量？從開始到結束？謝謝！ – George 2014-12-10 09:16:05

您可能需要[閱讀]（http://msdn.microsoft.com/en-us/library/9xd04bzs.aspx）'std :: vector'。 '.begin（）'是一個成員函數，它是'vector'類的一部分，它返回一個迭代器，該迭代器「指向」該向量的開頭（即第一個元素）。這並不意味着將使用整個矢量。如果還不清楚，我建議你發佈一個新問題。 – 2014-12-10 09:23:13

：好的，但是在上面的行（轉換）中，我想做一個減法'dev_Kx - dev_X'。我怎樣才能保證它也通過所有dev_X元素？因爲trasnform函數中的參數是5。 – George 2014-12-10 09:28:14

推力操作空主機陣列

回答

相關問題