推力：爲什麼總是主機代碼儘管__CUDA_ARCH__

我嘗試在代碼中定義兩個分支執行：一個用於CUDA執行，另 - 沒有它（未來OMP記）。但是，當我使用宏__CUDA_ARCH__它看起來好像總是主機代碼被執行。但是我認爲Thrust默認使用CUDA（和設備代碼分支）。我的代碼有什麼問題？這：推力：爲什麼總是主機代碼儘管__CUDA_ARCH__

#include <thrust/transform.h>         
#include <thrust/functional.h>         
#include <thrust/iterator/counting_iterator.h>     
#include <stdio.h>            

struct my_op             
{                
    my_op(int init_const) : constanta(init_const) {}  
    __host__ __device__ int operator()(const int &x) const 
    {              
     #if defined(__CUDA_ARCH__)      
      return 2 * x * constanta; // never executed - why? 
     #else          
      return x * constanta;  // always executed     
     #endif      
    }              

private:              
    int constanta;           
};                

int main()              
{                
int data[7] = { 0, 0, 0, 0, 0, 0, 0 };       
thrust::counting_iterator<int> first(10);      
thrust::counting_iterator<int> last = first + 7;    

int init_value = 1;           
my_op op(init_value);           

thrust::transform(first, last, data, op);      
for each (int el in data)          
    std::cout << el << " ";        

std::cout << std::endl;          
}

我期望「改造」將定義向量乘以2 *康斯坦察但我看到主機代碼使用 - 的輸出爲「10 11 12 13 14 15 16」，而不是「20 22 24 26 28 30 32」（如預期）。

爲什麼？

thrust::transform(first, last, data, op); 
           ^^^^

如果你想有一個推力算法在設備上進行操作，通常情況下，所有的：因爲供應到推力數據項中的一個變換操作在主機內存

來源

2016-10-06 Ali58

推力是選擇主機路徑容器您傳遞給/來自的數據也必須駐留在設備內存中。

這裏有一個修改你的代碼，以證明推力將按照設備路徑，如果我們有一個設備常駐容器代替data：

$ cat t13.cu 
#include <thrust/transform.h> 
#include <thrust/functional.h> 
#include <thrust/iterator/counting_iterator.h> 
#include <thrust/device_vector.h> 
#include <stdio.h> 

struct my_op 
{ 
    my_op(int init_const) : constanta(init_const) {} 
    __host__ __device__ int operator()(const int &x) const 
    { 
     #if defined(__CUDA_ARCH__) 
      return 2 * x * constanta; // never executed - why? 
     #else 
      return x * constanta;  // always executed 
     #endif 
    } 

private: 
    int constanta; 
}; 

int main() 
{ 
// int data[7] = { 0, 0, 0, 0, 0, 0, 0 }; 
thrust::counting_iterator<int> first(10); 
thrust::counting_iterator<int> last = first + 7; 
thrust::device_vector<int> d_data(7); 

int init_value = 1; 
my_op op(init_value); 

thrust::transform(first, last, d_data.begin(), op); 
for (int el = 0; el < 7; el++) { 
    int dat = d_data[el]; 
    std::cout << dat << " "; } 

std::cout << std::endl; 
} 
$ nvcc -arch=sm_61 -o t13 t13.cu 
$ ./t13 
20 22 24 26 28 30 32 
$

您可能需要閱讀thrust quick start guide瞭解推力算法調度。

來源

2016-10-07 00:40:10

推力：爲什麼總是主機代碼儘管__CUDA_ARCH__

回答

相關問題