分離出.cu和.cpp（使用C++ 11庫）

我想轉換一個C++程序，我有使用隨機庫這是一個C++ 11功能。在閱讀了一些類似的帖子之後，我嘗試將代碼分成三個文件。首先我想說的是我在C/C++中不是很熟悉，並且大多在工作中使用R。分離出.cu和.cpp（使用C++ 11庫）

主文件如下所示。

#ifndef _KERNEL_SUPPORT_ 
#define _KERNEL_SUPPORT_ 
#include <complex> 
#include <random> 
#include <iostream> 
#include "my_code_header.h" 
using namespace std; 
std::default_random_engine generator; 
std::normal_distribution<double> distribution(0.0,1.0); 
const int rand_mat_length = 24561; 
double rand_mat[rand_mat_length];// = {0}; 
void create_std_norm(){ 
    for(int i = 0 ; i < rand_mat_length ; i++) 
    ::rand_mat[i] = distribution(generator); 
} 
. 
. 
. 
int main(void) 
{ 
    ... 
    ... 
    call_global(); 
    return 0; 
} 
#endif

頭文件如下所示。

#ifndef mykernel_h 
#define mykernel_h 
void call_global(); 
void two_d_example(double *a, double *b, double *my_result, size_t length, size_t width); 
#endif

而.cu文件如下所示。

#ifndef _MY_KERNEL_ 
#define _MY_KERNEL_ 
#include <iostream> 
#include "my_code_header.h" 
#define TILE_WIDTH 8 
using namespace std; 
__global__ void two_d_example(double *a, double *b, double *my_result, size_t length, size_t width) 
{ 
    unsigned int row = blockIdx.y*blockDim.y + threadIdx.y; 
    unsigned int col = blockIdx.x*blockDim.x + threadIdx.x; 
    if ((row>length) || (col>width)) { 
    return; 
    } 
    ... 
} 
void call_global() 
{ 
    const size_t imageLength = 528; 
    const size_t imageWidth = 528; 
    const dim3 threadsPerBlock(TILE_WIDTH,TILE_WIDTH); 
    const dim3 numBlocks(((imageLength)/threadsPerBlock.x), ((imageWidth)/threadsPerBlock.y)); 
    double *d_a, *d_b, *mys ; 

    ... 
    cudaMalloc((void**)&d_a, sizeof(double) * imageLength); 
    cudaMalloc((void**)&d_b, sizeof(double) * imageWidth); 
    cudaMalloc((void**)&mys, sizeof(double) * imageLength * imageWidth); 

    two_d_example<<<numBlocks,threadsPerBlock>>>(d_a, d_b, mys, imageLength, imageWidth); 
    ... 
    cudaFree(d_a); 
    cudaFree(d_b); 


} 

#endif

請注意：__global__從.H，因爲我正因爲它被用g ++編譯下面的錯誤被刪除。

In file included from my_code_main.cpp:12:0: 
my_code_header.h:5:1: error: ‘__global__’ does not name a type

當我用nvcc編譯.cu文件時，它很好，並生成一個my_code_kernel.o。但是，因爲我在我的.cpp中使用C++ 11，我試圖用g ++編譯它，並且出現以下錯誤。

/tmp/ccR2rXzf.o: In function `main': 
my_code_main.cpp:(.text+0x1c4): undefined reference to `call_global()' 
collect2: ld returned 1 exit status

我知道這可能不需要對CUDA做任何事情，因爲這可能只是在兩個地方都包含頭的錯誤用法。另外什麼是正確的編譯方式，最重要的是鏈接my_code_kernel.o和my_code_main.o（希望）？對不起，如果這個問題太瑣碎了！

來源

2013-08-16 Sandipan Bhattacharyya

您可以將您正在使用的實際編譯命令添加到問題中嗎？ – talonmies

它看起來像你沒有鏈接my_code_kernel.o。你已經使用了-c作爲你的nvcc命令（導致它編譯但不能鏈接，即生成.o文件），我猜測你沒有在你的g ++命令中使用-c，在這種情況下你需要添加my_code_kernel.o添加到輸入列表以及.cpp文件。

你試圖實現的分離是完全可能的，它看起來像你沒有正確鏈接。如果仍有問題，請將編譯命令添加到您的問題中。

供參考：您不需要在頭文件中聲明two_d_example()，它只用於.cu文件（從call_global()）。

來源

2013-08-16 08:12:31 Tom

非常感謝！這是一個鏈接問題。 –

在這種情況下，您可以將問題標記爲已回答嗎？ – Tom

分離出.cu和.cpp（使用C++ 11庫）

回答

相關問題