2016-05-22 52 views
0

CUDA設備上的類型尺寸在理論上可能與主機平臺上的尺寸不同。那麼,在代碼中表達「sizeof(T)在我的CUDA設備上」的慣用方式是什麼?除了將自己的id類型地圖滾動到您知道的值之外?「我的CUDA設備上的sizeof type T」的成語是什麼?

+2

CUDA運行時已專門設計,以便主機和設備類型的大小相匹配。所以不,他們理論上不可能在大小上有所不同。唯一的角落案例是結構對齊,甚至是一致的。 – talonmies

+0

@talonmies,你確認cuda改變了bool的大小嗎?如果是的話,你記得什麼時候?哪個版本?其餘的,我完全同意,基本的(T)檢查尺寸應該適合大多數需求。 –

+0

@talonmies:但[LP64 vs LLP64編譯器](https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models)呢? – einpoklum

回答

2

您在任何目前支持的CUDA平臺上都沒有詢問您需要什麼。 CUDA工具鏈與主機編譯器和主機C++運行時庫如此高度集成的原因之一是,保證主機和設備上基本類型的大小始終匹配。沒有大小的慣用翻譯是必需的。對於主機和設備,sizeof的結果將始終相同。請注意,基礎類型的大小可能因平臺而異(Windows是LLP64/IL32P64平臺,Linux和OS X是LP64/I32LP64平臺),但對GPU沒有影響。

還要注意,GPU可以對複合類型施加對齊要求,這可能意味着編譯的大小與您所期望的不同。文件中詳細討論了適用條件。

例如,請考慮下面簡單的例子代碼:

#include <cstdio> 

__device__ __host__ __noinline__ void printsizes(const char* title) 
{ 
    printf("%s\n", title); 
    printf("sizeof(void*) = %ld\n", (unsigned long)sizeof(void*)); 
    printf("sizeof(char) = %ld\n", (unsigned long)sizeof(char)); 
    printf("sizeof(bool) = %ld\n", (unsigned long)sizeof(bool)); 
    printf("sizeof(short) = %ld\n", (unsigned long)sizeof(short)); 
    printf("sizeof(int) = %ld\n", (unsigned long)sizeof(int)); 
    printf("sizeof(long) = %ld\n", (unsigned long)sizeof(long)); 
    printf("sizeof(long long) = %ld\n", (unsigned long)sizeof(long long)); 
} 

__global__ void printkernel() 
{ 
    printsizes("On the device:"); 
} 

int main() 
{ 
    printsizes("On the host:"); 

    printkernel<<<1,1>>>(); 
    cudaDeviceSynchronize(); 
    cudaDeviceReset(); 

    return 0; 
} 

編譯並在Linux 64平臺的產量運行此:

$ nvcc -arch=sm_52 -m64 -o sizeof64 sizeof.cu 
$ ./sizeof64 
On the host: 
sizeof(void*) = 8 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 8 
sizeof(long long) = 8 
On the device: 
sizeof(void*) = 8 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 8 
sizeof(long long) = 8 

內置64位Windows平臺上它產生這樣的:

>nvcc -arch=sm_21 -m64 sizes.cu 
sizes.cu 
    Creating library a.lib and object a.exp 
>a.exe 
On the host: 
sizeof(void*) = 8 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 4 
sizeof(long long) = 8 
On the device: 
sizeof(void*) = 8 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 4 
sizeof(long long) = 8 

基於32位Windows平臺,它產生了這樣的結果:

>nvcc -arch=sm_21 -m32 sizes.cu 
sizes.cu 
    Creating library a.lib and object a.exp 

C:\Users\david\Documents>a.exe 
On the host: 
sizeof(void*) = 4 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 4 
sizeof(long long) = 8 
On the device: 
sizeof(void*) = 4 
sizeof(char) = 1 
sizeof(bool) = 1 
sizeof(short) = 2 
sizeof(int) = 4 
sizeof(long) = 4 
sizeof(long long) = 8 

請注意,void *long的大小可能因平臺而異。但是在任何情況下,GPU尺寸都與主機尺寸相匹配。這是CUDA驅動程序和GPU運行時的基本設計原則。

相關問題