在輸出XSIZE和YSIZE字段的含義投cudaMalloc3D

的指針的documentation of cudaMalloc3D說在輸出XSIZE和YSIZE字段的含義投cudaMalloc3D

返回cudaPitchedPtr包含附加字段xsize和 ysize，分配的邏輯寬度和高度，這是等效到分配期間編程人員提供的寬度和高度範圍參數。

但是，如果我跑以下最低例子

#include<stdio.h> 
#include<cuda.h> 
#include<cuda_runtime.h> 
#include<device_launch_parameters.h> 
#include<conio.h> 

#define Nrows 64 
#define Ncols 64 
#define Nslices 16 

/********************/ 
/* CUDA ERROR CHECK */ 
/********************/ 
// --- Credit to http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api 
void gpuAssert(cudaError_t code, char *file, int line, bool abort = true) 
{ 
    if (code != cudaSuccess) 
    { 
     fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); 
     if (abort) { exit(code); } 
    } 
} 

void gpuErrchk(cudaError_t ans) { gpuAssert((ans), __FILE__, __LINE__); } 

/********/ 
/* MAIN */ 
/********/ 
int main() { 

    // --- 3D pitched allocation and host->device memcopy 
    cudaExtent extent = make_cudaExtent(Ncols * sizeof(float), Nrows, Nslices); 
    cudaPitchedPtr devPitchedPtr; 
    gpuErrchk(cudaMalloc3D(&devPitchedPtr, extent)); 

    printf("xsize = %i; xsize in bytes = %i; ysize = %i\n", devPitchedPtr.xsize, devPitchedPtr.pitch, devPitchedPtr.ysize); 

    return 0; 
}

我收到：

xsize = 256; xsize in bytes = 512; ysize = 64

所以，ysize實際上等於Nrows，但xsize不同於要麼Ncols或xsize in bytes/sizeof(float) 。

請幫我理解xsize和ysize字段的含義cudaMalloc3D的cudaPitchedPtr？

非常感謝您的幫助。

我的系統：Windows 10, CUDA 8.0,GT 920M,cc 3.5。

來源

2017-05-08 JackOLantern

xsize是您請求的間距寬度，以字節爲單位。 pitch是以字節爲單位的實際音高寬度。 ysize是您請求的行數 – talonmies

不是文檔中的「分配*至少*寬度*高度*線性內存的深度字節數」和「函數*可能填充*分配...」。 – Shadow

@talonmies非常感謝您的及時評論。 – JackOLantern

xsize = Ncols * sizeof(float)

xsize是分配的邏輯寬度（以字節爲單位），而不是在投寬度

邏輯寬度= 256個字節

投寬度= 512字節

它等於（相同）寬度p您在分配期間提供的參數（即您傳遞給make_cudaExtent的第一個參數）

來源

2017-05-08 19:20:31

謝謝羅伯特您的及時答覆。現在我很清楚'xsize'是'bytes'中「測量」的列的數量。 – JackOLantern

在輸出XSIZE和YSIZE字段的含義投cudaMalloc3D

回答

相關問題