CUDA：重新分配內存時無效的設備指針錯誤

在下面的代碼中，我簡單地從main調用函數foo兩次。該函數只是執行設備內存分配，然後遞增該指針。然後退出並返回主界面。CUDA：重新分配內存時無效的設備指針錯誤

第一次foo被稱爲內存被正確分配。但是，現在你可以在輸出中看到的，當我再次調用foo，CUDA內存分配與錯誤無效的設備指針

失敗我嘗試了兩種foo的調用之間使用的cudaThreadSynchronize（），但沒有收穫。爲什麼內存分配失敗？

實際上錯誤被casued由於

matrixd + = 3;

因爲如果我不這樣做增量錯誤消失。
但是，爲什麼即使我使用cudaFree（）？

請幫助我理解這一點。

我的輸出是這裏

Calling foo for the first time 
Allocation of matrixd passed: 
I came back to main safely :-) 
I am going back to foo again :-) 
Allocation of matrixd failed, the reason is: invalid device pointer

我主要的（）在這裏FOO（的

#include<stdio.h> 
#include <cstdlib> // malloc(), free() 
#include <iostream> // cout, stream 
#include <math.h> 
#include <ctime> // time(), clock() 
#include <bitset> 
bool foo(); 

/*************************************** 
Main method. 

****************************************/ 
int main() 
{ 

    // Perform one warm-up pass and validate 
    std::cout << "Calling foo for the first time"<<std::endl; 
    foo(); 
    std::cout << "I came back to main safely :-) "<<std::endl; 
    std::cout << "I am going back to foo again :-) "<<std::endl; 
    foo();  
    getchar(); 
    return 0; 
}

定義）是在這個文件：

#include <cuda.h> 
#include <cuda_runtime_api.h> 
#include <device_launch_parameters.h> 
#include <iostream> 

bool foo() 
{ 
    // Error return value 
    cudaError_t status; 
    // Number of bytes in the matrix. 
    int bytes = 9 *sizeof(float); 
     // Pointers to the device arrays 
    float *matrixd=NULL; 

    // Allocate memory on the device to store matrix 
    cudaMalloc((void**) &matrixd, bytes); 
    status = cudaGetLastError();    //To check the error 
    if (status != cudaSuccess) {      
     std::cout << "Allocation of matrixd failed, the reason is: " << cudaGetErrorString(status) << 
     std::endl; 
     cudaFree(matrixd);      //Free call for memory 
     return false; 
    } 

    std::cout << "Allocation of matrixd passed: "<<std::endl; 


    ////// Increment address 
    for (int i=0; i<3; i++){ 
     matrixd += 3; 
    } 

     // Free device memory 
    cudaFree(matrixd);  

    return true; 
}

更新

更好的錯誤檢查。此外，我只將設備指針遞增一次。這次我得到以下輸出：

Calling foo for the first time 
Allocation of matrixd passed: 
Increamented the pointer and going to free cuda memory: 
GPUassert: invalid device pointer C:/Users/user/Desktop/Gauss/Gauss/GaussianElem 
inationGPU.cu 44

行號44是cudaFree（）。爲什麼它仍然失敗？

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); } 
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true) 
{ 
    if (code != cudaSuccess) 
    { 
     fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line); 
     if (abort) exit(code); 
    } 
} 

// GPU function for direct method Gross Jorden method. 

bool foo() 
{ 

    // Error return value 
    cudaError_t status; 
    // Number of bytes in the matrix. 
    int bytes = 9 *sizeof(float); 
     // Pointers to the device arrays 
    float *matrixd=NULL; 

    // Allocate memory on the device to store each matrix 
    gpuErrchk(cudaMalloc((void**) &matrixd, bytes)); 
    //cudaMemset(outputMatrixd, 0, bytes); 

    std::cout << "Allocation of matrixd passed: "<<std::endl; 


    ////// Incerament address 

     matrixd += 1; 

     std::cout << "Increamented the pointer and going to free cuda memory: "<<std::endl; 

     // Free device memory 
    gpuErrchk(cudaFree(matrixd));  

    return true; 
}

來源

2016-10-03 user3891236

如果您檢查'cudaFree'調用'的返回狀態會怎麼樣？ – talonmies

@talonmies你是對的，只是檢查，我用cudagetlasterror（），低於cudafree和是的它顯示，它是失敗的但又是爲什麼？ – user3891236

沒錯。所以你的問題基本上是由不完整的錯誤檢查造成的。你可以看到如何正確地做到這一點[這裏]（http://stackoverflow.com/q/14038589/681865）。內存分配不失敗。 – talonmies

真正的問題是在此代碼：

for (int i=0; i<3; i++){ 
    matrixd += 3; 
} 

// Free device memory 
cudaFree(matrixd);

你永遠不分配matrixd+9，所以它傳遞給cudaFree是非法的，併產生一個無效的設備指針錯誤。該錯誤正在傳播到下次您執行錯誤檢查時，這是在後續調用cudaMalloc之後。如果您閱讀任何這些API調用的文檔，您將注意到有警告說他們可以返回以前GPU操作的錯誤。這就是在這種情況下發生的事情。

CUDA運行時API中的錯誤檢查可以很精確地執行。有一個強大的，準備好的食譜，如何做到這一點here。我建議你使用它。

來源

2016-10-03 05:56:28 talonmies

您的錯誤檢查方式非常整齊。請參閱我的更新。我想我的錯誤是我正在嘗試增加主機函數內的設備指針。我想這是不允許的，免費的cuda對此並不滿意。事實上，在主機功能矩陣++會指向一些垃圾在主機不在設備內存.. – user3891236

@ user3891236：我告訴你到底是什麼問題。你不能釋放你沒有分配的地址。「增加」指針是完全可以的（儘管在這種情況下完全沒有意義）。但是要求API釋放遞增的指針是非法的，因爲API從未以該指針值分配內存。 – talonmies

非常感謝您清除我的疑惑。今天我學到了很多東西，包括檢查CUDA錯誤的重要性。 – user3891236

CUDA：重新分配內存時無效的設備指針錯誤

回答

相關問題