在下面的代碼中,我簡單地從main調用函數foo兩次。該函數只是執行設備內存分配,然後遞增該指針。然後退出並返回主界面。CUDA:重新分配內存時無效的設備指針錯誤
第一次foo被稱爲內存被正確分配。但是,現在你可以在輸出中看到的,當我再次調用foo,CUDA內存分配與錯誤無效的設備指針
失敗我嘗試了兩種foo的調用之間使用的cudaThreadSynchronize(),但沒有收穫。爲什麼內存分配失敗?
實際上錯誤被casued由於
matrixd + = 3;
因爲如果我不這樣做增量錯誤消失。
但是,爲什麼即使我使用cudaFree()?
請幫助我理解這一點。
我的輸出是這裏
Calling foo for the first time
Allocation of matrixd passed:
I came back to main safely :-)
I am going back to foo again :-)
Allocation of matrixd failed, the reason is: invalid device pointer
我主要的()在這裏FOO(的
#include<stdio.h>
#include <cstdlib> // malloc(), free()
#include <iostream> // cout, stream
#include <math.h>
#include <ctime> // time(), clock()
#include <bitset>
bool foo();
/***************************************
Main method.
****************************************/
int main()
{
// Perform one warm-up pass and validate
std::cout << "Calling foo for the first time"<<std::endl;
foo();
std::cout << "I came back to main safely :-) "<<std::endl;
std::cout << "I am going back to foo again :-) "<<std::endl;
foo();
getchar();
return 0;
}
定義)是在這個文件:
#include <cuda.h>
#include <cuda_runtime_api.h>
#include <device_launch_parameters.h>
#include <iostream>
bool foo()
{
// Error return value
cudaError_t status;
// Number of bytes in the matrix.
int bytes = 9 *sizeof(float);
// Pointers to the device arrays
float *matrixd=NULL;
// Allocate memory on the device to store matrix
cudaMalloc((void**) &matrixd, bytes);
status = cudaGetLastError(); //To check the error
if (status != cudaSuccess) {
std::cout << "Allocation of matrixd failed, the reason is: " << cudaGetErrorString(status) <<
std::endl;
cudaFree(matrixd); //Free call for memory
return false;
}
std::cout << "Allocation of matrixd passed: "<<std::endl;
////// Increment address
for (int i=0; i<3; i++){
matrixd += 3;
}
// Free device memory
cudaFree(matrixd);
return true;
}
更新
更好的錯誤檢查。此外,我只將設備指針遞增一次。這次我得到以下輸出:
Calling foo for the first time
Allocation of matrixd passed:
Increamented the pointer and going to free cuda memory:
GPUassert: invalid device pointer C:/Users/user/Desktop/Gauss/Gauss/GaussianElem
inationGPU.cu 44
行號44是cudaFree()。爲什麼它仍然失敗?
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, const char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) exit(code);
}
}
// GPU function for direct method Gross Jorden method.
bool foo()
{
// Error return value
cudaError_t status;
// Number of bytes in the matrix.
int bytes = 9 *sizeof(float);
// Pointers to the device arrays
float *matrixd=NULL;
// Allocate memory on the device to store each matrix
gpuErrchk(cudaMalloc((void**) &matrixd, bytes));
//cudaMemset(outputMatrixd, 0, bytes);
std::cout << "Allocation of matrixd passed: "<<std::endl;
////// Incerament address
matrixd += 1;
std::cout << "Increamented the pointer and going to free cuda memory: "<<std::endl;
// Free device memory
gpuErrchk(cudaFree(matrixd));
return true;
}
如果您檢查'cudaFree'調用'的返回狀態會怎麼樣? – talonmies
@talonmies你是對的,只是檢查,我用cudagetlasterror(),低於cudafree和是的它顯示,它是失敗的但又是爲什麼? – user3891236
沒錯。所以你的問題基本上是由不完整的錯誤檢查造成的。你可以看到如何正確地做到這一點[這裏](http://stackoverflow.com/q/14038589/681865)。內存分配不失敗。 – talonmies