0
我在使用CUDA Memory Checker運行程序時遇到問題。 在stackoverflow上的其他線程中,在內核中使用malloc的主要問題是「compute_50,sm_50」未正確設置。這裏的代碼編譯,所以這不是問題。CUDA - 內核中的malloc(compute_50,sm_50)
現在問題已解決,但我不明白爲什麼新代碼解決了問題。 我的問題是:爲什麼現在工作?
舊代碼:
__device__ unsigned int draw_active_levels(curandState * localState,const int num_levels_max){
unsigned int return_value = 0;
float draw;
draw = curand_uniform(localState);
int num_active_levels = floorf(draw * (num_levels_max - 1)) + 1;
double * arrLevelWeights = (double*) malloc((num_levels_max+1) * sizeof(double));
arrLevelWeights[num_levels_max]=0.0; //<--------Error on this line
double level_weights = 1.0/num_levels_max;
for(int i=0; i<num_levels_max; i++){
arrLevelWeights[i] = level_weights;
}
//...
//do some operations using arrLevelWeights
//..
free(arrLevelWeights);
return return_value;
}
錯誤與舊代碼:
Memory Checker detected 2 access violations.
error = access violation on store (global memory)
gridid = 198
blockIdx = {1,0,0}
threadIdx = {29,0,0}
address = 0x00000020
accessSize = 8
新代碼: 我只是增加了幾行,以檢查是否malloc返回一個空指針。
__device__ unsigned int draw_active_levels(curandState * localState,const int num_levels_max){
unsigned int return_value = 0;
float draw;
draw = curand_uniform(localState);
int num_active_levels = floorf(draw * (num_levels_max - 1)) + 1;
double * arrLevelWeights;
arrLevelWeights = (double*) malloc((num_levels_max+1) * sizeof(double));
if(arrLevelWeights == NULL){
printf("Error while dynamically allocating memory on device.\n"); //<--- this line is never called (I put a breakpoint on it)
}
arrLevelWeights[num_levels_max]=0.0; //<-------Error disapeared !
double level_weights = 1.0/num_levels_max;
for(int i=0; i<num_levels_max; i++){
arrLevelWeights[i] = level_weights;
}
//...
//do some operations using arrLevelWeights
//..
free(arrLevelWeights);
return return_value;
}
你可能會分配太多內存。設備堆的默認大小爲8 MB。 –
你是對的,它是關於內存空間,我錯過了一個完全不相關的代碼部分的免費()。你想把它作爲答案,以便我可以接受它嗎? – RemiDav
很明顯,您對代碼進行了其他更改。如果你添加一行代碼並且它從未被調用,那顯然不是問題。你的問題令人困惑。我很困惑這個問題的答案如何能夠真正回答這個問題,特別是如何添加永遠不會被調用的代碼行可以「解決」問題。 –