我有一個cuda應用程序,我想在0和1之間生成隨機數。我寫了一個虛擬代碼,其中大小爲8x256的矩陣將由生成的隨機數填充由內核。我的原始矩陣會是8XBIG_NUMBER。但可能我在代碼中缺少一些東西,因爲我無法產生所需的結果。我在下面發佈我的代碼。在0到1之間在cuda內核中生成隨機數
void main(int argc,char* argv[])
{
float *test_var,*dev_test;
curandState *state;
test_var = (float *)malloc(8*256*sizeof(float));
memset(test_var,0,8*256*sizeof(float));
cudaMalloc((void **)&dev_test,8*256*sizeof(float));
cudaMemcpy(dev_test,test_var,8*256*sizeof(float),cudaMemcpyHostToDevice);
dim3 gridDim(1,256/32,1);
dim3 blockDim(8,32,1);
cudaMalloc((void **)&state,8*256*sizeof(curandState));
setup_kernel<<<gridDim,blockDim>>>(state,unsigned(time(NULL)));
test_kernel<<<gridDim,blockDim>>>(state,dev_test);
cudaMemcpy(test_var,dev_test,8*256*sizeof(float),cudaMemcpyDeviceToHost);
system("PAUSE");
for (int i=0;i<256;i++)
{ for (int j=0;j<8;j++)
{ printf("%f\t",test_var[i*8+j]);
}
printf("\n");
}
cudaFree(dev_test);
cudaFree(state);
free(test_var);
exit(0);
}
__global__ void setup_kernel(curandState *state,unsigned long seed)
{
int id_col = threadIdx.x + blockDim.x*blockIdx.x;
int id_row = threadIdx.y+blockDim.y*blockIdx.y;
curand_init(seed,(id_row*8+id_col),0,&state[id_row*8+id_col]);
}
__global__ void test_kernel(curandState *state,float *dev_test)
{
int id_col = threadIdx.x + blockDim.x*blockIdx.x;
int id_row = threadIdx.y+blockDim.y*blockIdx.y;
curandState local_state = state[id_row*8+id_col];
dev_test[id_row*8+id_col] = curand(&local_state);
state[id_row*8+id_col] = local_state;
}
我想爲矩陣中的每個單元生成一個介於0和1之間的隨機數。我會很感激任何人的幫助。謝謝
'curandGenerateUniform()'有什麼問題嗎? –
我還沒有嘗試過。但我試圖產生一個不屬於任何分配的隨機數。這就是爲什麼我是curand()。 – duttasankha
屬於「不分配」的隨機數沒有意義。這就像說一句「沒有形狀」的句子。均勻分佈正是它聽起來像:0到1之間的每個值都是相同的可能性。 –