我試圖使用CURAND庫來生成隨機數,這是完全相互獨立的從0到100因此我給時間作爲種子給每個線程並指定「 id = threadIdx.x + blockDim.x * blockIdx.x「作爲序列和偏移量。 然後獲得隨機數作爲浮動後,我乘以100,並採取其整數值。現在製作CURAND從均勻分佈生成不同的隨機數
,我面臨的問題是,它得到相同的隨機數的線程[0,0]和[0,1],無論多少次,我跑這是11我無法代碼瞭解我做錯了什麼。請幫忙。
我貼我下面的代碼:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include<curand_kernel.h>
#include "util/cuPrintf.cu"
#include<time.h>
#define NE WA*HA //Total number of random numbers
#define WA 2 // Matrix A width
#define HA 2 // Matrix A height
#define SAMPLE 100 //Sample number
#define BLOCK_SIZE 2 //Block size
__global__ void setup_kernel (curandState * state, unsigned long seed)
{
int id = threadIdx.x + blockIdx.x + blockDim.x;
curand_init (seed, id , id, &state[id]);
}
__global__ void generate(curandState* globalState, float* randomMatrix)
{
int ind = threadIdx.x + blockIdx.x * blockDim.x;
if(ind < NE){
curandState localState = globalState[ind];
float stopId = curand_uniform(&localState) * SAMPLE;
cuPrintf("Float random value is : %f",stopId);
int stop = stopId ;
cuPrintf("Random number %d\n",stop);
for(int i = 0; i < SAMPLE; i++){
if(i == stop){
float random = curand_normal(&localState);
cuPrintf("Random Value %f\t",random);
randomMatrix[ind] = random;
break;
}
}
globalState[ind] = localState;
}
}
/////////////////////////////////////////////////////////
// Program main
/////////////////////////////////////////////////////////
int main(int argc, char** argv)
{
// 1. allocate host memory for matrix A
unsigned int size_A = WA * HA;
unsigned int mem_size_A = sizeof(float) * size_A;
float* h_A = (float*) malloc(mem_size_A);
time_t t;
// 2. allocate device memory
float* d_A;
cudaMalloc((void**) &d_A, mem_size_A);
// 3. create random states
curandState* devStates;
cudaMalloc (&devStates, size_A*sizeof(curandState));
// 4. setup seeds
int n_blocks = size_A/BLOCK_SIZE;
time(&t);
printf("\nTime is : %u\n",(unsigned long) t);
setup_kernel <<< n_blocks, BLOCK_SIZE >>> (devStates, (unsigned long) t);
// 4. generate random numbers
cudaPrintfInit();
generate <<< n_blocks, BLOCK_SIZE >>> (devStates,d_A);
cudaPrintfDisplay(stdout, true);
cudaPrintfEnd();
// 5. copy result from device to host
cudaMemcpy(h_A, d_A, mem_size_A, cudaMemcpyDeviceToHost);
// 6. print out the results
printf("\n\nMatrix A (Results)\n");
for(int i = 0; i < size_A; i++)
{
printf("%f ", h_A[i]);
if(((i + 1) % WA) == 0)
printf("\n");
}
printf("\n");
// 7. clean up memory
free(h_A);
cudaFree(d_A);
}
輸出,我得到的是:
時間是:1347857063 [0,0]:浮法隨機值:11.675105 [0,0 ]:隨機數11 [0,0]:隨機值0.358356 [0,1]:浮點隨機值爲:11.675105 [0,1]:隨機數11 [0,1]:隨機值0.358356 [ 1,0]:浮點隨機值爲:63.840496 [1,0]:隨機數63 [1,0]:隨機值0.696459 [1,1]:浮動隨機值:44.712799 [1,1]:隨機數44 [1,1]:隨機值0.735049
使用CUDA 5.0我無法重現此問題。我運行了你的代碼,它創建了四個完全不同的隨機值。你可以嘗試升級到CUDA 5.0 RC嗎?注意我也用'-arch = sm_20'編譯,所以我可以使用'printf',因爲我現在沒有'cuPrintf'了。 – harrism
我使用的是CUDA 4.2 ...請問您可以使用此版本運行代碼並檢查,因爲我無法找到要在Nvidia上下載CUDA 5.0 RC的位置。 – user1439690
[這是它](http://developer.nvidia.com/cuda/cuda-pre-production)。請嘗試。順便說一句,谷歌很擅長找到這些東西。 :) – harrism