我無法在此短cuda代碼中獲得分段錯誤的來源。 我正在使用它來測試Thrust庫與STL庫的排序速度,排序整數爲 。我傳遞的數組大小作爲命令行 參數進行排序。在小代碼中查找seg錯誤
inline void check_cuda_error(char *message)
cudaError_t error = cudaGetLastError();
if(error != cudaSuccess)
printf("CUDA error after %s: %s\n", message, cudaGetErrorString(error));
int main(int argc, char *argv[])
int N = atoi(argv[1]);
double* h = new double[N];
for (int i = 0; i < N; ++i)
h[i] = (double)rand()/RAND_MAX; //std::cout << h[i] << " " ;
clock_t start , stop;
std::cout << std::endl;
// Start timing
start = clock();
std::sort(h, h+N);
stop = clock();
std::cout << "Host sorting took " << (stop - start) /(double)CLOCKS_PER_SEC << std::endl ;
// Start the GPU work. Initialize to random numbers again.
for (int i = 0; i < N; ++i)
h[i] = (double)rand()/RAND_MAX; //std::cout << h[i] << " " ;
double* d = 0;
const size_t num_bytes = N * sizeof(double);
cudaMalloc((void**)&d, num_bytes);
check_cuda_error("Memory Allocation");
cudaMemcpy(d ,h , N * sizeof(double), cudaMemcpyHostToDevice); // Transfer data
thrust::sort(d, d+ N) ;
return 0;
[BeamerLatex/Farber]$ nvcc -arch=sm_20 sortcompare.cu ; ./a.out 16777216
Host sorting took 3.77
[1] 4661 segmentation fault ./a.out 16777216
不相關,但你不刪除'h' ... –
你可以添加你的包括?我cba谷歌他們所有,這將有助於我有一個運行版本的代碼 – sji
只是一個猜測,但是,你可以請檢查cudaMalloc的returnvalue。如果是cudaErrorMemoryAllocation,那麼對cudaMemcpy的調用很可能會失敗,因爲目標仍然是0.這通常會導致段錯誤。 –