2012-09-07

我正在Arch Linux上進行GPGPU開發,使用cuda-sdkcuda-toolkit包。我嘗試運行cuda-gdb同時對一個 簡單的程序,結果的普通用戶:cuda-gdb是否需要root權限?

$ cuda-gdb ./driver 
Reading symbols from /home/nwh/Dropbox/projects/G4CU/driver...done. 
(cuda-gdb) run 
Starting program: /home/nwh/Dropbox/projects/G4CU/driver 
warning: Could not load shared library symbols for linux-vdso.so.1. 
Do you need "set solib-search-path" or "set sysroot"? 
[Thread debugging using libthread_db enabled] 
fatal: The CUDA driver initialization failed. (error code = 1) 


# cuda-gdb ./driver 
Reading symbols from /home/nwh/Dropbox/work/2012-09-06-cuda_gdb/driver...done. 
(cuda-gdb) run 
Starting program: /home/nwh/Dropbox/work/2012-09-06-cuda_gdb/driver 
warning: Could not load shared library symbols for linux-vdso.so.1. 
Do you need "set solib-search-path" or "set sysroot"? 
[Thread debugging using libthread_db enabled] 
[New Thread 0x7ffff5ba8700 (LWP 11386)] 
[Context Create of context 0x6e8a30 on Device 0] 
[Launch of CUDA Kernel 0 (thrust::detail::backend::cuda::detail::launch_closure_by_value<thrust::detail::backend::cuda::for_each_n_closure<thrust::device_ptr<unsigned long long>, unsigned int, thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned long long> > > ><<<(1,1,1),(704,1,1)>>>) on Device 0] 
[Launch of CUDA Kernel 1 (set_vector<<<(1,1,1),(10,1,1)>>>) on Device 0] 
vd[0] = 0 
vd[1] = 1 
vd[2] = 2 
vd[3] = 3 
vd[4] = 4 
vd[5] = 5 
vd[6] = 6 
vd[7] = 7 
vd[8] = 8 
vd[9] = 9 
[Thread 0x7ffff5ba8700 (LWP 11386) exited] 

Program exited normally. 
[Termination of CUDA Kernel 1 (set_vector<<<(1,1,1),(10,1,1)>>>) on Device 0] 
[Termination of CUDA Kernel 0 (thrust::detail::backend::cuda::detail::launch_closure_by_value<thrust::detail::backend::cuda::for_each_n_closure<thrust::device_ptr<unsigned long long>, unsigned int, thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned long long> > > ><<<(1,1,1),(704,1,1)>>>) on Device 0] 


// needed for nvcc with gcc 4.7 and iostream 
#undef _GLIBCXX_USE_INT128 

#include <iostream> 
#include <thrust/device_vector.h> 
#include <thrust/host_vector.h> 

void set_vector(int *a) 
    // get thread id 
    int id = threadIdx.x + blockIdx.x * blockDim.x; 
    a[id] = id; 

int main(void) 
    // settings 
    int len = 10; int trd = 10; 

    // allocate vectors 
    thrust::device_vector<int> vd(len); 

    // get the raw pointer 
    int *a = thrust::raw_pointer_cast(vd.data()); 

    // call the kernel 

    // print vector 
    for (int i=0; i<len; i++) 
    std::cout << "vd[" << i << "] = " << vd[i] << std::endl; 

    return 0; 


$ nvcc -g -G -gencode arch=compute_20,code=sm_20 driver.cu -o driver 



$ nvidia-smi 
Mon Sep 10 07:16:32 2012  
| NVIDIA-SMI 4.304.43 Driver Version: 304.43   |      
| GPU Name      | Bus-Id  Disp. | Volatile Uncorr. ECC | 
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage   | GPU-Util Compute M. | 
| 0 Quadro FX 1700   | 0000:01:00.0  N/A |     N/A | 
| 60% 52C N/A  N/A/N/A | 4% 20MB/511MB |  N/A  Default | 
| 1 Tesla C2070    | 0000:02:00.0  Off |     0 | 
| 30% 82C P8 N/A/N/A | 0% 11MB/5375MB |  0%  Default | 

| Compute processes:            GPU Memory | 
| GPU  PID Process name          Usage  | 
| 0   Not Supported            | 



您能否提供更多信息:1.您是否可以在沒有調試器的情況下運行CUDA應用程序作爲普通用戶? 2.如果首次以超級用戶的身份運行CUDA應用程序,那麼隨後的CUDA應用程序是否以普通用戶身份啓動? 3.您正在使用的cuda-gdb和CUDA驅動程序的版本是什麼? – Vyas


@Vyas,(1)是的,我可以像普通用戶一樣運行CUDA應用程序。 (2)是的,如果我第一次以root身份運行應用程序(使用'su'或'sudo'),我可以以普通用戶身份運行應用程序。 (3)'cuda-gdb'說它是GNU gdb 7.2的4.2版本。我正在使用nvidia驅動程序版本304.43。感謝您考慮這個問題! – nwhsvc



使用最新的Nvidia驅動程序(304.60)和最新版本的cuda(5.0.35)修復了此問題。 cuda-gdb不需要root權限即可運行。



/sbin/modprobe nvidia 
if [ "$?" -eq 0 ]; then 
# Count the number of NVIDIA controllers found. 
NVDEVS=`lspci | grep -i NVIDIA` 
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l` 
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l` 
N=`expr $N3D + $NVGA - 1` 
for i in `seq 0 $N`; do 
mknod -m 666 /dev/nvidia$i c 195 $i 
mknod -m 666 /dev/nvidiactl c 195 255 
exit 1 




嗨@Vyas,當X和沒有運行時,我遇到同樣的問題。當X運行時,我在'/ dev'中看到節點'nvidia0','nvidia1'和'nvidiactl'。他們有一個許可級別'crw-rw-rw-'。如果我關閉X,節點也會出現。 – nwhsvc


糟糕,我錯過了您的第一個答案 - 您能夠以普通用戶身份運行應用程序。這就排除了開發節點成爲問題。在這種情況下,您可以嘗試/檢查以下內容: 1.刪除cuda-gdb臨時目錄。'rm -rf/tmp/cuda-dbg' 2.確保yama的ptrace限制不存在並且已啓用。 – Vyas


1.我的系統中沒有'/ tmp/cuda-gdb'。 2.'/ proc/sys/kernel /'中沒有名爲'yama'的目錄。這是否意味着yama不活躍? – nwhsvc
