2012-09-07 134 views
2

我正在Arch Linux上進行GPGPU開發,使用cuda-sdkcuda-toolkit包。我嘗試運行cuda-gdb同時對一個 簡單的程序,結果的普通用戶:cuda-gdb是否需要root權限?

$ cuda-gdb ./driver 
NVIDIA (R) CUDA Debugger 
4.2 release 
Portions Copyright (C) 2007-2012 NVIDIA Corporation 
GNU gdb (GDB) 7.2 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-unknown-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/nwh/Dropbox/projects/G4CU/driver...done. 
(cuda-gdb) run 
Starting program: /home/nwh/Dropbox/projects/G4CU/driver 
warning: Could not load shared library symbols for linux-vdso.so.1. 
Do you need "set solib-search-path" or "set sysroot"? 
[Thread debugging using libthread_db enabled] 
fatal: The CUDA driver initialization failed. (error code = 1) 

如果我運行cuda-gdb爲根,它可以正常工作:

# cuda-gdb ./driver 
NVIDIA (R) CUDA Debugger 
4.2 release 
Portions Copyright (C) 2007-2012 NVIDIA Corporation 
GNU gdb (GDB) 7.2 
Copyright (C) 2010 Free Software Foundation, Inc. 
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> 
This is free software: you are free to change and redistribute it. 
There is NO WARRANTY, to the extent permitted by law. Type "show copying" 
and "show warranty" for details. 
This GDB was configured as "x86_64-unknown-linux-gnu". 
For bug reporting instructions, please see: 
<http://www.gnu.org/software/gdb/bugs/>... 
Reading symbols from /home/nwh/Dropbox/work/2012-09-06-cuda_gdb/driver...done. 
(cuda-gdb) run 
Starting program: /home/nwh/Dropbox/work/2012-09-06-cuda_gdb/driver 
warning: Could not load shared library symbols for linux-vdso.so.1. 
Do you need "set solib-search-path" or "set sysroot"? 
[Thread debugging using libthread_db enabled] 
[New Thread 0x7ffff5ba8700 (LWP 11386)] 
[Context Create of context 0x6e8a30 on Device 0] 
[Launch of CUDA Kernel 0 (thrust::detail::backend::cuda::detail::launch_closure_by_value<thrust::detail::backend::cuda::for_each_n_closure<thrust::device_ptr<unsigned long long>, unsigned int, thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned long long> > > ><<<(1,1,1),(704,1,1)>>>) on Device 0] 
[Launch of CUDA Kernel 1 (set_vector<<<(1,1,1),(10,1,1)>>>) on Device 0] 
vd[0] = 0 
vd[1] = 1 
vd[2] = 2 
vd[3] = 3 
vd[4] = 4 
vd[5] = 5 
vd[6] = 6 
vd[7] = 7 
vd[8] = 8 
vd[9] = 9 
[Thread 0x7ffff5ba8700 (LWP 11386) exited] 

Program exited normally. 
[Termination of CUDA Kernel 1 (set_vector<<<(1,1,1),(10,1,1)>>>) on Device 0] 
[Termination of CUDA Kernel 0 (thrust::detail::backend::cuda::detail::launch_closure_by_value<thrust::detail::backend::cuda::for_each_n_closure<thrust::device_ptr<unsigned long long>, unsigned int, thrust::detail::device_generate_functor<thrust::detail::fill_functor<unsigned long long> > > ><<<(1,1,1),(704,1,1)>>>) on Device 0] 

測試程序driver.cu是:

// needed for nvcc with gcc 4.7 and iostream 
#undef _GLIBCXX_ATOMIC_BUILTINS 
#undef _GLIBCXX_USE_INT128 

#include <iostream> 
#include <thrust/device_vector.h> 
#include <thrust/host_vector.h> 

__global__ 
void set_vector(int *a) 
{ 
    // get thread id 
    int id = threadIdx.x + blockIdx.x * blockDim.x; 
    a[id] = id; 
    __syncthreads(); 
} 

int main(void) 
{ 
    // settings 
    int len = 10; int trd = 10; 

    // allocate vectors 
    thrust::device_vector<int> vd(len); 

    // get the raw pointer 
    int *a = thrust::raw_pointer_cast(vd.data()); 

    // call the kernel 
    set_vector<<<1,trd>>>(a); 

    // print vector 
    for (int i=0; i<len; i++) 
    std::cout << "vd[" << i << "] = " << vd[i] << std::endl; 

    return 0; 
} 

driver.c使用以下命令進行編譯:

$ nvcc -g -G -gencode arch=compute_20,code=sm_20 driver.cu -o driver 

我怎樣才能讓cuda-gdb以root權限運行?

一些信息從nvidia-smi輸出爲:

$ nvidia-smi 
Mon Sep 10 07:16:32 2012  
+------------------------------------------------------+      
| NVIDIA-SMI 4.304.43 Driver Version: 304.43   |      
|-------------------------------+----------------------+----------------------+ 
| GPU Name      | Bus-Id  Disp. | Volatile Uncorr. ECC | 
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage   | GPU-Util Compute M. | 
|===============================+======================+======================| 
| 0 Quadro FX 1700   | 0000:01:00.0  N/A |     N/A | 
| 60% 52C N/A  N/A/N/A | 4% 20MB/511MB |  N/A  Default | 
+-------------------------------+----------------------+----------------------+ 
| 1 Tesla C2070    | 0000:02:00.0  Off |     0 | 
| 30% 82C P8 N/A/N/A | 0% 11MB/5375MB |  0%  Default | 
+-------------------------------+----------------------+----------------------+ 

+-----------------------------------------------------------------------------+ 
| Compute processes:            GPU Memory | 
| GPU  PID Process name          Usage  | 
|=============================================================================| 
| 0   Not Supported            | 
+-----------------------------------------------------------------------------+ 

顯示器被連接到所述的Quadro和我運行特斯拉CUDA的應用程序。

+0

您能否提供更多信息:1.您是否可以在沒有調試器的情況下運行CUDA應用程序作爲普通用戶? 2.如果首次以超級用戶的身份運行CUDA應用程序,那麼隨後的CUDA應用程序是否以普通用戶身份啓動? 3.您正在使用的cuda-gdb和CUDA驅動程序的版本是什麼? – Vyas

+0

@Vyas,(1)是的,我可以像普通用戶一樣運行CUDA應用程序。 (2)是的,如果我第一次以root身份運行應用程序(使用'su'或'sudo'),我可以以普通用戶身份運行應用程序。 (3)'cuda-gdb'說它是GNU gdb 7.2的4.2版本。我正在使用nvidia驅動程序版本304.43。感謝您考慮這個問題! – nwhsvc

回答

1

使用最新的Nvidia驅動程序(304.60)和最新版本的cuda(5.0.35)修復了此問題。 cuda-gdb不需要root權限即可運行。

3

謝謝。從它的聲音,你的問題是所需的設備節點沒有得到初始化。通常,運行X將創建CUDA軟件堆棧與硬件通信所需的設備節點。當X沒有運行時,就像這裏的情況一樣,以root身份運行將創建節點。普通用戶由於缺少權限而無法創建節點。沒有X運行的是Linux系統時,建議的方法是(在http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_Getting_Started_Linux.pdf從入門指南)運行下面的腳本根

#!/bin/bash 
/sbin/modprobe nvidia 
if [ "$?" -eq 0 ]; then 
# Count the number of NVIDIA controllers found. 
NVDEVS=`lspci | grep -i NVIDIA` 
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l` 
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l` 
N=`expr $N3D + $NVGA - 1` 
for i in `seq 0 $N`; do 
mknod -m 666 /dev/nvidia$i c 195 $i 
done 
mknod -m 666 /dev/nvidiactl c 195 255 
else 
exit 1 
fi 

注意,您將需要重新在每次啓動的設備節點,所以它最好將此腳本(或類似的腳本)添加到您的啓動順序中。

@Till:道歉的問題作爲答案:)。我是新手,沒有足夠的聲望來創建評論。

+0

嗨@Vyas,當X和沒有運行時,我遇到同樣的問題。當X運行時,我在'/ dev'中看到節點'nvidia0','nvidia1'和'nvidiactl'。他們有一個許可級別'crw-rw-rw-'。如果我關閉X,節點也會出現。 – nwhsvc

+0

糟糕,我錯過了您的第一個答案 - 您能夠以普通用戶身份運行應用程序。這就排除了開發節點成爲問題。在這種情況下,您可以嘗試/檢查以下內容: 1.刪除cuda-gdb臨時目錄。'rm -rf/tmp/cuda-dbg' 2.確保yama的ptrace限制不存在並且已啓用。 – Vyas

+0

1.我的系統中沒有'/ tmp/cuda-gdb'。 2.'/ proc/sys/kernel /'中沒有名爲'yama'的目錄。這是否意味着yama不活躍? – nwhsvc

相關問題