2014-12-05 154 views
-1

我在Windows 7 Pro 32上安裝了NVIDIA GeForce 8500 GT,並且在CUDAC中我的項目出現問題。我已經安裝了所有軟件包和VS2012 Pro。我從Cuda 6.5的模板創建新的項目...編譯它和..「無效的設備功能」。 從歌廳Windows啓動PDF我已閱讀,我可以deviceQuery.exe chceck CUDA ..所以我做到了這一點:Cuda編譯示例

deviceQuery.exe Starting... 

CUDA Device Query (Runtime API) version (CUDART static linking) 

Detected 1 CUDA Capable device(s) 

Device 0: "GeForce 8500 GT" 
    CUDA Driver Version/Runtime Version   6.5/6.5 
    CUDA Capability Major/Minor version number: 1.1 
    Total amount of global memory:     512 MBytes (536870912 bytes) 
    (2) Multiprocessors, ( 8) CUDA Cores/MP:  16 CUDA Cores 
    GPU Clock rate:        1570 MHz (1.57 GHz) 
    Memory Clock rate:        400 Mhz 
    Memory Bus Width:        128-bit 
    Maximum Texture Dimension Size (x,y,z)   1D=(8192), 2D=(65536, 32768), 3D=(2048, 2048, 2048) 
    Maximum Layered 1D Texture Size, (num) layers 1D=(8192), 512 layers 
    Maximum Layered 2D Texture Size, (num) layers 2D=(8192, 8192), 512 layers 
    Total amount of constant memory:    65536 bytes 
    Total amount of shared memory per block:  16384 bytes 
    Total number of registers available per block: 8192 
    Warp size:          32 
    Maximum number of threads per multiprocessor: 768 
    Maximum number of threads per block:   512 
    Max dimension size of a thread block (x,y,z): (512, 512, 64) 
    Max dimension size of a grid size (x,y,z): (65535, 65535, 1) 
    Maximum memory pitch:       2147483647 bytes 
    Texture alignment:        256 bytes 
    Concurrent copy and kernel execution:   Yes with 1 copy engine(s) 
    Run time limit on kernels:      Yes 
    Integrated GPU sharing Host Memory:   No 
    Support host page-locked memory mapping:  Yes 
    Alignment requirement for Surfaces:   Yes 
    Device has ECC support:      Disabled 
    CUDA Device Driver Mode (TCC or WDDM):   WDDM (Windows Display Driver Model) 
    Device supports Unified Addressing (UVA):  No 
    Device PCI Bus ID/PCI location ID:   1/0 
    Compute Mode: 
    < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > 

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GeForce 8500 GT 
Result = PASS 

所以PASS!那麼什麼錯誤..?接下來我做了bandwidthTest

[CUDA Bandwidth Test] - Starting... 
Running on... 

Device 0: GeForce 8500 GT 
Quick Mode 

Host to Device Bandwidth, 1 Device(s) 
PINNED Memory Transfers 
    Transfer Size (Bytes) Bandwidth(MB/s) 
    33554432   1346.5 

Device to Host Bandwidth, 1 Device(s) 
PINNED Memory Transfers 
    Transfer Size (Bytes) Bandwidth(MB/s) 
    33554432   1556.9 

Device to Device Bandwidth, 1 Device(s) 
PINNED Memory Transfers 
    Transfer Size (Bytes) Bandwidth(MB/s) 
    33554432   5857.4 

Result = PASS 

那麼可以用enybode幫我嗎?

+1

CUDA 6.5的默認編譯目標是CC 2.0(sm_20),但您的GPU是CC 1.1(sm_11)。嘗試在'nvcc'命令行中指定正確的目標體系結構:'-arch = sm_11'。 – njuffa 2014-12-05 16:43:17

回答

2

無效的設備函數通常意味着代碼編譯時的體系結構高於您嘗試運行它的GPU。

的GPU架構包含在打印輸出:

CUDA Capability Major/Minor version number: 1.1 

CUDA 6.5編譯爲默認情況下,CC2.0架構。如果您想編譯cc 1.1體系結構,則需要將特定開關傳遞給您的編譯命令nvcc來執行此操作。

這通常意味着在項目屬性的Visual Studio設備配置選項卡中添加諸如compute_11,sm_11之類的內容。

當您這樣做時,您將會收到警告(在CUDA 6.5下)設備架構1.1已棄用。但是,您仍然可以編譯並定位此架構。

即使這個問題涉及到Windows,Linux上也存在同樣的必要性。如果您在Linux上使用CUDA 6.5,則默認編譯目標是cc2.0。爲了編譯早期的設備,有必要在編譯命令行中添加一些東西,如-arch=sm_11

+0

非常感謝你;)))作品;) – user3490530 2014-12-05 16:46:17

+1

@ user3490530如果答案讓你滿意,請點擊問題左邊的複選標記接受它。 – njuffa 2014-12-05 16:49:43