2016-03-11 203 views
1

我很難在GPU 1中運行tensorflow程序。無論我使用CUDA_VISIBLE_DEVICES=1 python program.py還是在程序中使用tf.device('/gpu:1'),我一直都可以使用以下錯誤:TensorFlow只適用於GPU 0

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: Tesla K40c 
major: 3 minor: 5 memoryClockRate (GHz) 0.745 
pciBusID 0000:04:00.0 
Total memory: 12.00GiB 
Free memory: 11.90GiB 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:717] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:04:00.0) 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.0KiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.00MiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00GiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00GiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00GiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00GiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00GiB 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:73] Allocating 11.31GiB bytes. 
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:83] GPU 0 memory begins at 0x9047a0000 extends to 0xbd8325334 
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216) 
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216) 
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216) 
Aborted 

與GPU0運行時,不會發生這種情況,但問題是,別人是使用GPU和圖形處理單元GPU1閒置

+0

我猜測在某處張量流代碼庫中有一些硬編碼的設備0引用。 – talonmies

+0

@talonmies我通常使用不同的GPU和tensorflow(使用CUDA_VISIBLE_DEVICES來選擇GPU),所以我非常確定這種方式一般可以工作 – etarion

+1

@etarion:工作*一般*並且在這個特定情況下工作的是兩個不同的事情:顯然,內部上下文處理例程正在炸燬 - 錯誤216是「CUDA_ERROR_CONTEXT_ALREADY_IN_USE」 – talonmies

回答

3

如果您正在使用CUDA_VISIBLE_DEVICES = 1確保「別人「正在運行CUDA_VISIBLE_DEVICES = 0。

默認情況下,如果未指定CUDA_VISIBLE_DEVICES,則張量流程會佔用所有GPU,即使它並未主動使​​用它們。它們可能看起來空閒,但不會被後續張量流程訪問。但是,您可以在每個進程上使用CUDA_VISIBLE_DEVICES爲每個GPU分配不同的GPU。請記住,在每個過程中,tensorflow設備/ gpu:0,/ gpu:1等都是相對於過程而言的,而不是全局的......因此,他們會參考任何cuda可見設備可用於該進程......換句話說,如果您希望每個進程使用1個GPU,則在使用CUDA_VISIBLE_DEVICES將特定GPU分配給每個進程時,可以在代碼中引用/ gpu:0。

使用命令nvidia-smi查看哪些進程正在使用哪些GPU。