2015-10-26 85 views
3

我有一個Celery task稱爲simple_theano_tasksRuntimeError使用時Theano在芹菜芹菜工人共享變量

@app.task(bind=True, queue='test') 
def simple_theano_tasks(self): 
    import theano, numpy as np 
    my_array = np.zeros((0,), dtype=theano.config.floatX) 
    shared = theano.shared(my_array, name='my_variable', borrow=True) 
    print 'Done. Shared value is {}'.format(shared.get_value()) 

當THEANO是配置使用CPU,一切正常(無錯誤):

$ THEANO_FLAGS=device=cpu celery -A my_project worker -c1 -l info -Q test 

[INFO/MainProcess] Received task: my_project.tasks.simple_theano_tasks[xxxx]

[WARNING/Worker-1] Done. Shared value is []

[INFO/MainProcess] Task my_project.tasks.simple_theano_tasks[xxxx] succeeded in 0.00407959899985s


現在,當我做同樣的事情與GPU啓用,Theano(或CUDA)引發錯誤:

$ THEANO_FLAGS=device=gpu celery -A my_project worker -c1 -l info -Q test 

...

Using gpu device 0: GeForce GTX 670M (CNMeM is enabled)

...

[INFO/MainProcess] Received task: my_project.tasks.simple_theano_tasks[xxx]

[ERROR/MainProcess] Task my_project.tasks.simple_theano_tasks[xxx] raised unexpected: RuntimeError("Cuda error 'initialization error' while copying %lli data element to device memory",)

Traceback (most recent call last):

File "/.../local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task R = retval = fun(*args, **kwargs)

File "/.../local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in protected_call return self.run(*args, **kwargs)

File "/.../my_project/tasks.py", line 362, in simple_theano_tasks shared = theano.shared(my_array, name='my_variable', borrow=True)

File "/.../local/lib/python2.7/site-packages/theano/compile/sharedvalue.py", line 247, in shared allow_downcast=allow_downcast, **kwargs)

File "/.../local/lib/python2.7/site-packages/theano/sandbox/cuda/var.py", line 229, in float32_shared_constructor deviceval = type_support_filter(value, type.broadcastable, False, None) RuntimeError: Cuda error 'initialization error' while copying %lli data element to device memory


最後,當我運行在完全相同的代碼一個Python殼我沒有任何錯誤:

$ THEANO_FLAGS=device=gpu python 
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> import theano, numpy as np 
Using gpu device 0: GeForce GTX 670M (CNMeM is enabled) 
>>> my_array = np.zeros((0,), dtype=theano.config.floatX) 
>>> shared = theano.shared(my_array, name='my_variable', borrow=True) 
>>> print 'Done. Shared value is {}'.format(shared.get_value()) 
Done. Shared value is [] 

不任何人都有一個想法:

  • 爲什麼在芹菜工人內theano行爲不同?
  • 如何解決這個問題?

一些附加的上下文

  • 我使用theano @ 0.7.0和Celery @ 3.1.18

  • 「〜/ .theanorc」 文件

[global]

floatX=float32

device=gpu

[mode]=FAST_RUN

[nvcc]

fastmath=True

[lib]

cnmem=0.1

[cuda]

root=/usr/local/cuda

+0

@asksol你有一個想法? BR – nicolaspanel

回答

4

一種解決方法是:

  1. 指定CPU作爲目標設備(在 「.theanorc」 或使用 「THEANO_FLAGS =設備= CPU」)
  2. 後來,重寫所分配的設備到指定的GPU

芹菜現在的任務是:

@app.task(bind=True, queue='test') 
def simple_theano_tasks(self): 
    # At this point, no theano import statements have been processed, and so the device is unbound 
    import theano, numpy as np 
    import theano.sandbox.cuda 
    theano.sandbox.cuda.use('gpu') # enable gpu 
    my_array = np.zeros((0,), dtype=theano.config.floatX) 
    shared = theano.shared(my_array, name='my_variable', borrow=True) 
    print 'Done. Shared value is {}'.format(shared.get_value()) 

I found the solution reading this article about using multiple GPU