2016-05-30 29 views
0

所以,這裏是事情:我必須在下面的代碼,當我應用畢業生wrt一切運行良好,它正確計算梯度。但是,如果我做wrt=i那麼它會給我一個DisconnectedInputError。爲什麼是這樣呢,我怎麼能區分我呢?不能理解theano掃描應用於grad行爲

def step(i, A): 
    return A*i, i 

A = T.scalar("A") 
outputs, _ = theano.scan(step, sequences=T.arange(2,6), non_sequences=A) 
res, i = outputs 
grad = T.grad(cost=res[3], wrt=A) 
func = theano.function([A],[grad, res, i]) 

print func(3.0) 

Traceback (most recent call last): 
    File "test.py", line 17, in <module> 
    grad = T.grad(cost=res[3], wrt=i) 
    File "/usr/local/lib/python2.7/dist-packages/theano/gradient.py", line 545, in grad 
    handle_disconnected(elem) 
    File "/usr/local/lib/python2.7/dist-packages/theano/gradient.py", line 532, in handle_disconnected 
    raise DisconnectedInputError(message) 
theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: for{cpu,scan_fn}.1 
Backtrace when the node is created: 
    File "test.py", line 15, in <module> 
    outputs, _ = theano.scan(step, sequences=T.arange(2,6), non_sequences=A) 

回答

0

要區分wrt爲i,您需要在掃描操作之前聲明整個i數組,然後使用整個數組來計算漸變wrt。

i_array = T.arange(2,6) 

def step(i, A): 
    return A*i, i 

A = T.scalar("A") 
A.tag.test_value = 5.0 

outputs, _ = theano.scan(step, sequences=i_array, non_sequences=A) 
res, i = outputs 
grad = T.grad(cost=res[3], wrt=i_array) 
func = theano.function([A],[grad, res, i]) 

print func(3.0) 
如果你想要的 i特定元素

,那麼你應該使用一個索引選擇它,你計算的畢業生i_array