Tensorflow：圖形不同路徑之間的tf.gradients

我正在開發一個DDPG實現，它需要計算一個網絡（下圖：critic）與另一個網絡（下圖：actor）輸出的梯度。我的代碼已經利用隊列，而不是飼料類型的字典大部分，但我不能爲這個特定部分這樣做還：Tensorflow：圖形不同路徑之間的tf.gradients

import tensorflow as tf 
tf.reset_default_graph() 

states = tf.placeholder(tf.float32, (None,)) 
actions = tf.placeholder(tf.float32, (None,)) 

actor = states * 1 
critic = states * 1 + actions 

grads_indirect = tf.gradients(critic, actions) 
grads_direct = tf.gradients(critic, actor) 

with tf.Session() as sess: 
    sess.run(tf.global_variables_initializer()) 

    act = sess.run(actor, {states: [1.]}) 
    print(act) # -> [1.] 
    cri = sess.run(critic, {states: [1.], actions: [2.]}) 
    print(cri) # -> [3.] 
    grad1 = sess.run(grads_indirect, {states: [1.], actions: act}) 
    print(grad1) # -> [[1.]] 
    grad2 = sess.run(grads_direct, {states: [1.], actions: [2.]}) 
    print(grad2) # -> TypeError: Fetch argument has invalid type 'NoneType'

grad1這裏計算的梯度w.r.t.到之前由actor計算出的接收動作。 grad2應該做同樣的事情，但直接在圖表的內部，而不需要重新提供動作，而是通過直接評估actor。問題是，grads_direct爲None：

print(grads_direct) # [None]

我怎樣才能做到這一點？有沒有專門的「評估張量」操作，我可以利用？謝謝！

來源

2017-06-08 ahoereth

在您的示例中，您不使用actor來計算critic，所以漸變爲無。

你應該這樣做：

actor = states * 1 
critic = actor + actions # change here 

grads_indirect = tf.gradients(critic, actions) 
grads_direct = tf.gradients(critic, actor)

來源

2017-06-08 20:34:31

Tensorflow：圖形不同路徑之間的tf.gradients

回答

相關問題