如何在Tensorflow中訪問循環單元的權重？

提高深度Q學習任務穩定性的一種方法是爲網絡維護一組目標權重，這些權重可以緩慢更新並用於計算Q值目標。作爲在學習過程的不同時間的結果，在正向傳球中使用兩組不同的權重。對於正常DQN這並不難實現，因爲權重可在feed_dict即設置tensorflow變量：如何在Tensorflow中訪問循環單元的權重？

sess = tf.Session() 
input = tf.placeholder(tf.float32, shape=[None, 5]) 
weights = tf.Variable(tf.random_normal(shape=[5,4], stddev=0.1) 
bias = tf.Variable(tf.constant(0.1, shape=[4]) 
output = tf.matmul(input, weights) + bias 
target = tf.placeholder(tf.float32, [None, 4]) 
loss = ... 

... 

#Here we explicitly set weights to be the slowly updated target weights 
sess.run(output, feed_dict={input: states, weights: target_weights, bias: target_bias}) 

# Targets for the learning procedure are computed using this output. 

.... 

#Now we run the learning procedure, using the most up to date weights, 
#as well as the previously computed targets 
sess.run(loss, feed_dict={input: states, target: targets})

我想DQN的一個經常性的版本才能使用此目標網絡技術，但我不知道如何訪問和設置重複使用的單元格內使用的權重。具體來說，我正在使用tf.nn.rnn_cell.BasicLSTMCell，但我想知道如何對任何類型的循環單元格執行此操作。

來源

2016-11-27 John H

BasicLSTMCell不公開它的變量作爲其公共API的一部分。我建議您查看這些變量在圖形中的名稱並提供這些名稱（因爲它們在檢查點中，所以這些名稱不太可能改變，並且更改這些名稱會破壞檢查點兼容性）。

或者，您可以製作一份BasicLSTMCell的副本，它會公開變量。我認爲這是最乾淨的方法。

來源

2016-11-28 18:01:18

這工作，謝謝亞歷山大。對於任何想要更多細節的人來說，當你將循環單元格送入'tf.nn.dynamicrnn（）'時，會創建權重和偏移變量。在會話中運行'tf.initialize_all_variables（）'後，如果運行tf.trainable_variables（）'，將會出現兩個新的可訓練張量。在我的情況下，他們被命名爲「RNN/BasicLSTMCell/Linear/Matrix：0」和「RNN/BasicLSTMCell/Linear/Bias：0」。 –

如何在Tensorflow中訪問循環單元的權重？

回答

相關問題