我很好奇,是否有一種很好的方法來分享不同RNN小區的權重,同時仍然爲每個小區提供不同的輸入。如何在Tensorflow中的不同輸入中分享跨不同RNN單元的權重?
,我想建立的圖表是這樣的:
那裏有橙色3個LSTM細胞,其並行操作,我想和大家分享的權重之間。
我已經設法實現了類似於我想要使用佔位符的東西(請參閱下面的代碼)。但是,使用佔位符會中斷優化程序的漸變計算,並且不會訓練超過使用佔位符的點的任何內容。在Tensorflow中可以做到這一點嗎?
我使用Tensorflow 1.2和3.5蟒在蟒蛇環境在Windows 7
代碼:
def ann_model(cls,data, act=tf.nn.relu):
with tf.name_scope('ANN'):
with tf.name_scope('ann_weights'):
ann_weights = tf.Variable(tf.random_normal([1,
cls.n_ann_nodes]))
with tf.name_scope('ann_bias'):
ann_biases = tf.Variable(tf.random_normal([1]))
out = act(tf.matmul(data,ann_weights) + ann_biases)
return out
def rnn_lower_model(cls,data):
with tf.name_scope('RNN_Model'):
data_tens = tf.split(data, cls.sequence_length,1)
for i in range(len(data_tens)):
data_tens[i] = tf.reshape(data_tens[i],[cls.batch_size,
cls.n_rnn_inputs])
rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(cls.n_rnn_nodes_lower)
outputs, states = tf.contrib.rnn.static_rnn(rnn_cell,
data_tens,
dtype=tf.float32)
with tf.name_scope('RNN_out_weights'):
out_weights = tf.Variable(
tf.random_normal([cls.n_rnn_nodes_lower,1]))
with tf.name_scope('RNN_out_biases'):
out_biases = tf.Variable(tf.random_normal([1]))
#Encode the output of the RNN into one estimate per entry in
#the input sequence
predict_list = []
for i in range(cls.sequence_length):
predict_list.append(tf.matmul(outputs[i],
out_weights)
+ out_biases)
return predict_list
def create_graph(cls,sess):
#Initializes the graph
with tf.name_scope('input'):
cls.x = tf.placeholder('float',[cls.batch_size,
cls.sequence_length,
cls.n_inputs])
with tf.name_scope('labels'):
cls.y = tf.placeholder('float',[cls.batch_size,1])
with tf.name_scope('community_id'):
cls.c = tf.placeholder('float',[cls.batch_size,1])
#Define Placeholder to provide variable input into the
#RNNs with shared weights
cls.input_place = tf.placeholder('float',[cls.batch_size,
cls.sequence_length,
cls.n_rnn_inputs])
#global step used in optimizer
global_step = tf.Variable(0,trainable = False)
#Create ANN
ann_output = cls.ann_model(cls.c)
#Combine output of ANN with other input data x
ann_out_seq = tf.reshape(tf.concat([ann_output for _ in
range(cls.sequence_length)],1),
[cls.batch_size,
cls.sequence_length,
cls.n_ann_nodes])
cls.rnn_input = tf.concat([ann_out_seq,cls.x],2)
#Create 'unrolled' RNN by creating sequence_length many RNN Cells that
#share the same weights.
with tf.variable_scope('Lower_RNNs'):
#Create RNNs
daily_prediction, daily_prediction1 =[cls.rnn_lower_model(cls.input_place)]*2
當訓練迷你批次分兩個步驟計算:
RNNinput = sess.run(cls.rnn_input,feed_dict = {
cls.x:batch_x,
cls.y:batch_y,
cls.c:batch_c})
_ = sess.run(cls.optimizer, feed_dict={cls.input_place:RNNinput,
cls.y:batch_y,
cls.x:batch_x,
cls.c:batch_c})
感謝您的幫助。任何想法,將不勝感激。
爲什麼你有兩個feed_dict? –
第二個與第一個相同,但包含了第一個'sess.run'的結果'RNNinput'。這就是我將具有共享RNN小區的較低層的輸出傳遞給上層的方式。我使用佔位符'cls.input_place'在第二個'sess.run'調用中執行此操作。不幸的是,這破壞了tensorflow的反向傳播計算。 – AlexR
你不應該那樣做。您可以像鏈接中提到的那樣構建一個圖形,提供一次輸入並讓整個網絡訓練。任何理由,爲什麼你無法做到這一點? –