2017-04-22 67 views
1

我想訓練兩個不同的LSTM,使它們在對話環境中相互作用(即一個產生一個序列,它將用作第二個rnn的上下文,它將回答等等)。但是,我不知道如何在tensorflow上分別進行訓練(我認爲我沒有完全理解tf圖的邏輯)。當我執行我的代碼時,出現以下錯誤:如何在同一個tensorflow會話中訓練不同的LSTM?

變量rnn/basic_lstm_cell /權重已存在,不允許。你是否想在VarScope中設置reuse = True?

當我創建第二個RNN時發生錯誤。你知道如何解決這個問題嗎?

我的代碼如下:

#User LSTM 
no_units=100 
_seq_user = tf.placeholder(tf.float32, [batch_size, max_length_user, user_inputShapeLen], name='seq') 
_seq_length_user = tf.placeholder(tf.int32, [batch_size], name='seq_length') 

cell = tf.contrib.rnn.BasicLSTMCell(
     no_units) 

output_user, hidden_states_user = tf.nn.dynamic_rnn(
    cell, 
    _seq_user, 
    dtype=tf.float32, 
    sequence_length=_seq_length_user 
) 
out2_user = tf.reshape(output_user, shape=[-1, no_units]) 
out2_user = tf.layers.dense(out2_user, user_outputShapeLen) 

out_final_user = tf.reshape(out2_user, shape=[-1, max_length_user, user_outputShapeLen]) 
y_user_ = tf.placeholder(tf.float32, [None, max_length_user, user_outputShapeLen]) 


softmax_user = tf.nn.softmax(out_final_user, dim=-1) 
loss_user = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_user, labels=y_user_)) 
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4) 
minimize = optimizer.minimize(loss_user) 

init = tf.global_variables_initializer() 
sess = tf.Session() 
sess.run(init) 

for i in range(epoch): 
    print 'Epoch: ', i 
    batch_X, batch_Y, batch_sizes = lstm.batching(user_train_X, user_train_Y, sizes_user_train) 
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes): 
     sess.run(minimize, {_seq_user:data_, _seq_length_user:size_, y_user_:target_}) 

#System LSTM 
no_units_system=100 
_seq_system = tf.placeholder(tf.float32, [batch_size, max_length_system, system_inputShapeLen], name='seq_') 
_seq_length_system = tf.placeholder(tf.int32, [batch_size], name='seq_length_') 

cell_system = tf.contrib.rnn.BasicLSTMCell(
     no_units_system) 

output_system, hidden_states_system = tf.nn.dynamic_rnn(
    cell_system, 
    _seq_system, 
    dtype=tf.float32, 
    sequence_length=_seq_length_system 
) 
out2_system = tf.reshape(output_system, shape=[-1, no_units]) 
out2_system = tf.layers.dense(out2_system, system_outputShapeLen) 

out_final_system = tf.reshape(out2_system, shape=[-1, max_length_system, system_outputShapeLen]) 
y_system_ = tf.placeholder(tf.float32, [None, max_length_system, system_outputShapeLen]) 

softmax_system = tf.nn.softmax(out_final_system, dim=-1) 
loss_system = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_system, labels=y_system_)) 
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4) 
minimize = optimizer.minimize(loss_system) 

for i in range(epoch): 
    print 'Epoch: ', i 
    batch_X, batch_Y, batch_sizes = lstm.batching(system_train_X, system_train_Y, sizes_system_train) 
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes): 
     sess.run(minimize, {_seq_system:data_, _seq_length_system:size_, y_system_:target_}) 

回答

0

關於可變範圍誤差,嘗試對於每個圖形設置不同的可變範圍。

with tf.variable_scope('User_LSTM'): your user_lstm graph

with tf.variable_scope('System_LSTM'): your system_lstm graph

此外,應避免使用相同的名稱爲不同的Python對象。 (例如:optimizer)第二個聲明將覆蓋第一個聲明,當您使用張量板時這會使您感到困惑。順便說一句,我建議培訓模型端到端的方式,而不是分別運行兩個會話。嘗試將第一個LSTM的輸出張量饋送到具有單個優化器和丟失函數的第二個LSTM中。

0

簡而言之,要解決問題(Variable rnn/basic_lstm_cell/weights already exists),您需要的是兩個分開的變量範圍(如@ J-min所述)。因爲在張量流中,變量按其名稱進行組織,並且通過管理這兩個範圍中的這兩組變量,tensorflow將能夠將它們彼此區分開來。

而通過train them separately on tensorflow,我想你要定義兩個不同的損失函數,並用兩個優化器來優化這兩個LSTM網絡,每個優化器都對應於之前的一個損失函數。

在這種情況下,你需要獲得這兩組變量的列表,並通過這些清單到您的優化,這樣

opt1 = GradientDescentOptimizer(learning_rate=0.1) 
opt_op1 = opt.minimize(loss1, var_list=<list of variables from scope 1>) 

opt2 = GradientDescentOptimizer(learning_rate=0.1) 
opt_op2 = opt.minimize(loss2, var_list=<list of variables from scope 2>) 
相關問題