LSTM模型輔助輸入

我有2列的數據集 - 每一列一組文檔。我必須將Col A中的文檔與Col B中提供的文檔相匹配。這是一個監督分類問題。所以我的訓練數據包含一個標籤欄，指出文件是否匹配。LSTM模型輔助輸入

爲了解決這個問題，我已經創建的一組特性，比如F1-F25（通過比較兩份文件），然後訓練的這些特徵的二元分類。這種方法工作得很好，但現在我想評估Deep Learning模型（特別是LSTM模型）。

我使用Python中keras庫。通過keras文檔和其他教程可在網上去後，我已成功地做到以下幾點：

from keras.layers import Input, Embedding, LSTM, Dense 
from keras.models import Model 

# Each document contains a series of 200 words 
# The necessary text pre-processing steps have been completed to transform 
    each doc to a fixed length seq 
main_input1 = Input(shape=(200,), dtype='int32', name='main_input1') 
main_input2 = Input(shape=(200,), dtype='int32', name='main_input2') 

# Next I add a word embedding layer (embed_matrix is separately created  
for each word in my vocabulary by reading from a pre-trained embedding model) 
x = Embedding(output_dim=300, input_dim=20000, 
input_length=200, weights = [embed_matrix])(main_input1) 
y = Embedding(output_dim=300, input_dim=20000, 
input_length=200, weights = [embed_matrix])(main_input2) 

# Next separately pass each layer thru a lstm layer to transform seq of 
vectors into a single sequence 
lstm_out_x1 = LSTM(32)(x) 
lstm_out_x2 = LSTM(32)(y) 

# concatenate the 2 layers and stack a dense layer on top 
x = keras.layers.concatenate([lstm_out_x1, lstm_out_x2]) 
x = Dense(64, activation='relu')(x) 
# generate intermediate output 
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(x) 

# add auxiliary input - auxiliary inputs contains 25 features for each document pair 
auxiliary_input = Input(shape=(25,), name='aux_input') 

# merge aux output with aux input and stack dense layer on top 
main_input = keras.layers.concatenate([auxiliary_output, auxiliary_input]) 
x = Dense(64, activation='relu')(main_input) 
x = Dense(64, activation='relu')(x) 

# finally add the main output layer 
main_output = Dense(1, activation='sigmoid', name='main_output')(x) 

model = Model(inputs=[main_input1, main_input2, auxiliary_input], outputs= main_output) 
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) 

model.fit([x1, x2,aux_input], y, 
     epochs=3, batch_size=32)

然而，當我這個分數在訓練數據上，我得到了相同的概率。所有情況下得分。這個問題似乎與輔助輸入的輸入方式有關（因爲當我移除輔助輸入時它會產生有意義的輸出）。我也嘗試在網絡中的不同位置插入輔助輸入。但不知何故，我無法得到這個工作。

任何指針？

來源

2017-05-07 Dataminer

不知道這是有意的，但只有auxiliary_output是（1）。它真的是你期望的嗎？只有一個結果合併25個輔助輸入？ - 當您僅訓練最後部分時，輔助輸出之前的模型是否意圖「不可訓練」？ –

嗯，這是一個二元分類模型，所以最終的輸出是（1，）。輔助輸出應該不同嗎？我只是在輔助輸入中添加25個特徵，因此（25，）形狀 – Dataminer

您是否嘗試過更多的時代？ –

好了，這是開了好幾個月，人們投票起來。
我最近使用this dataset做了一些非常類似的事情，可以用來預測信用卡默認值，它包含客戶的分類數據（性別，教育程度，婚姻狀況等）以及支付歷史作爲時間序列。所以我必須將時間序列與非系列數據合併。我的解決方案與LSTM結合非常相似，我嘗試採用解決問題的方法。對我來說，輔助輸入上的密集層是很有效的。

此外，在你的情況下，共享層是有意義所以相同的權重來「讀」兩個文件。我對你的數據測試的建議：

from keras.layers import Input, Embedding, LSTM, Dense 
from keras.models import Model 

# Each document contains a series of 200 words 
# The necessary text pre-processing steps have been completed to transform 
    each doc to a fixed length seq 
main_input1 = Input(shape=(200,), dtype='int32', name='main_input1') 
main_input2 = Input(shape=(200,), dtype='int32', name='main_input2') 

# Next I add a word embedding layer (embed_matrix is separately created  
for each word in my vocabulary by reading from a pre-trained embedding model) 
x1 = Embedding(output_dim=300, input_dim=20000, 
input_length=200, weights = [embed_matrix])(main_input1) 
x2 = Embedding(output_dim=300, input_dim=20000, 
input_length=200, weights = [embed_matrix])(main_input2) 

# Next separately pass each layer thru a lstm layer to transform seq of 
vectors into a single sequence 
# Comment Manngo: Here I changed to shared layer 
# Also renamed y as input as it was confusing 
# Now x and y are x1 and x2 
lstm_reader = LSTM(32) 
lstm_out_x1 = lstm_reader(x1) 
lstm_out_x2 = lstm_reader(x2) 

# concatenate the 2 layers and stack a dense layer on top 
x = keras.layers.concatenate([lstm_out_x1, lstm_out_x2]) 
x = Dense(64, activation='relu')(x) 
x = Dense(32, activation='relu')(x) 
# generate intermediate output 
# Comment Manngo: This is created as a dead-end 
# It will not be used as an input of any layers below 
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(x) 

# add auxiliary input - auxiliary inputs contains 25 features for each document pair 
# Comment Manngo: Dense branch on the comparison features 
auxiliary_input = Input(shape=(25,), name='aux_input') 
auxiliary_input = Dense(64, activation='relu')(auxiliary_input) 
auxiliary_input = Dense(32, activation='relu')(auxiliary_input) 

# OLD: merge aux output with aux input and stack dense layer on top 
# Comment Manngo: actually this is merging the aux output preparation dense with the aux input processing dense 
main_input = keras.layers.concatenate([x, auxiliary_input]) 
main = Dense(64, activation='relu')(main_input) 
main = Dense(64, activation='relu')(main) 

# finally add the main output layer 
main_output = Dense(1, activation='sigmoid', name='main_output')(main) 

# Compile 
# Comment Manngo: also define weighting of outputs, main as 1, auxiliary as 0.5 
model.compile(optimizer=adam, 
       loss={'main_output': 'w_binary_crossentropy', 'aux_output': 'binary_crossentropy'}, 
       loss_weights={'main_output': 1.,'auxiliary_output': 0.5}, 
       metrics=['accuracy']) 

# Train model on main_output and on auxiliary_output as a support 
# Comment Manngo: Unknown information marked with placeholders ____ 
# We have 3 inputs: x1 and x2: the 2 strings 
# aux_in: the 25 features 
# We have 2 outputs: main and auxiliary; both have the same targets -> (binary)y 


model.fit({'main_input1': __x1__, 'main_input2': __x2__, 'auxiliary_input' : __aux_in__}, {'main_output': __y__, 'auxiliary_output': __y__}, 
       epochs=1000, 
       batch_size=__, 
       validation_split=0.1, 
       callbacks=[____])

我不知道有多少這可以幫助，因爲我沒有你的數據，所以我不能嘗試。儘管如此，這是我的最佳選擇。
由於顯而易見的原因，我沒有運行上面的代碼。

來源

2018-01-31 23:27:30 Manngo

LSTM模型輔助輸入

回答

相關問題