回答

3

我也想要顯示Tensorflow seq2seq操作對我的文本摘要任務的注意力。我認爲臨時解決方案是使用session.run()來評估上面提到的注意模板張量。有趣的是,原來的seq2seq.py操作被認爲是舊版本,並且不能在github中很容易找到,所以我只是在0.12.0輪子分配中使用了seq2seq.py文件並對其進行了修改。爲了繪製熱圖,我使用了'Matplotlib'包,非常方便。

關注可視化的新聞標題textsum最終輸出看起來是這樣的: enter image description here

我修改代碼如下: https://github.com/rockingdingo/deepnlp/tree/master/deepnlp/textsum#attention-visualization

seq2seq_attn.py

# Find the attention mask tensor in function attention_decoder()-> attention() 
# Add the attention mask tensor to ‘return’ statement of all the function that calls the attention_decoder(), 
# all the way up to model_with_buckets() function, which is the final function I use for bucket training. 

def attention(query): 
    """Put attention masks on hidden using hidden_features and query.""" 
    ds = [] # Results of attention reads will be stored here. 

    # some code 

    for a in xrange(num_heads): 
    with variable_scope.variable_scope("Attention_%d" % a): 
     # some code 

     s = math_ops.reduce_sum(v[a] * math_ops.tanh(hidden_features[a] + y), 
           [2, 3]) 
     # This is the attention mask tensor we want to extract 
     a = nn_ops.softmax(s) 

     # some code 

    # add 'a' to return function 
    return ds, a 

seq2seq_model_attn.py

predict_attn.pyeval.py

# Use the plot_attention function in eval.py to visual the 2D ndarray during prediction. 

eval.plot_attention(attn_matrix[0:ty_cut, 0:tx_cut], X_label = X_label, Y_label = Y_label) 

並可能在未來tensorflow將會有更好的方法來提取和可視化的關注權重映射。有什麼想法嗎?

+0

嘿,尼斯答案,我試過相同的,但我有一個意想不到的關注向量。你可以看看:http://stackoverflow.com/questions/43123105/weird-attention-weights-when-trying-to-learn-to-inverse-sequence-with-seq2seq thx – pltrdy