SOFTMAX雅可比在Tensorflow

假設我有一個簡單的一層神經網絡：SOFTMAX雅可比在Tensorflow

x = tf.placeholder(tf.float32, [batch_size, input_dim]) 
W = tf.Variable(tf.random_normal([input_dim, output_dim])) 
a = tf.matmul(x, W) 
y = tf.nn.softmax(a)

因此，可變y通過output_dim是尺寸batch_size的。我想針對批次中的每個樣本計算y的雅可比，其相對於a，其尺寸爲batch_size，output_dim，output_dim。現在，在數學上，對於i！= j和另外的（dy/da）_ {i，i} = y_i（1-y_i），Jacobian（dy/da）_ {i，j} = - y_i y_j。

我想知道如何計算softmax相對於其在TensorFlow中的輸入的雅可比行列式？我知道tf.gradients將計算標量關於張量的梯度，所以我將TensorFlow中的循環與tf.gradients中的一些循環結合起來，甚至只是試圖實現上面給出的分析形式應該可行。但我不確定如何在TensorFlow中使用它的操作來做到這一點，並會欣賞任何代碼來做到這一點！

來源

2017-01-25 user19346

看來tf.gradients適用於output_dim的總和。解決方法：拆散然後重新粘貼。不知道這是如何影響效率的...

import numpy as np 
import tensorflow as tf 

batch_size = 3 
input_dim = 10 
output_dim = 20 

W_vals = np.random.rand(input_dim, output_dim).astype(np.float32) 

graph = tf.Graph() 
with graph.as_default(): 
    x = tf.placeholder(tf.float32, [batch_size, input_dim]) 
    # Use a constant for easier checking 
    W = tf.constant(W_vals, dtype=tf.float32) 
    a = tf.matmul(x, W) 
    y = a 
    # remove softmax for easier checking 
    # y = tf.nn.softmax(a) 

    grads = tf.stack([tf.gradients(yi, x)[0] for yi in tf.unstack(y, axis=1)], 
        axis=2) 

with tf.Session(graph=graph) as sess: 
    x_vals = np.random.rand(batch_size, input_dim).astype(np.float32) 
    g_vals = sess.run(grads, feed_dict={x: x_vals}) 

# check gradients match 
tol = 1e-10 
for i in range(batch_size): 
    if np.max(np.abs(g_vals[i] - W_vals)) >= tol: 
     raise Exception('') 
print('Gradients seem to match!')

來源

2017-01-25 01:58:31 DomJack

SOFTMAX雅可比在Tensorflow

回答

相關問題