在目標函數本身內使用softmax圖層

我有一個常規的CNN網絡，在它上面有標準的MLP圖層。在MLP之上，我也有softmax層，但是，與傳統網絡不同，它不完全連接到下面的MLP，它由子組構成。

爲了進一步描述SOFTMAX，它看起來像這樣：

Neur1A Neur2A ... NeurNA  Neur1B Neur2B ... NeurNB Neur1C Neur2C ...NeurNC 
     Group A       Group B    Group C

還有更多的羣體。每個組都有一個獨立於其他組的softmax。所以它有幾種獨立的分類方式（儘管實際上並不是這樣）。

我需要的是激活神經元的指數在組間單調遞增。例如，如果我激活了A組中的Neuron5，我想B組中的激活的神經元大於等於5。與B組和C組相同，等等。

這個包含所有組的神經元的softmax層實際上不是我的最後一層，它有趣的是一箇中間層。

爲了實現這種單調性，我在損失函數中添加了另一個術語，它懲罰非單調激活的神經元指數。下面是一些代碼：

爲SOFTMAX層和它的輸出的代碼：

def compute_image_estimate(layer2_input): 
    estimated_yps= tf.zeros([FLAGS.batch_size,0],dtype=tf.int64) 
    for pix in xrange(NUM_CLASSES): 
     pixrow= int(pix/width) 
     rowdata= image_pixels[:, pixrow*width:(pixrow+1)*width] 

     with tf.variable_scope('layer2_'+'_'+str(pix)) as scope: 
      weights = _variable_with_weight_decay('weights', shape=[layer2_input.get_shape()[1], width], stddev=0.04, wd=0.0000000) 
      biases = _variable_on_cpu('biases', [width], tf.constant_initializer(0.1)) 
      y = tf.nn.softmax(tf.matmul(layer2_input,weights) + biases) 
      argyp=width-1-tf.argmax(y,1) 
      argyp= tf.reshape(argyp,[FLAGS.batch_size,1]) 
     estimated_yps=tf.concat(1,[estimated_yps,argyp]) 

     return estimated_yps

的estimated_yps被傳遞到量化的函數單調性：

def compute_monotonicity(yp): 
    sm= tf.zeros([FLAGS.batch_size]) 

    for curr_row in xrange(height): 
     for curr_col in xrange(width-1): 
      pix= curr_row *width + curr_col 
      sm=sm+alpha * tf.to_float(tf.square(tf.minimum(0,tf.to_int32(yp[:,pix]-yp[:,pix+1])))) 

    return sm

和損耗函數：

def loss(estimated_yp, SOME_OTHER_THINGS): 
    tf.add_to_collection('losses', SOME_OTHER_THINGS) 

    monotonicity_metric= tf.reduce_mean(compute_monotonocity(estimated_yp)) 
    tf.add_to_collection('losses', monotonicity_metric) 
    return tf.add_n(tf.get_collection('losses'), name='total_loss')

現在我的問題是，當我不使用SOME_OTHER_THINGS tha t是傳統的指標，我得到ValueError: No gradients provided for any variable爲單調速度度量。

當像這樣使用softmax層輸出時，似乎沒有定義漸變。

我做錯了什麼？任何幫助，將不勝感激。

來源

2016-02-26 eurotomania

道歉..我意識到問題在於tf.argmax函數顯然沒有定義梯度。

來源

2016-02-26 22:59:22 eurotomania

在目標函數本身內使用softmax圖層

回答

相關問題