2017-07-28 18 views
1

在研究張量流時,我遇到了一個問題。
成本函數輸出'nan'。張量流中成本函數輸出'nan'

而且,如果您在源代碼中發現任何其他錯誤,請讓我知道它的鏈接。

我試圖將成本函數值發送給我的訓練模型,但它不工作。

tf.reset_default_graph() 

tf.set_random_seed(777) 

X = tf.placeholder(tf.float32, [None, 20, 20, 3]) 
Y = tf.placeholder(tf.float32, [None, 1]) 

with tf.variable_scope('conv1') as scope: 
    W1 = tf.Variable(tf.random_normal([4, 4, 3, 32], stddev=0.01), name='weight1')  
    L1 = tf.nn.conv2d(X, W1, strides=[1, 1, 1, 1], padding='SAME') 
    L1 = tf.nn.relu(L1) 
    L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 
    L1 = tf.reshape(L1, [-1, 10 * 10 * 32]) 

    W1_hist = tf.summary.histogram('conv_weight1', W1) 
    L1_hist = tf.summary.histogram('conv_layer1', L1) 

with tf.name_scope('fully_connected_layer1') as scope: 
    W2 = tf.get_variable('W2', shape=[10 * 10 * 32, 1], initializer=tf.contrib.layers.xavier_initializer())   
    b = tf.Variable(tf.random_normal([1])) 
    hypothesis = tf.matmul(L1, W2) + b 

    W2_hist = tf.summary.histogram('fully_connected_weight1', W2) 
    b_hist = tf.summary.histogram('fully_connected_bias', b) 
    hypothesis_hist = tf.summary.histogram('hypothesis', hypothesis) 

with tf.name_scope('cost') as scope: 
    cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) 
    cost_summary = tf.summary.scalar('cost', cost) 

with tf.name_scope('train_optimizer') as scope: 
    optimizer = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(cost) 

predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) 
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) 
accuracy_summary = tf.summary.scalar('accuracy', accuracy) 

train_data_batch, train_labels_batch = tf.train.batch([train_data, train_labels], enqueue_many=True , batch_size=100, allow_smaller_final_batch=True) 

with tf.Session() as sess: 
    # tensorboard --logdir=./logs/planesnet2_log 
    merged_summary = tf.summary.merge_all() 
    writer = tf.summary.FileWriter('./logs/planesnet2_log') 
    writer.add_graph(sess.graph) 

    sess.run(tf.global_variables_initializer()) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 
    total_cost = 0 

    for step in range(20): 
     x_batch, y_batch = sess.run([train_data_batch, train_labels_batch]) 
     feed_dict = {X: x_batch, Y: y_batch} 
     _, cost_val = sess.run([optimizer, cost], feed_dict = feed_dict) 
     total_cost += cost_val 
     print('total_cost: ', total_cost, 'cost_val: ', cost_val) 
    coord.request_stop() 
    coord.join(threads) 

回答

1

您使用交叉熵損失沒有乙狀結腸激活功能hypothesis,這樣你的價值不是在]界0,1]。日誌功能沒有爲負值定義,它最有可能獲得一些。添加sigmoid和epsilon因子,以避免負值或0值,你應該沒問題。

+0

我明白了 '的假說= tf.sigmoid(tf.matmul(L1,W2)+ B)'。 – Kim

+0

但是,我不明白'日誌功能沒有爲負值定義'和'epsilon因子'。 – Kim

+0

你能告訴我該怎麼做嗎? – Kim

1

據我所知,

交叉熵成本函數假設要預測的假設是隨機的值。由於交叉熵使用對數函數和公式。因此,交叉熵損失只能用於隨機情況。

因此您必須使用softmax函數來計算hypothesis的概率結果。

W2 = tf.get_variable('W2', shape=[10 * 10 * 32, 1], 
initializer=tf.contrib.layers.xavier_initializer())   
b = tf.Variable(tf.random_normal([1])) 

# hypothesis = tf.matmul(L1, W2) + b 
hypothesis = tf.nn.softmax(tf.add(tf.matmul(L1, W2), b)) 
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) 

或者您可以使用此代碼

W2 = tf.get_variable('W2', shape=[10 * 10 * 32, 1], 
initializer=tf.contrib.layers.xavier_initializer())   
b = tf.Variable(tf.random_normal([1])) 

hypothesis = tf.matmul(L1, W2) + b 
cost = tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=hypothesis) 
+1

我知道你在評論中陳述它,但是如果有人複製這樣的代碼,他將以雙softmax計算結束 - 最好是展示兩種單獨的方法,而不是以不兼容的方式合併它們, – lejlot

+0

@ lejlot哦,是的。你是對的。 – yumere