2017-02-13 77 views
0

我是TensorFlow的初學者。我的TensorFlow腳本突然退出,說Killed。我的代碼如下:TensorFlow Python腳本死亡

import tensorflow as tf 
# Load data X_train, y_train and X_valid, y_valid 

# An image augmentation pipeline 
def augment(x): 
    x = tf.image.random_brightness(x, max_delta=0.2) 
    x = tf.image.random_contrast(x, 0.5, 2) 
    return x 

from sklearn.utils import shuffle 
X_train, y_train = shuffle(X_train, y_train) 

def LeNet(x): 
    # Define LeNet architecture 
    return logits 

# Features: 
x = tf.placeholder(tf.float32, (None, 32, 32, 3)) 
# Labels: 
y = tf.placeholder(tf.int32, (None)) 
# Dropout probability 
prob = tf.placeholder(tf.float32, (None)) 
# Learning rate 
rate = tf.placeholder(tf.float32, (None)) 
rate_summary = tf.summary.scalar('learning rate', rate) 

logits = LeNet(x) 
accuracy_operation = # defined accuracy_operation 

accuracy_summary = tf.summary.scalar('validation accuracy', accuracy_operation) 
saver = tf.train.Saver() 

summary = tf.summary.merge_all() 
writer = tf.summary.FileWriter('./summary', tf.get_default_graph()) 

def evaluate(X_data, y_data): 
    # Return accuracy with X_data, y_data 
    return accuracy 

with tf.Session() as sess: 

    saver.restore(sess, './lenet') 

    for i in range(EPOCHS): 
     X_train, y_train = shuffle(X_train, y_train) 
     for offset in range(0, len(X_train), BATCH_SIZE): 
      end = offset + BATCH_SIZE 
      batch_x, batch_y = X_train[offset:end], y_train[offset:end] 
      batch_x = sess.run(augment(batch_x)) 

      # Run the training operation, update learning rate 

     validation_accuracy = evaluate(X_valid, y_valid) 
     writer.add_summary(sess.run(summary, feed_dict = {x: X_valid, y: y_valid, prob: 1., rate: alpha})) 

我已經省略了我確定不會造成問題的部分。我知道哪些部分是好的,因爲腳本沒有提前給出任何麻煩。在添加了某些部分(主要是摘要寫入操作)之後,腳本突然說Killed並在執行一定數量的訓練操作後退出。我懷疑這是由於內存泄漏,但我無法檢測到它。

+0

你能檢查輸出linux'dmesg'命令嗎?你的程序是由於內存不足而死的? – drpng

回答

2

就在幾天前我遇到了類似的問題。就我而言,我的一些操作結果在計算上非常沉重,我後來才知道。一旦我縮小了張量的大小,消息就消失了,我的代碼也運行了。 我不能確切地說出問題的原因是什麼,但從我的經驗和你的說法(只有在添加摘要時纔出現該錯誤),我會建議搗鼓你的X_valid的大小,Y_valid 。這可能只是作家無法應付太多的數據......