Tensorflow：使用輸入管道（.csv）作爲訓練字典

我試圖在.csv數據集（5008列，533行）上訓練模型。我使用的TextReader將數據解析爲兩個張量，一個保存數據對[例如]和一個訓練保持正確的標籤[標籤]：Tensorflow：使用輸入管道（.csv）作爲訓練字典

def read_my_file_format(filename_queue): 
    reader = tf.TextLineReader() 
    key, record_string = reader.read(filename_queue) 
    record_defaults = [[0.5] for row in range(5008)] 

    #Left out most of the columns for obvious reasons 
    col1, col2, col3, ..., col5008 = tf.decode_csv(record_string, record_defaults=record_defaults) 
    example = tf.stack([col1, col2, col3, ..., col5007]) 
    label = col5008 
    return example, label 

def input_pipeline(filenames, batch_size, num_epochs=None): 
    filename_queue = tf.train.string_input_producer(filenames, num_epochs=num_epochs, shuffle=True) 
    example, label = read_my_file_format(filename_queue) 
    min_after_dequeue = 10000 
    capacity = min_after_dequeue + 3 * batch_size 
    example_batch, label_batch = tf.train.shuffle_batch([example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) 
    return example_batch, label_batch

執行的東西時，這部分工作，像：

with tf.Session() as sess: 
    ex_b, l_b = input_pipeline(["Tensorflow_vectors.csv"], 10, 1) 
    print("Test: ",ex_b)

我的結果是Test: Tensor("shuffle_batch:0", shape=(10, 5007), dtype=float32)

到目前爲止，這似乎沒什麼問題。接下來，我創建了一個簡單的模型，其中包含兩個隱藏層（分別爲512和256個節點）。

batch_x, batch_y = input_pipeline(["Tensorflow_vectors.csv"], batch_size) 
_, cost = sess.run([optimizer, cost], feed_dict={x: batch_x.eval(), y: batch_y.eval()})

我基於this example that uses the MNIST database這種方法：當事情出錯時，我試圖培養模式。但是，當我執行此操作時，即使當我僅使用batch_size = 1時，Tensorflow也會掛起。如果我離開了.eval()職能應該從張量獲取的實際數據，我得到如下回應：

TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

現在，這個我能理解，但我不明白爲什麼程序掛起時，我不包括.eval()函數，我不知道我在哪裏可以找到有關此問題的任何信息。

編輯：我包括我的整個腳本here的最新版本。該程序仍然掛起，即使我實施了（據我所知）提供的解決方案vijay m

來源

2017-07-01 Voidling

請問您可以添加整個代碼嗎？ –

整個代碼可以在這裏找到：[鏈接]（https://github.com/Voidling0/TFCSV2/blob/master/script.py） – Voidling

由於錯誤說，你試圖喂張量到feed_dict。你已經定義了一個input_pipeline隊列，你不能通過feed_dict。數據傳遞到模型和火車的正確方法顯示在下面的代碼中：

# A queue which will return batches of inputs 
batch_x, batch_y = input_pipeline(["Tensorflow_vectors.csv"], batch_size) 

# Feed it to your neural network model: 
# Every time this is called, it will pull data from the queue. 
logits = neural_network(batch_x, batch_y, ...) 

# Define cost and optimizer 
cost = ... 
optimizer = ... 

# Evaluate the graph on a session: 
with tf.Session() as sess: 
    init_op = ... 
    sess.run(init_op) 

    # Start the queues 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(sess=sess, coord=coord) 

    # Loop through data and train 
    for (loop through steps): 
     _, cost = sess.run([optimizer, cost]) 

    coord.request_stop() 
    coord.join(threads)

來源

2017-07-01 16:13:04

我非常感謝你的幫助！在做了一個額外的必要的修改之後，因爲我的向量中的維度不相同（我通過使用一個整形函數'batch_y = tf.reshape（batch_y，[12,1]）'解決了這個問題），我仍然處於虧損狀態因爲程序再次掛起。如果你願意看一下這裏的鏈接到我的整個代碼：[link]（https://github.com/Voidling0/TFCSV2/blob/master/scriptv2.py）。我認爲這也可能對其他人進入Tensorflow很有幫助，因爲有時很難確定程序爲什麼掛起。 – Voidling

注意：在行** 118 **之後它會掛起，以確保準確。順便說一下在980Ti上運行，所以我期望硬件不成爲這個問題的原因。 – Voidling

你能分享輸入'csv'嗎？ –

Tensorflow：使用輸入管道（.csv）作爲訓練字典

回答

相關問題