2017-04-18 25 views
1

我想導入各種類型的列,如int,float,string作爲張量。只有第一個record_defaults工作,其中所有設置爲字符串。Tensorflow:Int,浮點類型在record_defaults中不起作用

但我得到下面的錯誤。有什麼辦法可以用各種類型的佔位符使用張量流?

csv文件來自aws

import tensorflow as tf 

# https://s3.amazonaws.com/aml-sample-data/banking.csv 
file1="/Users/Q/Downloads/banking.csv" 
filename_queue = tf.train.string_input_producer([file1]) 
reader = tf.TextLineReader(skip_header_lines=1) 
key, value = reader.read(filename_queue) 

# record_defaults = [[""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [0]] 
# record_defaults = [[0], [""], [""], [""], [""], [""], [""], [""], [""], [""], [0], [0], [0], [0], [""], [0.0], [0.0], [0.0], [0.0], [0.0], [0]] 
record_defaults = [[0], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [""], [0]] 

cols = tf.decode_csv(value, record_defaults=record_defaults, field_delim=",") 

features = tf.stack(cols[:-1]) 

with tf.Session() as sess: 
    # Start populating the filename queue. 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 

    for i in range(1000): 
    # Retrieve a single instance: 
    example, label = sess.run([features, cols[-1]]) 

    coord.request_stop() 
    coord.join(threads) 

錯誤消息

/System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/Q/Dropbox/code/Analyzing-Tea/tf-test.py 
Traceback (most recent call last): 
    File "/Users/Q/Dropbox/code/Analyzing-Tea/tf-test.py", line 14, in <module> 
    features = tf.stack(cols[:-1]) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/ops/array_ops.py", line 715, in stack 
    return gen_array_ops._pack(values, axis=axis, name=name) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1975, in _pack 
    result = _op_def_lib.apply_op("Pack", values=values, axis=axis, name=name) 
    File "/Library/Python/2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 463, in apply_op 
    raise TypeError("%s that don't all match." % prefix) 
TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [int32, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string] that don't all match. 

Process finished with exit code 1 

回答

0

罪魁禍首是這一行:features = tf.stack(cols[:-1])。它試圖將所有特徵列張量疊加到一個張量中,只有它們具有相同類型(並且類型是從記錄缺省值中推導出的)纔會起作用。

所以不要堆疊它們,你會沒事的。

+0

謝謝,@ alexandre-passos!我認爲錯誤消息「不匹配」意味着我給了某個列的錯誤類型,但這意味着所有列應該具有相同的類型。 –

+0

是的,如果要連接單個張量(tf.stack)中的所有列,那麼它們需要具有相同的類型。 –

+0

根據此文檔https://www.tensorflow.org/programmers_guide/reading_data處理csv文件,如果它們屬於不同類型,我們需要將所有功能放在一起,我們如何處理它們? – bicepjai